非專業(yè)人員可以學(xué)會的
程序語言
桌面數(shù)據(jù)處理和分析的
工具集
操作簡單、函數(shù)豐富
批量和重復(fù)性任務(wù)很和繁瑣
對復(fù)雜的運(yùn)算支持不足
使用簡單、界面流暢美觀
計(jì)算功能單一,只能做死板的多維分析
完整的編程能力,天然內(nèi)置于Excel
對表格計(jì)算支持太差,簡單任務(wù)也要大段代碼
支持表格計(jì)算,交互性好,培訓(xùn)班遍地
表格計(jì)算不符合自然思維,太多種類表格需要掌握,難度超出非專業(yè)人員能力
df = pd.read_csv("../login_data.csv")
df["ts"] = pd.to_datetime(df["ts"]).dt.date
grouped = df.groupby("userid")
aligned_dates = pd.date_range(start=df["ts"].min(), end=df["ts"].max(), freq='D')
user_date_wether_con3days = []
for uid, group in grouped:
group = group.drop_duplicates('ts')
aligned_group = group.set_index("ts").reindex(aligned_dates)
consecutive_logins = aligned_group.rolling(window=7)
n = 0
date_wether_con3days = []
for r in consecutive_logins:
n += 1
if n<7:
continue
else:
ds = r['userid'].isna().cumsum()
cont_login_times = r.groupby(ds).userid.count().max()
wether_cont3days = 1 if cont_login_times>=3 else 0
date_wether_con3days.append(wether_cont3days)
user_date_wether_con3days.append(date_wether_con3days)
arr = np.array(user_date_wether_con3days)
day7_cont3num = np.sum(arr,axis=0)
result = pd.DataFrame({'dt':aligned_dates[6:],'cont3_num':day7_cont3num})
真正的記錄集合,有交互性
環(huán)境復(fù)雜,不能直接處理桌面文件,非專業(yè)人員難以使用
過程式運(yùn)算復(fù)雜度太高,非專業(yè)人員難以掌握
WITH all_dates AS ( SELECT DISTINCT TRUNC(ts) AS login_date FROM login_data), user_login_counts AS ( SELECT userid, TRUNC(ts) AS login_date, (CASE WHEN COUNT(*)>=1 THEN 1 ELSE 0 END) AS login_count FROM login_data GROUP BY userid, TRUNC(ts)), whether_login AS ( SELECT u.userid, ad.login_date, NVL(ulc.login_count, 0) AS login_count FROM all_dates ad CROSS JOIN ( SELECT DISTINCT userid FROM login_data) u LEFT JOIN user_login_counts ulc ON u.userid = ulc.userid AND ad.login_date = ulc.login_date ORDER BY u.userid, ad.login_date), whether_login_rn AS ( SELECT userid,login_date,login_count,ROWNUM AS rn FROM whether_login), whether_eq AS( SELECT userid,login_date,login_count,rn, (CASE WHEN LAG(login_count,1) OVER (ORDER BY rn)= login_count AND login_count =1 AND LAG(userid,1) OVER (ORDER BY rn)=userid THEN 0 ELSE 1 END) AS wether_e FROM whether_login_rn ), numbered_sequence AS ( SELECT userid,login_date,login_count,rn, wether_e, SUM(wether_e) OVER (ORDER BY rn) AS lab FROM whether_eq), consecutive_logins_num AS ( SELECT userid,login_date,login_count,rn, wether_e,lab, (SELECT (CASE WHEN max(COUNT(*))<3 THEN 0 ELSE 1 END) FROM numbered_sequence b WHERE b.rn BETWEEN a.rn - 6 AND a.rn AND b.userid=a.userid GROUP BY b. lab) AS cnt FROM numbered_sequence a) SELECT login_date,SUM(cnt) AS cont3_num FROM consecutive_logins_num WHERE login_date>=(SELECT MIN(login_date) FROM all_dates)+6 GROUP BY login_date ORDER BY login_date;
| A | |
| 1 | =file("login_data.csv").import@tc() |
| 2 | =periods(date(A1.ts),date(A1.m(-1).ts)) |
| 3 | =A1.group(userid).(~.align(A2,date(ts)).(if(#<7,null,(cnt=~[-6:0].group@i(!~).max(count(~)),if(cnt>=3,1,0))))) |
| 4 | =msum(A3).~.new(A2(#):dt,int(~):cont3_num).to(7,) |
1. 按車型、月份、年份分組并計(jì)數(shù)
2. 計(jì)算同期比:if(Model==Model[-1] && Month==Month[-1],SalesVol/SalesVol[-1],null)
3. 重新按車型、年、月排序
4. 計(jì)算比上期: if(Model==Model[-1] ,SalesVol/SalesVol[-1],null)
1. 讀入3個(gè)表格
2. 用standard關(guān)聯(lián)absent和performance
3. 按照公式計(jì)算工資
| A | B | C | |
| 1 | =file("data.xlsx").xlsimport@t() | ||
| 2 | for A1.fname() | =A1.field(A2) | |
| 3 | =B2.sum() | =B2.sum(int(~)) | |
| 4 | =if(B3==C3,B2.mode(),B2.avg()) | ||
| 5 | =B2.(if(~,~,B4)) | >A1.field(A2,B5) | |
| 6 | >file("dataNew.xlsx").xlsexport@t(A1) | ||
| A | B | C | |
| 1 | [ID,Name,Sex,Postion,Birthday,Phone,Address,PostCode] | ||
| 2 | [C1,C2,F2,C3,C4,D5,C7,C8] | ||
| 3 | =directory@p("data/*.xlsx") | ||
| 4 | for A3 | =file(A4).xlsopen() | =B2.(B4.xlscell(~)) |
| 5 | =@|C4 | ||
| 6 | =create(${A1.concat@c()}).record(B5) | ||
| 7 | >file("all.xlsx").xlsexport@t(A6) | ||