Re: 回复:Re: calculate correlation between multiple columns and one specific column after groupby the spark data frame

2022-03-16 Thread Enrico Minack
of columns to a new data frame. It seems that there is no direct  API to do this. - 原始邮件 - 发件人:Sean Owen 收件人:ckgppl_...@sina.cn 抄送人:user 主题:Re: calculate correlation between multiple columns and one specific column after groupby the spark data frame 日期:2022年03月16日 11点55分 Are you just

回复:Re: calculate correlation between multiple columns and one specific column after groupby the spark data frame

2022-03-16 Thread ckgppl_yan
columns and one specific column after groupby the spark data frame 日期:2022年03月16日 11点55分 Are you just trying to avoid writing the function call 30 times? Just put this in a loop over all the columns instead, which adds a new corr col every time to a list. On Tue, Mar 15, 2022, 10:30 PM wrote

Re: calculate correlation between multiple columns and one specific column after groupby the spark data frame

2022-03-15 Thread Sean Owen
Are you just trying to avoid writing the function call 30 times? Just put this in a loop over all the columns instead, which adds a new corr col every time to a list. On Tue, Mar 15, 2022, 10:30 PM wrote: > Hi all, > > I am stuck at a correlation calculation problem. I have a dataframe like >

calculate correlation between multiple columns and one specific column after groupby the spark data frame

2022-03-15 Thread ckgppl_yan
Hi all, I am stuck at a correlation calculation problem. I have a dataframe like below:groupiddatacol1datacol2datacol3datacol*corr_co112345123465242175289325371235335315I want to calculate the correlation between all datacol columns and corr_col column by each