回复:Re: Build SPARK from source with SBT failed

2023-03-07 Thread ckgppl_yan
No. I haven't installed Apple Developer Tools. I have installed Zulu OpenJDK 11.0.17 manually.So I need to install Apple Developer Tools?- 原始邮件 - 发件人:Sean Owen 收件人:ckgppl_...@sina.cn 抄送人:user 主题:Re: Build SPARK from source with SBT failed 日期:2023年03月07日 20点58分 This says you don't have

回复:Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread ckgppl_yan
Oh, I got it. I thought SPARK can get local scala version. - 原始邮件 - 发件人:Sean Owen 收件人:ckgppl_...@sina.cn 抄送人:user 主题:Re: Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2 日期:2022年08月26日 21点08分 Spark is built with and ships with a copy of Scala. It doesn't use

Spark got incorrect scala version while using spark 3.2.1 and spark 3.2.2

2022-08-26 Thread ckgppl_yan
Hi all, I found a strange thing. I have run SPARK 3.2.1 prebuilt in local mode. My OS scala version is 2.13.7.But when I run spark-sumit then check the SparkUI, the web page shown that my scala version is 2.13.5.I used spark-shell, it also shown that my scala version is 2.13.5.Then I tried

回复:Re: 回复:Re: 回复:Re: calculate correlation_between_multiple_columns_and_one_specific_column_after_groupby_the_spark_data_frame

2022-03-16 Thread ckgppl_yan
Thanks, Jayesh and all. I finally get the correlation data frame using agg with list of functions.I think the list of functions which generate a column should be more detailed description. Liang - 原始邮件 - 发件人:"Lalwani, Jayesh" 收件人:"ckgppl_...@sina.cn" , Enrico Minack , Sean Owen

回复:Re: 回复:Re: calculate correlation between_multiple_columns_and_one_specific_column_after_groupby_the_spark_data_frame

2022-03-16 Thread ckgppl_yan
Thanks, Enrico.I just found that I need to group the data frame then calculate the correlation. So I will get a list of dataframe, not columns. So I used following solution:use following codes to create a mutable data frame df_all. I used the first datacol to calculate correlation.

回复:Re: calculate correlation between multiple columns and one specific column after groupby the spark data frame

2022-03-16 Thread ckgppl_yan
Thanks, Sean. I modified the codes and have generated a list of columns.I am working on convert a list of columns to a new data frame. It seems that there is no direct API to do this. - 原始邮件 - 发件人:Sean Owen 收件人:ckgppl_...@sina.cn 抄送人:user 主题:Re: calculate correlation between multiple

calculate correlation between multiple columns and one specific column after groupby the spark data frame

2022-03-15 Thread ckgppl_yan
Hi all, I am stuck at a correlation calculation problem. I have a dataframe like below:groupiddatacol1datacol2datacol3datacol*corr_co112345123465242175289325371235335315I want to calculate the correlation between all datacol columns and corr_col column by each