Faisal created SPARK-19519: ------------------------------ Summary: Groupby for multiple columns not working Key: SPARK-19519 URL: https://issues.apache.org/jira/browse/SPARK-19519 Project: Spark Issue Type: Bug Components: Java API Affects Versions: 1.5.0 Reporter: Faisal Priority: Blocker
DataFrame joinModCtypeAsgns = modCtypeAsgnsDf.as("mod") .join(moduleCodeDf.as("mc"), moduleCodeDf.col("EntityCode").equalTo(modCtypeAsgnsDf.col("charValCode"))) .join(dictDfCharCode.as("dc"), dictDfCharCode.col("EntityCode").equalTo(modCtypeAsgnsDf.col("charCode"))) .join(dictDfIsAChar, dictDfIsAChar.col("EntityCode").equalTo(modCtypeAsgnsDf.col("charCode"))) ; joinModCtypeAsgns.select(col("mc.propVal").as("mcaModCode"), col("dc.propVal").as("mcaCtypeCode"), max(col("mod.updatedDate")).as("mcaLastChangedDate"), coalesce(max(when(col("mndtryInd").equalTo("Y"), "Y")), max(when(col("mndtryInd").equalTo("N"), "N")), max(col("mndtryInd"))).as("mcaMandatoryFlg"), lit("N").as("mcaLockedFlg"), coalesce(max(when(col("fldColInd").equalTo("Y"), "F")), max(when(col("fldColInd").equalTo("N"), "I")), max(col("fldColInd"))).as("mcaFieldCollectionFlg") ).groupBy(col("mc.propVal"),col("dc.propVal")).agg(col("mc.propVal"),col("dc.propVal"),max(col("mod.updatedDate"))); Throws below exception User class threw exception: org.apache.spark.sql.AnalysisException: expression 'propVal' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() if you don't care which value you get. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org