[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Attachment: (was: testDF.parquet) > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Jorge Machado >Priority: Major > Fix For: 2.3.0 > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. > The dataset as circa 800 Rows and the projection_factor has values from 0 > until 100. the result should not be bigger that 5 but with get > 265820543091454 as result back. > > > As Code not 100% the same but I think there is really a bug there: > > {code:java} > BigDecimal [] objects = new BigDecimal[]{ > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D)}; > Row dataRow = new GenericRow(objects); > Row dataRow2 = new GenericRow(objects); > StructType structType = new StructType() > .add("id1", DataTypes.createDecimalType(38,10), true) > .add("id2", DataTypes.createDecimalType(38,10), true) > .add("id3", DataTypes.createDecimalType(38,10), true) > .add("id4", DataTypes.createDecimalType(38,10), true); > final Dataset dataFrame = > sparkSession.createDataFrame(Arrays.asList(dataRow,dataRow2), structType); > System.out.println(dataFrame.schema()); > dataFrame.show(); > final Dataset df1 = dataFrame.groupBy("id1","id2") > .agg( mean("id3").alias("projection_factor")); > df1.show(); > final Dataset df2 = df1 > .groupBy("id1") > .agg(max("projection_factor")); > df2.show(); > {code} > > The df2 should have: > {code:java} > ++--+ > | id1|max(projection_factor)| > ++--+ > |3.5714285714| 3.5714285714| > ++--+ > {code} > instead it returns: > {code:java} > ++--+ > | id1|max(projection_factor)| > ++--+ > |3.5714285714| 0.00035714285714| > ++--+ > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Affects Version/s: (was: 2.3.0) Fix Version/s: 2.3.0 > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Jorge Machado >Priority: Major > Fix For: 2.3.0 > > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. > The dataset as circa 800 Rows and the projection_factor has values from 0 > until 100. the result should not be bigger that 5 but with get > 265820543091454 as result back. > > > As Code not 100% the same but I think there is really a bug there: > > {code:java} > BigDecimal [] objects = new BigDecimal[]{ > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D)}; > Row dataRow = new GenericRow(objects); > Row dataRow2 = new GenericRow(objects); > StructType structType = new StructType() > .add("id1", DataTypes.createDecimalType(38,10), true) > .add("id2", DataTypes.createDecimalType(38,10), true) > .add("id3", DataTypes.createDecimalType(38,10), true) > .add("id4", DataTypes.createDecimalType(38,10), true); > final Dataset dataFrame = > sparkSession.createDataFrame(Arrays.asList(dataRow,dataRow2), structType); > System.out.println(dataFrame.schema()); > dataFrame.show(); > final Dataset df1 = dataFrame.groupBy("id1","id2") > .agg( mean("id3").alias("projection_factor")); > df1.show(); > final Dataset df2 = df1 > .groupBy("id1") > .agg(max("projection_factor")); > df2.show(); > {code} > > The df2 should have: > {code:java} > ++--+ > | id1|max(projection_factor)| > ++--+ > |3.5714285714| 3.5714285714| > ++--+ > {code} > instead it returns: > {code:java} > ++--+ > | id1|max(projection_factor)| > ++--+ > |3.5714285714| 0.00035714285714| > ++--+ > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Description: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 100. the result should not be bigger that 5 but with get 265820543091454 as result back. As Code not 100% the same but I think there is really a bug there: {code:java} BigDecimal [] objects = new BigDecimal[]{ new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D)}; Row dataRow = new GenericRow(objects); Row dataRow2 = new GenericRow(objects); StructType structType = new StructType() .add("id1", DataTypes.createDecimalType(38,10), true) .add("id2", DataTypes.createDecimalType(38,10), true) .add("id3", DataTypes.createDecimalType(38,10), true) .add("id4", DataTypes.createDecimalType(38,10), true); final Dataset dataFrame = sparkSession.createDataFrame(Arrays.asList(dataRow,dataRow2), structType); System.out.println(dataFrame.schema()); dataFrame.show(); final Dataset df1 = dataFrame.groupBy("id1","id2") .agg( mean("id3").alias("projection_factor")); df1.show(); final Dataset df2 = df1 .groupBy("id1") .agg(max("projection_factor")); df2.show(); {code} The df2 should have: {code:java} ++--+ | id1|max(projection_factor)| ++--+ |3.5714285714| 3.5714285714| ++--+ {code} instead it returns: {code:java} ++--+ | id1|max(projection_factor)| ++--+ |3.5714285714| 0.00035714285714| ++--+ {code} was: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 100. the result should not be bigger that 5 but with get 265820543091454 as result back. As Code not 100% the same but I think there is really a bug there: {code:java} BigDecimal [] objects = new BigDecimal[]{ new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D)}; Row dataRow = new GenericRow(objects); Row dataRow2 = new GenericRow(objects); StructType structType = new StructType() .add("id1", DataTypes.createDecimalType(38,10), true) .add("id2", DataTypes.createDecimalType(38,10), true) .add("id3", DataTypes.createDecimalType(38,10), true) .add("id4", DataTypes.createDecimalType(38,10), true); final Dataset dataFrame = sparkSession.createDataFrame(Arrays.asList(dataRow,dataRow2), structType); System.out.println(dataFrame.schema()); dataFrame.show(); final Dataset df1 = dataFrame.groupBy("id1","id2") .agg( mean("id3").alias("projection_factor")); df1.show(); final Dataset df2 = df1 .groupBy("id1") .agg(max("projection_factor")); df2.show(); {code} > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", >
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Description: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 100. the result should not be bigger that 5 but with get 265820543091454 as result back. As Code not 100% the same but I think there is really a bug there: {code:java} BigDecimal [] objects = new BigDecimal[]{ new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D), new BigDecimal(3.5714285714D)}; Row dataRow = new GenericRow(objects); Row dataRow2 = new GenericRow(objects); StructType structType = new StructType() .add("id1", DataTypes.createDecimalType(38,10), true) .add("id2", DataTypes.createDecimalType(38,10), true) .add("id3", DataTypes.createDecimalType(38,10), true) .add("id4", DataTypes.createDecimalType(38,10), true); final Dataset dataFrame = sparkSession.createDataFrame(Arrays.asList(dataRow,dataRow2), structType); System.out.println(dataFrame.schema()); dataFrame.show(); final Dataset df1 = dataFrame.groupBy("id1","id2") .agg( mean("id3").alias("projection_factor")); df1.show(); final Dataset df2 = df1 .groupBy("id1") .agg(max("projection_factor")); df2.show(); {code} was: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 100. the result should not be bigger that 5 but with get 265820543091454 as result back. > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. > The dataset as circa 800 Rows and the projection_factor has values from 0 > until 100. the result should not be bigger that 5 but with get > 265820543091454 as result back. > > > As Code not 100% the same but I think there is really a bug there: > > {code:java} > BigDecimal [] objects = new BigDecimal[]{ > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D), > new BigDecimal(3.5714285714D)}; > Row dataRow = new GenericRow(objects); > Row dataRow2 = new GenericRow(objects); > StructType structType = new StructType() > .add("id1", DataTypes.createDecimalType(38,10), true) > .add("id2", DataTypes.createDecimalType(38,10), true) > .add("id3", DataTypes.createDecimalType(38,10), true) > .add("id4", DataTypes.createDecimalType(38,10), true); > final Dataset dataFrame = > sparkSession.createDataFrame(Arrays.asList(dataRow,dataRow2), structType); > System.out.println(dataFrame.schema()); > dataFrame.show(); > final Dataset df1 = dataFrame.groupBy("id1","id2") > .agg(
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Description: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 100. the result should not be bigger that 5 but with get 265820543091454 as result back. was: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 5. the result should not be bigger that 5 but with get 265820543091454 as result back. > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. > The dataset as circa 800 Rows and the projection_factor has values from 0 > until 100. the result should not be bigger that 5 but with get > 265820543091454 as result back. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Description: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. The dataset as circa 800 Rows and the projection_factor has values from 0 until 5. the result should not be bigger that 5 but with get 265820543091454 as result back. was: Hi, I think I found a really ugly bug in spark when performing aggregations with Decimals To reproduce: {code:java} val df = spark.read.parquet("attached file") val first_agg = fact_df.groupBy("id1", "id2", "start_date").agg(mean("projection_factor").alias("projection_factor")) first_agg.show val second_agg = first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), min("projection_factor").alias("minf")) second_agg.show {code} First aggregation works fine the second aggregation seems to be summing instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. > The dataset as circa 800 Rows and the projection_factor has values from 0 > until 5. the result should not be bigger that 5 but with get > 265820543091454 as result back. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Attachment: testDF.parquet > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Attachment: (was: testDF.parquet) > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-24401) Aggreate on Decimal Types does not work
[ https://issues.apache.org/jira/browse/SPARK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Machado updated SPARK-24401: -- Attachment: testDF.parquet > Aggreate on Decimal Types does not work > --- > > Key: SPARK-24401 > URL: https://issues.apache.org/jira/browse/SPARK-24401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0, 2.3.0 >Reporter: Jorge Machado >Priority: Major > Attachments: testDF.parquet > > > Hi, > I think I found a really ugly bug in spark when performing aggregations with > Decimals > To reproduce: > > {code:java} > val df = spark.read.parquet("attached file") > val first_agg = fact_df.groupBy("id1", "id2", > "start_date").agg(mean("projection_factor").alias("projection_factor")) > first_agg.show > val second_agg = > first_agg.groupBy("id1","id2").agg(max("projection_factor").alias("maxf"), > min("projection_factor").alias("minf")) > second_agg.show > {code} > First aggregation works fine the second aggregation seems to be summing > instead of max value. I tried with spark 2.2.0 and 2.3.0 same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org