[ https://issues.apache.org/jira/browse/SPARK-38823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-38823. ---------------------------------- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36183 [https://github.com/apache/spark/pull/36183] > Incorrect result of dataset reduceGroups in java > ------------------------------------------------ > > Key: SPARK-38823 > URL: https://issues.apache.org/jira/browse/SPARK-38823 > Project: Spark > Issue Type: Bug > Components: Java API > Affects Versions: 3.3.0, 3.4.0 > Reporter: IKozar > Assignee: Bruce Robbins > Priority: Major > Labels: correctness > Fix For: 3.3.0 > > > {code:java} > @Data > @NoArgsConstructor > @AllArgsConstructor > public static class Item implements Serializable { > private String x; > private String y; > private int z; > public Item addZ(int z) { > return new Item(x, y, this.z + z); > } > } {code} > {code:java} > List<Item> items = List.of( > new Item("X1", "Y1", 1), > new Item("X2", "Y1", 1), > new Item("X1", "Y1", 1), > new Item("X2", "Y1", 1), > new Item("X3", "Y1", 1), > new Item("X1", "Y1", 1), > new Item("X1", "Y2", 1), > new Item("X2", "Y1", 1)); > Dataset<Item> ds = spark.createDataFrame(items, > Item.class).as(Encoders.bean(Item.class)); > ds.groupByKey((MapFunction<Item, Tuple2<String, String>>) item -> > Tuple2.apply(item.getX(), item.getY()), > Encoders.tuple(Encoders.STRING(), Encoders.STRING())) > .reduceGroups((ReduceFunction<Item>) (item1, item2) -> > item1.addZ(item2.getZ())) > .show(10); > {code} > result is > {noformat} > +--------+----------------------------------------------+ > | key|ReduceAggregator(poc.job.JavaSparkReduce$Item)| > +--------+----------------------------------------------+ > |{X1, Y1}| {X2, Y1, 2}|-- expected 3 > |{X2, Y1}| {X2, Y1, 2}|-- expected 3 > |{X1, Y2}| {X2, Y1, 1}| > |{X3, Y1}| {X2, Y1, 1}| > +--------+----------------------------------------------+{noformat} > pay attention that key doesn't mach with value -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org