[ https://issues.apache.org/jira/browse/SPARK-32809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-32809. ---------------------------------- Resolution: Incomplete Please write in English which the community use to communicate. > RDD分区数对于计算结果的影响 > --------------- > > Key: SPARK-32809 > URL: https://issues.apache.org/jira/browse/SPARK-32809 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: zhangchenglong > Priority: Major > > class Exec3 { > private val exec: SparkConf = new > SparkConf().setMaster("local[1]").setAppName("exec3") > private val context = new SparkContext(exec) > context.setCheckpointDir("checkPoint") > > /** > * get total number by key > * in this project desired results are ("苹果",25) ("华为",20) > * but in fact i get ("苹果",150) ("华为",20) > * when i change it to local[3] the result is correct > * i want to know which cause it and how to slove it > */ > @Test > def testError(): Unit ={ > val rdd = context.parallelize(Seq(("苹果", 10), ("苹果", 15), ("华为", 20))) > rdd.aggregateByKey(1.0)( > seqOp = (zero, price) => price * zero, > combOp = (curr, agg) => curr + agg > ).collect().foreach(println(_)) > context.stop() > } > } -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org