[ https://issues.apache.org/jira/browse/SPARK-12470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068693#comment-15068693 ]
Pete Robbins commented on SPARK-12470: -------------------------------------- I'm fairly sure the code in my PR is correct but it is causing an ExchangeCoordinatorSuite test to fail. I'm struggling to see why this test is failing with the change I made. The failure is: determining the number of reducers: aggregate operator *** FAILED *** 3 did not equal 2 (ExchangeCoordinatorSuite.scala:316) putting some debug into the test I see that before my change the pre-shuffle partition sizes are 600, 600, 600, 600, 600 an after my change are 800. 800. 800. 800. 720 but I have no idea why. I'd really appreciate anyone with knowledge of this area a) checking my PR and b) helping explain the failing test. > Incorrect calculation of row size in > o.a.s.sql.catalyst.expressions.codegen.GenerateUnsafeRowJoiner > --------------------------------------------------------------------------------------------------- > > Key: SPARK-12470 > URL: https://issues.apache.org/jira/browse/SPARK-12470 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.2 > Reporter: Pete Robbins > Priority: Minor > > While looking into https://issues.apache.org/jira/browse/SPARK-12319 I > noticed that the row size is incorrectly calculated. > The "sizeReduction" value is calculated in words: > // The number of words we can reduce when we concat two rows together. > // The only reduction comes from merging the bitset portion of the two > rows, saving 1 word. > val sizeReduction = bitset1Words + bitset2Words - outputBitsetWords > but then it is subtracted from the size of the row in bytes: > | out.pointTo(buf, ${schema1.size + schema2.size}, sizeInBytes - > $sizeReduction); > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org