[ https://issues.apache.org/jira/browse/SPARK-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052603#comment-15052603 ]
Adam Roberts commented on SPARK-9858: ------------------------------------- Modifying the UnsafeRowSerializer to always write/read in LE fixes the problem, therefore enabling tungsten features to be fully exploited regardless of endianness (not yet sure why only the aggregate functions are impacted, thought we'd have plenty of test failures). We can use LittleEndianDataInput/OutputStream to achieve this; part of the same package as ByteStreams. Will ensure the regular SparkSqlSerializer is OK too. We're hitting a similar problem with the DatasetAggregatorSuite (instead of 1 we get 9, instead of 2 we get 10, etc), I expect the root cause to be the same. I'll get to work on the pull request, cheers > Introduce an ExchangeCoordinator to estimate the number of post-shuffle > partitions. > ----------------------------------------------------------------------------------- > > Key: SPARK-9858 > URL: https://issues.apache.org/jira/browse/SPARK-9858 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Yin Huai > Assignee: Yin Huai > Fix For: 1.6.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org