[jira] [Commented] (SPARK-9858) Introduce an ExchangeCoordinator to estimate the number of post-shuffle partitions.

Adam Roberts (JIRA) Fri, 11 Dec 2015 02:46:25 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052603#comment-15052603
 ]


Adam Roberts commented on SPARK-9858:
-------------------------------------

Modifying the UnsafeRowSerializer to always write/read in LE fixes the problem, 
therefore enabling tungsten features to be fully exploited regardless of 
endianness (not yet sure why only the aggregate functions are impacted, thought 
we'd have plenty of test failures). We can use 
LittleEndianDataInput/OutputStream to achieve this; part of the same package as 
ByteStreams. Will ensure the regular SparkSqlSerializer is OK too.

We're hitting a similar problem with the DatasetAggregatorSuite (instead of 1 
we get 9, instead of 2 we get 10, etc), I expect the root cause to be the same.

I'll get to work on the pull request, cheers 

> Introduce an ExchangeCoordinator to estimate the number of post-shuffle 
> partitions.
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-9858
>                 URL: https://issues.apache.org/jira/browse/SPARK-9858
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>             Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-9858) Introduce an ExchangeCoordinator to estimate the number of post-shuffle partitions.

Reply via email to