[ 
https://issues.apache.org/jira/browse/SPARK-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068176#comment-15068176
 ] 

Tim Preece commented on SPARK-12319:
------------------------------------

[~marmbrus]
Hi Michael,
I think this may be a problem with the new DataSet API, in particular the new 
"as" function of DataFrame which I see is tagged as Experimental.

When we run the DatasetAggregatorSuite test "typed aggregation: class input 
with reordering" the implementation seems to get confused between the ordering 
of the data in the unsaferow (string,int) and the schema (int,string). This 
results in a testcase failure that shows up to BE platforms ( although the data 
is also corrupted on LE platforms ).

At the moment I'm not sure how to fix, so any pointers would be helpful.

> Address endian specific problems surfaced in 1.6
> ------------------------------------------------
>
>                 Key: SPARK-12319
>                 URL: https://issues.apache.org/jira/browse/SPARK-12319
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>         Environment: Problems apparent on BE, LE could be impacted too
>            Reporter: Adam Roberts
>            Priority: Critical
>
> JIRA to cover endian specific problems - since testing 1.6 I've noticed 
> problems with DataFrames on BE platforms, e.g. 
> https://issues.apache.org/jira/browse/SPARK-9858
> [~joshrosen] [~yhuai]
> Current progress: using com.google.common.io.LittleEndianDataInputStream and 
> com.google.common.io.LittleEndianDataOutputStream within UnsafeRowSerializer 
> fixes three test failures in ExchangeCoordinatorSuite but I'm concerned 
> around performance/wider functional implications
> "org.apache.spark.sql.DatasetAggregatorSuite.typed aggregation: class input 
> with reordering" fails as we expect "one, 1" but instead get "one, 9" - we 
> believe the issue lies within BitSetMethods.java, specifically around: return 
> (wi << 6) + subIndex + java.lang.Long.numberOfTrailingZeros(word); 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to