[ 
https://issues.apache.org/jira/browse/PIG-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000233#comment-16000233
 ] 

Rohini Palaniswamy commented on PIG-5228:
-----------------------------------------

bq.  I think it may be dependent on HashMap implementation and thus JDK version 
as well.
   If running on different JDK version, it could be an issue as HashMap 
implementation changed between 1.6 and 1.7.  For the same jdk version behavior 
of insertion is usually consistent.

bq. My wild guess is that either MR or Spark makes an extra filling of a map 
somewhere under the hood and that's where the difference comes from.
  Ordering of entries in map is not something we guarantee. But can you still 
try to find why it is happening? I am surprised you are running into this with 
just simple load and store statements for same jdk version. Could be something 
to do with serialization as well. 
  
bq. so I've created a new a test case where we project by each key from the map.
   If figuring out the cause of change is taking more time, I am fine with the 
current patch which has separate test for Spark.

> Orc_2 is failing with spark exec type
> -------------------------------------
>
>                 Key: PIG-5228
>                 URL: https://issues.apache.org/jira/browse/PIG-5228
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>             Fix For: spark-branch
>
>         Attachments: PIG-5228.0.patch
>
>
> This test is failing due to mismatch in the actual and expected result. The 
> difference is only related to the order of entries in Pig maps such as:
> Actual:
> {code}
> [name#alice, age#18]...
> {code}
> Expected:
> {code}
> [age#18, name#alice]...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to