[ 
https://issues.apache.org/jira/browse/SPARK-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049284#comment-15049284
 ] 

Adam Roberts edited comment on SPARK-9858 at 12/9/15 7:53 PM:
--------------------------------------------------------------

Yep, I added System.identityHashCode(serializer) prints in both the creation 
method and when it's used (both in the Exchange class)


Creating new unsafe row serializer
ADAMTEST. myUnsafeRowSerializer identity hash: -555078685
Creating new unsafe row serializer
ADAMTEST. myUnsafeRowSerializer identity hash: 1088823803
preparing shuffle dependency
ADAMTEST. In needToCopy function and serializer hash is: 1088823803


New development, on Intel (LE platform) if we take the 200 elements and print 
them, we get 20 rows containing (3,[0,13,5,ff00000000000000]) in a row. On our 
BE platforms this isn't the case, everything is 
(3,[0,13,5,0]) - the same as the rest of the RDD on Intel. I added this print 
in DAGScheduler's submitMapStage method:

  val rdd = dependency.rdd
  rdd.take(200).foreach(println)


was (Author: aroberts):
Yep, I added System.identityHashCode(serializer) prints in both the creation 
method and when it's used (both in the Exchange class)


Creating new unsafe row serializer
ADAMTEST. myUnsafeRowSerializer identity hash: -555078685
Creating new unsafe row serializer
ADAMTEST. myUnsafeRowSerializer identity hash: 1088823803
preparing shuffle dependency
ADAMTEST. In needToCopy function and serializer hash is: 1088823803


New development, on Intel (LE platform) if we take the 200 elements and print 
them, we get 20 rows containing (3,[0,13,5,ff00000000000000]) in a row. On our 
BE platforms this isn't the case, everything is 
(3,[0,13,5,0]) - the same as the rest of the file on Intel. This print is in 
DAGScheduler's submitMapStage method:

  val rdd = dependency.rdd
  rdd.take(200).foreach(println)

> Introduce an ExchangeCoordinator to estimate the number of post-shuffle 
> partitions.
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-9858
>                 URL: https://issues.apache.org/jira/browse/SPARK-9858
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>             Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to