[ https://issues.apache.org/jira/browse/SPARK-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134917#comment-16134917 ]
cen yuhai edited comment on SPARK-13330 at 8/21/17 9:00 AM: ------------------------------------------------------------ [~zjffdu] [~al.johri] can you also look at this issue: https://issues.apache.org/jira/browse/SPARK-21796 was (Author: cenyuhai): [~zjffdu] can you also look at this issue: https://issues.apache.org/jira/browse/SPARK-21796 > PYTHONHASHSEED is not propgated to python worker > ------------------------------------------------ > > Key: SPARK-13330 > URL: https://issues.apache.org/jira/browse/SPARK-13330 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.6.0 > Reporter: Jeff Zhang > Assignee: Jeff Zhang > Fix For: 2.2.0 > > > when using python 3.3 , PYTHONHASHSEED is only set in driver, but not > propagated to executor, and cause the following error. > {noformat} > File "/Users/jzhang/github/spark/python/pyspark/rdd.py", line 74, in > portable_hash > raise Exception("Randomness of hash of string should be disabled via > PYTHONHASHSEED") > Exception: Randomness of hash of string should be disabled via PYTHONHASHSEED > at > org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:166) > at > org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:207) > at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:125) > at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) > at org.apache.spark.api.python.PairwiseRDD.compute(PythonRDD.scala:342) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:77) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:45) > at org.apache.spark.scheduler.Task.run(Task.scala:81) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org