Re: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-12-02 Thread Andy Davidson
Hi Ted an Felix From: Ted Yu <yuzhih...@gmail.com> Date: Sunday, November 29, 2015 at 10:37 AM To: Andrew Davidson <a...@santacruzintegration.com> Cc: Felix Cheung <felixcheun...@hotmail.com>, "user @spark" <user@spark.apache.org> Subject: Re: possib

Re: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-29 Thread Andy Davidson
uzintegration.com>, "user @spark" <user@spark.apache.org> Subject: Re: possible bug spark/python/pyspark/rdd.py portable_hash() > > Ah, it's there in spark-submit and pyspark. > Seems like it should be added for spark_ec2 > > > > __

Re: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-29 Thread Ted Yu
tor-memory 2G \ > $extraPkgs \ > $* > > From: Felix Cheung <felixcheun...@hotmail.com> > Date: Saturday, November 28, 2015 at 12:11 AM > To: Ted Yu <yuzhih...@gmail.com> > Cc: Andrew Davidson <a...@santacruzintegration.com>, "user @spark" >

RE: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-29 Thread Felix Cheung
HSEED")? Date: Sun, 29 Nov 2015 09:48:19 -0800 Subject: Re: possible bug spark/python/pyspark/rdd.py portable_hash() From: a...@santacruzintegration.com To: felixcheun...@hotmail.com; yuzhih...@gmail.com CC: user@spark.apache.org Hi Felix and Ted This is how I am starting spark Should I

Re: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-28 Thread Felix Cheung
Ah, it's there in spark-submit and pyspark.Seems like it should be added for spark_ec2 _ From: Ted Yu <yuzhih...@gmail.com> Sent: Friday, November 27, 2015 11:50 AM Subject: Re: possible bug spark/python/pyspark/rdd.py portable_hash() To: Felix

RE: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-27 Thread Felix Cheung
May I ask how you are starting Spark? It looks like PYTHONHASHSEED is being set: https://github.com/apache/spark/search?utf8=%E2%9C%93=PYTHONHASHSEED Date: Thu, 26 Nov 2015 11:30:09 -0800 Subject: possible bug spark/python/pyspark/rdd.py portable_hash() From: a...@santacruzintegration.com To:

Re: possible bug spark/python/pyspark/rdd.py portable_hash()

2015-11-27 Thread Ted Yu
ec2/spark-ec2 calls ./ec2/spark_ec2.py I don't see PYTHONHASHSEED defined in any of these scripts. Andy reported this for ec2 cluster. I think a JIRA should be opened. On Fri, Nov 27, 2015 at 11:01 AM, Felix Cheung wrote: > May I ask how you are starting Spark? >