Re: Pyspark Error when broadcast numpy array

2014-11-12 Thread bliuab
-- My Homepage: www.cse.ust.hk/~bliuab MPhil student in Hong Kong University of Science and Technology. Clear Water Bay, Kowloon, Hong Kong. Profile at LinkedIn. View this message in context: Re: Pyspark Error when broadcast numpy array

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread Davies Liu
This PR fix the problem: https://github.com/apache/spark/pull/2659 cc @josh Davies On Tue, Nov 11, 2014 at 7:47 PM, bliuab bli...@cse.ust.hk wrote: In spark-1.0.2, I have come across an error when I try to broadcast a quite large numpy array(with 35M dimension). The error information except

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread bliuab
Dear Liu: Thank you very much for your help. I will update that patch. By the way, as I have succeed to broadcast an array of size(30M) the log said that such array takes around 230MB memory. As a result, I think the numpy array that leads to error is much smaller than 2G. On Wed, Nov 12, 2014

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread Davies Liu
broadcast numpy array, click here. NAML -- My Homepage: www.cse.ust.hk/~bliuab MPhil student in Hong Kong University of Science and Technology. Clear Water Bay, Kowloon, Hong Kong. Profile at LinkedIn. View this message in context: Re: Pyspark Error when

Re: Pyspark Error when broadcast numpy array

2014-11-11 Thread bliuab
. View this message in context: Re: Pyspark Error when broadcast numpy array Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: [hidden