from:"surender kumar"

Re: Broadcasting huge array or persisting on HDFS to read on executors - both not working

2018-04-12 Thread surender kumar

w whether using any of these built in functions helps or not. Regards,Gourav On Thu, Apr 12, 2018 at 3:25 AM, surender kumar wrote: Thanks Matteo, this should work! -Surender On Thursday, 12 April, 2018, 1:13:38 PM IST, Matteo Cossu wrote: I don't think it's trivial. A

Re: Broadcasting huge array or persisting on HDFS to read on executors - both not working

2018-04-12 Thread surender kumar

, so you have a new RDD (userID, [sample_items]) - flatten all the list in the previously created RDD and join them back with the RDD with (itemID, index) using index as join attribute You can do the same things with DataFrame using UDFs. On 11 April 2018 at 23:01, surender kumar wrote: right,

Re: Broadcasting huge array or persisting on HDFS to read on executors - both not working

2018-04-11 Thread surender kumar

AM IST, Matteo Cossu wrote: Why broadcasting this list then? You should use an RDD or DataFrame. For example, RDD has a method sample() that returns a random sample from it. On 11 April 2018 at 22:34, surender kumar wrote: I'm using pySpark.I've list of 1 million items (all fl

Broadcasting huge array or persisting on HDFS to read on executors - both not working

2018-04-11 Thread surender kumar

I'm using pySpark.I've list of 1 million items (all float values ) and 1 million users. for each user I want to sample randomly some items from the item list.Broadcasting the item list results in Outofmemory error on the driver, tried setting driver memory till 10G. I tried to persist this arra

Re: Broadcasting huge array or persisting on HDFS to read on executors - both not working

Re: Broadcasting huge array or persisting on HDFS to read on executors - both not working

Re: Broadcasting huge array or persisting on HDFS to read on executors - both not working

Broadcasting huge array or persisting on HDFS to read on executors - both not working

4 matches

Site Navigation

Mail list logo

Footer information