Re: Spark fpg large basket

Sean Owen Wed, 11 Mar 2015 00:58:01 -0700

I don't think there is enough information here. Where is the program
spending its time? where does it "stop"? how many partitions are
there?


On Wed, Mar 11, 2015 at 7:10 AM, Akhil Das <[email protected]> wrote:
> You need to set spark.cores.max to a number say 16, so that on all 4
> machines the tasks will get distributed evenly, Another thing would be to
> set spark.default.parallelism if you haven't tried already.
>
> Thanks
> Best Regards
>
> On Wed, Mar 11, 2015 at 12:27 PM, Sean Barzilay <[email protected]>
> wrote:
>>
>> I am running on a 4 workers cluster each having between 16 to 30 cores and
>> 50 GB of ram
>>
>>
>> On Wed, 11 Mar 2015 8:55 am Akhil Das <[email protected]> wrote:
>>>
>>> Depending on your cluster setup (cores, memory), you need to specify the
>>> parallelism/repartition the data.
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Wed, Mar 11, 2015 at 12:18 PM, Sean Barzilay <[email protected]>
>>> wrote:
>>>>
>>>> Hi I am currently using spark 1.3.0-snapshot to run the fpg algorithm
>>>> from the mllib library. When I am trying to run the algorithm over a large
>>>> basket(over 1000 items) the program seems to never finish. Did anyone find 
>>>> a
>>>> workaround for this problem?
>>>
>>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Spark fpg large basket

Reply via email to