I don't think there is enough information here. Where is the program spending its time? where does it "stop"? how many partitions are there?
On Wed, Mar 11, 2015 at 7:10 AM, Akhil Das <[email protected]> wrote: > You need to set spark.cores.max to a number say 16, so that on all 4 > machines the tasks will get distributed evenly, Another thing would be to > set spark.default.parallelism if you haven't tried already. > > Thanks > Best Regards > > On Wed, Mar 11, 2015 at 12:27 PM, Sean Barzilay <[email protected]> > wrote: >> >> I am running on a 4 workers cluster each having between 16 to 30 cores and >> 50 GB of ram >> >> >> On Wed, 11 Mar 2015 8:55 am Akhil Das <[email protected]> wrote: >>> >>> Depending on your cluster setup (cores, memory), you need to specify the >>> parallelism/repartition the data. >>> >>> Thanks >>> Best Regards >>> >>> On Wed, Mar 11, 2015 at 12:18 PM, Sean Barzilay <[email protected]> >>> wrote: >>>> >>>> Hi I am currently using spark 1.3.0-snapshot to run the fpg algorithm >>>> from the mllib library. When I am trying to run the algorithm over a large >>>> basket(over 1000 items) the program seems to never finish. Did anyone find >>>> a >>>> workaround for this problem? >>> >>> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
