Re: FPGrowth Model is taking too long to generate frequent item sets

2017-03-14 Thread Raju Bairishetti
Hi Yuhao, I have tried numPartitions from (numExecutors * numExecutorCores), 1000, 2000 and 1. I did not see much improvement. Having more partitions solved some perf issues but did not see any improvement when I give less minsupport. It is generating 260 million frequent item sets with 6

Re: FPGrowth Model is taking too long to generate frequent item sets

2017-03-14 Thread Yuhao Yang
Hi Raju, Have you tried setNumPartitions with a larger number? 2017-03-07 0:30 GMT-08:00 Eli Super : > Hi > > It's area of knowledge , you will need to read online several hours about > it > > What is your programming language ? > > Try search online : "machine learning binning %my_programing_la

Re: FPGrowth Model is taking too long to generate frequent item sets

2017-03-07 Thread Eli Super
Hi It's area of knowledge , you will need to read online several hours about it What is your programming language ? Try search online : "machine learning binning %my_programing_langauge%" and "machine learning feature engineering %my_programing_langauge%" On Tue, Mar 7, 2017 at 3:39 AM, Raju Ba

Re: FPGrowth Model is taking too long to generate frequent item sets

2017-03-06 Thread Raju Bairishetti
@Eli, Thanks for the suggestion. If you do not mind can you please elaborate approaches? On Mon, Mar 6, 2017 at 7:29 PM, Eli Super wrote: > Hi > > Try to implement binning and/or feature engineering (smart feature > selection for example) > > Good luck > > On Mon, Mar 6, 2017 at 6:56 AM, Raju Ba

Re: FPGrowth Model is taking too long to generate frequent item sets

2017-03-06 Thread Eli Super
Hi Try to implement binning and/or feature engineering (smart feature selection for example) Good luck On Mon, Mar 6, 2017 at 6:56 AM, Raju Bairishetti wrote: > Hi, > I am new to Spark ML Lib. I am using FPGrowth model for finding related > items. > > Number of transactions are 63K and the t

FPGrowth Model is taking too long to generate frequent item sets

2017-03-05 Thread Raju Bairishetti
Hi, I am new to Spark ML Lib. I am using FPGrowth model for finding related items. Number of transactions are 63K and the total number of items in all transactions are 200K. I am running FPGrowth model to generate frequent items sets. It is taking huge amount of time to generate frequent itemse