Hi Yuhao,
I have tried numPartitions from (numExecutors * numExecutorCores),
1000, 2000 and 1. I did not see much improvement.
Having more partitions solved some perf issues but did not see any
improvement when I give less minsupport.
It is generating 260 million frequent item sets with 6
Hi Raju,
Have you tried setNumPartitions with a larger number?
2017-03-07 0:30 GMT-08:00 Eli Super :
> Hi
>
> It's area of knowledge , you will need to read online several hours about
> it
>
> What is your programming language ?
>
> Try search online : "machine learning binning %my_programing_la
Hi
It's area of knowledge , you will need to read online several hours about
it
What is your programming language ?
Try search online : "machine learning binning %my_programing_langauge%"
and
"machine learning feature engineering %my_programing_langauge%"
On Tue, Mar 7, 2017 at 3:39 AM, Raju Ba
@Eli, Thanks for the suggestion. If you do not mind can you please
elaborate approaches?
On Mon, Mar 6, 2017 at 7:29 PM, Eli Super wrote:
> Hi
>
> Try to implement binning and/or feature engineering (smart feature
> selection for example)
>
> Good luck
>
> On Mon, Mar 6, 2017 at 6:56 AM, Raju Ba
Hi
Try to implement binning and/or feature engineering (smart feature
selection for example)
Good luck
On Mon, Mar 6, 2017 at 6:56 AM, Raju Bairishetti wrote:
> Hi,
> I am new to Spark ML Lib. I am using FPGrowth model for finding related
> items.
>
> Number of transactions are 63K and the t
Hi,
I am new to Spark ML Lib. I am using FPGrowth model for finding related
items.
Number of transactions are 63K and the total number of items in all
transactions are 200K.
I am running FPGrowth model to generate frequent items sets. It is taking
huge amount of time to generate frequent itemse