Hi Arun,

We have been running into the same issue (having only 1000 unique items, in 
100MM transactions), but have not investigated the root cause of this. We 
decided to run this on a cluster instead (4*16 / 64GB Ram), after which the OOM 
issue went away. However, we ran into the issue that the FPGrowth 
implementation starts spilling over to disk, and we had to increase the /tmp 
partition.

Hope it helps.

BR,
-patrick



On 05/04/2017, 10:29, "asethia" <sethia.a...@gmail.com> wrote:

    Hi,
    
    We are currently working on a Market Basket Analysis by deploying FP Growth
    algorithm on Spark to generate association rules for product recommendation.
    We are running on close to 24 million invoices over an assortment of more
    than 100k products. However, whenever we relax the support threshold below a
    certain level, the stack overflows. We are using Spark 1.6.2 but can somehow
    invoke 1.6.3 to counter this error. The problem though is even when we
    invoke Spark 1.6.3 and increase the stack size to 100M we are running out of
    memory. We believe the tree grows exponentially and is stored in memory
    which causes this problem. Can anyone suggest a solution to this issue
    please?
    
    Thanks
    Arun
    
    
    
    --
    View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Market-Basket-Analysis-by-deploying-FP-Growth-algorithm-tp28569.html
    Sent from the Apache Spark User List mailing list archive at Nabble.com.
    
    ---------------------------------------------------------------------
    To unsubscribe e-mail: user-unsubscr...@spark.apache.org
    
    



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to