from:"Patrick Plaatje"

Re: Market Basket Analysis by deploying FP Growth algorithm

2017-04-05 Thread Patrick Plaatje

Hi Arun, We have been running into the same issue (having only 1000 unique items, in 100MM transactions), but have not investigated the root cause of this. We decided to run this on a cluster instead (4*16 / 64GB Ram), after which the OOM issue went away. However, we ran into the issue that

Re: FP growth - Items in a transaction must be unique

2017-02-02 Thread Patrick Plaatje

Hi, This indicates you have duplicate products per row in your dataframe, the FP implementation only allows unique products per row, so you will need to dedupe duplicate products before running the FPGrowth algorithm. Best, Patrick From: "Devi P.V" Date:

Re: newbie unable to write to S3 403 forbidden error

2016-02-13 Thread Patrick Plaatje

Not sure if it’s related, but in our Hadoop configuration we’re also setting sc.hadoopConfiguration().set("fs.s3.impl","org.apache.hadoop.fs.s3native.NativeS3FileSystem”); Cheers, -patrick From: Andy Davidson Date: Friday, 12 February 2016 at 17:34 To: Igor

Getting top distinct strings from arraylist

2016-01-25 Thread Patrick Plaatje

Hi, I’m quite new to Spark and MR, but have a requirement to get all distinct values with their respective counts from a transactional file. Let’s assume the following file format: 0 1 2 3 4 5 6 7 1 3 4 5 8 9 9 10 11 12 13 14 15 16 17 18 1 4 7 11 12 13 19 20 3 4 7 11 15 20 21 22 23 1 2 5 9 11

Re: Market Basket Analysis by deploying FP Growth algorithm

Re: FP growth - Items in a transaction must be unique

Re: newbie unable to write to S3 403 forbidden error

Getting top distinct strings from arraylist

4 matches

Site Navigation

Mail list logo

Footer information