ORC file writing hangs in pyspark

2016-02-23 Thread James Barney
I'm trying to write an ORC file after running the FPGrowth algorithm on a dataset of around just 2GB in size. The algorithm performs well and can display results if I take(n) the freqItemSets() of the result after converting that to a DF. I'm using Spark 1.5.2 on HDP 2.3.4 and Python 3.4.2 on Yarn

Re: ORC file writing hangs in pyspark

2016-02-24 Thread James Barney
es quickly. Thank you again for the suggestions On Tue, Feb 23, 2016 at 9:28 PM, Zhan Zhang wrote: > Hi James, > > You can try to write with other format, e.g., parquet to see whether it is > a orc specific issue or more generic issue. > > Thanks. > > Zhan Zhang > > O