Re: Java Heap Memory OOM when using ORCNewInputFormat in MR

2016-08-31 Thread Hank baker
No, I am not using dynamic partitioning. On Wed, Aug 31, 2016 at 4:00 AM, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > In hive 1.2.1 the automatic estimation of buffer size happens only if > column count is >1000. > You need https://issues.apache.org/jira/browse/HIVE-11807 for

Re: Java Heap Memory OOM when using ORCNewInputFormat in MR

2016-08-30 Thread Prasanth Jayachandran
In hive 1.2.1 the automatic estimation of buffer size happens only if column count is >1000. You need https://issues.apache.org/jira/browse/HIVE-11807 for automatic estimation by default or >=Hive 2.0 release. Are you using dynamic partitioning in hive? Thanks Prasanth On Aug 30, 2016, at

Re: Java Heap Memory OOM when using ORCNewInputFormat in MR

2016-08-30 Thread Hank baker
Hi Prasanth, Thanks for your quick reply. As I mentioned in the previous mail, this was the same stack trace in about 60 failed reducers. I am using Hive 1.2.1, not sure which newer version you are referring to. But exactly as you pointed out, When I tried to reproduce this issue on my local

Re: Java Heap Memory OOM when using ORCNewInputFormat in MR

2016-08-30 Thread Prasanth Jayachandran
Under memory pressure, the stack trace of OOM can be different depending on who is requesting more memory when the memory is already full. That is the reason you are seeing OOM in writeMetadata (it may happen in other places as well). When dealing with thousands of columns its better to set

Java Heap Memory OOM when using ORCNewInputFormat in MR

2016-08-30 Thread Hank baker
Hi all, I'm trying to run a map reduce job to convert csv data into orc using the OrcNewOutputFormat (reduce is required to satisfy some partitioning logic) but getting an OOM error at reduce phase (during merge to be exact) with the below attached stacktrace for one particular table which has