Re: OOM with Hive on Tez

2014-08-26 Thread Suma Shivaprasad
Am using Tez 0.4.0 and counters for the query run are as below

2014-08-26 14:06:41,203 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(171)) - org.apache.tez.common.counters.DAGCounter:
2014-08-26 14:06:41,205 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -NUM_FAILED_TASKS: 67
2014-08-26 14:06:41,205 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -NUM_KILLED_TASKS: 312
2014-08-26 14:06:41,205 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -TOTAL_LAUNCHED_TASKS: 259
2014-08-26 14:06:41,205 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -DATA_LOCAL_TASKS: 59
2014-08-26 14:06:41,205 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -RACK_LOCAL_TASKS: 27
2014-08-26 14:06:41,207 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(171)) - File System Counters:
2014-08-26 14:06:41,208 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -FILE: BYTES_READ: 0
2014-08-26 14:06:41,208 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -FILE: BYTES_WRITTEN: 3201156949
2014-08-26 14:06:41,208 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -FILE: READ_OPS: 0
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -FILE: LARGE_READ_OPS: 0
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -FILE: WRITE_OPS: 0
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -HDFS: BYTES_READ: 30052072845
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -HDFS: BYTES_WRITTEN: 0
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -HDFS: READ_OPS: 768
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -HDFS: LARGE_READ_OPS: 0
2014-08-26 14:06:41,209 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -HDFS: WRITE_OPS: 0
2014-08-26 14:06:41,211 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(171)) - org.apache.tez.common.counters.TaskCounter:
2014-08-26 14:06:41,211 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -GC_TIME_MILLIS: 148639
2014-08-26 14:06:41,211 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -CPU_MILLISECONDS: 1420020
2014-08-26 14:06:41,211 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -PHYSICAL_MEMORY_BYTES: 304725393408
2014-08-26 14:06:41,211 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -VIRTUAL_MEMORY_BYTES: 440084279296
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -COMMITTED_HEAP_BYTES: 337806557184
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -INPUT_RECORDS_PROCESSED: 722420718
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -OUTPUT_RECORDS: 144488481
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -OUTPUT_BYTES: 6876509984
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -OUTPUT_BYTES_WITH_OVERHEAD: 7165487118
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -OUTPUT_BYTES_PHYSICAL: 3201154197
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(171)) -
org.apache.hadoop.hive.ql.exec.FilterOperator$Counter:
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -FILTERED: 863123081
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -PASSED: 215782564
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(171)) -
org.apache.hadoop.hive.ql.exec.MapOperator$Counter:
2014-08-26 14:06:41,212 INFO  [Thread-13]: exec.Task
(TezTask.java:execute(173)) -DESERIALIZE_ERRORS: 0

Thanks
Suma


On Tue, Aug 26, 2014 at 7:47 PM, Suma Shivaprasad <
sumasai.shivapra...@gmail.com> wrote:

> Trying to run a query on Tez with the following configurations
>
>
> *set hive.tez.container.size=5120*
> *set mapreduce.map.child.java.opts=-Xmx5120M*
> *set hive.tez.java.opts=-Xmx4096M*
> *set hive.auto.convert.join.noconditionaltask.size=805306000*
> *set tez.am.resource.memory.mb=5120*
> *set tez.am.java.opts=-Xmx4096M*
>
> The above config settings were set after  running
> https://github.com/hortonworks/hdp-configuration-utils/blob/master/2.1/hdp-configuration-utils.py
> to get the right memory configs
>
> Tried with both
>
> set tez.runtime.io.sort.mb=512
> set mapreduce.task.io.sort.mb=512
>
> and
>
> set tez.runtime.io.sort.mb=2048
> set mapreduce.task.io.sort.mb=2048
>
>
> The query I am trying run is
>
> *select sum(tab1.m1),sum(tab1.m2)*
> * from tab1 join tab2 dm on tab1.col1=tab2.col1*
> * where tab1.dt = '2014-06-01' *
> * and tab2.col2 = '..'*
> * and tab2.col3 IN ('..')*
> * group by TAB1.col1*
>
> *where TAB1.col1 has high cardinality(around 700- 800 million)*
>
> And its going OOM during shuffle phase.
>
>  errorMessage=Fetch failed
> Con

OOM with Hive on Tez

2014-08-26 Thread Suma Shivaprasad
Trying to run a query on Tez with the following configurations


*set hive.tez.container.size=5120*
*set mapreduce.map.child.java.opts=-Xmx5120M*
*set hive.tez.java.opts=-Xmx4096M*
*set hive.auto.convert.join.noconditionaltask.size=805306000*
*set tez.am.resource.memory.mb=5120*
*set tez.am.java.opts=-Xmx4096M*

The above config settings were set after  running
https://github.com/hortonworks/hdp-configuration-utils/blob/master/2.1/hdp-configuration-utils.py
to get the right memory configs

Tried with both

set tez.runtime.io.sort.mb=512
set mapreduce.task.io.sort.mb=512

and

set tez.runtime.io.sort.mb=2048
set mapreduce.task.io.sort.mb=2048


The query I am trying run is

*select sum(tab1.m1),sum(tab1.m2)*
* from tab1 join tab2 dm on tab1.col1=tab2.col1*
* where tab1.dt = '2014-06-01' *
* and tab2.col2 = '..'*
* and tab2.col3 IN ('..')*
* group by TAB1.col1*

*where TAB1.col1 has high cardinality(around 700- 800 million)*

And its going OOM during shuffle phase.

 errorMessage=Fetch failed
Container released by application,
AttemptID:attempt_1407396011310_1577_1_01_00_4 Info:Error:
exceptionThrown=java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:56)
at
org.apache.hadoop.io.BoundedByteArrayOutputStream.(BoundedByteArrayOutputStream.java:46)
at
org.apache.tez.runtime.library.shuffle.common.MemoryFetchedInput.(MemoryFetchedInput.java:38)
at
org.apache.tez.runtime.library.shuffle.common.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:137)
at
org.apache.tez.runtime.library.shuffle.common.Fetcher.fetchInputs(Fetcher.java:252)
at
org.apache.tez.runtime.library.shuffle.common.Fetcher.call(Fetcher.java:184)
at
org.apache.tez.runtime.library.shuffle.common.Fetcher.call(Fetcher.java:59)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)


Please advice if the configurations look ok? Do I need to change anything?



Thanks
Suma