HI Anandha,

| reducers are still running but " DAG finished successfully" and inserted
zero records.

The "containers" used by reducers can be still running even after the DAG
has completed , due to the container re-use, they will be waiting for
another "task" to be assigned to the same container.
Can you please verify the task logs of the container where you see the
reducer is still running, I think the reduce attempt would have already
have completed and the container is waiting for a new task.

If that is the case, can you verify the hive query which might be causing 0
records being inserted.

Thanks,
Ravi

On Mon, May 22, 2017 at 11:51 AM, Anandha L Ranganathan <
[email protected]> wrote:

> Thanks Ravi for the information. We will udpate
> tez.session.am.dag.submit.timeout.secs to lower value in our config.
>
> Our main concern is  when   reducers are still running but " DAG finished
> successfully" and inserted  zero records.
>
> [2017-04-21 18:29:38,597] {hive_hooks.py:130} INFO - INFO  : Table 
> dw_insight.d_experiment stats: [numFiles=0, numRows=0, totalSize=0, 
> rawDataSize=0]
>
>
> This resulted incorrect result set and data got corrupted.
>
> Thanks
> Anand
>
> On Sun, May 21, 2017 at 7:18 AM, Ravi Teja Chilukuri <[email protected]
> > wrote:
>
>> Hi Anandha,
>>
>> If your hive query has completed and still the containers are running,
>> you might want to set the param: tez.session.am.dag.submit.timeout.secs
>> to a lower value , for the yarn application to be completely shutdown.
>> In session mode, by default the AM container waits till 5 minutes for any
>> other queries to be fired.
>>
>>
>> Thanks,
>> Ravi
>>
>> On Sat, May 20, 2017 at 4:44 AM, Anandha L Ranganathan <
>> [email protected]> wrote:
>>
>>> We hit with unique problem. Still the tez containers are running but Hive 
>>> moved the file from staging location to actual location.  Is this bug in 
>>> the Tez engine?
>>>
>>> I have attached logs and  you can see still Map1, Reducer3 , Reducer 4, 
>>> Reducer 7 are running but DAG is SUCCEEDED.
>>>
>>>
>>>
>>> [2017-04-21 18:08:57,108] {hive_hooks.py:130} INFO - INFO  : Map 1: 0(+2)/8 
>>> Map 12: 1/1     Map 5: 1/1      Map 8: 1/1      Map 9: 1/1      Reducer 11: 
>>> 0/1 Reducer 3: 0/308        Reducer 4: 0/2  Reducer 7: 0/1
>>> [2017-04-21 18:09:57,116] {hive_hooks.py:130} INFO - INFO  : Map 1: 0(+2)/8 
>>> Map 12: 1/1     Map 5: 1/1      Map 8: 1/1      Map 9: 1/1      Reducer 11: 
>>> 0/1 Reducer 3: 0/308        Reducer 4: 0/2  Reducer 7: 0/1
>>> [2017-04-21 18:09:57,116] {hive_hooks.py:130} INFO - INFO  : Map 1: 0(+2)/8 
>>> Map 12: 1/1     Map 5: 1/1      Map 8: 1/1      Map 9: 1/1      Reducer 11: 
>>> 0/1 Reducer 3: 0/308        Reducer 4: 0/2  Reducer 7: 0/1
>>> [2017-04-21 18:29:38,595] {hive_hooks.py:130} INFO - INFO  : Map 1: 0(+2)/8 
>>> Map 12: 1/1     Map 5: 1/1      Map 8: 1/1      Map 9: 1/1      Reducer 11: 
>>> 0/1 Reducer 3: 0/308        Reducer 4: 0/2  Reducer 7: 0/1
>>> [2017-04-21 18:29:38,595] {hive_hooks.py:130} INFO - INFO  :
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  : Status: DAG 
>>> finished successfully in 1560.07 seconds
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  :
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO -
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  : METHOD         
>>>                 DURATION(ms)
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  : parse          
>>>                           0
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  : 
>>> semanticAnalyze                          0
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  : TezBuildDag    
>>>                           0
>>> [2017-04-21 18:29:38,596] {hive_hooks.py:130} INFO - INFO  : 
>>> TezSubmitToRunningDag               12,957
>>> [2017-04-21 18:29:38,597] {hive_hooks.py:130} INFO - INFO  : TotalPrepTime  
>>>                      53,372
>>> [2017-04-21 18:29:38,597] {hive_hooks.py:130} INFO -
>>> [2017-04-21 18:29:38,597] {hive_hooks.py:130} INFO - INFO  : Loading data 
>>> to table dw_insight.d_experiment from 
>>> hdfs://dfs-nameservices/data/insight/dwuser/d_experiment/.hive-staging_hive_2017-04-21_18-02-31_741_4202396962843353301-4404/-ext-10000
>>> [2017-04-21 18:29:38,597] {hive_hooks.py:130} INFO - INFO  : Table 
>>> dw_insight.d_experiment stats: [numFiles=0, numRows=0, totalSize=0, 
>>> rawDataSize=0]
>>> [2017-04-21 18:29:38,597] {hive_hooks.py:130} INFO - No rows affected 
>>> (1626.877 seconds)
>>> [2017-04-21 18:29:38,604] {hive_hooks.py:130} INFO - 0: 
>>> jdbc:hive2://usw2prdpmn01:10000/dw_insight>
>>> [2017-04-21 18:29:38,611] {hive_hooks.py:130} INFO - 0: 
>>> jdbc:hive2://usw2prdpmn01:10000/dw_insight> ANALYZE TABLE 
>>> ${hiveconf:target_t
>>>
>>>
>>
>

Reply via email to