[ 
https://issues.apache.org/jira/browse/HIVE-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747776#action_12747776
 ] 

Zheng Shao commented on HIVE-797:
---------------------------------

Mapper script can write to stderr to avoid the killing.

Alternatively, you can do the following to achieve the same result:

{code}
set hive.script.auto.progress=true;
{code}


> mappers should report life in ways other than emitting data
> -----------------------------------------------------------
>
>                 Key: HIVE-797
>                 URL: https://issues.apache.org/jira/browse/HIVE-797
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: S. Alex Smith
>
> Mappers which are performing a great deal of aggregation can be killed by 
> time out even if they are running successfully.  For example, in the 
> following query the group by operator stops the mapper from returning any 
> rows of data until the map is entirely finished.  If the data processing 
> takes longer than the time-out limit, the job will fail.  The mapper should 
> instead offer the tracker some indication that it is busy working.  
> Alternatively, the tracker could ping the mapper with an appropriate question 
> / warning before it sends a kill signal.
> FROM (
>   FROM my_table
>   SELECT TRANSFORM(my_data)
>   USING 'my_boolean_function'
>   AS boolean_output) a
> SELECT boolean_output, COUNT(1)
> GROUP BY boolean_output

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to