[ https://issues.apache.org/jira/browse/HIVE-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747776#action_12747776 ]
Zheng Shao commented on HIVE-797: --------------------------------- Mapper script can write to stderr to avoid the killing. Alternatively, you can do the following to achieve the same result: {code} set hive.script.auto.progress=true; {code} > mappers should report life in ways other than emitting data > ----------------------------------------------------------- > > Key: HIVE-797 > URL: https://issues.apache.org/jira/browse/HIVE-797 > Project: Hadoop Hive > Issue Type: Bug > Reporter: S. Alex Smith > > Mappers which are performing a great deal of aggregation can be killed by > time out even if they are running successfully. For example, in the > following query the group by operator stops the mapper from returning any > rows of data until the map is entirely finished. If the data processing > takes longer than the time-out limit, the job will fail. The mapper should > instead offer the tracker some indication that it is busy working. > Alternatively, the tracker could ping the mapper with an appropriate question > / warning before it sends a kill signal. > FROM ( > FROM my_table > SELECT TRANSFORM(my_data) > USING 'my_boolean_function' > AS boolean_output) a > SELECT boolean_output, COUNT(1) > GROUP BY boolean_output -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.