[ https://issues.apache.org/jira/browse/HIVE-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joydeep Sen Sarma resolved HIVE-277. ------------------------------------ Resolution: Duplicate > Files created by redundant tasks that have been killed are taken into > consideration > ----------------------------------------------------------------------------------- > > Key: HIVE-277 > URL: https://issues.apache.org/jira/browse/HIVE-277 > Project: Hadoop Hive > Issue Type: Bug > Environment: Hive with Hadoop 0.19.0 on a Linux cluster > Reporter: Rodrigo Schmidt > Priority: Minor > > Hadoop starts redundant tasks (mappers, by default) if it some particular > mappers are taking too long to be executed. When one of the redundant tasks > finishes, the others are killed. Killed tasks may generate output files > (usually empty) and Hive is considering them as part of the job output. > In my case, I'm profiling one of the mappers in an INSERT OVERWRITE TABLE ... > SELECT (map-only) query, and the extra time added by the profiler makes > hadoop start a second mapper for the same part of the input. When one of > these redundant mappers finishes, the other is killed, and > /tmp/hive-xxxx/xxxxxxxxx.10000.insclause-0/ will have the following files: > _tmp.attempt_XX....XX_XXXX_m_000000_0 > attempt_XX....XX_XXXX_m_000000_0 > attempt_XX....XX_XXXX_m_000000_1 > attempt_XX....XX_XXXX_m_000000_2 > ... > The first file is empty, but Hive considers it as part of the generated > output and tries to load it in the destination table, giving the following > error message: > Loading data to table output_table partition {p=p1} > Failed with exception Cannot load text files into a table stored as > SequenceFile. > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > I'm not sure if the files generated by killed tasks will always be empty. If > not, this bug might render the data inconsistent. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.