[ 
https://issues.apache.org/jira/browse/PIG-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561934#action_12561934
 ] 

eric baldeschwieler commented on PIG-71:
----------------------------------------

There are two different issues confused here.

1) metadata files in output directories.  Hadoop reserves the right to put 
metadata in the output directory in files and directories that start with '_'.  
Pig should behave consistently with this.  We should document this in Hadoop 
and Pig

2) speculative execution seems to be leaving task directories about.  If this 
is happening on successful runs, something is broken and should be fixed.

> Support for Hadoop Speculative Execution
> ----------------------------------------
>
>                 Key: PIG-71
>                 URL: https://issues.apache.org/jira/browse/PIG-71
>             Project: Pig
>          Issue Type: New Feature
>         Environment: Hadoop
>            Reporter: Amir Youssefi
>            Priority: Minor
>
> If Speculative Execution is used in Hadoop while creating a data-set then Pig 
> scripts loading this data-set may fail. Reason is temp directories generated 
> in the process. 
> Pig can filter out these temp directories and problem gets solved. Here is 
> sample error:
> [main] ERROR org.apache.pig - Error message from task (map) 
> tip_..._0001_m_002735 java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
>         at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1524)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1590)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1626)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1712)
>         at 
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:79)
>         ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to