[ 
https://issues.apache.org/jira/browse/PIG-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831400#action_12831400
 ] 

Hadoop QA commented on PIG-1231:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12435230/PIG-1231-1.patch
  against trunk revision 907760.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/206/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/206/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/206/console

This message is automatically generated.

> Default DataBagIterator.hasNext() should be idempotent in all cases
> -------------------------------------------------------------------
>
>                 Key: PIG-1231
>                 URL: https://issues.apache.org/jira/browse/PIG-1231
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: PIG-1231-1.patch
>
>
> DefaultDataBagIterator.hasNext() is not repeatable when the below conditions 
> met:
> 1. There is no more tuple in the last spill file
> 2. There is no tuples in memory (all contents are spilled to files)
> This is not acceptable cuz the name hasNext() implies that it is idempotent. 
> In BagFormat, we do misuse DataBagIterator.hasNext() because of the 
> assumption that hasNext() is always idempotent, which leads to some 
> mysterious errors. 
> Condition 2 seems to be very restrictive, but when the databag is really big, 
> the memory can hold less than a couple of tuples, the chance to hit 2. is 
> high enough.
> Here is one error we saw:
> Caused by: java.io.IOException: Stream closed
>         at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:189)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         at java.io.DataInputStream.readByte(DataInputStream.java:248)
>         at org.apache.pig.data.DefaultTuple.readFields(DefaultTuple.java:278)
>         at 
> org.apache.pig.data.DefaultDataBag$DefaultDataBagIterator.readFromFile(DefaultDataBag.java:237)
>         ... 20 more
> This happens because: we call hasNext(), which reach EOF and we close the 
> file. Then we call hasNext() again in the assumption that it is idempotent. 
> However, the stream is closed so we get this error message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to