[ https://issues.apache.org/jira/browse/PIG-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-1231: ---------------------------- Attachment: PIG-1231-1.patch DefaultDataBagIterator is the only DataBag has this problem. Other databag handles this through different mechanisms. > DataBagIterator.hasNext() should be idempotent > ---------------------------------------------- > > Key: PIG-1231 > URL: https://issues.apache.org/jira/browse/PIG-1231 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.6.0 > Reporter: Daniel Dai > Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1231-1.patch > > > DataBagIterator.hasNext() is not repeatable in some situations. This is not > acceptable cuz the name hasNext() implies that it is idempotent. While > hasNext() returns true, it is repeatable, but if hasNext() returns false, it > is not. In BagFormat, we do misuse DataBagIterator.hasNext() because of the > assumption that hasNext() is always idempotent, which leads to some > mysterious errors. Here is one error we saw: > Caused by: java.io.IOException: Stream closed > at > java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:189) > at java.io.BufferedInputStream.read(BufferedInputStream.java:237) > at java.io.DataInputStream.readByte(DataInputStream.java:248) > at org.apache.pig.data.DefaultTuple.readFields(DefaultTuple.java:278) > at > org.apache.pig.data.DefaultDataBag$DefaultDataBagIterator.readFromFile(DefaultDataBag.java:237) > ... 20 more > This happens because: we call hasNext(), which reach EOF and we close the > file. Then we call hasNext() again in the assumption that it is idempotent. > However, the stream is closed so we get this error message. > This fix will go to DefaultDataBagIterator, DistinctDataBagIterator, > CachedBagIterator, SortedDataBagIterator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.