Some comments below: On Jun 21, 2011, at 10:26 AM, Dan Berindei wrote:
> On Mon, Jun 20, 2011 at 11:42 PM, Sanne Grinovero <sa...@infinispan.org> > wrote: >> 2011/6/20 Manik Surtani <ma...@jboss.org>: >>> Oddly enough, I don't see any other tests exhibiting this behaviour. Let >>> me know if you see it in more recent CI runs, and we'll investigate in >>> detail. >> >> In fact there are not many tests in core which verify a full stream is >> received; but as in another thread I mentioned I was seeing the >> following exception relatively often (it never caught my attention >> some months ago) >> >> Caused by: java.io.EOFException: The stream ended unexpectedly. >> Please check whether the source of the stream encountered any issues >> generating the stream. >> at >> org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:193) >> at >> org.infinispan.statetransfer.StateTransferManagerImpl.processCommitLog(StateTransferManagerImpl.java:218) >> at >> org.infinispan.statetransfer.StateTransferManagerImpl.applyTransactionLog(StateTransferManagerImpl.java:245) >> at >> org.infinispan.statetransfer.StateTransferManagerImpl.applyState(StateTransferManagerImpl.java:284) >> ... 27 more >> Caused by: java.io.EOFException: Read past end of file >> at >> org.jboss.marshalling.SimpleDataInput.eofOnRead(SimpleDataInput.java:126) >> at >> org.jboss.marshalling.SimpleDataInput.readUnsignedByteDirect(SimpleDataInput.java:263) >> at >> org.jboss.marshalling.SimpleDataInput.readUnsignedByte(SimpleDataInput.java:224) >> at >> org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209) >> at >> org.jboss.marshalling.AbstractObjectInput.readObject(AbstractObjectInput.java:37) >> at >> org.infinispan.marshall.jboss.GenericJBossMarshaller.objectFromObjectStream(GenericJBossMarshaller.java:191) >> at >> org.infinispan.marshall.VersionAwareMarshaller.objectFromObjectStream(VersionAwareMarshaller.java:191) >> ... 30 more >> > > The line "at > org.jboss.marshalling.river.RiverUnmarshaller.doReadObject(RiverUnmarshaller.java:209)" > suggests EOF has been reached while reading the lead byte of the > object, not a partial object. This is consistent with > StateTransferManagerImpl.generateTransactionLog() getting a timeout > while trying to acquire the processing lock (at > StateTransferManagerImpl.java:192) and closing the stream in the > middle of the transaction log, not a transmission error. We could > probably get rid of the exception in the logs by inserting another > delimiter here. You mean inserting a particular delimiter in case of a timeout acquiring the processing lock? That sounds like a good idea. Just back from holidays and can't remember well, but where previous EOFs related to processing lock acquisition? If so, I think your idea makes even more sense cos it'd fall into a possible known problem and could provide the receiver side with that bit more of information. > > Back to the original problem, if this was a stream corruption issue > I'd expect a lot more instances of deserializing errors because the > length of the buffer was smaller/larger than the number of bytes > following it and the next object to be deserialized from the stream > found garbage instead. > > This looks to me more like an index segment has been created with size > x on node A and also on node B, then it was updated with size y > x on > node A but only the metadata got to node B, the segment's byte array > remained the same. > > I don't know anything about the Lucene directory implementation yet, > so I have no idea if/how this could happen and I haven't been able to > reproduce it on my machine. Is there a way to see the Jenkins test > logs? There's the console log in http://goo.gl/JFh5R but if this happens relatively often in the Lucene dir impl, we could create a small jenkins run and pass a -Dlog4j.configuration configuring TRACE to be printed on console. The testsuite is small and would not generate a lot logging. There's always a change of not encountering the issue when TRACE is enabled, particularly if it's a race condition, but I think it's worth doing. IOW, I can set it up. > > Dan > > >> This looks like a suspicious correlation to me, as I think the >> reported errors are similar in nature. >> >> Cheers, >> Sanne >> >> >> >>> >>> On 18 Jun 2011, at 20:18, Sanne Grinovero wrote: >>> >>>> Hello all, >>>> I'm not in state to fully debug the issue this week, but even though >>>> this failure happens in the Lucene Directory it looks like it's >>>> reporting an issue with Infinispan core: >>>> >>>> https://infinispan.ci.cloudbees.com/job/Infinispan-master-JDK6-tcp/90/org.infinispan$infinispan-lucene-directory/testReport/junit/org.infinispan.lucene/SimpleLuceneTest/org_infinispan_lucene_SimpleLuceneTest_testIndexWritingAndFinding/ >>>> >>>> In this test we're writing to the index, and then asserting on the >>>> expected state on both nodes, but while it is successful on the same >>>> node as the writer, it fails with >>>> "java.io.IOException: Read past EOF" on the second node. >>>> >>>> This exception can mean only one thing: the value, which is a >>>> buffer[], was not completely transferred to the second node, which >>>> seems quite critical as the caches are using sync. >>>> I can't reproduce the error locally, but it's not the first time it is >>>> reported by CI: builds 60, 62, 65 for example (and more) show the same >>>> testcase fail in the same manner. >>>> >>>> Cheers, >>>> Sanne >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev@lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Manik Surtani >>> ma...@jboss.org >>> twitter.com/maniksurtani >>> >>> Lead, Infinispan >>> http://www.infinispan.org >>> >>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev