[ 
https://issues.apache.org/jira/browse/SOLR-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006519#comment-15006519
 ] 

Renaud Delbru commented on SOLR-8292:
-------------------------------------

I have reviewed all the methods writing records to the tlog file, and all of 
them are properly synchronised with the flushing of the output stream. 
However, the access to the input stream is not synchronised. Could it be that 
one concurrent thread changed the fis position while another one was trying to 
read a record ? The CdcrLogReader#resetToLastPosition could interfere with the 
TransactionLog.LogReader#next.


> TransactionLog.next() does not honor contract and return null for EOF
> ---------------------------------------------------------------------
>
>                 Key: SOLR-8292
>                 URL: https://issues.apache.org/jira/browse/SOLR-8292
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Erick Erickson
>
> This came to light in CDCR testing, which stresses this code a lot, there's a 
> stack trace showing this line (641 trunk) throwing an EOF exception:
> o = codec.readVal(fis);
> At first I thought to just wrap reading fis in a try/catch and return null, 
> but looking at the code a bit more I'm not so sure, that seems like it'd mask 
> what looks at first glance like a bug in the logic.
> A few lines earlier (633-4) there's these lines:
> // shouldn't currently happen - header and first record are currently written 
> at the same time
> if (fis.position() >= fos.size()) {
> Why are we comparing the the input file position against the size of the 
> output file? Maybe because the 'i' key is right next to the 'o' key? The 
> comment hints that it's checking for the ability to read the first record in 
> input stream along with the header. And perhaps there's a different issue 
> here because the expectation clearly is that the first record should be there 
> if the header is.
> So what's the right thing to do? Wrap in a try/catch and return null for EOF? 
> Change the test? Do both?
> I can take care of either, but wanted a clue whether the comparison of fis to 
> fos is intended.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to