[ 
https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739664#comment-13739664
 ] 

Uwe Schindler commented on SOLR-5150:
-------------------------------------

Nice catch! As there is a positioned readFully we can handle that in a good way 
without loosing performance. Otherwise I would have suggested to use an 
approach like done in NIOFSDir (we using chunking + a while (remaining) loop 
and update the position pointer).

bq. I also noticed that the seekInternal impl should not seek but be a no-op 
since we are seeking on the read.

Right! I dont know why seekInternal in the BufferedIndexInput is still there. 
IMHO, it should be removed from the base class, as it is no longer used 
anywhere (at least it should default to an empty method). No IndexInput in 
Lucene is implementing it anymore, because with positional reads it is not 
applicable and in the case of separate seek/read, the seek and read must be 
synchronized because of clones (unless every IndexInput has a separate file 
descriptor).
                
> HdfsIndexInput may not fully read requested bytes.
> --------------------------------------------------
>
>                 Key: SOLR-5150
>                 URL: https://issues.apache.org/jira/browse/SOLR-5150
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.4
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.5, 5.0
>
>         Attachments: SOLR-5150.patch
>
>
> Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - 
> the read call we are using may not read all of the requested bytes - it 
> returns the number of bytes actually written - which we ignore.
> Blur moved to using a seek and then readFully call - synchronizing across the 
> two calls to deal with clones.
> We have seen that really kills performance, and using the readFully call that 
> lets you pass the position rather than first doing a sync, performs much 
> better and does not require the synchronization.
> I also noticed that the seekInternal impl should not seek but be a no op 
> since we are seeking on the read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to