[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742831#comment-13742831 ]
Mark Miller commented on SOLR-5150: ----------------------------------- bq. If the positional readFully approach would be slower, then this would be clearly a bug in Hdfs. Right - if that turns out to be the case, I'd raise an issue with the hdfs team. The performance difference actually looks fairly large on first glance though, so it might be worth working around for a while if possible. I don't really know yet. > HdfsIndexInput may not fully read requested bytes. > -------------------------------------------------- > > Key: SOLR-5150 > URL: https://issues.apache.org/jira/browse/SOLR-5150 > Project: Solr > Issue Type: Bug > Affects Versions: 4.4 > Reporter: Mark Miller > Assignee: Mark Miller > Fix For: 4.5, 5.0 > > Attachments: SOLR-5150.patch > > > Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - > the read call we are using may not read all of the requested bytes - it > returns the number of bytes actually written - which we ignore. > Blur moved to using a seek and then readFully call - synchronizing across the > two calls to deal with clones. > We have seen that really kills performance, and using the readFully call that > lets you pass the position rather than first doing a seek, performs much > better and does not require the synchronization. > I also noticed that the seekInternal impl should not seek but be a no op > since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org