[ https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742432#comment-13742432 ]
Mark Miller edited comment on SOLR-5150 at 8/16/13 5:38 PM: ------------------------------------------------------------ I've held off on committing this because some performance tests indicate the upstream blur patch may have been more performant for merging/flushing while the current patch is *much* more performant for queries. We might be able to use one or the other based on the IOContext. I'm waiting until I can get some more results and testing done though - I've seen lots of random deadlock situations in some of my testing with the upstream blur fix (synchronization around two calls). was (Author: markrmil...@gmail.com): I've held off on committing this because some performance tests indicate the upstream blur patch may have been more performant for merging/flushing while the current patch is *much* more performant for queries. We might be able to use one or the other based on the IOContext. I'm waiting until I can get some more results and testing done though - I've seen lots of random deadlock situations in some of my testing with the upstream blue fix (synchronization around two calls). > HdfsIndexInput may not fully read requested bytes. > -------------------------------------------------- > > Key: SOLR-5150 > URL: https://issues.apache.org/jira/browse/SOLR-5150 > Project: Solr > Issue Type: Bug > Affects Versions: 4.4 > Reporter: Mark Miller > Assignee: Mark Miller > Fix For: 4.5, 5.0 > > Attachments: SOLR-5150.patch > > > Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - > the read call we are using may not read all of the requested bytes - it > returns the number of bytes actually written - which we ignore. > Blur moved to using a seek and then readFully call - synchronizing across the > two calls to deal with clones. > We have seen that really kills performance, and using the readFully call that > lets you pass the position rather than first doing a seek, performs much > better and does not require the synchronization. > I also noticed that the seekInternal impl should not seek but be a no op > since we are seeking on the read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org