[ 
https://issues.apache.org/jira/browse/SOLR-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742441#comment-13742441
 ] 

Mark Miller commented on SOLR-5150:
-----------------------------------

To describe that more fully: not deadlock - just really long pauses - no cpu or 
harddrive usage by either hdfs processes or solr for a *long* time - threads 
hanging out in socket waits of some kind it seemed.

That is how I first saw the slowdown with the blur fix - I was running one of 
the HdfsDirectory tests on my mac and it took 10 min instead of 14 seconds. On 
linux, the test was still fast. Some other perf tests around querying took a 
nose dive on linux as well though. Meanwhile, some tests involving indexing 
sped up.

The current patch sped that test back up on my mac and fixed the query perf 
test.

We might be able to get the best of both worlds, or the synchronized version 
might not be worth it.
                
> HdfsIndexInput may not fully read requested bytes.
> --------------------------------------------------
>
>                 Key: SOLR-5150
>                 URL: https://issues.apache.org/jira/browse/SOLR-5150
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.4
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.5, 5.0
>
>         Attachments: SOLR-5150.patch
>
>
> Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - 
> the read call we are using may not read all of the requested bytes - it 
> returns the number of bytes actually written - which we ignore.
> Blur moved to using a seek and then readFully call - synchronizing across the 
> two calls to deal with clones.
> We have seen that really kills performance, and using the readFully call that 
> lets you pass the position rather than first doing a seek, performs much 
> better and does not require the synchronization.
> I also noticed that the seekInternal impl should not seek but be a no op 
> since we are seeking on the read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to