[ https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228649#comment-13228649 ]
Mikhail Khludnev commented on SOLR-3011: ---------------------------------------- James, bq. So it seems that for this to work, not only does the core (DocBuilder etc) need to be thread-safe, but every component in a given DIH configuration needs to be also. For me it's doubtful statement. I believe it's possible to have bunch of threadUnsafe classes synchronized by some smart orchestrator. bq. There also is quite a bit of code duplication in DocBuilder and classes Yep. Agree, ThrdEPWrapper is a FullImport only DocBuilder code dupe. bq. Mikhail, you've just noticed that MockDataSource was not designed to test a multi-threaded scenario in a valid fashion. not really, they just an odd mocks. With real DS every time you get a full resulset from the beginning, but after you reach eof in MockDS's resultset, re-querying gets you the same eof. bq. Take a look at TestDocBuilderThreaded. I've never seen it actually. bq. 1. Keep 3.x as-is, and make any quick fixes to threads for common use-cases there, as possible. No any quick fixes for any "common" use-cases is possible. I'm sure. bq. 2. In 4.0 (or a separate branch), remove threading from DIH. I suggest an opposite way: * be honest with users and remove "threads" from 3.6. Zero impact here. Nobody use it. It just doesn't work. * as well I already spend enormous efforts for fixing in it 4.0. I hope I will complete the fix anyway. (it will live at github at least). Btw, the reason why I fix 4.0 is SOLR-2382. Actually I wait sometime before it was completed. bq. 4. Make DocBuilder, etc threadsafe. 5. Create a marker interface or annotation I don't see how it's possible and be really helpful. bq. The SOLR-3011 patches work on 4.x .. But I can probably help with porting (some of?) this patch back to 3.x. Petr found a case where the patch doesn't work. After (if) I done it, all commits around SOLR-2382 can be cherrypicked to 3.x. Porting fix w/o DIHCacheSupport will take more time. In to my opposite proposals above, I think we really need to start a design of new Ultimate DIH. I propose # to pick up usecases (you are experienced in extreme caching, I did a throughput maximization via async producer-consumer, Peter will give us his cases, etc) # sketch a design in plant uml, check that it's bullet proof # cut in > DIH MultiThreaded bug > --------------------- > > Key: SOLR-3011 > URL: https://issues.apache.org/jira/browse/SOLR-3011 > Project: Solr > Issue Type: Sub-task > Components: contrib - DataImportHandler > Affects Versions: 3.5, 4.0 > Reporter: Mikhail Khludnev > Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3011.patch, SOLR-3011.patch, > patch-3011-EntityProcessorBase-iterator.patch, > patch-3011-EntityProcessorBase-iterator.patch > > > current DIH design is not thread safe. see last comments at SOLR-2382 and > SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly > it's a SOLR-2947 patch from 28th Dec. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org