[ 
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228649#comment-13228649
 ] 

Mikhail Khludnev commented on SOLR-3011:
----------------------------------------

James,

bq. So it seems that for this to work, not only does the core (DocBuilder etc) 
need to be thread-safe, but every component in a given DIH configuration needs 
to be also.

For me it's doubtful statement. I believe it's possible to have bunch of 
threadUnsafe classes synchronized by some smart orchestrator. 

bq. There also is quite a bit of code duplication in DocBuilder and classes

Yep. Agree, ThrdEPWrapper is a FullImport only DocBuilder code dupe.

bq. Mikhail, you've just noticed that MockDataSource was not designed to test a 
multi-threaded scenario in a valid fashion.

not really, they just an odd mocks. With real DS every time you get a full 
resulset from the beginning, but after you reach eof in MockDS's resultset, 
re-querying gets you the same eof.

bq. Take a look at TestDocBuilderThreaded.

I've never seen it actually.

bq. 1. Keep 3.x as-is, and make any quick fixes to threads for common use-cases 
there, as possible.

No any quick fixes for any "common" use-cases is possible. I'm sure.

bq. 2. In 4.0 (or a separate branch), remove threading from DIH.

I suggest an opposite way:
* be honest with users and remove "threads" from 3.6. Zero impact here. Nobody 
use it. It just doesn't work.
* as well I already spend enormous efforts for fixing in it 4.0. I hope I will 
complete the fix anyway. (it will live at github at least). Btw, the reason why 
I fix 4.0 is SOLR-2382. Actually I wait sometime before it was completed. 

bq. 4. Make DocBuilder, etc threadsafe. 5. Create a marker interface or 
annotation

I don't see how it's possible and be really helpful.

bq.  The SOLR-3011 patches work on 4.x .. But I can probably help with porting 
(some of?) this patch back to 3.x.

Petr found a case where the patch doesn't work. After (if) I done it, all 
commits around SOLR-2382 can be cherrypicked to 3.x. Porting fix w/o 
DIHCacheSupport will take more time.

In  to my opposite proposals above, I think we really need to start a design of 
new Ultimate DIH. I propose
# to pick up usecases (you are experienced in extreme caching, I did a 
throughput maximization via async producer-consumer, Peter will give us his 
cases, etc)
# sketch a design in plant uml, check that it's bullet proof 
# cut in 


 
                
> DIH MultiThreaded bug
> ---------------------
>
>                 Key: SOLR-3011
>                 URL: https://issues.apache.org/jira/browse/SOLR-3011
>             Project: Solr
>          Issue Type: Sub-task
>          Components: contrib - DataImportHandler
>    Affects Versions: 3.5, 4.0
>            Reporter: Mikhail Khludnev
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-3011.patch, SOLR-3011.patch, 
> patch-3011-EntityProcessorBase-iterator.patch, 
> patch-3011-EntityProcessorBase-iterator.patch
>
>
> current DIH design is not thread safe. see last comments at SOLR-2382 and 
> SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly 
> it's a SOLR-2947 patch from 28th Dec. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to