[
https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228649#comment-13228649
]
Mikhail Khludnev commented on SOLR-3011:
----------------------------------------
James,
bq. So it seems that for this to work, not only does the core (DocBuilder etc)
need to be thread-safe, but every component in a given DIH configuration needs
to be also.
For me it's doubtful statement. I believe it's possible to have bunch of
threadUnsafe classes synchronized by some smart orchestrator.
bq. There also is quite a bit of code duplication in DocBuilder and classes
Yep. Agree, ThrdEPWrapper is a FullImport only DocBuilder code dupe.
bq. Mikhail, you've just noticed that MockDataSource was not designed to test a
multi-threaded scenario in a valid fashion.
not really, they just an odd mocks. With real DS every time you get a full
resulset from the beginning, but after you reach eof in MockDS's resultset,
re-querying gets you the same eof.
bq. Take a look at TestDocBuilderThreaded.
I've never seen it actually.
bq. 1. Keep 3.x as-is, and make any quick fixes to threads for common use-cases
there, as possible.
No any quick fixes for any "common" use-cases is possible. I'm sure.
bq. 2. In 4.0 (or a separate branch), remove threading from DIH.
I suggest an opposite way:
* be honest with users and remove "threads" from 3.6. Zero impact here. Nobody
use it. It just doesn't work.
* as well I already spend enormous efforts for fixing in it 4.0. I hope I will
complete the fix anyway. (it will live at github at least). Btw, the reason why
I fix 4.0 is SOLR-2382. Actually I wait sometime before it was completed.
bq. 4. Make DocBuilder, etc threadsafe. 5. Create a marker interface or
annotation
I don't see how it's possible and be really helpful.
bq. The SOLR-3011 patches work on 4.x .. But I can probably help with porting
(some of?) this patch back to 3.x.
Petr found a case where the patch doesn't work. After (if) I done it, all
commits around SOLR-2382 can be cherrypicked to 3.x. Porting fix w/o
DIHCacheSupport will take more time.
In to my opposite proposals above, I think we really need to start a design of
new Ultimate DIH. I propose
# to pick up usecases (you are experienced in extreme caching, I did a
throughput maximization via async producer-consumer, Peter will give us his
cases, etc)
# sketch a design in plant uml, check that it's bullet proof
# cut in
> DIH MultiThreaded bug
> ---------------------
>
> Key: SOLR-3011
> URL: https://issues.apache.org/jira/browse/SOLR-3011
> Project: Solr
> Issue Type: Sub-task
> Components: contrib - DataImportHandler
> Affects Versions: 3.5, 4.0
> Reporter: Mikhail Khludnev
> Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3011.patch, SOLR-3011.patch,
> patch-3011-EntityProcessorBase-iterator.patch,
> patch-3011-EntityProcessorBase-iterator.patch
>
>
> current DIH design is not thread safe. see last comments at SOLR-2382 and
> SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly
> it's a SOLR-2947 patch from 28th Dec.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]