[ https://issues.apache.org/jira/browse/SOLR-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242310#comment-13242310 ]
Bernd Fehling commented on SOLR-3011: ------------------------------------- Just tried multi-threaded. It produces the required number of threads (seen in debugger) but only runs once. My configuration is: <dataConfig> <dataSource name="filetraverser" type="FileDataSource" encoding="UTF-8" /> <document> <entity name="basedata" processor="FileListEntityProcessor" threads="4" rootEntity="false" fileName="\.xml$" recursive="true" dataSource="null" baseDir="/srv/www/solr/DATA/OAI" > <entity name="records" processor="XPathEntityProcessor" threads="4" rootEntity="true" dataSource="filetraverser" stream="true" forEach="/documents/document" url="${basedata.fileAbsolutePath}" > <field column="id" xpath="/documents/document/@id" /> <field column="dctitle" xpath="/documents/document/element[@name='dctitle']/value" /> </entity> </entity> </document> </dataConfig> It should read all files below baseDir and build documents from the records inside the files. Works fine in non-multi-threaded but only reads the first file in multi-threaded mode. Any idea? And another thing to mention, in TestThreaded.java there are the lines: @Test public void testCachedThreadless_FullImport() throws Exception { runFullImport(getCachedConfig(random.nextBoolean(), random.nextBoolean(), 0)); } @Test public void testCachedSingleThread_FullImport() throws Exception { runFullImport(getCachedConfig(random.nextBoolean(), random.nextBoolean(), 1)); } @Test public void testCachedThread_FullImport() throws Exception { int numThreads = random.nextInt(9) + 1; // between one and 10 String config = getCachedConfig(random.nextBoolean(), random.nextBoolean(), numThreads); runFullImport(config); } This will test 0, 1 and random between 1 to 9. But 1 is already covered. So wouldn't it be better to have "random.nextInt(8) + 2" for the range 2 to 9? > DIH MultiThreaded bug > --------------------- > > Key: SOLR-3011 > URL: https://issues.apache.org/jira/browse/SOLR-3011 > Project: Solr > Issue Type: Sub-task > Components: contrib - DataImportHandler > Affects Versions: 3.5 > Reporter: Mikhail Khludnev > Assignee: James Dyer > Priority: Minor > Fix For: 3.6 > > Attachments: SOLR-3011.patch, SOLR-3011.patch, SOLR-3011.patch, > SOLR-3011.patch, SOLR-3011.patch, > patch-3011-EntityProcessorBase-iterator.patch, > patch-3011-EntityProcessorBase-iterator.patch > > > current DIH design is not thread safe. see last comments at SOLR-2382 and > SOLR-2947. I'm going to provide the patch makes DIH core threadsafe. Mostly > it's a SOLR-2947 patch from 28th Dec. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org