[jira] [Commented] (SOLR-3360) Problem with DataImportHandler multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255044#comment-13255044 ] Claudio R commented on SOLR-3360: - Hi James Dyer and Mikhail Khludnev, I added in logging.properties of tomcat the line below: org.apache.solr.handler.dataimport.JdbcDataSource.level=FINE And ran again in 3.6.0 with 10 threads. The select below was performed 10 times select url from video where indice_id_indice = '257933' This select of sub-entity should have been executed only one time. > Problem with DataImportHandler multi-threaded > - > > Key: SOLR-3360 > URL: https://issues.apache.org/jira/browse/SOLR-3360 > Project: Solr > Issue Type: Bug >Affects Versions: 3.6 > Environment: Solr 3.6.0, Apache Tomcat 6.0.20, jdk1.6.0_15, Windows XP >Reporter: Claudio R > > Hi, > If I use dataimport with 1 thread, I got: > >5001 >1000 >0 >2012-04-16 11:21:57 >Indexing completed. Added/Updated: 1000 documents. Deleted 0 > documents. >2012-04-16 11:23:19 >1000 >0:1:22.390 > > If I use datamport with 10 threads, I got: > >0 >1 >0 >2012-04-16 11:31:43 >Indexing completed. Added/Updated: 1 documents. Deleted 0 > documents. >2012-04-16 11:41:50 >1 >0:10:7.586 > > The configuration of 10 threads consumed 10 times longer than the > configuration with 1 thread. > I have 1000 records in the database. > My db-data-config.xml is shown below: > > > url="jdbc:sqlserver://200.XXX.XXX.XXX:1433;databaseName=test" user="user" > password="pass"/> > > transformer="RegexTransformer,TemplateTransformer" query="select top 1000 > i.id_indice, i.a, i.b from indice i where i.status = 'I'" > deltaImportQuery="i.id_indice, i.a, i.b from indice i where id_indice in > ('${dataimporter.delta.id_indice}')" deltaQuery="select id_indice from indice > where status='I' and data_hora_modificacao >= convert(datetime, > '${dataimporter.last_index_time}', 120)" deletedPkQuery="select id_indice > from indice where status='D' and data_hora_modificacao >= convert(datetime, > '${dataimporter.last_index_time}', 120)"> > > > > transformer="RegexTransformer,TemplateTransformer" query="select categoria, > sub_categoria from filtro where indice_id_indice = '${indice.id_indice}'"> > > > template="${filtro.categoria}|${filtro.sub_categoria}" /> > > > > > > > > > > > > > > > > > > Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3360) Problem with DataImportHandler multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254875#comment-13254875 ] Claudio R commented on SOLR-3360: - I ran in version 3.6.0 with 20 threads and got 2 documents processed: 0 2 0 2012-04-16 15:10:22 Indexing completed. Added/Updated: 2 documents. Deleted 0 documents. 2012-04-16 15:24:04 2 0:13:42.110 > Problem with DataImportHandler multi-threaded > - > > Key: SOLR-3360 > URL: https://issues.apache.org/jira/browse/SOLR-3360 > Project: Solr > Issue Type: Bug >Affects Versions: 3.6 > Environment: Solr 3.6.0, Apache Tomcat 6.0.20, jdk1.6.0_15, Windows XP >Reporter: Claudio R > > Hi, > If I use dataimport with 1 thread, I got: > >5001 >1000 >0 >2012-04-16 11:21:57 >Indexing completed. Added/Updated: 1000 documents. Deleted 0 > documents. >2012-04-16 11:23:19 >1000 >0:1:22.390 > > If I use datamport with 10 threads, I got: > >0 >1 >0 >2012-04-16 11:31:43 >Indexing completed. Added/Updated: 1 documents. Deleted 0 > documents. >2012-04-16 11:41:50 >1 >0:10:7.586 > > The configuration of 10 threads consumed 10 times longer than the > configuration with 1 thread. > I have 1000 records in the database. > My db-data-config.xml is shown below: > > > url="jdbc:sqlserver://200.XXX.XXX.XXX:1433;databaseName=test" user="user" > password="pass"/> > > transformer="RegexTransformer,TemplateTransformer" query="select top 1000 > i.id_indice, i.a, i.b from indice i where i.status = 'I'" > deltaImportQuery="i.id_indice, i.a, i.b from indice i where id_indice in > ('${dataimporter.delta.id_indice}')" deltaQuery="select id_indice from indice > where status='I' and data_hora_modificacao >= convert(datetime, > '${dataimporter.last_index_time}', 120)" deletedPkQuery="select id_indice > from indice where status='D' and data_hora_modificacao >= convert(datetime, > '${dataimporter.last_index_time}', 120)"> > > > > transformer="RegexTransformer,TemplateTransformer" query="select categoria, > sub_categoria from filtro where indice_id_indice = '${indice.id_indice}'"> > > > template="${filtro.categoria}|${filtro.sub_categoria}" /> > > > > > > > > > > > > > > > > > > Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3360) Problem with DataImportHandler multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254855#comment-13254855 ] Claudio R commented on SOLR-3360: - Hi James, About the version 3.5.0, I got unstable behavior with 10 threads. In first full-import, I got successful import: 0 1000 0 2012-04-16 14:12:08 Indexing completed. Added/Updated: 1000 documents. Deleted 0 documents. 2012-04-16 14:13:21 2012-04-16 14:13:21 1000 0:1:12.875 But, in second, third full-import I got Indexing failed. Rolled back all changes. 0:0:6.906 0 12 11 0 2012-04-16 14:15:38 Indexing failed. Rolled back all changes. 2012-04-16 14:15:43 At catalina.out, I got: SEVERE: Full Import failed:java.lang.RuntimeException: Error in multi-threaded import at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:265) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:187) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408) Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: select categoria, sub_categoria from filtro where indice_id_indice = '257346' at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:253) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73) at org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper.nextRow(ThreadedEntityProcessorWrapper.java:84) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:446) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:399) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:466) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.access$000(DocBuilder.java:353) at org.apache.solr.handler.dataimport.DocBuilder$EntityRunner$1.run(DocBuilder.java:406) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: socket closed at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1368) at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1355) at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1532) at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:3274) at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:4433) at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:784) at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:685) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4026) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1416) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:185) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:160) at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStatement.java:658) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:246) ... 13 more In version 3.6.0 I did not get unstable behavior as obtained in version 3.5.0 with 10 threads. In version 3.6.0 I tried without transformers: 0 1 0 2012-04-16 14:30:39 Indexing completed. Added/Updated: 1 documents. Deleted 0 documents. 2012-04-16 14:39:45 1 0:9:5.719 In version 3.6.0, with just parent entity: 0 1 0 2012-04-16 14:42:49 Indexing completed. Added/Updated: 1 documents. Deleted 0 documents. 2012-04-16 14:49:05 1 0:6:16.0 It's weird obtain 1 documents processed. I only have 1000 records in the database. Thanks. >