[
https://issues.apache.org/jira/browse/SOLR-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923765#action_12923765
]
Lance Norskog edited comment on SOLR-2186 at 10/23/10 8:47 PM:
---------------------------------------------------------------
This is the dataConfig.xml. It is very simple: it walks a directory and indexes
every PDF file it finds.
If you change threads='4' to threads='1', it will still fail. If you remove the
threads directive, it runs.
{noformat}
<dataConfig>
<dataSource type="BinFileDataSource"/>
<document>
<entity name="jc" dataSource="null"
pk="id"
processor="FileListEntityProcessor"
fileName="^.*\.pdf$" recursive="false"
baseDir="/lucid/private_pdfs/10.pdfs"
transformer="TemplateTransformer"
threads='4'
>
<field column="id" template="${jc.fileAbsolutePath}"/>
<entity name="tika-test" processor="TikaEntityProcessor"
url="${jc.fileAbsolutePath}"
parser="org.apache.tika.parser.pdf.PDFParser"
onError="skip"
>
<field column="Author" name="author" meta="true"/>
<field column="title" name="title" meta="true"/>
<field column="text" name="text"/>
</entity>
</entity>
</document>
</dataConfig>
{noformat}
was (Author: lancenorskog):
This is the dataConfig.xml. It is very simple: it walks a directory and
indexes every PDF file it finds.
If you change threads='4' to threads='1', it will still fail. If you remove the
threads directive, it runs.
<dataConfig>
<dataSource type="BinFileDataSource"/>
<document>
<entity name="jc" dataSource="null"
pk="id"
processor="FileListEntityProcessor"
fileName="^.*\.pdf$" recursive="false"
baseDir="/lucid/private_pdfs/10.pdfs"
transformer="TemplateTransformer"
threads='4'
>
<field column="id" template="${jc.fileAbsolutePath}"/>
<entity name="tika-test" processor="TikaEntityProcessor"
url="${jc.fileAbsolutePath}"
parser="org.apache.tika.parser.pdf.PDFParser"
onError="skip"
>
<field column="Author" name="author" meta="true"/>
<field column="title" name="title" meta="true"/>
<field column="text" name="text"/>
</entity>
</entity>
</document>
</dataConfig>
> DataImportHandler multi-threaded option throws exception
> --------------------------------------------------------
>
> Key: SOLR-2186
> URL: https://issues.apache.org/jira/browse/SOLR-2186
> Project: Solr
> Issue Type: Bug
> Components: contrib - DataImportHandler
> Reporter: Lance Norskog
>
> The multi-threaded option for the DataImportHandler throws an exception and
> the entire operation fails. This is true even if only 1 thread is configured
> via *threads='1'*
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]