You say "dataSource='bin'" but I don't see you defining that datasource. E.g.:

<dataSource type="BinURLDataSource" name="bin"/>

So, there might be some weird default fallback that's just causes
strange problems.

Regards,
    Alex.

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 10 October 2014 14:17, Dan Davis <dansm...@gmail.com> wrote:
>
> What I want to do is to pull an URL out of an Oracle database, and then use
> TikaEntityProcessor and BinURLDataSource to go fetch and process that URL.
> I'm having a problem with this that seems general to JDBC with Tika - I get
> an exception as follows:
>
> Exception in entity :
> extract:org.apache.solr.handler.dataimport.DataImportHandlerException:
> Unable to execute query: http://www.cdc.gov/healthypets/pets/wildlife.html
> Processing Document # 14
>       at
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:71)
> ...
>
> Steps to reproduce any problem should be:
>
> Try it with the XML and verify you get two documents and they contain text
> (schema browser with the text field)
> Try it with a JDBC sqlite3 dataSource and verify that you get an exception,
> and advise me what may be the problem in my configuration ...
>
> Now, I've tried this 3 ways:
>
> My Oracle database - fails as above
> An SQLite3 database to see if it is Oracle specific - fails with "Unable to
> execute query", but doesn't have the URL as part of the message.
> An XML file listing two URLs - succeeds without error.
>
> For the SQL attempts, setting onError="skip" leads the data from the
> database to be indexed, but the exception is logged for each root entity.
> I can tell that nothing is indexed from the text extraction by browsing the
> "text" field from the schema browser and seeing how few terms there are.
> The exceptions also sort of give it away, but it is good to be careful :)
>
> This is using:
>
> Tomcat 7.0.55
> Solr 4.10.1
> and JDBC drivers
>
> ojdbc7.jar
> sqlite-jdbc-3.7.2.jar
>
> Excerpt of solrconfig.xml:
>
>   <!-- Data Import Handler for Health Topics -->
>   <requestHandler name="/dih-healthtopics" class="solr.DataImportHandler">
>     <lst name="defaults">
>       <str name="config">dih-healthtopics.xml</str>
>     </lst>
>   </requestHandler>
>
>   <!-- Data Import Handler that imports a single URL via Tika -->
>   <requestHandler name="/dih-smallxml" class="solr.DataImportHandler">
>     <lst name="defaults">
>       <str name="config">dih-smallxml.xml</str>
>     </lst>
>   </requestHandler>
>
>     <!-- Data Import Handler that imports a single URL via Tika -->
>   <requestHandler name="/dih-smallsqlite" class="solr.DataImportHandler">
>     <lst name="defaults">
>       <str name="config">dih-smallsqlite.xml</str>
>     </lst>
>   </requestHandler>
>
>
> The data import handlers and a copy-paste from Solr logging are attached.

Reply via email to