Dear Solr Team,

I am trying to index Word and PDF documents with Solr using SolrJ, but most of the examples I found on the internet use the SolrServer class which I guess is deprecated. The connection to Solr itself is working, because I can add SolrInputDocuments to the index but it does not work for rich documents because I get an exception.


public static void main(String[] args) throws IOException, SolrServerException {
        String urlString = "http://localhost:8983/solr/localDocs16";;
HttpSolrClient solr = new HttpSolrClient.Builder(urlString).build();

        //is working
        for(int i=0;i<1000;++i) {
            SolrInputDocument doc = new SolrInputDocument();
            doc.addField("cat", "book");
            doc.addField("id", "book-" + i);
            doc.addField("name", "The Legend of the Hobbit part " + i);
            solr.add(doc);
            if(i%100==0) solr.commit();  // periodically flush
        }

        //is not working
        File file = new File("path\\testfile.pdf");

ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("update/extract");

        req.addFile(file, "application/pdf");
        req.setParam("literal.id", "doc1");
        req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
        try{
            solr.request(req);
        }
        catch(IOException e){
PrintWriter out = new PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
            e.printStackTrace(out);
            out.close();
            System.out.println("IO message: " + e.getMessage());
        } catch(SolrServerException e){
PrintWriter out = new PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
            e.printStackTrace(out);
            out.close();
            System.out.println("SolrServer message: " + e.getMessage());
        } catch(Exception e){
PrintWriter out = new PrintWriter("C:\\Users\\mareike\\Desktop\\filename.txt");
            e.printStackTrace(out);
            out.close();
System.out.println("UnknownException message: " + e.getMessage());
        }finally{
            solr.commit();
        }
}


I am using Maven (pom.xml attached) and created a JAR file, which I then tried to execute from the command line, and this is the output I get:

    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder".
    SLF4J: Defaulting to no-operation MDCAdapter implementation.
SLF4J: See http://www.slf4j.org/codes.html#no_static_mdc_binder for further details. message: *UnknownException message: Error from server at http://localhost:8983/solr/localDocs17: Bad contentType for search handler :application/pdf request={wt=javabin&version=2}*



I hope you may be able to help me with this. I also posted this issue on Github <https://stackoverflow.com/questions/56149903/indexing-rich-documents-with-solrj-bad-contenttype-for-search-handler>.

Cheers,
Mareike Glock

<project xmlns="http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd";>
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.app</groupId>
  <artifactId>solr-search</artifactId>
  <packaging>jar</packaging>
  <version>1.0</version>
  <name>solr-search</name>
  <url>http://maven.apache.org</url>

  <properties>
    <maven.compiler.source>1.7</maven.compiler.source>
    <maven.compiler.target>1.7</maven.compiler.target>
  </properties>

  <build>
    <plugins>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <archive>
            <manifest>
              <mainClass>com.mycompany.app.Main</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
      </plugin>
    </plugins>
  </build>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.solr</groupId>
      <artifactId>solr-solrj</artifactId>
      <version>7.7.0</version>
    </dependency>
  </dependencies>

</project>

Reply via email to