I am useing the Extract URL And Renamed the File to test.txtBut it is still Parsed with the XML ParserCan I force the txt Parser for all .txt Files?
Von meinem Samsung Gerät gesendet. -------- Ursprüngliche Nachricht -------- Von: Shawn Heisey <apa...@elyograg.org> Datum: 04.01.17 17:10 (GMT+01:00) An: solr-user@lucene.apache.org Betreff: Re: update/extract override ExtractTyp On 1/4/2017 8:12 AM, sn0...@ulysses-erp.com wrote: > Is it possible to override the ExtractClass for a specific document? > I would like to upload a XML Document, but this XML is not XML conform > > I need this XML because it is part of a project where a corrupt XML is > need, for testing purpose. > > > The update/extract process failes every time with an 500 error. > > I tried to override the Content-Type with "text/plain" but get still > the XML parse error. If you send something to the /update handler, and don't tell Solr that it is another format that it knows like CSV, JSON, or Javabin, then Solr assumes that it is XML -- and that it is the *specific* XML format that Solr uses. "text/plain" is not one of the formats that the update handler knows how to handle, so it will assume XML. If you send some other arbitrary XML content, even if that XML is otherwise correctly formed (which apparently yours isn't), Solr will throw an error, because it is not the type of XML that Solr is looking for. On this page are some examples of what Solr is expecting when you send XML: https://wiki.apache.org/solr/UpdateXmlMessages If you want to parse arbitrary XML into fields, you probably need to send it using DIH and the XPathEntityProcessor. If you want the XML to go into a field completely as-is, then you need to encode the XML into one of the update formats that Solr knows (XML, JSON, etc) and set it as the value of one of the fields. Thanks, Shawn