This message started in [email protected], but appears to be a more general problem with gsearch, so I'm also copying this to fedora-users.
On Mar 18, 2013, at 8:06 PM, Peter Murray <[email protected]> wrote: > Does default configuration of GSearch for Islandora-7.x index the FULL_TEXT > datastream of objects created by the PDF Solution Pack? The search engine > appears to index the metadata without fail. I've even gone into the GSearch > updateIndex web screen and updated all of the FOXML files. I'm using the > GSearch 2.5 (the version previous to the one released today) > 'fgsconfig-basic-for-islandora.properties' updated with the passwords and > locations specific to my setup. I've dug a little deeper on this, and am still coming up stymied. It looks like objects with PDFs are not getting index. GSearch is showing this error: DEBUG 2013-03-18 21:34:09,583 (Config) insertSystemProperties propertyValue=http://localhost:8080/solr DEBUG 2013-03-18 21:34:09,594 (OperationsImpl) closeIndexSearcher indexName=FgsIndex DEBUG 2013-03-18 21:34:09,595 (OperationsImpl) closeIndexReader indexName=FgsIndex docCount=45 ERROR 2013-03-18 21:34:09,597 (UpdateListener) Unable to perform index update due to Exception: Mon Mar 18 21:34:09 EDT 2013 Connection error (is Solr running at http://localhost:8080/solr/update ?): java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:8080/solr/update dk.defxws.fedoragsearch.server.errors.GenericSearchException: Mon Mar 18 21:34:09 EDT 2013 Connection error (is Solr running at http://localhost:8080/solr/update ?): java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:8080/solr/update at dk.defxws.fgssolr.OperationsImpl.postData(OperationsImpl.java:653) at dk.defxws.fgssolr.OperationsImpl.indexDoc(OperationsImpl.java:473) at dk.defxws.fgssolr.OperationsImpl.fromPid(OperationsImpl.java:413) Which correlates to this SOLR error in catalina.out: Mar 18, 2013 9:34:09 PM org.apache.solr.common.SolrException log SEVERE: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0xc) not a valid XML character at [row,col {unknown-source}]: [1668,5] at com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:45) at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:729) at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3659) at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809) at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:315) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:156) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) The discussions I'm seeing on Stack Exchange about the "…not a valid XML character" point to XML that is being generated with characters that are invalid in XML. (In this case 0xC -- or "form feed" character.) Before I start tracing around the guts of GSearch, is this sounding familiar to anyone? Peter -- Peter Murray Assistant Director, Technology Services Development LYRASIS [email protected] +1 678-235-2955 800.999.8558 x2955 ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar _______________________________________________ Fedora-commons-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
