-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I was one of the people who instigated Gert to add that functionality.The motivation is to be able to extract technical assertions about binary datastreams and use them in indexing. It's not extracting content from images, although it could extract content from PDF files or other text-containing formats.
On perhaps a more useful note, you should definitely expect to alter the default indexing stylesheets, or even better, to create your own that are to your particular purposes. - --- A. Soroka The University of Virginia Library On Jul 24, 2013, at 8:32 AM, Alistair Young wrote: > sorted it by removing the Apache Tika extraction from: > > WEB-INF/classes/fgsconfigFinal/index/FgsIndex/foxmlToSolrGenerated.xslt > > it seems it extracts the content and tries to index it. Not sure why it would > want to extract the content of an image but when it does it causes Solr to > fail to index the resource: > > SEVERE: org.apache.solr.common.SolrException: Illegal character (NULL, > unicode 0) encountered: not valid in any content > > Seems to only think some jpg files are not jpg files. > > Alistair > > -- > mov eax,1 > mov ebx,0 > int 80h > > From: Alistair Young <[email protected]> > Reply-To: "Support and info exchange list for Fedora users." > <[email protected]> > Date: Wednesday, 24 July 2013 11:03 > To: "Support and info exchange list for Fedora users." > <[email protected]> > Subject: Re: [fcrepo-user] Does gsearch index content with solr? > > sorry should have mentioned, it's the content datastream, i.e. image/jpeg > > Alistair > > -- > mov eax,1 > mov ebx,0 > int 80h > > From: Alistair Young <[email protected]> > Reply-To: "Support and info exchange list for Fedora users." > <[email protected]> > Date: Wednesday, 24 July 2013 10:59 > To: "Support and info exchange list for Fedora users." > <[email protected]> > Subject: [fcrepo-user] Does gsearch index content with solr? > > I have a weird problem. I dropped a foxml file into > FgsConfig/indexingXsltGenerator/foxml and configured etc but certain files, > when uploaded cause solr to crash: > > SEVERE: org.apache.solr.common.SolrException: Illegal character (NULL, > unicode 0) encountered: not valid in any content > > If I don't include datastream in the foxml it doesn't cause the crash, i.e. > remove this: > > <foxml:datastream ID="AUDIT" STATE="A" CONTROL_GROUP="X" VERSIONABLE="false"> > > Should the foxml used to configure gsearch only contain 'metadata', i.e. DC, > RDF etc and not datastreams? > > thanks, > > Alistair > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk_______________________________________________ > Fedora-commons-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.19 (Darwin) Comment: GPGTools - http://gpgtools.org iQEcBAEBAgAGBQJR78vHAAoJEATpPYSyaoIk8dsIALihgJB0b4OABcOcOnk2qthk 79JqHouayvOFwTNMHsHZMIPXQ9KlD7h/zrHVYPPOqXV8fvNb3+EeQEal5WJxs4Z3 mMevFpEpBlOWUOBAiEqayNNfnxNCGQ3ARCRXNzeiaheM43ouFCluOGkX9p3fjqSV qq6QG862vDFvYF69rMH1NiFIUIA/QP8w/K/QzyI8qoblrzWCX2LmQ8NaH5b0oN1j Nb0NXIQv+XOVJZeHFvbHNEzGMGMEWHKs2QsZ1auirOKaO3ccV74+gVTuvDkmmuXL VjQQoxNBTqbkhSpoDsWPCkHE+fVGuWyFS/ffJQ/0heX1rWOkiOFgJhhGuwJOl2Y= =s4aM -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Fedora-commons-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
