I think he's saying that the field is called 'Quality & File Format', and gsearch replaces the whitespace with underscores but leaves the ampersand unmodified. Then the resulting solr xml document is malformed, because the ampersand isn't encoded.
On Wed, Jul 18, 2012 at 4:19 AM, Richard Green <[email protected]> wrote: > Could you be more specific about “XML special chars”? > > > > Richard Green > > > > From: Rich d'Rich [mailto:[email protected]] > Sent: 18 July 2012 6:21 AM > To: [email protected] > Subject: [fcrepo-user] SOLR indexing fails when XML chars in EXIF > fieldsextracted in gsearch > > > > We have a large repository with images files included with EXIF data. > > > > Some of these have fields (e.g. 'Quality & File Format') that contain XML > special chars in the field name. > > > > When the getDatastreamFromTika function in the gsearch template extracts > these fields, the resulting document has an badly formed entity tag > > &_File_Format that causes SOLR to fail to index the document. > > > > Is this a known issue? Any workrounds? > > > > > ************************************************** > To view the terms under which this email is > distributed, please go to > http://www2.hull.ac.uk/legal/disclaimer.aspx > ************************************************** > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Fedora-commons-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Fedora-commons-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
