Re: How to avoid the unexpected character error?
I am sorry, but I can't get what you mean. I tried the HTMLStripCharFilter and PatternReplaceCharFilter. It doesn't work. Could you give me an example? Thanks! fieldType name=text_html class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType I also tried: charFilter class=solr.PatternReplaceCharFilterFactory pattern=([^a-z]) replacement= maxBlockChars=1 blockDelimiters=|/ -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3831064.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to avoid the unexpected character error?
it's not the right place. when you use java -Durl=http://... -jar post.jar data.xml the data.xml file must be a valid xml file. you shoud escape special chars in this file. I don't know how you generate this file. if you use java program(or other scripts) to generate this file, you should use xml tools to generate this file. but if you generate like this: StringBuilder buf=new StringBuilder(); buf.append(add); buf.append(doc); buf.append(field name=fnametext content/field); you should escape special chars. if you use java, you can make use of org.apache.solr.common.util.XML class On Fri, Mar 16, 2012 at 2:03 PM, neosky neosk...@yahoo.com wrote: I am sorry, but I can't get what you mean. I tried the HTMLStripCharFilter and PatternReplaceCharFilter. It doesn't work. Could you give me an example? Thanks! fieldType name=text_html class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType I also tried: charFilter class=solr.PatternReplaceCharFilterFactory pattern=([^a-z]) replacement= maxBlockChars=1 blockDelimiters=|/ -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3831064.html Sent from the Solr - User mailing list archive at Nabble.com.
How to avoid the unexpected character error?
I use the xml to index the data. One filed might contains some characters like '' = It seems that will produce the error I modify that filed doesn't index, but it doesn't work. I need to store the filed, but index might not be indexed. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3824726.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to avoid the unexpected character error?
There is a class org.apache.solr.common.util.XML in solr you can use this wrapper: public static String escapeXml(String s) throws IOException{ StringWriter sw=new StringWriter(); XML.escapeCharData(s, sw); return sw.getBuffer().toString(); } On Wed, Mar 14, 2012 at 4:34 PM, neosky neosk...@yahoo.com wrote: I use the xml to index the data. One filed might contains some characters like '' = It seems that will produce the error I modify that filed doesn't index, but it doesn't work. I need to store the filed, but index might not be indexed. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3824726.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to avoid the unexpected character error?
Thanks! Does the schema.xml support this parameter? I am using the example post.jar to index my file. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3825959.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to avoid the unexpected character error?
no, it's nothing to do with schema.xml post.jar just post a file, it don't parse this file. solr will use xml parser to parse this file. if you don't escape special characters, it's not a valid xml file and solr will throw exceptions. On Thu, Mar 15, 2012 at 12:33 AM, neosky neosk...@yahoo.com wrote: Thanks! Does the schema.xml support this parameter? I am using the example post.jar to index my file. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3825959.html Sent from the Solr - User mailing list archive at Nabble.com.