Re: How to avoid the unexpected character error?

2012-03-16 Thread neosky
I am sorry, but I can't get what you mean.
I tried the  HTMLStripCharFilter and PatternReplaceCharFilter. It doesn't
work.
Could you give me an example? Thanks! 

 fieldType name=text_html class=solr.TextField
positionIncrementGap=100
   analyzer
 charFilter class=solr.HTMLStripCharFilterFactory/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType

I also tried:

charFilter class=solr.PatternReplaceCharFilterFactory pattern=([^a-z])
replacement=
 maxBlockChars=1 blockDelimiters=|/

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3831064.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to avoid the unexpected character error?

2012-03-16 Thread Li Li
it's not the right place.
when you use java -Durl=http://... -jar post.jar data.xml
the data.xml file must be a valid xml file. you shoud escape special chars
in this file.
I don't know how you generate this file.
if you use java program(or other scripts) to generate this file, you should
use xml tools to generate this file.
but if you generate like this:
StringBuilder buf=new StringBuilder();
buf.append(add);
buf.append(doc);
buf.append(field name=fnametext content/field);
you should escape special chars.
if you use java, you can make use of org.apache.solr.common.util.XML class

On Fri, Mar 16, 2012 at 2:03 PM, neosky neosk...@yahoo.com wrote:

 I am sorry, but I can't get what you mean.
 I tried the  HTMLStripCharFilter and PatternReplaceCharFilter. It doesn't
 work.
 Could you give me an example? Thanks!

  fieldType name=text_html class=solr.TextField
 positionIncrementGap=100
   analyzer
 charFilter class=solr.HTMLStripCharFilterFactory/
 tokenizer class=solr.WhitespaceTokenizerFactory/
   /analyzer
  /fieldType

 I also tried:

 charFilter class=solr.PatternReplaceCharFilterFactory pattern=([^a-z])
 replacement=
 maxBlockChars=1 blockDelimiters=|/

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3831064.html
 Sent from the Solr - User mailing list archive at Nabble.com.



How to avoid the unexpected character error?

2012-03-14 Thread neosky
I use the xml to index the data. One filed might contains some characters
like '' =
It seems that will produce the error
I modify that filed doesn't index, but it doesn't work. I need to store the
filed, but index might not be indexed.
Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3824726.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
There is a class org.apache.solr.common.util.XML in solr
you can use this wrapper:
public static String escapeXml(String s) throws IOException{
StringWriter sw=new StringWriter();
XML.escapeCharData(s, sw);
return sw.getBuffer().toString();
}

On Wed, Mar 14, 2012 at 4:34 PM, neosky neosk...@yahoo.com wrote:

 I use the xml to index the data. One filed might contains some characters
 like '' =
 It seems that will produce the error
 I modify that filed doesn't index, but it doesn't work. I need to store the
 filed, but index might not be indexed.
 Thanks!

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3824726.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to avoid the unexpected character error?

2012-03-14 Thread neosky
Thanks!
Does the schema.xml support this parameter? I am using the example post.jar
to index my file.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3825959.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
no, it's nothing to do with schema.xml
post.jar just post a file, it don't parse this file.
solr will use xml parser to parse this file. if you don't escape special
characters, it's not a valid xml file and solr will throw exceptions.

On Thu, Mar 15, 2012 at 12:33 AM, neosky neosk...@yahoo.com wrote:

 Thanks!
 Does the schema.xml support this parameter? I am using the example post.jar
 to index my file.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-avoid-the-unexpected-character-error-tp3824726p3825959.html
 Sent from the Solr - User mailing list archive at Nabble.com.