Hi Everyone,

I setup a server and began to index my data. I have two questions I am hoping 
someone can help me with. Many of my files seem to index without any problems. 
Others, I get a host of different errors. I am indexing primarily web based 
content and have identified my text field as follows:
 
<fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping.txt"/>
                <charfilter class="solr.HTMLStripCharFilterFactory"/>   
                <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt"/>
                <filter class="solr.WordDelimiterFilterFactory" 
generateWordParts="1" generateNumberParts="1" catenateWords="1" 
catenateNumbers="1" catenateAll="0"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPorterFilterFactory" 
protected="protwords.txt"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
        </fieldtype>


q1) Errors while indexing.

* SimplePostTool: WARNING: Unexpected response from Solr: '<result 
status="0"></result>' does not contain '<int name="status">0</int>'

* SEVERE: Error processing "legacy" update 
command:com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' 
(code 32) in content after '<' (malformed start element?). at [row,col 
{unknown-source}]: [1591,90] at 
com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)

* Although I can't find the actual error, I recall solr giving me an error when 
it came across a string &What - The error was something like expecting 
semicolon after "What"


q2) If my file has 1000 documents and I submit it with post.jar, if it comes 
across any of the above errors, will it break the processing of the whole file, 
or just the document with the error?


Thanks in advance. 
Your help is very much appreciated.

Charlie

  

Reply via email to