exception with xml file processing

2010-12-26 Thread xu cheng
hi all:
 I use solr to index my documents, and I put my text in a cdata
segment.however, solr always throws an exception complaining about
thexml file processing
.
 It seems that I can still index the document successfully!!!(actually , I'm
not sure about cos there are pretty too many document!)


the exception stack is like this: and all the exception infos are the same




 Error processing "legacy" update
command:com.ctc.wstx.exc.WstxUnexpectedCharException: Une
xpected character ''' (code 39) in prolog; expected '<'
 at [row,col {unknown-source}]: [1,1]
at
com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:648)
at
com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2047)
at
com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90)
at
org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpdateRequestHandle
r.java:130)
at
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:79)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterCha
in.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:
206)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:286)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterCha
in.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:
206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protoco
l.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:619)










any suggestion and reference are appreciated! thanks


Re: about the string type field, will it be analyzed?

2010-11-24 Thread xu cheng
btw
I only define an analyzer for the fieldType text,no string

2010/11/25 xu cheng 

> hi all:
> I have a solr app, and there is* a filed named filePath *whose type is *
> string*
> and the filePath fields  in the documents are* unique* (supposed to be
> uniqued)
> cos filePath
> and now I wanna query with this field, it find nothing.
> I just wonder whether the field defined as a string type will be analyzed?
> the query is like this   q=filePath:blablabla, it find nothing.(there is
> absolutely a unique document with the filePath field blablabla!!!)
>
> ( I declared this field to be text type, but each time I indexed the same
> document, there would be a duplicated doc indexed into the system. then I
> defined it as a string type, it is uniqued then, so I think the string type
> will not be analyzed, but why the query find nothing???)
>
> any one can help
> thanks
>


about the string type field, will it be analyzed?

2010-11-24 Thread xu cheng
hi all:
I have a solr app, and there is* a filed named filePath *whose type is *
string*
and the filePath fields  in the documents are* unique* (supposed to be
uniqued)
cos filePath
and now I wanna query with this field, it find nothing.
I just wonder whether the field defined as a string type will be analyzed?
the query is like this   q=filePath:blablabla, it find nothing.(there is
absolutely a unique document with the filePath field blablabla!!!)

( I declared this field to be text type, but each time I indexed the same
document, there would be a duplicated doc indexed into the system. then I
defined it as a string type, it is uniqued then, so I think the string type
will not be analyzed, but why the query find nothing???)

any one can help
thanks


Re: sort desc and out of memory exception

2010-11-21 Thread xu cheng
thanks for replying

but when it's sort with asc, it runs pretty well
only if I sort with desc , it has the out o f memory exception

2010/11/17 Peter Karich 

>  You are applying the sort against a (tokenized) text field?
> You should better sort against a number or a string. Probably using the
> copyField directive.
>
> Regards,
> Peter.
>
>
>  hi all:
>>  I configure a solr application and there is a field of type text,and some
>> kind like this 123456, that is a string of number
>> and I wanna solr to sort the result on this field
>> however, when I use sort asc , it works perfectly ,and when I sort it with
>> desc, the application became unacceptablly slow
>> and finally , an OutOfMemoryException was throw.
>> does anyone have the same kind of problem?or any suggestions?
>>
>> thanks
>>
>>
>
> --
> http://jetwick.com twitter search prototype
>
>


sort desc and out of memory exception

2010-11-16 Thread xu cheng
hi all:
 I configure a solr application and there is a field of type text,and some
kind like this 123456, that is a string of number
and I wanna solr to sort the result on this field
however, when I use sort asc , it works perfectly ,and when I sort it with
desc, the application became unacceptablly slow
and finally , an OutOfMemoryException was throw.
does anyone have the same kind of problem?or any suggestions?

thanks


Re: encoding messy code

2010-11-16 Thread xu cheng
hi:
the problem lies in the web server that interact with the solr server. and
after some transformation, it works now
thanks

2010/11/16 Peter Karich 

>  Am 16.11.2010 07:25, schrieb xu cheng:
>
>  hi all:
>> I configure an app with solr to index documents
>> and there are some Chinese content in the documents
>> and I've configure the apache tomcat URIEncoding to be utf-8
>> and I use the program curl to sent the documents in xml format
>> however , when I query the documents, all the Chinese content becomes
>> messy
>> code. It've cost me a lot of time.
>>
>
> solr handles only utf8. is the xml properly encoded in utf8?
> if you are under linux you can easily convert this via iconv.
> or detect the encoding (based on some heuristics) using enca or similar.
>
> Regards,
> Peter.
>
> --
> http://jetwick.com twitter search prototype
>
>


encoding messy code

2010-11-15 Thread xu cheng
hi all:
I configure an app with solr to index documents
and there are some Chinese content in the documents
and I've configure the apache tomcat URIEncoding to be utf-8
and I use the program curl to sent the documents in xml format
however , when I query the documents, all the Chinese content becomes messy
code. It've cost me a lot of time.
does anyone has some idea about it?
thanks
by the way , the tomcat edition is 6.0.20,while the solr is 1.4


Re: How can I delete the entire contents of the index?

2010-09-22 Thread xu cheng
the query that fetch the data you wanna
delete
I did like this to delete my data
best regards

2010/9/23 Igor Chudov 

> Let's say that I added a number of elements to Solr (I use
> Webservice::Solr as the interface to do so).
>
> Then I change my mind and want to delete them all.
>
> How can I delete all contents of the database, but leave the database
> itself, just empty?
>
> Thanks
>
> i
>