Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

javaxmlsoapdev Tue, 24 Nov 2009 10:19:07 -0800

Following is luke response. <lst name="fields" /> is empty. can someone
assist to find out why file content isn't being index?


  <?xml version="1.0" encoding="UTF-8" ?> 
 <response>
 <lst name="responseHeader">
  <int name="status">0</int> 
  <int name="QTime">0</int> 
  </lst>
 <lst name="index">
  <int name="numDocs">0</int> 
  <int name="maxDoc">0</int> 
  <int name="numTerms">0</int> 
  <long name="version">1259085661332</long> 
  <bool name="optimized">false</bool> 
  <bool name="current">true</bool> 
  <bool name="hasDeletions">false</bool> 
  <str
name="directory">org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/home/tomcat-solr/bin/docs/data/index</str>
 
  <date name="lastModified">2009-11-24T18:01:01Z</date> 
  </lst>
  <lst name="fields" /> 
 <lst name="info">
 <lst name="key">
  <str name="I">Indexed</str> 
  <str name="T">Tokenized</str> 
  <str name="S">Stored</str> 
  <str name="M">Multivalued</str> 
  <str name="V">TermVector Stored</str> 
  <str name="o">Store Offset With TermVector</str> 
  <str name="p">Store Position With TermVector</str> 
  <str name="O">Omit Norms</str> 
  <str name="L">Lazy</str> 
  <str name="B">Binary</str> 
  <str name="C">Compressed</str> 
  <str name="f">Sort Missing First</str> 
  <str name="l">Sort Missing Last</str> 
  </lst>
  <str name="NOTE">Document Frequency (df) is not updated when a document is
marked for deletion. df values include deleted documents.</str> 
  </lst>
  </response>

javaxmlsoapdev wrote:
> 
> I was able to configure /docs index separately from my db data index.
> 
> still I am seeing same behavior where it only puts .docName & its size in
> the "content" field (I have renamed field to "content" in this new schema)
> 
> below are the only two fields I have in schema.xml
> <field name="key" type="slong" indexed="true" stored="true"
> required="true" /> 
> <field name="content" type="text" indexed="true" stored="true"
> multiValued="true"/>   
> 
> Following is updated code from test case
> 
> File fileToIndex = new File("file.txt");
> 
> ContentStreamUpdateRequest up = new
> ContentStreamUpdateRequest("/update/extract");
> up.addFile(fileToIndex);
> up.setParam("literal.key", "8978");
> up.setParam("literal.docName", "doc123.txt");
> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
> NamedList list = server.request(up);
> assertNotNull("Couldn't upload .txt",list);
>                       
> QueryResponse rsp = server.query( new SolrQuery( "*:*") );
> assertEquals( 1, rsp.getResults().getNumFound() );
> System.out.println(rsp.getResults().get(0).getFieldValue("content"));
> 
> Also from solr admin UI when I search for "doc123.txt" then only it
> returns me following response. not sure why its not indexing file's
> content into "content" attribute.
> - <result name="response" numFound="1" start="0">
> - <doc>
> - <arr name="content">
>   <str>702</str> 
>   <str>text/plain</str> 
>   <str>doc123.txt</str> 
>   <str /> 
>   </arr>
>   <long name="key">8978</long> 
>   </doc>
>   </result>
> 
> Any idea?
> 
> Thanks,
> 
> 
> javaxmlsoapdev wrote:
>> 
>> http://machinename:port/solr/admin/luke gives me 404 error so seems like
>> its not able to find luke.
>> 
>> I am reusing schema, which is used for indexing other entity from
>> database, which has no relevance to documents. that was my next question
>> that what do I put in, in a schema if my documents don't need any column
>> mappings or anything. plus I want to keep file documents index separately
>> from database entity index. what's the best way to do this? If I don't
>> have any db columns etc to map and file documents index should leave
>> separate from db entity index, what's the best way to achieve this.
>> 
>> thanks,
>> 
>> 
>> 
>> Grant Ingersoll-6 wrote:
>>> 
>>> 
>>> On Nov 23, 2009, at 5:33 PM, javaxmlsoapdev wrote:
>>> 
>>>> 
>>>> *:* returns me 1 count but when I search for specific word (which was
>>>> part of
>>>> .txt file I indexed before) it doesn't return me anything. I don't have
>>>> luke
>>>> setup on my end.
>>> 
>>> http://localhost:8983/solr/admin/luke should give yo some info.
>>> 
>>> 
>>>> let me see if I can set that up quickly but otherwise do
>>>> you see anything I am missing in solrconfig mapping or something?
>>> 
>>> What's your schema look like and how are you querying?
>>> 
>>>> which maps
>>>> document "content" to wrong attribute?
>>>> 
>>>> thanks,
>>>> 
>>>> Grant Ingersoll-6 wrote:
>>>>> 
>>>>> 
>>>>> On Nov 23, 2009, at 5:04 PM, javaxmlsoapdev wrote:
>>>>> 
>>>>>> 
>>>>>> Following code is from my test case where it tries to index a file
>>>>>> (of
>>>>>> type
>>>>>> .txt)
>>>>>> ContentStreamUpdateRequest up = new
>>>>>> ContentStreamUpdateRequest("/update/extract");
>>>>>> up.addFile(fileToIndex);
>>>>>> up.setParam("literal.key", "8978"); //key is the uniqueId
>>>>>> up.setParam("ext.literal.docName", "doc123.txt");
>>>>>> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);   
>>>>>> server.request(up);              
>>>>>> 
>>>>>> test case doesn't give me any error and "I think" its indexing the
>>>>>> file?
>>>>>> but
>>>>>> when I search for a text (which was part of the .txt file) search
>>>>>> doesn't
>>>>>> return me anything.
>>>>> 
>>>>> What do your logs show?  Else, what does Luke show or doing a *:*
>>>>> query
>>>>> (assuming this is the only file you added)?
>>>>> 
>>>>> Also, I don't think you need ext.literal anymore, just literal.
>>>>> 
>>>>>> 
>>>>>> Following is the config from solrconfig.xml where I have mapped
>>>>>> content
>>>>>> to
>>>>>> "description" field(default search field) in the schema.
>>>>>> 
>>>>>> <requestHandler name="/update/extract"
>>>>>> class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
>>>>>>   <lst name="defaults">
>>>>>>     <str name="map.content">description</str>
>>>>>>     <str name="defaultField">description</str>
>>>>>>   </lst>
>>>>>> </requestHandler>
>>>>>> 
>>>>>> Clearly it seems I am missing something. Any idea?
>>>>> 
>>>>> 
>>>>> 
>>>>> --------------------------
>>>>> Grant Ingersoll
>>>>> http://www.lucidimagination.com/
>>>>> 
>>>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>>>>> using
>>>>> Solr/Lucene:
>>>>> http://www.lucidimagination.com/search
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/ExternalRequestHandler-and-ContentStreamUpdateRequest-usage-tp26486817p26487320.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>> 
>>> 
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucidimagination.com/
>>> 
>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>>> Solr/Lucene:
>>> http://www.lucidimagination.com/search
>>> 
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/ExternalRequestHandler-and-ContentStreamUpdateRequest-usage-tp26486817p26499908.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

Reply via email to