Any thoughts on this? I would like to get the id back in the request after
indexing.  My initial thoughts were to do a search to get the docid  based
on the attr_stream_name after indexing but now that I reread my message I
mentioned the attr_stream_name (file_name) may be different so that is
unreliable.  My only option is to somehow return the id in the XML
response.  Any guidance is greatly appreciated.

-Bill

On Wed, Feb 24, 2010 at 12:06 PM, Bill Engle <billengle...@gmail.com> wrote:

> Hi -
>
> New Solr user here.  I am using Solr Cell to index files (PDF, doc, docx,
> txt, htm, etc.) and there is a good chance that a new file will have
> duplicate content but not necessarily the same file name.  To avoid this I
> am using the deduplication feature of Solr.
>
>   <updateRequestProcessorChain name="dedupe">
>     <processor
> class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
>       <bool name="enabled">true</bool>
>       <str name="signatureField">id</str>
>       <bool name="overwriteDupes">true</bool>
>       <str name="fields">attr_content</str>
>       <str name="signatureClass">org.apache.solr.update.processor.</str>
>     </processor>
>     <processor class="solr.LogUpdateProcessorFactory" />
>     <processor class="solr.RunUpdateProcessorFactory" />
>   </updateRequestProcessorChain>
>
> How do I get the "id" value post Solr processing.  Is there someway to
> modify the curl response so that id is returned.  I need this id because I
> would like to rename the file to the id value.  I could probably do a Solr
> search after the fact to get the id field based on the attr_stream_name but
> I would like to do only one request.
>
> curl '
> http://localhost:8080/solr/update/extract?uprefix=attr_&fmap.content=attr_content&commit=true'
> -F "myfi...@myfile.pdf"
>
> Thanks,
> Bill
>

Reply via email to