Any thoughts on this? I would like to get the id back in the request after indexing. My initial thoughts were to do a search to get the docid based on the attr_stream_name after indexing but now that I reread my message I mentioned the attr_stream_name (file_name) may be different so that is unreliable. My only option is to somehow return the id in the XML response. Any guidance is greatly appreciated.
-Bill On Wed, Feb 24, 2010 at 12:06 PM, Bill Engle <billengle...@gmail.com> wrote: > Hi - > > New Solr user here. I am using Solr Cell to index files (PDF, doc, docx, > txt, htm, etc.) and there is a good chance that a new file will have > duplicate content but not necessarily the same file name. To avoid this I > am using the deduplication feature of Solr. > > <updateRequestProcessorChain name="dedupe"> > <processor > class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory"> > <bool name="enabled">true</bool> > <str name="signatureField">id</str> > <bool name="overwriteDupes">true</bool> > <str name="fields">attr_content</str> > <str name="signatureClass">org.apache.solr.update.processor.</str> > </processor> > <processor class="solr.LogUpdateProcessorFactory" /> > <processor class="solr.RunUpdateProcessorFactory" /> > </updateRequestProcessorChain> > > How do I get the "id" value post Solr processing. Is there someway to > modify the curl response so that id is returned. I need this id because I > would like to rename the file to the id value. I could probably do a Solr > search after the fact to get the id field based on the attr_stream_name but > I would like to do only one request. > > curl ' > http://localhost:8080/solr/update/extract?uprefix=attr_&fmap.content=attr_content&commit=true' > -F "myfi...@myfile.pdf" > > Thanks, > Bill >