Thanks for the responses. This is exactly what I had to resort to. I will definitely put in a feature request to get the generated ID back from the extract request.
I am doing this with PHP cURL for extraction and pecl php solr for querying. I am then saving the unique id and dupe hash in a MySQL table which I check against after the doc is indexed in Solr. If it is a dupe I delete the Solr record and discard the file. My problem now is the dupe hash sometimes comes back NULL from Solr although when I check it through Solr Admin it is there. I am working through this now to isolate. I had to set Solr to ALLOW duplicates because I have to somehow know that the file is a dupe and then remove the duplicate files on my filesystem. Based on the extract response I have no way of knowing this if duplicates are disallowed. -Bill On Tue, Mar 2, 2010 at 2:11 AM, Chris Hostetter <hossman_luc...@fucit.org>wrote: > > > : To quote from the wiki, > ... > That's all true ... but Bill explicitly said he wanted to use > SignatureUpdateProcessorFactory to generate a uniqueKey from the content > field post-extraction so he could dedup documents with the same content > ... his question was how to get that key after adding a doc. > > Using a unique literal.field value will work -- but only as the value of > a secondary field that he can then query on to get the uniqueKeyField > value. > > > : > : You could create your own unique ID and pass it in with the > : > : literal.field=value feature. > : > > : > By which Lance means you could specify an unique value in a differnet > : > field from yoru uniqueKey field, and then query on that field:value > pair > : > to get the doc after it's been added -- but that query will only work > : > until some other version of the doc (with some other value) overwrites > it. > : > so you'd esentially have to query for the field:value to lookup the > : > uniqueKey. > : > > : > it seems like it should definitely be feasible for the > : > Update RequestHandlers to return the uniqueKeyField values for all the > : > added docs (regardless of wether the key was included in the request, > or > : > added by an UpdateProcessor -- but i'm not sure how that would fit in > with > : > the SolrJ API. > : > > : > would you mind opening a feature request in Jira? > : > > : > > : > > : > -Hoss > : > > : > > : > : > : > : -- > : Lance Norskog > : goks...@gmail.com > : > > > > -Hoss > >