: I've been trying to use the UUIDField in solr to maintain ids of the : pages I've crawled with nutch (as per : http://wiki.apache.org/solr/UniqueKey). The use case is that I want to : have the server able to use these ids in another database for various : statistics gathering. So I want the link url to act like a primary key : for determining if a page exists, and if it doesn't exist to generate a : new uuid.
i'm confused ... if you want the URL to be the primary key, then use the URL as the primary key, why use the UUID Field at all? : 2. Looking at the code for UUIDField (relevant bit pasted below), it : seems that the UUID is just generated randomly. There is no check if the : generated UUID has already been used. correct ... if you specify "NEW" then it generates a new UUID for you -- if you wnat to update an existing doc with an existing UUID then you need to send the real, existing, value of the UUID for the doc you are updating. : I can sort of solve this problem by generating the UUID myself, as a : hash of the link url, but that doesn't help me for those random cases : when the hash might happen to generate the same UUID. : : Does anyone know if there is a way for solr to only add a uuid if the : document doesn't already exist? I don't really understand your second sentence, but based on that first sentence it sounds like what you want may be to use something like the SignatureUpdateProcessor to generate a hash based on the URL... https://wiki.apache.org/solr/Deduplication -Hoss