Re: Duplicate documents being added even with unique key
Changing my field type to string for my uniquekey field solved the problem. Thanks to Jack and Erik for the fix! On May 18, 2012, at 5:33 PM, Jack Krupansky wrote: Typically the uniqueKey field is a string field type (your schema uses text_general), although I don't think it is supposed to be a requirement. Still, it is one thing that stands out. Actually, you may be running into some variation of SOLR-1401: https://issues.apache.org/jira/browse/SOLR-1401 In other words, stick with string and stay away from a tokenized (text) key. You could also get duplicates by merging cores or if your add has allowDups = true or overwrite=false. -- Jack Krupansky -Original Message- From: Parmeley, Michael Sent: Friday, May 18, 2012 5:50 PM To: solr-user@lucene.apache.org Subject: Duplicate documents being added even with unique key I have a uniquekey set in my schema; however, I am still getting duplicated documents added. Can anyone provide any insight into why this may be happening? This is in my schema.xml: !-- Field to use to determine and enforce document uniqueness. Unless this field is marked with required=false, it will be a required field -- uniqueKeyuniquekey/uniqueKey field name=uniquekey type=text_general indexed=true stored=true required=true / On startup I get this message in catalina.out: INFO: unique key field: uniquekey However, you can see I get multiple documents: result name=response numFound=7 start=0 doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc /result
Duplicate documents being added even with unique key
I have a uniquekey set in my schema; however, I am still getting duplicated documents added. Can anyone provide any insight into why this may be happening? This is in my schema.xml: !-- Field to use to determine and enforce document uniqueness. Unless this field is marked with required=false, it will be a required field -- uniqueKeyuniquekey/uniqueKey field name=uniquekey type=text_general indexed=true stored=true required=true / On startup I get this message in catalina.out: INFO: unique key field: uniquekey However, you can see I get multiple documents: result name=response numFound=7 start=0 doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc doc str name=abbreviationPSR3/str int name=clientid1/int str name=entitytypeSkill/str int name=id510/int str name=nameBody and Soul/str int name=projectid1/int int name=skillnumber281/int str name=uniquekeySkill510/str /doc /result
DataImportHandler only importing 3 fields on all entities
I have a data-config.xml declaring some entities and no matter what fields I declare in the entities the only ones it will index are id, name, and description. So fields like firstname, email, url don't appear in the index. They also don't appear in the schema browser. Am I doing something wrong? What is so special about id, name, and description that they always appear? document entity name=Project processor=SqlEntityProcessor query=select * from project field column=projectid name=id / field column=name name=name / field column=description name=description / /entity entity name=Person processor=SqlEntityProcessor query=select * from person field column=firstname name=firstName / field column=lastname name=lastName / field column=email name=email / field column=phonenumber name=phoneNumber / field column=login name=login / field column=displayname name=displayName / field column=updateduser name=updatedUser / /entity entity name=Script processor=SqlEntityProcessor query=select * from script field column=scriptid name=id / field column=name name=name / field column=description name=description / field column=url name=url / /entity /document
Re: DataImportHandler only importing 3 fields on all entities
I discovered the schema.xml file about 2 minutes before I got your response. It was very enlightening:-) thanks for the tips about dynamicFields! On May 3, 2012, at 1:02 PM, Jack Krupansky wrote: Those three field names are already in the Solr example schema. Either manually add your desired fields to the schema, change their names (column vs. sourceColName) to fields that do exist in your Solr schema, give them names that end with one of the dynamicField suffixes (such as *_s), or enable the * dynamicField rule in the Solr schema. -- Jack Krupansky -Original Message- From: Parmeley, Michael Sent: Thursday, May 03, 2012 1:39 PM To: solr-user@lucene.apache.org Subject: DataImportHandler only importing 3 fields on all entities I have a data-config.xml declaring some entities and no matter what fields I declare in the entities the only ones it will index are id, name, and description. So fields like firstname, email, url don't appear in the index. They also don't appear in the schema browser. Am I doing something wrong? What is so special about id, name, and description that they always appear? document entity name=Project processor=SqlEntityProcessor query=select * from project field column=projectid name=id / field column=name name=name / field column=description name=description / /entity entity name=Person processor=SqlEntityProcessor query=select * from person field column=firstname name=firstName / field column=lastname name=lastName / field column=email name=email / field column=phonenumber name=phoneNumber / field column=login name=login / field column=displayname name=displayName / field column=updateduser name=updatedUser / /entity entity name=Script processor=SqlEntityProcessor query=select * from script field column=scriptid name=id / field column=name name=name / field column=description name=description / field column=url name=url / /entity /document