Noble, Thanks for continuing to assist me on trying to come up a config that works. A couple of questions/clarifications: 1) I had to introduce the "artificial" comboID and the transformer because of a conflict with a parallel entity on the "id" ("vets" and "owners"). 2) I don't think there is a conflict with the petID because prior to the introduction of "vets" I had "owners" with no "id" issues regarding "pets". 3) The conflict was introduced the moment I tried to add "vets". Unfortunately by introducing the transformer, for "owners", the "pets" relationships stopped working.
Below are 3 specifications. The first 2 work in isolation, when combined(last one) it doesn't work. * CASE 1 (Works -nested entities -no conflicts on ids -no transformer) <document name="doc-1"> <entity name="owners" pk="id" query="select id,first_name,last_name FROM owners"> <field column="id" name="id"/> <field column="first_name" name="userName"/> <field column="last_name" name="userName"/> <entity name="pets" pk="id" query="SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}'" parentDeltaQuery="SELECT id FROM owners WHERE id=${pets.owner_id}"> <field column="id" name="id"/> <field column="name" name="name"/> <field column="birth_date" name="birthDate"/> </entity> </entity> </document> * CASE 2 (Works -parallel independent entities -introduced transformer to avoid id conflicts) <document name="doc-1"> <entity name="vets" pk="id" query="select id,first_name,last_name FROM vets" transformer="TemplateTransformer"> <field column="id" name="comboId" template="vets-${vets.id}"/> <field column="first_name" name="userName"/> <field column="last_name" name="userName"/> </entity> <entity name="owners" pk="id" query="select id,first_name,last_name FROM owners" transformer="TemplateTransformer"> <field column="id" name="comboId" template="owners-${owners.id}"/> <field column="first_name" name="userName"/> <field column="last_name" name="userName"/> </entity> </document> * CASE 3 (Commented out "vets" to simplify case. Nested entities don't work: "Document [null] missing required field: id") <document name="doc-1"> <!--entity name="vets" pk="id" query="select id,first_name,last_name FROM vets" transformer="TemplateTransformer"> <field column="id" name="comboId" template="vets-${vets.id}"/> <field column="first_name" name="userName"/> <field column="last_name" name="userName"/> </entity--> <entity name="owners" pk="id" query="select id,first_name,last_name FROM owners" transformer="TemplateTransformer"> <field column="id" name="comboId" template="owners-${owners.id}"/> <field column="first_name" name="userName"/> <field column="last_name" name="userName"/> <entity name="pets" pk="id" query="SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}'" parentDeltaQuery="SELECT id FROM owners WHERE id=${pets.owner_id}"> <field column="id" name="id"/> <field column="name" name="name"/> <field column="birth_date" name="birthDate"/> </entity> </entity> </document> The debug output for one row from the dataImporter while iterating over pets where the first row owner_id=1 (which gets transformed to 'owners-1' -where owner_id is a fk to owners id column) shows as follows: "SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1' I believe the issue on somehow having to "untransform" the owners-id prior to comparison with pets foreign key owner_id Thanks again ** julio -----Original Message----- From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 03, 2008 10:30 PM To: solr-user@lucene.apache.org Subject: Re: How to describe 2 entities in dataConfig for the DataImporter? The id in pet should be aliased to 'petid' , because id is coming from both entities there is a conflict <entity name="owners" pk="id" query="select id,first_name,last_name FROM owners" transformer="TemplateTransformer"> <field column="comboId" template="owners-${owners.id}"/> <field column="id" /> <field column="first_name" name="userName"/> <field column="last_name" name="userName"/> <entity name="pets" query="SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}'" parentDeltaQuery="SELECT id FROM owners WHERE id=${pets.owner_id}"> <field column="id" name="petid"/> <field column="name" name="name"/> <field column="birth_date" name="birthDate"/> </entity> </entity> On Wed, Jun 4, 2008 at 10:37 AM, Noble Paul ??????? ?????? <[EMAIL PROTECTED]> wrote: > hi julio, > You must create an extra field for 'comboid' because you really need > the 'id' for your sub-entities. Your data-config must look as follows. > The pet also has a field called 'id' . It is not a good idea. call it > 'petid' or something (both in dataconfig and schema.xml). Please make > sure that the field names are unique . > > > <entity name="owners" pk="id" > query="select id,first_name,last_name FROM owners" > transformer="TemplateTransformer"> > <field column="comboId" template="owners-${owners.id}"/> > <field column="id" /> > <field column="first_name" name="userName"/> > <field column="last_name" name="userName"/> > > <entity name="pets" pk="id" > query="SELECT id,name,birth_date,type_id FROM pets WHERE > owner_id='${owners.id}'" > parentDeltaQuery="SELECT id FROM owners WHERE > id=${pets.owner_id}"> > <field column="id" name="id"/> > <field column="name" name="name"/> > <field column="birth_date" name="birthDate"/> > </entity> > </entity> > > > On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo <[EMAIL PROTECTED]> wrote: >> Hi Noble, >> I had forgotten to also list comboId as a uniqueKey in the schema.xml file. >> But that didn't make a difference. >> It still complained about the "Document [null] missing required field: id" >> for each row it ran into of the outer entity. >> >> If you look at the debug output of the entity:pets (see below on >> original message). >> The query looks like this: >> "SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1' >> >> This is the problem lies, because, the owner_id in the pets table is >> currently a number and thus will not match the modified combo id >> generated for the owners' id column. >> >> So, somehow, I need to be able to either remove the 'owners-' suffix >> before comparing, or append the same suffix to the pets.owner_id >> value prior to comparing. >> >> Thanks >> >> ** julio >> >> -----Original Message----- >> From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED] >> Sent: Monday, June 02, 2008 9:20 PM >> To: solr-user@lucene.apache.org >> Subject: Re: How to describe 2 entities in dataConfig for the DataImporter? >> >> hi Julio, >> delete my previous response. In your schema , 'id' is the uniqueKey. >> make 'comboid' the unique key. Because that is the target field name >> coming out of the entity 'owners' >> >> --Noble >> >> On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??????? ?????? >> <[EMAIL PROTECTED]> wrote: >>> The field 'id' is repeated for pet also rename it to something else >>> say <entity name="pets" pk="id" >>> query="SELECT id,name,birth_date,type_id FROM pets >>> WHERE owner_id='${owners.id}'" >>> parentDeltaQuery="SELECT id FROM owners WHERE >>> id=${pets.owner_id}"> >>> <field column="id" name="petid"/> >>> </entity> >>> >>> --Noble >>> >>> On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo >>> <[EMAIL PROTECTED]> >> wrote: >>>> Shalin, >>>> I experimented with it, and the null pointer exception has been >>>> taken care of. Thank you. >>>> >>>> I have a different problem now. I believe it is a >>>> syntax/specification problem. >>>> >>>> When importing data, I got the following exceptions: >>>> SEVERE: Exception while adding: >>>> SolrInputDocumnt[{comboId=comboId(1.0)={owners-9}, >>>> userName=userName(1.0)={[David, Schroeder]}}] >>>> >>>> org.apache.solr.common.SolrException: Document [null] missing >>>> required >>>> field: id >>>> at >>>> >> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.jav >> a:289) >>>> at >>>> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataI >>>> mp >>>> ortHand >>>> ler.java:263) >>>> ... >>>> >>>> The problem arises the moment I try to include nested entities (e.g. >>>> pets -the problem does not occur if I don't use the transformer, >>>> but I have to use the transformer because other unrelated entities >>>> also have >> id's). >>>> My data config file looks as follows. >>>> >>>> <dataConfig> >>>> <document name="doc-1"> >>>> <entity name="owners" pk="id" >>>> query="select id,first_name,last_name FROM owners" >>>> transformer="TemplateTransformer"> >>>> <field column="id" name="comboId" >> template="owners-${owners.id}"/> >>>> <field column="first_name" name="userName"/> >>>> <field column="last_name" name="userName"/> >>>> >>>> <entity name="pets" pk="id" >>>> query="SELECT id,name,birth_date,type_id FROM pets >>>> WHERE owner_id='${owners.id}'" >>>> parentDeltaQuery="SELECT id FROM owners WHERE >>>> id=${pets.owner_id}"> >>>> <field column="id" name="id"/> >>>> <field column="name" name="name"/> >>>> <field column="birth_date" name="birthDate"/> >>>> </entity> >>>> </entity> >>>> </document> >>>> </dataConfig> >>>> >>>> The debug output of the data import looks as follows: >>>> >>>> .... >>>> - <lst name="verbose-output"> >>>> - <lst name="entity:owners"> >>>> - <lst name="document#1"> >>>> <str name="query">select id,first_name,last_name FROM owners</str> >>>> <str name="time-taken">0:0:0.15</str> >>>> <str>----------- row #1-------------</str> >>>> <int name="id">1</int> >>>> <str name="first_name">George</str> >>>> <str name="last_name">Franklin</str> >>>> <str>---------------------------------------------</str> >>>> - <lst name="transformer:TemplateTransformer"> >>>> <str>---------------------------------------------</str> >>>> <str name="id">owners-1</str> >>>> <str name="first_name">George</str> >>>> <str name="last_name">Franklin</str> >>>> <str>---------------------------------------------</str> >>>> - <lst name="entity:pets"> >>>> <str name="query">SELECT id,name,birth_date,type_id FROM >>>> pets WHERE owner_id='owners-1'</str> >>>> <str name="time-taken">0:0:0.0</str> >>>> </lst> >>>> </lst> >>>> </lst> >>>> + <lst name="document#1"> >>>> .... >>>> >>>> Thanks again >>>> >>>> ** julio >>>> >>>> >>>> -----Original Message----- >>>> From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] >>>> Sent: Saturday, May 31, 2008 10:26 AM >>>> To: solr-user@lucene.apache.org >>>> Subject: Re: How to describe 2 entities in dataConfig for the >> DataImporter? >>>> >>>> Hi Julio, >>>> >>>> I've fixed the bug, can you please replace the exiting >>>> TemplateTransformer.java in the SOLR-469.patch and use the attached >>>> TemplateTransformer.java file. We'll add the changes to our next patch. >>>> Sorry for all the trouble. >>>> >>>> On Sat, May 31, 2008 at 10:31 PM, Noble Paul ??????? ?????? >>>> <[EMAIL PROTECTED]> wrote: >>>>> julio, >>>>> Looks like it is a bug. >>>>> We can give u a new TemplateTransformer.java which we will >>>>> incorporate in the next patch --Noble >>>>> >>>>> On Sat, May 31, 2008 at 12:24 AM, Julio Castillo >>>>> <[EMAIL PROTECTED]> wrote: >>>>>> I'm sorry Shalin, but I still get the same Null Pointer exception. >>>>>> This is my complete dataconfig.xml (I remove the parallel entity >>>>>> to narrow down the scope of the problem). >>>>>> <dataConfig> >>>>>> <document name="doc-1"> >>>>>> <entity name="vets" pk="id" >>>>>> query="select id as idAlias,first_name,last_name FROM vets" >>>>>> deltaQuery="SELECT id as idAlias FROM vets WHERE >>>>>> last_modified > '${dataimporter.last_index_time}'" >>>>>> transformer="TemplateTransformer"> >>>>>> <field column="id" name="id" >>>>>> template="vets-${vets.idAlias}"/> >>>>>> <field column="first_name" name="userName"/> >>>>>> <field column="last_name" name="userName"/> >>>>>> </entity> >>>>>> </document> >>>>>> </dataConfig> >>>>>> >>>>>> Thanks again. >>>>>> >>>>>> ** julio >>>>>> >>>>>> -----Original Message----- >>>>>> From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] >>>>>> Sent: Friday, May 30, 2008 11:38 AM >>>>>> To: solr-user@lucene.apache.org >>>>>> Subject: Re: How to describe 2 entities in dataConfig for the >>>> DataImporter? >>>>>> >>>>>> The surname is used just as an example of a field. >>>>>> >>>>>> The NullPointerException is because the same field "id" tries to >>>>>> use it's own value in a template. The template cannot contain the >>>>>> same field on which it is being applied. I'd suggest that you get >>>>>> the id aliased to another name, for example using a query "select >>>>>> id as idAlias from vets" and then >>>>>> use: >>>>>> <field column="id" template="vets-${vets.idAlias}" /> >>>>>> >>>>>> That should work, let me know if you face a problem. >>>>>> >>>>>> On Fri, May 30, 2008 at 10:40 PM, Julio Castillo >>>>>> <[EMAIL PROTECTED]> >>>>>> wrote: >>>>>>> Thanks for all the leads. >>>>>>> I did get however a null pointer exception while implementing it: >>>>>>> >>>>>>> May 30, 2008 9:57:50 AM >>>>>>> org.apache.solr.handler.dataimport.EntityProcessorBase >> applyTransformer >>>>>>> WARNING: transformer threw error java.lang.NullPointerException >>>>>>> at >>>>>>> >> org.apache.solr.handler.dataimport.TemplateTransformer.transformRow(T >> emplate >> Transformer.java:55) >>>>>>> at >>>>>>> >> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransform >> er(Enti >> tyProcessorBase.java:186) >>>>>>> >>>>>>> Looking at the source code, it appears that the resolverMap is null. >>>>>>> The resolver returned null given the entityName. >>>>>>> >>>>>>> >>>>>>> ** julio >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED] >>>>>>> Sent: Thursday, May 29, 2008 11:10 PM >>>>>>> To: solr-user@lucene.apache.org >>>>>>> Subject: Re: How to describe 2 entities in dataConfig for the >>>>>> DataImporter? >>>>>>> >>>>>>> Sorry I forgot to mention that. >>>>>>> http://wiki.apache.org/solr/DataImportHandler#head-a6916b30b5d76 >>>>>>> 05 >>>>>>> a9 >>>>>>> 90 >>>>>>> fb03c4 >>>>>>> ff461b3736496a9 >>>>>>> --Noble >>>>>>> >>>>>>> On Fri, May 30, 2008 at 11:37 AM, Shalin Shekhar Mangar >>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>> You need to enable TemplateTransformer for your entity. For example: >>>>>>>> <entity name="owners" pk="id" query="...." >>>>>>>> transformer="TemplateTransformer"> >>>>>>>> >>>>>>>> On Fri, May 30, 2008 at 11:31 AM, Julio Castillo >>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>>> Noble, >>>>>>>>> I tried the template setting for the "id" field, but I didn't >>>>>>>>> notice any different behavior. I also didn't see where this >>>>>>>>> would be >>>> reflected. >>>>>>>>> I looked at the fields and the debug output for the >>>>>>>>> dataImporter and couldn't see any reference to a modified id >>>>>>>>> name (per the template instructions). >>>>>>>>> >>>>>>>>> The behavior in the end seemed to be the same. Did I miss anything? >>>>>>>>> I assume that the <uniqueKey>id</uniqueKey> setting in the >>>>>>>>> schema.xml remains the same? >>>>>>>>> >>>>>>>>> Thanks again >>>>>>>>> >>>>>>>>> ** julio >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Noble Paul ??????? ?????? [mailto:[EMAIL PROTECTED] >>>>>>>>> Sent: Thursday, May 29, 2008 9:46 PM >>>>>>>>> To: solr-user@lucene.apache.org >>>>>>>>> Subject: Re: How to describe 2 entities in dataConfig for the >>>>>>> DataImporter? >>>>>>>>> >>>>>>>>> Consider constructing the id concatenating an extra string for >>>>>>>>> each document . You can construct that field using the >>>> TeplateTransformer. >>>>>>>>> in the entity owners keep the id as >>>>>>>>> >>>>>>>>> <field column="id" name="id" template="owners-${owners.id}"/> >>>>>>>>> and in vets <field column="id" name="id" >>>>>>>>> template="vets-${vets.id}"/> >>>>>>>>> >>>>>>>>> or anything else which can make it unique >>>>>>>>> >>>>>>>>> --Noble >>>>>>>>> >>>>>>>>> On Fri, May 30, 2008 at 10:05 AM, Shalin Shekhar Mangar >>>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>>>> That will happen only if id is the uniqueKey in Solr and the >>>>>>>>>> id coming from both your tables have same values. In that >>>>>>>>>> case, they will overwrite each other. You will need a >>>>>>>>>> separate uniqueKey (on other than id field). >> >> > > > > -- > --Noble Paul > -- --Noble Paul