Hello, I was able to repeat this behaviour in Solr 3.1.0 The procedure is - rename the directory example-DIH/rss to example-DIH/gcontacts - modify solrconfig.xml to only load gcontacts - rename rss-data-config.xml to gcontacts-data-config.xml and modify (see content below) - modify schema.xml
This is from my schema.xml <field name="source" type="text" indexed="true" stored="true" /> <field name="source-link" type="string" indexed="false" stored="true" /> <field name="title" type="string" indexed="true" stored="true" /> <field name="link" type="string" indexed="true" stored="true" /> <field name="email" type="string" indexed="true" stored="true" multiValued="true" default=" "/> <field name="phoneNumber" type="string" indexed="true" stored="true" multiValued="true" default=" "/> <field name="organization" type="string" indexed="true" stored="true" multiValued="true" default=" "/> <field name="postalAddress" type="string" indexed="true" stored="true" multiValued="true" default=" "/> <field name="all_text" type="text" indexed="true" stored="true" multiValued="true" /> <copyField source="title" dest="all_text" /> <copyField source="email" dest="all_text" /> <copyField source="phoneNumber" dest="all_text" /> <copyField source="organization" dest="all_text" /> <copyField source="postalAddress" dest="all_text" /> this is my gcontacts-data-config.xml file <dataConfig> <dataSource type="URLDataSource" /> <document> <entity name="gcontacts" pk="link" url="http://172.16.0.30/sayt2/contacts/testtim.xml" processor="XPathEntityProcessor" forEach="/feed/entry" > <field column="source" xpath="/feed/entry/id" commonField="true" /> <field column="source-link" xpath="/feed/entry/link[@rel='edit']/@href" commonField="true" /> <field column="title" xpath="/feed/entry/title" commonField="true"/> <field column="link" xpath="/feed/entry/link[@rel='edit']/@href" /> <field column="email" xpath="/feed/entry/email/@address" commonField="true"/> <field column="phoneNumber" xpath="/feed/entry/phoneNumber" commonField="true"/> <field column="organization" xpath="/feed/entry/organization" commonField="true"/> <field column="postalAddress" xpath="/feed/entry/postalAddress" commonField="true"/> </entity> </document> </dataConfig> This is from my solrconfig.xml file <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <solr sharedLib="lib" persistent="true"> <cores adminPath="/admin/cores"> <core default="false" instanceDir="gcontacts" name="gcontacts"/> </cores> </solr> Thanks for your help. Regards On Fri, Apr 1, 2011 at 4:27 AM, Stefan Matheis < matheis.ste...@googlemail.com> wrote: > Marcelo, > > could you paste the relevant parts of your DIH config? > > Regards > Stefan > > On Thu, Mar 31, 2011 at 9:55 PM, Marcelo Iturbe <marc...@santiago.cl> > wrote: > > Hello, > > I have an XML which contains personal contacts. Not all contacts have the > > same fields (email, phone, postal). > > > > The problem is that when certain fields are NOT present, SOLR is > injecting > > the previous contacts data. > > > > For example, assume the following from the XML feed: > > <entry> > > <title type='text'>Jane Doe</title> > > <gd:email rel='http://schemas.google.com/g/2005#work' address=' > > jane....@gmail.com' primary='true'/> > > <gd:postalAddress rel='http://schemas.google.com/g/2005#home > > '>Santiago > > Region Metropolitana > > Chile</gd:postalAddress> > > </entry> > > <entry> > > <title type='text'>Jeff Smith</title> > > <gd:email rel='http://schemas.google.com/g/2005#work' address=' > > jeff.sm...@gmail.com' primary='true'/> > > </entry> > > <entry> > > <title type='text'>Ana Mercurio</title> > > <gd:phoneNumber rel='http://schemas.google.com/g/2005#mobile' > > primary='true'>+56912345678</gd:phoneNumber> > > </entry> > > > > The second contact, will have the first contacts postal address. > > The third contact, will have Janes Postal Address and Jeffs email > address: > > > > <lst> > > <arr name="title"> > > <str>Ana Mercurio</str> > > </arr> > > <arr name="phoneNumber"> > > <str>+5612345678</str> > > </arr> > > <arr name="email"> > > <str>jeff.sm...@gmail.com</str> > > </arr> > > <arr name="postalAddress"> > > <str>Santiago > > Region Metropolitana > > Chile</str> > > </arr> > > </lst> > > > > This is how I have the fields specified in the schema.xml file: > > <field name="email" type="string" indexed="true" stored="true" > > multiValued="true" default=" "/> > > <field name="phoneNumber" type="string" indexed="true" stored="true" > > multiValued="true" default=" "/> > > <field name="postalAddress" type="string" indexed="true" stored="true" > > multiValued="true" default=" "/> > > > > What did I miss? > > > > Thanks for your help. > > >