Re: How to make Relationships work for Multi-valued Index Fields?
Hey Gunaranjan, I have the same scenario as you. A lucene index is denormalized. It should not contain entity relationship. When I need to do something like you are doing, I group the related values in one field. Let's say we have 2 credit cards. the first has id 30459673 and taxes at 1.5%/month and the second has id 56305 and taxes at 2.5%. What I do is create a multivalued field that I index the values as id ^ taxes. In the client side I put the logic to parse the string in a convenient way to work with the values. I expect that helps you. 2009/1/25 Gunaranjan Chandraraju chandrar...@apple.com Paul Its not just about merging the fields or resource usage. If you look at the scenario below, the issue is that it mixes up my fields (shipping and billing address) for instance. I can't merge them and still keep the 'distinction' for search.Your case is a 'generalization' field. Thus the search will work. I know mine is a trivial example and can be overcome by just two fields (shipping_address billing_address - but can I am talking of cases when we have many such 'groups of fields'). In general such one to many relationship for indices in a 'document' is also really really common :). Again I am not trying to argue a point - I would be happy to get some idea on how to do it and be corrected if I'm wrong. Lastly (while thats not my worry point right now), I tend to be careful with resources. When dealing with very large data, I will avoid any unnecessary overhead as-far-as-possible and take every optimization I get :) Guna On Jan 25, 2009, at 1:50 AM, Paul Libbrecht wrote: Guna, it's really really normal to duplicate stuffs to be merged into a field. We do this all the time, for example to have a field text-in-any-language while a field text-in-english is also there and the queries boost matches in text-in-any-language less than text-in-english (if user is in english). This difference in weighting is the gold of Lucene I feel (of retrieval generally). Also, depending on the field you make different indexing, while still copying it in solr (for example use a different analyzer per language). paul PS: don't be scared with resources, this is the side of the world where the resource is the least the problem! (typically a catch-all-field wouldn't be stored though as this would then load the memory). Le 25-janv.-09 à 09:35, Gunaranjan Chandraraju a écrit : Thanks This sounds redundant to me - to store the fields separately and then concat all of them to one copy field again. My XML is like this address street=XYZ state=CA country=1 type=shipping ... I am currently using XPATH or XSL to separate them into individual indexed fields like: address_state_1, address_type_1 etc. in SOLR. From what you say, it looks to me that I might as well just treat the entire address as a single 'text field' and search within the text after tokenizing. This way I don't need to have the _1, _2 as the single text field will contain the information together (and thus grouped - so I know which is shipping/billing etc?).Will there be any performance difference between this and the copy field approach? Is there no other way (programmatic) to search across multiple fields? I did take a quick look at dismax but again it needs the field names to be specifically mentioned in the config file or in the query. I can't do this as I am not able to predict the number of fields (e.g. credit cards a person can have?). I like SOLR, but to me, this seems to be a very common and simple search scenario/pattern - however its implementation in SOLR is appearing to be not very straightforward. (My apologies, if I on the wrong track here because I don't understand SOLR well. ) Regards, Guna On Jan 24, 2009, at 10:54 PM, Noble Paul നോബിള് नोब्ळ् wrote: for searching you need to put them in a single field . use copyField in schema.xml to achieve that On Sun, Jan 25, 2009 at 7:39 AM, Gunaranjan Chandraraju chandrar...@apple.com wrote: I make this approach work with XPATH and XSL. However, this approach creates multiple fields of like this address_state_1 address_state_2 ... address_state_10 and credit_card_1 credit_card_2 credit_card_3 How do I search for a credit_card.The query syntax does not seem to support wild cards in field names. For e.g. I cant seem to do this - credit_card*:1234 4567 7890 1234 On the search side I would not know how many credit card fields got created for a document and so I need that to be dynamic. -g On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home /
Re: How to make Relationships work for Multi-valued Index Fields?
Thanks This sounds redundant to me - to store the fields separately and then concat all of them to one copy field again. My XML is like this address street=XYZ state=CA country=1 type=shipping ... I am currently using XPATH or XSL to separate them into individual indexed fields like: address_state_1, address_type_1 etc. in SOLR. From what you say, it looks to me that I might as well just treat the entire address as a single 'text field' and search within the text after tokenizing. This way I don't need to have the _1, _2 as the single text field will contain the information together (and thus grouped - so I know which is shipping/billing etc?).Will there be any performance difference between this and the copy field approach? Is there no other way (programmatic) to search across multiple fields? I did take a quick look at dismax but again it needs the field names to be specifically mentioned in the config file or in the query. I can't do this as I am not able to predict the number of fields (e.g. credit cards a person can have?). I like SOLR, but to me, this seems to be a very common and simple search scenario/pattern - however its implementation in SOLR is appearing to be not very straightforward. (My apologies, if I on the wrong track here because I don't understand SOLR well. ) Regards, Guna On Jan 24, 2009, at 10:54 PM, Noble Paul നോബിള് नोब्ळ् wrote: for searching you need to put them in a single field . use copyField in schema.xml to achieve that On Sun, Jan 25, 2009 at 7:39 AM, Gunaranjan Chandraraju chandrar...@apple.com wrote: I make this approach work with XPATH and XSL. However, this approach creates multiple fields of like this address_state_1 address_state_2 ... address_state_10 and credit_card_1 credit_card_2 credit_card_3 How do I search for a credit_card.The query syntax does not seem to support wild cards in field names. For e.g. I cant seem to do this - credit_card*:1234 4567 7890 1234 On the search side I would not know how many credit card fields got created for a document and so I need that to be dynamic. -g On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- --Noble Paul
Re: How to make Relationships work for Multi-valued Index Fields?
On Sun, Jan 25, 2009 at 2:05 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: Thanks This sounds redundant to me - to store the fields separately and then concat all of them to one copy field again. Sometimes that may be the only way. For example, if you want to facet on some of those fields, as well as to search them all. My XML is like this address street=XYZ state=CA country=1 type=shipping ... I am currently using XPATH or XSL to separate them into individual indexed fields like: address_state_1, address_type_1 etc. in SOLR. From what you say, it looks to me that I might as well just treat the entire address as a single 'text field' and search within the text after tokenizing. This way I don't need to have the _1, _2 as the single text field will contain the information together (and thus grouped - so I know which is shipping/billing etc?).Will there be any performance difference between this and the copy field approach? No I think, one field may even be better since you are creating less number of fields. If you never need to do faceting and you don't want to get the contents of each address field separately. This is your best option. Is there no other way (programmatic) to search across multiple fields? I did take a quick look at dismax but again it needs the field names to be specifically mentioned in the config file or in the query. I can't do this as I am not able to predict the number of fields (e.g. credit cards a person can have?). I like SOLR, but to me, this seems to be a very common and simple search scenario/pattern - however its implementation in SOLR is appearing to be not very straightforward. (My apologies, if I on the wrong track here because I don't understand SOLR well. ) There had been some discussion on having wildcards in field names. But I guess nobody contributed (or had the need?) for the complete proposal. Copy Fields give a lot of flexibility which is what most people use. http://wiki.apache.org/solr/FieldAliasesAndGlobsInParams -- Regards, Shalin Shekhar Mangar.
Re: How to make Relationships work for Multi-valued Index Fields?
Hello, I am also a newbie and was wanting to do almost the exact same thing. I was planning on doing the equivalent of:- dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false rootEntity=false***changed*** forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id commonField=true/ ***change** !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/ record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_typexpath=/ record/address//@type / /entity /entity /entity /document /dataConfig ID is no longer unique within Solr, There would be multiple documents with a given ID; one for each address. You can then search on ID and get the three addresses, you can also search on an address more sensibly. I have not been able to try this yet as other issues are still to be dealt with. Comments? Hi I may be completely off on this being new to SOLR but I am not sure how to index related groups of fields in a document and preserver their 'grouping'. I would appreciate any help on this.Detailed description of the problem below. I am trying to index an entity that can have multiple occurrences in the same document - e.g. Address. The address could be Shipping, Home, Office etc. Each address element has multiple values in it like street, state etc.Thus each address element is a group with the state and street in one address element being related to each other. It looks like this in my source xml record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/ record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_typexpath=/ record/address//@type / /entity /entity /entity /document /dataConfig The problem is as follows. DIH seems to treat these as entities but solr seems to flatten them out on indexing to fields in a document (losing the entity part). So when I search for the an ID - in the response all the street fields are bunched to-gather, followed by all the state fields type etc. Thus I can't associate which street address corresponds to which address type in the response. What seems harder is this - say I need to query on 'Street' = XYZ1 and type=Office. This should NOT return a document since the street for the office address is XY2 and not XYZ1. However when I query for address_state:XYZ1 and address_type:Office I get back this document. The problem seems to be that while DIH allows 'entities' within a document the SOLR schema does not preserve them - it 'flattens' all of them out as indices for the document. I could work around the problem by creating SOLR fields like home_address_street and office_address_street and do some xpath mapping. However I don't want to do it as we can have multiple 'other' addresses. Also I have other fields whose type is not easily distinguished like address. As I mentioned being new to SOLR I might have completely goofed on a way to set it up - much appreciate any direction on it. I am using SOLR 1.3 Regards, Guna -- === Fergus
Re: How to make Relationships work for Multi-valued Index Fields?
nesting of an XPathEntityProcessor into another XPathEntityProcessor is possible only if a field in an xml is a filename/url . what is the purpose of nesting like this? is it because you have multiple addresses? the possible solutions are discussed elsewhere in this thread On Sat, Jan 24, 2009 at 2:41 PM, Fergus McMenemie fer...@twig.me.uk wrote: Hello, I am also a newbie and was wanting to do almost the exact same thing. I was planning on doing the equivalent of:- dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false rootEntity=false***changed*** forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id commonField=true/ ***change** !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/ record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_typexpath=/ record/address//@type / /entity /entity /entity /document /dataConfig ID is no longer unique within Solr, There would be multiple documents with a given ID; one for each address. You can then search on ID and get the three addresses, you can also search on an address more sensibly. I have not been able to try this yet as other issues are still to be dealt with. Comments? Hi I may be completely off on this being new to SOLR but I am not sure how to index related groups of fields in a document and preserver their 'grouping'. I would appreciate any help on this.Detailed description of the problem below. I am trying to index an entity that can have multiple occurrences in the same document - e.g. Address. The address could be Shipping, Home, Office etc. Each address element has multiple values in it like street, state etc.Thus each address element is a group with the state and street in one address element being related to each other. It looks like this in my source xml record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/ record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_typexpath=/ record/address//@type / /entity /entity /entity /document /dataConfig The problem is as follows. DIH seems to treat these as entities but solr seems to flatten them out on indexing to fields in a document (losing the entity part). So when I search for the an ID - in the response all the street fields are bunched to-gather, followed by all the state fields type etc. Thus I can't associate which street address corresponds to which address type in the response. What seems harder is this - say I need to query on 'Street' = XYZ1 and type=Office. This should NOT return a document since the street for the office address is XY2 and not XYZ1. However when I query for address_state:XYZ1 and address_type:Office I get back this document. The problem seems to be that while DIH allows 'entities' within a document the SOLR schema does not preserve them - it 'flattens' all of them out as indices for the document. I could work around the problem by creating SOLR fields like home_address_street and office_address_street and do some xpath mapping. However I don't want to do it as we can have multiple 'other' addresses.
Re: How to make Relationships work for Multi-valued Index Fields?
Hi Fergus, XPathEntityprocessor can read multivalued fields easily eg dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id commonField=true/ ***change** field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address/@state / field column=address_type xpath=/record/address/@type / /entity /entity /document /dataConfig In this case all address_street,address_state,address_type will be returned as separate lists while parsing. If you wish to put them into multple fields you can write a transformer and iterate thru the lists and put them into separate fields. If there are 3 address tags then you get a ListString for each fields where the length of the list==3. If an item is missing it will be added as a null. ensure that the fields are marked as multiValued=true in the schema.xml. Otherwise it does not return ListString . If there is no corresponding mapping in schema.xml you can explicitly put it here in the dataconfig.xml eg: field column=address_state multiValued=true xpath=/record/address/@state / I saw the syntax '/record/address//@state'. '//' is not supported . You will have to explicitly give the full path. --Noble On Sat, Jan 24, 2009 at 2:57 PM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: nesting of an XPathEntityProcessor into another XPathEntityProcessor is possible only if a field in an xml is a filename/url . what is the purpose of nesting like this? is it because you have multiple addresses? the possible solutions are discussed elsewhere in this thread On Sat, Jan 24, 2009 at 2:41 PM, Fergus McMenemie fer...@twig.me.uk wrote: Hello, I am also a newbie and was wanting to do almost the exact same thing. I was planning on doing the equivalent of:- dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false rootEntity=false***changed*** forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id commonField=true/ ***change** !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/ record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_typexpath=/ record/address//@type / /entity /entity /entity /document /dataConfig ID is no longer unique within Solr, There would be multiple documents with a given ID; one for each address. You can then search on ID and get the three addresses, you can also search on an address more sensibly. I have not been able to try this yet as other issues are still to be dealt with. Comments? Hi I may be completely off on this being new to SOLR but I am not sure how to index related groups of fields in a document and preserver their 'grouping'. I would appreciate any help on this.Detailed description of the problem below. I am trying to index an entity that can have multiple occurrences in the same document - e.g. Address. The address could be Shipping, Home, Office etc. Each address element has multiple values in it like street, state etc.Thus each address element is a group with the state and street in one address element being related to each other. It looks like this in my source xml record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record
Re: How to make Relationships work for Multi-valued Index Fields?
I make this approach work with XPATH and XSL. However, this approach creates multiple fields of like this address_state_1 address_state_2 ... address_state_10 and credit_card_1 credit_card_2 credit_card_3 How do I search for a credit_card.The query syntax does not seem to support wild cards in field names. For e.g. I cant seem to do this - credit_card*:1234 4567 7890 1234 On the search side I would not know how many credit card fields got created for a document and so I need that to be dynamic. -g On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: How to make Relationships work for Multi-valued Index Fields?
for searching you need to put them in a single field . use copyField in schema.xml to achieve that On Sun, Jan 25, 2009 at 7:39 AM, Gunaranjan Chandraraju chandrar...@apple.com wrote: I make this approach work with XPATH and XSL. However, this approach creates multiple fields of like this address_state_1 address_state_2 ... address_state_10 and credit_card_1 credit_card_2 credit_card_3 How do I search for a credit_card.The query syntax does not seem to support wild cards in field names. For e.g. I cant seem to do this - credit_card*:1234 4567 7890 1234 On the search side I would not know how many credit card fields got created for a document and so I need that to be dynamic. -g On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- --Noble Paul
Re: How to make Relationships work for Multi-valued Index Fields?
I thought 1.3 supported dynamic fields in schema.xml? Guna On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: How to make Relationships work for Multi-valued Index Fields?
I thought 1.3 supported dynamic fields in schema.xml? Guna On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: How to make Relationships work for Multi-valued Index Fields?
Yes Solr does. But DataImportHandler with the 1.3 release does not support it. However, you can use the trunk data import handler jar with Solr 1.3 if you do not feel comfortable using Solr 1.4 trunk. On Fri, Jan 23, 2009 at 1:36 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: I thought 1.3 supported dynamic fields in schema.xml? Guna On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote: Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: How to make Relationships work for Multi-valued Index Fields?
On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar.
Re: How to make Relationships work for Multi-valued Index Fields?
Oops, one more gotcha. The dynamic field support is only in 1.4 trunk. On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju chandrar...@apple.com wrote: record coreInfo id=123 , .../ address street=XYZ1 State=CA ...type=home / address street=XYZ2 state=CA ... type=Office/ address street=XYZ3 state=CA type=Other/ /record I have setup my DIH to treat these as entities as below dataConfig dataSource type=FileDataSource encoding=UTF-8 / document entity name =f processor=FileListEntityProcessor baseDir=*** fileName=.*xml rootEntity=false dataSource=null entity name=record processor=XPathEntityProcessor stream=false forEach=/record url=${f.fileAbsolutePath} field column=ID xpath=/record/@id / !-- Address -- entity name=record_adr processor=XPathEntityProcessor stream=false forEach=/record/address url=${f.fileAbsolutePath} field column=address_street xpath=/record/address/@street / field column=address_state xpath=/record/address//@state / field column=address_type xpath=/record/address//@type / /entity /entity /entity /document /dataConfig I think the only way is to create a dynamic field for each attribute (street, state etc.). Write a transformer to copy the fields from your data config to appropriately named dynamic field (e.g. street_1, state_1, etc). To maintain this counter you will need to get/store it with Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and Context#setSessionAttribute(name, val, Context.SCOPE_DOC). I cant't think of an easier way. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.