Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-26 Thread Alexander Ramos Jardim
Hey Gunaranjan,

I have the same scenario as you.

A lucene index is denormalized. It should not contain entity relationship.
When I need to do something like you are doing, I group the related values
in one field.

Let's say we have 2 credit cards. the first has id 30459673 and taxes at
1.5%/month and the second has id 56305 and taxes at 2.5%. What I do is
create a multivalued field that I index the values as id ^ taxes. In the
client side I put the logic to parse the string in a convenient way to work
with the values. I expect that helps you.

2009/1/25 Gunaranjan Chandraraju chandrar...@apple.com

 Paul
 Its not just about merging the fields or resource usage.  If you look at
 the scenario below, the issue is that it mixes up my fields (shipping and
 billing address) for instance.  I can't merge them and still keep the
 'distinction' for search.Your case is a 'generalization' field.  Thus
 the search will work.   I know mine is a trivial example and can be overcome
 by just two fields (shipping_address  billing_address  - but can I am
 talking of cases when we have many such 'groups of fields').

 In general such one to many relationship for indices in a 'document' is
 also really really common :).  Again I am not trying to argue a point - I
 would be happy to get some idea on how to do it and be corrected if I'm
 wrong.

 Lastly (while thats not my worry point right now), I tend to be careful
 with resources. When dealing with very large data, I will avoid any
 unnecessary overhead as-far-as-possible and take every optimization I get :)

 Guna


 On Jan 25, 2009, at 1:50 AM, Paul Libbrecht wrote:

  Guna,

 it's really really normal to duplicate stuffs to be merged into a field.

 We do this all the time, for example to have a field
 text-in-any-language while a field text-in-english is also there and the
 queries boost matches in text-in-any-language less than text-in-english (if
 user is in english).

 This difference in weighting is the gold of Lucene I feel (of retrieval
 generally).
 Also, depending on the field you make different indexing, while still
 copying it in solr (for example use a different analyzer per language).

 paul

 PS: don't be scared with resources, this is the side of the world where
 the resource is the least the problem! (typically a catch-all-field
 wouldn't be stored though as this would then load the memory).


 Le 25-janv.-09 à 09:35, Gunaranjan Chandraraju a écrit :

  Thanks
 This sounds redundant to me - to store the fields separately and then
 concat all of them to one copy field again.

 My XML is like this
 address street=XYZ state=CA country=1 type=shipping ...

 I am currently using XPATH or XSL to separate them into individual
 indexed fields like: address_state_1, address_type_1 etc. in SOLR.

 From what you say, it looks to me that I might as well just treat the
 entire address as a single 'text field' and search within the text after
 tokenizing.  This way I don't need to have the _1, _2 as the single text
 field will contain the information together (and thus grouped - so I know
 which is shipping/billing etc?).Will there be any performance difference
 between this and the copy field approach?

 Is there no other way (programmatic) to search across multiple fields?  I
 did take a quick look at dismax but again it needs the field names to be
 specifically mentioned in the config file or in the query.  I can't do this
 as I am not able to predict the number of fields (e.g. credit cards a person
 can have?).

 I like SOLR, but to me, this seems to be a very common and simple search
 scenario/pattern - however its implementation in SOLR is appearing to be not
 very straightforward.   (My apologies, if I on the wrong track here because
 I don't understand SOLR well.  )

 Regards,
 Guna
 On Jan 24, 2009, at 10:54 PM, Noble Paul നോബിള്‍ नोब्ळ् wrote:

  for searching you need to put them in a single field . use copyField
 in schema.xml to achieve that

 On Sun, Jan 25, 2009 at 7:39 AM, Gunaranjan Chandraraju
 chandrar...@apple.com wrote:

 I make this approach work with XPATH and XSL.   However, this approach
 creates multiple fields of like this

 address_state_1
 address_state_2
 ...
 address_state_10

 and

 credit_card_1
 credit_card_2
 credit_card_3


 How do I search for a credit_card.The query syntax does not seem to
 support wild cards in field names.   For e.g. I cant seem to do this -
 credit_card*:1234 4567 7890 1234

 On the search side I would not know how many credit card fields  got
 created
 for a document and so I need that to be dynamic.

 -g


 On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:

  Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

 On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

  On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
 chandrar...@apple.com wrote:


 record
 coreInfo id=123 , .../
 address street=XYZ1 State=CA ...type=home /
 

Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-25 Thread Gunaranjan Chandraraju

Thanks
This sounds redundant to me - to store the fields separately and then  
concat all of them to one copy field again.


My XML is like this
address street=XYZ state=CA country=1 type=shipping ...

I am currently using XPATH or XSL to separate them into individual  
indexed fields like: address_state_1, address_type_1 etc. in SOLR.


From what you say, it looks to me that I might as well just treat the  
entire address as a single 'text field' and search within the text  
after tokenizing.  This way I don't need to have the _1, _2 as the  
single text field will contain the information together (and thus  
grouped - so I know which is shipping/billing etc?).Will there be  
any performance difference between this and the copy field approach?


Is there no other way (programmatic) to search across multiple  
fields?  I did take a quick look at dismax but again it needs the  
field names to be specifically mentioned in the config file or in the  
query.  I can't do this as I am not able to predict the number of  
fields (e.g. credit cards a person can have?).


 I like SOLR, but to me, this seems to be a very common and simple  
search scenario/pattern - however its implementation in SOLR is  
appearing to be not very straightforward.   (My apologies, if I on the  
wrong track here because I don't understand SOLR well.  )


Regards,
Guna
On Jan 24, 2009, at 10:54 PM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:



for searching you need to put them in a single field . use copyField
in schema.xml to achieve that

On Sun, Jan 25, 2009 at 7:39 AM, Gunaranjan Chandraraju
chandrar...@apple.com wrote:
I make this approach work with XPATH and XSL.   However, this  
approach

creates multiple fields of like this

address_state_1
address_state_2
...
address_state_10

and

credit_card_1
credit_card_2
credit_card_3


How do I search for a credit_card.The query syntax does not  
seem to
support wild cards in field names.   For e.g. I cant seem to do  
this -

credit_card*:1234 4567 7890 1234

On the search side I would not know how many credit card fields   
got created

for a document and so I need that to be dynamic.

-g


On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:

Oops, one more gotcha. The dynamic field support is only in 1.4  
trunk.


On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:



record
coreInfo id=123 , .../
address street=XYZ1 State=CA ...type=home /
address street=XYZ2 state=CA ... type=Office/
address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
 entity name =f processor=FileListEntityProcessor
 baseDir=***
 fileName=.*xml
 rootEntity=false
 dataSource=null 
entity
   name=record
   processor=XPathEntityProcessor
   stream=false
   forEach=/record
   url=${f.fileAbsolutePath}
field column=ID xpath=/record/@id /

!-- Address  --
 entity
 name=record_adr
 processor=XPathEntityProcessor
 stream=false
 forEach=/record/address
 url=${f.fileAbsolutePath}
 field column=address_street
xpath=/record/address/@street /
 field column=address_state
xpath=/record/address//@state /
 field column=address_type
xpath=/record/address//@type /
/entity
   /entity
 /entity
/document
/dataConfig



I think the only way is to create a dynamic field for each  
attribute
(street, state etc.). Write a transformer to copy the fields from  
your

data
config to appropriately named dynamic field (e.g. street_1,  
state_1,

etc).
To maintain this counter you will need to get/store it with
Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

I cant't think of an easier way.
--
Regards,
Shalin Shekhar Mangar.





--
Regards,
Shalin Shekhar Mangar.







--
--Noble Paul




Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-25 Thread Shalin Shekhar Mangar
On Sun, Jan 25, 2009 at 2:05 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:

 Thanks
 This sounds redundant to me - to store the fields separately and then
 concat all of them to one copy field again.


Sometimes that may be the only way. For example, if you want to facet on
some of those fields, as well as to search them all.



 My XML is like this
 address street=XYZ state=CA country=1 type=shipping ...

 I am currently using XPATH or XSL to separate them into individual indexed
 fields like: address_state_1, address_type_1 etc. in SOLR.

 From what you say, it looks to me that I might as well just treat the
 entire address as a single 'text field' and search within the text after
 tokenizing.  This way I don't need to have the _1, _2 as the single text
 field will contain the information together (and thus grouped - so I know
 which is shipping/billing etc?).Will there be any performance difference
 between this and the copy field approach?


No I think, one field may even be better since you are creating less number
of fields. If you never need to do faceting and you don't want to get the
contents of each address field separately. This is your best option.



 Is there no other way (programmatic) to search across multiple fields?  I
 did take a quick look at dismax but again it needs the field names to be
 specifically mentioned in the config file or in the query.  I can't do this
 as I am not able to predict the number of fields (e.g. credit cards a person
 can have?).

  I like SOLR, but to me, this seems to be a very common and simple search
 scenario/pattern - however its implementation in SOLR is appearing to be not
 very straightforward.   (My apologies, if I on the wrong track here because
 I don't understand SOLR well.  )


There had been some discussion on having wildcards in field names. But I
guess nobody contributed (or had the need?) for the complete proposal. Copy
Fields give a lot of flexibility which is what most people use.

http://wiki.apache.org/solr/FieldAliasesAndGlobsInParams

-- 
Regards,
Shalin Shekhar Mangar.


Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-24 Thread Fergus McMenemie
Hello,

I am also a newbie and was wanting to do almost the exact same thing.
I was planning on doing the equivalent of:-

dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
   name=record
   processor=XPathEntityProcessor
   stream=false
   rootEntity=false***changed***
   forEach=/record
   url=${f.fileAbsolutePath}
 field column=ID xpath=/record/@id commonField=true/ 
***change**
 !-- Address  --
  entity
 name=record_adr
 processor=XPathEntityProcessor
 stream=false
 forEach=/record/address
 url=${f.fileAbsolutePath}
  field column=address_street  xpath=/ 
record/address/@street /
  field column=address_state   
xpath=/record/address//@state /
  field column=address_typexpath=/ 
record/address//@type /
/entity
/entity
  /entity
/document
/dataConfig

ID is no longer unique within Solr, There would be multiple documents
with a given ID; one for each address. You can then search on ID and get 
the three addresses, you can also search on an address more sensibly.

I have not been able to try this yet as other issues are still to be
dealt with.

Comments?

Hi
I may be completely off on this being new to SOLR but I am not sure  
how to index related groups of fields in a document and preserver  
their 'grouping'.   I  would appreciate any help on this.Detailed  
description of the problem below.

I am trying to index an entity that can have multiple occurrences in  
the same document - e.g. Address.  The address could be Shipping,  
Home, Office etc.   Each address element has multiple values in it  
like street, state etc.Thus each address element is a group with  
the state and street in one address element being related to each other.

It looks like this in my source xml

record
coreInfo id=123 , .../
address street=XYZ1 State=CA ...type=home /
address street=XYZ2 state=CA ... type=Office/
address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
name=record
  processor=XPathEntityProcessor
  stream=false
  forEach=/record
url=${f.fileAbsolutePath}
 field column=ID xpath=/record/@id /

 !-- Address  --
  entity
  name=record_adr
processor=XPathEntityProcessor
stream=false
forEach=/record/address
url=${f.fileAbsolutePath}
  field column=address_street  xpath=/ 
record/address/@street /
field column=address_state   
 xpath=/record/address//@state /
  field column=address_typexpath=/ 
record/address//@type /
   /entity
/entity
  /entity
/document
/dataConfig


The problem is as follows.  DIH seems to treat these as entities but  
solr seems to flatten them out on indexing to fields in a document  
(losing the entity part).

So when I search for the an ID - in the response all the street fields  
are bunched to-gather, followed by all the state fields type etc.   
Thus I can't associate which street address corresponds to which  
address type in the response.

What seems harder is this - say I need to query on 'Street' = XYZ1 and  
type=Office.  This should NOT return a document since the street for  
the office address is XY2 and not XYZ1.  However when I query for  
address_state:XYZ1 and address_type:Office I get back this document.

The problem seems to be that while DIH allows 'entities' within a  
document  the SOLR schema does not preserve them - it 'flattens' all  
of them out as indices for the document.

I could work around the problem by creating SOLR fields like  
home_address_street and office_address_street and do some xpath  
mapping.  However I don't want to do it as we can have multiple  
'other' addresses.  Also I have other fields whose type is not easily  
distinguished like address.

As I mentioned being new to SOLR I might have completely goofed on a  
way to set it up - much appreciate any direction on it. I am using  
SOLR 1.3

Regards,
Guna

-- 

===
Fergus 

Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
nesting of an XPathEntityProcessor into another XPathEntityProcessor
is possible only if a field in an xml is a filename/url .
what is the purpose of nesting like this?
is it because you have multiple addresses? the possible solutions are
discussed elsewhere in this thread

On Sat, Jan 24, 2009 at 2:41 PM, Fergus McMenemie fer...@twig.me.uk wrote:
 Hello,

 I am also a newbie and was wanting to do almost the exact same thing.
 I was planning on doing the equivalent of:-

 dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
   name=record
   processor=XPathEntityProcessor
   stream=false
   rootEntity=false***changed***
   forEach=/record
   url=${f.fileAbsolutePath}
 field column=ID xpath=/record/@id commonField=true/ 
 ***change**
 !-- Address  --
  entity
 name=record_adr
 processor=XPathEntityProcessor
 stream=false
 forEach=/record/address
 url=${f.fileAbsolutePath}
  field column=address_street  xpath=/
 record/address/@street /
  field column=address_state   
 xpath=/record/address//@state /
  field column=address_typexpath=/
 record/address//@type /
/entity
/entity
  /entity
/document
 /dataConfig

 ID is no longer unique within Solr, There would be multiple documents
 with a given ID; one for each address. You can then search on ID and get
 the three addresses, you can also search on an address more sensibly.

 I have not been able to try this yet as other issues are still to be
 dealt with.

 Comments?

Hi
I may be completely off on this being new to SOLR but I am not sure
how to index related groups of fields in a document and preserver
their 'grouping'.   I  would appreciate any help on this.Detailed
description of the problem below.

I am trying to index an entity that can have multiple occurrences in
the same document - e.g. Address.  The address could be Shipping,
Home, Office etc.   Each address element has multiple values in it
like street, state etc.Thus each address element is a group with
the state and street in one address element being related to each other.

It looks like this in my source xml

record
coreInfo id=123 , .../
address street=XYZ1 State=CA ...type=home /
address street=XYZ2 state=CA ... type=Office/
address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
name=record
  processor=XPathEntityProcessor
  stream=false
  forEach=/record
url=${f.fileAbsolutePath}
 field column=ID xpath=/record/@id /

 !-- Address  --
  entity
  name=record_adr
processor=XPathEntityProcessor
stream=false
forEach=/record/address
url=${f.fileAbsolutePath}
  field column=address_street  xpath=/
record/address/@street /
field column=address_state   
 xpath=/record/address//@state /
  field column=address_typexpath=/
record/address//@type /
   /entity
/entity
  /entity
/document
/dataConfig


The problem is as follows.  DIH seems to treat these as entities but
solr seems to flatten them out on indexing to fields in a document
(losing the entity part).

So when I search for the an ID - in the response all the street fields
are bunched to-gather, followed by all the state fields type etc.
Thus I can't associate which street address corresponds to which
address type in the response.

What seems harder is this - say I need to query on 'Street' = XYZ1 and
type=Office.  This should NOT return a document since the street for
the office address is XY2 and not XYZ1.  However when I query for
address_state:XYZ1 and address_type:Office I get back this document.

The problem seems to be that while DIH allows 'entities' within a
document  the SOLR schema does not preserve them - it 'flattens' all
of them out as indices for the document.

I could work around the problem by creating SOLR fields like
home_address_street and office_address_street and do some xpath
mapping.  However I don't want to do it as we can have multiple
'other' addresses.  

Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Fergus,
XPathEntityprocessor can read multivalued fields easily

eg
dataConfig
   dataSource type=FileDataSource encoding=UTF-8 /
   document
 entity name =f processor=FileListEntityProcessor
 baseDir=***
 fileName=.*xml
 rootEntity=false
 dataSource=null 
entity
  name=record
  processor=XPathEntityProcessor
  forEach=/record
  url=${f.fileAbsolutePath}
field column=ID xpath=/record/@id
commonField=true/ ***change**
field column=address_street
xpath=/record/address/@street /
 field column=address_state
xpath=/record/address/@state /
 field column=address_type
xpath=/record/address/@type /

   /entity
 /entity
   /document
/dataConfig


In this case all address_street,address_state,address_type will be
returned as separate lists while parsing. If you wish to put them into
multple fields you can write a transformer and iterate thru the lists
and put them into separate fields. If there are 3 address tags then
you get a ListString for each fields where the length of the
list==3. If an item is missing it will be added as a null.

ensure that the fields are marked as multiValued=true in the
schema.xml. Otherwise it does not return ListString  . If there is
no corresponding mapping in schema.xml you can explicitly put it here
in the dataconfig.xml
eg: field column=address_state   multiValued=true
xpath=/record/address/@state /


I saw the syntax '/record/address//@state'. '//' is not supported .
You will have to explicitly give the full path.
--Noble



On Sat, Jan 24, 2009 at 2:57 PM, Noble Paul നോബിള്‍  नोब्ळ्
noble.p...@gmail.com wrote:
 nesting of an XPathEntityProcessor into another XPathEntityProcessor
 is possible only if a field in an xml is a filename/url .
 what is the purpose of nesting like this?
 is it because you have multiple addresses? the possible solutions are
 discussed elsewhere in this thread

 On Sat, Jan 24, 2009 at 2:41 PM, Fergus McMenemie fer...@twig.me.uk wrote:
 Hello,

 I am also a newbie and was wanting to do almost the exact same thing.
 I was planning on doing the equivalent of:-

 dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
   name=record
   processor=XPathEntityProcessor
   stream=false
   rootEntity=false***changed***
   forEach=/record
   url=${f.fileAbsolutePath}
 field column=ID xpath=/record/@id commonField=true/ 
 ***change**
 !-- Address  --
  entity
 name=record_adr
 processor=XPathEntityProcessor
 stream=false
 forEach=/record/address
 url=${f.fileAbsolutePath}
  field column=address_street  xpath=/
 record/address/@street /
  field column=address_state   
 xpath=/record/address//@state /
  field column=address_typexpath=/
 record/address//@type /
/entity
/entity
  /entity
/document
 /dataConfig

 ID is no longer unique within Solr, There would be multiple documents
 with a given ID; one for each address. You can then search on ID and get
 the three addresses, you can also search on an address more sensibly.

 I have not been able to try this yet as other issues are still to be
 dealt with.

 Comments?

Hi
I may be completely off on this being new to SOLR but I am not sure
how to index related groups of fields in a document and preserver
their 'grouping'.   I  would appreciate any help on this.Detailed
description of the problem below.

I am trying to index an entity that can have multiple occurrences in
the same document - e.g. Address.  The address could be Shipping,
Home, Office etc.   Each address element has multiple values in it
like street, state etc.Thus each address element is a group with
the state and street in one address element being related to each other.

It looks like this in my source xml

record
coreInfo id=123 , .../
address street=XYZ1 State=CA ...type=home /
address street=XYZ2 state=CA ... type=Office/
address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
name=record
  processor=XPathEntityProcessor
  stream=false
  forEach=/record

Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-24 Thread Gunaranjan Chandraraju
I make this approach work with XPATH and XSL.   However, this approach  
creates multiple fields of like this


address_state_1
address_state_2
...
address_state_10

and

credit_card_1
credit_card_2
credit_card_3


How do I search for a credit_card.The query syntax does not seem  
to support wild cards in field names.   For e.g. I cant seem to do  
this -   credit_card*:1234 4567 7890 1234


On the search side I would not know how many credit card fields  got  
created for a document and so I need that to be dynamic.


-g


On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:


Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:



record
 coreInfo id=123 , .../
 address street=XYZ1 State=CA ...type=home /
 address street=XYZ2 state=CA ... type=Office/
 address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
 dataSource type=FileDataSource encoding=UTF-8 /
 document
   entity name =f processor=FileListEntityProcessor
   baseDir=***
   fileName=.*xml
   rootEntity=false
   dataSource=null 
  entity
 name=record
 processor=XPathEntityProcessor
 stream=false
 forEach=/record
 url=${f.fileAbsolutePath}
  field column=ID xpath=/record/@id /

  !-- Address  --
   entity
   name=record_adr
   processor=XPathEntityProcessor
   stream=false
   forEach=/record/address
   url=${f.fileAbsolutePath}
   field column=address_street
xpath=/record/address/@street /
   field column=address_state
xpath=/record/address//@state /
   field column=address_type
xpath=/record/address//@type /
  /entity
 /entity
   /entity
 /document
/dataConfig



I think the only way is to create a dynamic field for each attribute
(street, state etc.). Write a transformer to copy the fields from  
your data
config to appropriately named dynamic field (e.g. street_1,  
state_1, etc).

To maintain this counter you will need to get/store it with
Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

I cant't think of an easier way.
--
Regards,
Shalin Shekhar Mangar.





--
Regards,
Shalin Shekhar Mangar.




Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
for searching you need to put them in a single field . use copyField
in schema.xml to achieve that

On Sun, Jan 25, 2009 at 7:39 AM, Gunaranjan Chandraraju
chandrar...@apple.com wrote:
 I make this approach work with XPATH and XSL.   However, this approach
 creates multiple fields of like this

 address_state_1
 address_state_2
 ...
 address_state_10

 and

 credit_card_1
 credit_card_2
 credit_card_3


 How do I search for a credit_card.The query syntax does not seem to
 support wild cards in field names.   For e.g. I cant seem to do this -
 credit_card*:1234 4567 7890 1234

 On the search side I would not know how many credit card fields  got created
 for a document and so I need that to be dynamic.

 -g


 On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:

 Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

 On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
 chandrar...@apple.com wrote:


 record
  coreInfo id=123 , .../
  address street=XYZ1 State=CA ...type=home /
  address street=XYZ2 state=CA ... type=Office/
  address street=XYZ3 state=CA type=Other/
 /record

 I have setup my DIH to treat these as entities as below

 dataConfig
  dataSource type=FileDataSource encoding=UTF-8 /
  document
   entity name =f processor=FileListEntityProcessor
   baseDir=***
   fileName=.*xml
   rootEntity=false
   dataSource=null 
  entity
 name=record
 processor=XPathEntityProcessor
 stream=false
 forEach=/record
 url=${f.fileAbsolutePath}
  field column=ID xpath=/record/@id /

  !-- Address  --
   entity
   name=record_adr
   processor=XPathEntityProcessor
   stream=false
   forEach=/record/address
   url=${f.fileAbsolutePath}
   field column=address_street
 xpath=/record/address/@street /
   field column=address_state
 xpath=/record/address//@state /
   field column=address_type
 xpath=/record/address//@type /
  /entity
 /entity
   /entity
  /document
 /dataConfig


 I think the only way is to create a dynamic field for each attribute
 (street, state etc.). Write a transformer to copy the fields from your
 data
 config to appropriately named dynamic field (e.g. street_1, state_1,
 etc).
 To maintain this counter you will need to get/store it with
 Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
 Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

 I cant't think of an easier way.
 --
 Regards,
 Shalin Shekhar Mangar.




 --
 Regards,
 Shalin Shekhar Mangar.





-- 
--Noble Paul


Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-23 Thread Gunaranjan Chandraraju


I thought 1.3 supported dynamic fields in schema.xml?

Guna

On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:


Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:



record
 coreInfo id=123 , .../
 address street=XYZ1 State=CA ...type=home /
 address street=XYZ2 state=CA ... type=Office/
 address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
 dataSource type=FileDataSource encoding=UTF-8 /
 document
   entity name =f processor=FileListEntityProcessor
   baseDir=***
   fileName=.*xml
   rootEntity=false
   dataSource=null 
  entity
 name=record
 processor=XPathEntityProcessor
 stream=false
 forEach=/record
 url=${f.fileAbsolutePath}
  field column=ID xpath=/record/@id /

  !-- Address  --
   entity
   name=record_adr
   processor=XPathEntityProcessor
   stream=false
   forEach=/record/address
   url=${f.fileAbsolutePath}
   field column=address_street
xpath=/record/address/@street /
   field column=address_state
xpath=/record/address//@state /
   field column=address_type
xpath=/record/address//@type /
  /entity
 /entity
   /entity
 /document
/dataConfig



I think the only way is to create a dynamic field for each attribute
(street, state etc.). Write a transformer to copy the fields from  
your data
config to appropriately named dynamic field (e.g. street_1,  
state_1, etc).

To maintain this counter you will need to get/store it with
Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

I cant't think of an easier way.
--
Regards,
Shalin Shekhar Mangar.





--
Regards,
Shalin Shekhar Mangar.




Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-23 Thread Gunaranjan Chandraraju


I thought 1.3 supported dynamic fields in schema.xml?

Guna

On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:


Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:



record
coreInfo id=123 , .../
address street=XYZ1 State=CA ...type=home /
address street=XYZ2 state=CA ... type=Office/
address street=XYZ3 state=CA type=Other/
/record

I have setup my DIH to treat these as entities as below

dataConfig
dataSource type=FileDataSource encoding=UTF-8 /
document
  entity name =f processor=FileListEntityProcessor
  baseDir=***
  fileName=.*xml
  rootEntity=false
  dataSource=null 
 entity
name=record
processor=XPathEntityProcessor
stream=false
forEach=/record
url=${f.fileAbsolutePath}
 field column=ID xpath=/record/@id /

 !-- Address  --
  entity
  name=record_adr
  processor=XPathEntityProcessor
  stream=false
  forEach=/record/address
  url=${f.fileAbsolutePath}
  field column=address_street
xpath=/record/address/@street /
  field column=address_state
xpath=/record/address//@state /
  field column=address_type
xpath=/record/address//@type /
 /entity
/entity
  /entity
/document
/dataConfig



I think the only way is to create a dynamic field for each attribute
(street, state etc.). Write a transformer to copy the fields from  
your data
config to appropriately named dynamic field (e.g. street_1,  
state_1, etc).

To maintain this counter you will need to get/store it with
Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

I cant't think of an easier way.
--
Regards,
Shalin Shekhar Mangar.





--
Regards,
Shalin Shekhar Mangar.




Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-23 Thread Shalin Shekhar Mangar
Yes Solr does. But DataImportHandler with the 1.3 release does not support
it.

However, you can use the trunk data import handler jar with Solr 1.3 if you
do not feel comfortable using Solr 1.4 trunk.

On Fri, Jan 23, 2009 at 1:36 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:


 I thought 1.3 supported dynamic fields in schema.xml?

 Guna


 On Jan 22, 2009, at 11:54 PM, Shalin Shekhar Mangar wrote:

  Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

 On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

  On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
 chandrar...@apple.com wrote:


 record
  coreInfo id=123 , .../
  address street=XYZ1 State=CA ...type=home /
  address street=XYZ2 state=CA ... type=Office/
  address street=XYZ3 state=CA type=Other/
 /record

 I have setup my DIH to treat these as entities as below

 dataConfig
  dataSource type=FileDataSource encoding=UTF-8 /
  document
   entity name =f processor=FileListEntityProcessor
   baseDir=***
   fileName=.*xml
   rootEntity=false
   dataSource=null 
  entity
 name=record
 processor=XPathEntityProcessor
 stream=false
 forEach=/record
 url=${f.fileAbsolutePath}
  field column=ID xpath=/record/@id /

  !-- Address  --
   entity
   name=record_adr
   processor=XPathEntityProcessor
   stream=false
   forEach=/record/address
   url=${f.fileAbsolutePath}
   field column=address_street
 xpath=/record/address/@street /
   field column=address_state
 xpath=/record/address//@state /
   field column=address_type
 xpath=/record/address//@type /
  /entity
 /entity
   /entity
  /document
 /dataConfig


 I think the only way is to create a dynamic field for each attribute
 (street, state etc.). Write a transformer to copy the fields from your
 data
 config to appropriately named dynamic field (e.g. street_1, state_1,
 etc).
 To maintain this counter you will need to get/store it with
 Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
 Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

 I cant't think of an easier way.
 --
 Regards,
 Shalin Shekhar Mangar.




 --
 Regards,
 Shalin Shekhar Mangar.





-- 
Regards,
Shalin Shekhar Mangar.


Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-22 Thread Shalin Shekhar Mangar
On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
chandrar...@apple.com wrote:


 record
   coreInfo id=123 , .../
   address street=XYZ1 State=CA ...type=home /
   address street=XYZ2 state=CA ... type=Office/
   address street=XYZ3 state=CA type=Other/
 /record

 I have setup my DIH to treat these as entities as below

 dataConfig
   dataSource type=FileDataSource encoding=UTF-8 /
   document
 entity name =f processor=FileListEntityProcessor
 baseDir=***
 fileName=.*xml
 rootEntity=false
 dataSource=null 
entity
   name=record
   processor=XPathEntityProcessor
   stream=false
   forEach=/record
   url=${f.fileAbsolutePath}
field column=ID xpath=/record/@id /

!-- Address  --
 entity
 name=record_adr
 processor=XPathEntityProcessor
 stream=false
 forEach=/record/address
 url=${f.fileAbsolutePath}
 field column=address_street
  xpath=/record/address/@street /
 field column=address_state
 xpath=/record/address//@state /
 field column=address_type
  xpath=/record/address//@type /
/entity
   /entity
 /entity
   /document
 /dataConfig


I think the only way is to create a dynamic field for each attribute
(street, state etc.). Write a transformer to copy the fields from your data
config to appropriately named dynamic field (e.g. street_1, state_1, etc).
To maintain this counter you will need to get/store it with
Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

I cant't think of an easier way.
-- 
Regards,
Shalin Shekhar Mangar.


Re: How to make Relationships work for Multi-valued Index Fields?

2009-01-22 Thread Shalin Shekhar Mangar
Oops, one more gotcha. The dynamic field support is only in 1.4 trunk.

On Fri, Jan 23, 2009 at 1:24 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Fri, Jan 23, 2009 at 1:08 PM, Gunaranjan Chandraraju 
 chandrar...@apple.com wrote:


 record
   coreInfo id=123 , .../
   address street=XYZ1 State=CA ...type=home /
   address street=XYZ2 state=CA ... type=Office/
   address street=XYZ3 state=CA type=Other/
 /record

 I have setup my DIH to treat these as entities as below

 dataConfig
   dataSource type=FileDataSource encoding=UTF-8 /
   document
 entity name =f processor=FileListEntityProcessor
 baseDir=***
 fileName=.*xml
 rootEntity=false
 dataSource=null 
entity
   name=record
   processor=XPathEntityProcessor
   stream=false
   forEach=/record
   url=${f.fileAbsolutePath}
field column=ID xpath=/record/@id /

!-- Address  --
 entity
 name=record_adr
 processor=XPathEntityProcessor
 stream=false
 forEach=/record/address
 url=${f.fileAbsolutePath}
 field column=address_street
  xpath=/record/address/@street /
 field column=address_state
 xpath=/record/address//@state /
 field column=address_type
  xpath=/record/address//@type /
/entity
   /entity
 /entity
   /document
 /dataConfig


 I think the only way is to create a dynamic field for each attribute
 (street, state etc.). Write a transformer to copy the fields from your data
 config to appropriately named dynamic field (e.g. street_1, state_1, etc).
 To maintain this counter you will need to get/store it with
 Context#getSessionAttribute(name, val, Context.SCOPE_DOC) and
 Context#setSessionAttribute(name, val, Context.SCOPE_DOC).

 I cant't think of an easier way.
 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.