Shawn,
Thanks for your response.
Due to security requirements, I do need the name and domain parts of the email 
address stored in separate Lucene indexes.
How do you recommend doing this?  What are the challenges?
Once the name and domain parts of the email address are in different Lucene 
indexes, would I need to modify my  Solr search string?
Thanks,
Roger


-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, June 20, 2014 10:19 AM
To: solr-user@lucene.apache.org
Subject: Re: Indexing a term into separate Lucene indexes

On 6/19/2014 4:51 PM, Huang, Roger wrote:
> If I have documents with a person and his email address: 
> u...@domain.com<mailto:u...@domain.com>
>
> How can I configure Solr (4.6) so that the email address source field 
> is indexed as
>
> -          the user part of the address (e.g., "user") is in Lucene index X
>
> -          the domain part of the address (e.g., "domain.com") is in a 
> separate Lucene index Y
>
> I would like to be able search as follows:
>
> -          Find all people whose email addresses have user part = "userXyz"
>
> -          Find all people whose email addresses have domain part = 
> "domainABC.com"
>
> -          Find the person with exact email address = "user...@domainabc.com"
>
> Would I use a <copyField> declaration in my schema?
> http://wiki.apache.org/solr/SchemaXml#Copy_Fields

I don't think you actually want the data to end up in entirely different 
indexes.  Although it is possible to search more than one separate index, 
that's very likely NOT what you want to do, and it comes with its own 
challenges.  What you most likely want is to put this data into different 
fields within the same index.

You'll need to write custom code to accomplish this, especially if you need the 
stored data to contain only the parts rather than the complete email address.  
A copyField can get the data to additional fields, but I'm not aware of 
anything built-in to the schema that can trim the unwanted information from the 
new fields, and even if there is, any stored data will be the original data for 
all three fields.  It's up to you whether this custom code is in a user 
application that does your indexing or in a custom update processor that you 
load as a plugin to Solr itself.  Extending whatever user application you are 
already using for indexing is very likely to be a lot easier.

Thanks,
Shawn

Reply via email to