Re: Relational data

2012-03-12 Thread André Maldonado
Thank's Ahmet and Tomás. It worked like a charm.

*
--
*
*"E conhecereis a verdade, e a verdade vos libertará." (João 8:32)*

 *andre.maldonado*@gmail.com 
 (11) 9112-4227




   
  

  




On Mon, Mar 12, 2012 at 2:54 PM, Ahmet Arslan  wrote:

> > The problem is that the same house can have different prices
> > for different
> > dates.
> >
> > If I denormalyze this data, I will show the same house
> > multiple times in
> > the resultset, and I don't want this.
> >
> > So, for example:
> >
> > House  Holyday   Price per
> > day
> > 1  Xmas
> > $ 75.00
> > 1  July 4
> >   $ 50.00
> > 1
> > Valentine's   $ 15.00
> > 2  Xmas
> >  $ 50.00
> > 2  July 4
> >   $ 10.00
> >
> > If I query for all data, I'll get 3 documents for the same
> > house (house 1),
> > but I just want to show it one time to the end-user.
> >
> > There is some way to do this in Solr (Without processing it
> > in my app)?
>
>
> http://wiki.apache.org/solr/FieldCollapsing could work.
>


Re: Relational data

2012-03-12 Thread Tomás Fernández Löbbe
You could use the grouping feature, depending on your needs:
http://wiki.apache.org/solr/FieldCollapsing

2012/3/12 André Maldonado 

> Hi.
>
> I need to setup an index that have relational data. This index will be for
> houses to rent, where the user will search for date, price, holydays (by
> name), etc.
>
> The problem is that the same house can have different prices for different
> dates.
>
> If I denormalyze this data, I will show the same house multiple times in
> the resultset, and I don't want this.
>
> So, for example:
>
> House  Holyday   Price per day
> 1  Xmas  $ 75.00
> 1  July 4  $ 50.00
> 1  Valentine's   $ 15.00
> 2  Xmas   $ 50.00
> 2  July 4  $ 10.00
>
> If I query for all data, I'll get 3 documents for the same house (house 1),
> but I just want to show it one time to the end-user.
>
> There is some way to do this in Solr (Without processing it in my app)?
>
> Thank's
>
> *
>
> --
> *
> *"E conhecereis a verdade, e a verdade vos libertará." (João 8:32)*
>
>  *andre.maldonado*@gmail.com 
>  (11) 9112-4227
>
> <http://www.orkut.com.br/Main#Profile?uid=2397703412199036664>
> <http://www.orkut.com.br/Main#Profile?uid=2397703412199036664>
> <http://www.facebook.com/profile.php?id=10659376883>
>  <http://twitter.com/andremaldonado> <
> http://www.delicious.com/andre.maldonado>
>  <https://profiles.google.com/105605760943701739931>
> <http://www.linkedin.com/pub/andr%C3%A9-maldonado/23/234/4b3>
>  <http://www.youtube.com/andremaldonado>
>


Re: Relational data

2012-03-12 Thread Ahmet Arslan
> The problem is that the same house can have different prices
> for different
> dates.
> 
> If I denormalyze this data, I will show the same house
> multiple times in
> the resultset, and I don't want this.
> 
> So, for example:
> 
> House  Holyday       Price per
> day
> 1          Xmas     
>     $ 75.00
> 1          July 4   
>       $ 50.00
> 1         
> Valentine's   $ 15.00
> 2          Xmas     
>      $ 50.00
> 2          July 4   
>       $ 10.00
> 
> If I query for all data, I'll get 3 documents for the same
> house (house 1),
> but I just want to show it one time to the end-user.
> 
> There is some way to do this in Solr (Without processing it
> in my app)?


http://wiki.apache.org/solr/FieldCollapsing could work.


Relational data

2012-03-12 Thread André Maldonado
Hi.

I need to setup an index that have relational data. This index will be for
houses to rent, where the user will search for date, price, holydays (by
name), etc.

The problem is that the same house can have different prices for different
dates.

If I denormalyze this data, I will show the same house multiple times in
the resultset, and I don't want this.

So, for example:

House  Holyday   Price per day
1  Xmas  $ 75.00
1  July 4  $ 50.00
1  Valentine's   $ 15.00
2  Xmas   $ 50.00
2  July 4  $ 10.00

If I query for all data, I'll get 3 documents for the same house (house 1),
but I just want to show it one time to the end-user.

There is some way to do this in Solr (Without processing it in my app)?

Thank's

*
--
*
*"E conhecereis a verdade, e a verdade vos libertará." (João 8:32)*

 *andre.maldonado*@gmail.com 
 (11) 9112-4227

<http://www.orkut.com.br/Main#Profile?uid=2397703412199036664>
<http://www.orkut.com.br/Main#Profile?uid=2397703412199036664>
<http://www.facebook.com/profile.php?id=10659376883>
  <http://twitter.com/andremaldonado> <http://www.delicious.com/andre.maldonado>
  <https://profiles.google.com/105605760943701739931>
<http://www.linkedin.com/pub/andr%C3%A9-maldonado/23/234/4b3>
  <http://www.youtube.com/andremaldonado>


Re: how to handle large relational data in Solr

2011-10-23 Thread Erick Erickson
In addition to Otis' suggestion, think about using multivalued fields
with an increment gap of,
say, 100 (assuming your accessories had less than 100 fields). Then
you can do proximity
searches with a size < 100 (e.g. "red swing"~90) would not match
across your multiple
entries

If this is clear as mud, write back with what you've tried and maybe we can help

Best
Erick

On Thu, Oct 20, 2011 at 7:23 PM, Jonathan Carothers
 wrote:
> Actually, that's the root of my concern.  It looks like it product will 
> average ~20,000 associated accessories, still workable, but starting to look 
> painful.  Coming back the other way, I would guess each accessory would be 
> associated with 100 products on average.
>
> Given that there would be searchable fields in both the product and accessory 
> data, I assume I would have to either split them  into separate indexes and 
> merge the results, or have one document per product/accessory combo so that I 
> don't get a mix of accessories matching the search term.  For example, if a 
> product had two accessories, one with the description of "Blue Swing" and 
> another with "Red Ball" and I did a search for "Red Swing" it would rank 
> about the same as a document that actually had a "Red Swing".
>
> So it sounds like you are suggesting the external map, in which case is there 
> a good way to merge the two searches?  Basically on search on product 
> attributes and a second search on the attributes of related accessories?
>
> many thanks,
> Jonathan
> 
> From: Robert Stewart [bstewart...@gmail.com]
> Sent: Thursday, October 20, 2011 12:05 PM
> To: solr-user@lucene.apache.org
> Subject: Re: how to handle large relational data in Solr
>
> If your "documents" are products, then 100,000 documents is a pretty small 
> index for solr.  Do you know approximately how many accessories are related 
> to each product on average?  If # if relatively small (around 100 or less), 
> then it should be ok to create product documents with all the related 
> accessories as fields on the document, something like:
>
> 
>        PRODUCT_ID
>        PRODUCT_NAME
>        accessory one
>        accessory two
>        
>        accessory N
> 
>
>
> And then you can search for products by accessory, and show accessory facets 
> over products, etc.
>
> Even if # of accessories per product is large (1000 or more), you can still 
> do it this way, but it may be better to store some small accessory ID as 
> integers instead of larger names, and maybe use some external mapping to 
> resolve names for search and display.
>
> Bob
>
>
> On Oct 20, 2011, at 11:08 AM, Jonathan Carothers wrote:
>
>> Agreed, this will just be a read only view of the existing database for 
>> search purposes.  Sorry for the confusion.
>> 
>> From: Brandon Ramirez [brandon_rami...@elementk.com]
>> Sent: Thursday, October 20, 2011 10:50 AM
>> To: solr-user@lucene.apache.org
>> Subject: RE: how to handle large relational data in Solr
>>
>> I would not recommend removing your relational database altogether.  You 
>> should treat that as your system of record.  By replacing it, you are 
>> forcing Solr to store the unmodified value for everything even when not 
>> needed.  You also lose normalization.   And if you ever need to add some 
>> data to your system that isn't search-related, you have no choice but to add 
>> it to your search index.
>>
>>
>> Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848
>> Software Engineer II | Element K | www.elementk.com
>>
>>
>> -Original Message-
>> From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com]
>> Sent: Thursday, October 20, 2011 10:12 AM
>> To: solr-user@lucene.apache.org
>> Subject: how to handle large relational data in Solr
>>
>> All,
>>
>> We are attempting to convert a fairly large relational database into Solr 
>> index(es).
>>
>> There are ~100,000 products with ~1,000,000 accessories that can be related 
>> to any number of the products.  So if I include the search terms and the 
>> relationships in the same index, we're looking at a pretty huge index.
>>
>> If we break it out into three indexes, one for the product search, one for 
>> the accessories search, and one for their relationship, is there a good way 
>> to merge the results?
>>
>> Is there a better way to structure the indexes?
>>
>> We will have a relational database available if it makes sense to do some 
>> sort of a hybrid approach.
>>
>> many thanks,
>> Jonathan
>>
>
>


Re: how to handle large relational data in Solr

2011-10-20 Thread Otis Gospodnetic
Hi Jonathan,

Not sure which version of Solr you are using, but look into Join functionality 
- hit #1: http://search-lucene.com/?q=join&fc_project=Solr

Otis


Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>
>From: Jonathan Carothers 
>To: "solr-user@lucene.apache.org" 
>Sent: Thursday, October 20, 2011 1:23 PM
>Subject: RE: how to handle large relational data in Solr
>
>Actually, that's the root of my concern.  It looks like it product will 
>average ~20,000 associated accessories, still workable, but starting to look 
>painful.  Coming back the other way, I would guess each accessory would be 
>associated with 100 products on average.
>
>Given that there would be searchable fields in both the product and accessory 
>data, I assume I would have to either split them  into separate indexes and 
>merge the results, or have one document per product/accessory combo so that I 
>don't get a mix of accessories matching the search term.  For example, if a 
>product had two accessories, one with the description of "Blue Swing" and 
>another with "Red Ball" and I did a search for "Red Swing" it would rank about 
>the same as a document that actually had a "Red Swing".
>
>So it sounds like you are suggesting the external map, in which case is there 
>a good way to merge the two searches?  Basically on search on product 
>attributes and a second search on the attributes of related accessories?
>
>many thanks,
>Jonathan
>
>From: Robert Stewart [bstewart...@gmail.com]
>Sent: Thursday, October 20, 2011 12:05 PM
>To: solr-user@lucene.apache.org
>Subject: Re: how to handle large relational data in Solr
>
>If your "documents" are products, then 100,000 documents is a pretty small 
>index for solr.  Do you know approximately how many accessories are related to 
>each product on average?  If # if relatively small (around 100 or less), then 
>it should be ok to create product documents with all the related accessories 
>as fields on the document, something like:
>
>
>        PRODUCT_ID
>        PRODUCT_NAME
>        accessory one
>        accessory two
>        
>        accessory N
>
>
>
>And then you can search for products by accessory, and show accessory facets 
>over products, etc.
>
>Even if # of accessories per product is large (1000 or more), you can still do 
>it this way, but it may be better to store some small accessory ID as integers 
>instead of larger names, and maybe use some external mapping to resolve names 
>for search and display.
>
>Bob
>
>
>On Oct 20, 2011, at 11:08 AM, Jonathan Carothers wrote:
>
>> Agreed, this will just be a read only view of the existing database for 
>> search purposes.  Sorry for the confusion.
>> 
>> From: Brandon Ramirez [brandon_rami...@elementk.com]
>> Sent: Thursday, October 20, 2011 10:50 AM
>> To: solr-user@lucene.apache.org
>> Subject: RE: how to handle large relational data in Solr
>>
>> I would not recommend removing your relational database altogether.  You 
>> should treat that as your system of record.  By replacing it, you are 
>> forcing Solr to store the unmodified value for everything even when not 
>> needed.  You also lose normalization.   And if you ever need to add some 
>> data to your system that isn't search-related, you have no choice but to add 
>> it to your search index.
>>
>>
>> Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848
>> Software Engineer II | Element K | www.elementk.com
>>
>>
>> -Original Message-
>> From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com]
>> Sent: Thursday, October 20, 2011 10:12 AM
>> To: solr-user@lucene.apache.org
>> Subject: how to handle large relational data in Solr
>>
>> All,
>>
>> We are attempting to convert a fairly large relational database into Solr 
>> index(es).
>>
>> There are ~100,000 products with ~1,000,000 accessories that can be related 
>> to any number of the products.  So if I include the search terms and the 
>> relationships in the same index, we're looking at a pretty huge index.
>>
>> If we break it out into three indexes, one for the product search, one for 
>> the accessories search, and one for their relationship, is there a good way 
>> to merge the results?
>>
>> Is there a better way to structure the indexes?
>>
>> We will have a relational database available if it makes sense to do some 
>> sort of a hybrid approach.
>>
>> many thanks,
>> Jonathan
>>
>
>
>
>

RE: how to handle large relational data in Solr

2011-10-20 Thread Jonathan Carothers
Actually, that's the root of my concern.  It looks like it product will average 
~20,000 associated accessories, still workable, but starting to look painful.  
Coming back the other way, I would guess each accessory would be associated 
with 100 products on average.

Given that there would be searchable fields in both the product and accessory 
data, I assume I would have to either split them  into separate indexes and 
merge the results, or have one document per product/accessory combo so that I 
don't get a mix of accessories matching the search term.  For example, if a 
product had two accessories, one with the description of "Blue Swing" and 
another with "Red Ball" and I did a search for "Red Swing" it would rank about 
the same as a document that actually had a "Red Swing".

So it sounds like you are suggesting the external map, in which case is there a 
good way to merge the two searches?  Basically on search on product attributes 
and a second search on the attributes of related accessories?

many thanks,
Jonathan

From: Robert Stewart [bstewart...@gmail.com]
Sent: Thursday, October 20, 2011 12:05 PM
To: solr-user@lucene.apache.org
Subject: Re: how to handle large relational data in Solr

If your "documents" are products, then 100,000 documents is a pretty small 
index for solr.  Do you know approximately how many accessories are related to 
each product on average?  If # if relatively small (around 100 or less), then 
it should be ok to create product documents with all the related accessories as 
fields on the document, something like:


PRODUCT_ID
PRODUCT_NAME
accessory one
accessory two

accessory N



And then you can search for products by accessory, and show accessory facets 
over products, etc.

Even if # of accessories per product is large (1000 or more), you can still do 
it this way, but it may be better to store some small accessory ID as integers 
instead of larger names, and maybe use some external mapping to resolve names 
for search and display.

Bob


On Oct 20, 2011, at 11:08 AM, Jonathan Carothers wrote:

> Agreed, this will just be a read only view of the existing database for 
> search purposes.  Sorry for the confusion.
> 
> From: Brandon Ramirez [brandon_rami...@elementk.com]
> Sent: Thursday, October 20, 2011 10:50 AM
> To: solr-user@lucene.apache.org
> Subject: RE: how to handle large relational data in Solr
>
> I would not recommend removing your relational database altogether.  You 
> should treat that as your system of record.  By replacing it, you are forcing 
> Solr to store the unmodified value for everything even when not needed.  You 
> also lose normalization.   And if you ever need to add some data to your 
> system that isn't search-related, you have no choice but to add it to your 
> search index.
>
>
> Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848
> Software Engineer II | Element K | www.elementk.com
>
>
> -Original Message-
> From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com]
> Sent: Thursday, October 20, 2011 10:12 AM
> To: solr-user@lucene.apache.org
> Subject: how to handle large relational data in Solr
>
> All,
>
> We are attempting to convert a fairly large relational database into Solr 
> index(es).
>
> There are ~100,000 products with ~1,000,000 accessories that can be related 
> to any number of the products.  So if I include the search terms and the 
> relationships in the same index, we're looking at a pretty huge index.
>
> If we break it out into three indexes, one for the product search, one for 
> the accessories search, and one for their relationship, is there a good way 
> to merge the results?
>
> Is there a better way to structure the indexes?
>
> We will have a relational database available if it makes sense to do some 
> sort of a hybrid approach.
>
> many thanks,
> Jonathan
>



Re: how to handle large relational data in Solr

2011-10-20 Thread Robert Stewart
If your "documents" are products, then 100,000 documents is a pretty small 
index for solr.  Do you know approximately how many accessories are related to 
each product on average?  If # if relatively small (around 100 or less), then 
it should be ok to create product documents with all the related accessories as 
fields on the document, something like:


PRODUCT_ID
PRODUCT_NAME
accessory one
accessory two

accessory N



And then you can search for products by accessory, and show accessory facets 
over products, etc. 

Even if # of accessories per product is large (1000 or more), you can still do 
it this way, but it may be better to store some small accessory ID as integers 
instead of larger names, and maybe use some external mapping to resolve names 
for search and display.  

Bob


On Oct 20, 2011, at 11:08 AM, Jonathan Carothers wrote:

> Agreed, this will just be a read only view of the existing database for 
> search purposes.  Sorry for the confusion.
> 
> From: Brandon Ramirez [brandon_rami...@elementk.com]
> Sent: Thursday, October 20, 2011 10:50 AM
> To: solr-user@lucene.apache.org
> Subject: RE: how to handle large relational data in Solr
> 
> I would not recommend removing your relational database altogether.  You 
> should treat that as your system of record.  By replacing it, you are forcing 
> Solr to store the unmodified value for everything even when not needed.  You 
> also lose normalization.   And if you ever need to add some data to your 
> system that isn't search-related, you have no choice but to add it to your 
> search index.
> 
> 
> Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848
> Software Engineer II | Element K | www.elementk.com
> 
> 
> -Original Message-
> From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com]
> Sent: Thursday, October 20, 2011 10:12 AM
> To: solr-user@lucene.apache.org
> Subject: how to handle large relational data in Solr
> 
> All,
> 
> We are attempting to convert a fairly large relational database into Solr 
> index(es).
> 
> There are ~100,000 products with ~1,000,000 accessories that can be related 
> to any number of the products.  So if I include the search terms and the 
> relationships in the same index, we're looking at a pretty huge index.
> 
> If we break it out into three indexes, one for the product search, one for 
> the accessories search, and one for their relationship, is there a good way 
> to merge the results?
> 
> Is there a better way to structure the indexes?
> 
> We will have a relational database available if it makes sense to do some 
> sort of a hybrid approach.
> 
> many thanks,
> Jonathan
> 



RE: how to handle large relational data in Solr

2011-10-20 Thread Jonathan Carothers
Agreed, this will just be a read only view of the existing database for search 
purposes.  Sorry for the confusion.

From: Brandon Ramirez [brandon_rami...@elementk.com]
Sent: Thursday, October 20, 2011 10:50 AM
To: solr-user@lucene.apache.org
Subject: RE: how to handle large relational data in Solr

I would not recommend removing your relational database altogether.  You should 
treat that as your system of record.  By replacing it, you are forcing Solr to 
store the unmodified value for everything even when not needed.  You also lose 
normalization.   And if you ever need to add some data to your system that 
isn't search-related, you have no choice but to add it to your search index.


Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848
Software Engineer II | Element K | www.elementk.com


-Original Message-
From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com]
Sent: Thursday, October 20, 2011 10:12 AM
To: solr-user@lucene.apache.org
Subject: how to handle large relational data in Solr

All,

We are attempting to convert a fairly large relational database into Solr 
index(es).

There are ~100,000 products with ~1,000,000 accessories that can be related to 
any number of the products.  So if I include the search terms and the 
relationships in the same index, we're looking at a pretty huge index.

If we break it out into three indexes, one for the product search, one for the 
accessories search, and one for their relationship, is there a good way to 
merge the results?

Is there a better way to structure the indexes?

We will have a relational database available if it makes sense to do some sort 
of a hybrid approach.

many thanks,
Jonathan



RE: how to handle large relational data in Solr

2011-10-20 Thread Brandon Ramirez
I would not recommend removing your relational database altogether.  You should 
treat that as your system of record.  By replacing it, you are forcing Solr to 
store the unmodified value for everything even when not needed.  You also lose 
normalization.   And if you ever need to add some data to your system that 
isn't search-related, you have no choice but to add it to your search index.


Brandon Ramirez | Office: 585.214.5413 | Fax: 585.295.4848 
Software Engineer II | Element K | www.elementk.com


-Original Message-
From: Jonathan Carothers [mailto:jonathan.caroth...@amentra.com] 
Sent: Thursday, October 20, 2011 10:12 AM
To: solr-user@lucene.apache.org
Subject: how to handle large relational data in Solr

All,

We are attempting to convert a fairly large relational database into Solr 
index(es).

There are ~100,000 products with ~1,000,000 accessories that can be related to 
any number of the products.  So if I include the search terms and the 
relationships in the same index, we're looking at a pretty huge index.

If we break it out into three indexes, one for the product search, one for the 
accessories search, and one for their relationship, is there a good way to 
merge the results?

Is there a better way to structure the indexes?

We will have a relational database available if it makes sense to do some sort 
of a hybrid approach.

many thanks,
Jonathan



how to handle large relational data in Solr

2011-10-20 Thread Jonathan Carothers
All,

We are attempting to convert a fairly large relational database into Solr 
index(es).

There are ~100,000 products with ~1,000,000 accessories that can be related to 
any number of the products.  So if I include the search terms and the 
relationships in the same index, we're looking at a pretty huge index.

If we break it out into three indexes, one for the product search, one for the 
accessories search, and one for their relationship, is there a good way to 
merge the results?

Is there a better way to structure the indexes?

We will have a relational database available if it makes sense to do some sort 
of a hybrid approach.

many thanks,
Jonathan


Re: Unified search of relational data on Solr?

2009-02-19 Thread Kalidoss MM
Its for searching with almost all the fields we used for seaching, stats we
used for list the most viewd image(gallery).

thanks,
kalidoss.m,

On Thu, Feb 19, 2009 at 12:50 PM, Noble Paul നോബിള്‍ नोब्ळ् <
noble.p...@gmail.com> wrote:

> do you wish to search on the image names or is it that you only wish
> to read the image details
> --Noble
>
> On Thu, Feb 19, 2009 at 12:31 PM, Kalidoss MM 
> wrote:
> > Even in my case, we cant make it flattern, Bcoz we are managing total
> image
> > gallery information in Solr, So image gallery contains aroung 20 images
> also
> > with image descrption, thumbnail info, width, height, etc also we want to
> > store/update the stats along with image gallery,
> >
> > If we flatten the xml, for every visit to the image gallery i need to
> update
> > the whole lengh record again into Solr, we have around 30lacs image
> gallery
> > also per day around 50K imagegallery stats supposed to update,
> >
> > So we are thinking of spliting of Image gallery And (Stats, comments) as
> > separate xml..
> >
> > 1) if any body used parallel Reader (lucene) let me know how this will be
> > usefull for us,
> > 2) If any body used multicore let me know how this will be useful for us.
> > 3) Is "MultipleIndexes" will be useful or not?
> > http://wiki.apache.org/solr/MultipleIndexes
> >
> > Please suggest us,
> >
> > Thanks,
> > kalidoss.m,
> >
> > On Thu, Feb 19, 2009 at 11:24 AM, Otis Gospodnetic <
> > otis_gospodne...@yahoo.com> wrote:
> >
> >> Hi,
> >>
> >> Just flatten it - create a single Person + Address entity (document) and
> >> index it.
> >>
> >> Otis
> >> --
> >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>
> >>
> >>
> >>
> >> 
> >> From: Senthil Kumar 
> >> To: solr-user@lucene.apache.org
> >> Sent: Thursday, February 19, 2009 1:20:23 PM
> >> Subject: Unified search of relational data on Solr?
> >>
> >> Hi,
> >>
> >>  How to index relational data in Solr which can not be merged as
> a
> >> single file for some reasons?
> >>  We have two kinds of XMLs indexed in Solr,
> >> 
> >>   1_persona
> >>   
> >>   
> >>   
> >> 
> >>
> >> 
> >>   1_addr
> >>   washington
> >> 
> >>
> >>  Our aim to get a list of persons living in Washington. Can anyone
> >> suggest what is the best approach for this and to index relational data
> in
> >> general?
> >>
> >>
> >> Senthil Kumar P
> >>
> >
>
>
>
> --
> --Noble Paul
>


Re: Unified search of relational data on Solr?

2009-02-18 Thread Noble Paul നോബിള്‍ नोब्ळ्
do you wish to search on the image names or is it that you only wish
to read the image details
--Noble

On Thu, Feb 19, 2009 at 12:31 PM, Kalidoss MM  wrote:
> Even in my case, we cant make it flattern, Bcoz we are managing total image
> gallery information in Solr, So image gallery contains aroung 20 images also
> with image descrption, thumbnail info, width, height, etc also we want to
> store/update the stats along with image gallery,
>
> If we flatten the xml, for every visit to the image gallery i need to update
> the whole lengh record again into Solr, we have around 30lacs image gallery
> also per day around 50K imagegallery stats supposed to update,
>
> So we are thinking of spliting of Image gallery And (Stats, comments) as
> separate xml..
>
> 1) if any body used parallel Reader (lucene) let me know how this will be
> usefull for us,
> 2) If any body used multicore let me know how this will be useful for us.
> 3) Is "MultipleIndexes" will be useful or not?
> http://wiki.apache.org/solr/MultipleIndexes
>
> Please suggest us,
>
> Thanks,
> kalidoss.m,
>
> On Thu, Feb 19, 2009 at 11:24 AM, Otis Gospodnetic <
> otis_gospodne...@yahoo.com> wrote:
>
>> Hi,
>>
>> Just flatten it - create a single Person + Address entity (document) and
>> index it.
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>
>>
>>
>>
>> ____
>> From: Senthil Kumar 
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, February 19, 2009 1:20:23 PM
>> Subject: Unified search of relational data on Solr?
>>
>> Hi,
>>
>>  How to index relational data in Solr which can not be merged as a
>> single file for some reasons?
>>  We have two kinds of XMLs indexed in Solr,
>> 
>>   1_persona
>>   
>>   
>>   
>> 
>>
>> 
>>   1_addr
>>   washington
>> 
>>
>>  Our aim to get a list of persons living in Washington. Can anyone
>> suggest what is the best approach for this and to index relational data in
>> general?
>>
>>
>> Senthil Kumar P
>>
>



-- 
--Noble Paul


Re: Unified search of relational data on Solr?

2009-02-18 Thread Kalidoss MM
Even in my case, we cant make it flattern, Bcoz we are managing total image
gallery information in Solr, So image gallery contains aroung 20 images also
with image descrption, thumbnail info, width, height, etc also we want to
store/update the stats along with image gallery,

If we flatten the xml, for every visit to the image gallery i need to update
the whole lengh record again into Solr, we have around 30lacs image gallery
also per day around 50K imagegallery stats supposed to update,

So we are thinking of spliting of Image gallery And (Stats, comments) as
separate xml..

1) if any body used parallel Reader (lucene) let me know how this will be
usefull for us,
2) If any body used multicore let me know how this will be useful for us.
3) Is "MultipleIndexes" will be useful or not?
http://wiki.apache.org/solr/MultipleIndexes

Please suggest us,

Thanks,
kalidoss.m,

On Thu, Feb 19, 2009 at 11:24 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi,
>
> Just flatten it - create a single Person + Address entity (document) and
> index it.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>
> 
> From: Senthil Kumar 
> To: solr-user@lucene.apache.org
> Sent: Thursday, February 19, 2009 1:20:23 PM
> Subject: Unified search of relational data on Solr?
>
> Hi,
>
>  How to index relational data in Solr which can not be merged as a
> single file for some reasons?
>  We have two kinds of XMLs indexed in Solr,
> 
>   1_persona
>   
>   
>   
> 
>
> 
>   1_addr
>   washington
> 
>
>  Our aim to get a list of persons living in Washington. Can anyone
> suggest what is the best approach for this and to index relational data in
> general?
>
>
> Senthil Kumar P
>


Re: Unified search of relational data on Solr?

2009-02-18 Thread Otis Gospodnetic
Hi,

Just flatten it - create a single Person + Address entity (document) and index 
it.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: Senthil Kumar 
To: solr-user@lucene.apache.org
Sent: Thursday, February 19, 2009 1:20:23 PM
Subject: Unified search of relational data on Solr?

Hi,

  How to index relational data in Solr which can not be merged as a
single file for some reasons?
  We have two kinds of XMLs indexed in Solr,

   1_persona
   
   
   



   1_addr
   washington


  Our aim to get a list of persons living in Washington. Can anyone
suggest what is the best approach for this and to index relational data in
general?


Senthil Kumar P


Unified search of relational data on Solr?

2009-02-18 Thread Senthil Kumar
Hi,

  How to index relational data in Solr which can not be merged as a
single file for some reasons?
  We have two kinds of XMLs indexed in Solr,

   1_persona
   
   
   



   1_addr
   washington


  Our aim to get a list of persons living in Washington. Can anyone
suggest what is the best approach for this and to index relational data in
general?


Senthil Kumar P


Re: Best practice for storing relational data in Solr

2008-01-08 Thread Ryan Grange
I've found that Solr running on modest hardware (a 2.4 GHz PC running 
Windows XP Pro for testing changes) is able to index about 23,000 
records in under three minutes.  Assuming you aren't going to make too 
many typos in your naming, you should be fine just doing the 
re-indexing.  Try timing your system.  Make a change to about a thousand 
records and see how long it takes to index them.


When indexing, I've found it's better to do them in batches for larger 
updates.  I get up to a few hundred updates ready at a time and commit 
them at once.  Goes much faster than committing each update document 
individually.


Ryan Grange, IT Manager
DollarDays International, LLC
[EMAIL PROTECTED]
480-922-8155 x106



steve.lillywhite wrote:

Hi all,

 


This is a (possibly very naive) newbie question regarding Solr best practice...

 

I run a website that displays/stores data on job applicants, together with information on where they came from (e.g. which recruiter), which office they are applying to, etc. This data is stored in a mySQL database. I currently have a basic search facility, but I  plan to introduce Solr to improve this, by also storing applicant data in a Solr schema. 

 


My problem is that *related* applicant data can also be updated in the web GUI 
(e.g. if there was a typo a recruiter could be changed from “My Rcruiter” to 
“My Recruiter”, and I don’t know how best to reflect this in the Solr schema.

Example:

We may have 2 applicants that came from recruiter “My Recruiter”. If the 
name of this recruiter is altered in the GUI then I would have to reindex all 
2 of those applicants in the Solr schema, which seems very overkill. The 
alternative would be if I didn’t store the recruiter name in the Solr schema, 
and instead only stored its mySQL database identifier. Then, I would need to 
parse any search results from Solr to put in the recruiter name before 
displaying the data in the GUI.

 


So I guess I’m asking which of these is the better approach;

 


1.   Use Solr to store the text value of related applicant data that exists 
in a relational mySQL database. Whenever that data is updated in the database 
reindex all dependent entries in the Solr schema. Advantage of this approach I 
guess is that search results can be returned from Solr and displayed as is (if 
XSLT is used). E.g. search result for “John Smith” of recruiter “My Recruiter” 
could be returned in the required HTML format from Solr, and displayed in the 
web GUI without any reformatting or further processing.

2.   Use Solr to store database Ids of related applicant data that exists 
in a relational mySQL database. When that data is updated in the database there 
is no need to reindex Solr. However, search results from Solr will need to be 
parsed before they can be output in the web GUI. E.g. if Solr returns “John 
Smith” of recruiter with database ID 143, then 143 will need to be mapped back 
to “My Recruiter” by my application before it can be displayed.

 


Can anyone offer any guidance here?

 


Regards

 


Steve

 



No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.516 / Virus Database: 269.17.13/1208 - Release Date: 03/01/2008 15:52
 

  


Re: Best practice for storing relational data in Solr

2008-01-04 Thread Robert Young
Short answer: It depends.
Long answer: It depends on whether you want to be able to search on.
If you need to search by recruiter name then obviously you'll need to
index it, if you don't you only really need to index the most relevent
db identifier, then work out the relations from that in MySQL (it's
what it's good at after all).

Cheers
Rob

On Jan 4, 2008 11:39 AM, steve.lillywhite
<[EMAIL PROTECTED]> wrote:
> Hi all,
>
>
>
> This is a (possibly very naive) newbie question regarding Solr best 
> practice...
>
>
>
> I run a website that displays/stores data on job applicants, together with 
> information on where they came from (e.g. which recruiter), which office they 
> are applying to, etc. This data is stored in a mySQL database. I currently 
> have a basic search facility, but I  plan to introduce Solr to improve this, 
> by also storing applicant data in a Solr schema.
>
>
>
> My problem is that *related* applicant data can also be updated in the web 
> GUI (e.g. if there was a typo a recruiter could be changed from "My Rcruiter" 
> to "My Recruiter", and I don't know how best to reflect this in the Solr 
> schema.
>
> Example:
>
> We may have 2 applicants that came from recruiter "My Recruiter". If the 
> name of this recruiter is altered in the GUI then I would have to reindex all 
> 2 of those applicants in the Solr schema, which seems very overkill. The 
> alternative would be if I didn't store the recruiter name in the Solr schema, 
> and instead only stored its mySQL database identifier. Then, I would need to 
> parse any search results from Solr to put in the recruiter name before 
> displaying the data in the GUI.
>
>
>
> So I guess I'm asking which of these is the better approach;
>
>
>
> 1.   Use Solr to store the text value of related applicant data that 
> exists in a relational mySQL database. Whenever that data is updated in the 
> database reindex all dependent entries in the Solr schema. Advantage of this 
> approach I guess is that search results can be returned from Solr and 
> displayed as is (if XSLT is used). E.g. search result for "John Smith" of 
> recruiter "My Recruiter" could be returned in the required HTML format from 
> Solr, and displayed in the web GUI without any reformatting or further 
> processing.
>
> 2.   Use Solr to store database Ids of related applicant data that exists 
> in a relational mySQL database. When that data is updated in the database 
> there is no need to reindex Solr. However, search results from Solr will need 
> to be parsed before they can be output in the web GUI. E.g. if Solr returns 
> "John Smith" of recruiter with database ID 143, then 143 will need to be 
> mapped back to "My Recruiter" by my application before it can be displayed.
>
>
>
> Can anyone offer any guidance here?
>
>
>
> Regards
>
>
>
> Steve
>
>
>
>
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.17.13/1208 - Release Date: 03/01/2008 
> 15:52
>
>


Best practice for storing relational data in Solr

2008-01-04 Thread steve.lillywhite
Hi all,

 

This is a (possibly very naive) newbie question regarding Solr best practice...

 

I run a website that displays/stores data on job applicants, together with 
information on where they came from (e.g. which recruiter), which office they 
are applying to, etc. This data is stored in a mySQL database. I currently have 
a basic search facility, but I  plan to introduce Solr to improve this, by also 
storing applicant data in a Solr schema. 

 

My problem is that *related* applicant data can also be updated in the web GUI 
(e.g. if there was a typo a recruiter could be changed from “My Rcruiter” to 
“My Recruiter”, and I don’t know how best to reflect this in the Solr schema.

Example:

We may have 2 applicants that came from recruiter “My Recruiter”. If the 
name of this recruiter is altered in the GUI then I would have to reindex all 
2 of those applicants in the Solr schema, which seems very overkill. The 
alternative would be if I didn’t store the recruiter name in the Solr schema, 
and instead only stored its mySQL database identifier. Then, I would need to 
parse any search results from Solr to put in the recruiter name before 
displaying the data in the GUI.

 

So I guess I’m asking which of these is the better approach;

 

1.   Use Solr to store the text value of related applicant data that exists 
in a relational mySQL database. Whenever that data is updated in the database 
reindex all dependent entries in the Solr schema. Advantage of this approach I 
guess is that search results can be returned from Solr and displayed as is (if 
XSLT is used). E.g. search result for “John Smith” of recruiter “My Recruiter” 
could be returned in the required HTML format from Solr, and displayed in the 
web GUI without any reformatting or further processing.

2.   Use Solr to store database Ids of related applicant data that exists 
in a relational mySQL database. When that data is updated in the database there 
is no need to reindex Solr. However, search results from Solr will need to be 
parsed before they can be output in the web GUI. E.g. if Solr returns “John 
Smith” of recruiter with database ID 143, then 143 will need to be mapped back 
to “My Recruiter” by my application before it can be displayed.

 

Can anyone offer any guidance here?

 

Regards

 

Steve

 


No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.516 / Virus Database: 269.17.13/1208 - Release Date: 03/01/2008 
15:52