On Feb 8, 2012, at 10:14 AM, Danil ŢORIN wrote:
> For example if you only query data for 1 month intervals, and you
> partition by date, you can calculate in which shard your data can be
> found, and query just that shard.
This is what one calls "partition pruning" in database terms.
http://en.
On Feb 8, 2012, at 10:14 AM, Danil ŢORIN wrote:
> For example if you only query data for 1 month intervals, and you
> partition by date, you can calculate in which shard your data can be
> found, and query just that shard.
This is what one calls "partition pruning" in database terms.
http://en.
It also depends on your queries.
For example if you only query data for 1 month intervals, and you
partition by date, you can calculate in which shard your data can be
found, and query just that shard.
If you can find a partition key that is always present in the query,
you can create a gazillion
it's up to your machines. in our application, we indexs about
30,000,000(30M)docs/shard, and the response time is about 150ms. our
machine has about 48GB memory and about 25GB is allocated to solr and other
is used for disk cache in Linux.
if calculated by our application, indexing 1.25T docs will
27;t all fit in memory.
That's a great resource you reference. Thanks so much, The Captn
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Wednesday, 8 February 2012 1:38 PM
To: java-user@lucene.apache.org
Subject: Re: How best to handle a reasonable
to:peter.mil...@objectconsulting.com.au]
> Sent: Wednesday, 8 February 2012 12:20 PM
> To: java-user@lucene.apache.org
> Subject: RE: How best to handle a reasonable amount to data (25TB+)
>
> Whoops! Very poor basic maths, I should have written it down. I was thinking
> 13 shards. But yes,
ing.com.au]
Sent: Wednesday, 8 February 2012 12:20 PM
To: java-user@lucene.apache.org
Subject: RE: How best to handle a reasonable amount to data (25TB+)
Whoops! Very poor basic maths, I should have written it down. I was thinking 13
shards. But yes, 13,000 is a bit different. Now I'm in even mo
ch across seven years of data.
Thanks a lot,
The Captn
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Wednesday, 8 February 2012 12:39 AM
To: java-user@lucene.apache.org
Subject: Re: How best to handle a reasonable amount to data (25TB+)
I'm curiou
:peter.c.e...@gmail.com]
> Sent: Monday, 6 February 2012 5:29 PM
> To: java-user@lucene.apache.org
> Subject: Re: How best to handle a reasonable amount to data (25TB+)
>
> it sounds not an issue of lucene but the logic of your app.
> if you're afraid too many docs in one index
llion documents, so that
makes the 1.25 trillion number look reasonable.
Any other thoughts?
Thanks,
The Captn.
-Original Message-
From: ppp c [mailto:peter.c.e...@gmail.com]
Sent: Monday, 6 February 2012 5:29 PM
To: java-user@lucene.apache.org
Subject: Re: How best to handle a reasonable amou
it sounds not an issue of lucene but the logic of your app.
if you're afraid too many docs in one index you can make multiple indexes.
And then search across them, then merge, then over.
On Mon, Feb 6, 2012 at 10:50 AM, Peter Miller <
peter.mil...@objectconsulting.com.au> wrote:
> Hi,
>
> I have
Hi,
I have a little bit of an unusual set of requirements, and I am looking for
advice. I have researched the archives, and seen some relevant posts, but they
are fairly old and not specifically a match, so I thought I would give this a
try.
We will eventually have about 50TB raw, non-searchab
12 matches
Mail list logo