So, is that a clear yes or a clear no for Aleksey's use case - 10's of millions of cores, not all active but each loadable on demand?

I asked this same basic question months ago and there was no answer forthcoming.

-- Jack Krupansky

-----Original Message----- From: Erick Erickson
Sent: Thursday, June 06, 2013 3:53 PM
To: solr-user@lucene.apache.org
Subject: Re: LotsOfCores feature

100K is really not the limit, it's just hard to imagine
100K cores on a single machine unless some were
really rarely used. And it's per node, not cluster-wide.

The current state is that everything is in place, including
transient cores, auto-discovery, etc. So you should be
able to go ahead and try it out.

The next bit that will help with efficiency is sharing named
config sets. The intent here is that <solrhome>/configs will
contain sub-dirs like "conf1", "conf2" etc. Then your cores
can reference configName=conf1 and only one copy of
the configuration data will be used rather than re-loading one
for each core as it comes up and down.

Do note that the _first_ query in to one of the not-yet-loaded
cores will be slow. The model here is that you can tolerate
some queries taking more time at first than you might like
in exchange for the hardware savings. This pre-supposes that
you simply cannot fit all the cores into memory at once.

The "won't fix" bits are there because, as we got farther into this
process, the approach changed and the functionality of the
won't fix JIRAs was subsumed by other changes by and large.

I've got to update that documentation sometime, but just haven't
had time yet. If you go down this route, we'll be happy to
add your name to the authorized editors of the wiki list if you'd
like.

Best
Erick

On Thu, Jun 6, 2013 at 3:08 PM, Aleksey <bitterc...@gmail.com> wrote:
I was looking at this wiki and linked issues:
http://wiki.apache.org/solr/LotsOfCores

they talk about a limit being 100K cores. Is that per server or per
entire fleet because zookeeper needs to manage that?

I was considering a use case where I have tens of millions of indices
but less that a million needs to be active at any time, so they need
to be loaded on demand and evicted when not used for a while.
Also since number one requirement is efficient loading of course I
assume I will store a prebuilt index somewhere so Solr will just
download it and strap it in, right?

The root issue is marked as "won;t fix" but some other important
subissues are marked as resolved. What's the overall status of the
effort?

Thank you in advance,

Aleksey

Reply via email to