Thanks for the response guys..
Let's consider I have two fields X and Y and field type of both fields are
*text*. Now, i want to use whitespace analyzer for field X and standard
analyzer for field Y.
In Elasticsearch, we can specify the different analyzer for same field
type. Is this feature is a
Hi Shawn and Jack,
Thank you for your reply.
Yes, I want to run data import hander independently and sync it to Solr Cloud.
because current my DIH node do not only DB fetch & join but also many
preprocessing.
Thanks,
Chunki.
On Aug 30, 2014, at 1:34 AM, Jack Krupansky wrote:
> My other thou
Yeah, I second Mark's suggestion on reducing the stack size. The default on
modern 64-bit boxes is usually 1024KB which adds up to a lot when you're
running 5000 cores (5000 * 2 = 1MB). I think the zk register thread can
be pooled together but the search threads can't be because we'd run into
d
Trying to shoehorn business name resolution or correction purely into
Solr tokenization and spell checking is not, in my opinion, a viable
approach. It seems to me that you need a query parser that does
something very different from pure tokenization, and you might also
need a more complex approach
>
> so you might still end up with these out of threads issue again.
You can also generally drop the stack size (Xss) quite a bit to to handle
more threads.
Beyond that, there are some thread pools you can configure. However, until
we fix the distrib deadlock issue, you don't want to drop the co
Can you write your own spell check class and use something like edit distance
to get the desired result
Sent from my iPhone
> On Aug 27, 2014, at 9:55 AM, Corey Gerhardt
> wrote:
>
> Sorry to keep beating this to death. I could be looking for perfection which
> isn't possible.
>
> I'm tr
We should also consider "lightly-sharded" collections. IOW, even if a
cluster has dozens or a hundred nodes or more, the goal may not be to shard
all collections across all shards, which is fine for the really large
collections, but to also support collections which may only need to be
sharded
You close with two great questions for the community!
We have a similar issue over in Apache Cassandra database land (thousands of
tables).
There is no immediate, easy, great answer. Other than the kinds of
"workarounds" being suggested.
-- Jack Krupansky
-Original Message-
From:
What is your access pattern? By that I mean do all the cores need to be
searched at the same time or is it reasonable for them to be loaded on
demand? This latter would impose the penalty of the first time a collection
was accessed there would be a delay while the core loaded. I suppose I'm
asking
On 31 Aug 2014 13:24, "Mark Miller" wrote:
>
>
> > On Aug 31, 2014, at 4:04 AM, Christoph Schmidt <
christoph.schm...@moresophy.de> wrote:
> >
> > we see at least two problems when scaling to large number of
collections. I would like to ask the community, if they are known and maybe
already addres
One collection has 2 replicas, no sharding, the collections are not that big.
No, they are unfortunately not independent. There are collections with customer
documents (some thousand customers) and product collections. One customer has
at least on customer collection and 1 to some hundred produc
On 8/31/2014 8:58 AM, Joseph Obernberger wrote:
> Could you add another field(s) to your application and use that instead of
> creating collections/cores? When you execute a search, instead of picking
> a core, just search a single large core but add in a field which contains
> some core ID.
This
Could you add another field(s) to your application and use that instead of
creating collections/cores? When you execute a search, instead of picking
a core, just search a single large core but add in a field which contains
some core ID.
-Joe
http://www.lovehorsepower.com
On Sun, Aug 31, 2014 at
On 8/30/2014 11:43 PM, Shawn Heisey wrote:
> The release is likely to be finalized tomorrow. Once it's finalized and
> uploaded, the Apache mirror system will begin replicating. It usually
> takes a couple more days before a release is actually announced. The
> announcement will be made when it
> On Aug 31, 2014, at 4:04 AM, Christoph Schmidt
> wrote:
>
> we see at least two problems when scaling to large number of collections. I
> would like to ask the community, if they are known and maybe already
> addressed in development:
> We have a SolrCloud running with the following numbers
How are the 5 servers arranged in terms of shards and replicas? 5 shards
with 1 replica each, 1 shard with 5 replicas, 2 shards with 2 and 3
replicas, or... what?
How big is each collection? The key strength of SolrCloud is scaling large
collections via shards, NOT scaling large numbers of col
we see at least two problems when scaling to large number of collections. I
would like to ask the community, if they are known and maybe already addressed
in development:
We have a SolrCloud running with the following numbers:
- 5 Servers (each 24 CPUs, 128 RAM)
- 13.000 Collec
17 matches
Mail list logo