There're some edge cases around the response based on the timing. In case it's
useful:
Here's the bit from solrcloud-haft: (java)
't a query so it isn't parsed. So I have no way to
dereference the "$row.[shard]".
On 3/27/18, 4:00 PM, "Jeff Wartes" <jwar...@whitepages.com> wrote:
I have a large 7.2 index with nested documents and many shards.
For each result (parent doc) in a query,
ere is a shared filesystem requirement. It would be nice if this
> Solr feature could be enhanced to have more options like backing up
> directly to another SolrCloud using replication/fetchIndex like your cool
> solrcloud_manager thing.
>
> On Wed, Mar 28, 2018 at
for the duration
of the restore
But the former isn't tenable if you're sharding due to space constraints, and
the latter can't be easily predicted.
On 3/28/18, 11:30 AM, "Shawn Heisey" <apa...@elyograg.org> wrote:
On 3/28/2018 10:34 AM, Jeff Wartes wrote:
> The backup/res
The backup/restore still requires setting up a shared filesystem on all your
nodes though right?
I've been using the fetchindex trick in my solrcloud_manager tool for ages now:
https://github.com/whitepages/solrcloud_manager#cluster-commands
Some of the original features in that tool have been
I have a large 7.2 index with nested documents and many shards.
For each result (parent doc) in a query, I want to gather a relevance-ranked
subset of the child documents. It seemed like the subquery transformer would be
ideal:
lica": "<7", "node":"#ANY"} , means don't put more than 7
replicas of the collection (irrespective of the shards) in a given
node
what do you mean by distinct 'RF' ? I think we are screwing up the
terminologies a bit here
On Wed, Feb 7, 2018
I’ve been messing around with the Solr 7.2 autoscaling framework this week.
Some things seem trivial, but I’m also running into questions and issues. If
anyone else has experience with this stuff, I’d be glad to hear it.
Specifically:
Context:
-One collection, consisting of 42 shards, where
It’s presumably not a small degradation - this guy very recently suggested it’s
77% slower:
https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/
The other reason that blog post is interesting to me is that his benchmark
utility showed the work of entering the kernel
Yes, that’s the Xenial I tried. Ubuntu 16.04.2 LTS.
On 5/1/17, 7:22 PM, "Will Martin" <wmartin...@outlook.com> wrote:
Ubuntu 16.04 LTS - Xenial (HVM)
Is this your Xenial version?
On 5/1/2017 6:37 PM, Jeff Wartes wrote:
> I tri
I started with the same three-node 15-shard configuration I’d been used to, in
an RF1 cluster. (the index is almost 700G so this takes three r4.8xlarge’s if I
want to be entirely memory-resident) I eventually dropped down to a 1/3rd size
index on a single node (so 5 shards, 100M docs each) so I
We settled on the R4.2XL... The R series is labeled "High-Memory"
Which instance type did you end up using?
On Mon, May 1, 2017 at 8:22 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> On 4/28/2017 10:09 AM, Jeff Wartes wrote:
> > tldr: Recen
with you having such different
performance between local and EC2
But thanks for telling us about this! It's totally baffling
Erick
On Fri, Apr 28, 2017 at 9:09 AM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
> tldr: Recently, I tried moving an existing
tldr: Recently, I tried moving an existing solrcloud configuration from a local
datacenter to EC2. Performance was roughly 1/10th what I’d expected, until I
applied a bunch of linux tweaks.
This should’ve been a straight port: one datacenter server -> one EC2 node.
Solr 5.4, Solrcloud, Ubuntu
Sounds similar to a thread last year:
http://lucene.472066.n3.nabble.com/Node-not-recovering-leader-elections-not-occuring-tp4287819p4287866.html
On 2/1/17, 7:49 AM, "tedsolr" wrote:
I have version 5.2.1. Short of an upgrade, are there any remedies?
Adding my anecdotes:
I’m using heavily tuned ParNew/CMS. This is a SolrCloud collection, but
per-node I’ve got a 28G heap and a 200G index. The large heap turned out to be
necessary because certain operations in Lucene allocate memory based on things
other than result size, (index size
Hah, interesting.
The fact that the CMS collector fails back to a *single-threaded* collection on
concurrent-mode-failure had me seriously considering trying the Parallel
collector a year or two ago. I figured out (and stopped) the queries that were
doing the sudden massive allocations that
I’d prefer it if the alias was required to be removed, or pointed elsewhere,
before the collection could be deleted.
As a best practice, I encourage all SolrCloud users to configure an alias to
each collection, and use only the alias in their clients. This allows atomic
switching between
Here’s an earlier post where I mentioned some GC investigation tools:
https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201604.mbox/%3c8f8fa32d-ec0e-4352-86f7-4b2d8a906...@whitepages.com%3E
In my experience, there are many aspects of the Solr/Lucene memory allocation
model that scale
I found this, which intends to explore the usage of RoaringDocIdSet for solr:
https://issues.apache.org/jira/browse/SOLR-9008
This suggests Lucene’s filter cache already uses it, or did at one point:
https://issues.apache.org/jira/browse/LUCENE-6077
I was playing with id set implementations
Expanding on my comment on the ticket, I’m really quite happy with using
codahale/dropwizard metrics with Solr. I don’t know if I’m comfortable just
sharing a screenshot of the resulting grafana dashboard, but I’ve got, per-host:
- Percentile latencies and rates for GET vs POST (which in
https://issues.apache.org/jira/browse/SOLR-5894 had some pretty interesting
looking work on heuristic counts for facets, among other things.
Unfortunately, it didn’t get picked up, but if you don’t mind using Solr 4.10,
there’s a jar.
On 11/4/16, 12:02 PM, "John Davis"
I’ll also mention the choice to improve processing speed by allocating more
memory, which increases the importance of GC tuning. This bit me when I tried
using it on a larger index.
https://issues.apache.org/jira/browse/SOLR-9125
I don’t know if the result grouping feature shares the same
h routing:
https://sematext.com/blog/2015/09/29/solrcloud-large-tenants-and-routing/
Regards,
Emir
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On 11.08.2016 19:39, Je
This isn’t really a question, although some validation would be nice. It’s more
of a warning.
Tldr is that the insert order of documents in my collection appears to have had
a huge effect on my query speed.
I have a very large (sharded) SolrCloud 5.4 index. One aspect of this index is
a
It sounds like the node-local version of the ZK clusterstate has diverged from
the ZK cluster state. You should check the contents of zookeeper and verify the
state there looks sane. I’ve had issues (v5.4) on a few occasions where leader
election got screwed up to the point where I had to
data?
>
>Thanks!
>Kent
>
>2016-07-12 23:02 GMT+08:00 Jeff Wartes <jwar...@whitepages.com>:
>
>> Well, two thoughts:
>>
>>
>> 1. If you’re not using solrcloud, presumably you don’t have any replicas.
>> If you are, presumably you do. This makes fo
Well, two thoughts:
1. If you’re not using solrcloud, presumably you don’t have any replicas. If
you are, presumably you do. This makes for a biased comparison, because
SolrCloud won’t acknowledge a write until it’s been safely written to all
replicas. In short, solrcloud write time is
A variation on #1 here - Use the same cluster, create a new collection, but use
the createNodeSet option to logically partition your cluster so no node has
both the old and new collection.
If your clients all reference a collection alias, instead of a collection name,
then all you need to do
This might come a little late to be helpful, but I had a similar situation with
Solr 5.4 once.
We ended up finding a ZK snapshot we could restore, but we did also get the
cluster back up for most of the interim by taking the now-empty ZK cluster,
re-uploading the configs that the collections
There’s no official way of doing #1, but there are some less official ways:
1. The Backup/Restore API provides some hooks into loading pre-existing data
dirs into an existing collection. Lots of caveats.
2. If you don’t have many shards, there’s always rsync/reload.
3. There are some third-party
to promotion failures. I suspect there's a lot of garbage building up.
>We're going to run tests with field collapsing disabled and see if that
>makes a difference.
>
>Cas
>
>
>On Thu, Jun 16, 2016 at 1:08 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
>> Check y
Check your gc log for CMS “concurrent mode failure” messages.
If a concurrent CMS collection fails, it does a stop-the-world pause while it
cleans up using a *single thread*. This means the stop-the-world CMS collection
in the failure case is typically several times slower than a concurrent
Any distributed query falls into the two-phase process. Actually, I think some
components may require a third phase. (faceting?)
However, there are also cases where only a single pass is required. A
fl=id,score will only be a single pass, for example, since it doesn’t need to
get the field
For what it’s worth, I’d suggest you go into a conversation with Azul with a
more explicit “I’m looking to buy” approach. I reached out to them with a more
“I’m exploring my options” attitude, and never even got a trial. I get the
impression their business model involves a fairly expensive (to
r on the linux command line I get:
>
>/opt/solr-5.4.0/server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
>
>But the log file is still carrying class not found exceptions when I
>restart...
>
>Are you in "Cloud" mode? What version of Solr are you using?
t; > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
>> > > > >> >
>> > > > >> > Is there anything else out there that you would recommend I look
>> > at?
>> > > > >> >
>> > > > >>
Oh, interesting. I’ve certainty encountered issues with multi-word synonyms,
but I hadn’t come across this. If you end up using it with a recent solr
verison, I’d be glad to hear your experience.
I haven’t used it, but I am aware of one other project in this vein that you
might be interested
SolrCloud never creates replicas automatically, unless perhaps you’re using the
HDFS-only autoAddReplicas option. Start the new node using the same ZK, and
then use the Collections API
(https://cwiki.apache.org/confluence/display/solr/Collections+API) to
ADDREPLICA.
The replicationFactor you
My first thought is that you haven’t indexed such that all values of the field
you’re grouping on are found in the same cores.
See the end of the article here: (Distributed Result Grouping Caveats)
https://cwiki.apache.org/confluence/display/solr/Result+Grouping
And the “Document Routing”
https://github.com/whitepages/solrcloud_manager was designed to provide some
easier operations for common kinds of cluster operation.
It hasn’t been tested with 6.0 though, so if you try it, please let me know
your experience.
On 5/23/16, 6:28 AM, "Tom Evans"
The PingRequestHandler contains support for a file check, which allows you to
control whether the ping request succeeds based on the presence/absence of a
file on disk on the node.
http://lucene.apache.org/solr/6_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html
I suppose you could
That case related to consistency after a ZK outage or network connectivity
issue. Your case is standard operation, so I’m not sure that’s really the same
thing. I’m aware of a few issues that cam happen if ZK connectivity goes wonky,
that I hope are fixed in SOLR-8697.
This one might be a
have replicas B and C.
>
>What the "something" is that sends requests I'm not quite sure, but
>that's a place
>to start.
>
>Best,
>Erick
>
>On Mon, May 16, 2016 at 11:08 AM, Jeff Wartes <jwar...@whitepages.com> wrote:
>>
>> I have a solr 5.4 clus
I have a solr 5.4 cluster with three collections, A, B, C.
Nodes either host replicas for collection A, or B and C. Collections B and C
are not currently used - no inserts or queries. Collection A is getting
significant query traffic, but no insert traffic, and queries are only directed
to
An ID lookup is a very simple and fast query, for one ID. Or’ing a lookup for
80k ids though is basically 80k searches as far as Solr is concerned, so it’s
not altogether surprising that it takes a while. Your complaint seems to be
that the query planner doesn’t know in advance that should be
Shawn Heisey’s page is the usual reference guide for GC settings:
https://wiki.apache.org/solr/ShawnHeisey
Most of the learnings from that are in the Solr 5.x startup scripts already,
but your heap is bigger, so your mileage may vary.
Some tools I’ve used while doing GC tuning:
* VisualVM -
some retry logic in the code that distributes the updates from
>the leader as well.
>
>Best,
>Erick
>
>On Tue, Apr 26, 2016 at 12:51 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>>
>> At the risk of thread hijacking, this is an area where I don’t know I full
At the risk of thread hijacking, this is an area where I don’t know I fully
understand, so I want to make sure.
I understand the case where a node is marked “down” in the clusterstate, but
what if it’s down for less than the ZK heartbeat? That’s not unreasonable, I’ve
seen some
I have no numbers to back this up, but I’d expect Atomic Updates to be slightly
slower than a full update, since the atomic approach has to retrieve the fields
you didn't specify before it can write the new (updated) document.
On 4/19/16, 11:54 AM, "Tim Robertson"
I’m all for finding another way to make something work, but I feel like this is
the wrong advice.
There are two options:
1) You are doing something wrong. In which case, you should probably invest in
figuring out what.
2) Solr is doing something wrong. In which case, you should probably invest
If you’re already using java, just use the CloudSolrClient.
If you’re using the default router, (CompositeId) it’ll figure out the leaders
and send documents to the right place for you.
If you’re not using java, then I’d still look there for hints on how to
duplicate the functionality.
On
There is some automation around this process in the backup commands here:
https://github.com/whitepages/solrcloud_manager
It’s been tested with 5.4, and will restore arbitrary replication factors.
Ever assuming the shared filesystem for backups, of course.
On 4/5/16, 3:18 AM, "Reth RM"
I recall I had some luck fixing a leader-less shard (after a ZK quorum failure)
by forcably removing the records for the down-state replicas from the leader
election list, and then forcing an election.
The ZK path looks like collections//leader_elect/shardX/election.
Usually you’ll find the
It’s a bit backwards feeling, but I’ve had luck setting the install dir and
solr home, instead of the data dir.
Something like:
-Dsolr.solr.home=/data/solr
-Dsolr.install.dir=/opt/solr
So all of the Solr files are in in /opt/solr and all of the index/core related
files end up in /data/solr.
I've experimented with that a bit, and Shawn added my comments in IRC to his
Solr/GC page here: https://wiki.apache.org/solr/ShawnHeisey
The relevant bit:
"With values of 4096 and 32768, the IRC user was able to achieve 15% and 19%
reductions in average pause time, respectively, with the
n zookeeper?
>
>
>
>Your tool is very interesting, I just thought about writing such a tool
>myself.
>From the sources I understand that you represent each node as a path in the
>git repository.
>So, I guess that for restore purposes I will have to do
>the opposite direction a
I’ve been running SolrCloud clusters in various versions for a few years here,
and I can only think of two or three cases that the ZK-stored cluster state was
broken in a way that I had to manually intervene by hand-editing the contents
of ZK. I think I’ve seen Solr fixes go by for those
I believe the shard state is a reflection of whether that shard is still in use
by the collection, and has nothing to do with the state of the replicas. I
think doing a split-shard operation would create two new shards, and mark the
old one as inactive, for example.
On 2/26/16, 8:50 AM,
;> of
>> > SOLR as the field which is the basis of the sort is not included in the
>> > schema for example the price. The customer wants the list in descending
>> > order of the price.
>> >
>> > So I have to get all the 1000 docids from solr an
My suggestion would be to split your problem domain. Use Solr exclusively for
search - index the id and only those fields you need to search on. Then use
some other data store for retrieval. Get the id’s from the solr results, and
look them up in the data store to get the rest of your fields.
Solrcloud does not come with any autoscaling functionality. If you want such a
thing, you’ll need to write it yourself.
https://github.com/whitepages/solrcloud_manager might be a useful head start
though, particularly the “fill” and “cleancollection” commands. I don’t do
*auto* scaling, but I
You could write your own snitch:
https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
Or, it would be more annoying, but you can always add/remove replicas manually
and juggle things yourself after you create the initial collection.
On 2/1/16, 8:42 AM, "Tom Evans"
Aliases work when indexing too.
Create collection: collection1
Create alias: this_week -> collection1
Index to: this_week
Next week...
Create collection: collection2
Create (Move) alias: this_week -> collection2
Index to: this_week
On 2/1/16, 2:14 AM, "vidya" wrote:
I enjoy using collection aliases in all client references, because that allows
me to change the collection all clients use without updating the clients. I
just move the alias.
This is particularly useful if I’m doing a full index rebuild and want an
atomic, zero-downtime switchover.
On
On 1/27/16, 8:28 AM, "Shawn Heisey" wrote:
>
>I don't think any documentation states this, but it seems like a good
>idea to me use an alias from day one, so that you always have the option
>of swapping the "real" collection that you are using without needing to
>change
If you can identify the problem documents, you can just re-index those after
forcing a sync. Might save a full rebuild and downtime.
You might describe your cluster setup, including ZK. it sounds like you’ve done
your research, but improper ZK node distribution could certainly invalidate
some
My understanding is that the "version" represents the timestamp the searcher
was opened, so it doesn’t really offer any assurances about your data.
Although you could probably bounce a node and get your document counts back in
sync (by provoking a check), it’s interesting that you’re in this
t;
>>>
>>> You might watch the achieved replication factor of your updates and see if
>>> it ever changes
>>>
>
>This is a good tip. I’m not sure I like the implication that any failure to
>write all 3 of our replicas must be retried at the app layer. Is t
be...
>
>=xxx
>
>btw, for your app, isn't "slice" old notation?
>
>
>
>
>On 08/01/16 22:05, Jeff Wartes wrote:
>>
>> I’m pretty sure you could change the name when you ADDREPLICA using a
>> core.name property. I don’t know if you can when you
I’m pretty sure you could change the name when you ADDREPLICA using a core.name
property. I don’t know if you can when you initially create the collection
though.
The CLUSTERSTATUS command will tell you the core names:
Looks like it’ll set partialResults=true on your results if you hit the
timeout.
https://issues.apache.org/jira/browse/SOLR-502
https://issues.apache.org/jira/browse/SOLR-5986
On 12/22/15, 5:43 PM, "Vincenzo D'Amore" wrote:
>Well... I can write everything, but
Don’t set solr.data.dir. Instead, set the install dir. Something like:
-Dsolr.solr.home=/data/solr
-Dsolr.install.dir=/opt/solr
I have many solrcloud collections, and separate data/install dirs, and
I’ve never had to do anything with manual per-collection or per-replica
data dirs.
That said,
It’s a pretty common misperception that since solr scales, you can just
spin up new nodes and be done. Amazon ElasticSearch and older solrcloud
getting-started docs encourage this misperception, as does the HDFS-only
autoAddReplicas flag.
I agree that auto-scaling should be approached carefully,
If you want two different collections to have two different schemas, those
collections need to reference two different configsets.
So you need another copy of your config available using a different name,
and to reference that other name when you create the second collection.
On 12/4/15, 6:26
I’ve never used the managed schema, so I’m probably biased, but I’ve never
seen much of a point to the Schema API.
I need to make changes sometimes to solrconfig.xml, in addition to
schema.xml and other config files, and there’s no API for those, so my
process has been like:
1. Put the entire
Looks like LIST was added in 4.8, so I guess you’re stuck looking at ZK,
or finding some tool that looks in ZK for you.
The zkCli.sh that ships with zookeeper would probably suffice for a
one-off manual inspection:
https://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_ConnectingT
dentally and the DIH cannot be run
>because the database is unavailable.
>
>Our collection is simple: 2 nodes - 1 collection - 2 shards with 2
>replicas
>each
>
>So a simple copy (cp command) for both the nodes/shards might work for us?
>How do I restore the data back?
he
>limit on each server but it isn't clear to me how high it should be or if
>raising the limit will cause new problems.
>
>Any advice you could provide in this situation would be awesome!
>
>Cheers,
>Brian
>
>
>
>> On Oct 27, 2015, at 20:50, Jeff Wartes <jwar
https://github.com/whitepages/solrcloud_manager supports 5.x, and I added
some backup/restore functionality similar to SOLR-5750 in the last
release.
Like SOLR-5750, this backup strategy requires a shared filesystem, but
note that unlike SOLR-5750, I haven’t yet added any backup functionality
FWIW, since it seemed like there was at least one bug here (and possibly
more), I filed
https://issues.apache.org/jira/browse/SOLR-8171
On 10/6/15, 3:58 PM, "Jeff Wartes" <jwar...@whitepages.com> wrote:
>
>I dug far enough yesterday to find the GET_DOCSET, but not f
On the face of it, your scenario seems plausible. I can offer two pieces
of info that may or may not help you:
1. A write request to Solr will not be acknowledged until an attempt has
been made to write to all relevant replicas. So, B won’t ever be missing
updates that were applied to A, unless
The “copy” command in this tool automatically does what Upayavira
describes, including bringing the replicas up to date. (if any)
https://github.com/whitepages/solrcloud_manager
I’ve been using it as a mechanism for copying a collection into a new
cluster (different ZK), but it should work
If you’re using AWS, there’s this:
https://github.com/LucidWorks/solr-scale-tk
If you’re using chef, there’s this:
https://github.com/vkhatri/chef-solrcloud
(There are several other chef cookbooks for Solr out there, but this is
the only one I’m aware of that supports Solr 5.3.)
For ZK, I’m
I’m aware of two public administration tools:
This was announced to the list just recently:
https://github.com/bloomreach/solrcloud-haft
And I’ve been working in this:
https://github.com/whitepages/solrcloud_manager
Both of these hook the Solrcloud client’s ZK access to inspect the cluster
state
I dug far enough yesterday to find the GET_DOCSET, but not far enough to
find why. Thanks, a little context is really helpful sometimes.
So, starting with an empty filterCache...
http://localhost:8983/solr/techproducts/select?q=name:foo=1=true
=popularity
New values: lookups: 0,
ert, but
not a lookup, so the cache hit ratio is always exactly 1.
On 10/2/15, 4:18 AM, "Toke Eskildsen" <t...@statsbiblioteket.dk> wrote:
>On Thu, 2015-10-01 at 22:31 +, Jeff Wartes wrote:
>> It still inserts if I address the core directly and use distrib=f
I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud
index on fields like this:
; wrote:
>what if you set f.city.facet.limit=-1 ?
>
>On Thu, Oct 1, 2015 at 7:43 PM, Jeff Wartes <jwar...@whitepages.com>
>wrote:
>
>>
>> I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud
>> index on fields like this:
>>
>> > docValue
stributed requests, it expained here
>https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-Over-Re
>questParameters
>eg does it happen if you run with distrib=false?
>
>On Fri, Oct 2, 2015 at 12:27 AM, Jeff Wartes <jwar...@whitepages.com>
>wrote:
>
&
ibute it. We’ve been running it in production for a year,
>but the config is pretty manual.
>
>wunder
>Walter Underwood
>wun...@wunderwood.org
>http://observer.wunderwood.org/ (my blog)
>
>
>> On Sep 28, 2015, at 4:41 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
One would hope that https://issues.apache.org/jira/browse/SOLR-4735 will
be done by then.
On 9/28/15, 11:39 AM, "Walter Underwood" wrote:
>We did the same thing, but reporting performance metrics to Graphite.
>
>But we won’t be able to add servlet filters in 6.x,
I’ve been relying on this:
https://code.google.com/archive/p/linux-ftools/
fincore will tell you what percentage of a given file is in cache, and
fadvise can suggest to the OS that a file be cached.
All of the solr start scripts at my company first call fadvise
(FADV_WILLNEED) on all the
If I configure my filterCache like this:
and I have <= 10 distinct filter queries I ever use, does that mean I’ve
effectively disabled cache invalidation? So my cached filter query results
will never change? (short of JVM restart)
I’m unclear on whether autowarm simply copies the value into
of whether it was populated via autowarm.
On 9/24/15, 11:28 AM, "Jeff Wartes" <jwar...@whitepages.com> wrote:
>
>If I configure my filterCache like this:
>autowarmCount="10"/>
>
>and I have <= 10 distinct filter queries I ever use, does that mean I’ve
On 9/4/15, 7:06 AM, "Yonik Seeley" wrote:
>
>Lucene seems to always be changing it's execution model, so it can be
>difficult to keep up. What version of Solr are you using?
>Lucene also changed how filters work, so now, a filter is
>incorporated with the query like so:
>
Tokenizers, Filters, URPs and even a newsletter:
>http://www.solr-start.com/
>
>
>On 3 September 2015 at 16:45, Jeff Wartes <jwar...@whitepages.com> wrote:
>>
>> I have a query like:
>>
>> q==enabled:true
>>
>> For purposes of this conversation
I have a query like:
q==enabled:true
For purposes of this conversation, "fq=enabled:true" is set for every query, I
never open a new searcher, and this is the only fq I ever use, so the filter
cache size is 1, and the hit ratio is 1.
The fq=enabled:true clause matches about 15% of my
I had a similar need. The resulting tool is in scala, but it still might
be useful to look at. I had to work through some of those same issues:
https://github.com/whitepages/solrcloud_manager
From a clusterstate perspective, I mostly cared about active vs
non-active, so here¹s a sample output
You need to specify a replication factor of 2 if you want two copies of
each shard. Solr doesn¹t ³auto fill² available capacity, contrary to the
misleading examples on the http://wiki.apache.org/solr/SolrCloud page.
Those examples only have that behavior because they ask you to copy the
examples
1 - 100 of 142 matches
Mail list logo