Hello list,
I'm using mapreduce from contrib and I get this stack trace:
https://gist.github.com/ralph-tice/b1e84bdeb64532c7ecab
Whenever I specify luceneMatchVersion4.10/luceneMatchVersion in my
solrconfig.xml. 4.9 works fine. I'm using 4.10.4 artifacts for both map
reduce runs. I tried
It looks like there's a patch available:
https://issues.apache.org/jira/browse/SOLR-5132
Currently the only way without that patch is to hand-edit
clusterstate.json, which is very ill advised. If you absolutely must,
it's best to stop all your Solr nodes, backup the current clusterstate
in ZK,
We index lots of relatively small documents, minimum of around
6k/second, but up to 20k/second. At the same time we are deleting
400-900 documents a second. We have our shards organized by time, so
the bulk of our indexing happens in one 'hot' shard, but deletes can
go back in time to our epoch.
You might want to also look at Trulia's Thoth project
https://github.com/trulia/thoth-ml/ -- it doesn't supply this feature
out of the box, but it gives you a nice framework for implementing it.
On Sun, Feb 8, 2015 at 5:07 PM, Jorge Luis Betancourt González
jlbetanco...@uci.cu wrote:
For a
Like all things it really depends on your use case. We have 160B
documents in our largest SolrCloud and doing a *:* to get that count takes
~13-14 seconds. Doing a text:happy query only takes ~3.5-3.6 seconds cold,
subsequent queries for the same terms take 500ms. We have a little over
3TB of
I have a writeup of how to perform safe backups here:
https://gist.github.com/ralph-tice/887414a7f8082a0cb828
There are some tickets around this work to further the ease of
backups, especially https://issues.apache.org/jira/browse/SOLR-5750
On Mon, Nov 24, 2014 at 9:45 AM, Vivek Pathak vpat
bq. We ran into one of failure modes that only AWS can dream up
recently, where for an extended amount of time, two nodes in the same
placement group couldn't talk to one another, but they could both see
Zookeeper, so nothing was marked as down.
I had something similar happen with one of my
I think ADD/DELETE replica APIs are best for within a SolrCloud,
however if you need to move data across SolrClouds you will have to
resort to older APIs, which I didn't find good documentation of but
many references to. So I wrote up the instructions to do so here:
https://gist.github.com/ralph
I've had a bad enough experience with the default shard placement that I
create a collection with one shard, add the shards where I want them, then
use add/delete replica to move the first one to the right machine/port.
Typically this is in a SolrCloud of dozens or hundreds of shards. Our
shards
FWIW, I do a lot of moving Lucene indexes around and as long as the core is
unloaded it's never been an issue for Solr to be running at the same time.
If you move a core into the correct hierarchy for a replica, you can call
the Collections API's CREATESHARD action with the appropriate params
not reload the
collection or core. Have not tried re-starting the solr cloud. Can someone
point out the best way to achieve the goal? I prefer not to re-start solr
cloud.
Shushuai
From: ralph tice ralph.t...@gmail.com
To: solr-user@lucene.apache.org
Sent
Hi all,
Two issues, first, when I issue an ADDREPLICA call like so:
http://localhost:8983/solr/admin/collections?action=ADDREPLICAshard=myshardcollection=mycollectioncreateNodeSet=solr18.mycorp.com:8983_solr
It does not seem to respect the 8983_solr designation in the createNodeSet
parameter
/markrmiller
On August 24, 2014 at 12:35:13 PM, ralph tice (ralph.t...@gmail.com)
wrote:
Hi all,
Two issues, first, when I issue an ADDREPLICA call like so:
http://localhost:8983/solr/admin/collections?action=ADDREPLICAshard=myshardcollection=mycollectioncreateNodeSet=solr18.mycorp.com
What are the dependencies here in terms of solr config? Looks like it's
dependent on highlighting at a minimum?
I tried the example url and got a 500 with this stack trace once I
inspected the response of the generated URI:
java.lang.NullPointerException at
, as well as clusterstate for the shard in
question, which describe what I see via the UI also -- the newly created
replica shard erroneously thinks it has fully replicated.
https://gist.github.com/ralph-tice/18796de6393f48fb0192
The logs are after issuing a REQUESTRECOVERY call.
The only message
I think your response time is including the average response for an add
operation, which generally returns very quickly and due to sheer number are
averaging out the response time of your queries. New Relic should break
out requests based on which handler they're hitting but they don't seem to.
We have a cluster running SolrCloud 4.7 built 2/25. 10 shards with 2
replicas each (20 shards total) at about ~20GB/shard.
We index around 1k-1.5k documents/second into this cluster constantly. To
manage growth we have a scheduled job that runs every 3 hours to prune
documents based on business
17 matches
Mail list logo