We have a customer that needs to update few billion documents to SolrCloud. I
know the suggested way of using is SolrCloudClient, for its load balancing
feature.
As per docs - CloudSolrClient
SolrJ client class to communicate with SolrCloud. Instances of this class
communicate with Zookeeper
On 1/12/2016 7:42 PM, Shivaji Dutta wrote:
> Now since with ConcurrentUdateSolrClient I am able to use a queue and a pool
> of threads, which makes it more attractive to use over CloudSolrClient which
> will use a HTTPSolrClient once it gets a set of nodes to do the updates.
>
> What is the
Hi
I have created a collection in one datanode on which solr server is deployed
say DN1. I am having another datanode on which solr server is deployed which
has resource manager service also running on it,say DN2. When i created a
collection using solrctl command in DN1, it got reflected in DN2
Perfect, I'll remove the block and check if the warning will be gone.
Thanks.
--
Gian Maria Ricci
Cell: +39 320 0136949
-Original Message-
From: Alessandro Benedetti [mailto:abenede...@apache.org]
Sent: martedì 12 gennaio 2016 10:43
To: solr-user@lucene.apache.org
Subject: Re:
yes i'am indexing succeflly with DIH other files ; now i try to index this
files with ExtractingRequestHandler i get this ERROR:
null:org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Error creating OOXML
extractor
at
THis is the replication handler configured in solrconfig.xml, there is nothing
else regarding replication. This configuration is used for a single core test
installation, then to experiment with SolrCloud I simply uploaded the very same
configuration to SolrCloud. Do you think it could be the
I would like to do a special mention of the update request processor chain
Solr Cloud mechanism.[1]
Quoting the documentation :
In a distributed SolrCloud situation setup, All processors in the chain
> *before* the DistributedUpdateProcessor are run on the first node that
> receives an update
To be honest, that block is not necessary anymore.
As Erick and Shawn were saying that is now implicit and defined by default.
Cheers
On 12 January 2016 at 08:22, Gian Maria Ricci - aka Alkampfer <
alkamp...@nablasoft.com> wrote:
> THis is the replication handler configured in solrconfig.xml,
Hi,
can you confirm me that realtime get requirements are just:
true
json
true
${solr.ulog.dir:}
Understood, thanks. I thought that the leader send data to other shards after
indexing and autocommit take place, but I know that this is not the optimal
situation. Sending all documents to all shard Solr can guarantee consistency of
data.
Now everything is more clear. Thanks for the
Hi all,
looking at parsedquery_toString debug I have many fields, but there is one
that have this configuration:
((attr_search:8 attr_search:gb)~2^5.0)
I hope to be right, but I expect to find a boost in both the values
matches.
Now, I don't understand why, even if both the terms matches, I
I need debug my custom processor (updateRequestProcessor) in my Eclipse
IDE. With old Solr version was possible, but with the solr like a service
with jetty i don't know if exists some way to do
--
Un Saludo.
Rodrigo Testillano Tordesillas.
Yep.
I have done this just few hours ago.
Let's download Solr source:
wget http://it.apache.contactlab.it/lucene/solr/5.4.0/solr-5.4.0-src.tgz
untar the file.
I'm not sure we need, but I have already installed latest versions of: ant,
ivy and maven.
Then in the solr-5.4.0 directory I did
Mmmm... I'm not sure it worth the trouble. Anyway, I'm just curious, when
you find a way let me know.
On Tue, Jan 12, 2016 at 1:01 PM, Rodrigo Testillano <
rodrite.testill...@gmail.com> wrote:
> Yes, with remote debug is working, but i want up a jetty with solr in
> Eclipse like i did with
On 1/12/2016 6:05 AM, Tom Evans wrote:
> Hi all, trying to move our Solr 4 setup to SolrCloud (5.4). Having
> some problems with a DIH config that attempts to load an XML file and
> iterate through the nodes in that file, it trys to load the file from
> disk instead of from zookeeper.
>
>
On 1/12/2016 2:50 AM, Matteo Grolla wrote:
> and that it works with any directory factory? (Not just
> NRTCachingDirectoryFactory)
Realtime Get relies on the updateLog to return uncommitted documents,
and standard Lucene mechanisms to return documents that have already
been committed. It should
right, suggester had some bad behavior where it rebuilt on startup despite
setting the flag to _not_ do that. See:
Some details here:
https://lucidworks.com/blog/2015/03/04/solr-suggester/
Best,
Erick
On Tue, Jan 12, 2016 at 8:12 AM, Matteo Grolla wrote:
> ok,
>
bq: it is too hard understand,what do you mean "lots"?
I mean that if you have one or two duplicate docs it's
worth looking at things like leading or trailing spaces in
the ID leading to IDs that look identical but aren't.
If it's hundreds or thousands of docs, then it's probably
indicative of
And a neater way to debug stuff rather than attaching to
Solr is to step through the Junit tests that exercise the code
you need to work on rather than attach to a remote Solr.
This is often much faster rather than compile/start solr/attach.
Of course some problems don't fit that process, but I
Then you probably have a corrupt file or have
discovered a Tika bug.
Next I'd try running the file through stand-alone Tika,
perhaps trying different versions of Tika. If this latter
is the case, you can always use a more recent version
of Tika with Solr and/or process the file on a SolrJ client
As an update, I went ahead and used the Collection API and deleted the
existing one, and then recreated it (specifying the compositeId router),
and when I tried out MRIT, I didn't have any problems whatsoever with the
number of reducers (and was able to cut the indexing time by over half!!).
I'm
Great to know. Thank you very much for your assistance!
On Tue, Jan 12, 2016 at 10:34 AM, Erick Erickson
wrote:
> bq: Do you know, is using the API the
> recommended way of handling collections? As opposed to putting collection
> folders containing "core.properties"
Well, Solr _can_ put all the languages in one field... it's just that
the user experience is sub-optimal.
Stopwords, stemming rules, even tokenization vary between
languages and using, say, the English stopwords for Catalan
is not the best.
And the CJK languages (Chinese, Japanese and Korean)
You won't necessarily find both if those values
are NOT in the particular document. If you have
a document you know contains both but doesn't
appear in your results list, consider using
explainOther to see how the doc of interest is
actually scored.
Best,
Erick
On Tue, Jan 12, 2016 at 1:54 AM,
bq: Do you know, is using the API the
recommended way of handling collections? As opposed to putting collection
folders containing "core.properties" file and "conf" folders (containing
"schema.xml" and "solrconfig.xml", etc) all in the Solr home location?
Absolutely and certainly DO use the
Yeah, that's essentially the nature of open source, someone
gets frustrated enough with current behavior and fixes it ;)...
There's never any harm in opening a JIRA, all you need to do
is register. It's not a bad idea to open on as you _start_ writing
the code, even providing very early versions
: ((attr_search:8 attr_search:gb)~2^5.0)
:
: I hope to be right, but I expect to find a boost in both the values
: matches.
1) "boost" information should show up as a detail of the "queryWeight",
which is itself a detail of the "weight" of term clauses -- in the output
you've included below,
Thank you so much!, I'm going to try right now and tell you my results!!
2016-01-12 12:47 GMT+01:00 Vincenzo D'Amore :
> Yep.
>
> I have done this just few hours ago.
> Let's download Solr source:
>
> wget http://it.apache.contactlab.it/lucene/solr/5.4.0/solr-5.4.0-src.tgz
>
So, you have deployed solr server on three nodes namely
192.168.100.210;211;212 .
Am I correct ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/solrcloud-How-to-delete-a-doc-at-a-specific-shard-tp4249354p4250117.html
Sent from the Solr - User mailing list archive at
Yes, with remote debug is working, but i want up a jetty with solr in
Eclipse like i did with tomcat in older versions. Thank you very much for
your help! I am going to try other way to do it, but maybe will be not
possible
2016-01-12 12:51 GMT+01:00 Rodrigo Testillano
Hi all, trying to move our Solr 4 setup to SolrCloud (5.4). Having
some problems with a DIH config that attempts to load an XML file and
iterate through the nodes in that file, it trys to load the file from
disk instead of from zookeeper.
The file exists in zookeeper, adjacent to the
On 1/12/2016 7:45 AM, Tom Evans wrote:
> That makes no sense whatsoever. DIH loads the data_import.conf from ZK
> just fine, or is that provided to DIH from another module that does
> know about ZK?
This is accomplished indirectly through a resource loader in the
SolrCore object that is
On Tue, Jan 12, 2016 at 3:00 PM, Shawn Heisey wrote:
> On 1/12/2016 7:45 AM, Tom Evans wrote:
>> That makes no sense whatsoever. DIH loads the data_import.conf from ZK
>> just fine, or is that provided to DIH from another module that does
>> know about ZK?
>
> This is
On Tue, Jan 12, 2016 at 2:32 PM, Shawn Heisey wrote:
> On 1/12/2016 6:05 AM, Tom Evans wrote:
>> Hi all, trying to move our Solr 4 setup to SolrCloud (5.4). Having
>> some problems with a DIH config that attempts to load an XML file and
>> iterate through the nodes in that
Thanks Shawn,
On a production solr instance some cores take a long time to load
while other of similar size take much less. One of the differences between
these cores is the directoryFactory.
2016-01-12 15:34 GMT+01:00 Shawn Heisey :
> On 1/12/2016 2:50 AM, Matteo
ok,
suggester was responsible for the long time to load.
Thanks
2016-01-12 15:47 GMT+01:00 Matteo Grolla :
> Thanks Shawn,
> On a production solr instance some cores take a long time to load
> while other of similar size take much less. One of the differences
I'm actually not specifying any router, and assumed the "implicit" one was
the default. The only resource I can find for setting the document router
is when creating a new collection via the Collections API, which I am not
using. What I do is define several options in the "solrconfig.xml" file,
37 matches
Mail list logo