I think you are running into race conditions in the API which have been
fixed. See SOLR-8804 and SOLR-10720. The first is available in 5.5.1 but
the latter fix will be released in the upcoming 7.3 release. The best
workaround for your version is to just retry a few times until the API
succeeds.
On
On 3/12/2018 8:39 PM, Terry Steichen wrote:
I'm increasingly of the view that Solr's authentication/authorization
mechanism doesn't work correctly in a _standalone_ mode. It was present
in the cloud mode for quite a few versions back, but as of 6.0.0 (or so)
it was supposed to be available in st
On 3/12/2018 6:18 PM, S G wrote:
> We have use-cases where some queries will return about 100k to 500k records.
> As per https://lucene.apache.org/solr/guide/7_2/pagination-of-results.html,
> it seems that using start=x, rows=y is a bad combination performance wise.
>
> 1) However, it is not clear
bq: Looks like the "qf=all phonetic" would take the place of my
existing "df=all" parameter.
In fact, it may call int question whether you even want an "all" field
or just list all the
fields you _would_ have copied into "all" in the "qf" parameter.
Having a single field
to search is certainly mo
On 3/12/2018 4:07 PM, Terry Steichen wrote:
> I'm using 6.6.0 with security.json active, having the content shown
> below. I am running standalone mode, have two solr cores defined:
> email1, and email2. Since the 'blockUnknown' is set to false, everyone
> should have access to any unprotected re
<1> consider start=100&rows=10. In the absence of cursorMark, Solr has
to sort the top 110 documents in order to throw away the first 100
since the last document scored could be in the top 110 and there's no
way to know that ahead of time. For 110 that's not very expensive, but
when the list is in
Hi,
We have use-cases where some queries will return about 100k to 500k records.
As per https://lucene.apache.org/solr/guide/7_2/pagination-of-results.html,
it seems that using start=x, rows=y is a bad combination performance wise.
1) However, it is not clear to me why the alternative: "cursor-qu
On 3/12/2018 4:15 PM, PeterKerk wrote:
> I trimmed stemdict_nl.txt for testing to just this:
>
> aachenaach
> aachener aachener
According to the example here:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/test-files/solr/collection1/c
Hello Peter,
StemmerOverride wants \t separated fields, that is probably the cause of the
AIooBE you get. Regarding schema definitions, each factory JavaDoc [1] has a
proper example listed. I recommend putting a decompounder before a stemmer, and
have an accent (or ICU) folder as one of the las
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Erick,
(Sorry... hit sent inadvertently before completion...)
On 3/12/18 2:50 PM, Erick Erickson wrote:
> Something like:
>
> solr/collection/query?q=chris shultz&defType=edismax&qf=all^10
> phonetic
Interesting. Looks like the "qf=all phonet
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Erick,
On 3/12/18 2:50 PM, Erick Erickson wrote:
> Something like:
>
> solr/collection/query?q=chris shultz&defType=edismax&qf=all^10
> phonetic
Interesting. Looks like the "qf=all phonetic" would take the place of
my existing "df=all" paramet
@Erick: thank you for clarifying!
@Markus:
I feel like I'm not (or at least should not be :-)) the first person to run
into these challenges.
"You can solve this by adding manual rules to StemmerOverrideFilter, but due
to the compound nature of words, you would need to add it for all the mills"
I'm resending the information below because the original message got the
security.json stuff garbled.
I'm using 6.6.0 with security.json active, having the content shown
below. I am running standalone mode, have two
I'm using 6.6.0 with security.json active, having the content shown
below. I am running standalone mode, have two solr cores defined:
email1, and email2. Since the 'blockUnknown' is set to false, everyone
should have access to any unprotected resource. As you can see, I have
three users defined:
Hi,
are your collections using stateFormat 1 or 2? In version 1 all state
was stored in one file while in version 2 each collection has its own
state.json. I assume that in the old version it could happen that the
common file still contains state for a collection that was deleted. So I
would
I'm also having issue with replicas in the target data center. It will go from
recovering to down. And when one of my replicas go to down in the target data
center, CDCR will no longer send updates from the source to the target.
> On Mar 12, 2018, at 9:24 AM, Tom Peters wrote:
>
> Anyone have
Something like:
solr/collection/query?q=chris shultz&defType=edismax&qf=all^10 phonetic
The point of edismax is to take whatever the input is and distribute
it among one or more fields defined by the "qf" parameter.
In this case, it'll look for "chris" and "shultz" in both the "all"
and "phon
I don't have any production logs and this all sounds to complicated.
So, I'll just trow the system together in a way it makes the most sense for
now.. collect some logs and then do some testing further down the road. For
now just get the sucker up and running.
Thanks all
On Mon, Mar 12, 2018
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Erick,
On 3/12/18 1:36 PM, Erick Erickson wrote:
> Did you try edismax?
Err no, and I must admit that it's a lot to take in. Did you have
a particular suggestion for how to use it?
Thanks,
- -chris
> On Mon, Mar 12, 2018 at 10:20 AM, Christop
I am not sure if I understand your question
*"How do I test this?"*
You have to run test (benchmark test) of transactions (queries) which are
most representative of your system (requirement).
You can use a performance testing tool like JMeter (along with PerfMon
configured for utilisation metrics
Greetings list,
I had question regarding the spellcheck.reload parameter. I am using the
IndexBasedSpellChecker which creates it's dictionary based on content from a
field. I built the spell check (in error) with a field that has stemming and
other filters associated to it.
Regarding the spell
Did you try edismax?
On Mon, Mar 12, 2018 at 10:20 AM, Christopher Schultz
wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> All,
>
> I have a Solr index containing application user information (username,
> first/last, etc.). I have created an "all" field for the purpose of
> using i
Chris:
LGTM, except maybe ;).
You'll want to look closely at your admin UI/Analysis page for the
field (or fieldType) once it's defined. Uncheck the "verbose" box when
you look the first time, it'll be less confusing. That'll show you
_exactly_ what the results are and whether they match your
So Im thinking following scenarios :
Single instance with drives in raid 0, raid 10 and raid 5.
And then having 3 Vms and 4 Solr instances each with its own HD.
How do I test this?
Greetz
On Mar 12, 2018 1:16 PM, "BlackIce" wrote:
> OK, so we're gone nowhere, since I've already lost lots of
Benchmark with production logs. Replay them at a constant request rate. Measure
the response time and look at the median and 90th or 95th percentile. Do not
use the average response time, because that will be thrown off by outliers.
It is best to run a few thousand warming queries before startin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
All,
I have a Solr index containing application user information (username,
first/last, etc.). I have created an "all" field for the purpose of
using it as a default. It contains most but not all fields.
I recently added phonetic searching for the
OK, so we're gone nowhere, since I've already lost lots of time... A few
days more or less won't make a difference I'd be willing to benchmark
if some tells me how to.
Greetz
On Mar 12, 2018 7:17 AM, "Deepak Goel" wrote:
> Now you are mixing your original question about performance with
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Erick,
On 3/12/18 1:00 PM, Erick Erickson wrote:
> bq: which you aren't supposed to edit directly.
>
> Well, kind of. Here's why it's "discouraged":
> https://lucene.apache.org/solr/guide/6_6/schema-api.html.
>
> But as long as you don't mix-and-
People can discourage that, but we only use hand-edited schema and solrconfig
files. Those are checked into version control. I wrote some Python to load them
into Zookeeper and reload the cluster.
This allows us to use the same configs in dev, test, and prod. We can actually
test things before
bq: which you aren't supposed to edit directly.
Well, kind of. Here's why it's "discouraged":
https://lucene.apache.org/solr/guide/6_6/schema-api.html.
But as long as you don't mix-and-match hand-editing with using the
schema API you can hand edit it freely. You're then in charge of
pushing it to
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
All,
I'd like to add a new synthesized field that uses a phonetic analyzer
such as Beider-Morse. I'm using Solr 7.2.
When I request the current schema via the schema API, I get a list of
existing fields, dynamic fields, and analyzers, none of which
Peter:
bq: I don't have a requestHandler named "/select".
Right, that was just an example of a request handler, your
"/scoresearch" handler _does_ have edismax as your default "defType"
so assuming you're using that one it makes no difference at all
whether you specify &defType=edismax on the URL
On 3/12/2018 3:22 AM, Deepak Goel wrote:
A single OS and JVM does not scale linearly for higher loads. If you have
seperate OS and Java, the load is distributed across multiple instances
(with each instance only requiered to support a smaller load and hence
would scale nicely)
I had found this f
What would be the best way to patch this to Solr 6.6 without having to do a
full upgrade
Thanks,
Roopa
On Fri, Mar 9, 2018 at 4:55 PM, Erick Erickson
wrote:
> Spoonerk:
>
> Please follow the instructions here:
> http://lucene.apache.org/solr/community.html#mailing-lists-irc
>
> . You must use t
Anyone have any thoughts on the questions I raised?
I have another question related to CDCR:
Sometimes we have to reindex a large chunk of our index (1M+ documents). What's
the best way to handle this if the normal CDCR process won't be able to keep
up? Manually trigger a bootstrap again? Or is
I have tried emailing to.unsubscribe. I have tried disrupting threads
hoping to anger the admin into getting me out of the spam list. All I get
is arrogant emails about headers
On Mar 12, 2018 1:15 AM, "苗海泉" wrote:
> Thanks Erick and Shawn , Thank you for your patience. I said that the
> abo
Now you are mixing your original question about performance with reliability
On 12 Mar 2018 02:29, "BlackIce" wrote:
> Second to this wouldn't 4 Solr instances each with its own HD be fault
> tolerant? vs. one solr instance with 4 HD's in RAID 0? Plus to his comes
> the storage capacity, I need
"/You probably want to call solr.FlattenGraphFilterFactory after the call
to WordDelimiterGraphFilterFactory. I put it at the end/ "
That solved my issue
Thank you
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Alright, thanks.
Yeah, the SuggestStopFilterFactory gets closer but isn't what I'm looking
for in this case!
Ryan
On Sat, 10 Mar 2018 at 06:12 Rick Leir wrote:
> Tav, Ryan
> Now you have me wondering, should it be returning *:* or some general
> landing page.
>
> Suppose you had typeahead or
We need benchmarks or data to support the claim.
A single OS and JVM does not scale linearly for higher loads. If you have
seperate OS and Java, the load is distributed across multiple instances
(with each instance only requiered to support a smaller load and hence
would scale nicely)
I had found
Hi,
Glad to hear you removed the gramming, but Kraaij-Pohlmann isn't going to solve
all problems either, for example molens => molen, but molen => mool, and many
more like that. You can solve this by adding manual rules to
StemmerOverrideFilter, but due to the compound nature of words, you woul
Thanks Erick and Shawn , Thank you for your patience. I said that the
above phenomenon was caused by the IO, cpu, memory, and network io. The
swap was turned off and the machine's memory was sufficient. When the speed
of indexing is declining, QTime is found to take 3 seconds to 4 seconds to
relo
On 3/11/2018 7:39 PM, Deepak Goel wrote:
I doubt this. It would be great if someone can subtantiate this with hard
facts
This seems to be in response to my claim that virtualization always has
overhead. I don't see how this statement can be at all controversial.
Virtualization isn't free,
43 matches
Mail list logo