Hi,
On 7.2 is regularly see this popping up:
2018-01-23 16:16:37.056 ERROR (qtp329611835-117592) [c:logs s:shard1
r:core_node1 x:logs_shard1_replica1] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 12/12 ms
at
Hi,
Glad to hear you removed the gramming, but Kraaij-Pohlmann isn't going to solve
all problems either, for example molens => molen, but molen => mool, and many
more like that. You can solve this by adding manual rules to
StemmerOverrideFilter, but due to the compound nature of words, you
Hello,
Our payload handling became broken since Lucene/Solr 7.2, we sometimes get 0.0
= AveragePayloadFunction.docScore() for some but not all query clauses. We only
have payloads on some terms, to signal the similarity it needs to 'punish' the
term, e.g. being a article or adjective.
I
Hi - In that case you need the KeywordRepeat and RemoveDuplicates filters as
well, i'd suggest reading their Javadocs. With the docs and the analysis GUI,
you can probably figure out their respective place in the tokenizer chain
yourself.
Trusting on IDF is not really a fine controlled
Hello Peter,
StemmerOverride wants \t separated fields, that is probably the cause of the
AIooBE you get. Regarding schema definitions, each factory JavaDoc [1] has a
proper example listed. I recommend putting a decompounder before a stemmer, and
have an accent (or ICU) folder as one of the
-Original message-
> From:PeterKerk
> Sent: Tuesday 13th March 2018 14:24
> To: solr-user@lucene.apache.org
> Subject: RE: Solr search engine configuration
>
> Markus,
>
> Thanks again. Ok, 1 by 1:
>
> StemmerOverride wants \t separated fields, that is
has:
> If this
> option is changed, the system property must be set on all servers and
> clients otherwise problems will arise
>
> Other than Zookeeper java property what are the other places this should be
> set?
>
> Thank you
> Roopa
>
> Sent from my iPhone
>
Hi - For now, the only option is to allow larger blobs via jute.maxbuffer
(whatever jute means). Despite ZK being designed for kb sized blobs, Solr
demands us to abuse it. I think there was a ticket for compression support, but
that only stretches the limit.
We are running ZK with 16 MB for
Inline, cheers.
-Original message-
> From:PeterKerk
> Sent: Tuesday 13th March 2018 18:53
> To: solr-user@lucene.apache.org
> Subject: RE: Solr search engine configuration
>
> You must stay in the Javadoc section, there the examples are good, or the
> reference
eriority to the FieldType here:
> https://issues.apache.org/jira/browse/SOLR-4619?focusedCommentId=13611191=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13611191
> Sadly, the FieldType is the one that is documented in the ref guide, but
> not the URP :-(
>
&g
Hello Quynh,
Solr has support for external file fields [1]. They are a simple key=float
based text file where key is ID, and the float can be used for boosting/scoring
documents. This is a much simpler approach than using a separate collection.
These files can be reloaded every commit and are
rse it isn't for everybody --
> only when the analysis chain is sufficiently complex.
>
> On Mon, Apr 9, 2018 at 9:45 AM Markus Jelsma <markus.jel...@openindex.io>
> wrote:
>
> > Hello David,
> >
> > The remote client has everything on the class path but jus
opy PreAnalyzedParser into
> your codebase so that you don't have to reinvent any wheels, even though
> that's awkward. Perhaps that ought to be in Solrj? But no we don't want
> SolrJ depending on Lucene-core, though it'd make a fine "optional"
> dependency.
>
> On Wed,
Inline.
-Original message-
> From:Shawn Heisey <apa...@elyograg.org>
> Sent: Tuesday 24th April 2018 21:18
> To: solr-user@lucene.apache.org
> Subject: Re: IndexFetcher cannot download index file
>
> On 4/24/2018 12:36 PM, Markus Jelsma wrote:
> > I
-Original message-
> From:Shawn Heisey <apa...@elyograg.org>
> Sent: Tuesday 24th April 2018 19:12
> To: solr-user@lucene.apache.org
> Subject: Re: IndexFetcher cannot download index file
>
> On 4/24/2018 9:46 AM, Markus Jelsma wrote:
> > Disk space was W
19:12
> > To: solr-user@lucene.apache.org
> > Subject: Re: IndexFetcher cannot download index file
> >
> > On 4/24/2018 9:46 AM, Markus Jelsma wrote:
> > > Disk space was WARN level. It seems only stack traces of ERROR level
> > > messages are visible via
Hello Nicolas,
Yes you can! Check out ComplexPhaseQParser
https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-ComplexPhraseQueryParser
Regards,
Markus
-Original message-
> From:Nicolas Paris
> Sent: Sunday 22nd April 2018 20:04
> To:
Hello,
We have a DocumentTransformer that gets a Field from the SolrDocument and casts
it to StoredField (although aparently we don't need to cast). This works well
in tests and fine in production, except for some curious, unknown and
unreproducible, cases, throwing the ClassCastException.
I
Hello,
After a failed log replay (it got a ClassCastException) with 7.2.1 it seems
Solr tries to haul over a 50 GB index from another replica. While doing so, it
throws a good number of checksum warnings.
Why don't the checksums match? Can i safely ignore them? Do i need to do
something about
Forget about it, recovery got a java.io.IOException: No space left on device
but it wasn't clear until i inspected the real logs.
The logs in de web admin didn't show the disk space exception, even when i
expand the log line. Maybe that could be changed.
Thanks,
Markus
-Original
Hello,
Slightly different questions/problem, what is going on here on 7.2.1? During
the recovery, none of this node's fellow replicas indexes were changed but we
still got this error.
When we got that error, the recovery was restarted, but shortly after the
replicas indexes got updated and
l 2018 17:39
> To: solr-user@lucene.apache.org
> Subject: Re: IndexFetcher cannot download index file
>
> On 4/24/2018 6:52 AM, Markus Jelsma wrote:
> > Forget about it, recovery got a java.io.IOException: No space left on
> > device but it wasn't clear until i inspected the r
Hello,
We want to move to PreAnalyzed FieldType to offload our very heavy analysis
chain away from the search cluster, so we have to configure our fields to
accept pre-analyzed tokens in production.
But we use the same schema in development environments too, and that is where
we use JSON
Hello,
We intend to move to PreAnalyzed URP for analysis offloading. Browsing the
Javadocs i came across the SchemaRequest API looking for a way to get a Field
object remotely, which i seem to need for
JsonPreAnalyzedParser.toFormattedString(Field f). But all i can get from
SchemaRequest API
Hello,
QueryElevator.prepare() runs five times for a single query in distributed
search, this is probably not how it should be, but in what phase of distributed
search is it supposed to actually run?
Many thanks,
Markus
Anything on this one to share?
Thanks,
Markus
-Original message-
> From:Markus Jelsma
> Sent: Friday 16th March 2018 18:13
> To: Solr-user
> Subject: QueryElevator prepare() in in distributed search
>
> Hello,
>
>
thing. It's
> just a tool for me so i didn't want to go too deep into it bit sometimes a
> must is a must. :) default schema.xml? I just get this managed_schema file
> when installing. Do you mean that one?
>
>
> Am 27. Februar 2018 11:12:39 vorm. schrieb Markus Jelsma
Hello,
Mixing language specific filters in the same analyzer is not going to give
predictable or desirable results. Instead, create separate text_en and text_de
fieldTypes and fields. See Solr's default schema.xml, it has many examples of
various languages.
Depending on what query parser you
Hi,
I would not use ID (uniqueKey) as signature field, query elevation would never
work properly with such a set up, change a document's content, and it 'll get a
new ID.
If i remember correctly this factory still deletes duplicates if signatureField
is not uniqueKey.
Regarding SOLR-3473,
related to updateLog replay.
>
> On Tue, Apr 24, 2018 at 7:13 AM Markus Jelsma <markus.jel...@openindex.io>
> wrote:
>
> > Hello,
> >
> > We have a DocumentTransformer that gets a Field from the SolrDocument and
> > casts it to StoredField (although apa
ke? Any custom
> plugins or things we should be aware of? Simple indexing artificial docs,
> querying and committing doesn't seem to reproduce the issue for me.
>
> On Thu, Apr 26, 2018 at 10:13 PM, Markus Jelsma
> wrote:
>
> > Hello,
> >
> > We just finished upgrad
-Original message-
> From:Shawn Heisey
> Sent: Wednesday 27th June 2018 17:40
> To: solr-user@lucene.apache.org
> Subject: Re: 7.4.0 changes in DocTransformer behaviour
>
> On 6/27/2018 8:29 AM, Markus Jelsma wrote:
> > I am attempting an upgrade to 7.4.0,
Hello Yonik,
If leaking a whole SolrIndexSearcher would cause this problem, then the only
custom component would be our copy/paste-and-enhance version of the elevator
component, is the root of all problems. It is a direct copy of the 7.2 source
where only things like getAnalyzedQuery, the
y 2nd May 2018 17:21
> > To: solr-user
> > Subject: Re: Collection reload leaves dangling SolrCore instances
> >
> > Markus:
> >
> > You may well be hitting SOLR-11882.
> >
> > On Wed, May 2, 2018 at 8:18 AM, Shawn Heisey wrote:
> > > On 5
Hello,
We observed this problem too with older Solr versions. Whenever none of the
shard's replica's would come up we would just shut them all down again and
restart just one replica and wait. In some cases it won't come up (still true
for Solr 7.4), but start a second shard a while later and
Hello Martin,
We also use an URP for this in some cases. We index documents to some
collection, the URP reads a field from that document which is an ID in another
collection. So we fetch that remote Solr document on-the-fly, and use those
fields to enrich the incoming document.
It is very
Hello Webster,
It smells like KeywordRepeat. In general it is not a problem if all terms are
scored twice. But you also have RemoveDuplicates, and this causes that in some
cases a term in one field is scored twice, but once in the other field and then
you have a problem.
Due to lack of
Hello, apologies for this long winded e-mail.
Our fields have KeywordRepeat and language specific filters such as a stemmer,
the final filter at query-time is SynonymGraph. We do not use
RemoveDuplicatesFilter for those of you wondering why when you see the parsed
queries below, this is due to
Hello Pratik,
We would use ShingleFilter for this indeed. If you only want bigrams/shingles,
don't forget to disable outputUnigrams and set both shinle size limits to 2.
Regards,
Markus
-Original message-
> From:Pratik Patel
> Sent: Thursday 15th November 2018 17:00
> To:
lePositionIncrements="false" for stop word filter but
> that parameter only works for lucene version 4.3 or earlier. Looks like
> it's an open issue in lucene
> https://issues.apache.org/jira/browse/LUCENE-4065
>
> For now, I am trying to find a workaround using PatternReplaceFilterFactory.
There are a few bugs for which you require to merge the index, see SOLR-8807
and related bugs.
https://issues.apache.org/jira/browse/SOLR-8807
-Original message-
> From:Erick Erickson
> Sent: Wednesday 3rd October 2018 4:50
> To: solr-user
> Subject: Re: Opinions on index
e you able to figure out anything?
> Currently thinking about rollbacking to 7.2.1.
>
>
>
> > On 3. Sep 2018, at 21:54, Markus Jelsma wrote:
> >
> > Hello,
> >
> > Getting an OOM plus the fact you are having a lot of IndexSearcher
> > instances
Hello,
Getting an OOM plus the fact you are having a lot of IndexSearcher instances
rings a familiar bell. One of our collections has the same issue [1] when we
attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr
code but had to keep our Lucene filters in the schema,
Hello Aishwarya,
KStem does a really bad job with the examples you have given, it won't remove
the -s and -ing suffixes in some strange cases. Porter/Snowball work just fine
for this example.
What won't work, of course, are irregular verbs and nouns (plural forms). They
always need to be
Indeed, but JDK-8038348 has been fixed very recently for Java 9 or higher.
-Original message-
> From:Jeff Courtade
> Sent: Wednesday 26th September 2018 17:36
> To: solr-user@lucene.apache.org
> Subject: Re: Java version 11 for solr 7.5?
>
> My concern with using g1 is solely based on
Hello,
Apologies for bothering you all again, but i really need some help in this
matter. How can we resolve this issue? Are we dealing with a bug here (then
i'll open a ticket), am i doing something wrong?
Is here anyone who had the same issue or understand the problem?
Many thanks,
Markus
Hello,
There is an extremely undocumented parameter to get the cache's contents
displayed. Set showItems="100" on the filter cache.
Regards,
Markus
-Original message-
> From:Erick Erickson
> Sent: Wednesday 16th January 2019 17:40
> To: solr-user
> Subject: Re: Re:
Hello,
Sorry for trying this once more. Is there anyone around who can help me, and
perhaps others, on this subject and the linked Jira ticket and failing test?
I could really need some help from someone who is really familiar with edismax
code and the underlying QueryBuilder parts that are
Hello,
I have opened a SOLR-13009 describing the problem. The attached patch contains
a unit test proving the problem, i.e. the test fails. Any help would be greatly
appreciated.
Many thanks,
Markus
https://issues.apache.org/jira/browse/SOLR-13009
-Original message-
> From:Markus
Hello,
A background batch process compiles a data set, when finished, it sends a
delete all to its target collection, then everything gets sent by SolrJ,
followed by a regular commit. When inspecting the core i notice it has one
segment with 9578 documents, of which exactly half are deleted.
and
> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
> (Solr 7.5+).
>
> Best,
> Erick
> On Tue, Nov 27, 2018 at 4:29 AM Markus Jelsma
> wrote:
> >
> > Hello,
> >
> > A background batch process compiles a data set, when fi
Hello,
We just witnessed this too with 7.7. No no obvious messages in the logs, the
replica status would not come out of 'down'.
Meanwhile we got another weird exception from a neighbouring collection sharing
the same nodes:
2019-02-18 13:47:20.622 ERROR
Hello,
We are moving some replica's to TLOG, one collection runs 7.5, the others 7.7.
When indexing, we see UPDATE.updateHandler.errors increment for each document
being indexed, there is nothing in the logs.
Is this a known issue?
Thanks,
Markus
produce this
> should be
> a JIRA IMO.
>
> Best,
> Erick
>
> > On Feb 21, 2019, at 2:33 AM, Markus Jelsma
> > wrote:
> >
> > Hello,
> >
> > We are moving some replica's to TLOG, one collection runs 7.5, the others
> > 7.
enumerated approach for phrase queries where slop>0, so setting ps=0 would
> probably also help.
> Michael
>
> On Fri, Feb 8, 2019 at 5:57 AM Markus Jelsma
> wrote:
>
> > Hello (apologies for cross-posting),
> >
> > While working on SOLR-12743, using 7.
Hello,
Solr's error responses respect the configured response writer settings, so you
could probably remove the element and the stuff it contains
using XSLT. It is not too fancy, but it should work.
Regards,
Markus
-Original message-
> From:Branham, Jeremy (Experis)
> Sent: Friday
Hello,
Due to reading 'This filter must be included on index-time analyzer..' in the
documentation, i never considered adding it to a query-time analyser.
However, we had problems with a set of three two-word synonyms never yielding
the same number of results with SynonymGraph. When switching
Hello (apologies for cross-posting),
While working on SOLR-12743, using 7.6 on two nodes and 7.2.1 on the remaining
four, we stumbled upon a situation where the 7.6 nodes quickly succumb when a
'Query-of-Death' is issued, 7.2.1 up to 7.5 are all unaffected (tested and
confirmed).
Following
I stumbled upon this too yesterday and created SOLR-13249. In local unit tests
we get String but in distributed unit tests we get a ByteArrayUtf8CharSequence
instead.
https://issues.apache.org/jira/browse/SOLR-13249
-Original message-
> From:Andreas Hubold
> Sent: Friday 15th
Hello,
Thanks to SOLR-12743 - one of our collections can't use FastLRUCache - we are
considering LFUCache instead. But there is SOLR-3393 as well, claiming the
current implementation is inefficient.
But ConcurrentLRUCache and ConcurrentLFUCache both use ConcurrentHashmap under
the hood, the
Hello,
I made a ConditionalTokenFilter filter and factory. Its Lucene based unit tests
work really well, and i can see it is doing something, queries are differently
analyzed based on some condition.
But when debugging through the GUI i get the following:
2019-04-15 12:37:42.219 ERROR
Hello,
We use VisualVM for making observations. But use Eclipse MAT for in-depth
analysis, usually only when there is a suspected memory leak.
Regards,
Markus
-Original message-
> From:John Davis
> Sent: Friday 7th June 2019 20:30
> To: solr-user@lucene.apache.org
> Subject: Re:
Hello,
When upgrading to 7.7 i got SOLR-13249, when a SolrInputField's value suddenly
became ByteArrayUtf8CharSequence instead of a String. That has been addressed.
I am now upgrading to 8.1.1 and have a SearchComponent that operates on uses
SolrClient to fetch documents from elsewhere
Hello,
Slight correction, SolrCLI does become visible in the local applications view.
I just missed it before.
Thanks,
Markus
-Original message-
> From:Markus Jelsma
> Sent: Thursday 30th May 2019 14:47
> To: solr-user
> Subject: Solr 8.1.1, JMX and VisualVM
>
> Hello,
>
> While
Hello,
While upgrading from 7.7 to 8.1.1, i noticed start.jar and SolrCLI no longer
pop up in the local applications view of VisualVM! I CTRL-F'ed my way through
the changelog for Solr 8.0.0 to 8.1.1 but could not find anything related. I am
clueless!
Using OpenJDK 11.0.3 2019-04-16 and Solr
Hello,
What is missing in that article is you must never use NOW without rounding it
down in a filter query. If you have it, round it down to an hour, day or minute
to prevent flooding the filter cache.
Regards,
Markus
-Original message-
> From:Atita Arora
> Sent: Wednesday 29th May
22, 2019 at 11:00 AM Gregg Donovan wrote:
>
> > FWIW: we have also seen serious Query of Death issues after our upgrade to
> > Solr 7.6. Are there any open issues we can watch? Is Markus' findings
> > around `pf` our best guess? We've seen these issues even with ps=0. We also
&
ct: Re: Solr 8.1.1, JMX and VisualVM
>
> Hi,
>
> This has to do with the new JVM flags that optimise performance, they were
> added roughly at the same time when Solr switched to G1GC.
>
> In ‘bin/solr’ please comment out this flag: '-XX:+PerfDisableSharedMem'.
>
> &g
Hello,
We are upgrading to Solr 8. One of our reindexed collections takes a GB more
than the production uses which is on 7.7.1. Production also has deleted
documents. This means Solr 8 somehow uses more disk space. I have checked both
Solr and Lucene's CHANGES but no ticket was immediately
gt; an "optimize" change anything? Is this DocValues strings?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 12. jun. 2019 kl. 23:49 skrev Markus Jelsma :
> >
> > Hello again,
> >
> > We found
sey
> Sent: Thursday 13th June 2019 13:42
> To: solr-user@lucene.apache.org
> Subject: Re: Increased disk space usage 8.1.1 vs 7.7.1
>
> On 6/13/2019 4:19 AM, Markus Jelsma wrote:
> > We are upgrading to Solr 8. One of our reindexed collections takes a GB
> > more than the pro
Hello again,
We found another oddity when upgrading to Solr 8. For a *:* query, the facet
counts for a simple string field do not match at all between these versions.
Solr 7.7.1 gives less or zero counts where as for 8 we see the correct counts.
So something seems fixed for a bug that i was
Hello,
One of our collections hates CursorMark, it really does. When under very heavy
load the nodes can occasionally consume GBs additional heap for no clear reason
immediately after downloading the entire corpus.
Although the additional heap consumption is a separate problem that i hope
Hello,
With gridlevel set to 3 i have a map of 256 x 128. However, i would really like
a higher resolution, preferable twice as high. But with any gridlevel higher
than 3, or distErrPct 0.1 or lower, i get the IllegalArgumentException, saying
it does not want to give me a 1024x1024 sized map.
Hello,
If you get a Connection Refused, then normally the server is just offline. But,
something weird is hiding in your stack trace, you should check it out further:
> Caused by: java.net.ConnectException: Cannot assign requested address
> (connect failed)
I have not seen this before.
Opened SOLR-13591.
https://issues.apache.org/jira/browse/SOLR-13591
-Original message-
> From:Markus Jelsma
> Sent: Thursday 27th June 2019 13:20
> To: solr-user@lucene.apache.org; solr-user
> Subject: RE: Solr 8 getZkStateReader throwing AlreadyClosedException
>
> This was 8.1.1
Hello,
There is no definitive rule for this, it depends on your situation such as size
of documents, resource constraints and possible heavy analysis chain. And in
case of (re)indexing a large amount, your autocommit time/limit is probably
more important.
In our case, some collections are
Hello,
We had two different SolrClients failing on different collections and machines
just around the same time. After restarting everything was just fine again. The
following exception was thrown:
2019-06-27 11:04:28.117 ERROR (qtp203849460-13532) [c:_shard1_replica_t15]
This was 8.1.1 to be precise. Sorry!
-Original message-
> From:Markus Jelsma
> Sent: Thursday 27th June 2019 13:19
> To: solr-user
> Subject: Solr 8 getZkStateReader throwing AlreadyClosedException
>
> Hello,
>
> We had two different SolrClients failing on different collections
Hello,
There is a newly created 8.2.0 all NRT type cluster for which i replaced each
NRT replica with a TLOG type replica. Now, the replicas no longer replicate
when the leader receives data. The situation is odd, because some shard
replicas kept replicating up until eight hours ago, another
asn't caused any issues.
>
> I'll make a note to check state.json next time we encounter the
> situation to see if I can see what you reported.
>
> Regards,
> Ere
>
> Markus Jelsma kirjoitti 22.8.2019 klo 16.36:
> > Hello,
> >
> > There is a newly created
Hello,
Looking this up i found SOLR-5692, but that was solved a lifetime ago, so just
checking if this is a familiar error and one i missing in Jira:
A client's Solr 8.2.0 cluster brought us the next StackOverflowError while
running 8.2.0 on Java 8:
Exception in thread
Hello Arnold,
Yes, we do this too for several cases.
You can create the SolrClient in the Factory's inform() method, and pass is to
the URP when it is created. You must implement SolrCoreAware and close the
client when the core closes as well. Use a CloseHook for this.
If you do not close the
Is there any way to get the information about the current Solr endpoint
> from within the custom URP?
>
> On Wed, Sep 4, 2019 at 3:10 PM Markus Jelsma
> wrote:
>
> > Hello Arnold,
> >
> > Yes, we do this too for several cases.
> >
> > You can create the So
Hello Rahul,
I don't know why you don't see your logs lines, but if i remember correctly,
you must put all custom processors above Log, Distributed and Run, at least i
remember i read it somewhere a long time ago.
We put all our custom processors on top of the three default processors and
that approach work for the other use case of searching from end of
> documents ?
> For example if I need to perform some term search from the end, e.g. "book"
> in the last 30 or 100 words.
>
> Is there SpanLastQuery ?
>
> Thanks,
> Adi
>
> -Original Me
Hello Adi,
Try SpanFirstQuery. It limits the search to within the Nth term in the field.
Regards,
Markus
-Original message-
> From:Kaminski, Adi
> Sent: Tuesday 15th October 2019 8:25
> To: solr-user@lucene.apache.org
> Subject: Position search
>
> Hi,
> What's the recommended way
Hello,
We are moving our text analysis to outside of Solr and use PreAnalyzedField to
speed up indexing. We also use MLT, but these two don't work together, there is
no way for MLT to properly analyze a document using the PreAnalyzedField's
analyzer, and it does not pass the code in the MLT
Hello Phil,
Solr never returns "The website encountered an unexpected error. Please try
again later." as an error. To get to the root of the problem, you should at
least post error logs that Solr actually throws, if it does at all.
You either have an application error, or an actual Solr
Hello Kyle,
This is actually the manual [1] clearly warns for. Snippet copied from the
manual:
"When setting the maximum heap size, be careful not to let the JVM consume all
available physical memory. If the JVM process space grows too large, the
operating system will start swapping it, which
Hello,
I have multiple collections, one 7.5.0 and the rest is on 8.3.1. They all share
the same ZK ensemble and have the same ZK connection string. The first ZK
address in the connection string is one that is not reachable, it seems
firewalled, the rest is accessible.
The 7.5.0 nodes do not
I found the bastard, it was a freaky document that skrewed Solr over, indexing
kept failing, passing documents between replica's times out, documents get
reindexed and so the document (and others) end up in the transaction log (many
times) and are eligible for reindexing. Reindexing and
Hello,
Our main Solr text search collection broke down last night (search was still
working fine), every indexing action timed out with the Solr master spending
most of its time in Java regex. One shard has only one replica left for queries
and it stays like that. I have copied both shard's
Hello,
Although it is not mentioned in Solr's language analysis page in the manual,
Lucene has had support for Korean for quite a while now.
https://lucene.apache.org/core/8_5_0/analyzers-nori/index.html
Regards,
Markus
-Original message-
> From:Audrey Lorberfeld -
> (Or should we be using this extended ExtendedDisMaxQParser class server
> side in Solr?)
>
> Kind regards,
>
> Edd
>
> ----
> Edward Turner
>
>
> On Mon, 17 Aug 2020 at 15:06, Markus Jelsma
> wrote:
>
> > Hello Edward,
> >
> &
Hello,
Normally, if a single document is bad, the whole indexing batch is dropped. I
think i remember there was an URP(?) that discards bad documents from the
batch, but i cannot find it in the manual [1].
Is it possible or am i starting to imagine things?
Thanks,
Markus
[1]
Hello,
You can use TrimFieldUpdateProcessorFactory [1] in your URP chain to remove
leading or trailing whitespace when indexing.
Regards,
Markus
[1]
https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/TrimFieldUpdateProcessorFactory.html
-Original
Subject: Re: Drop bad document in update batch
>
> I think you’re looking for TolerantUpdateProcessor(Factory), added in
> SOLR-445. It hung around for a LOGGG time and didn’t actually get
> added until 6.1.
>
> > On Aug 18, 2020, at 12:51 PM, Markus J
Hello Edward,
Yes you can by extending ExtendedDismaxQParser [1] and override its parse()
method. You get the main Query object through super.parse().
If you need even more fine grained control on how Query objects are created you
can extend ExtendedSolrQueryParser's [2] (inner class)
Well, when not splitting on whitespace you can the CharFilter for regex
replacements [1] to clear the entire search string if anywhere in the string a
banned word is found:
.*(cigarette|tobacco).*
[1]
1401 - 1500 of 1541 matches
Mail list logo