RE: Programmatic Basic Auth on CloudSolrClient

2021-03-05 Thread Subhajit Das

Hi Tomas,

Tried your suggestion. But last suggestion (directly passing the httpclient) 
resilts in NonRepeatableRequestException. And using full step, also didn’t 
recognize the auth.

Anything I should look for?

Thanks,
Subhajit

From: Tomás Fernández Löbbe
Sent: 05 March 2021 04:23 AM
To: solr-user@lucene.apache.org
Subject: Re: Programmatic Basic Auth on CloudSolrClient

Ah, right, now I remember that something like this was possible with the
"http1" version of the clients, which is why I created the Jira issues for
the http2 ones. Maybe you can even skip the "LBHttpSolrClient" step, I
believe you can just pass the HttpClient to the CloudSolrClient? you will
have to make sure to close all the clients that are created externally
after done, since the Solr client won't in this case.

On Thu, Mar 4, 2021 at 1:22 PM Mark H. Wood  wrote:

> On Wed, Mar 03, 2021 at 10:34:50AM -0800, Tomás Fernández Löbbe wrote:
> > As far as I know the current OOTB options are system properties or
> > per-request (which would allow you to use different per collection, but
> > probably not ideal if you do different types of requests from different
> > parts of your code). A workaround (which I've used in the past) is to
> have
> > a custom client that overrides and sets the credentials in the "request"
> > method (you can put whatever logic there to identify which credentials to
> > use). I recently created
> https://issues.apache.org/jira/browse/SOLR-15154
> > and https://issues.apache.org/jira/browse/SOLR-15155 to try to address
> this
> > issue in future releases.
>
> I have not tried it, but could you not:
>
> 1. set up an HttpClient with an appropriate CredentialsProvider;
> 2. pass it to HttpSolrClient.Builder.withHttpClient();
> 2. pass that Builder to
> LBHttpSolrClient.Builder.withHttpSolrClientBuilder();
> 3. pass *that* Builder to
> CloudSolrClient.Builder.withLBHttpSolrClientBuilder();
>
> Now you have control of the CredentialsProvider and can have it return
> whatever credentials you wish, so long as you still have a reference
> to it.
>
> > On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das 
> wrote:
> >
> > >
> > > Hi There,
> > >
> > > Is there any way to programmatically set basic authentication
> credential
> > > on CloudSolrClient?
> > >
> > > The only documentation available is to use system property. This is not
> > > useful if two collection required two separate set of credentials and
> they
> > > are parallelly accessed.
> > > Thanks in advance.
> > >
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>



Re: Investigating Seeming Deadlock

2021-03-05 Thread Mike Drob
Were you having any OOM errors beforehand? If so, that could have caused
some GC of objects that other threads still expect to be reachable, leading
to these null monitors.

On Fri, Mar 5, 2021 at 12:55 PM Stephen Lewis Bianamara <
stephen.bianam...@gmail.com> wrote:

> Hi SOLR Community,
>
> I'm investigating a node on solr 8.3.1 running in cloud mode which appears
> to have deadlocked, and I'm trying to figure out if this is a known issue
> or not, and looking for some guidance in understanding both (a) whether
> this is a resolved issue in future releases or needs a bug, and (b) how to
> lower the risk of recurrence until it is fixed.
>
> Here is what I've observed:
>
>- strace shows the main process waiting. A spot check on child processes
>shows the same, though I did not deep dive all of the threads yet (there
>are over 100).
>- the server was not doing anything or busy, except for jvm sitting at
>constant memory usage. No resource of memory, swap, cpu, etc... was
> limited
>or showing active usage.
>- jcmd Thread.Print shows some interesting info which suggests a
>deadlock or another type of locking issue
>   - For example, I found this log suggests something unusual because it
>   looks like it's trying to lock a null object
>  - "Finalizer" #3 daemon prio=8 os_prio=0 cpu=11.11ms
>  elapsed=11.11s tid=0x0100 nid=0x in
> Object.wait()
>   [0x1000]
> java.lang.Thread.State: WAITING (on object monitor)
>  at java.lang.Object.wait(java.base@11.0.7/Native Method)
>  - waiting on 
>  at java.lang.ref.ReferenceQueue.remove(java.base@11.0.7
>  /ReferenceQueue.java:155)
>  - waiting to re-lock in wait() <0x00020020> (a
>  java.lang.ref.ReferenceQueue$Lock)
>  at java.lang.ref.ReferenceQueue.remove(java.base@11.0.7
>  /ReferenceQueue.java:176)
>  at
>  java.lang.ref.Finalizer$FinalizerThread.run(java.base@11.0.7
>  /Finalizer.java:170)
>  - I also see a lot of this. Some addressess occur multiple times,
>   but one in particular occurs 31 times. Maybe related?
>  - "h2sc-1-thread-11" #110 prio=5 os_prio=0 cpu=54.29ms
>  elapsed=11.11s tid=0x10010100 nid=0x waiting
> on condition
>   [0x10011000]
> java.lang.Thread.State: WAITING (parking)
>  at jdk.internal.misc.Unsafe.park(java.base@11.0.7/Native
>  Method)
>  - parking to wait for  <0x00030033>
>
> Can anyone help answer whether this is known or what I could look at next?
>
> Thanks!
> Stephen
>


Investigating Seeming Deadlock

2021-03-05 Thread Stephen Lewis Bianamara
Hi SOLR Community,

I'm investigating a node on solr 8.3.1 running in cloud mode which appears
to have deadlocked, and I'm trying to figure out if this is a known issue
or not, and looking for some guidance in understanding both (a) whether
this is a resolved issue in future releases or needs a bug, and (b) how to
lower the risk of recurrence until it is fixed.

Here is what I've observed:

   - strace shows the main process waiting. A spot check on child processes
   shows the same, though I did not deep dive all of the threads yet (there
   are over 100).
   - the server was not doing anything or busy, except for jvm sitting at
   constant memory usage. No resource of memory, swap, cpu, etc... was limited
   or showing active usage.
   - jcmd Thread.Print shows some interesting info which suggests a
   deadlock or another type of locking issue
  - For example, I found this log suggests something unusual because it
  looks like it's trying to lock a null object
 - "Finalizer" #3 daemon prio=8 os_prio=0 cpu=11.11ms
 elapsed=11.11s tid=0x0100 nid=0x in Object.wait()
  [0x1000]
java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(java.base@11.0.7/Native Method)
 - waiting on 
 at java.lang.ref.ReferenceQueue.remove(java.base@11.0.7
 /ReferenceQueue.java:155)
 - waiting to re-lock in wait() <0x00020020> (a
 java.lang.ref.ReferenceQueue$Lock)
 at java.lang.ref.ReferenceQueue.remove(java.base@11.0.7
 /ReferenceQueue.java:176)
 at
 java.lang.ref.Finalizer$FinalizerThread.run(java.base@11.0.7
 /Finalizer.java:170)
 - I also see a lot of this. Some addressess occur multiple times,
  but one in particular occurs 31 times. Maybe related?
 - "h2sc-1-thread-11" #110 prio=5 os_prio=0 cpu=54.29ms
 elapsed=11.11s tid=0x10010100 nid=0x waiting
on condition
  [0x10011000]
java.lang.Thread.State: WAITING (parking)
 at jdk.internal.misc.Unsafe.park(java.base@11.0.7/Native
 Method)
 - parking to wait for  <0x00030033>

Can anyone help answer whether this is known or what I could look at next?

Thanks!
Stephen


Re: What controls field cache size and eviction rates?

2021-03-05 Thread Stephen Lewis Bianamara
Should say -- Can anyone confirm if it's right *still*, since the article
is 10 years old :)

On Fri, Mar 5, 2021 at 10:36 AM Stephen Lewis Bianamara <
stephen.bianam...@gmail.com> wrote:

> Hi SOLR Community,
>
> Just following up here with an update. I found this article which goes
> into depth on the field cache though stops short of discussing how it
> handles eviction. Can anyone confirm if this info is right?
>
> https://lucidworks.com/post/scaling-lucene-and-solr/
>
>
> Also, can anyone speak to how the field cache handles evictions?
>
> Best,
> Stephen
>
> On Wed, Feb 24, 2021 at 4:43 PM Stephen Lewis Bianamara <
> stephen.bianam...@gmail.com> wrote:
>
>> Hi SOLR Community,
>>
>> I've been trying to understand how the field cache in SOLR manages
>> its evictions, and it is not easily readable from the code or documentation
>> the simple question of when and how something gets evicted from the field
>> cache. This cache also doesn't show hit ratio, total hits, eviction ratio,
>> total evictions, etc... in the web UI.
>>
>> For example: I've observed that if I write one document and trigger a
>> query with a sort on the field, it will generate two entries in the field
>> cache. Then if I repush the document, the entries get removed, but will
>> otherwise stay there seemingly forever. If my query matches 2 docs, same
>> thing but with 4 entries (2 each). Then, if I rewrite one of the docs,
>> those two entries go away but not the two from the first one. This
>> obviously implies that there are implications to write throughput
>> performance based on this cache, so the fact that it is not configurable by
>> the user and doesn't have very clear documentation is a bit worrisome.
>>
>> Can someone here help out and explain how the filter cache handles
>> evictions, or perhaps send me the documentation if I missed it?
>>
>>
>> Thanks!
>> Stephen
>>
>


Re: What controls field cache size and eviction rates?

2021-03-05 Thread Stephen Lewis Bianamara
Hi SOLR Community,

Just following up here with an update. I found this article which goes into
depth on the field cache though stops short of discussing how it handles
eviction. Can anyone confirm if this info is right?

https://lucidworks.com/post/scaling-lucene-and-solr/


Also, can anyone speak to how the field cache handles evictions?

Best,
Stephen

On Wed, Feb 24, 2021 at 4:43 PM Stephen Lewis Bianamara <
stephen.bianam...@gmail.com> wrote:

> Hi SOLR Community,
>
> I've been trying to understand how the field cache in SOLR manages
> its evictions, and it is not easily readable from the code or documentation
> the simple question of when and how something gets evicted from the field
> cache. This cache also doesn't show hit ratio, total hits, eviction ratio,
> total evictions, etc... in the web UI.
>
> For example: I've observed that if I write one document and trigger a
> query with a sort on the field, it will generate two entries in the field
> cache. Then if I repush the document, the entries get removed, but will
> otherwise stay there seemingly forever. If my query matches 2 docs, same
> thing but with 4 entries (2 each). Then, if I rewrite one of the docs,
> those two entries go away but not the two from the first one. This
> obviously implies that there are implications to write throughput
> performance based on this cache, so the fact that it is not configurable by
> the user and doesn't have very clear documentation is a bit worrisome.
>
> Can someone here help out and explain how the filter cache handles
> evictions, or perhaps send me the documentation if I missed it?
>
>
> Thanks!
> Stephen
>


Re: Caffeine Cache Metrics Broken?

2021-03-05 Thread Stephen Lewis Bianamara
Thanks Shawn. Something seems different between the two because Caffeine
Cache is having much higher volume per hour than our previous
implementation was. So I guess it is then more likely that it is something
actually expected due to a change in what is getting kept/warmed, so I'll
look into this more and get back to you if that doesn't end up making sense
based on what I observe.

Thanks again,
Stephen

On Tue, Mar 2, 2021 at 6:35 PM Shawn Heisey  wrote:

> On 3/2/2021 3:47 PM, Stephen Lewis Bianamara wrote:
> > I'm investigating a weird behavior I've observed in the admin page for
> > caffeine cache metrics. It looks to me like on the older caches, warm-up
> > queries were not counted toward hit/miss ratios, which of course makes
> > sense, but on Caffeine cache it looks like they are. I'm using solr 8.3.
> >
> > Obviously this makes measuring its true impact a little tough. Is this by
> > any chance a known issue and already fixed in later versions?
>
> The earlier cache implementations are entirely native to Solr -- all the
> source code is include in the Solr codebase.
>
> Caffeine is a third-party cache implementation that has been integrated
> into Solr.  Some of the metrics might come directly from Caffeine, not
> Solr code.
>
> I would expect warming queries to be counted on any of the cache
> implementations.  One of the reasons that the warming capability exists
> is to pre-populate the caches before actual queries begin.  If warming
> queries are somehow excluded, then the cache metrics would not be correct.
>
> I looked into the code and did not find anything that would keep warming
> queries from affecting stats.  But it is always possible that I just
> didn't know what to look for.
>
> In the master branch (Solr 9.0), CaffeineCache is currently the only
> implementation available.
>
> Thanks,
> Shawn
>


Re: org.apache.solr.common.SolrException: this IndexWriter is closed

2021-03-05 Thread Dominique Bejean
Hi,
You are using RAMDirectoryFactory without enough RAM ?
regards
Dominique

Le ven. 5 mars 2021 à 16:18, 李世明  a écrit :

> Hello:
>
> Have you encountered the following exception that will cause the index to
> not be written? But you can query
> Version:8.7.0
>
> org.apache.solr.common.SolrException: this IndexWriter is closed
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:234)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)
> at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:795)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:568)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
> at
> org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
> at org.eclipse.jetty.server.Server.handle(Server.java:500)
> at
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
> at
> org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
> at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
> at org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:335)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
> at
> org.eclipse.jetty.http2.HTTP2Connection.produce(HTTP2Connection.java:170)
> at
> org.eclipse.jetty.http2.HTTP2Connection.onFillable(HTTP2Connection.java:125)
> at
> org.eclipse.jetty.http2.HTTP2Connection$FillableCallback.succeeded(HTTP2Connection.java:348)
> at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:103)
> at org.eclipse.jetty.io
> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
> at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
> at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
> at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: 

org.apache.solr.common.SolrException: this IndexWriter is closed

2021-03-05 Thread 李世明
Hello:

Have you encountered the following exception that will cause the index to not 
be written? But you can query
Version:8.7.0

org.apache.solr.common.SolrException: this IndexWriter is closed
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:234)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:795)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:568)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
at 
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:500)
at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:335)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
at 
org.eclipse.jetty.http2.HTTP2Connection.produce(HTTP2Connection.java:170)
at 
org.eclipse.jetty.http2.HTTP2Connection.onFillable(HTTP2Connection.java:125)
at 
org.eclipse.jetty.http2.HTTP2Connection$FillableCallback.succeeded(HTTP2Connection.java:348)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is 
closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:877)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:891)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:2080)

Re: new tlog files are not created per commit but adding into latest existing tlog file after replica reload

2021-03-04 Thread Michael Hu
Hi experts:

After I sent out previous email, I issued commit on that replica core and 
observed the same "ClosedChannelException", please refer to below under 
"issuing core commit" section

Then I issued a core reload, and I see the timestamp of the latest tlog file 
changed, please refer to "files under tlog directory " section below. Not sure 
those information is useful or not.

Thank you!

--Michael Hu

--- beginning for issuing core commit ---

$ curl 
'http://localhost:8983/solr/myconection_myshard_replica_t7/update?commit=true'

{

  "responseHeader":{

"status":500,

"QTime":71},

  "error":{

"metadata":[

  "error-class","org.apache.solr.common.SolrException",

  "root-error-class","java.nio.channels.ClosedChannelException"],

"msg":"java.nio.channels.ClosedChannelException",

"trace":"org.apache.solr.common.SolrException:

--- end for issuing core commit ---

--- beginning for files under tlog directory ---
before core reload:

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877

-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878

-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879

-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880

-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881

-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882

-rw-r--r-- 1 solr solr 2179991713 Mar  4 20:29 tlog.883


after core reload:

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877
-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878
-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879
-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880
-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881
-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882
-rw-r--r-- 1 solr solr 2179991717 Mar  4 22:23 tlog.883


--- end for files under tlog directory ---



From: Michael Hu 
Sent: Thursday, March 4, 2021 1:58 PM
To: solr-user@lucene.apache.org 
Subject: new tlog files are not created per commit but adding into latest 
existing tlog file after replica reload

Hi experts:

Need some help and suggestion about an issue I am facing

Solr info:
 - Solr 8.7
 - Solr cloud with tlog replica; replica size is 3 for my Solr collection

Issue:
 - before issuing collection reload; I observed a new tlog file are created 
after every commit; and those tlog files are deleted after a while (may be 
after index are merged?)
 - then I issued a collection reload using collection API on my collection at 
20:15
 - after leader replica is reloaded; no new tlog file are created; instead 
latest tlog file is growing, and no tlog file is deleted after reload. Below 
under "files under tlog directory" section is a snapshot of the tlog files 
under tlog directory of the leader replica. Again, I issued collection reload 
at 20:15, and after that tlog.883 is growing
 - I looked into log file and find error log entries below under "log entries" 
section, and the log entry repeats continuously for every auto commit after 
reload. I hope this log entry can provide some information for the issue.

Please help and suggestion what I may do incorrectly. Or this is a known issue, 
is there a way I can fix or work-around it?

Thank you so much!

--Michael Hu

--- beginning for files under tlog directory ---

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877

-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878

-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879

-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880

-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881

-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882

-rw-r--r-- 1 solr solr 2179991713 Mar  4 20:29 tlog.883

--- end for files under tlog directory ---

--- beginning for log entries ---

2021-03-04 20:15:38.251 ERROR (commitScheduler-4327-thread-1) [c:mycollection 
s:myshard r:core_node10 x:mycolletion_myshard_replica_t7] o.a.s.u.CommitTracker 
auto commit error...:

org.apache.solr.common.SolrException: java.nio.channels.ClosedChannelException

at 
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:503)

at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:835)

at org.apache.solr.update.UpdateLog.preCommit(UpdateLog.java:819)

at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:673)

at org.apache.solr.update.CommitTracker.run(CommitTracker.java:273)

at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

  

Re: Programmatic Basic Auth on CloudSolrClient

2021-03-04 Thread Tomás Fernández Löbbe
Ah, right, now I remember that something like this was possible with the
"http1" version of the clients, which is why I created the Jira issues for
the http2 ones. Maybe you can even skip the "LBHttpSolrClient" step, I
believe you can just pass the HttpClient to the CloudSolrClient? you will
have to make sure to close all the clients that are created externally
after done, since the Solr client won't in this case.

On Thu, Mar 4, 2021 at 1:22 PM Mark H. Wood  wrote:

> On Wed, Mar 03, 2021 at 10:34:50AM -0800, Tomás Fernández Löbbe wrote:
> > As far as I know the current OOTB options are system properties or
> > per-request (which would allow you to use different per collection, but
> > probably not ideal if you do different types of requests from different
> > parts of your code). A workaround (which I've used in the past) is to
> have
> > a custom client that overrides and sets the credentials in the "request"
> > method (you can put whatever logic there to identify which credentials to
> > use). I recently created
> https://issues.apache.org/jira/browse/SOLR-15154
> > and https://issues.apache.org/jira/browse/SOLR-15155 to try to address
> this
> > issue in future releases.
>
> I have not tried it, but could you not:
>
> 1. set up an HttpClient with an appropriate CredentialsProvider;
> 2. pass it to HttpSolrClient.Builder.withHttpClient();
> 2. pass that Builder to
> LBHttpSolrClient.Builder.withHttpSolrClientBuilder();
> 3. pass *that* Builder to
> CloudSolrClient.Builder.withLBHttpSolrClientBuilder();
>
> Now you have control of the CredentialsProvider and can have it return
> whatever credentials you wish, so long as you still have a reference
> to it.
>
> > On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das 
> wrote:
> >
> > >
> > > Hi There,
> > >
> > > Is there any way to programmatically set basic authentication
> credential
> > > on CloudSolrClient?
> > >
> > > The only documentation available is to use system property. This is not
> > > useful if two collection required two separate set of credentials and
> they
> > > are parallelly accessed.
> > > Thanks in advance.
> > >
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>


new tlog files are not created per commit but adding into latest existing tlog file after replica reload

2021-03-04 Thread Michael Hu
Hi experts:

Need some help and suggestion about an issue I am facing

Solr info:
 - Solr 8.7
 - Solr cloud with tlog replica; replica size is 3 for my Solr collection

Issue:
 - before issuing collection reload; I observed a new tlog file are created 
after every commit; and those tlog files are deleted after a while (may be 
after index are merged?)
 - then I issued a collection reload using collection API on my collection at 
20:15
 - after leader replica is reloaded; no new tlog file are created; instead 
latest tlog file is growing, and no tlog file is deleted after reload. Below 
under "files under tlog directory" section is a snapshot of the tlog files 
under tlog directory of the leader replica. Again, I issued collection reload 
at 20:15, and after that tlog.883 is growing
 - I looked into log file and find error log entries below under "log entries" 
section, and the log entry repeats continuously for every auto commit after 
reload. I hope this log entry can provide some information for the issue.

Please help and suggestion what I may do incorrectly. Or this is a known issue, 
is there a way I can fix or work-around it?

Thank you so much!

--Michael Hu

--- beginning for files under tlog directory ---

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877

-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878

-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879

-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880

-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881

-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882

-rw-r--r-- 1 solr solr 2179991713 Mar  4 20:29 tlog.883

--- end for files under tlog directory ---

--- beginning for log entries ---

2021-03-04 20:15:38.251 ERROR (commitScheduler-4327-thread-1) [c:mycollection 
s:myshard r:core_node10 x:mycolletion_myshard_replica_t7] o.a.s.u.CommitTracker 
auto commit error...:

org.apache.solr.common.SolrException: java.nio.channels.ClosedChannelException

at 
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:503)

at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:835)

at org.apache.solr.update.UpdateLog.preCommit(UpdateLog.java:819)

at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:673)

at org.apache.solr.update.CommitTracker.run(CommitTracker.java:273)

at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)

at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: java.nio.channels.ClosedChannelException

at 
java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150)

at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:266)

at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)

at java.base/java.nio.channels.Channels.writeFully(Channels.java:97)

at java.base/java.nio.channels.Channels$1.write(Channels.java:172)

at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:216)

at 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:209)

at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:193)

at 
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:498)

... 10 more

--- end for log entries ---



Re: Programmatic Basic Auth on CloudSolrClient

2021-03-04 Thread Mark H. Wood
On Wed, Mar 03, 2021 at 10:34:50AM -0800, Tomás Fernández Löbbe wrote:
> As far as I know the current OOTB options are system properties or
> per-request (which would allow you to use different per collection, but
> probably not ideal if you do different types of requests from different
> parts of your code). A workaround (which I've used in the past) is to have
> a custom client that overrides and sets the credentials in the "request"
> method (you can put whatever logic there to identify which credentials to
> use). I recently created https://issues.apache.org/jira/browse/SOLR-15154
> and https://issues.apache.org/jira/browse/SOLR-15155 to try to address this
> issue in future releases.

I have not tried it, but could you not:

1. set up an HttpClient with an appropriate CredentialsProvider;
2. pass it to HttpSolrClient.Builder.withHttpClient();
2. pass that Builder to LBHttpSolrClient.Builder.withHttpSolrClientBuilder();
3. pass *that* Builder to CloudSolrClient.Builder.withLBHttpSolrClientBuilder();

Now you have control of the CredentialsProvider and can have it return
whatever credentials you wish, so long as you still have a reference
to it.

> On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das  wrote:
> 
> >
> > Hi There,
> >
> > Is there any way to programmatically set basic authentication credential
> > on CloudSolrClient?
> >
> > The only documentation available is to use system property. This is not
> > useful if two collection required two separate set of credentials and they
> > are parallelly accessed.
> > Thanks in advance.
> >

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


graph traversal filter which uses document value in the query

2021-03-04 Thread Lee Carroll
Hi All,
I'm using the graph query parser to traverse a set of edge documents. An
edge looks like

"id":"edge1", "recordType":"journey", "Date":"2021-03-04T00:00:00Z", "Origin
":"AAC", "OriginLocalDateTime":"2021-03-04T05:00:00Z", "Destination":"AAB",
"DestinationLocalDateTime":"2021-03-04T07:00:00Z"

I'd like to collect  journeys needed to travel from an origin city to a
destination city in a single hop (a-b-c) where all journeys are made on the
same day. I'm using a traversal filter to achieve this on the same day
criteria but the function field parameter which I'm expecting to return the
document's date value is being ignored
For example a query to get all journeys from AAA to AAB is:

q={!graph
   maxDepth=1
   from=Origin
   to=Destination
traversalFilter='Date:{!func}Date'
} Origin:AAA  & fq= DestinationAirportCode:AAB || originAirportCode:AAA

What is the correct approach for this problem?

Cheers Lee C


Re: Get first value in a multivalued field

2021-03-04 Thread Walter Underwood
You can copy the field to another field, then use the 
FirstFieldValueUpdateProcessorFactory to limit that field to the first value. 
At least, that seems to be what that URP does. I have not used it.

https://solr.apache.org/guide/8_8/update-request-processors.html

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 4, 2021, at 11:42 AM, ufuk yılmaz  wrote:
> 
> Hi,
> 
> Is it possible in any way to get the first value in a multivalued field? 
> Using function queries, streaming expressions or any other way without 
> reindexing? (Stream decorators have array(), but no way to get a value at a 
> specific index?)
> 
> Another one, is it possible to match a regex to a text field and extract only 
> the matching part?
> 
> I tried very hard for this too but couldn’t find a way.
> 
> --ufuk
> 
> Sent from Mail for Windows 10
> 



Get first value in a multivalued field

2021-03-04 Thread ufuk yılmaz
Hi,

Is it possible in any way to get the first value in a multivalued field? Using 
function queries, streaming expressions or any other way without reindexing? 
(Stream decorators have array(), but no way to get a value at a specific index?)

Another one, is it possible to match a regex to a text field and extract only 
the matching part?

I tried very hard for this too but couldn’t find a way.

--ufuk

Sent from Mail for Windows 10



Re: wordpress anyone?

2021-03-04 Thread dmitri maziuk

On 2021-03-03 10:24 PM, Gora Mohanty wrote:

... there does seem to be another plugin that is

open-source,and hosted on Github: https://wordpress.org/plugins/solr-power/


I saw it, they lost me at

"you'll need access to a functioning Solr 3.6 instance for the plugin to 
work as expected. This plugin does not support other versions of Solr."


Dima



Re: Potential Slow searching for unified highlighting on Solr 8.8.0/8.8.1

2021-03-04 Thread Ere Maijala

Hi,

Solr uses JIRA for issue tickets. You can find it here: 
https://issues.apache.org/jira/browse/SOLR


I'd suggest filing a new bug issue in the SOLR project (note that 
several other projects also use this JIRA installation). Here's an 
example of an existing highlighter issue for reference: 
https://issues.apache.org/jira/browse/SOLR-14019.


See also some brief documentation:

https://cwiki.apache.org/confluence/display/solr/HowToContribute#HowToContribute-JIRAtips(ourissue/bugtracker)

Regards,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.58:

Hi Ere

Please to be of service!

No I have not filed a JIRA ticket. I am new to interacting with the Solr
Community and only beginning to 'find my legs'. I am not too sure what JIRA
is I am afraid!

Regards

Matthew

Matthew Flowerday | Consultant | ULEAF
Unisys | 01908 774830| matthew.flower...@unisys.com
Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.



-Original Message-
From: Ere Maijala 
Sent: 01 March 2021 12:53
To: solr-user@lucene.apache.org
Subject: Re: Potential Slow searching for unified highlighting on Solr
8.8.0/8.8.1

EXTERNAL EMAIL - Be cautious of all links and attachments.

Hi,

Whoa, thanks for the heads-up! You may just have saved me from a whole lot
of trouble. Did you file a JIRA ticket already?

Thanks,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.00:

Hi There

I just came across a situation where a unified highlighting search
under solr 8.8.0/8.8.1 can take over 20 mins to run and eventually times

out.

I resolved it by a config change – but it can catch you out. Hence
this email.

With solr 8.8.0 a new unified highlighting parameter
 was implemented which if not set defaults to 0.5.
This attempts to improve the high lighting so that highlighted text
does not appear right at the left. This works well but if you have a
search result with numerous occurrences of the word in question within
the record performance goes right down!

2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf]
o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select
params={hl.snippets=2=test=on=100=id,d
escription,specification,score=20=*=10&_=161440511913
4}
hits=57008 status=0 QTime=1414320

2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf]
o.a.s.s.HttpSolrCall Unable to write response, client closed
connection or we are shutting down =>
org.eclipse.jetty.io.EofException

at
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)

org.eclipse.jetty.io.EofException: null

at
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

at
org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

at
org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

when I set =0.25 results came back much quicker

2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false=on=id,description,specification,s
core=1=0.25=100=2=test
axAnalyzedChars=100=*=unified=9&_=
1614430061690}
hits=136939 status=0 QTime=87024

And  =0.1

2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false=on=id,description,specification,s
core=1=0.1=100=2=test
xAnalyzedChars=100=*=unified=9&_=1
614430061690}
hits=136939 status=0 QTime=69033

And =0.0

2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false=on=id,description,specification,s
core=1=0.0=100=2=test
xAnalyzedChars=100=*=unified=9&_=1
614430061690}
hits=136939 status=0 QTime=2841

I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully
left aligned).  I am not too sure as to how many time a word has to
occur in a record for performance to go right down – but if too many
it can have a BIG impact.

I also noticed that setting =9 did not break out of
the query until it finished. Perhaps because the query finished
quickly and what took the time was the highlighting. It might be an
idea to get  to also cover any highlighting so that the
query does not run until the jetty timeout is hit. The machine 100%
one core for about
20 mins!.

Hope this helps.

Regards

Matthew

*Matthew Flowerday*| Consultant | ULEAF

Unisys | 01908 774830| matthew.flower...@unisys.com


Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes |

Graph query from A to X[n] when number of hops is not known

2021-03-03 Thread Sravani Kambhampati
Hi,

How to graph query from A to X where number of hops is not known, but when 
graph query for each hop remains same.


For example:
If my graph looks like this,
id:A -> pk:A1 -> tgt:A2
id:B -> pk:B1 -> tgt:B2
...
id:X

To get from A to B,

  1.  We query A to A2 using (id->pk) + (pk -> tgt) {!graph from=tgt 
to=pk}{!graph from=pk to=id}id:A
  2.  Then from A2 to B using (tgt -> id) {!graph from=id to=tgt}


To get from A to C, steps 1 and 2 will be repeated:
{!graph from=id to=tgt}{!graph from=tgt to=pk}{!graph from=pk to=id}{!graph 
from=id to=tgt}{!graph from=tgt to=pk}{!graph from=pk to=id}id:A

Likewise, given a start node A, is it possible to query for X when number of 
hops is unknown, but when query is same for every hop?

Thanks,
Sravani


Re: wordpress anyone?

2021-03-03 Thread Gora Mohanty
On Thu, 4 Mar 2021 at 01:50, dmitri maziuk  wrote:

> Hi all,
>
> does anyone use Solr with WP? It seems there is one for-pay-only
> offering and a few defunct projects from a decade ago... a great web
> search engine is particularly useful if it can actually be used in a
> client.
>
> So has anyone heard about any active WP integration projects other than
> wpsolr.com?
>

Haven't had occasion to use Wordpress, and Solr with it, for a while. Since
nobody else has replied, there does seem to be another plugin that is
open-source,and hosted on Github: https://wordpress.org/plugins/solr-power/
. Cannot comment as to how well it works. Alternatively, one could use a
PHP client library like Solarium.

Regards,
Gora


wordpress anyone?

2021-03-03 Thread dmitri maziuk

Hi all,

does anyone use Solr with WP? It seems there is one for-pay-only 
offering and a few defunct projects from a decade ago... a great web 
search engine is particularly useful if it can actually be used in a client.


So has anyone heard about any active WP integration projects other than 
wpsolr.com?


Dima


Solr NRT Replicas Out of Sync

2021-03-03 Thread Anshuman Singh
Hi,

In our Solr 7.4 cluster, we have noticed that some replicas of some of our
Collections are out of sync, the slave replica has more number of records
than the leader.
This is resulting in different number of records on subsequent queries on
the same Collection. Commit is also not helping in this case.

I'm able to replicate the issue using the steps given below:

   1. Create a collection with 1 shard and 2 rf
   2. Ingest 10k records in the collection
   3. Turn down node with replica 2
   4. Ingest 10k records in the collection
   5. Turn down replica 1
   6. Turn up replica 2, wait till it become leader
   7. Ingest 20k records on replica 2
   8. Turn down replica 2
   9. Turn up replica 1, wait till it become leader or use FORCELEADER
   action of Collections API
   10. Turn up replica 2
   11. Now replica 2 has 30k records and replica 1 has 20k records and they
   never sync

I tried the same steps with TLOG replicas and in that case both replicas
had 20k records in the end and were in sync but 10k records were lost.

Is there any way to sync the replicas? I am looking for a lightweight
solution that doesn't require re-creating the index.

Regards,
Anshuman


Parallel SQL Interface and 'qt'

2021-03-03 Thread Jostein Elvaker Haande
Hi,

I've just started to look into the Parallel SQL interface available in
SOLR. I've done some tests across a few collections, and it works fairly
well.

However I've run into an issue with a few collections where the SQL
interface does not return any data. Now according to the documentation, it
seems like it relies on using the default /select handler to lookup and
return data. However the collections that do not return data in the SQL
interface, are all using /select handlers that have some logic that
requires additional parameters to return data.

I know that if you define 'handleSelect' to true in the /select handler,
you can pass the 'qt' parameter to define which handler to use on the fly.
The handler in question is configured to use 'handleSelect', however when I
pass the 'qt' parameter to the /sql handler, it does not seem to work.

I've gone through the documentation, however I can't find any information
in this regard. Is it possible to define which handler to use when using
the /sql handler?

-- 
Yours sincerely Jostein Elvaker Haande
"A free society is a society where it is safe to be unpopular"
- Adlai Stevenson

https://tolecnal.net  -- tolecnal at tolecnal dot net


RE: Programmatic Basic Auth on CloudSolrClient

2021-03-03 Thread Subhajit Das
Thanks. This would be very helpful.

From: Tomás Fernández Löbbe
Sent: 04 March 2021 12:32 AM
To: solr-user@lucene.apache.org
Subject: Re: Programmatic Basic Auth on CloudSolrClient

Maybe something like this (I omitted a lot of things you'll have to do,
like passing zk or the list of hosts):

static class CustomCloudSolrClient extends CloudSolrClient {

  protected CustomCloudSolrClient(CustomCloudSolrClientBuilder builder) {
super(builder);
  }

  @Override
  public NamedList request(SolrRequest request, String
collection) throws SolrServerException, IOException {
// your logic here to figure out which credentials to use...
String user = "user";
String pass = "pass";
request.setBasicAuthCredentials(user, pass);
return super.request(request, collection);
  }
}

static class CustomCloudSolrClientBuilder extends CloudSolrClient.Builder {

  @Override
  public CloudSolrClient build() {
return new CustomCloudSolrClient(this);
  }
}

public static void main(String[] args) {
  CloudSolrClient c = new CustomCloudSolrClientBuilder().build();
  ...
}

Do consider that "request" method is called per request, make sure whatever
logic you have there is not super expensive.

On Wed, Mar 3, 2021 at 10:48 AM Subhajit Das 
wrote:

> Hi Thomas,
>
> Thanks. Can you please also share a sample of code to configure the client
> with your workaround?
>
> From: Tomás Fernández Löbbe
> Sent: 04 March 2021 12:05 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Programmatic Basic Auth on CloudSolrClient
>
> As far as I know the current OOTB options are system properties or
> per-request (which would allow you to use different per collection, but
> probably not ideal if you do different types of requests from different
> parts of your code). A workaround (which I've used in the past) is to have
> a custom client that overrides and sets the credentials in the "request"
> method (you can put whatever logic there to identify which credentials to
> use). I recently created https://issues.apache.org/jira/browse/SOLR-15154
> and https://issues.apache.org/jira/browse/SOLR-15155 to try to address
> this
> issue in future releases.
>
> On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das 
> wrote:
>
> >
> > Hi There,
> >
> > Is there any way to programmatically set basic authentication credential
> > on CloudSolrClient?
> >
> > The only documentation available is to use system property. This is not
> > useful if two collection required two separate set of credentials and
> they
> > are parallelly accessed.
> > Thanks in advance.
> >
>
>



Re: Programmatic Basic Auth on CloudSolrClient

2021-03-03 Thread Tomás Fernández Löbbe
Maybe something like this (I omitted a lot of things you'll have to do,
like passing zk or the list of hosts):

static class CustomCloudSolrClient extends CloudSolrClient {

  protected CustomCloudSolrClient(CustomCloudSolrClientBuilder builder) {
super(builder);
  }

  @Override
  public NamedList request(SolrRequest request, String
collection) throws SolrServerException, IOException {
// your logic here to figure out which credentials to use...
String user = "user";
String pass = "pass";
request.setBasicAuthCredentials(user, pass);
return super.request(request, collection);
  }
}

static class CustomCloudSolrClientBuilder extends CloudSolrClient.Builder {

  @Override
  public CloudSolrClient build() {
return new CustomCloudSolrClient(this);
  }
}

public static void main(String[] args) {
  CloudSolrClient c = new CustomCloudSolrClientBuilder().build();
  ...
}

Do consider that "request" method is called per request, make sure whatever
logic you have there is not super expensive.

On Wed, Mar 3, 2021 at 10:48 AM Subhajit Das 
wrote:

> Hi Thomas,
>
> Thanks. Can you please also share a sample of code to configure the client
> with your workaround?
>
> From: Tomás Fernández Löbbe
> Sent: 04 March 2021 12:05 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Programmatic Basic Auth on CloudSolrClient
>
> As far as I know the current OOTB options are system properties or
> per-request (which would allow you to use different per collection, but
> probably not ideal if you do different types of requests from different
> parts of your code). A workaround (which I've used in the past) is to have
> a custom client that overrides and sets the credentials in the "request"
> method (you can put whatever logic there to identify which credentials to
> use). I recently created https://issues.apache.org/jira/browse/SOLR-15154
> and https://issues.apache.org/jira/browse/SOLR-15155 to try to address
> this
> issue in future releases.
>
> On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das 
> wrote:
>
> >
> > Hi There,
> >
> > Is there any way to programmatically set basic authentication credential
> > on CloudSolrClient?
> >
> > The only documentation available is to use system property. This is not
> > useful if two collection required two separate set of credentials and
> they
> > are parallelly accessed.
> > Thanks in advance.
> >
>
>


RE: Programmatic Basic Auth on CloudSolrClient

2021-03-03 Thread Subhajit Das
Hi Thomas,

Thanks. Can you please also share a sample of code to configure the client with 
your workaround?

From: Tomás Fernández Löbbe
Sent: 04 March 2021 12:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Programmatic Basic Auth on CloudSolrClient

As far as I know the current OOTB options are system properties or
per-request (which would allow you to use different per collection, but
probably not ideal if you do different types of requests from different
parts of your code). A workaround (which I've used in the past) is to have
a custom client that overrides and sets the credentials in the "request"
method (you can put whatever logic there to identify which credentials to
use). I recently created https://issues.apache.org/jira/browse/SOLR-15154
and https://issues.apache.org/jira/browse/SOLR-15155 to try to address this
issue in future releases.

On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das  wrote:

>
> Hi There,
>
> Is there any way to programmatically set basic authentication credential
> on CloudSolrClient?
>
> The only documentation available is to use system property. This is not
> useful if two collection required two separate set of credentials and they
> are parallelly accessed.
> Thanks in advance.
>



Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-03 Thread Tomás Fernández Löbbe
Patch looks good to me. Since it's a bugfix it can be committed to 8_8
branch and released on the next bugfix release, though I don't think it
should trigger one. In the meantime, if you can patch your environment and
confirm that it fixes your problem, that's a good comment to leave in
SOLR-14758. 

On Mon, Mar 1, 2021 at 3:12 PM Phill Campbell 
wrote:

> Anyone?
>
> > On Feb 24, 2021, at 7:47 AM, Phill Campbell
>  wrote:
> >
> > Last week I switched to Solr 8.7 from a “special” build of Solr 6.6
> >
> > The system has a timeout set for querying. I am now seeing this bug.
> >
> > https://issues.apache.org/jira/browse/SOLR-14758 <
> https://issues.apache.org/jira/browse/SOLR-14758>
> >
> > Max Query Time goes from 1.6 seconds to 20 seconds and affects the
> entire system for about 2 minutes as reported in New Relic.
> >
> > null:java.lang.NullPointerException
> >   at
> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
> >   at
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
> >   at
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
> >   at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
> >   at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
> >   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)
> >
> >
> > Can this be fixed in a patch for Solr 8.8? I do not want to have to go
> back to Solr 6 and reindex the system, that takes 2 days using 180 EMR
> instances.
> >
> > Pease advise. Thank you.
>
>


Amazon Sponsor Product(Ads) Search team is looking for talents with expertise in Solr/Lucene at all levels

2021-03-03 Thread pxiong

Hi folks,

Are you interested in Amazon’s Advertising (that employs state of art 
solutions involving Search, Digital Advertising, AWS technologies etc) 
but wondering which team can best leverage your expertise in 
Solr/Lucene? Please continue to read:


My team is Sponsored Search Delivery (SSD) where we directly use Solr to 
efficiently render relevant Ads for shoppers. We deliver billions of ad 
impressions and millions of clicks daily and are breaking fresh ground 
to create world-class products. Powered with Solr, our systems scan 
billions of records in milliseconds to return relevant ads. The 
architectural excellence is evident from the fact that despite we’ve 
such high-throughput-low-tps Tier-1 system, our team enjoys an 
exceptionally low operations burden. Our solutions employ latest 
‘information retrieval(IR)’ algorithms and we try to leverage latest 
research in IR to ensure superior shopper experience. It’s a great mix 
of ‘engineering’ and ‘applied-science’ to render relevant ads within 
milliseconds.


My team is making tremendous investment in Solr research, development 
and application this year and the future. For example, we plan to apply 
vector search in Solr (SOLR-12890) to extend power of current 
keyword-based search to improve ads recall. We are experimenting 
approximate nearest vector search (LUCENE-9004) to achieve accuracy at a 
reasonable cost. We also plan to leverage rich features of Solr to 
return ads based on 'themes' (viz. brand-similar, economical, 4-star 
above etc.). We also plan to share the experience and lessons to 
Solr/Lucene community and find ways to contribute back to the community.


With a broad mandate to experiment and innovate, we are growing at an 
unprecedented rate together with Amazon Ads business. There is a 
seemingly endless range of new opportunities ahead of us. We’re looking 
for amazing minds at various positions in team for New York Location 
(Primary goal is to expand the team in NYC, However, Location 
preferences of Boulder, Seattle, Toronto: can also be considered on case 
by case basis). Prior experience in Solr/Lucene would be a great value add.
SDM: www.amazon.jobs/jobs/1427445?no_int_redir=1 

SDE2: www.amazon.jobs/jobs/1451662?no_int_redir=1 

SDE3:www.amazon.jobs/jobs/1357591?no_int_redir=1 

ASII: www.amazon.jobs/jobs/1427080?no_int_redir=1 

ASIII: www.amazon.jobs/jobs/1379670?no_int_redir=1 

TPM: www.amazon.jobs/jobs/1353298?no_int_redir=1 



If you are interested, please reply this email directly or reach out to 
the hiring manager Vikas Dhaka (dhavi...@amazon.com 
) Thanks!


Best

Pengcheng Xiong

(Apache Hive PMC member and committer working at Amazon)




Re: Programmatic Basic Auth on CloudSolrClient

2021-03-03 Thread Tomás Fernández Löbbe
As far as I know the current OOTB options are system properties or
per-request (which would allow you to use different per collection, but
probably not ideal if you do different types of requests from different
parts of your code). A workaround (which I've used in the past) is to have
a custom client that overrides and sets the credentials in the "request"
method (you can put whatever logic there to identify which credentials to
use). I recently created https://issues.apache.org/jira/browse/SOLR-15154
and https://issues.apache.org/jira/browse/SOLR-15155 to try to address this
issue in future releases.

On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das  wrote:

>
> Hi There,
>
> Is there any way to programmatically set basic authentication credential
> on CloudSolrClient?
>
> The only documentation available is to use system property. This is not
> useful if two collection required two separate set of credentials and they
> are parallelly accessed.
> Thanks in advance.
>


Programmatic Basic Auth on CloudSolrClient

2021-03-03 Thread Subhajit Das

Hi There,

Is there any way to programmatically set basic authentication credential on 
CloudSolrClient?

The only documentation available is to use system property. This is not useful 
if two collection required two separate set of credentials and they are 
parallelly accessed.
Thanks in advance.


Solr backup no longer writes to a UNC path

2021-03-03 Thread Gell-Holleron, Daniel
Hi there,

We've upgraded from Solr 7.7.1 to Solr 8.8.1 (running on Windows Operating 
System) and I've noticed that when running a Solr backup, it will no longer 
allow me to write to a UNC path. Is this something that has been purposely 
changed?

I've noticed a new system property called SOLR_OPTS="%SOLR_OPTS% 
-Dsolr.allowPath=S:\" which I've enabled. Is there a way to point this to a 
remote server, rather than a different drive on the local server?

Thanks,

Daniel



Increase in response time in case of collapse queries.

2021-03-03 Thread Parshant Kumar
Hi all,

We have implemented collapse queries in place of grouped queries on our
production solr. As mentioned in solr documentation collapse queries are
recommended in place of grouped queries in terms of performance . But after
switching to collapsed queries from grouped queries response time of
queries have increased. This is unexpected behaviour, the response time
should have been improved but results are opposites.
Please someone help why response time is increased for collapsed queries.

Thanks
Parshant Kumar

-- 



How can I get dynamicField in Solr 6.1.0.

2021-03-03 Thread vishal patel
I am using Solr 6.1.0. We have 2 shards and each has one replica.

My schema field is below in one collection



My data like

FORM1065678510875540
2021-03-03T23:59:59Z
false
false

£


No





fdsfsfsfsf


Yes


2021-03-03


yes





2021-03-01T06:55:29


12:00 PM


no


12:00 PM

-1
-1


I want to get only fields 
/myFields/FORM_CUSTOM_FIELDS/Bid_Opportunity/Start_Date
, /myFields/FORM_CUSTOM_FIELDS/Bid_Opportunity/TenderEndDatePassed and 
/myFields/FORM_CUSTOM_FIELDS/Bid_Opportunity/Tender_Review_Time. How can I get 
only above fields? What need to pass in FL?

Regards,
Vishal


FW: Graph traversal when nodes are indirectly connected with references

2021-03-03 Thread Sravani Kambhampati
I have a graph with disjoint sets of nodes connected indirectly with a 
reference as shown below. Given an id is it possible to get the leaf node when 
the depth is unknown?

[
{ id: A, child: { ref: B } },
{ id: B, child: { ref: C } },
{ id: C, child: { ref: D } },
.
.
{ id: Y, child: { ref: Z } }
]
[cid:image001.png@01D71031.E9C961D0]
Thanks,
Sravani


RE: Filter by sibling ?

2021-03-02 Thread Manoj Mokashi
I tried passing a parent parser connected to the {!child} parser using query 
params, and it seems to work !

q=type:C1 AND AND {!child of='type:PR' v=$statusqry}
statusqry={!parent which='type:PR' }type:C2

Note that my real query is not exactly this, so I haven't tried the exact 
expression above

-Original Message-
From: Manoj Mokashi 
Sent: Wednesday, March 3, 2021 9:56 AM
To: solr-user@lucene.apache.org
Subject: RE: Filter by sibling ?

Ok. Will check. thanks !

-Original Message-
From: Joel Bernstein 
Sent: Tuesday, March 2, 2021 8:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Filter by sibling ?

Solr's graph expressions can do this type of thing. It allows you to walk the 
relationships in a graph with filters:

https://lucene.apache.org/solr/guide/8_6/graph-traversal.html



Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Mar 2, 2021 at 9:00 AM Manoj Mokashi 
wrote:

> Hi,
>
> If I have a nested document structure, with say parent type:PR, child
> 1
> type:C1 and child2 type:C2,
> would it possible to fetch documents of type C1  that are children of
> parents that have child2 docs with a certain condition ?
> e.g. for
> { type:PR,
>   Title: "XXX",
>   Children1 : [ { type:C1, city:ABC} ],
>   Children2 : [ { type:C2, status:Done}] }
>
> Can I fetch type:C1 documents which are children of parent docs that
> have child C2 docs with status:Done ?
>
> Regards,
> manoj
>
> Confidentiality Notice
> 
> This email message, including any attachments, is for the sole use of
> the intended recipient and may contain confidential and privileged 
> information.
> Any unauthorized view, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by
> reply email and destroy all copies of the original message. Anju Software, 
> Inc.
> 4500 S. Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.
>
Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.
Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Query response time long for dynamicField in Solr 6.1.0

2021-03-02 Thread vishal patel
I am using Solr 6.1.0. We have 2 shards and each has one replica.

My schema field is below in one collection


When I execute below query It is taking more than 180 milliseconds every time.
http://10.38.33.24:8983/solr/forms/select?q=project_id:(2117627+2102977+2109667+2102912+2113720+2102976+2102478+2114939+2101443+2123237+2078189+2086596+2079707+2079706+2079705+2079658+2088340+2088338+2113641+2117131+2117672+2120870+2079708+2113718+2096308+2125462+2117837+2115406+2123865+2081232+2080746+2081239+2082706+2098700+2103039+2098699+2082878+2082877+2079994+2113719+2107255+2103251+2100558+2112735+2100036+2100037+2115359+2099330+2112101+2115360+2112070+2125140+2103656+2090184+2090183+2088269+2088270+2115358+2113036+2096855+2098258+2097226+2097225+2113127+2102847+2081187+2082817+2085678+2085677+2100937+2116632+2117133+2121028+2102479+2080006+2117509+2091443+2094716+2109780+2109779+2102735+2102736+2102685+2101923+2103648+2102608+2102480+2103664+2079205+2075380+2079206+2091442+2088614+2088613+2079876+2079875+2082886+2088615+2079429+2079428+2117185+2082859+2082860+2125270+2081301+2117623+2112740+2086757+2086756+2101344+2086597+2086847+2102648+2113362+2109010+2100223+2079877+2082704+2109669+2103649+2100744+2101490+2117526+2117134+2124020+2124021+2123524+2127200+2125039+2103663)=updated+desc,id+desc=0=30==id,form_id,project_id,doctype,dc,form_type_id,status_id,originator_user_id,controller_user_id,form_num,originator_proxy_user_id,originator_user_type_id,controller_user_type_id,msg_id,msg_originator_id,msg_status_id,parent_msg_id,msg_type_id,msg_code,form_code,appType,instance_group_id,bim_model_id,is_draft,InvoiceColourCode,InvoiceCountAgainstOrder,msg_content,msg_content1,msg_content3,user_ref,form_type_name,form_group_name,observationId,locationId,pf_loc_folderId,hasFormAssociation,hasCommentAssociation,hasDocAssociation,hasBimViewAssociation,hasBimListAssociation,originator_org_id,form_closeby_date,form_creation_date,status_change_userId,status_update_date,lastmodified,is_public,title,*Start_Date,*Tender_End_Date,*Tender_End_Time,*Tender_Review_Date,*Tender_Review_Time,*TenderEndDatePassed,*Package_Description,*Budget,*Currency_Sign,*allowExternalVendor,*Enable_form_public_link,*Is_Tender_Public=off=true=http://10.38.33.24:8983/solr/forms,http://10.38.33.227:8983/solr/forms=true=form_id=msg_creation_date+desc=true

When I execute below query It is taking less than 80 milliseconds every time.

RE: Filter by sibling ?

2021-03-02 Thread Manoj Mokashi
Ok. Will check. thanks !

-Original Message-
From: Joel Bernstein 
Sent: Tuesday, March 2, 2021 8:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Filter by sibling ?

Solr's graph expressions can do this type of thing. It allows you to walk the 
relationships in a graph with filters:

https://lucene.apache.org/solr/guide/8_6/graph-traversal.html



Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Mar 2, 2021 at 9:00 AM Manoj Mokashi 
wrote:

> Hi,
>
> If I have a nested document structure, with say parent type:PR, child
> 1
> type:C1 and child2 type:C2,
> would it possible to fetch documents of type C1  that are children of
> parents that have child2 docs with a certain condition ?
> e.g. for
> { type:PR,
>   Title: "XXX",
>   Children1 : [ { type:C1, city:ABC} ],
>   Children2 : [ { type:C2, status:Done}] }
>
> Can I fetch type:C1 documents which are children of parent docs that
> have child C2 docs with status:Done ?
>
> Regards,
> manoj
>
> Confidentiality Notice
> 
> This email message, including any attachments, is for the sole use of
> the intended recipient and may contain confidential and privileged 
> information.
> Any unauthorized view, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by
> reply email and destroy all copies of the original message. Anju Software, 
> Inc.
> 4500 S. Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.
>
Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Re: Caffeine Cache Metrics Broken?

2021-03-02 Thread Shawn Heisey

On 3/2/2021 3:47 PM, Stephen Lewis Bianamara wrote:

I'm investigating a weird behavior I've observed in the admin page for
caffeine cache metrics. It looks to me like on the older caches, warm-up
queries were not counted toward hit/miss ratios, which of course makes
sense, but on Caffeine cache it looks like they are. I'm using solr 8.3.

Obviously this makes measuring its true impact a little tough. Is this by
any chance a known issue and already fixed in later versions?


The earlier cache implementations are entirely native to Solr -- all the 
source code is include in the Solr codebase.


Caffeine is a third-party cache implementation that has been integrated 
into Solr.  Some of the metrics might come directly from Caffeine, not 
Solr code.


I would expect warming queries to be counted on any of the cache 
implementations.  One of the reasons that the warming capability exists 
is to pre-populate the caches before actual queries begin.  If warming 
queries are somehow excluded, then the cache metrics would not be correct.


I looked into the code and did not find anything that would keep warming 
queries from affecting stats.  But it is always possible that I just 
didn't know what to look for.


In the master branch (Solr 9.0), CaffeineCache is currently the only 
implementation available.


Thanks,
Shawn


RE: Idle timeout expired and Early Client Disconnect errors

2021-03-02 Thread ufuk yılmaz
I divided the query to 1000 pieces and removed the parallel stream clause, it 
seems to be working without timeout so far, if it does I just can divide it to 
even smaller pieces I guess.

I tried to send all 1000 pieces in a “list” expression to be executed linearly, 
it didn’t work but I was just curious if it could handle such a large query 

Now I’m just generating expression strings from java code and sending them one 
by one. I tried to use SolrJ for this, but encountered a weird problem where 
even the simplest expression (echo) stops working after a few iterations in a 
loop. I’m guessing the underlying HttpClient is not closing connections timely, 
hitting the OS per-host connection limit. I asked a separate question about 
this. I was following the example on lucidworks: 
https://lucidworks.com/post/streaming-expressions-in-solrj/

I just modified my code to use regular REST calls using okhttp3, it’s a shame 
that I couldn’t use SolrJ since it truly streams every result 1 by 1 
continuously. REST just returns a single large response at the very end of the 
stream.

Thanks again for your help.

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: 02 March 2021 00:19
To: solr-user@lucene.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

Also the parallel function builds hash partitioning filters that could lead
to timeouts if they take too long to build. Try the query without the
parallel function if you're still getting timeouts when making the query
smaller.



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein  wrote:

> The settings in your version are 30 seconds and 15 seconds for socket and
> connection timeouts.
>
> Typically timeouts occur because one or more shards in the query are idle
> beyond the timeout threshold. This happens because lot's of data is being
> read from other shards.
>
> Breaking the query into small parts would be a good strategy.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz 
> wrote:
>
>> Hello Mr. Bernstein,
>>
>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>> timeouts and they are bound to happen in such a large stream. Should I just
>> reduce the output of my search expressions?
>>
>> Maybe I can split my search results into ~100 parts and run the same
>> query 100 times in series. Each part would emit ~3M documents so they
>> should finish before timeout?
>>
>> Is this a reasonable solution?
>>
>> Btw how long is the default hard-coded timeout value? Because yesterday I
>> ran another query which took more than 1 hour without any timeouts and
>> finished successfully.
>>
>> Sent from Mail for Windows 10
>>
>> From: Joel Bernstein
>> Sent: 01 March 2021 23:03
>> To: solr-user@lucene.apache.org
>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>
>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>
>> https://issues.apache.org/jira/browse/SOLR-14672
>>
>> This unfortunately missed the 8.8 release and will be 8.9.
>>
>>
>>
>> This i
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein  wrote:
>>
>> > What version are you using?
>> >
>> > Solr 8.7 has changes that caused these errors to hit the logs. These
>> used
>> > to be suppressed. This has been fixed in Solr 9.0 but it has not been
>> back
>> > ported to Solr 8.x.
>> >
>> > The errors are actually normal operational occurrences when doing joins
>> so
>> > should be suppressed in the logs and were before the specific release.
>> >
>> > It might make sense to do a release that specifically suppresses these
>> > errors without backporting the full Solr 9.0 changes which impact the
>> > memory footprint of export.
>> >
>> >
>> >
>> >
>> > Joel Bernstein
>> > http://joelsolr.blogspot.com/
>> >
>> >
>> > On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz > >
>> > wrote:
>> >
>> >> Hello all,
>> >>
>> >> I’m running a large streaming expression and feeding the result to
>> update
>> >> expression.
>> >>
>> >>  update(targetCollection, ...long running stream here...,
>> >>
>> >> I tried sending the exact same query multiple times, it sometimes works
>> >> and indexes some results, then gives exception, other times fails with
>> an
>> >> exception after 2 minutes.
>> >>
>> >> Response is like:
>> >> "EXCEPTION":"java.util.concurrent.ExecutionException:
>> >> java.io.IOException: params distrib=false=4 and my long
>> >> stream expression
>> >>
>> >> Server log (short):
>> >> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 12/12
>> >> ms
>> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 12/12
>> >> ms
>> >>
>> >> I tried to increase the 

RE: Default conjunction behaving differently after field type change

2021-03-02 Thread ufuk yılmaz
I changed the tokenizer class from KeywordTokenizerFactory to 
WhitespaceTokenizerFactory for the query analyzer using the Schema API, it 
seems to have solved the problem.

Sent from Mail for Windows 10

From: ufuk yılmaz
Sent: 02 March 2021 20:47
To: solr-user@lucene.apache.org
Subject: Default conjunction behaving differently after field type change

Hello all,

>From the Solr 8.4 (my version) documentation:

“The OR operator is the default conjunction operator. This means that if there 
is no Boolean operator between two terms, the OR operator is used. To search 
for documents that contain either "jakarta apache" or just "jakarta," use the 
query:

"jakarta apache" jakarta

or

"jakarta apache" OR jakarta”


I had a field type=”string” in my old schema:





I could use queries like:
username: (user1 user2 user3)

So it would find the documents of all 3 users (conjunction is OR)
-
Recently I changed the field definition in a new schema:


  
  
  
  




When I search with the same query:

username: (user1 user2 user3)

I get no results unless I change it to either:
username: (user1 OR user2 OR user3) //
username: (“user1” “user2” “user3”)


First I was thinking the default conjunction operator changed to AND, but it 
seems now standart query parser thinks user1 user2 user3 is a single string 
containin spaces I guess?

I couldn’t find how the default “string” field queries are analyzed, what is 
the difference that may cause this behavior?

--ufuk yilmaz



Sent from Mail for Windows 10




RE: Schema API specifying different analysers for query and index

2021-03-02 Thread ufuk yılmaz
It worked! Thanks Mr. Rafalovitch. I just removed “type”: “query”.. keys from 
the json, and used indexAnalyzer and queryAnalyzer in place of analyzer json 
node.

Sent from Mail for Windows 10

From: Alexandre Rafalovitch
Sent: 03 March 2021 01:19
To: solr-user
Subject: Re: Schema API specifying different analysers for query and index

RefGuide gives this for Adding, I would hope the Replace would be similar:

curl -X POST -H 'Content-type:application/json' --data-binary '{
  "add-field-type":{
 "name":"myNewTextField",
 "class":"solr.TextField",
 "indexAnalyzer":{
"tokenizer":{
   "class":"solr.PathHierarchyTokenizerFactory",
   "delimiter":"/" }},
 "queryAnalyzer":{
"tokenizer":{
   "class":"solr.KeywordTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema

So, indexAnalyzer/queryAnalyzer, rather than array:
https://lucene.apache.org/solr/guide/8_8/schema-api.html#add-a-new-field-type

Hope this works,
Alex.
P.s. Also check whether you are using matching API and V1/V2 end point.

On Tue, 2 Mar 2021 at 15:25, ufuk yılmaz  wrote:
>
> Hello,
>
> I’m trying to change a field’s query analysers. The following works but it 
> replaces both index and query type analysers:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzer": {
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }
> }
> }
>
> I tried to change analyzer field to analyzers, to specify different analysers 
> for query and index, but it gave error:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzers": [{
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> },{
> "type": "index",
> "tokenizer": {
> "class": "solr.KeywordTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }]
> }
> }
>
> "errorMessages":["Plugin init failure for [schema.xml]
> "msg":"error processing commands",...
>
> How can I specify different analyzers for query and index type when using 
> schema api?
>
> Sent from Mail for Windows 10
>



Running Simple Streaming expressions in a loop through SolrJ stops with read timeout after a few iterations

2021-03-02 Thread ufuk yılmaz
I’m using the following example on Lucidworks to use streaming expressions from 
SolrJ:

https://lucidworks.com/post/streaming-expressions-in-solrj/

Problem is, when I run it inside a for loop, even the simplest expression 
(echo) stops executing after about 5 iterations. I thought the underlying 
HttpClient was not closing the tcp connection to the solr host, and after 4-5 
iterations it reaches the max connections per host limit of the OS (mine is 
windows 10) and stops working.

But then I tried to manually supply a SolrClientCache with a custom configured 
HttpClient, debugged and saw my custom HttpClient is being utilized by the 
stream, but whatever I tried it didn’t change the outcome.

Do you have any idea about this problem? Am I on the right track about 
HttpClient not closing-reusing a connection after an expression is finished? Or 
is there another issue?

I also tried this with different expressions but result didn’t change.

I created a gist to share my code here: https://git.io/Jqevp
but I’m pasting a shortened version here to read without going there: 

-
String workerUrl = "http://mySolrHost:8983/solr/WorkerCollection;;

String expr = "echo(x)";

for (int i = 0; i < 20; i++) {

TupleStream tplStream = null;

ModifiableSolrParams modifiableSolrParams =
new ModifiableSolrParams()
.set("expr", expr.replaceAll("x", Integer.toString(i)))
.set("preferLocalShards", true)
.set("qt", "/stream");

TupleStream tplStream = new SolrStream(workerUrl, modifiableSolrParams);

tplStream.setStreamContext(new StreamContext());

tplStream.open();

Tuple tuple;
tuple = tplStream.read();
System.out.println(tuple.fields);

tplStream.close();
}
-

Sent from Mail for Windows 10



Caffeine Cache Metrics Broken?

2021-03-02 Thread Stephen Lewis Bianamara
Hi SOLR Community,

I'm investigating a weird behavior I've observed in the admin page for
caffeine cache metrics. It looks to me like on the older caches, warm-up
queries were not counted toward hit/miss ratios, which of course makes
sense, but on Caffeine cache it looks like they are. I'm using solr 8.3.

Obviously this makes measuring its true impact a little tough. Is this by
any chance a known issue and already fixed in later versions?

Thanks!
Stephen


Re: Schema API specifying different analysers for query and index

2021-03-02 Thread Alexandre Rafalovitch
RefGuide gives this for Adding, I would hope the Replace would be similar:

curl -X POST -H 'Content-type:application/json' --data-binary '{
  "add-field-type":{
 "name":"myNewTextField",
 "class":"solr.TextField",
 "indexAnalyzer":{
"tokenizer":{
   "class":"solr.PathHierarchyTokenizerFactory",
   "delimiter":"/" }},
 "queryAnalyzer":{
"tokenizer":{
   "class":"solr.KeywordTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema

So, indexAnalyzer/queryAnalyzer, rather than array:
https://lucene.apache.org/solr/guide/8_8/schema-api.html#add-a-new-field-type

Hope this works,
Alex.
P.s. Also check whether you are using matching API and V1/V2 end point.

On Tue, 2 Mar 2021 at 15:25, ufuk yılmaz  wrote:
>
> Hello,
>
> I’m trying to change a field’s query analysers. The following works but it 
> replaces both index and query type analysers:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzer": {
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }
> }
> }
>
> I tried to change analyzer field to analyzers, to specify different analysers 
> for query and index, but it gave error:
>
> {
> "replace-field-type": {
> "name": "string_ci",
> "class": "solr.TextField",
> "sortMissingLast": true,
> "omitNorms": true,
> "stored": true,
> "docValues": false,
> "analyzers": [{
> "type": "query",
> "tokenizer": {
> "class": "solr.StandardTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> },{
> "type": "index",
> "tokenizer": {
> "class": "solr.KeywordTokenizerFactory"
> },
> "filters": [
> {
> "class": "solr.LowerCaseFilterFactory"
> }
> ]
> }]
> }
> }
>
> "errorMessages":["Plugin init failure for [schema.xml]
> "msg":"error processing commands",...
>
> How can I specify different analyzers for query and index type when using 
> schema api?
>
> Sent from Mail for Windows 10
>


Re: Location of Solr 9 Branch

2021-03-02 Thread Houston Putman
Solr 9 is an unreleased major version, so it lives in *master*. Once the
release process starts for Solr 9, it will live at *branch_9x*, and *master*
will host Solr 10.

On Tue, Mar 2, 2021 at 3:49 PM Phill Campbell 
wrote:

> I have just begun investigating Solr source code. Where is the branch for
> Solr 9?
>
>
>


Location of Solr 9 Branch

2021-03-02 Thread Phill Campbell
I have just begun investigating Solr source code. Where is the branch for Solr 
9?




Schema API specifying different analysers for query and index

2021-03-02 Thread ufuk yılmaz
Hello,

I’m trying to change a field’s query analysers. The following works but it 
replaces both index and query type analysers:

{
"replace-field-type": {
"name": "string_ci",
"class": "solr.TextField",
"sortMissingLast": true,
"omitNorms": true,
"stored": true,
"docValues": false,
"analyzer": {
"type": "query",
"tokenizer": {
"class": "solr.StandardTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
}
]
}
}
}

I tried to change analyzer field to analyzers, to specify different analysers 
for query and index, but it gave error:

{
"replace-field-type": {
"name": "string_ci",
"class": "solr.TextField",
"sortMissingLast": true,
"omitNorms": true,
"stored": true,
"docValues": false,
"analyzers": [{
"type": "query",
"tokenizer": {
"class": "solr.StandardTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
}
]
},{
"type": "index",
"tokenizer": {
"class": "solr.KeywordTokenizerFactory"
},
"filters": [
{
"class": "solr.LowerCaseFilterFactory"
}
]
}]
}
}

"errorMessages":["Plugin init failure for [schema.xml]
"msg":"error processing commands",...

How can I specify different analyzers for query and index type when using 
schema api?

Sent from Mail for Windows 10



Default conjunction behaving differently after field type change

2021-03-02 Thread ufuk yılmaz
Hello all,

>From the Solr 8.4 (my version) documentation:

“The OR operator is the default conjunction operator. This means that if there 
is no Boolean operator between two terms, the OR operator is used. To search 
for documents that contain either "jakarta apache" or just "jakarta," use the 
query:

"jakarta apache" jakarta

or

"jakarta apache" OR jakarta”


I had a field type=”string” in my old schema:





I could use queries like:
username: (user1 user2 user3)

So it would find the documents of all 3 users (conjunction is OR)
-
Recently I changed the field definition in a new schema:


  
  
  
  




When I search with the same query:

username: (user1 user2 user3)

I get no results unless I change it to either:
username: (user1 OR user2 OR user3) //
username: (“user1” “user2” “user3”)


First I was thinking the default conjunction operator changed to AND, but it 
seems now standart query parser thinks user1 user2 user3 is a single string 
containin spaces I guess?

I couldn’t find how the default “string” field queries are analyzed, what is 
the difference that may cause this behavior?

--ufuk yilmaz



Sent from Mail for Windows 10



Possible bug with AnalyzingInfixLookupFactory, FileDictionaryFactory and Context Filtering

2021-03-02 Thread Joaquim de Souza
Hi all,

I asked a question on StackOverflow
about
a problem I was having with the suggester module, but since then I have
looked into the source code of Solr, and I thinkit is a bug.

Essentially, context filtering is being applied to a suggester that is
backed by a FileDictionaryFactory. According to the docs, this should not
happen, and context filters should be ignored.

This is my config:


  
location
AnalyzingInfixLookupFactory
FileDictionaryFactory
tdwg.txt
text_general
false
  

  
common-name
AnalyzingInfixLookupFactory
DocumentDictionaryFactory
region.vernacular_names_t
common_name_suggest
searchable.context_ss
text_general
false
  


I have tested this on the latest version of Solr (8.8.1).

The relevant bit of source code is here
.
I would expect suggestions to be null, as the combination of
AnalyzingInfixLookupFactory and FileDictionaryFactor doesn't support
context filtering.

Is there anything I can do to fix this problem?

Thanks,
Joaquim


Re: Filter by sibling ?

2021-03-02 Thread Joel Bernstein
Solr's graph expressions can do this type of thing. It allows you to walk
the relationships in a graph with filters:

https://lucene.apache.org/solr/guide/8_6/graph-traversal.html



Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Mar 2, 2021 at 9:00 AM Manoj Mokashi 
wrote:

> Hi,
>
> If I have a nested document structure, with say parent type:PR, child 1
> type:C1 and child2 type:C2,
> would it possible to fetch documents of type C1  that are children of
> parents that have child2 docs with a certain condition ?
> e.g. for
> { type:PR,
>   Title: "XXX",
>   Children1 : [ { type:C1, city:ABC} ],
>   Children2 : [ { type:C2, status:Done}]
> }
>
> Can I fetch type:C1 documents which are children of parent docs that have
> child C2 docs with status:Done ?
>
> Regards,
> manoj
>
> Confidentiality Notice
> 
> This email message, including any attachments, is for the sole use of the
> intended recipient and may contain confidential and privileged information.
> Any unauthorized view, use, disclosure or distribution is prohibited. If
> you are not the intended recipient, please contact the sender by reply
> email and destroy all copies of the original message. Anju Software, Inc.
> 4500 S. Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.
>


Re: Partial update bug on solr 8.8.0

2021-03-02 Thread Mike Drob
This looks like a bug that is already fixed but not yet released in 8.9

https://issues.apache.org/jira/plugins/servlet/mobile#issue/SOLR-13034

On Tue, Mar 2, 2021 at 6:27 AM Mohsen Saboorian  wrote:

> Any idea about this post?
> https://stackoverflow.com/q/66335803/141438
>
> Regards.
>


Re: Multiword synonyms and term wildcards/substring matching

2021-03-02 Thread Martin Graney
Hi Alex

Thanks for the reply.
We are not using the 'copyField bucket' approach as it is inflexible. Our
textual fields are all multivalued dynamic fields, which allows us to craft
a list of `pf` (phrase fields) with associated weighting boosts that are
meant to be used in the search on a *per-collection* basis. This allows us
to have all of the textual fields indexed independently and then simply
change the query when we want to include/exclude a field from the search
without the need to reindex the entire collection. e/dismax makes this more
flexible approach possible.

I'll take a look at the ComplexQueryParser and see if it is a good fit.
We use a lot of the e/dismax params though, such as `bf` (boost functions),
`bq` (boost queries), and 'pf' (phrase fields), to influence the relevance
score.

FYI: We are using Solr 8.3.

On Tue, 2 Mar 2021 at 13:38, Alexandre Rafalovitch 
wrote:

> I admit to not fully understanding the examples, but ComplexQueryParser
> looks like something worth at least reviewing:
>
>
> https://lucene.apache.org/solr/guide/8_8/other-parsers.html#complex-phrase-query-parser
>
> Also I did not see any references to trying to copyField and process same
> content in different ways. If copyField is not stored, the overhead is not
> as large.
>
> Regards,
> Alex
>
>
>
> On Tue., Mar. 2, 2021, 7:08 a.m. Martin Graney, 
> wrote:
>
> > Hi All
> >
> > I have been trying to implement multi word synonyms using `sow=false`
> into
> > a pre-existing system that applied pre-processing to the phrase to apply
> > wildcards around the terms, i.e. `bread stick` => `*bread* *stick*`.
> >
> > I got the synonyms expansion working perfectly, after discovering the
> > `preserveOriginal` filter param, but then I needed to re-implement the
> > existing wildcard behaviour.
> > I tried using the edge-ngram filter, but found that when searching for
> the
> > phrase `bread stick` on a field containing the word `breadstick` and
> > `q.op=AND` it returns no results, as the content `breadstick` does not
> > _start with_ `stick`. The previous wildcard behaviour would return all
> > documents that contain the substrings `bread` AND `stick`, which is the
> > desired behaviour.
> > I tried using the ngram filter, but this does not support the
> > `preserveOriginal`, and so loses a lot of relevance for exact matches,
> but
> > it also results in matches that are far too broad, creating 21 tokens
> from
> > `breadstick` for `minGramSize=3` and `maxGramSize=5` that in practice
> > essentially matches all of the documents. Which means that boosts applied
> > to other fields, such as 'in stock', push irrelevant documents to the
> top.
> >
> > Finally, I tried to strip out ngrams entirely and use subquery/LocalParam
> > syntax and local params, a solr feature that is not very well documented.
> > I created something like `q={!edismax sow=true v=$widlcards} OR {!edismax
> > sow=false v=$plain}` to effectively create a union of results, one with
> > multi word synonyms support and one with wildcard support.
> > But then I had to implement the other edismax params and immediately
> > stumbled.
> > Each query in production normally has a slew of `bf` and `bq` params,
> and I
> > cannot see a way to pass these into the nested query using local
> variables.
> > If I have 3 different `bf` params how can I pass them into the local
> param
> > subqueries?
> >
> > Also, as the search in production is across multiple fields I found
> passing
> > `qf` to both subqueries using dereferencing failed, as the parser saw it
> as
> > a single field and threw a 'number format exception'.
> > i.e.
> > q={!edismax sow=true v=$tw tf=$tqf} OR {!edismax sow=false v=$tp tf=$tqf}
> > $tw=*bread* *stick*
> > $tp=bread stick
> > $tqf=title^2 desctiption^0.5
> >
> > As you can guess, I have spent quite some time going down this rabbit
> hole
> > in my attempt to reproduce the existing desired functionality alongside
> > multiterm synonyms.
> > Is there a way to get multiterm synonyms working with substring matching
> > effectively?
> > I am sure there is a much simpler way that I am missing than all of my
> > attempts so far.
> >
> > Solr: 8.3
> >
> > Thanks
> > Martin Graney
> >
> > --
> >  
> >
>


-- 
Martin Graney
Lead Developer

http://sooqr.com 
http://twitter.com/sooqrcom

Office: +31 (0) 88 766 7700
Mobile: +31 (0) 64 660 8543

-- 
 


Filter by sibling ?

2021-03-02 Thread Manoj Mokashi
Hi,

If I have a nested document structure, with say parent type:PR, child 1 type:C1 
and child2 type:C2,
would it possible to fetch documents of type C1  that are children of parents 
that have child2 docs with a certain condition ?
e.g. for
{ type:PR,
  Title: "XXX",
  Children1 : [ { type:C1, city:ABC} ],
  Children2 : [ { type:C2, status:Done}]
}

Can I fetch type:C1 documents which are children of parent docs that have child 
C2 docs with status:Done ?

Regards,
manoj

Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Re: Multiword synonyms and term wildcards/substring matching

2021-03-02 Thread Alexandre Rafalovitch
I admit to not fully understanding the examples, but ComplexQueryParser
looks like something worth at least reviewing:

https://lucene.apache.org/solr/guide/8_8/other-parsers.html#complex-phrase-query-parser

Also I did not see any references to trying to copyField and process same
content in different ways. If copyField is not stored, the overhead is not
as large.

Regards,
Alex



On Tue., Mar. 2, 2021, 7:08 a.m. Martin Graney, 
wrote:

> Hi All
>
> I have been trying to implement multi word synonyms using `sow=false` into
> a pre-existing system that applied pre-processing to the phrase to apply
> wildcards around the terms, i.e. `bread stick` => `*bread* *stick*`.
>
> I got the synonyms expansion working perfectly, after discovering the
> `preserveOriginal` filter param, but then I needed to re-implement the
> existing wildcard behaviour.
> I tried using the edge-ngram filter, but found that when searching for the
> phrase `bread stick` on a field containing the word `breadstick` and
> `q.op=AND` it returns no results, as the content `breadstick` does not
> _start with_ `stick`. The previous wildcard behaviour would return all
> documents that contain the substrings `bread` AND `stick`, which is the
> desired behaviour.
> I tried using the ngram filter, but this does not support the
> `preserveOriginal`, and so loses a lot of relevance for exact matches, but
> it also results in matches that are far too broad, creating 21 tokens from
> `breadstick` for `minGramSize=3` and `maxGramSize=5` that in practice
> essentially matches all of the documents. Which means that boosts applied
> to other fields, such as 'in stock', push irrelevant documents to the top.
>
> Finally, I tried to strip out ngrams entirely and use subquery/LocalParam
> syntax and local params, a solr feature that is not very well documented.
> I created something like `q={!edismax sow=true v=$widlcards} OR {!edismax
> sow=false v=$plain}` to effectively create a union of results, one with
> multi word synonyms support and one with wildcard support.
> But then I had to implement the other edismax params and immediately
> stumbled.
> Each query in production normally has a slew of `bf` and `bq` params, and I
> cannot see a way to pass these into the nested query using local variables.
> If I have 3 different `bf` params how can I pass them into the local param
> subqueries?
>
> Also, as the search in production is across multiple fields I found passing
> `qf` to both subqueries using dereferencing failed, as the parser saw it as
> a single field and threw a 'number format exception'.
> i.e.
> q={!edismax sow=true v=$tw tf=$tqf} OR {!edismax sow=false v=$tp tf=$tqf}
> $tw=*bread* *stick*
> $tp=bread stick
> $tqf=title^2 desctiption^0.5
>
> As you can guess, I have spent quite some time going down this rabbit hole
> in my attempt to reproduce the existing desired functionality alongside
> multiterm synonyms.
> Is there a way to get multiterm synonyms working with substring matching
> effectively?
> I am sure there is a much simpler way that I am missing than all of my
> attempts so far.
>
> Solr: 8.3
>
> Thanks
> Martin Graney
>
> --
>  
>


Partial update bug on solr 8.8.0

2021-03-02 Thread Mohsen Saboorian
Any idea about this post?
https://stackoverflow.com/q/66335803/141438

Regards.


Multiword synonyms and term wildcards/substring matching

2021-03-02 Thread Martin Graney
Hi All

I have been trying to implement multi word synonyms using `sow=false` into
a pre-existing system that applied pre-processing to the phrase to apply
wildcards around the terms, i.e. `bread stick` => `*bread* *stick*`.

I got the synonyms expansion working perfectly, after discovering the
`preserveOriginal` filter param, but then I needed to re-implement the
existing wildcard behaviour.
I tried using the edge-ngram filter, but found that when searching for the
phrase `bread stick` on a field containing the word `breadstick` and
`q.op=AND` it returns no results, as the content `breadstick` does not
_start with_ `stick`. The previous wildcard behaviour would return all
documents that contain the substrings `bread` AND `stick`, which is the
desired behaviour.
I tried using the ngram filter, but this does not support the
`preserveOriginal`, and so loses a lot of relevance for exact matches, but
it also results in matches that are far too broad, creating 21 tokens from
`breadstick` for `minGramSize=3` and `maxGramSize=5` that in practice
essentially matches all of the documents. Which means that boosts applied
to other fields, such as 'in stock', push irrelevant documents to the top.

Finally, I tried to strip out ngrams entirely and use subquery/LocalParam
syntax and local params, a solr feature that is not very well documented.
I created something like `q={!edismax sow=true v=$widlcards} OR {!edismax
sow=false v=$plain}` to effectively create a union of results, one with
multi word synonyms support and one with wildcard support.
But then I had to implement the other edismax params and immediately
stumbled.
Each query in production normally has a slew of `bf` and `bq` params, and I
cannot see a way to pass these into the nested query using local variables.
If I have 3 different `bf` params how can I pass them into the local param
subqueries?

Also, as the search in production is across multiple fields I found passing
`qf` to both subqueries using dereferencing failed, as the parser saw it as
a single field and threw a 'number format exception'.
i.e.
q={!edismax sow=true v=$tw tf=$tqf} OR {!edismax sow=false v=$tp tf=$tqf}
$tw=*bread* *stick*
$tp=bread stick
$tqf=title^2 desctiption^0.5

As you can guess, I have spent quite some time going down this rabbit hole
in my attempt to reproduce the existing desired functionality alongside
multiterm synonyms.
Is there a way to get multiterm synonyms working with substring matching
effectively?
I am sure there is a much simpler way that I am missing than all of my
attempts so far.

Solr: 8.3

Thanks
Martin Graney

-- 
 


Re: Solr wiki page update

2021-03-02 Thread Jan Høydahl
Vincent,

I added you as editor, please try editing that page again.

Jan

> 11. feb. 2021 kl. 17:43 skrev Vincent Brehin :
> 
> Hi community members,
> I work for Adelean  https://www.adelean.com/ , we are offering services
> around everything Search related, and especially Solr consulting and
> support. We are based in Paris and operate mainly in France.
> Is it possible to list our company on the support page (Support - SOLR -
> Apache Software Foundation
> ) ?
> Or give me the permission to edit it on confluence (my user:
> vincent.brehin) ?
> Thanks !
> Best Regards,
> 
> Vincent



Re: Zookeeper 3.4.5 with Solr 8.8.0

2021-03-01 Thread Shawn Heisey

On 3/1/2021 9:45 PM, Subhajit Das wrote:

That is not possible at this time.

Will it be ok, if remote the zookeeper dependencies (jars) from solr and 
replace it with 3.5.5 jars?
Thanks in advance.


Maybe.  But I cannot say for sure.

I know that when we upgraded to ZK 3.5, some fairly significant code 
changes in Solr were required.  I did not see whether more changes were 
needed when we upgraded again.


It would not surprise me to learn that a jar swap won't work.  Upgrades 
are far more likely to work than downgrades.


Thanks,
Shawn


RE: Zookeeper 3.4.5 with Solr 8.8.0

2021-03-01 Thread Subhajit Das
Hi Shawn,

That is not possible at this time.

Will it be ok, if remote the zookeeper dependencies (jars) from solr and 
replace it with 3.5.5 jars?
Thanks in advance.


From: Shawn Heisey 
Sent: Monday, March 1, 2021 11:17:24 PM
To: solr-user@lucene.apache.org ; 
u...@zookeeper.apache.org 
Subject: Re: Zookeeper 3.4.5 with Solr 8.8.0

On 3/1/2021 6:51 AM, Subhajit Das wrote:
> I noticed, that Solr 8.8.0 uses Zookeeper 3.6.2 client, while Solr 6.3.0 uses 
> Zookeeper 3.4.6 client. Is this a client bug or mismatch issue?
> If so, how to fix this?

The ZK project guarantees that each minor version (X.Y.Z, where Y is the
same) will work with the previous minor version or the next minor version.

3.4 and 3.6 are two minor versions apart, and thus compatibility cannot
be guaranteed.

See the "backward compatibility" matrix here:

https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement

I think you'll need to upgrade your ZK server ensemble to fix it.

Thanks,
Shawn


Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-01 Thread Phill Campbell
Anyone?

> On Feb 24, 2021, at 7:47 AM, Phill Campbell  
> wrote:
> 
> Last week I switched to Solr 8.7 from a “special” build of Solr 6.6
> 
> The system has a timeout set for querying. I am now seeing this bug.
> 
> https://issues.apache.org/jira/browse/SOLR-14758 
> 
> 
> Max Query Time goes from 1.6 seconds to 20 seconds and affects the entire 
> system for about 2 minutes as reported in New Relic.
> 
> null:java.lang.NullPointerException
>   at 
> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
>   at 
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
>   at 
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
>   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)
> 
> 
> Can this be fixed in a patch for Solr 8.8? I do not want to have to go back 
> to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances.
> 
> Pease advise. Thank you.



Re: Idle timeout expired and Early Client Disconnect errors

2021-03-01 Thread Joel Bernstein
Also the parallel function builds hash partitioning filters that could lead
to timeouts if they take too long to build. Try the query without the
parallel function if you're still getting timeouts when making the query
smaller.



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein  wrote:

> The settings in your version are 30 seconds and 15 seconds for socket and
> connection timeouts.
>
> Typically timeouts occur because one or more shards in the query are idle
> beyond the timeout threshold. This happens because lot's of data is being
> read from other shards.
>
> Breaking the query into small parts would be a good strategy.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz 
> wrote:
>
>> Hello Mr. Bernstein,
>>
>> I’m using version 8.4. So, if I understand correctly, I can’t increase
>> timeouts and they are bound to happen in such a large stream. Should I just
>> reduce the output of my search expressions?
>>
>> Maybe I can split my search results into ~100 parts and run the same
>> query 100 times in series. Each part would emit ~3M documents so they
>> should finish before timeout?
>>
>> Is this a reasonable solution?
>>
>> Btw how long is the default hard-coded timeout value? Because yesterday I
>> ran another query which took more than 1 hour without any timeouts and
>> finished successfully.
>>
>> Sent from Mail for Windows 10
>>
>> From: Joel Bernstein
>> Sent: 01 March 2021 23:03
>> To: solr-user@lucene.apache.org
>> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>>
>> Oh wait, I misread your email. The idle timeout issue is configurable in:
>>
>> https://issues.apache.org/jira/browse/SOLR-14672
>>
>> This unfortunately missed the 8.8 release and will be 8.9.
>>
>>
>>
>> This i
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein  wrote:
>>
>> > What version are you using?
>> >
>> > Solr 8.7 has changes that caused these errors to hit the logs. These
>> used
>> > to be suppressed. This has been fixed in Solr 9.0 but it has not been
>> back
>> > ported to Solr 8.x.
>> >
>> > The errors are actually normal operational occurrences when doing joins
>> so
>> > should be suppressed in the logs and were before the specific release.
>> >
>> > It might make sense to do a release that specifically suppresses these
>> > errors without backporting the full Solr 9.0 changes which impact the
>> > memory footprint of export.
>> >
>> >
>> >
>> >
>> > Joel Bernstein
>> > http://joelsolr.blogspot.com/
>> >
>> >
>> > On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz > >
>> > wrote:
>> >
>> >> Hello all,
>> >>
>> >> I’m running a large streaming expression and feeding the result to
>> update
>> >> expression.
>> >>
>> >>  update(targetCollection, ...long running stream here...,
>> >>
>> >> I tried sending the exact same query multiple times, it sometimes works
>> >> and indexes some results, then gives exception, other times fails with
>> an
>> >> exception after 2 minutes.
>> >>
>> >> Response is like:
>> >> "EXCEPTION":"java.util.concurrent.ExecutionException:
>> >> java.io.IOException: params distrib=false=4 and my long
>> >> stream expression
>> >>
>> >> Server log (short):
>> >> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 12/12
>> >> ms
>> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 12/12
>> >> ms
>> >>
>> >> I tried to increase the jetty idle timeout value on the node which
>> hosts
>> >> my target collection to something like an hour. It didn’t affect.
>> >>
>> >>
>> >> Server logs (long)
>> >> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>> >> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>> >> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>> >> 2/12 ms
>> >> solr-01|at
>> >>
>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>> >> solr-01|at
>> >> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>> >> solr-01|at
>> >> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>> >> solr-01|at
>> >>
>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>> >> solr-01|at
>> >> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>> >> solr-01|at
>> >> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>> >> solr-01|at
>> >> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>> >> solr-01|at
>> >> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>> >> solr-01|at 

Re: Idle timeout expired and Early Client Disconnect errors

2021-03-01 Thread Joel Bernstein
The settings in your version are 30 seconds and 15 seconds for socket and
connection timeouts.

Typically timeouts occur because one or more shards in the query are idle
beyond the timeout threshold. This happens because lot's of data is being
read from other shards.

Breaking the query into small parts would be a good strategy.




Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz 
wrote:

> Hello Mr. Bernstein,
>
> I’m using version 8.4. So, if I understand correctly, I can’t increase
> timeouts and they are bound to happen in such a large stream. Should I just
> reduce the output of my search expressions?
>
> Maybe I can split my search results into ~100 parts and run the same query
> 100 times in series. Each part would emit ~3M documents so they should
> finish before timeout?
>
> Is this a reasonable solution?
>
> Btw how long is the default hard-coded timeout value? Because yesterday I
> ran another query which took more than 1 hour without any timeouts and
> finished successfully.
>
> Sent from Mail for Windows 10
>
> From: Joel Bernstein
> Sent: 01 March 2021 23:03
> To: solr-user@lucene.apache.org
> Subject: Re: Idle timeout expired and Early Client Disconnect errors
>
> Oh wait, I misread your email. The idle timeout issue is configurable in:
>
> https://issues.apache.org/jira/browse/SOLR-14672
>
> This unfortunately missed the 8.8 release and will be 8.9.
>
>
>
> This i
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein  wrote:
>
> > What version are you using?
> >
> > Solr 8.7 has changes that caused these errors to hit the logs. These used
> > to be suppressed. This has been fixed in Solr 9.0 but it has not been
> back
> > ported to Solr 8.x.
> >
> > The errors are actually normal operational occurrences when doing joins
> so
> > should be suppressed in the logs and were before the specific release.
> >
> > It might make sense to do a release that specifically suppresses these
> > errors without backporting the full Solr 9.0 changes which impact the
> > memory footprint of export.
> >
> >
> >
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> >
> > On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz  >
> > wrote:
> >
> >> Hello all,
> >>
> >> I’m running a large streaming expression and feeding the result to
> update
> >> expression.
> >>
> >>  update(targetCollection, ...long running stream here...,
> >>
> >> I tried sending the exact same query multiple times, it sometimes works
> >> and indexes some results, then gives exception, other times fails with
> an
> >> exception after 2 minutes.
> >>
> >> Response is like:
> >> "EXCEPTION":"java.util.concurrent.ExecutionException:
> >> java.io.IOException: params distrib=false=4 and my long
> >> stream expression
> >>
> >> Server log (short):
> >> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
> >> java.util.concurrent.TimeoutException: Idle timeout expired:
> 12/12
> >> ms
> >> o.a.s.s.HttpSolrCall null:java.io.IOException:
> >> java.util.concurrent.TimeoutException: Idle timeout expired:
> 12/12
> >> ms
> >>
> >> I tried to increase the jetty idle timeout value on the node which hosts
> >> my target collection to something like an hour. It didn’t affect.
> >>
> >>
> >> Server logs (long)
> >> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
> >> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
> >> java.util.concurrent.TimeoutException: Idle timeout expired: 1
> >> 2/12 ms
> >> solr-01|at
> >>
> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
> >> solr-01|at
> >> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
> >> solr-01|at
> >> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
> >> solr-01|at
> >>
> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
> >> solr-01|at
> >> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
> >> solr-01|at
> >> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
> >> solr-01|at
> >> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
> >> solr-01|at
> >> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> >> solr-01|at java.base/java.io
> >> .OutputStreamWriter.write(OutputStreamWriter.java:211)
> >> solr-01|at
> >> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
> >> solr-01|at
> >> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
> >> solr-01|at
> >> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
> >> solr-01|at
> >>
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
> >> solr-01|at
> >> 

RE: Idle timeout expired and Early Client Disconnect errors

2021-03-01 Thread ufuk yılmaz
Hello Mr. Bernstein,

I’m using version 8.4. So, if I understand correctly, I can’t increase timeouts 
and they are bound to happen in such a large stream. Should I just reduce the 
output of my search expressions?

Maybe I can split my search results into ~100 parts and run the same query 100 
times in series. Each part would emit ~3M documents so they should finish 
before timeout?

Is this a reasonable solution?

Btw how long is the default hard-coded timeout value? Because yesterday I ran 
another query which took more than 1 hour without any timeouts and finished 
successfully.

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: 01 March 2021 23:03
To: solr-user@lucene.apache.org
Subject: Re: Idle timeout expired and Early Client Disconnect errors

Oh wait, I misread your email. The idle timeout issue is configurable in:

https://issues.apache.org/jira/browse/SOLR-14672

This unfortunately missed the 8.8 release and will be 8.9.



This i



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein  wrote:

> What version are you using?
>
> Solr 8.7 has changes that caused these errors to hit the logs. These used
> to be suppressed. This has been fixed in Solr 9.0 but it has not been back
> ported to Solr 8.x.
>
> The errors are actually normal operational occurrences when doing joins so
> should be suppressed in the logs and were before the specific release.
>
> It might make sense to do a release that specifically suppresses these
> errors without backporting the full Solr 9.0 changes which impact the
> memory footprint of export.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz 
> wrote:
>
>> Hello all,
>>
>> I’m running a large streaming expression and feeding the result to update
>> expression.
>>
>>  update(targetCollection, ...long running stream here...,
>>
>> I tried sending the exact same query multiple times, it sometimes works
>> and indexes some results, then gives exception, other times fails with an
>> exception after 2 minutes.
>>
>> Response is like:
>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>> java.io.IOException: params distrib=false=4 and my long
>> stream expression
>>
>> Server log (short):
>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
>> ms
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
>> ms
>>
>> I tried to increase the jetty idle timeout value on the node which hosts
>> my target collection to something like an hour. It didn’t affect.
>>
>>
>> Server logs (long)
>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>> 2/12 ms
>> solr-01|at
>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>> solr-01|at
>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>> solr-01|at
>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>> solr-01|at
>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>> solr-01|at java.base/java.io
>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>> solr-01|at
>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>> solr-01|at
>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>> solr-01|at
>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>> solr-01|at
>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>> solr-01|at
>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>> solr-01|at
>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>> solr-01|at
>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>> solr-01|at
>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>> solr-01|at
>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>> solr-01|at
>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>> solr-01|at
>> 

Re: Idle timeout expired and Early Client Disconnect errors

2021-03-01 Thread Joel Bernstein
Oh wait, I misread your email. The idle timeout issue is configurable in:

https://issues.apache.org/jira/browse/SOLR-14672

This unfortunately missed the 8.8 release and will be 8.9.



This i



Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein  wrote:

> What version are you using?
>
> Solr 8.7 has changes that caused these errors to hit the logs. These used
> to be suppressed. This has been fixed in Solr 9.0 but it has not been back
> ported to Solr 8.x.
>
> The errors are actually normal operational occurrences when doing joins so
> should be suppressed in the logs and were before the specific release.
>
> It might make sense to do a release that specifically suppresses these
> errors without backporting the full Solr 9.0 changes which impact the
> memory footprint of export.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz 
> wrote:
>
>> Hello all,
>>
>> I’m running a large streaming expression and feeding the result to update
>> expression.
>>
>>  update(targetCollection, ...long running stream here...,
>>
>> I tried sending the exact same query multiple times, it sometimes works
>> and indexes some results, then gives exception, other times fails with an
>> exception after 2 minutes.
>>
>> Response is like:
>> "EXCEPTION":"java.util.concurrent.ExecutionException:
>> java.io.IOException: params distrib=false=4 and my long
>> stream expression
>>
>> Server log (short):
>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
>> ms
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
>> ms
>>
>> I tried to increase the jetty idle timeout value on the node which hosts
>> my target collection to something like an hour. It didn’t affect.
>>
>>
>> Server logs (long)
>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 1
>> 2/12 ms
>> solr-01|at
>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
>> solr-01|at
>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
>> solr-01|at
>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
>> solr-01|at
>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
>> solr-01|at
>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
>> solr-01|at java.base/java.io
>> .OutputStreamWriter.write(OutputStreamWriter.java:211)
>> solr-01|at
>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
>> solr-01|at
>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
>> solr-01|at
>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
>> solr-01|at
>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
>> solr-01|at
>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
>> solr-01|at
>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
>> solr-01|at
>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
>> solr-01|at
>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
>> solr-01|at
>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
>> solr-01|at
>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
>> solr-01|at
>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
>> --
>> solr-01|at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:103)
>> solr-01|at org.eclipse.jetty.io
>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>> solr-01|at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>> solr-01|at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>> solr-01|at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>> solr-01|at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>> solr-01|at
>> 

Re: Idle timeout expired and Early Client Disconnect errors

2021-03-01 Thread Joel Bernstein
What version are you using?

Solr 8.7 has changes that caused these errors to hit the logs. These used
to be suppressed. This has been fixed in Solr 9.0 but it has not been back
ported to Solr 8.x.

The errors are actually normal operational occurrences when doing joins so
should be suppressed in the logs and were before the specific release.

It might make sense to do a release that specifically suppresses these
errors without backporting the full Solr 9.0 changes which impact the
memory footprint of export.




Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz 
wrote:

> Hello all,
>
> I’m running a large streaming expression and feeding the result to update
> expression.
>
>  update(targetCollection, ...long running stream here...,
>
> I tried sending the exact same query multiple times, it sometimes works
> and indexes some results, then gives exception, other times fails with an
> exception after 2 minutes.
>
> Response is like:
> "EXCEPTION":"java.util.concurrent.ExecutionException: java.io.IOException:
> params distrib=false=4 and my long stream expression
>
> Server log (short):
> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 12/12 ms
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException: Idle timeout expired: 12/12
> ms
>
> I tried to increase the jetty idle timeout value on the node which hosts
> my target collection to something like an hour. It didn’t affect.
>
>
> Server logs (long)
> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2
> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException: Idle timeout expired: 1
> 2/12 ms
> solr-01|at
> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
> solr-01|at
> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
> solr-01|at
> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
> solr-01|at
> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
> solr-01|at
> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
> solr-01|at
> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
> solr-01|at
> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
> solr-01|at
> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> solr-01|at java.base/java.io
> .OutputStreamWriter.write(OutputStreamWriter.java:211)
> solr-01|at
> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
> solr-01|at
> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
> solr-01|at
> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
> solr-01|at
> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
> solr-01|at
> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
> solr-01|at
> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
> solr-01|at
> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
> solr-01|at
> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
> solr-01|at
> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
> solr-01|at
> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
> solr-01|at
> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
> --
> solr-01|at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:103)
> solr-01|at org.eclipse.jetty.io
> .ChannelEndPoint$2.run(ChannelEndPoint.java:117)
> solr-01|at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
> solr-01|at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
> solr-01|at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
> solr-01|at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
> solr-01|at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
> solr-01|at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
> solr-01|at
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
> solr-01|at java.base/java.lang.Thread.run(Thread.java:834)
> solr-01| Caused by: java.util.concurrent.TimeoutException: Idle
> timeout expired: 12/12 ms
> solr-01|at 

Re: Zookeeper 3.4.5 with Solr 8.8.0

2021-03-01 Thread Shawn Heisey

On 3/1/2021 6:51 AM, Subhajit Das wrote:

I noticed, that Solr 8.8.0 uses Zookeeper 3.6.2 client, while Solr 6.3.0 uses 
Zookeeper 3.4.6 client. Is this a client bug or mismatch issue?
If so, how to fix this?


The ZK project guarantees that each minor version (X.Y.Z, where Y is the 
same) will work with the previous minor version or the next minor version.


3.4 and 3.6 are two minor versions apart, and thus compatibility cannot 
be guaranteed.


See the "backward compatibility" matrix here:

https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement

I think you'll need to upgrade your ZK server ensemble to fix it.

Thanks,
Shawn


Idle timeout expired and Early Client Disconnect errors

2021-03-01 Thread ufuk yılmaz
Hello all,

I’m running a large streaming expression and feeding the result to update 
expression.

 update(targetCollection, ...long running stream here..., 

I tried sending the exact same query multiple times, it sometimes works and 
indexes some results, then gives exception, other times fails with an exception 
after 2 minutes.

Response is like:
"EXCEPTION":"java.util.concurrent.ExecutionException: java.io.IOException: 
params distrib=false=4 and my long stream expression

Server log (short):
[c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall 
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout 
expired: 12/12 ms
o.a.s.s.HttpSolrCall null:java.io.IOException: 
java.util.concurrent.TimeoutException: Idle timeout expired: 12/12 ms

I tried to increase the jetty idle timeout value on the node which hosts my 
target collection to something like an hour. It didn’t affect.


Server logs (long)
ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1] 
o.a.s.s.HttpSolrCall null:java.io.IOException: 
java.util.concurrent.TimeoutException: Idle timeout expired: 1  
2/12 ms
solr-01|at 
org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235)
solr-01|at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226)
solr-01|at 
org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524)
solr-01|at 
org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134)
solr-01|at 
java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233)
solr-01|at 
java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303)
solr-01|at 
java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281)
solr-01|at 
java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
solr-01|at 
java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:211)
solr-01|at 
org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140)
solr-01|at 
org.apache.solr.common.util.FastWriter.write(FastWriter.java:54)
solr-01|at 
org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173)
solr-01|at 
org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86)
solr-01|at 
org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52)
solr-01|at 
org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152)
solr-01|at 
org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176)
solr-01|at 
org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154)
solr-01|at 
org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77)
solr-01|at 
org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313)
solr-01|at 
org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263)
--
solr-01|at 
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
solr-01|at 
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
solr-01|at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
solr-01|at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
solr-01|at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
solr-01|at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
solr-01|at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
solr-01|at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
solr-01|at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
solr-01|at java.base/java.lang.Thread.run(Thread.java:834)
solr-01| Caused by: java.util.concurrent.TimeoutException: Idle timeout 
expired: 12/12 ms
solr-01|at 
org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
solr-01|at 
org.eclipse.jetty.io.IdleTimeout.idleCheck(IdleTimeout.java:113)
solr-01|at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
solr-01|at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
solr-01|at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
solr-01|at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
solr-01|at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
solr-01|... 1 more


My expression, in case 

Zookeeper 3.4.5 with Solr 8.8.0

2021-03-01 Thread Subhajit Das

Hi There,

I am setting up Solr 8.8.0 with Zookeeper 3.4.5.

There seems to an issue. EndOfStream issue is coming, saying client must have 
closed the connection.

If same tried with Solr 6.3.0, the issue dosen’t come. This comes with newer 
Solr only.

I noticed, that Solr 8.8.0 uses Zookeeper 3.6.2 client, while Solr 6.3.0 uses 
Zookeeper 3.4.6 client. Is this a client bug or mismatch issue?
If so, how to fix this?

Thanks in advance.


RE: How to read tlog

2021-03-01 Thread Subhajit Das
Thanks for reply.
Will try.

From: Gael Jourdan-Weil
Sent: 01 March 2021 05:48 PM
To: solr-user@lucene.apache.org
Subject: RE: How to read tlog

Hello,

You can just use "cat" or "tail", even though the tlog is not a text file, its 
content can mostly be read using these commands.
You will have one document per line and should be able to see the fields 
content.

I don't know is there is a Solr command which would give better display though.

Gaël

De : Subhajit Das 
Envoyé : samedi 27 février 2021 16:00
À : solr-user@lucene.apache.org 
Objet : How to read tlog


Hi There,

I faced a issue, on a core, in multicore standalone instance.
Is there any way to read tlog contents as text files. This might help to 
resolve the issue.
Thanks in advance.



RE: Potential Slow searching for unified highlighting on Solr 8.8.0/8.8.1

2021-03-01 Thread Flowerday, Matthew J
Hi Ere

Please to be of service!

No I have not filed a JIRA ticket. I am new to interacting with the Solr
Community and only beginning to 'find my legs'. I am not too sure what JIRA
is I am afraid!

Regards

Matthew

Matthew Flowerday | Consultant | ULEAF
Unisys | 01908 774830| matthew.flower...@unisys.com 
Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.
   

-Original Message-
From: Ere Maijala  
Sent: 01 March 2021 12:53
To: solr-user@lucene.apache.org
Subject: Re: Potential Slow searching for unified highlighting on Solr
8.8.0/8.8.1

EXTERNAL EMAIL - Be cautious of all links and attachments.

Hi,

Whoa, thanks for the heads-up! You may just have saved me from a whole lot
of trouble. Did you file a JIRA ticket already?

Thanks,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.00:
> Hi There
>
> I just came across a situation where a unified highlighting search 
> under solr 8.8.0/8.8.1 can take over 20 mins to run and eventually times
out.
> I resolved it by a config change – but it can catch you out. Hence 
> this email.
>
> With solr 8.8.0 a new unified highlighting parameter 
>  was implemented which if not set defaults to 0.5. 
> This attempts to improve the high lighting so that highlighted text 
> does not appear right at the left. This works well but if you have a 
> search result with numerous occurrences of the word in question within 
> the record performance goes right down!
>
> 2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf] 
> o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select 
> params={hl.snippets=2=test=on=100=id,d
> escription,specification,score=20=*=10&_=161440511913
> 4}
> hits=57008 status=0 QTime=1414320
>
> 2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf] 
> o.a.s.s.HttpSolrCall Unable to write response, client closed 
> connection or we are shutting down => 
> org.eclipse.jetty.io.EofException
>
>at
> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
>
> org.eclipse.jetty.io.EofException: null
>
>at
> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>
>at
> org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>
>at
> org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378)
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>
> when I set =0.25 results came back much quicker
>
> 2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params={hl.weightMatches=false=on=id,description,specification,s
> core=1=0.25=100=2=test
> axAnalyzedChars=100=*=unified=9&_=
> 1614430061690}
> hits=136939 status=0 QTime=87024
>
> And  =0.1
>
> 2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params={hl.weightMatches=false=on=id,description,specification,s
> core=1=0.1=100=2=test
> xAnalyzedChars=100=*=unified=9&_=1
> 614430061690}
> hits=136939 status=0 QTime=69033
>
> And =0.0
>
> 2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params={hl.weightMatches=false=on=id,description,specification,s
> core=1=0.0=100=2=test
> xAnalyzedChars=100=*=unified=9&_=1
> 614430061690}
> hits=136939 status=0 QTime=2841
>
> I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully 
> left aligned).  I am not too sure as to how many time a word has to 
> occur in a record for performance to go right down – but if too many 
> it can have a BIG impact.
>
> I also noticed that setting =9 did not break out of 
> the query until it finished. Perhaps because the query finished 
> quickly and what took the time was the highlighting. It might be an 
> idea to get  to also cover any highlighting so that the 
> query does not run until the jetty timeout is hit. The machine 100% 
> one core for about
> 20 mins!.
>
> Hope this helps.
>
> Regards
>
> Matthew
>
> *Matthew Flowerday*| Consultant | ULEAF
>
> Unisys | 01908 774830| matthew.flower...@unisys.com 
> 
>
> Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes |
> MK17 8LX
>
> unisys_logo 
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE 
> PROPRIETARY MATERIAL and is for use only by the intended recipient. If 
> you received this in error, please contact the sender and delete the 
> e-mail and its attachments from all devices.
>
> Grey_LI Grey_TW
> 

Re: Potential Slow searching for unified highlighting on Solr 8.8.0/8.8.1

2021-03-01 Thread Ere Maijala

Hi,

Whoa, thanks for the heads-up! You may just have saved me from a whole 
lot of trouble. Did you file a JIRA ticket already?


Thanks,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.00:

Hi There

I just came across a situation where a unified highlighting search under 
solr 8.8.0/8.8.1 can take over 20 mins to run and eventually times out. 
I resolved it by a config change – but it can catch you out. Hence this 
email.


With solr 8.8.0 a new unified highlighting parameter  
was implemented which if not set defaults to 0.5. This attempts to 
improve the high lighting so that highlighted text does not appear right 
at the left. This works well but if you have a search result with 
numerous occurrences of the word in question within the record 
performance goes right down!


2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf] 
o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select 
params={hl.snippets=2=test=on=100=id,description,specification,score=20=*=10&_=1614405119134} 
hits=57008 status=0 QTime=1414320


2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf] 
o.a.s.s.HttpSolrCall Unable to write response, client closed connection 
or we are shutting down => org.eclipse.jetty.io.EofException


   at 
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)


org.eclipse.jetty.io.EofException: null

   at 
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) 
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]


   at 
org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) 
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]


   at 
org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378) 
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]


when I set =0.25 results came back much quicker

2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes] 
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
params={hl.weightMatches=false=on=id,description,specification,score=1=0.25=100=2=test=100=*=unified=9&_=1614430061690} 
hits=136939 status=0 QTime=87024


And  =0.1

2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes] 
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
params={hl.weightMatches=false=on=id,description,specification,score=1=0.1=100=2=test=100=*=unified=9&_=1614430061690} 
hits=136939 status=0 QTime=69033


And =0.0

2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes] 
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
params={hl.weightMatches=false=on=id,description,specification,score=1=0.0=100=2=test=100=*=unified=9&_=1614430061690} 
hits=136939 status=0 QTime=2841


I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully 
left aligned).  I am not too sure as to how many time a word has to 
occur in a record for performance to go right down – but if too many it 
can have a BIG impact.


I also noticed that setting =9 did not break out of the 
query until it finished. Perhaps because the query finished quickly and 
what took the time was the highlighting. It might be an idea to get 
 to also cover any highlighting so that the query does not 
run until the jetty timeout is hit. The machine 100% one core for about 
20 mins!.


Hope this helps.

Regards

Matthew

*Matthew Flowerday*| Consultant | ULEAF

Unisys | 01908 774830| matthew.flower...@unisys.com 



Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | 
MK17 8LX


unisys_logo 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is for use only by the intended recipient. If you received 
this in error, please contact the sender and delete the e-mail and its 
attachments from all devices.


Grey_LI Grey_TW 
Grey_YT 
Grey_FB 
Grey_Vimeo 
Grey_UB 




--
Ere Maijala
Kansalliskirjasto / The National Library of Finland


RE: How to read tlog

2021-03-01 Thread Gael Jourdan-Weil
Hello,

You can just use "cat" or "tail", even though the tlog is not a text file, its 
content can mostly be read using these commands.
You will have one document per line and should be able to see the fields 
content.

I don't know is there is a Solr command which would give better display though.

Gaël

De : Subhajit Das 
Envoyé : samedi 27 février 2021 16:00
À : solr-user@lucene.apache.org 
Objet : How to read tlog 
 

Hi There,

I faced a issue, on a core, in multicore standalone instance.
Is there any way to read tlog contents as text files. This might help to 
resolve the issue.
Thanks in advance.

Potential Slow searching for unified highlighting on Solr 8.8.0/8.8.1

2021-03-01 Thread Flowerday, Matthew J
Hi There

 

I just came across a situation where a unified highlighting search under
solr 8.8.0/8.8.1 can take over 20 mins to run and eventually times out. I
resolved it by a config change - but it can catch you out. Hence this email.

 

With solr 8.8.0 a new unified highlighting parameter  was
implemented which if not set defaults to 0.5. This attempts to improve the
high lighting so that highlighted text does not appear right at the left.
This works well but if you have a search result with numerous occurrences of
the word in question within the record performance goes right down!

 

2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf]
o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select
params={hl.snippets=2=test=on=100=id,descrip
tion,specification,score=20=*=10&_=1614405119134}
hits=57008 status=0 QTime=1414320

2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf]
o.a.s.s.HttpSolrCall Unable to write response, client closed connection or
we are shutting down => org.eclipse.jetty.io.EofException

  at
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)

org.eclipse.jetty.io.EofException: null

  at
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

  at
org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

  at
org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

 

when I set =0.25 results came back much quicker

 

2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false=on=id,description,specification,score
tart=1=0.25=100=2=test
ars=100=*=unified=9&_=1614430061690}
hits=136939 status=0 QTime=87024

 

And  =0.1

 

2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false=on=id,description,specification,score
tart=1=0.1=100=2=test
rs=100=*=unified=9&_=1614430061690}
hits=136939 status=0 QTime=69033

 

And =0.0

 

2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false=on=id,description,specification,score
tart=1=0.0=100=2=test
rs=100=*=unified=9&_=1614430061690}
hits=136939 status=0 QTime=2841

 

I left our setting at 0.0 - this presumably how it was in 7.7.1 (fully left
aligned).  I am not too sure as to how many time a word has to occur in a
record for performance to go right down - but if too many it can have a BIG
impact.

 

I also noticed that setting =9 did not break out of the
query until it finished. Perhaps because the query finished quickly and what
took the time was the highlighting. It might be an idea to get 
to also cover any highlighting so that the query does not run until the
jetty timeout is hit. The machine 100% one core for about 20 mins!.

 

Hope this helps.

 

Regards

 

Matthew

 

Matthew Flowerday | Consultant | ULEAF

Unisys | 01908 774830|  
matthew.flower...@unisys.com 

Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX

 

  

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.

 

  
 

 



smime.p7s
Description: S/MIME cryptographic signature


Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-27 Thread Joel Bernstein
Congratulations Jan!

Joel Bernstein
http://joelsolr.blogspot.com/


On Mon, Feb 22, 2021 at 2:41 AM Danilo Tomasoni  wrote:

> Congratulations Jan!
>
> Danilo Tomasoni
>
> Fondazione The Microsoft Research - University of Trento Centre for
> Computational and Systems Biology (COSBI)
> Piazza Manifattura 1,  38068 Rovereto (TN), Italy
> tomas...@cosbi.eu<
> https://webmail.cosbi.eu/owa/redir.aspx?C=VNXi3_8-qSZTBi-FPvMwmwSB3IhCOjY8nuCBIfcNIs_5SgD-zNPWCA..=mailto%3acalabro%40cosbi.eu
> >
> http://www.cosbi.eu<
> https://webmail.cosbi.eu/owa/redir.aspx?C=CkilyF54_imtLHzZqF1gCGvmYXjsnf4bzGynd8OXm__5SgD-zNPWCA..=http%3a%2f%2fwww.cosbi.eu%2f
> >
>
> As for the European General Data Protection Regulation 2016/679 on the
> protection of natural persons with regard to the processing of personal
> data, we inform you that all the data we possess are object of treatment in
> the respect of the normative provided for by the cited GDPR.
> It is your right to be informed on which of your data are used and how;
> you may ask for their correction, cancellation or you may oppose to their
> use by written request sent by recorded delivery to The Microsoft Research
> – University of Trento Centre for Computational and Systems Biology Scarl,
> Piazza Manifattura 1, 38068 Rovereto (TN), Italy.
> P Please don't print this e-mail unless you really need to
> 
> Da: Yonik Seeley 
> Inviato: domenica 21 febbraio 2021 05:51
> A: solr-user@lucene.apache.org 
> Cc: Lucene Dev 
> Oggetto: Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
>
> [CAUTION: EXTERNAL SENDER]
> [Please check correspondence between Sender Display Name and Sender Email
> Address before clicking on any link or opening attachments]
>
>
> Congrats Jan! Go Solr!
> -Yonik
>
>
> On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta 
> wrote:
>
> > Hi everyone,
> >
> > I’d like to inform everyone that the newly formed Apache Solr PMC
> nominated
> > and elected Jan Høydahl for the position of the Solr PMC Chair and Vice
> > President. This decision was approved by the board in its February 2021
> > meeting.
> >
> > Congratulations Jan!
> >
> > --
> > Anshum Gupta
> >
>


Re: [ANNOUNCE] Apache Solr 8.8.1 released

2021-02-27 Thread Timothy Potter
Awesome! Thank you David and Tobias ;-)

On Sat, Feb 27, 2021 at 12:21 PM David Smiley  wrote:
>
> The corresponding docker image has been released as well:
> https://hub.docker.com/_/solr
> (credit to Tobias Kässmann for helping)
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Feb 23, 2021 at 10:39 AM Timothy Potter 
> wrote:
>
> > The Lucene PMC is pleased to announce the release of Apache Solr 8.8.1.
> >
> >
> > Solr is the popular, blazing fast, open source NoSQL search platform from
> > the Apache Lucene project. Its major features include powerful full-text
> > search, hit highlighting, faceted search, dynamic clustering, database
> > integration, rich document handling, and geospatial search. Solr is highly
> > scalable, providing fault tolerant distributed search and indexing, and
> > powers the search and navigation features of many of the world's largest
> > internet sites.
> >
> >
> > Solr 8.8.1 is available for immediate download at:
> >
> >
> >   
> >
> >
> > ### Solr 8.8.1 Release Highlights:
> >
> >
> > Fix for a SolrJ backwards compatibility issue when upgrading the server to
> > 8.8.0 without upgrading SolrJ to 8.8.0.
> >
> >
> > Please refer to the Upgrade Notes in the Solr Ref Guide for information on
> > upgrading from previous Solr versions:
> >
> >
> >   
> >
> >
> > Please read CHANGES.txt for a full list of bugfixes:
> >
> >
> >   
> >
> >
> > Solr 8.8.1 also includes bugfixes in the corresponding Apache Lucene
> > release:
> >
> >
> >   
> >
> >
> >
> > Note: The Apache Software Foundation uses an extensive mirroring network
> > for
> >
> > distributing releases. It is possible that the mirror you are using may not
> > have
> >
> > replicated the release yet. If that is the case, please try another mirror.
> >
> > This also applies to Maven access.
> >
> > 
> >


Re: [ANNOUNCE] Apache Solr 8.8.1 released

2021-02-27 Thread David Smiley
The corresponding docker image has been released as well:
https://hub.docker.com/_/solr
(credit to Tobias Kässmann for helping)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Feb 23, 2021 at 10:39 AM Timothy Potter 
wrote:

> The Lucene PMC is pleased to announce the release of Apache Solr 8.8.1.
>
>
> Solr is the popular, blazing fast, open source NoSQL search platform from
> the Apache Lucene project. Its major features include powerful full-text
> search, hit highlighting, faceted search, dynamic clustering, database
> integration, rich document handling, and geospatial search. Solr is highly
> scalable, providing fault tolerant distributed search and indexing, and
> powers the search and navigation features of many of the world's largest
> internet sites.
>
>
> Solr 8.8.1 is available for immediate download at:
>
>
>   
>
>
> ### Solr 8.8.1 Release Highlights:
>
>
> Fix for a SolrJ backwards compatibility issue when upgrading the server to
> 8.8.0 without upgrading SolrJ to 8.8.0.
>
>
> Please refer to the Upgrade Notes in the Solr Ref Guide for information on
> upgrading from previous Solr versions:
>
>
>   
>
>
> Please read CHANGES.txt for a full list of bugfixes:
>
>
>   
>
>
> Solr 8.8.1 also includes bugfixes in the corresponding Apache Lucene
> release:
>
>
>   
>
>
>
> Note: The Apache Software Foundation uses an extensive mirroring network
> for
>
> distributing releases. It is possible that the mirror you are using may not
> have
>
> replicated the release yet. If that is the case, please try another mirror.
>
> This also applies to Maven access.
>
> 
>


What guarantees does solr have for keeping commit deadlines?

2021-02-27 Thread Nándor Mátravölgyi
Hi!

I'm working on building a NRT solr instance. The schema is designed so
documents can be partially updated. Some documents will need to
receive or lose filter tags in a multi-valued field.

I have to be able to query already existing documents to add tags to
them or remove tags from them. Obviously if I (soft-)commit after each
document is added or removed the serializable consistency would
guarantee that I can see all documents that I might want to change.
However this is not desirable in terms of performance.

I've come up with a potential solution: If I can track document
updates that I make and only call commit before I would query
documents that have been changed recently, the performance is not
sacrificed and I will get the strict consistency where I need it. For
this to work reliably the soft-auto-commit interval and the times
specified through commit-within must strictly comply with the config
and the requests.

I have auto-commit interval of 60 seconds with open-searcher false and
auto-soft-commit interval of 15 seconds. Documents will be submitted
through the REST API where some of them will also have commit-within
2-3 seconds specified.

My questions:
 - After a document indexing request has returned with success, what
level of guarantee do I have that the document will be available after
the configured soft-commit-interval?
 - After a document indexing request with commit-within has returned
with success, what level of guarantee do I have that the document will
be available after the requested commit timeout?
 - Alternatively if I could query solr when the last soft-commit was
done I could ensure to call a soft-commit myself. Is there an API to
see when or how long ago was the last (soft-)commit?

I'm primarily interested in answers regards to solr in standalone mode.

Thanks,
Nandor


How to read tlog

2021-02-27 Thread Subhajit Das

Hi There,

I faced a issue, on a core, in multicore standalone instance.
Is there any way to read tlog contents as text files. This might help to 
resolve the issue.
Thanks in advance.


RE: Select streaming expression, add a field to every tuple, replaceor raw not working

2021-02-26 Thread ufuk yılmaz
I tried to debug this to the best of my ability, and it seems the correct name 
for the “raw” evaluator is “val”.

Copied from StreamContext: val=class 
org.apache.solr.client.solrj.io.eval.RawValueEvaluator

I think there’s a small error in stream evaluator documentation of 8.4

https://lucene.apache.org/solr/guide/8_4/stream-evaluator-reference.html

When I used “val” instead of “raw”, I got the expected response:

select(
search(
myCollection,
q="*:*",
qt="/export",
sort="id_str asc",
fl="id_str"
),
id_str,
val(abc) as text
)

{
  "result-set": {
"docs": [
  {
"id_str": "deneme123",
"text": "abc"
  },
  {
"EOF": true,
"RESPONSE_TIME": 70
  }
]
  }
}

--ufuk yilmaz


Sent from Mail for Windows 10

From: ufuk yılmaz
Sent: 26 February 2021 16:38
To: solr-user@lucene.apache.org
Subject: Select streaming expression, add a field to every tuple, replaceor raw 
not working

Hello all,

Solr version 8.4

I have a very simple select expression here. What I’m trying to do is to add a 
constant value to incoming tuples.

My collection has only 1 document. Id_str is of type String. Other fields are 
Solr generated.

{
"_version_":1692761378187640832,
"id_str":"experiment123",
"id":"18d658b13b6b072f"}]
  }

My streaming expression:

select(
search(
myCollection,
q="*:*",
qt="/export",
sort="id_str asc",
fl="id_str"
),
id_str,
raw(ttt) as text // Docs state that select works with any 
evaluator. “raw” here is a stream evaluator.
)

I also tried:

select(
search(
myCollection,
q="*:*",
qt="/export",
sort="id_str asc",
fl="id_str"
),
id_str,
replace(text, null, withValue=raw(ttt)) as text //replace is 
described in select expression documentation. I also tried withValue=ttt 
directly
)

No matter what I do, response only includes id_str field, without any error:

{
  "result-set":{
"docs":[{
"id_str":" experiment123"}
  ,{
"EOF":true,
"RESPONSE_TIME":45}]}}

I also tried wrapping text value with quotes, that didn’t work too.

What am I doing wrong?

--ufuk yilmaz

Sent from Mail for Windows 10




Select streaming expression, add a field to every tuple, replace or raw not working

2021-02-26 Thread ufuk yılmaz
Hello all,

Solr version 8.4

I have a very simple select expression here. What I’m trying to do is to add a 
constant value to incoming tuples.

My collection has only 1 document. Id_str is of type String. Other fields are 
Solr generated.

{
"_version_":1692761378187640832,
"id_str":"experiment123",
"id":"18d658b13b6b072f"}]
  }

My streaming expression:

select(
search(
myCollection,
q="*:*",
qt="/export",
sort="id_str asc",
fl="id_str"
),
id_str,
raw(ttt) as text // Docs state that select works with any 
evaluator. “raw” here is a stream evaluator.
)

I also tried:

select(
search(
myCollection,
q="*:*",
qt="/export",
sort="id_str asc",
fl="id_str"
),
id_str,
replace(text, null, withValue=raw(ttt)) as text //replace is 
described in select expression documentation. I also tried withValue=ttt 
directly
)

No matter what I do, response only includes id_str field, without any error:

{
  "result-set":{
"docs":[{
"id_str":" experiment123"}
  ,{
"EOF":true,
"RESPONSE_TIME":45}]}}

I also tried wrapping text value with quotes, that didn’t work too.

What am I doing wrong?

--ufuk yilmaz

Sent from Mail for Windows 10



Jetty JNDI connection pooling

2021-02-26 Thread Srinivas Kashyap
Hi,

Our datasource is oracle db and we are pulling data to solr through JDBC(DIH). 
I have below entry in jetty.xml



jdbc/tss

  
thin
:1521:ORCL
XXX
XXX
  

  

And we have added below entry in server/solr-webapp/webapp/WEB-INF/web.xml


jdbc/tss
javax.sql.DataSource
Container
  


What is the default connection pool limit for this datasource? Also, how to set 
the max connections that can be made from jetty?

Thanks,
Srinivas

DISCLAIMER:
E-mails and attachments from Bamboo Rose, LLC are confidential.
If you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way.
No representation is made that this email or any attachments are free of 
viruses. Virus scanning is recommended and is the responsibility of the 
recipient.

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for use by the recipient and others 
authorized to receive it. If you are not the recipient, you are hereby notified 
that any disclosure, copying, distribution or taking action in relation of the 
contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been 
automatically archived by Mimecast Ltd, an innovator in Software as a Service 
(SaaS) for business. Providing a safer and more useful place for your human 
generated data. Specializing in; Security, archiving and compliance. To find 
out more visit the Mimecast website.


Re: Add plugins to Solr docker container

2021-02-25 Thread Prabhatika Vij
Hey Anil,

If you want to execute anything before Solr starts, you can do the
following: -

mkdir initdb; echo "echo hi" > initdb/hi.sh
docker run -v $PWD/initdb:/docker-entrypoint-initdb.d solr
Using the above, you can have any script executed before Solr starts.

Source:
https://github.com/docker-solr/docker-solr/blob/master/8.8/scripts/run-initdb

Hope this helps. Please feel free to ask any further questions.

I am replying to this mailing list for the first time. If I am not
following any convention, please let me know.

Thank you,
Prabhatika

On Fri, Feb 26, 2021 at 11:32 AM anilkumar panditi <
anilkumar.pand...@gmail.com> wrote:

> Hi,
> I am first time user of the apache Solr, and i have brought up the Solr as
> docker container , and i am unable to install/enable someplugins
> (authentication,authorization etc)..
> Could you please help. how to add these plug ins to solr which is running
> as docker container.
>
> Thanks
> Anil
>


Add plugins to Solr docker container

2021-02-25 Thread anilkumar panditi
Hi,
I am first time user of the apache Solr, and i have brought up the Solr as
docker container , and i am unable to install/enable someplugins
(authentication,authorization etc)..
Could you please help. how to add these plug ins to solr which is running
as docker container.

Thanks
Anil


Rule Based Authorization

2021-02-25 Thread Subhajit Das
Hi There,

I am trying to create a rule based authorization for two types of user.

Role : Access
-
power-user : Everything except data change in collections/cores
ui-user : UI access to view all data. But no edit access except data change in 
collections/cores

How to implement this?

No predefined permissions for this. Tried by adding all read permissions to 
this ui-user. But dosent wok. One of failing APIs is “/admin/info/system”. Cant 
match this url, with custom permission also.

Please help.

Thanks in advance.



How can I enable scoring on a DocList rather than a DocSet

2021-02-25 Thread krishan goyal
Hi,

I want to match and score on a sorted DocList.

The use case is something like this

   - Cache sorted results (with scores) of certain queries in the
   queryCache (This is a DocList)
   - New queries are superset of these cached queries and have dynamic
   scoring clauses
   - At runtime, I want to lookup results in query Cache and run matching
   and scoring on this list
  - This enables me to have a more dynamic way (per query) of pre
  sorted data set where I can early terminate shorter and more effectively
  and reduce latencies even further
  - In some cases, I can use the cached score along with new score too
  and avoid recomputation here.

The problem is currently the Scorer interface requires a DocIdSetIterator
and can't work on top of a DocList.

So does that mean, enabling this kind of optimisation requires using a
different Scorer & Weight interfaces or is there something I can do using
the current interfaces itself ?


RE: Handling Locales in Solr

2021-02-25 Thread Krönert Florian
Hi Markus,

thank you a lot for your response, that helps us very much.

We will try out the approach of separating the cores by topic only.

Kind Regards,
Florian 

-Original Message-
From: Markus Jelsma  
Sent: Mittwoch, 24. Februar 2021 12:27
To: solr-user@lucene.apache.org
Subject: Re: Handling Locales in Solr

Hello,

We put all our customers in the same core/collection because of this, it is not 
practical to manage hundreds of cores, including their small overhead.
Although it can be advantageous when it comes to relevance tuning, no skewed 
statistics because of other customers.

In your case, an unused core is probably slow because it is not in cached 
memory anymore, and/or it has to load from a slow drive.

With regards to the locales, i would probably separate the cores by topic only, 
and have different languages share the same collection/core.

Regards,
Markus



Op wo 24 feb. 2021 om 12:09 schreef Krönert Florian <
florian.kroen...@orbis.de>:

> Hi everyone,
>
>
>
> First up thanks for this group, I appreciate it very much for 
> exchanging opinions on how to use Solr.
>
>
>
> We built a Solr instance for one of our customers which is used for 
> searching data on his website.
>
> We need to search different data (kb articles, products and external
> links) in different locales.
>
>
>
> For our logic it seemed best to separate solr Cores by topic and 
> locale, so we have cores like this:
>
> kbarticle_de-de
>
> kbarticle_en-us
>
> …
>
> products_de-de
>
> products_en-us
>
> …
>
> links_de-de
>
> links_en-us
>
>
>
> First we had only 3 locales, but it grew pretty fast to 16 locales, so 
> that we’re having 48 solr cores by now already.
>
> There would have been different approaches for realizing this of 
> course, so we’re wondering whether we are using Solr not in the optimal way?
>
>
>
> We found out that when a search on a locale that was not used for some 
> time is started, it takes >10 seconds in many cases to execute the search.
>
>
>
> We then find logs like this, where it seems as if Solr needs to start 
> a searcher first, which takes time:
>
> 2021-02-20 04:33:42.634 INFO  (Thread-20674) [   ]
> o.a.s.s.SolrIndexSearcher Opening [Searcher@775f8595[kbarticles_en-gb]
> main]
>
> 2021-02-20 04:33:42.643 INFO  (searcherExecutor-26-thread-1) [   ]
> o.a.s.c.QuerySenderListener QuerySenderListener sending requests to 
> Searcher@775f8595[kbarticles_en-gb]
>
> …
>
>
>
> Is that an issue? It would be good to know whether our localization 
> approach causes issues with Solr and whether we should restructure our 
> core design.
>
> Any help would be very much appreciated.
>
>
>
> Kind Regards,
>
>
>
> *Florian Krönert*
> Senior Software Developer
>
> 
>
> *ORBIS AG | *Planckstraße 10 | D-88677 Markdorf
>
> Phone: +49 7544 50398 21 | Mobile: +49 162 3065972 | E-Mail:
> florian.kroen...@orbis.de
> www.orbis.de
>
>
> 
>
> Registered Seat: Saarbrücken
> Commercial Register Court: Amtsgericht Saarbrücken, HRB 12022 Board of 
> Management: Thomas Gard (Chairman), Michael Jung, Stefan Mailänder, 
> Frank Schmelzer Chairman of the Supervisory Board: Ulrich Holzer
>
> 
> 
> 
> 
> 
>
>
>
>
>  ticle/inner-circle-award-orbis-zaehlt-erneut-zu-den-weltbesten-partner
> n-fuer-microsoft-business-application.html>
>
>
>
>
>  latform.html?wmc=Banner-Microsoft-Power-Platform>
>
>
>  by-ORBIS>
>


Re: Solr Cloud Autoscaling Basics

2021-02-24 Thread yasoobhaider
Any pointers here would be appreciated :)



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr 7.6.0 - OOM Caused Down Replica. Cannot recover. Please advice

2021-02-24 Thread Ashwin Ramesh
Hi everyone,

We had an OOM event earlier this morning. This has caused one of our shards
to lose all it's replicas and it's leader is still in a down state. We have
restarted the Java process (solr) and it's still in a down state. Logs
below:

```
Feb 25, 2021 @ 11:46:43.000 2021-02-25 00:46:43.268 WARN
 (updateExecutor-3-thread-1-processing-n:10.0.10.43:8983_solr
x:search-collection-2018-10-30_shard2_5_replica_n1480
c:search-collection-2018-10-30 s:shard2_5 r:core_node1481)
[c:search-collection-2018-10-30 s:shard2_5 r:core_node1481
x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:40.000 2021-02-25 00:46:40.759 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:35.000 2021-02-25 00:46:35.761 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:33.000 2021-02-25 00:46:33.270 WARN
 (updateExecutor-3-thread-2-processing-n:10.0.10.43:8983_solr
x:search-collection-2018-10-30_shard2_5_replica_n1480
c:search-collection-2018-10-30 s:shard2_5 r:core_node1481)
[c:search-collection-2018-10-30 s:shard2_5 r:core_node1481
x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:30.000 2021-02-25 00:46:30.759 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:25.000 2021-02-25 00:46:25.761 WARN
 (zkCallback-7-thread-2) [c:search-collection-2018-10-30 s:shard2_5
r:core_node1481 x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
Feb 25, 2021 @ 11:46:23.000 2021-02-25 00:46:23.279 WARN
 (updateExecutor-3-thread-1-processing-n:10.0.10.43:8983_solr
x:search-collection-2018-10-30_shard2_5_replica_n1480
c:search-collection-2018-10-30 s:shard2_5 r:core_node1481)
[c:search-collection-2018-10-30 s:shard2_5 r:core_node1481
x:search-collection-2018-10-30_shard2_5_replica_n1480]
o.a.s.c.RecoveryStrategy Stopping recovery for
core=[search-collection-2018-10-30_shard2_5_replica_n1480]
coreNodeName=[core_node1481] ∎
```

Questions:
1. Is there anything we can do to force this core to go live?
2. If the core is unrecoverable, is there a way to clear the core up such
that we can reindex only that shard?

Any other advice would be great too :)

Ash

-- 
**
** Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.
Here are some resources 

 
that can help.
   
   
    













What controls field cache size and eviction rates?

2021-02-24 Thread Stephen Lewis Bianamara
Hi SOLR Community,

I've been trying to understand how the field cache in SOLR manages
its evictions, and it is not easily readable from the code or documentation
the simple question of when and how something gets evicted from the field
cache. This cache also doesn't show hit ratio, total hits, eviction ratio,
total evictions, etc... in the web UI.

For example: I've observed that if I write one document and trigger a query
with a sort on the field, it will generate two entries in the field cache.
Then if I repush the document, the entries get removed, but will otherwise
stay there seemingly forever. If my query matches 2 docs, same thing but
with 4 entries (2 each). Then, if I rewrite one of the docs, those two
entries go away but not the two from the first one. This obviously implies
that there are implications to write throughput performance based on this
cache, so the fact that it is not configurable by the user and doesn't have
very clear documentation is a bit worrisome.

Can someone here help out and explain how the filter cache handles
evictions, or perhaps send me the documentation if I missed it?


Thanks!
Stephen


Re: Dynamic starting or stoping of zookeepers in a cluster

2021-02-24 Thread Shawn Heisey

On 2/24/2021 9:04 AM, DAVID MARTIN NIETO wrote:

If I'm not mistaken the number of zookeepers must be odd. Having 3 zoos on 3 
different machines, if we temporarily lost one of the three machines, we would 
have only two running and it would be an even number.Would it be advisable in 
this case to raise a third party in one of the 2 active machines or with only 
two zookeepers there would be no blockages in their internal votes?


It does not HAVE to be an odd number.  But increasing the total by one 
doesn't add any additional fault tolerance, and exposes an additional 
point of failure.


If you have 3 servers, 2 of them have to be running to maintain quorum. 
 If you have 4 servers, 3 of them have to be running for the cluster to 
be fully operational.


So a 3-server cluster and a 4-server cluster can survive the failure of 
one machine.  This holds true for larger numbers as well -- with 5 
servers or with 6 servers, you can lose two and stay fully operational. 
 Having that extra server that makes the total even is just wasteful.


Thanks,
Shawn


RE: Dynamic starting or stoping of zookeepers in a cluster

2021-02-24 Thread DAVID MARTIN NIETO
One doubt about it:

In order to have a highly available zookeeper, you must have at least
three separate physical servers for ZK.  Running multiple zookeepers on
one physical machine gains you nothing ... because if the whole machine
fails, you lose all of those zookeepers.  If you have three physical
servers, one can fail with no problems.  If you have five separate
physical servers running ZK, then two of the machines can fail without
taking the cluster down.

If I'm not mistaken the number of zookeepers must be odd. Having 3 zoos on 3 
different machines, if we temporarily lost one of the three machines, we would 
have only two running and it would be an even number.Would it be advisable in 
this case to raise a third party in one of the 2 active machines or with only 
two zookeepers there would be no blockages in their internal votes?

About the dynamic reconfiguration many thanks we've 8.2 but the zoos are in 
3.4.2 version, we're going to test with 3.5 version and the dynamic 
configuration of zookeepers to avoid this problem.

Many thanks.
Kind regards.



De: Joe Lerner 
Enviado: viernes, 19 de febrero de 2021 18:56
Para: solr-user@lucene.apache.org 
Asunto: Re: Dynamic starting or stoping of zookeepers in a cluster

This is solid information. *How about the application, which uses
SOLR/Zookeeper?*

Do we have to follow this guidance, to make the application ZK config aware:

https://zookeeper.apache.org/doc/r3.5.5/zookeeperReconfig.html#ch_reconfig_rebalancing


Or, could we leave it as is, and as long as the ZK Ensemble has the same
IPs?

Thanks!

Joe




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
Last week I switched to Solr 8.7 from a “special” build of Solr 6.6

The system has a timeout set for querying. I am now seeing this bug.

https://issues.apache.org/jira/browse/SOLR-14758 


Max Query Time goes from 1.6 seconds to 20 seconds and affects the entire 
system for about 2 minutes as reported in New Relic.

null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)


Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to 
Solr 6 and reindex the system, that takes 2 days using 180 EMR instances.

Pease advise. Thank you.

NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
Last week I switched to Solr 8.7 from a “special” build of Solr 6.6

The system has a timeout set for querying. I am now seeing this bug.

https://issues.apache.org/jira/browse/SOLR-14758 


Max Query Time goes from 1.6 seconds to 20 seconds and affects the entire 
system for about 2 minutes as reported in New Relic.

null:java.lang.NullPointerException
at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:935)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:626)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:486)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627)


Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to 
Solr 6 and reindex the system, that takes 2 days using 180 EMR instances.

Pease advise. Thank you.

Re: Overriding Sort and boosting some docs to the top

2021-02-24 Thread Mark Robinson
Thanks Marcus for your response.

Best,
Mark

On Wed, Feb 24, 2021 at 4:50 PM Markus Jelsma 
wrote:

> I would stick to the query elevation component, it is pretty fast and
> easier to handle/configure elevation IDs, instead of using function queries
> for it. We have customers that set a dozen of documents for a given query
> and it works just fine.
>
> I also do not expect the function query variant to be more performant, but
> i am not sure. If it were, would it be measurable?
>
> Regards,
> Markus
>
> Op wo 24 feb. 2021 om 12:15 schreef Mark Robinson  >:
>
> > Thanks for the reply Markus!
> >
> > I did try it.
> > My question specifically was (repasting here):-
> >
> > Which is more recommended/ performant?
> >
> > Note:- Assume that I have hundreds of ids to boost like this.
> > Is there a difference to the answer if docs to be boosted after the sort
> is
> > less?
> >
> > Thanks!
> > Mark
> >
> > On Wed, Feb 24, 2021 at 4:41 PM Markus Jelsma <
> markus.jel...@openindex.io>
> > wrote:
> >
> > > Hello,
> > >
> > > You are probably looking for the elevator component, check it out:
> > >
> >
> https://lucene.apache.org/solr/guide/8_8/the-query-elevation-component.html
> > >
> > > Regards,
> > > Markus
> > >
> > > Op wo 24 feb. 2021 om 11:59 schreef Mark Robinson <
> > mark123lea...@gmail.com
> > > >:
> > >
> > > > Hi,
> > > >
> > > > I wanted to sort and then boost some docs to the top and these docs
> > > should
> > > > be my first set in the results and the following ones appearing
> > according
> > > > to my sort criteria.
> > > >
> > > > I understand that sort overrides bq hence bq may not be used in this
> > case
> > > >
> > > > - I brought my boost into sort using "query()" and achieved my goal.
> > > > - I tried sort and then elevate with forceElevation and that also
> > worked.
> > > >
> > > > My question is which is more recommended/ performant?
> > > >
> > > > Note:- Assume that I have hundreds of ids to boost like this.
> > > > Is there a difference to the answer if docs to be boosted after the
> > sort
> > > is
> > > > less?
> > > >
> > > > Could someone please share your thoughts/experience?
> > > >
> > > > Thanks!
> > > > Mark.
> > > >
> > >
> >
>


Re: Handling Locales in Solr

2021-02-24 Thread Markus Jelsma
Hello,

We put all our customers in the same core/collection because of this, it is
not practical to manage hundreds of cores, including their small overhead.
Although it can be advantageous when it comes to relevance tuning, no
skewed statistics because of other customers.

In your case, an unused core is probably slow because it is not in cached
memory anymore, and/or it has to load from a slow drive.

With regards to the locales, i would probably separate the cores by topic
only, and have different languages share the same collection/core.

Regards,
Markus



Op wo 24 feb. 2021 om 12:09 schreef Krönert Florian <
florian.kroen...@orbis.de>:

> Hi everyone,
>
>
>
> First up thanks for this group, I appreciate it very much for exchanging
> opinions on how to use Solr.
>
>
>
> We built a Solr instance for one of our customers which is used for
> searching data on his website.
>
> We need to search different data (kb articles, products and external
> links) in different locales.
>
>
>
> For our logic it seemed best to separate solr Cores by topic and locale,
> so we have cores like this:
>
> kbarticle_de-de
>
> kbarticle_en-us
>
> …
>
> products_de-de
>
> products_en-us
>
> …
>
> links_de-de
>
> links_en-us
>
>
>
> First we had only 3 locales, but it grew pretty fast to 16 locales, so
> that we’re having 48 solr cores by now already.
>
> There would have been different approaches for realizing this of course,
> so we’re wondering whether we are using Solr not in the optimal way?
>
>
>
> We found out that when a search on a locale that was not used for some
> time is started, it takes >10 seconds in many cases to execute the search.
>
>
>
> We then find logs like this, where it seems as if Solr needs to start a
> searcher first, which takes time:
>
> 2021-02-20 04:33:42.634 INFO  (Thread-20674) [   ]
> o.a.s.s.SolrIndexSearcher Opening [Searcher@775f8595[kbarticles_en-gb]
> main]
>
> 2021-02-20 04:33:42.643 INFO  (searcherExecutor-26-thread-1) [   ]
> o.a.s.c.QuerySenderListener QuerySenderListener sending requests to
> Searcher@775f8595[kbarticles_en-gb]
>
> …
>
>
>
> Is that an issue? It would be good to know whether our localization
> approach causes issues with Solr and whether we should restructure our core
> design.
>
> Any help would be very much appreciated.
>
>
>
> Kind Regards,
>
>
>
> *Florian Krönert*
> Senior Software Developer
>
> 
>
> *ORBIS AG | *Planckstraße 10 | D-88677 Markdorf
>
> Phone: +49 7544 50398 21 | Mobile: +49 162 3065972 | E-Mail:
> florian.kroen...@orbis.de
> www.orbis.de
>
>
> 
>
> Registered Seat: Saarbrücken
> Commercial Register Court: Amtsgericht Saarbrücken, HRB 12022
> Board of Management: Thomas Gard (Chairman), Michael Jung, Stefan
> Mailänder, Frank Schmelzer
> Chairman of the Supervisory Board: Ulrich Holzer
>
> 
> 
> 
> 
>
>
>
>
> 
>
>
>
>
> 
>
>
> 
>


Re: Overriding Sort and boosting some docs to the top

2021-02-24 Thread Markus Jelsma
I would stick to the query elevation component, it is pretty fast and
easier to handle/configure elevation IDs, instead of using function queries
for it. We have customers that set a dozen of documents for a given query
and it works just fine.

I also do not expect the function query variant to be more performant, but
i am not sure. If it were, would it be measurable?

Regards,
Markus

Op wo 24 feb. 2021 om 12:15 schreef Mark Robinson :

> Thanks for the reply Markus!
>
> I did try it.
> My question specifically was (repasting here):-
>
> Which is more recommended/ performant?
>
> Note:- Assume that I have hundreds of ids to boost like this.
> Is there a difference to the answer if docs to be boosted after the sort is
> less?
>
> Thanks!
> Mark
>
> On Wed, Feb 24, 2021 at 4:41 PM Markus Jelsma 
> wrote:
>
> > Hello,
> >
> > You are probably looking for the elevator component, check it out:
> >
> https://lucene.apache.org/solr/guide/8_8/the-query-elevation-component.html
> >
> > Regards,
> > Markus
> >
> > Op wo 24 feb. 2021 om 11:59 schreef Mark Robinson <
> mark123lea...@gmail.com
> > >:
> >
> > > Hi,
> > >
> > > I wanted to sort and then boost some docs to the top and these docs
> > should
> > > be my first set in the results and the following ones appearing
> according
> > > to my sort criteria.
> > >
> > > I understand that sort overrides bq hence bq may not be used in this
> case
> > >
> > > - I brought my boost into sort using "query()" and achieved my goal.
> > > - I tried sort and then elevate with forceElevation and that also
> worked.
> > >
> > > My question is which is more recommended/ performant?
> > >
> > > Note:- Assume that I have hundreds of ids to boost like this.
> > > Is there a difference to the answer if docs to be boosted after the
> sort
> > is
> > > less?
> > >
> > > Could someone please share your thoughts/experience?
> > >
> > > Thanks!
> > > Mark.
> > >
> >
>


Re: Overriding Sort and boosting some docs to the top

2021-02-24 Thread Mark Robinson
Thanks for the reply Markus!

I did try it.
My question specifically was (repasting here):-

Which is more recommended/ performant?

Note:- Assume that I have hundreds of ids to boost like this.
Is there a difference to the answer if docs to be boosted after the sort is
less?

Thanks!
Mark

On Wed, Feb 24, 2021 at 4:41 PM Markus Jelsma 
wrote:

> Hello,
>
> You are probably looking for the elevator component, check it out:
> https://lucene.apache.org/solr/guide/8_8/the-query-elevation-component.html
>
> Regards,
> Markus
>
> Op wo 24 feb. 2021 om 11:59 schreef Mark Robinson  >:
>
> > Hi,
> >
> > I wanted to sort and then boost some docs to the top and these docs
> should
> > be my first set in the results and the following ones appearing according
> > to my sort criteria.
> >
> > I understand that sort overrides bq hence bq may not be used in this case
> >
> > - I brought my boost into sort using "query()" and achieved my goal.
> > - I tried sort and then elevate with forceElevation and that also worked.
> >
> > My question is which is more recommended/ performant?
> >
> > Note:- Assume that I have hundreds of ids to boost like this.
> > Is there a difference to the answer if docs to be boosted after the sort
> is
> > less?
> >
> > Could someone please share your thoughts/experience?
> >
> > Thanks!
> > Mark.
> >
>


Re: Overriding Sort and boosting some docs to the top

2021-02-24 Thread Markus Jelsma
Hello,

You are probably looking for the elevator component, check it out:
https://lucene.apache.org/solr/guide/8_8/the-query-elevation-component.html

Regards,
Markus

Op wo 24 feb. 2021 om 11:59 schreef Mark Robinson :

> Hi,
>
> I wanted to sort and then boost some docs to the top and these docs should
> be my first set in the results and the following ones appearing according
> to my sort criteria.
>
> I understand that sort overrides bq hence bq may not be used in this case
>
> - I brought my boost into sort using "query()" and achieved my goal.
> - I tried sort and then elevate with forceElevation and that also worked.
>
> My question is which is more recommended/ performant?
>
> Note:- Assume that I have hundreds of ids to boost like this.
> Is there a difference to the answer if docs to be boosted after the sort is
> less?
>
> Could someone please share your thoughts/experience?
>
> Thanks!
> Mark.
>


Handling Locales in Solr

2021-02-24 Thread Krönert Florian
Hi everyone,

First up thanks for this group, I appreciate it very much for exchanging 
opinions on how to use Solr.

We built a Solr instance for one of our customers which is used for searching 
data on his website.
We need to search different data (kb articles, products and external links) in 
different locales.

For our logic it seemed best to separate solr Cores by topic and locale, so we 
have cores like this:
kbarticle_de-de
kbarticle_en-us
...
products_de-de
products_en-us
...
links_de-de
links_en-us

First we had only 3 locales, but it grew pretty fast to 16 locales, so that 
we're having 48 solr cores by now already.
There would have been different approaches for realizing this of course, so 
we're wondering whether we are using Solr not in the optimal way?

We found out that when a search on a locale that was not used for some time is 
started, it takes >10 seconds in many cases to execute the search.

We then find logs like this, where it seems as if Solr needs to start a 
searcher first, which takes time:
2021-02-20 04:33:42.634 INFO  (Thread-20674) [   ] o.a.s.s.SolrIndexSearcher 
Opening [Searcher@775f8595[kbarticles_en-gb] main]
2021-02-20 04:33:42.643 INFO  (searcherExecutor-26-thread-1) [   ] 
o.a.s.c.QuerySenderListener QuerySenderListener sending requests to 
Searcher@775f8595[kbarticles_en-gb]
...

Is that an issue? It would be good to know whether our localization approach 
causes issues with Solr and whether we should restructure our core design.
Any help would be very much appreciated.

Kind Regards,

Florian Krönert
Senior Software Developer

[cid:image001.gif@01D70AA4.14778350]
ORBIS AG | Planckstraße 10 | D-88677 Markdorf
Phone: +49 7544 50398 21 | Mobile: +49 162 3065972 | E-Mail: 
florian.kroen...@orbis.de
www.orbis.de

[cid:image002.png@01D70AA4.14778350] 
   [cid:image003.jpg@01D70AA4.14778350]

Registered Seat: Saarbrücken
Commercial Register Court: Amtsgericht Saarbrücken, HRB 12022
Board of Management: Thomas Gard (Chairman), Michael Jung, Stefan Mailänder, 
Frank Schmelzer
Chairman of the Supervisory Board: Ulrich Holzer
[cid:image004.png@01D70AA4.14778350]   
[cid:image005.png@01D70AA4.14778350] 

[cid:image006.png@01D70AA4.14778350]    
 [cid:image007.png@01D70AA4.14778350] 
[cid:image008.png@01D70AA4.14778350] 

[cid:image009.png@01D70AA4.14778350]

[cid:image010.png@01D70AA4.14778350]


[cid:MicrosoftPowerPlatform_7054080f-b0bf-4e97-b9ab-55dae1373165.png]

[cid:Microsoftallgemein_e6028b0b-75cc-43da-9046-9fec28faad6d.png]


Overriding Sort and boosting some docs to the top

2021-02-24 Thread Mark Robinson
Hi,

I wanted to sort and then boost some docs to the top and these docs should
be my first set in the results and the following ones appearing according
to my sort criteria.

I understand that sort overrides bq hence bq may not be used in this case

- I brought my boost into sort using "query()" and achieved my goal.
- I tried sort and then elevate with forceElevation and that also worked.

My question is which is more recommended/ performant?

Note:- Assume that I have hundreds of ids to boost like this.
Is there a difference to the answer if docs to be boosted after the sort is
less?

Could someone please share your thoughts/experience?

Thanks!
Mark.


Cross join on multivalued field

2021-02-23 Thread Luke Oak
Hi,

I am wondering whether there is planning to implement cross collections join 
query on multivalued field 

Thanks 

Sent from my iPhone

  1   2   3   4   5   6   7   8   9   10   >