Re: new tlog files are not created per commit but adding into latest existing tlog file after replica reload

2021-03-04 Thread Michael Hu
Hi experts:

After I sent out previous email, I issued commit on that replica core and 
observed the same "ClosedChannelException", please refer to below under 
"issuing core commit" section

Then I issued a core reload, and I see the timestamp of the latest tlog file 
changed, please refer to "files under tlog directory " section below. Not sure 
those information is useful or not.

Thank you!

--Michael Hu

--- beginning for issuing core commit ---

$ curl 
'http://localhost:8983/solr/myconection_myshard_replica_t7/update?commit=true'

{

  "responseHeader":{

"status":500,

"QTime":71},

  "error":{

"metadata":[

  "error-class","org.apache.solr.common.SolrException",

  "root-error-class","java.nio.channels.ClosedChannelException"],

"msg":"java.nio.channels.ClosedChannelException",

"trace":"org.apache.solr.common.SolrException:

--- end for issuing core commit ---

--- beginning for files under tlog directory ---
before core reload:

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877

-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878

-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879

-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880

-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881

-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882

-rw-r--r-- 1 solr solr 2179991713 Mar  4 20:29 tlog.883


after core reload:

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877
-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878
-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879
-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880
-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881
-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882
-rw-r--r-- 1 solr solr 2179991717 Mar  4 22:23 tlog.883


--- end for files under tlog directory ---



From: Michael Hu 
Sent: Thursday, March 4, 2021 1:58 PM
To: solr-user@lucene.apache.org 
Subject: new tlog files are not created per commit but adding into latest 
existing tlog file after replica reload

Hi experts:

Need some help and suggestion about an issue I am facing

Solr info:
 - Solr 8.7
 - Solr cloud with tlog replica; replica size is 3 for my Solr collection

Issue:
 - before issuing collection reload; I observed a new tlog file are created 
after every commit; and those tlog files are deleted after a while (may be 
after index are merged?)
 - then I issued a collection reload using collection API on my collection at 
20:15
 - after leader replica is reloaded; no new tlog file are created; instead 
latest tlog file is growing, and no tlog file is deleted after reload. Below 
under "files under tlog directory" section is a snapshot of the tlog files 
under tlog directory of the leader replica. Again, I issued collection reload 
at 20:15, and after that tlog.883 is growing
 - I looked into log file and find error log entries below under "log entries" 
section, and the log entry repeats continuously for every auto commit after 
reload. I hope this log entry can provide some information for the issue.

Please help and suggestion what I may do incorrectly. Or this is a known issue, 
is there a way I can fix or work-around it?

Thank you so much!

--Michael Hu

--- beginning for files under tlog directory ---

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877

-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878

-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879

-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880

-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881

-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882

-rw-r--r-- 1 solr solr 2179991713 Mar  4 20:29 tlog.883

--- end for files under tlog directory ---

--- beginning for log entries ---

2021-03-04 20:15:38.251 ERROR (commitScheduler-4327-thread-1) [c:mycollection 
s:myshard r:core_node10 x:mycolletion_myshard_replica_t7] o.a.s.u.CommitTracker 
auto commit error...:

org.apache.solr.common.SolrException: java.nio.channels.ClosedChannelException

at 
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:503)

at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:835)

at org.apache.solr.update.UpdateLog.preCommit(UpdateLog.java:819)

at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:673)

at org.apache.solr.update.CommitTracker.run(CommitTracker.java:273)

at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

  

Re: Programmatic Basic Auth on CloudSolrClient

2021-03-04 Thread Tomás Fernández Löbbe
Ah, right, now I remember that something like this was possible with the
"http1" version of the clients, which is why I created the Jira issues for
the http2 ones. Maybe you can even skip the "LBHttpSolrClient" step, I
believe you can just pass the HttpClient to the CloudSolrClient? you will
have to make sure to close all the clients that are created externally
after done, since the Solr client won't in this case.

On Thu, Mar 4, 2021 at 1:22 PM Mark H. Wood  wrote:

> On Wed, Mar 03, 2021 at 10:34:50AM -0800, Tomás Fernández Löbbe wrote:
> > As far as I know the current OOTB options are system properties or
> > per-request (which would allow you to use different per collection, but
> > probably not ideal if you do different types of requests from different
> > parts of your code). A workaround (which I've used in the past) is to
> have
> > a custom client that overrides and sets the credentials in the "request"
> > method (you can put whatever logic there to identify which credentials to
> > use). I recently created
> https://issues.apache.org/jira/browse/SOLR-15154
> > and https://issues.apache.org/jira/browse/SOLR-15155 to try to address
> this
> > issue in future releases.
>
> I have not tried it, but could you not:
>
> 1. set up an HttpClient with an appropriate CredentialsProvider;
> 2. pass it to HttpSolrClient.Builder.withHttpClient();
> 2. pass that Builder to
> LBHttpSolrClient.Builder.withHttpSolrClientBuilder();
> 3. pass *that* Builder to
> CloudSolrClient.Builder.withLBHttpSolrClientBuilder();
>
> Now you have control of the CredentialsProvider and can have it return
> whatever credentials you wish, so long as you still have a reference
> to it.
>
> > On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das 
> wrote:
> >
> > >
> > > Hi There,
> > >
> > > Is there any way to programmatically set basic authentication
> credential
> > > on CloudSolrClient?
> > >
> > > The only documentation available is to use system property. This is not
> > > useful if two collection required two separate set of credentials and
> they
> > > are parallelly accessed.
> > > Thanks in advance.
> > >
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>


new tlog files are not created per commit but adding into latest existing tlog file after replica reload

2021-03-04 Thread Michael Hu
Hi experts:

Need some help and suggestion about an issue I am facing

Solr info:
 - Solr 8.7
 - Solr cloud with tlog replica; replica size is 3 for my Solr collection

Issue:
 - before issuing collection reload; I observed a new tlog file are created 
after every commit; and those tlog files are deleted after a while (may be 
after index are merged?)
 - then I issued a collection reload using collection API on my collection at 
20:15
 - after leader replica is reloaded; no new tlog file are created; instead 
latest tlog file is growing, and no tlog file is deleted after reload. Below 
under "files under tlog directory" section is a snapshot of the tlog files 
under tlog directory of the leader replica. Again, I issued collection reload 
at 20:15, and after that tlog.883 is growing
 - I looked into log file and find error log entries below under "log entries" 
section, and the log entry repeats continuously for every auto commit after 
reload. I hope this log entry can provide some information for the issue.

Please help and suggestion what I may do incorrectly. Or this is a known issue, 
is there a way I can fix or work-around it?

Thank you so much!

--Michael Hu

--- beginning for files under tlog directory ---

-rw-r--r-- 1 solr solr   47527321 Mar  4 20:14 tlog.877

-rw-r--r-- 1 solr solr   42614907 Mar  4 20:14 tlog.878

-rw-r--r-- 1 solr solr   37524663 Mar  4 20:14 tlog.879

-rw-r--r-- 1 solr solr   44067997 Mar  4 20:14 tlog.880

-rw-r--r-- 1 solr solr   33209784 Mar  4 20:15 tlog.881

-rw-r--r-- 1 solr solr   55435186 Mar  4 20:15 tlog.882

-rw-r--r-- 1 solr solr 2179991713 Mar  4 20:29 tlog.883

--- end for files under tlog directory ---

--- beginning for log entries ---

2021-03-04 20:15:38.251 ERROR (commitScheduler-4327-thread-1) [c:mycollection 
s:myshard r:core_node10 x:mycolletion_myshard_replica_t7] o.a.s.u.CommitTracker 
auto commit error...:

org.apache.solr.common.SolrException: java.nio.channels.ClosedChannelException

at 
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:503)

at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:835)

at org.apache.solr.update.UpdateLog.preCommit(UpdateLog.java:819)

at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:673)

at org.apache.solr.update.CommitTracker.run(CommitTracker.java:273)

at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)

at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)

at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: java.nio.channels.ClosedChannelException

at 
java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150)

at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:266)

at java.base/java.nio.channels.Channels.writeFullyImpl(Channels.java:74)

at java.base/java.nio.channels.Channels.writeFully(Channels.java:97)

at java.base/java.nio.channels.Channels$1.write(Channels.java:172)

at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:216)

at 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:209)

at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:193)

at 
org.apache.solr.update.TransactionLog.writeCommit(TransactionLog.java:498)

... 10 more

--- end for log entries ---



Re: Programmatic Basic Auth on CloudSolrClient

2021-03-04 Thread Mark H. Wood
On Wed, Mar 03, 2021 at 10:34:50AM -0800, Tomás Fernández Löbbe wrote:
> As far as I know the current OOTB options are system properties or
> per-request (which would allow you to use different per collection, but
> probably not ideal if you do different types of requests from different
> parts of your code). A workaround (which I've used in the past) is to have
> a custom client that overrides and sets the credentials in the "request"
> method (you can put whatever logic there to identify which credentials to
> use). I recently created https://issues.apache.org/jira/browse/SOLR-15154
> and https://issues.apache.org/jira/browse/SOLR-15155 to try to address this
> issue in future releases.

I have not tried it, but could you not:

1. set up an HttpClient with an appropriate CredentialsProvider;
2. pass it to HttpSolrClient.Builder.withHttpClient();
2. pass that Builder to LBHttpSolrClient.Builder.withHttpSolrClientBuilder();
3. pass *that* Builder to CloudSolrClient.Builder.withLBHttpSolrClientBuilder();

Now you have control of the CredentialsProvider and can have it return
whatever credentials you wish, so long as you still have a reference
to it.

> On Wed, Mar 3, 2021 at 5:42 AM Subhajit Das  wrote:
> 
> >
> > Hi There,
> >
> > Is there any way to programmatically set basic authentication credential
> > on CloudSolrClient?
> >
> > The only documentation available is to use system property. This is not
> > useful if two collection required two separate set of credentials and they
> > are parallelly accessed.
> > Thanks in advance.
> >

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


graph traversal filter which uses document value in the query

2021-03-04 Thread Lee Carroll
Hi All,
I'm using the graph query parser to traverse a set of edge documents. An
edge looks like

"id":"edge1", "recordType":"journey", "Date":"2021-03-04T00:00:00Z", "Origin
":"AAC", "OriginLocalDateTime":"2021-03-04T05:00:00Z", "Destination":"AAB",
"DestinationLocalDateTime":"2021-03-04T07:00:00Z"

I'd like to collect  journeys needed to travel from an origin city to a
destination city in a single hop (a-b-c) where all journeys are made on the
same day. I'm using a traversal filter to achieve this on the same day
criteria but the function field parameter which I'm expecting to return the
document's date value is being ignored
For example a query to get all journeys from AAA to AAB is:

q={!graph
   maxDepth=1
   from=Origin
   to=Destination
traversalFilter='Date:{!func}Date'
} Origin:AAA  & fq= DestinationAirportCode:AAB || originAirportCode:AAA

What is the correct approach for this problem?

Cheers Lee C


Re: Get first value in a multivalued field

2021-03-04 Thread Walter Underwood
You can copy the field to another field, then use the 
FirstFieldValueUpdateProcessorFactory to limit that field to the first value. 
At least, that seems to be what that URP does. I have not used it.

https://solr.apache.org/guide/8_8/update-request-processors.html

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 4, 2021, at 11:42 AM, ufuk yılmaz  wrote:
> 
> Hi,
> 
> Is it possible in any way to get the first value in a multivalued field? 
> Using function queries, streaming expressions or any other way without 
> reindexing? (Stream decorators have array(), but no way to get a value at a 
> specific index?)
> 
> Another one, is it possible to match a regex to a text field and extract only 
> the matching part?
> 
> I tried very hard for this too but couldn’t find a way.
> 
> --ufuk
> 
> Sent from Mail for Windows 10
> 



Get first value in a multivalued field

2021-03-04 Thread ufuk yılmaz
Hi,

Is it possible in any way to get the first value in a multivalued field? Using 
function queries, streaming expressions or any other way without reindexing? 
(Stream decorators have array(), but no way to get a value at a specific index?)

Another one, is it possible to match a regex to a text field and extract only 
the matching part?

I tried very hard for this too but couldn’t find a way.

--ufuk

Sent from Mail for Windows 10



Re: wordpress anyone?

2021-03-04 Thread dmitri maziuk

On 2021-03-03 10:24 PM, Gora Mohanty wrote:

... there does seem to be another plugin that is

open-source,and hosted on Github: https://wordpress.org/plugins/solr-power/


I saw it, they lost me at

"you'll need access to a functioning Solr 3.6 instance for the plugin to 
work as expected. This plugin does not support other versions of Solr."


Dima



Re: Potential Slow searching for unified highlighting on Solr 8.8.0/8.8.1

2021-03-04 Thread Ere Maijala

Hi,

Solr uses JIRA for issue tickets. You can find it here: 
https://issues.apache.org/jira/browse/SOLR


I'd suggest filing a new bug issue in the SOLR project (note that 
several other projects also use this JIRA installation). Here's an 
example of an existing highlighter issue for reference: 
https://issues.apache.org/jira/browse/SOLR-14019.


See also some brief documentation:

https://cwiki.apache.org/confluence/display/solr/HowToContribute#HowToContribute-JIRAtips(ourissue/bugtracker)

Regards,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.58:

Hi Ere

Please to be of service!

No I have not filed a JIRA ticket. I am new to interacting with the Solr
Community and only beginning to 'find my legs'. I am not too sure what JIRA
is I am afraid!

Regards

Matthew

Matthew Flowerday | Consultant | ULEAF
Unisys | 01908 774830| matthew.flower...@unisys.com
Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.



-Original Message-
From: Ere Maijala 
Sent: 01 March 2021 12:53
To: solr-user@lucene.apache.org
Subject: Re: Potential Slow searching for unified highlighting on Solr
8.8.0/8.8.1

EXTERNAL EMAIL - Be cautious of all links and attachments.

Hi,

Whoa, thanks for the heads-up! You may just have saved me from a whole lot
of trouble. Did you file a JIRA ticket already?

Thanks,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.00:

Hi There

I just came across a situation where a unified highlighting search
under solr 8.8.0/8.8.1 can take over 20 mins to run and eventually times

out.

I resolved it by a config change – but it can catch you out. Hence
this email.

With solr 8.8.0 a new unified highlighting parameter
&hl.fragAlignRatio was implemented which if not set defaults to 0.5.
This attempts to improve the high lighting so that highlighted text
does not appear right at the left. This works well but if you have a
search result with numerous occurrences of the word in question within
the record performance goes right down!

2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf]
o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select
params={hl.snippets=2&q=test&hl=on&hl.maxAnalyzedChars=100&fl=id,d
escription,specification,score&start=20&hl.fl=*&rows=10&_=161440511913
4}
hits=57008 status=0 QTime=1414320

2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf]
o.a.s.s.HttpSolrCall Unable to write response, client closed
connection or we are shutting down =>
org.eclipse.jetty.io.EofException

at
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)

org.eclipse.jetty.io.EofException: null

at
org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

at
org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

at
org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378)
~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]

when I set &hl.fragAlignRatio=0.25 results came back much quicker

2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false&hl=on&fl=id,description,specification,s
core&start=1&hl.fragAlignRatio=0.25&rows=100&hl.snippets=2&q=test&hl.m
axAnalyzedChars=100&hl.fl=*&hl.method=unified&timeAllowed=9&_=
1614430061690}
hits=136939 status=0 QTime=87024

And  &hl.fragAlignRatio=0.1

2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false&hl=on&fl=id,description,specification,s
core&start=1&hl.fragAlignRatio=0.1&rows=100&hl.snippets=2&q=test&hl.ma
xAnalyzedChars=100&hl.fl=*&hl.method=unified&timeAllowed=9&_=1
614430061690}
hits=136939 status=0 QTime=69033

And &hl.fragAlignRatio=0.0

2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes]
o.a.s.c.S.Request [holmes]  webapp=/solr path=/select
params={hl.weightMatches=false&hl=on&fl=id,description,specification,s
core&start=1&hl.fragAlignRatio=0.0&rows=100&hl.snippets=2&q=test&hl.ma
xAnalyzedChars=100&hl.fl=*&hl.method=unified&timeAllowed=9&_=1
614430061690}
hits=136939 status=0 QTime=2841

I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully
left aligned).  I am not too sure as to how many time a word has to
occur in a record for performance to go right down – but if too many
it can have a BIG impact.

I also noticed that setting &timeAllowed=9 did not break out of
the query until it finished. Perhaps because the query finished
quickly and what took the time was the highlighting. It might be an
idea to get &ti