Re: Apache Solr Reference Guide isn't accessible

2021-02-01 Thread Bernd Fehling

Yeah, but guide 8.8 is still buggy.

As I reported a month ago, "ICU Normalizer 2 Filter" states:
- NFC: ... Normalization Form C, canonical decomposition
- NFD: ... Normalization Form D, canonical decomposition, followed by canonical 
composition
- NFKC: ... Normalization Form KC, compatibility decomposition
- NFKD: ... Normalization Form KD, compatibility decomposition, followed by 
canonical composition

But the link to "Unicode Standard Annex #15" right above says:
- NFC: ... Normalization Form C, Canonical Decomposition, followed by Canonical 
Composition
- NFD: ... Normalization Form D, Canonical Decomposition
- NFKC: ... Normalization Form KC, Compatibility Decomposition, followed by 
Canonical Composition
- NFKD: ... Normalization Form KD, Compatibility Decomposition

But, well who cares.

Have a nice day.


Am 01.02.21 um 23:04 schrieb Cassandra Targett:

The problem causing this has been fixed and the docs should be available again.
On Feb 1, 2021, 2:15 PM -0600, Alexandre Rafalovitch , 
wrote:

And if you need something more recent while this is being fixed, you
can look right at the source in GitHub, though a navigation, etc is
missing:
https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc

Open Source :-)

Regards,
Alex.

On Mon, 1 Feb 2021 at 15:04, Mike Drob  wrote:


Hi Dorion,

We are currently working with our infra team to get these restored. In the
meantime, the 8.4 guide is still available at
https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8
guide will be back up soon. Thank you for your patience.

Mike

On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline 
wrote:


Hi,

I can't access to Apache Solr Reference Guide since few days.
Example:
URL

* https://lucene.apache.org/solr/guide/8_8/
* https://lucene.apache.org/solr/guide/8_7/
Result:
Not Found
The requested URL was not found on this server.

Do you know what going on?

Thanks
Caroline Dorion





Re: SolrCloud keeps crashing

2021-02-01 Thread Satish Silveri
I am facing the same issue. Did u find any solution for this?




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: How to get case-sensitive Terms?

2021-02-01 Thread elivis
Alexandre Rafalovitch wrote
> Admin UI also allows you to run text string against a field definition to
> see what each stage of analyzer chain does.

Thank you. Could you please give me some pointers how to achieve this (see
what each stage of analyzer chain does in Admin UI)?




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: Query over migrating a solr database from 7.7.1 to 8.7.0

2021-02-01 Thread Flowerday, Matthew J
Hi There 

Just as an update to this thread I have resolved the issue. The new
schema.xml had this entries







Once I commented out the lines containing _root_ and _nest_path_ (as we
don't have nested documents) and re-started solr then no further duplication
on update occurred.

Regards

Matthew

Matthew Flowerday | Consultant | ULEAF
Unisys | 01908 774830| matthew.flower...@unisys.com 
Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.
   

-Original Message-
From: Flowerday, Matthew J  
Sent: 15 January 2021 11:18
To: solr-user@lucene.apache.org
Subject: RE: Query over migrating a solr database from 7.7.1 to 8.7.0

EXTERNAL EMAIL - Be cautious of all links and attachments.


smime.p7s
Description: S/MIME cryptographic signature


Re: How to get case-sensitive Terms?

2021-02-01 Thread elivis
Alexandre Rafalovitch wrote
> Admin UI also allows you to run text string against a field definition to
> see what each stage of analyzer chain does.

Thank you. Could please let me know how to do this (see what each stage of
analyzer chain does)?




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: NRT - Indexing

2021-02-01 Thread Shawn Heisey

On 2/1/2021 12:08 AM, haris.k...@vnc.biz wrote:
Hope you're doing good. I am trying to configure NRT - Indexing in my 
project. For this reason, I have configured *autoSoftCommit* to execute 
every second and *autoCommit* to execute every 5 minutes. Everything 
works as expected on the dev and test server. But on the production 
server, there are more than 6 million documents indexed in Solr, so 
whenever a new document is indexed it takes 2-3 minutes before appearing 
in the search despite the setting I have described above. Since the 
target is to develop a real-time system, this delay of 2-3 minutes is 
not acceptable. How can I reduce this time window?


Setting autoSoftCommit with a max time of 1000 (one second) does not 
mean you will see changes within one second.  It means that one second 
after indexing begins, Solr will start a soft commit operation.  That 
commit operation must fully complete and the new searcher must come 
online before changes are visible.  Those steps may take much longer 
than one second, which seems to be happening on your system.


With the information available, I cannot tell you why your commits are 
taking so long.  One of the most common reasons for poor Solr 
performance is a lack of free memory on the system for caching purposes.


Thanks,
Shawn


Re: Apache Solr Reference Guide isn't accessible

2021-02-01 Thread Cassandra Targett
The problem causing this has been fixed and the docs should be available again.
On Feb 1, 2021, 2:15 PM -0600, Alexandre Rafalovitch , 
wrote:
> And if you need something more recent while this is being fixed, you
> can look right at the source in GitHub, though a navigation, etc is
> missing:
> https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc
>
> Open Source :-)
>
> Regards,
> Alex.
>
> On Mon, 1 Feb 2021 at 15:04, Mike Drob  wrote:
> >
> > Hi Dorion,
> >
> > We are currently working with our infra team to get these restored. In the
> > meantime, the 8.4 guide is still available at
> > https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8
> > guide will be back up soon. Thank you for your patience.
> >
> > Mike
> >
> > On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline 
> > 
> > wrote:
> >
> > > Hi,
> > >
> > > I can't access to Apache Solr Reference Guide since few days.
> > > Example:
> > > URL
> > >
> > > * https://lucene.apache.org/solr/guide/8_8/
> > > * https://lucene.apache.org/solr/guide/8_7/
> > > Result:
> > > Not Found
> > > The requested URL was not found on this server.
> > >
> > > Do you know what going on?
> > >
> > > Thanks
> > > Caroline Dorion
> > >


Re: Is the lucene.apache.org link dead?

2021-02-01 Thread Cassandra Targett
This problem has been fixed and docs should be available again. Please let us 
know if you still have problems accessing anything.
On Feb 1, 2021, 8:32 AM -0600, Cassandra Targett , wrote:
> There were some issues while publishing the various bits for 8.8 and Lucene 
> and Solr Javadocs and Ref Guides for 8.5-8.7 are currently missing. The 
> project is working on getting those versions back as soon as possible.
>
> We apologize for this situation, hopefully it won’t be too long today before 
> we have it fixed.
> On Feb 1, 2021, 3:33 AM -0600, Atita Arora , wrote:
> > True the link is down since last week, I checked as we are currently in the
> > state of migration to 8.7 too.
> >
> >
> > On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki 
> > wrote:
> >
> > > Hi,
> > > I tried to open the Solr News page to check the contents of the solr
> > > release, but it seems to get Not Found.
> > > I think it's either the wrong link or the link is messed up.
> > > If there is a problem, do you think you can fix it?
> > >
> > > Sorry if this has already been discussed somewhere.
> > >
> > > Solr News Page: https://lucene.apache.org/solr/news.html
> > > Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html
> > >
> > > Thank you.
> > > Taisuke.
> > >


Re: Change uniqueKey using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi,

SolrJ doesn't have any purpose-made request class to change the
uniqueKey, afaict.  However doing so is still possible (though less
convenient) using the "GenericSolrRequest" class, which can be used to
hit arbitrary Solr APIs.

If you'd like to see better support for this in SolrJ, open a JIRA
ticket with the details of what you're trying to do (or a PR directly)
and I'd be happy to take a look.

Best,

Jason

On Fri, Jan 22, 2021 at 9:29 AM Timo Grün  wrote:
>
> Hi All,
>
> I’m currently trying to change the uniqueKey of my Solr Cloud schema using 
> Solrj.
> While creating new Fields and FieldDefinitions is pretty straight forward, I 
> struggle to find any solution to change the Unique Key field with Solrj.
>
> Any advice here?
>
> Best Regards,
>
> Timo Gruen
>


Re: Ghost Documents or Shards out of Sync

2021-02-01 Thread Mike Drob
To expand on what Jason suggested, if the issue is the non-deterministic
ordering due to staggered commits per replica, you may have more
consistency with TLOG replicas rather than the NRT replicas. In this case,
the underlying segment files should be identical and lead to more
predictable results.

On Mon, Feb 1, 2021 at 2:50 PM Jason Gerlowski 
wrote:

> Hi Ronen,
>
> The first thing I'd figure out in your situation is whether the
> results are actually different each time, or whether the ordering is
> what differs (which might push a particular result off the page you're
> looking at, giving the appearance that it didn't match).
>
> In the case of the former, this can happen briefly if queries come in
> when some but not all replicas have seen a commit.  But usually this
> is a transient concern - either waiting for the next autocommit or
> triggering an explicit commit resolves the discrepancy in this case.
> Since you only see identical results after a restart, this _doesn't_
> sound like what you're seeing.
>
> In the case of the latter (same results, differently ordered) this is
> expected sometimes.  Solr sorts on relevance by default with the
> internal Lucene document ID being a tiebreaker.  Both the relevance
> statistics and Lucene's document IDs can differ across SolrCloud
> replicas (due to non-deterministic conditions such as the segment
> merging and deleted-doc removal that Lucene does under the hood), and
> this can produce differently-ordered result sets for users that issue
> the same query repeatedly.
>
> Good luck narrowing things down!
>
> Jason
>
> On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum  wrote:
> >
> > Hi All,
> >
> > I'm using Solr Cloud (version 8.3.0) with shards and replicas
> (replication
> > factor of 2).
> > Recently, I've encountered several times that running the same query
> > repeatedly yields different results. Restarting the nodes fixes the
> problem
> > (until next time).
> > I assume that some shards are not synchronized and I have several
> questions:
> > 1. What can cause this - many atomic updates? issues with commits?
> > 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an
> API
> > call or some other method?
> >
> > Thanks in advance,
> > Ronen.
>


Re: Ghost Documents or Shards out of Sync

2021-02-01 Thread Jason Gerlowski
Forgot to answer your second question:

> Can I trigger the "fixing" mechanism that Solr runs at restart by an API call 
> or some other method?

It depends on what the cause is.  But for at least some possible
causes there is an API call that can resolve this.  Though that API
itself (Solr's misnamed "optimize" feature) comes with a lot of
warnings and has been discouraged by the community in the past.  (I
won't get into those specifics though until you figure out the cause.)

Before you consider calling "optimize" or taking any other means to
fix this though, it might be worth revisiting whether this is really
an issue?  While this quirk of Solr's can bedevil automated tests or
other things that rely on repeatability, it's unusual in many
applications for end-users to submit identical queries multiple times.
Every case is different of course, but something to consider.

Best,

Jason

On Mon, Feb 1, 2021 at 3:49 PM Jason Gerlowski  wrote:
>
> Hi Ronen,
>
> The first thing I'd figure out in your situation is whether the
> results are actually different each time, or whether the ordering is
> what differs (which might push a particular result off the page you're
> looking at, giving the appearance that it didn't match).
>
> In the case of the former, this can happen briefly if queries come in
> when some but not all replicas have seen a commit.  But usually this
> is a transient concern - either waiting for the next autocommit or
> triggering an explicit commit resolves the discrepancy in this case.
> Since you only see identical results after a restart, this _doesn't_
> sound like what you're seeing.
>
> In the case of the latter (same results, differently ordered) this is
> expected sometimes.  Solr sorts on relevance by default with the
> internal Lucene document ID being a tiebreaker.  Both the relevance
> statistics and Lucene's document IDs can differ across SolrCloud
> replicas (due to non-deterministic conditions such as the segment
> merging and deleted-doc removal that Lucene does under the hood), and
> this can produce differently-ordered result sets for users that issue
> the same query repeatedly.
>
> Good luck narrowing things down!
>
> Jason
>
> On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum  wrote:
> >
> > Hi All,
> >
> > I'm using Solr Cloud (version 8.3.0) with shards and replicas (replication
> > factor of 2).
> > Recently, I've encountered several times that running the same query
> > repeatedly yields different results. Restarting the nodes fixes the problem
> > (until next time).
> > I assume that some shards are not synchronized and I have several questions:
> > 1. What can cause this - many atomic updates? issues with commits?
> > 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an API
> > call or some other method?
> >
> > Thanks in advance,
> > Ronen.


Re: Ghost Documents or Shards out of Sync

2021-02-01 Thread Jason Gerlowski
Hi Ronen,

The first thing I'd figure out in your situation is whether the
results are actually different each time, or whether the ordering is
what differs (which might push a particular result off the page you're
looking at, giving the appearance that it didn't match).

In the case of the former, this can happen briefly if queries come in
when some but not all replicas have seen a commit.  But usually this
is a transient concern - either waiting for the next autocommit or
triggering an explicit commit resolves the discrepancy in this case.
Since you only see identical results after a restart, this _doesn't_
sound like what you're seeing.

In the case of the latter (same results, differently ordered) this is
expected sometimes.  Solr sorts on relevance by default with the
internal Lucene document ID being a tiebreaker.  Both the relevance
statistics and Lucene's document IDs can differ across SolrCloud
replicas (due to non-deterministic conditions such as the segment
merging and deleted-doc removal that Lucene does under the hood), and
this can produce differently-ordered result sets for users that issue
the same query repeatedly.

Good luck narrowing things down!

Jason

On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum  wrote:
>
> Hi All,
>
> I'm using Solr Cloud (version 8.3.0) with shards and replicas (replication
> factor of 2).
> Recently, I've encountered several times that running the same query
> repeatedly yields different results. Restarting the nodes fixes the problem
> (until next time).
> I assume that some shards are not synchronized and I have several questions:
> 1. What can cause this - many atomic updates? issues with commits?
> 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an API
> call or some other method?
>
> Thanks in advance,
> Ronen.


Re: Getting Solr's statistic using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi Steven,

AFAIK, SolrJ doesn't have built in request objects for the metrics
API.  But you can still use the "GenericSolrRequest" class to hit any
Solr API:

e.g.

SolrParams params = new ModifiableSolrParams();
params.set("action", "list");
GenericSolrRequest request = new
GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics/history",
params);
final SimpleSolrResponse response = request.process(solrClient);

Hope that helps,

Jason

On Fri, Jan 22, 2021 at 11:21 AM Gael Jourdan-Weil
 wrote:
>
> Hello Steven,
>
> I believe what you are looking for cannot be accessed using SolrJ (I didn't 
> really check though).
>
> But you can easily access it either via the Collections APIs and/or the 
> Metrics API depending on what you need exactly.
> See https://lucene.apache.org/solr/guide/8_4/cluster-node-management.html and 
> https://lucene.apache.org/solr/guide/8_4/metrics-reporting.html
>
> Gaël
>
>
> De : Steven White 
> Envoyé : vendredi 22 janvier 2021 16:46
> À : solr-user@lucene.apache.org 
> Objet : Getting Solr's statistic using SolrJ
>
> Hi everyone,
>
> Is there a SolrJ API that I can use to collect statistics data about Solr
> (everything that I see on the dashboard if possible)?
>
> I am in need to collect data about Solr instances, those same data that I
> see on the dashboard such as swap-memory, jvm-memory, list of cores, info
> about each core, etc. etc. using SolrJ API.
>
> Thanks
>
> Steven


Re: Apache Solr Reference Guide isn't accessible

2021-02-01 Thread Alexandre Rafalovitch
And if you need something more recent while this is being fixed, you
can look right at the source in GitHub, though a navigation, etc is
missing:
https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc

Open Source :-)

Regards,
   Alex.

On Mon, 1 Feb 2021 at 15:04, Mike Drob  wrote:
>
> Hi Dorion,
>
> We are currently working with our infra team to get these restored. In the
> meantime, the 8.4 guide is still available at
> https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8
> guide will be back up soon. Thank you for your patience.
>
> Mike
>
> On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline 
> wrote:
>
> > Hi,
> >
> > I can't access to Apache Solr Reference Guide since few days.
> > Example:
> > URL
> >
> >   *   https://lucene.apache.org/solr/guide/8_8/
> >   *   https://lucene.apache.org/solr/guide/8_7/
> > Result:
> > Not Found
> > The requested URL was not found on this server.
> >
> > Do you know what going on?
> >
> > Thanks
> > Caroline Dorion
> >


Re: Apache Solr Reference Guide isn't accessible

2021-02-01 Thread Mike Drob
Hi Dorion,

We are currently working with our infra team to get these restored. In the
meantime, the 8.4 guide is still available at
https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8
guide will be back up soon. Thank you for your patience.

Mike

On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline 
wrote:

> Hi,
>
> I can't access to Apache Solr Reference Guide since few days.
> Example:
> URL
>
>   *   https://lucene.apache.org/solr/guide/8_8/
>   *   https://lucene.apache.org/solr/guide/8_7/
> Result:
> Not Found
> The requested URL was not found on this server.
>
> Do you know what going on?
>
> Thanks
> Caroline Dorion
>


Apache Solr Reference Guide isn't accessible

2021-02-01 Thread Dorion Caroline
Hi,

I can't access to Apache Solr Reference Guide since few days.
Example:
URL

  *   https://lucene.apache.org/solr/guide/8_8/
  *   https://lucene.apache.org/solr/guide/8_7/
Result:
Not Found
The requested URL was not found on this server.

Do you know what going on?

Thanks
Caroline Dorion


Re: Is the lucene.apache.org link dead?

2021-02-01 Thread Cassandra Targett
There were some issues while publishing the various bits for 8.8 and Lucene and 
Solr Javadocs and Ref Guides for 8.5-8.7 are currently missing. The project is 
working on getting those versions back as soon as possible.

We apologize for this situation, hopefully it won’t be too long today before we 
have it fixed.
On Feb 1, 2021, 3:33 AM -0600, Atita Arora , wrote:
> True the link is down since last week, I checked as we are currently in the
> state of migration to 8.7 too.
>
>
> On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki 
> wrote:
>
> > Hi,
> > I tried to open the Solr News page to check the contents of the solr
> > release, but it seems to get Not Found.
> > I think it's either the wrong link or the link is messed up.
> > If there is a problem, do you think you can fix it?
> >
> > Sorry if this has already been discussed somewhere.
> >
> > Solr News Page: https://lucene.apache.org/solr/news.html
> > Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html
> >
> > Thank you.
> > Taisuke.
> >


Re: NRT - Indexing

2021-02-01 Thread Dominique Bejean
Hi,

It is not the cause of your issue, but Solr version is 8.6.0, and
solrconfig.xml includes
7.5.0

By "I am using a service that fetches data from the Postgres database and
indexes it to solr. The service runs with a delay of 5 seconds.". You man,
you are using DIH and launch a delta-import each 5 seconds ?

Solr logs may help.

Dominique



Le lun. 1 févr. 2021 à 13:00,  a écrit :

> Hello,
>
>
> I am attaching the solrconfig.xml along with this email, also I am
> attaching a text document that has JSON object regarding the system
> information I am using a service that fetches data from the Postgres
> database and indexes it to solr. The service runs with a delay of 5 seconds.
>
>
> Regards
>
>
> Mit freundlichen Grüssen / Kind regards
>
>
> Muhammad Haris Khan
>
>
> *VNC - Virtual Network Consult*
>
>
> *-- Solr Ingenieur --*
>
>
> - On 1 February, 2021 3:50 PM, Dominique Bejean <
> dominique.bej...@eolya.fr> wrote:
>
>
>
> Hi,
>
>
> What is your Solr version ?
>
> Can you share your solrconfig.xml ?
>
> How is your sharding ?
>
> Did you grep your solr logs on with the "commit' pattern in order to see
>
> hard and soft commit occurrences ?
>
> How are you pushing new docs or updates in the collection ?
>
>
> Regards.
>
>
> Dominique
>
>
>
>
>
> Le lun. 1 févr. 2021 à 08:08,  a écrit :
>
>
> > Hello,
>
> >
>
> > Hope you're doing good. I am trying to configure NRT - Indexing in my
>
> > project. For this reason, I have configured *autoSoftCommit* to execute
>
> > every second and *autoCommit* to execute every 5 minutes. Everything
>
> > works as expected on the dev and test server. But on the production
> server,
>
> > there are more than 6 million documents indexed in Solr, so whenever a
> new
>
> > document is indexed it takes 2-3 minutes before appearing in the search
>
> > despite the setting I have described above. Since the target is to
> develop
>
> > a real-time system, this delay of 2-3 minutes is not acceptable. How can
> I
>
> > reduce this time window?
>
> >
>
> > Plus any advice on better scaling the Solr considering more than 6
> million
>
> > records would be very helpful. Thank you in advance.
>
> >
>
> >
>
> >
>
> > Mit freundlichen Grüssen / Kind regards
>
> >
>
> > Muhammad Haris Khan
>
> >
>
> > *VNC - Virtual Network Consult*
>
> >
>
> > *-- Solr Ingenieur --*
>
> >
>


[ANNOUNCE] Apache Solr 8.8.0 released

2021-02-01 Thread Noble Paul
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

The release is available for immediate download at:

 https://lucene.apache.org/solr/downloads.html

Please read CHANGES.txt for a detailed list of changes:

 https://lucene.apache.org/solr/8_8_0/changes/Changes.html Solr 8.8.0

Release Highlights:

Reducing overseer bottlenecks using per-replica states. More stability and
lesser load on large cluster that use this feature.

Better restart and collection creation performance.

Interleaving support in Learning To Rank

A summary of important changes is published in the Solr Reference Guide at:

 https://lucene.apache.org/solr/guide/8_8/solr-upgrade-notes.html.

For the most exhaustive list, see the full release notes at
https://lucene.apache.org/solr/8_8_0/changes/Changes.html

or

by viewing the CHANGES.txt file accompanying the distribution. Solr's
release notes usually don't include Lucene layer changes. Lucene's release
notes are at

https://lucene.apache.org/core/8_8_0/changes/Changes.html

- -
Noble Paul
-BEGIN PGP SIGNATURE-
Version: FlowCrypt Email Encryption 8.0.0
Comment: Seamlessly send and receive encrypted email

wsFzBAEBCgAGBQJgF/RnACEJEMOP9ew/z9s+FiEEz85fu5IMPHRc7uCEw4/1
7D/P2z6fzRAAm4AKbeIGWfPK+0nsrZCAPaDucGZYVL0lPQr3eF4jnmhi60dF
Sv9rD5Mq5ZSTTuJlpwoaxowxVp4M1tV1vmCdfBRkgoUD3dwS/snryr/AK69R
zdjjV/BABtcMNA7cMYIrkolGl37g4kI1alLfU36Uf/3M0NfUcw0keW1XuMOr
uV7AzXhZGw4eL4LJt7I7gXJs1kgE6/sPSmoKBVckKisrruiUSYmH9r/EhXXU
YB8cxd5tenMrchbjcOquC9X2JJjB++/LyJw3mFNIO5W3UpjqwtI8IGDo1Sxl
fM32FuAWVVDZsiBKXuRzsIO/iEPfgZFfTcoSJkD0Rt/Q6gJPZIuBmiUFaYfs
9fzufNDuXdPKFEndSHfwdPMJwvk3XA5+xYzhkcQH+3FKOPmYXkvLolOC3j+r
ZtbgI421jDIahpVPbFtgUPB2dM3mw34B73wP5MIOHHxz22tVKe6PBOeihccK
mOr0r1tZHR+11aijYf+Nlhv3hpbpRoDbQ7pRkRyu53Od47p6itZAi60TFFIJ
bDw26wZRNRrEuYhriJUeM7ahvJNlcE6VaO0szUDL5g/x2Oa9jKMHPpsUF9pS
9HbJWcnflxq0iU+sfdv7Eoxzv6zkXMTUsbpT2XjKcZZN5jd2rWV3JfiU6FiZ
jpqJBHzwGan9qKKswNKyDKhoa2jPdSYIbqQ=
=NbSI
-END PGP SIGNATURE-


Re: NRT - Indexing

2021-02-01 Thread haris . khan
Hello,I am attaching the solrconfig.xml along with this email, also I am 
attaching a text document that has JSON object regarding the system information 
I am using a service that fetches data from the Postgres database and indexes 
it to solr. The service runs with a delay of 5 seconds.RegardsMit freundlichen 
Grüssen / Kind regardsMuhammad Haris KhanVNC - Virtual Network Consult-- 
Solr Ingenieur --- On 1 February, 2021 3:50 PM, Dominique Bejean 
 wrote:Hi,What is your Solr version ?Can you 
share your solrconfig.xml ?How is your sharding ?Did you grep your solr logs on 
with the "commit' pattern in order to seehard and soft commit occurrences ?How 
are you pushing new docs or updates in the collection ?Regards.DominiqueLe lun. 
1 févr. 2021 à 08:08,  a écrit :> Hello,>> 
Hope you're doing good. I am trying to configure NRT - Indexing in my> 
project. For this reason, I have configured *autoSoftCommit* to execute> 
every second and *autoCommit* to execute every 5 minutes. Everything> works 
as expected on the dev and test server. But on the production server,> there 
are more than 6 million documents indexed in Solr, so whenever a new> 
document is indexed it takes 2-3 minutes before appearing in the search> 
despite the setting I have described above. Since the target is to develop> 
a real-time system, this delay of 2-3 minutes is not acceptable. How can I> 
reduce this time window?>> Plus any advice on better scaling the Solr 
considering more than 6 million> records would be very helpful. Thank you in 
advance. Mit freundlichen Grüssen / Kind regards>> 
Muhammad Haris Khan>> *VNC - Virtual Network Consult*>> *-- Solr 
Ingenieur --*>

solrconfig.xml
Description: XML document


Re: Tweaking Shards and Replicas for high volume queries and updates

2021-02-01 Thread Dominique Bejean
Hi,

Some suggestions.

* 64GB JVM Heap
Are you sure you really need this heap size ? Did you check in your GC logs
(with gceasy.io) ?
A best practice is to minimize as possible the heap size and never more
than 31 GB.

* OS Caching
Did you set swappiness to 1 ?

* Put two instances of Solr on each node
You need to check resource usage in order to evaluate if it could be
interesting (CPU usage, CPU load average, CPU iowait, Heap usage, Disk I/O
read and write, MMAP caching, ...)
Load Average high with CPU Load low looks like Disk I/O can be the
bottleneck. I would consider increasing the number of physical servers with
less CPU, RAM and disk space on each (but globally with the same quantity
of CPU, RAM and disk space). This will increase the disk I/O capacity.

* Collection 4 is the trouble collection
Try to have smaller cores (more shards if you increase the number of Solr
instances)
Investigate in time routed ou category routed aliases if it can match with
your update strategy and/or your queries profiles.
Work again on shema :
- For docValues=true fields, check if you really need indexed=true and
storted=true (there are a lot of considerations to take in account), ...
- Over-indexing with copyfield ?
Work on queries : facets, group, collapse, fl=, rows=, ...

Regards

Dominique


Le mer. 27 janv. 2021 à 14:53, Hollowell,Skip  a écrit :

> 30 Dedicated physical Nodes in the Solr Cloud Cluster, all of identical
> configuration
> Server01   RHEL 7.x
> 256GB RAM
> 10 2TB Spinning Disk in a RAID 10 Configuration (Leaving us 9.8TB usable
> per node)
> 64GB JVM Heap, Tried has high as 100GB, but it appeared that 64GB was
> faster.  If we set a higher heap, do we starve the OS for caching?
> Huge Pages is off on the system, and thus UseLargePages is off on Solr
> Startup
> G1GC, Java 11  (ZGC with Java 15 and HugePages turned on was a disaster.
> We suspect it was due to the Huge Pages configuration)
> At one time we discussed putting two instances of Solr on each node,
> giving us a cloud of 60 instances instead of 30.  Load Average is high on
> these nodes during certain types of queries or updates, but CPU Load is
> relatively low and should be able to accommodate a second instance, but all
> the data would still be on the same RAID10 group of disks.
> Collection 4 is the trouble collection.  It has nearly a billion
> documents, and there are between 200 and 400 million updates every day.
> How do we get that kind of update performance, and still serve 10 million
> queries a day?  Schemas have been reviewed and re-reviewed to ensure we are
> only indexing and storing what is absolutely necessary.  What are we
> missing?  Do we need to revisit our replica policy?  Number of replicas or
> types of replicas (to ensure some are only used for reading, etc?)
> [Grabbed from the Admin UI]
> 755.6Gb Index Size according to Solr Cloud UI
> Total #docs: 371.8mn
> Avg size/doc: 2.1Kb
> 90 Shards, 2 NRT Replicas per Shard, 1,750,612,476 documents, avg
> size/doc: 1.7Kb, uses nested documents
> collection-1_s69r317   31.1Gb
> collection-1_s49r96 30.7Gb
> collection-1_s78r154   30.2Gb
> collection-1_s40r259   30.1Gb
> collection-1_s9r197 29.1Gb
> collection-1_s18r34 28.9Gb
> 120 Shards, 2 TLOG Replicas per Shard, 2,230,207,046 documents, avg
> size/doc: 1.3Kb
> collection-2_s78r154   22.8Gb
> collection-2_s49r96 22.8Gb
> collection-2_s46r331   22.8Gb
> collection-2_s18r34 22.7Gb
> collection-2_s109r21622.7Gb
> collection-2_s104r44722.7Gb
> collection-2_s15r269   22.7Gb
> collection-2_s73r385   22.7Gb
> 120 Shards, 2 TLOG Replicas per Shard, 733,588,503 documents, avg
> size/doc: 1.9Kb
> collection-3_s19r277   10.6Gb
> collection-3_s108r21410.6Gb
> collection-3_s48r94 10.6Gb
> collection-3_s109r45710.6Gb
> collection-3_s47r333   10.5Gb
> collection-3_s78r154   10.5Gb
> collection-3_s18r34 10.5Gb
> collection-3_s77r393   10.5Gb
>
> 120 Shards, 2 TLOG Replicas per Shard, 864,372,654 documents, avg
> size/doc: 5.6Kb
> collection-4_s109r21638.7Gb
> collection-4_s100r43938.7Gb
> collection-4_s49r96 38.7Gb
> collection-4_s35r309   38.6Gb
> collection-4_s18r34 38.6Gb
> collection-4_s78r154   38.6Gb
> collection-4_s7r253 38.6Gb
> collection-4_s69r377   38.6Gb
>


Re: NRT - Indexing

2021-02-01 Thread Dominique Bejean
Hi,

What is your Solr version ?
Can you share your solrconfig.xml ?
How is your sharding ?
Did you grep your solr logs on with the "commit' pattern in order to see
hard and soft commit occurrences ?
How are you pushing new docs or updates in the collection ?

Regards.

Dominique




Le lun. 1 févr. 2021 à 08:08,  a écrit :

> Hello,
>
> Hope you're doing good. I am trying to configure NRT - Indexing in my
> project. For this reason, I have configured *autoSoftCommit* to execute
> every second and *autoCommit* to execute every 5 minutes. Everything
> works as expected on the dev and test server. But on the production server,
> there are more than 6 million documents indexed in Solr, so whenever a new
> document is indexed it takes 2-3 minutes before appearing in the search
> despite the setting I have described above. Since the target is to develop
> a real-time system, this delay of 2-3 minutes is not acceptable. How can I
> reduce this time window?
>
> Plus any advice on better scaling the Solr considering more than 6 million
> records would be very helpful. Thank you in advance.
>
>
>
> Mit freundlichen Grüssen / Kind regards
>
> Muhammad Haris Khan
>
> *VNC - Virtual Network Consult*
>
> *-- Solr Ingenieur --*
>


Re: NRT - Indexing

2021-02-01 Thread Mr Havercamp
I'm running into the same issue. I've set autoSoftCommit and autoCommit but
the speed at which docs are indexed seems to be inconsistent with the
settings. I have lowered the autoCommit to a minute but it still takes a
few minutes for docs to show after indexing. Soft commit settings also seem
to have no effect (from what I understand of the docs, Soft commit makes
items viewable but I'm not seeing them until well after the autoCommit
period has passed.

On Mon, 1 Feb 2021 at 15:08,  wrote:

> Hello,
>
> Hope you're doing good. I am trying to configure NRT - Indexing in my
> project. For this reason, I have configured *autoSoftCommit* to execute
> every second and *autoCommit* to execute every 5 minutes. Everything
> works as expected on the dev and test server. But on the production server,
> there are more than 6 million documents indexed in Solr, so whenever a new
> document is indexed it takes 2-3 minutes before appearing in the search
> despite the setting I have described above. Since the target is to develop
> a real-time system, this delay of 2-3 minutes is not acceptable. How can I
> reduce this time window?
>
> Plus any advice on better scaling the Solr considering more than 6 million
> records would be very helpful. Thank you in advance.
>
>
>
> Mit freundlichen Grüssen / Kind regards
>
> Muhammad Haris Khan
>
> *VNC - Virtual Network Consult*
>
> *-- Solr Ingenieur --*
>


Re: Is the lucene.apache.org link dead?

2021-02-01 Thread Atita Arora
True the link is down since last week, I checked as we are currently in the
state of migration to 8.7 too.


On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki 
wrote:

> Hi,
> I tried to open the Solr News page to check the contents of the solr
> release, but it seems to get Not Found.
> I think it's either the wrong link or the link is messed up.
> If there is a problem, do you think you can fix it?
>
> Sorry if this has already been discussed somewhere.
>
> Solr News Page: https://lucene.apache.org/solr/news.html
> Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html
>
> Thank you.
> Taisuke.
>


Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.

2021-02-01 Thread Kerwin
 Hi David,

Thanks for filing this issue. The classic non-weightMatcher mode works well
for us right now. Yes, we are using the POSTINGS mode for most of the
fields although explicitly mentioning it gives an error since not all
fields are indexed with offsets. So I guess the highlighter is picking the
right choice for each field. Here is the test with hl.offsetSource=ANALYSIS
and hl.weightMatches=false that you requested.

hl.offsetSource=ANALYSIS&hl.weightMatches=false (340 ms)

The above is thus better than the original highlighter. I'll also try and
create that PR soon.