Re: Apache Solr Reference Guide isn't accessible
Yeah, but guide 8.8 is still buggy. As I reported a month ago, "ICU Normalizer 2 Filter" states: - NFC: ... Normalization Form C, canonical decomposition - NFD: ... Normalization Form D, canonical decomposition, followed by canonical composition - NFKC: ... Normalization Form KC, compatibility decomposition - NFKD: ... Normalization Form KD, compatibility decomposition, followed by canonical composition But the link to "Unicode Standard Annex #15" right above says: - NFC: ... Normalization Form C, Canonical Decomposition, followed by Canonical Composition - NFD: ... Normalization Form D, Canonical Decomposition - NFKC: ... Normalization Form KC, Compatibility Decomposition, followed by Canonical Composition - NFKD: ... Normalization Form KD, Compatibility Decomposition But, well who cares. Have a nice day. Am 01.02.21 um 23:04 schrieb Cassandra Targett: The problem causing this has been fixed and the docs should be available again. On Feb 1, 2021, 2:15 PM -0600, Alexandre Rafalovitch , wrote: And if you need something more recent while this is being fixed, you can look right at the source in GitHub, though a navigation, etc is missing: https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc Open Source :-) Regards, Alex. On Mon, 1 Feb 2021 at 15:04, Mike Drob wrote: Hi Dorion, We are currently working with our infra team to get these restored. In the meantime, the 8.4 guide is still available at https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8 guide will be back up soon. Thank you for your patience. Mike On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline wrote: Hi, I can't access to Apache Solr Reference Guide since few days. Example: URL * https://lucene.apache.org/solr/guide/8_8/ * https://lucene.apache.org/solr/guide/8_7/ Result: Not Found The requested URL was not found on this server. Do you know what going on? Thanks Caroline Dorion
Re: SolrCloud keeps crashing
I am facing the same issue. Did u find any solution for this? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: How to get case-sensitive Terms?
Alexandre Rafalovitch wrote > Admin UI also allows you to run text string against a field definition to > see what each stage of analyzer chain does. Thank you. Could you please give me some pointers how to achieve this (see what each stage of analyzer chain does in Admin UI)? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
RE: Query over migrating a solr database from 7.7.1 to 8.7.0
Hi There Just as an update to this thread I have resolved the issue. The new schema.xml had this entries Once I commented out the lines containing _root_ and _nest_path_ (as we don't have nested documents) and re-started solr then no further duplication on update occurred. Regards Matthew Matthew Flowerday | Consultant | ULEAF Unisys | 01908 774830| matthew.flower...@unisys.com Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17 8LX THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all devices. -Original Message- From: Flowerday, Matthew J Sent: 15 January 2021 11:18 To: solr-user@lucene.apache.org Subject: RE: Query over migrating a solr database from 7.7.1 to 8.7.0 EXTERNAL EMAIL - Be cautious of all links and attachments. smime.p7s Description: S/MIME cryptographic signature
Re: How to get case-sensitive Terms?
Alexandre Rafalovitch wrote > Admin UI also allows you to run text string against a field definition to > see what each stage of analyzer chain does. Thank you. Could please let me know how to do this (see what each stage of analyzer chain does)? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: NRT - Indexing
On 2/1/2021 12:08 AM, haris.k...@vnc.biz wrote: Hope you're doing good. I am trying to configure NRT - Indexing in my project. For this reason, I have configured *autoSoftCommit* to execute every second and *autoCommit* to execute every 5 minutes. Everything works as expected on the dev and test server. But on the production server, there are more than 6 million documents indexed in Solr, so whenever a new document is indexed it takes 2-3 minutes before appearing in the search despite the setting I have described above. Since the target is to develop a real-time system, this delay of 2-3 minutes is not acceptable. How can I reduce this time window? Setting autoSoftCommit with a max time of 1000 (one second) does not mean you will see changes within one second. It means that one second after indexing begins, Solr will start a soft commit operation. That commit operation must fully complete and the new searcher must come online before changes are visible. Those steps may take much longer than one second, which seems to be happening on your system. With the information available, I cannot tell you why your commits are taking so long. One of the most common reasons for poor Solr performance is a lack of free memory on the system for caching purposes. Thanks, Shawn
Re: Apache Solr Reference Guide isn't accessible
The problem causing this has been fixed and the docs should be available again. On Feb 1, 2021, 2:15 PM -0600, Alexandre Rafalovitch , wrote: > And if you need something more recent while this is being fixed, you > can look right at the source in GitHub, though a navigation, etc is > missing: > https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc > > Open Source :-) > > Regards, > Alex. > > On Mon, 1 Feb 2021 at 15:04, Mike Drob wrote: > > > > Hi Dorion, > > > > We are currently working with our infra team to get these restored. In the > > meantime, the 8.4 guide is still available at > > https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8 > > guide will be back up soon. Thank you for your patience. > > > > Mike > > > > On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline > > > > wrote: > > > > > Hi, > > > > > > I can't access to Apache Solr Reference Guide since few days. > > > Example: > > > URL > > > > > > * https://lucene.apache.org/solr/guide/8_8/ > > > * https://lucene.apache.org/solr/guide/8_7/ > > > Result: > > > Not Found > > > The requested URL was not found on this server. > > > > > > Do you know what going on? > > > > > > Thanks > > > Caroline Dorion > > >
Re: Is the lucene.apache.org link dead?
This problem has been fixed and docs should be available again. Please let us know if you still have problems accessing anything. On Feb 1, 2021, 8:32 AM -0600, Cassandra Targett , wrote: > There were some issues while publishing the various bits for 8.8 and Lucene > and Solr Javadocs and Ref Guides for 8.5-8.7 are currently missing. The > project is working on getting those versions back as soon as possible. > > We apologize for this situation, hopefully it won’t be too long today before > we have it fixed. > On Feb 1, 2021, 3:33 AM -0600, Atita Arora , wrote: > > True the link is down since last week, I checked as we are currently in the > > state of migration to 8.7 too. > > > > > > On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki > > wrote: > > > > > Hi, > > > I tried to open the Solr News page to check the contents of the solr > > > release, but it seems to get Not Found. > > > I think it's either the wrong link or the link is messed up. > > > If there is a problem, do you think you can fix it? > > > > > > Sorry if this has already been discussed somewhere. > > > > > > Solr News Page: https://lucene.apache.org/solr/news.html > > > Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html > > > > > > Thank you. > > > Taisuke. > > >
Re: Change uniqueKey using SolrJ
Hi, SolrJ doesn't have any purpose-made request class to change the uniqueKey, afaict. However doing so is still possible (though less convenient) using the "GenericSolrRequest" class, which can be used to hit arbitrary Solr APIs. If you'd like to see better support for this in SolrJ, open a JIRA ticket with the details of what you're trying to do (or a PR directly) and I'd be happy to take a look. Best, Jason On Fri, Jan 22, 2021 at 9:29 AM Timo Grün wrote: > > Hi All, > > I’m currently trying to change the uniqueKey of my Solr Cloud schema using > Solrj. > While creating new Fields and FieldDefinitions is pretty straight forward, I > struggle to find any solution to change the Unique Key field with Solrj. > > Any advice here? > > Best Regards, > > Timo Gruen >
Re: Ghost Documents or Shards out of Sync
To expand on what Jason suggested, if the issue is the non-deterministic ordering due to staggered commits per replica, you may have more consistency with TLOG replicas rather than the NRT replicas. In this case, the underlying segment files should be identical and lead to more predictable results. On Mon, Feb 1, 2021 at 2:50 PM Jason Gerlowski wrote: > Hi Ronen, > > The first thing I'd figure out in your situation is whether the > results are actually different each time, or whether the ordering is > what differs (which might push a particular result off the page you're > looking at, giving the appearance that it didn't match). > > In the case of the former, this can happen briefly if queries come in > when some but not all replicas have seen a commit. But usually this > is a transient concern - either waiting for the next autocommit or > triggering an explicit commit resolves the discrepancy in this case. > Since you only see identical results after a restart, this _doesn't_ > sound like what you're seeing. > > In the case of the latter (same results, differently ordered) this is > expected sometimes. Solr sorts on relevance by default with the > internal Lucene document ID being a tiebreaker. Both the relevance > statistics and Lucene's document IDs can differ across SolrCloud > replicas (due to non-deterministic conditions such as the segment > merging and deleted-doc removal that Lucene does under the hood), and > this can produce differently-ordered result sets for users that issue > the same query repeatedly. > > Good luck narrowing things down! > > Jason > > On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum wrote: > > > > Hi All, > > > > I'm using Solr Cloud (version 8.3.0) with shards and replicas > (replication > > factor of 2). > > Recently, I've encountered several times that running the same query > > repeatedly yields different results. Restarting the nodes fixes the > problem > > (until next time). > > I assume that some shards are not synchronized and I have several > questions: > > 1. What can cause this - many atomic updates? issues with commits? > > 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an > API > > call or some other method? > > > > Thanks in advance, > > Ronen. >
Re: Ghost Documents or Shards out of Sync
Forgot to answer your second question: > Can I trigger the "fixing" mechanism that Solr runs at restart by an API call > or some other method? It depends on what the cause is. But for at least some possible causes there is an API call that can resolve this. Though that API itself (Solr's misnamed "optimize" feature) comes with a lot of warnings and has been discouraged by the community in the past. (I won't get into those specifics though until you figure out the cause.) Before you consider calling "optimize" or taking any other means to fix this though, it might be worth revisiting whether this is really an issue? While this quirk of Solr's can bedevil automated tests or other things that rely on repeatability, it's unusual in many applications for end-users to submit identical queries multiple times. Every case is different of course, but something to consider. Best, Jason On Mon, Feb 1, 2021 at 3:49 PM Jason Gerlowski wrote: > > Hi Ronen, > > The first thing I'd figure out in your situation is whether the > results are actually different each time, or whether the ordering is > what differs (which might push a particular result off the page you're > looking at, giving the appearance that it didn't match). > > In the case of the former, this can happen briefly if queries come in > when some but not all replicas have seen a commit. But usually this > is a transient concern - either waiting for the next autocommit or > triggering an explicit commit resolves the discrepancy in this case. > Since you only see identical results after a restart, this _doesn't_ > sound like what you're seeing. > > In the case of the latter (same results, differently ordered) this is > expected sometimes. Solr sorts on relevance by default with the > internal Lucene document ID being a tiebreaker. Both the relevance > statistics and Lucene's document IDs can differ across SolrCloud > replicas (due to non-deterministic conditions such as the segment > merging and deleted-doc removal that Lucene does under the hood), and > this can produce differently-ordered result sets for users that issue > the same query repeatedly. > > Good luck narrowing things down! > > Jason > > On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum wrote: > > > > Hi All, > > > > I'm using Solr Cloud (version 8.3.0) with shards and replicas (replication > > factor of 2). > > Recently, I've encountered several times that running the same query > > repeatedly yields different results. Restarting the nodes fixes the problem > > (until next time). > > I assume that some shards are not synchronized and I have several questions: > > 1. What can cause this - many atomic updates? issues with commits? > > 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an API > > call or some other method? > > > > Thanks in advance, > > Ronen.
Re: Ghost Documents or Shards out of Sync
Hi Ronen, The first thing I'd figure out in your situation is whether the results are actually different each time, or whether the ordering is what differs (which might push a particular result off the page you're looking at, giving the appearance that it didn't match). In the case of the former, this can happen briefly if queries come in when some but not all replicas have seen a commit. But usually this is a transient concern - either waiting for the next autocommit or triggering an explicit commit resolves the discrepancy in this case. Since you only see identical results after a restart, this _doesn't_ sound like what you're seeing. In the case of the latter (same results, differently ordered) this is expected sometimes. Solr sorts on relevance by default with the internal Lucene document ID being a tiebreaker. Both the relevance statistics and Lucene's document IDs can differ across SolrCloud replicas (due to non-deterministic conditions such as the segment merging and deleted-doc removal that Lucene does under the hood), and this can produce differently-ordered result sets for users that issue the same query repeatedly. Good luck narrowing things down! Jason On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum wrote: > > Hi All, > > I'm using Solr Cloud (version 8.3.0) with shards and replicas (replication > factor of 2). > Recently, I've encountered several times that running the same query > repeatedly yields different results. Restarting the nodes fixes the problem > (until next time). > I assume that some shards are not synchronized and I have several questions: > 1. What can cause this - many atomic updates? issues with commits? > 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an API > call or some other method? > > Thanks in advance, > Ronen.
Re: Getting Solr's statistic using SolrJ
Hi Steven, AFAIK, SolrJ doesn't have built in request objects for the metrics API. But you can still use the "GenericSolrRequest" class to hit any Solr API: e.g. SolrParams params = new ModifiableSolrParams(); params.set("action", "list"); GenericSolrRequest request = new GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics/history", params); final SimpleSolrResponse response = request.process(solrClient); Hope that helps, Jason On Fri, Jan 22, 2021 at 11:21 AM Gael Jourdan-Weil wrote: > > Hello Steven, > > I believe what you are looking for cannot be accessed using SolrJ (I didn't > really check though). > > But you can easily access it either via the Collections APIs and/or the > Metrics API depending on what you need exactly. > See https://lucene.apache.org/solr/guide/8_4/cluster-node-management.html and > https://lucene.apache.org/solr/guide/8_4/metrics-reporting.html > > Gaël > > > De : Steven White > Envoyé : vendredi 22 janvier 2021 16:46 > À : solr-user@lucene.apache.org > Objet : Getting Solr's statistic using SolrJ > > Hi everyone, > > Is there a SolrJ API that I can use to collect statistics data about Solr > (everything that I see on the dashboard if possible)? > > I am in need to collect data about Solr instances, those same data that I > see on the dashboard such as swap-memory, jvm-memory, list of cores, info > about each core, etc. etc. using SolrJ API. > > Thanks > > Steven
Re: Apache Solr Reference Guide isn't accessible
And if you need something more recent while this is being fixed, you can look right at the source in GitHub, though a navigation, etc is missing: https://github.com/apache/lucene-solr/blob/master/solr/solr-ref-guide/src/analyzers.adoc Open Source :-) Regards, Alex. On Mon, 1 Feb 2021 at 15:04, Mike Drob wrote: > > Hi Dorion, > > We are currently working with our infra team to get these restored. In the > meantime, the 8.4 guide is still available at > https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8 > guide will be back up soon. Thank you for your patience. > > Mike > > On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline > wrote: > > > Hi, > > > > I can't access to Apache Solr Reference Guide since few days. > > Example: > > URL > > > > * https://lucene.apache.org/solr/guide/8_8/ > > * https://lucene.apache.org/solr/guide/8_7/ > > Result: > > Not Found > > The requested URL was not found on this server. > > > > Do you know what going on? > > > > Thanks > > Caroline Dorion > >
Re: Apache Solr Reference Guide isn't accessible
Hi Dorion, We are currently working with our infra team to get these restored. In the meantime, the 8.4 guide is still available at https://lucene.apache.org/solr/guide/8_4/ and are hopeful that the 8.8 guide will be back up soon. Thank you for your patience. Mike On Mon, Feb 1, 2021 at 1:58 PM Dorion Caroline wrote: > Hi, > > I can't access to Apache Solr Reference Guide since few days. > Example: > URL > > * https://lucene.apache.org/solr/guide/8_8/ > * https://lucene.apache.org/solr/guide/8_7/ > Result: > Not Found > The requested URL was not found on this server. > > Do you know what going on? > > Thanks > Caroline Dorion >
Apache Solr Reference Guide isn't accessible
Hi, I can't access to Apache Solr Reference Guide since few days. Example: URL * https://lucene.apache.org/solr/guide/8_8/ * https://lucene.apache.org/solr/guide/8_7/ Result: Not Found The requested URL was not found on this server. Do you know what going on? Thanks Caroline Dorion
Re: Is the lucene.apache.org link dead?
There were some issues while publishing the various bits for 8.8 and Lucene and Solr Javadocs and Ref Guides for 8.5-8.7 are currently missing. The project is working on getting those versions back as soon as possible. We apologize for this situation, hopefully it won’t be too long today before we have it fixed. On Feb 1, 2021, 3:33 AM -0600, Atita Arora , wrote: > True the link is down since last week, I checked as we are currently in the > state of migration to 8.7 too. > > > On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki > wrote: > > > Hi, > > I tried to open the Solr News page to check the contents of the solr > > release, but it seems to get Not Found. > > I think it's either the wrong link or the link is messed up. > > If there is a problem, do you think you can fix it? > > > > Sorry if this has already been discussed somewhere. > > > > Solr News Page: https://lucene.apache.org/solr/news.html > > Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html > > > > Thank you. > > Taisuke. > >
Re: NRT - Indexing
Hi, It is not the cause of your issue, but Solr version is 8.6.0, and solrconfig.xml includes 7.5.0 By "I am using a service that fetches data from the Postgres database and indexes it to solr. The service runs with a delay of 5 seconds.". You man, you are using DIH and launch a delta-import each 5 seconds ? Solr logs may help. Dominique Le lun. 1 févr. 2021 à 13:00, a écrit : > Hello, > > > I am attaching the solrconfig.xml along with this email, also I am > attaching a text document that has JSON object regarding the system > information I am using a service that fetches data from the Postgres > database and indexes it to solr. The service runs with a delay of 5 seconds. > > > Regards > > > Mit freundlichen Grüssen / Kind regards > > > Muhammad Haris Khan > > > *VNC - Virtual Network Consult* > > > *-- Solr Ingenieur --* > > > - On 1 February, 2021 3:50 PM, Dominique Bejean < > dominique.bej...@eolya.fr> wrote: > > > > Hi, > > > What is your Solr version ? > > Can you share your solrconfig.xml ? > > How is your sharding ? > > Did you grep your solr logs on with the "commit' pattern in order to see > > hard and soft commit occurrences ? > > How are you pushing new docs or updates in the collection ? > > > Regards. > > > Dominique > > > > > > Le lun. 1 févr. 2021 à 08:08, a écrit : > > > > Hello, > > > > > > Hope you're doing good. I am trying to configure NRT - Indexing in my > > > project. For this reason, I have configured *autoSoftCommit* to execute > > > every second and *autoCommit* to execute every 5 minutes. Everything > > > works as expected on the dev and test server. But on the production > server, > > > there are more than 6 million documents indexed in Solr, so whenever a > new > > > document is indexed it takes 2-3 minutes before appearing in the search > > > despite the setting I have described above. Since the target is to > develop > > > a real-time system, this delay of 2-3 minutes is not acceptable. How can > I > > > reduce this time window? > > > > > > Plus any advice on better scaling the Solr considering more than 6 > million > > > records would be very helpful. Thank you in advance. > > > > > > > > > > > > Mit freundlichen Grüssen / Kind regards > > > > > > Muhammad Haris Khan > > > > > > *VNC - Virtual Network Consult* > > > > > > *-- Solr Ingenieur --* > > > >
[ANNOUNCE] Apache Solr 8.8.0 released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search and analytics, rich document parsing, geospatial search, extensive REST APIs as well as parallel SQL. Solr is enterprise grade, secure and highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. The release is available for immediate download at: https://lucene.apache.org/solr/downloads.html Please read CHANGES.txt for a detailed list of changes: https://lucene.apache.org/solr/8_8_0/changes/Changes.html Solr 8.8.0 Release Highlights: Reducing overseer bottlenecks using per-replica states. More stability and lesser load on large cluster that use this feature. Better restart and collection creation performance. Interleaving support in Learning To Rank A summary of important changes is published in the Solr Reference Guide at: https://lucene.apache.org/solr/guide/8_8/solr-upgrade-notes.html. For the most exhaustive list, see the full release notes at https://lucene.apache.org/solr/8_8_0/changes/Changes.html or by viewing the CHANGES.txt file accompanying the distribution. Solr's release notes usually don't include Lucene layer changes. Lucene's release notes are at https://lucene.apache.org/core/8_8_0/changes/Changes.html - - Noble Paul -BEGIN PGP SIGNATURE- Version: FlowCrypt Email Encryption 8.0.0 Comment: Seamlessly send and receive encrypted email wsFzBAEBCgAGBQJgF/RnACEJEMOP9ew/z9s+FiEEz85fu5IMPHRc7uCEw4/1 7D/P2z6fzRAAm4AKbeIGWfPK+0nsrZCAPaDucGZYVL0lPQr3eF4jnmhi60dF Sv9rD5Mq5ZSTTuJlpwoaxowxVp4M1tV1vmCdfBRkgoUD3dwS/snryr/AK69R zdjjV/BABtcMNA7cMYIrkolGl37g4kI1alLfU36Uf/3M0NfUcw0keW1XuMOr uV7AzXhZGw4eL4LJt7I7gXJs1kgE6/sPSmoKBVckKisrruiUSYmH9r/EhXXU YB8cxd5tenMrchbjcOquC9X2JJjB++/LyJw3mFNIO5W3UpjqwtI8IGDo1Sxl fM32FuAWVVDZsiBKXuRzsIO/iEPfgZFfTcoSJkD0Rt/Q6gJPZIuBmiUFaYfs 9fzufNDuXdPKFEndSHfwdPMJwvk3XA5+xYzhkcQH+3FKOPmYXkvLolOC3j+r ZtbgI421jDIahpVPbFtgUPB2dM3mw34B73wP5MIOHHxz22tVKe6PBOeihccK mOr0r1tZHR+11aijYf+Nlhv3hpbpRoDbQ7pRkRyu53Od47p6itZAi60TFFIJ bDw26wZRNRrEuYhriJUeM7ahvJNlcE6VaO0szUDL5g/x2Oa9jKMHPpsUF9pS 9HbJWcnflxq0iU+sfdv7Eoxzv6zkXMTUsbpT2XjKcZZN5jd2rWV3JfiU6FiZ jpqJBHzwGan9qKKswNKyDKhoa2jPdSYIbqQ= =NbSI -END PGP SIGNATURE-
Re: NRT - Indexing
Hello,I am attaching the solrconfig.xml along with this email, also I am attaching a text document that has JSON object regarding the system information I am using a service that fetches data from the Postgres database and indexes it to solr. The service runs with a delay of 5 seconds.RegardsMit freundlichen Grüssen / Kind regardsMuhammad Haris KhanVNC - Virtual Network Consult-- Solr Ingenieur --- On 1 February, 2021 3:50 PM, Dominique Bejeanwrote:Hi,What is your Solr version ?Can you share your solrconfig.xml ?How is your sharding ?Did you grep your solr logs on with the "commit' pattern in order to seehard and soft commit occurrences ?How are you pushing new docs or updates in the collection ?Regards.DominiqueLe lun. 1 févr. 2021 à 08:08, a écrit :> Hello,>> Hope you're doing good. I am trying to configure NRT - Indexing in my> project. For this reason, I have configured *autoSoftCommit* to execute> every second and *autoCommit* to execute every 5 minutes. Everything> works as expected on the dev and test server. But on the production server,> there are more than 6 million documents indexed in Solr, so whenever a new> document is indexed it takes 2-3 minutes before appearing in the search> despite the setting I have described above. Since the target is to develop> a real-time system, this delay of 2-3 minutes is not acceptable. How can I> reduce this time window?>> Plus any advice on better scaling the Solr considering more than 6 million> records would be very helpful. Thank you in advance. Mit freundlichen Grüssen / Kind regards>> Muhammad Haris Khan>> *VNC - Virtual Network Consult*>> *-- Solr Ingenieur --*> solrconfig.xml Description: XML document
Re: Tweaking Shards and Replicas for high volume queries and updates
Hi, Some suggestions. * 64GB JVM Heap Are you sure you really need this heap size ? Did you check in your GC logs (with gceasy.io) ? A best practice is to minimize as possible the heap size and never more than 31 GB. * OS Caching Did you set swappiness to 1 ? * Put two instances of Solr on each node You need to check resource usage in order to evaluate if it could be interesting (CPU usage, CPU load average, CPU iowait, Heap usage, Disk I/O read and write, MMAP caching, ...) Load Average high with CPU Load low looks like Disk I/O can be the bottleneck. I would consider increasing the number of physical servers with less CPU, RAM and disk space on each (but globally with the same quantity of CPU, RAM and disk space). This will increase the disk I/O capacity. * Collection 4 is the trouble collection Try to have smaller cores (more shards if you increase the number of Solr instances) Investigate in time routed ou category routed aliases if it can match with your update strategy and/or your queries profiles. Work again on shema : - For docValues=true fields, check if you really need indexed=true and storted=true (there are a lot of considerations to take in account), ... - Over-indexing with copyfield ? Work on queries : facets, group, collapse, fl=, rows=, ... Regards Dominique Le mer. 27 janv. 2021 à 14:53, Hollowell,Skip a écrit : > 30 Dedicated physical Nodes in the Solr Cloud Cluster, all of identical > configuration > Server01 RHEL 7.x > 256GB RAM > 10 2TB Spinning Disk in a RAID 10 Configuration (Leaving us 9.8TB usable > per node) > 64GB JVM Heap, Tried has high as 100GB, but it appeared that 64GB was > faster. If we set a higher heap, do we starve the OS for caching? > Huge Pages is off on the system, and thus UseLargePages is off on Solr > Startup > G1GC, Java 11 (ZGC with Java 15 and HugePages turned on was a disaster. > We suspect it was due to the Huge Pages configuration) > At one time we discussed putting two instances of Solr on each node, > giving us a cloud of 60 instances instead of 30. Load Average is high on > these nodes during certain types of queries or updates, but CPU Load is > relatively low and should be able to accommodate a second instance, but all > the data would still be on the same RAID10 group of disks. > Collection 4 is the trouble collection. It has nearly a billion > documents, and there are between 200 and 400 million updates every day. > How do we get that kind of update performance, and still serve 10 million > queries a day? Schemas have been reviewed and re-reviewed to ensure we are > only indexing and storing what is absolutely necessary. What are we > missing? Do we need to revisit our replica policy? Number of replicas or > types of replicas (to ensure some are only used for reading, etc?) > [Grabbed from the Admin UI] > 755.6Gb Index Size according to Solr Cloud UI > Total #docs: 371.8mn > Avg size/doc: 2.1Kb > 90 Shards, 2 NRT Replicas per Shard, 1,750,612,476 documents, avg > size/doc: 1.7Kb, uses nested documents > collection-1_s69r317 31.1Gb > collection-1_s49r96 30.7Gb > collection-1_s78r154 30.2Gb > collection-1_s40r259 30.1Gb > collection-1_s9r197 29.1Gb > collection-1_s18r34 28.9Gb > 120 Shards, 2 TLOG Replicas per Shard, 2,230,207,046 documents, avg > size/doc: 1.3Kb > collection-2_s78r154 22.8Gb > collection-2_s49r96 22.8Gb > collection-2_s46r331 22.8Gb > collection-2_s18r34 22.7Gb > collection-2_s109r21622.7Gb > collection-2_s104r44722.7Gb > collection-2_s15r269 22.7Gb > collection-2_s73r385 22.7Gb > 120 Shards, 2 TLOG Replicas per Shard, 733,588,503 documents, avg > size/doc: 1.9Kb > collection-3_s19r277 10.6Gb > collection-3_s108r21410.6Gb > collection-3_s48r94 10.6Gb > collection-3_s109r45710.6Gb > collection-3_s47r333 10.5Gb > collection-3_s78r154 10.5Gb > collection-3_s18r34 10.5Gb > collection-3_s77r393 10.5Gb > > 120 Shards, 2 TLOG Replicas per Shard, 864,372,654 documents, avg > size/doc: 5.6Kb > collection-4_s109r21638.7Gb > collection-4_s100r43938.7Gb > collection-4_s49r96 38.7Gb > collection-4_s35r309 38.6Gb > collection-4_s18r34 38.6Gb > collection-4_s78r154 38.6Gb > collection-4_s7r253 38.6Gb > collection-4_s69r377 38.6Gb >
Re: NRT - Indexing
Hi, What is your Solr version ? Can you share your solrconfig.xml ? How is your sharding ? Did you grep your solr logs on with the "commit' pattern in order to see hard and soft commit occurrences ? How are you pushing new docs or updates in the collection ? Regards. Dominique Le lun. 1 févr. 2021 à 08:08, a écrit : > Hello, > > Hope you're doing good. I am trying to configure NRT - Indexing in my > project. For this reason, I have configured *autoSoftCommit* to execute > every second and *autoCommit* to execute every 5 minutes. Everything > works as expected on the dev and test server. But on the production server, > there are more than 6 million documents indexed in Solr, so whenever a new > document is indexed it takes 2-3 minutes before appearing in the search > despite the setting I have described above. Since the target is to develop > a real-time system, this delay of 2-3 minutes is not acceptable. How can I > reduce this time window? > > Plus any advice on better scaling the Solr considering more than 6 million > records would be very helpful. Thank you in advance. > > > > Mit freundlichen Grüssen / Kind regards > > Muhammad Haris Khan > > *VNC - Virtual Network Consult* > > *-- Solr Ingenieur --* >
Re: NRT - Indexing
I'm running into the same issue. I've set autoSoftCommit and autoCommit but the speed at which docs are indexed seems to be inconsistent with the settings. I have lowered the autoCommit to a minute but it still takes a few minutes for docs to show after indexing. Soft commit settings also seem to have no effect (from what I understand of the docs, Soft commit makes items viewable but I'm not seeing them until well after the autoCommit period has passed. On Mon, 1 Feb 2021 at 15:08, wrote: > Hello, > > Hope you're doing good. I am trying to configure NRT - Indexing in my > project. For this reason, I have configured *autoSoftCommit* to execute > every second and *autoCommit* to execute every 5 minutes. Everything > works as expected on the dev and test server. But on the production server, > there are more than 6 million documents indexed in Solr, so whenever a new > document is indexed it takes 2-3 minutes before appearing in the search > despite the setting I have described above. Since the target is to develop > a real-time system, this delay of 2-3 minutes is not acceptable. How can I > reduce this time window? > > Plus any advice on better scaling the Solr considering more than 6 million > records would be very helpful. Thank you in advance. > > > > Mit freundlichen Grüssen / Kind regards > > Muhammad Haris Khan > > *VNC - Virtual Network Consult* > > *-- Solr Ingenieur --* >
Re: Is the lucene.apache.org link dead?
True the link is down since last week, I checked as we are currently in the state of migration to 8.7 too. On Mon, Feb 1, 2021 at 6:57 AM Taisuke Miyazaki wrote: > Hi, > I tried to open the Solr News page to check the contents of the solr > release, but it seems to get Not Found. > I think it's either the wrong link or the link is messed up. > If there is a problem, do you think you can fix it? > > Sorry if this has already been discussed somewhere. > > Solr News Page: https://lucene.apache.org/solr/news.html > Dead LInk: https://lucene.apache.org/solr/8_7_0/changes/Changes.html > > Thank you. > Taisuke. >
Re: Performance issue with Solr 8.6.1 Unified Highlighter does not occur on Solr 6.
Hi David, Thanks for filing this issue. The classic non-weightMatcher mode works well for us right now. Yes, we are using the POSTINGS mode for most of the fields although explicitly mentioning it gives an error since not all fields are indexed with offsets. So I guess the highlighter is picking the right choice for each field. Here is the test with hl.offsetSource=ANALYSIS and hl.weightMatches=false that you requested. hl.offsetSource=ANALYSIS&hl.weightMatches=false (340 ms) The above is thus better than the original highlighter. I'll also try and create that PR soon.