Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
Congratulations Jan! ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote: > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC nominated > and elected Jan Høydahl for the position of the Solr PMC Chair and Vice > President. This decision was approved by the board in its February 2021 > meeting. > > Congratulations Jan! > > -- > Anshum Gupta >
Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
Congratulations and thank you, Jan! It is so exciting that Solr is now a TLP! Mike McCandless http://blog.mikemccandless.com On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote: > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC > nominated and elected Jan Høydahl for the position of the Solr PMC Chair > and Vice President. This decision was approved by the board in its February > 2021 meeting. > > Congratulations Jan! > > -- > Anshum Gupta >
Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
Congratulations Jan! -- Steve > On Feb 18, 2021, at 1:55 PM, Anshum Gupta wrote: > > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC nominated > and elected Jan Høydahl for the position of the Solr PMC Chair and Vice > President. This decision was approved by the board in its February 2021 > meeting. > > Congratulations Jan! > > -- > Anshum Gupta
Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
Yes, Congratulations and a big thank you Jan! On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote: > > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC nominated > and elected Jan Høydahl for the position of the Solr PMC Chair and Vice > President. This decision was approved by the board in its February 2021 > meeting. > > Congratulations Jan! > > -- > Anshum Gupta
Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!
Hi everyone, I’d like to inform everyone that the newly formed Apache Solr PMC nominated and elected Jan Høydahl for the position of the Solr PMC Chair and Vice President. This decision was approved by the board in its February 2021 meeting. Congratulations Jan! -- Anshum Gupta
[ANNOUNCE] Apache Solr TLP Created
Hi everyone, On behalf of the Apache Lucene PMC, and the newly formed Apache Solr PMC, I’d like to inform folks that the ASF board has approved the resolution to create the Solr TLP (Top Level Project). We are currently working on the next steps but would like to assure the community that they can continue to expect critical bug fixes for releases previously made under the Apache Lucene project. We will send another update as the mailing lists and website are set up for the Solr project. -Anshum On behalf of the Apache Lucene and Solr PMC
Re: Dynamic starting or stoping of zookeepers in a cluster
On 2/18/2021 8:20 AM, DAVID MARTIN NIETO wrote: We've a solr cluster with 4 solr servers and 5 zookeepers in HA mode. We've tested about if our cluster can mantain the service with only the half of the cluster, in case of disaster os similar, and we've a problem with the zookepers config and its static configuration. In the start script of the 4 solrs servers there are a list of 5 ip:port of the 5 zookeepers of the cluster, so when we "lost" the half of machines (we've 2 zoos in one machine and 3 on another) in the worst case we lost 3 of these 5 zookeepers. We can start a sixth zookeeper (to have 3 with the half of cluster stopped) but to add in the solr server we need to stop and restart with a new list of ip:port adding it and that's not an automatic or dynamic thing. In order to have a highly available zookeeper, you must have at least three separate physical servers for ZK. Running multiple zookeepers on one physical machine gains you nothing ... because if the whole machine fails, you lose all of those zookeepers. If you have three physical servers, one can fail with no problems. If you have five separate physical servers running ZK, then two of the machines can fail without taking the cluster down. ¿Somebody knows another configuration or workaround to have a dynamic list of zoos and start or stop some of thems without changes in the config and start/stop the solr server? The Zookeeper client was upgraded to 3.5 in Solr 8.2.0. https://issues.apache.org/jira/browse/SOLR-8346 If you're running at least Solr 8.2.0, and your ZK servers are at least version 3.5, then ZK should support dynamic cluster reconfiguration. The ZK status page in the admin UI may have some problems after ZK undergoes a dynamic reconfiguration, but SolrCloud's core functionality should work fine. Thanks, Shawn
RE: Autoscaling working Examples
Hi Florian, We have a Solr Cloud installation with 2 nodes, each in a separate Availability Zone. We run one shard for each collection, with replicationFactor set to 2 and autoAddReplica set to True. These are the only settings we make other than using our config set. Typical settings for our collections: Shard Count:1 configName ourConfig replicationFactor 2 maxShardsperNode1 router: compositeId autoAddReplicas:True Hope this helps, Shane -Original Message- From: Florian Gleixner Sent: Thursday, February 18, 2021 9:03 AM To: solr-user@lucene.apache.org Subject: Autoscaling working Examples WARNING: This email originated from outside Municode. DO NOT CLICK on any attachments or links from unknown senders or unexpected emails. Always check the sender's display name and email address are correct before you communicate. Hi, i'm trying ot get Autoscaling running, but i always get more and more repicas. Some shards have two replicas, some have up to 7 Replicas while the replicationFactor of the collection is 2. Is there a working example, that keeps shard replicas distributed over 2 availability zones with exactly one replica in every zone? Florian
Dynamic starting or stoping of zookeepers in a cluster
Hi all, We've a solr cluster with 4 solr servers and 5 zookeepers in HA mode. We've tested about if our cluster can mantain the service with only the half of the cluster, in case of disaster os similar, and we've a problem with the zookepers config and its static configuration. In the start script of the 4 solrs servers there are a list of 5 ip:port of the 5 zookeepers of the cluster, so when we "lost" the half of machines (we've 2 zoos in one machine and 3 on another) in the worst case we lost 3 of these 5 zookeepers. We can start a sixth zookeeper (to have 3 with the half of cluster stopped) but to add in the solr server we need to stop and restart with a new list of ip:port adding it and that's not an automatic or dynamic thing. ¿Somebody knows another configuration or workaround to have a dynamic list of zoos and start or stop some of thems without changes in the config and start/stop the solr server? Kind regards and thanks a lot. --- Este mensaje va dirigido únicamente a la(s) persona(s) y/o entidad(es) arriba relacionada(s). Puede contener información confidencial o legalmente protegida. Si no es usted el destinatario señalado, le rogamos borre del sistema inmediatamente el mensaje y sus copias. Asimismo le informamos que cualquier copia, divulgación, distribución o uso de los contenidos está prohibida. --- This message is addressed only to the person (people) and / or entities listed above. It may contain confidential or legally protected information. If you are not the recipient indicated, please delete the message and its copies immediately from the system. We also inform that any copy, disclosure, distribution or use of the contents is forbidden --- Viewnext, S.A. Domicilio Social: Avda. de Burgos 8-A 28036 de Madrid. telf: 913834060, Fax: 913834090. Reg. M. Madrid: Tomo 3238, Libro:0, Folio: 78, Seccion: 8ª, Hoja M-55112, N.I.F.: A-80157746
Autoscaling working Examples
Hi, i'm trying ot get Autoscaling running, but i always get more and more repicas. Some shards have two replicas, some have up to 7 Replicas while the replicationFactor of the collection is 2. Is there a working example, that keeps shard replicas distributed over 2 availability zones with exactly one replica in every zone? Florian
Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index
IMO enableLazyFieldLoading is a small optimization for most apps. It saves memory in the document cache at the expense of increased latency if your usage pattern wants a field later that wasn't requested earlier. You'd probably need detailed metrics/benchmarks to observe a difference, and you might reach a conclusion that enableLazyFieldLoading is best at "false" for you irrespective of the bug. I suspect it may have been developed for particularly large document use-cases where you don't normally need some large text fields for retrieval/highlighting. For example imagine if you stored the entire input data as JSON in a _json_ field or some-such. Nowadays, I'd set large="true" on such a field, which is a much newer option. I was able to tweak my test to have only alphabetic IDs, and the test still failed. I don't see how the ID's contents/format could cause any effect. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Feb 18, 2021 at 5:04 AM Nussbaum, Ronen wrote: > You're right, I was able to reproduce it too without highlighting. > Regarding the existing bug, I think there might be an additional issue > here because it happens only when id field contains an underscore (didn't > check for other special characters). > Currently I have no other choice but to use enableLazyFieldLoading=false. > I hope it wouldn't have a significant performance impact. > > -Original Message- > From: David Smiley > Sent: יום ה 18 פברואר 2021 01:03 > To: solr-user > Subject: Re: Atomic Update (nested), Unified Highlighter and Lazy Field > Loading => Invalid Index > > I think the issue is this existing bug, but needs to refer to > toSolrInputDocument instead of toSolrDoc: > https://issues.apache.org/jira/browse/SOLR-13034 > Highlighting isn't involved; you just need to somehow get a document > cached with lazy fields. In a test I was able to do this simply by doing a > query that only returns the "id" field. No highlighting. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Wed, Feb 17, 2021 at 10:28 AM David Smiley wrote: > > > Thanks for more details. I was able to reproduce this locally! I > > hacked a test to look similar to what you are doing. BTW it's okay to > > fill out a JIRA imperfectly; they can always be edited :-). Once I > > better understand the nature of the bug today, I'll file an issue and > respond with it here. > > > > ~ David Smiley > > Apache Lucene/Solr Search Developer > > http://www.linkedin.com/in/davidwsmiley > > > > > > On Wed, Feb 17, 2021 at 6:36 AM Nussbaum, Ronen > > > > wrote: > > > >> Hello David, > >> > >> Thank you for your reply. > >> It was very hard but finally I discovered how to reproduce it. I > >> thought of issuing an issue but wasn't sure about the components and > priority. > >> I used the "tech products" configset, with the following changes: > >> 1. Added >> name="_nest_path_" class="solr.NestPathField" /> 2. Added >> name="text_en" type="text_en" indexed="true" > >> stored="true" termVectors="true" termOffsets="true" termPositions="true" > >> required="false" multiValued="true" /> Than I inserted one document > >> with a nested child e.g. > >> {id:"abc_1", utterances:{id:"abc_1-1", text_en:"Solr is great"}} > >> > >> To reproduce: > >> Do a search with surround and unified highlighter: > >> > >> hl.fl=text_en=unified=on=%7B!surround%7Dtext_en%3A4W(" > >> solr"%2C"great") > >> > >> Now, try to update the parent e.g. {id:"abc_1", categories_i:{add:1}} > >> > >> Important: it happens only when "id" contains underscore characters! > >> If you'll use "abc-1" it would work. > >> > >> Thanks in advance, > >> Ronen. > >> > >> -Original Message- > >> From: David Smiley > >> Sent: יום א 14 פברואר 2021 19:17 > >> To: solr-user > >> Subject: Re: Atomic Update (nested), Unified Highlighter and Lazy > >> Field Loading => Invalid Index > >> > >> Hello Ronen, > >> > >> Can you please file a JIRA issue? Some quick searches did not turn > >> anything up. It would be super helpful to me if you could list a > >> series of steps with Solr out-of-the-box in 8.8 including what data > >> to index and query. Solr already includes the "tech products" sample > >> data; maybe that can illustrate the problem? It's not clear if > >> nested schema or nested docs are actually required in your example. > >> If you share the JIRA issue with me, I'll chase this one down. > >> > >> ~ David Smiley > >> Apache Lucene/Solr Search Developer > >> http://www.linkedin.com/in/davidwsmiley > >> > >> > >> On Sun, Feb 14, 2021 at 11:16 AM Ronen Nussbaum > >> wrote: > >> > >> > Hi All, > >> > > >> > I discovered a strange behaviour with this combination. > >> > Not only the atomic update fails, the child documents are not > >> > properly indexed, and you can't use highlights on their text > >> > fields. Currently there is no workaround other than reindex. > >> > > >> > Checked
Re: Solr 8.0 query length limit
Thanks Alex and Shawn. Regards, Anuj On Thu, 18 Feb 2021 at 18:57, Shawn Heisey wrote: > On 2/18/2021 3:38 AM, Anuj Bhargava wrote: > > Solr 8.0 query length limit > > > > We are having an issue where queries are too big, we get no result. And > if > > we remove a few keywords we get the result. > > The best option is to convert the request to POST, as Thomas suggested. > With that, the query parameters could be up to 2 megabytes in size > with no config changes. > > The limit for this is enforced by Jetty -- the servlet container that > Solr ships with. If you cannot switch your requests to POST, then you > can find the following line in server/etc/jetty.xml, adjust it, and > restart Solr: > > name="solr.jetty.request.header.size" default="8192" /> > > A header limit of 8KB is found in nearly all web servers and related > software, like load balancers. > > Thanks, > Shawn >
Re: Cannot find Solr 7.4.1 release
On 2/18/2021 1:05 AM, Olivier Tavard wrote: I wanted to download Solr 7.4.1, but I cannot find the 7.4.1 release into http://archive.apache.org/dist/lucene/solr/ : there are Solr 7.4 and after directly 7.5. Of course I can build from source code, but this is frustrating because I can see that in the 7_4_branch there is a fix that I need (SOLR-12594) with the status fixed into 7.4.1 and 7.5 versions. Everythings seems to have been prepared to release the 7.4.1, but I cannot find it. Does this release exist ? That release does not exist. There was never any discussion about it on the dev list. 7.4.1 was added to Jira for tracking purposes, and the code change for that issue was saved to branch_7_4 just in case somebody felt a 7.4.1 release was required. That issue deals with a problem in metrics, which is outside of basic Solr functionality -- not critical enough to warrant a point release. The release process for 7.5.0 was underway about a month after that issue was committed. If 7.5.0 (or one of the many later releases) will not work for your needs, then you will need to compile branch_7_4 yourself. I have used custom-compiled versions before in production because we needed a bugfix that was not deemed severe enough for a new point release. You can create binary packages similar to what is available for download by running "ant package" in the solr directory of your code checkout. I think that build target only works on *NIX systems -- Windows is missing some of the required pieces. Thanks, Shawn
Re: Solr 8.0 query length limit
On 2/18/2021 3:38 AM, Anuj Bhargava wrote: Solr 8.0 query length limit We are having an issue where queries are too big, we get no result. And if we remove a few keywords we get the result. The best option is to convert the request to POST, as Thomas suggested. With that, the query parameters could be up to 2 megabytes in size with no config changes. The limit for this is enforced by Jetty -- the servlet container that Solr ships with. If you cannot switch your requests to POST, then you can find the following line in server/etc/jetty.xml, adjust it, and restart Solr: name="solr.jetty.request.header.size" default="8192" /> A header limit of 8KB is found in nearly all web servers and related software, like load balancers. Thanks, Shawn
Re: Solr 8.0 query length limit
Also, investigate if you have repeating conditions and push those into defaults in custom request handler endpoints (in solrconfig.xml). Also, Solr supports parameter substitutions, if you have repeated subconditions. Regards, Alex On Thu., Feb. 18, 2021, 7:08 a.m. Thomas Corthals, wrote: > You can send big queries as a POST request instead of a GET request. > > Op do 18 feb. 2021 om 11:38 schreef Anuj Bhargava : > > > Solr 8.0 query length limit > > > > We are having an issue where queries are too big, we get no result. And > if > > we remove a few keywords we get the result. > > > > Error we get - error 414 (Request-URI Too Long) > > > > > > Have made the following changes in jetty.xml, still the same error > > > > * > name="solr.jetty.output.buffer.size" default="32768" />* > > * > name="solr.jetty.output.aggregation.size" default="32768" />* > > * > name="solr.jetty.request.header.size" default="65536" />* > > * > name="solr.jetty.response.header.size" default="32768" />* > > * > name="solr.jetty.send.server.version" default="false" />* > > * > name="solr.jetty.send.date.header" default="false" />* > > * > name="solr.jetty.header.cache.size" default="1024" />* > > * > name="solr.jetty.delayDispatchUntilContent" default="false"/>* > > >
Re: How to get case-sensitive Terms?
Terms query does not do analysis chain, but expect tokenized values. Because it matches what is returned by faceting. So I would check whether that field is string or text and difference in processing. Enabling debug will also show difference in final expanded form. Regards, Alex P. S. It is better to start new question threads for new questions. More people will pay attention. On Thu., Feb. 18, 2021, 1:31 a.m. elivis, wrote: > Alexandre Rafalovitch wrote > > What about copyField with the target being index only (docValue only?) > and > > no lowercase on the target field type? > > > > Solr is not a database, you are optimising for search. So duplicate, > > multi-process, denormalise, create custom field types, etc. > > > > Regards, > >Alex > > Thank you! > > One more question - when we index data, we have some other fields that we > are populating. Our data comes from different inputs, so one of those > fields > is a data source ID that the text came from. Wen we do search, we are able > to get search results specific to only that data source by adding filter > query (e.g. fq=image_id:1). However, that doesn't seem to work when doing a > terms query - I always get the terms from the entire index. Is there a way > to filter the terms? > > Thank you again. > > > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html >
Re: Solr 8.0 query length limit
You can send big queries as a POST request instead of a GET request. Op do 18 feb. 2021 om 11:38 schreef Anuj Bhargava : > Solr 8.0 query length limit > > We are having an issue where queries are too big, we get no result. And if > we remove a few keywords we get the result. > > Error we get - error 414 (Request-URI Too Long) > > > Have made the following changes in jetty.xml, still the same error > > * name="solr.jetty.output.buffer.size" default="32768" />* > * name="solr.jetty.output.aggregation.size" default="32768" />* > * name="solr.jetty.request.header.size" default="65536" />* > * name="solr.jetty.response.header.size" default="32768" />* > * name="solr.jetty.send.server.version" default="false" />* > * name="solr.jetty.send.date.header" default="false" />* > * name="solr.jetty.header.cache.size" default="1024" />* > * name="solr.jetty.delayDispatchUntilContent" default="false"/>* >
Solr 8.0 query length limit
Solr 8.0 query length limit We are having an issue where queries are too big, we get no result. And if we remove a few keywords we get the result. Error we get - error 414 (Request-URI Too Long) Have made the following changes in jetty.xml, still the same error ** ** ** ** ** ** ** **
RE: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index
You're right, I was able to reproduce it too without highlighting. Regarding the existing bug, I think there might be an additional issue here because it happens only when id field contains an underscore (didn't check for other special characters). Currently I have no other choice but to use enableLazyFieldLoading=false. I hope it wouldn't have a significant performance impact. -Original Message- From: David Smiley Sent: יום ה 18 פברואר 2021 01:03 To: solr-user Subject: Re: Atomic Update (nested), Unified Highlighter and Lazy Field Loading => Invalid Index I think the issue is this existing bug, but needs to refer to toSolrInputDocument instead of toSolrDoc: https://issues.apache.org/jira/browse/SOLR-13034 Highlighting isn't involved; you just need to somehow get a document cached with lazy fields. In a test I was able to do this simply by doing a query that only returns the "id" field. No highlighting. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Wed, Feb 17, 2021 at 10:28 AM David Smiley wrote: > Thanks for more details. I was able to reproduce this locally! I > hacked a test to look similar to what you are doing. BTW it's okay to > fill out a JIRA imperfectly; they can always be edited :-). Once I > better understand the nature of the bug today, I'll file an issue and respond > with it here. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Wed, Feb 17, 2021 at 6:36 AM Nussbaum, Ronen > > wrote: > >> Hello David, >> >> Thank you for your reply. >> It was very hard but finally I discovered how to reproduce it. I >> thought of issuing an issue but wasn't sure about the components and >> priority. >> I used the "tech products" configset, with the following changes: >> 1. Added > name="_nest_path_" class="solr.NestPathField" /> 2. Added > name="text_en" type="text_en" indexed="true" >> stored="true" termVectors="true" termOffsets="true" termPositions="true" >> required="false" multiValued="true" /> Than I inserted one document >> with a nested child e.g. >> {id:"abc_1", utterances:{id:"abc_1-1", text_en:"Solr is great"}} >> >> To reproduce: >> Do a search with surround and unified highlighter: >> >> hl.fl=text_en=unified=on=%7B!surround%7Dtext_en%3A4W(" >> solr"%2C"great") >> >> Now, try to update the parent e.g. {id:"abc_1", categories_i:{add:1}} >> >> Important: it happens only when "id" contains underscore characters! >> If you'll use "abc-1" it would work. >> >> Thanks in advance, >> Ronen. >> >> -Original Message- >> From: David Smiley >> Sent: יום א 14 פברואר 2021 19:17 >> To: solr-user >> Subject: Re: Atomic Update (nested), Unified Highlighter and Lazy >> Field Loading => Invalid Index >> >> Hello Ronen, >> >> Can you please file a JIRA issue? Some quick searches did not turn >> anything up. It would be super helpful to me if you could list a >> series of steps with Solr out-of-the-box in 8.8 including what data >> to index and query. Solr already includes the "tech products" sample >> data; maybe that can illustrate the problem? It's not clear if >> nested schema or nested docs are actually required in your example. >> If you share the JIRA issue with me, I'll chase this one down. >> >> ~ David Smiley >> Apache Lucene/Solr Search Developer >> http://www.linkedin.com/in/davidwsmiley >> >> >> On Sun, Feb 14, 2021 at 11:16 AM Ronen Nussbaum >> wrote: >> >> > Hi All, >> > >> > I discovered a strange behaviour with this combination. >> > Not only the atomic update fails, the child documents are not >> > properly indexed, and you can't use highlights on their text >> > fields. Currently there is no workaround other than reindex. >> > >> > Checked on 8.3.0, 8.6.1 and 8.8.0. >> > 1. Configure nested schema. >> > 2. enableLazyFieldLoading is true (default). >> > 3. Run a search with hl.method=unified and hl.fl=> > fields> 4. Trying to do an atomic update on some of the *parents* >> > fields> of >> > the returned documents from #3. >> > >> > You get an error: "TransactionLog doesn't know how to serialize >> > class org.apache.lucene.document.LazyDocument$LazyField". >> > >> > Now trying to run #3 again yields an error message that the text >> > field is indexed without positions. >> > >> > If enableLazyFieldLoading is false or if using the default >> > highlighter this doesn't happen. >> > >> > Ronen. >> > >> >> >> This electronic message may contain proprietary and confidential >> information of Verint Systems Inc., its affiliates and/or >> subsidiaries. The information is intended to be for the use of the >> individual(s) or >> entity(ies) named above. If you are not the intended recipient (or >> authorized to receive this e-mail for the intended recipient), you >> may not use, copy, disclose or distribute to anyone this message or >> any information contained in this message. If you have received this >> electronic message in error, please notify us by
Cannot find Solr 7.4.1 release
Hi, I wanted to download Solr 7.4.1, but I cannot find the 7.4.1 release into http://archive.apache.org/dist/lucene/solr/ : there are Solr 7.4 and after directly 7.5. Of course I can build from source code, but this is frustrating because I can see that in the 7_4_branch there is a fix that I need (SOLR-12594) with the status fixed into 7.4.1 and 7.5 versions. Everythings seems to have been prepared to release the 7.4.1, but I cannot find it. Does this release exist ? Thank you, Olivier