Re: Solr Geospatial Polygon Indexing/Querying Issue
On Tue, Jul 30, 2019 at 4:41 PM Sanders, Marshall (CAI - Atlanta) < marshall.sande...@coxautoinc.com> wrote: > I’ll explain the context around the use case we’re trying to solve and > then attempt to respond as best I can to each of your points. What we have > is a list of documents that in our case the location is sometimes a point > and sometimes a circle. These basically represent (in our case) inventory > at a physical location (point) or inventory that can be delivered to you > within X km (configurable per document) which represents the circle use > case. We want to be able to allow a user to say I want all documents > within X distance of my location, but also all documents that are able to > be delivered to your point where the delivery distance is defined on the > inventory (creating the circle). > That background info helps me understand things! > This is why we were actually trying to combine both point based data and > poly/circle data into a single geospatial field, since I don’t believe you > could do something like fq=geofilt(latlng, x, y, d) OR > geofilt(latlngCircle, x, y, 1) but perhaps we’re just not getting quite the > right syntax, etc. > Oh quite possible :-). It would look something like this: fq= {!geofilt sfield=latLng d=queryDistance} OR {!geofilt sfield=latLngCircle d=0}=myLocation Notice the space after the fq= which is critical so that the first local-params (i.e. first geofilt) does not "own" the entire filter query string end to end. Due to the space, the whole thing is parsed by the default lucene/standard query parser, and then we have the two clauses clearly there. The second geofilt has distance 0; it'd be nice if it internally optimized to a point but nonetheless it's fine. Alternatively there's another syntax to embed WKT where you can specify a point explicitly... something like this: ... {!field f=latLngCircle v="Intersects(POINT(x y))"} That said, it's also just fine to do as you were planning -- have one RPT based field for the shape representation (mixture of points and circles), and one LLPSF field purely for the center point that is used for sorting. That LLPSF field would be indexed=false docValues=true since you wouldn't be filtering on it. > > * Generally RptWithGeometrySpatialField should be used over > SpatialRecursivePrefixTreeFieldType unless you want heatmaps or are willing > to make trade-offs in higher index size and lossy precision in order to get > faster search. It's up to you; if you benchmark both I'd love to hear how > it went. > > -We may explore both but typically we’re more interested in speed > than accuracy, benchmarking it may be a very interesting exercise however. > For sorting for instance we’re actually using sqedist instead of geodist > because we’re not overly concerned about sorting accuracy. > Okay... though geodist on a LLPSF field is remarkably optimized. > * I see you are using Geo3D, which is not the default. Geo3D is strict > about the coordinate order -- counter-clickwise. Your triangle is > clockwise and thus it has an inverted interpretation -- thus it's a shape > that covers nearly the whole globe. I recently documented this > https://issues.apache.org/jira/browse/SOLR-13467 but it's not published > yet since it's so new. > > - Thanks for this clarification as well. I had read this in the > WKT docs too, again something we tried but really weren’t sure about what > the right answer was and had been going back and forth on. The > documentation seems to specify that you need to specify either JTS or > Geo3d, but doesn’t provide much info/guidance about which to use when and > since JTS required adding another jar manually and therefore complicates > our build process significantly (at least vs using Geo3D) we tried Geo3D. > I’d love to hear more about the tradeoffs and other considerations between > the two, but sounds like we should switch to JTS (the default, correct?) > The default spatialContextFactory is something internal; not JTS or Geo3D. Based on your requirements, you needn't specify either JTS or Geo3D, mostly because you don't actually need polygons. I wouldn't bother specifying it unless you want to experiment with some benchmarking. JTS would give you nothing here but Geo3D + prefixTree=S2 (in Solr 8.2) might be faster. > * You can absolutely index a circle in Solr -- this is something cool and > somewhat unique. And you don't need format=legacy. The documentation needs > to call t out better, though it at least refers to circles as a "buffered > point" which is the currently supported way of representing it, and it does > have one example. Search for "BUFFER" and you'll see a WKT-like syntax to > do it. BUFFER is not standard WKT; it was added on to do this. The first > arg is a X Y center, and 2nd arg is a distance in decimal degrees (not > km). BTW Geo3D is a good choice here but not essential either. > > - This sounds very promising and we’ll
Solr 7.7.2 vs Solr 8.2.0
Hi, We are trying to decide whether we should upgrade to Solr 7.7.2 version or Solr 8.2.0 version. We are currently on Solr 6.3.0 version. On one hand 8.2.0 version feels like a good choice because it is the latest version. But then experience tells that initial versions usually have lot of bugs compared to the later LTS versions. Also, there is one more issue. There is this major JIRA bug https://issues.apache.org/jira/browse/SOLR-13336 which mostly won't get fixed in any 7.x version, but is fixed in Solr 8.1. I checked and our Solr configuration is vulnerable to it. Do you have any recommendation as to which Solr version one should move to given these facts?
SOLR 8.1.1 EdgeNGramFilterFactory parsing query
I have a SOLR 4.10.2 core, and I am upgrading to 8.1.1. I created an 8.1.1 core manually using the default_config set , and then brought over settings into the 8.1.1 schema I have adjusted the schema.xml and solrconfig.xml, and I have the core queryable in 8.1.1. I have a field named Company: In 4.10.2 when I run the query: IDX_Company:blue with debugQuery on, I see the query parsed into pieces (correctly) "debug": { "rawquerystring": "IDX_Company:blue", "querystring": "IDX_Company:blue", "parsedquery": "(IDX_Company:b IDX_Company:bl IDX_Company:blu IDX_Company:blue)/no_coord", ... When I run this against 8.1.1, with debugQuery on, I get the following: "debug":{ "rawquerystring":"IDX_Company:blue", "querystring":"IDX_Company:blue", "parsedquery":"IDX_Company:blue", ... It seems to not be applying the EdgeNGramFilterFactory - the only change I made to the EdgeNGramFilterFactory configuration was to remove the "side" attribute, per the documentation. Also, per the documentation, I replaced the SynonymFilterFactory with SynonmGraphFilterFactory, and added the FlattenGraphFilterFactory. I have tried removing the FlattenGraphFilterFactory, I have cleared and repopulated the core (reindexed), I have stopped and started SOLR 8.1.1, and no difference. Here is the definition of text_general I am using in schema.xml
Re: SolrCloud recommended I/O RAID level
On 7/30/2019 12:12 PM, Kaminski, Adi wrote: Indeed RAID10 with both mirroring and striping should satisfy the need, but per some benchmarks in the network there is still an impact on write performance on it compared to RAID0 which is considered as much better (attaching a table that summarizes different RAID levels and their pros/cons and capacity ratio). RAID10 offers the best combination of performance and reliability. RAID0 might beat it *slightly* on performance, but if ANY drive fails on RAID0, the entire volume is lost. If we have ~200-320 shards spread by our 7 Solr node servers (part of SolrCloud cluster) on single core/collection configured with replication factor 2, shouldn't it supply applicative level redundancy of indexed data ? Yes, you could rely on Solr alone for data redundancy. But if there's a drive failure, do you REALLY want to be single-stranded for the time it takes to rebuild the entire server and copy data? That's what you would end up doing if you choose RAID0. It is true that RAID1 or RAID10 means you have to buy double your usable capacity. I would argue that drives are cheap and will cost less than either downtime or sysadmin effort. Thanks, Shawn
Re: SolrCloud recommended I/O RAID level
Hi Furkan, Thanks for your response ! Indeed RAID10 with both mirroring and striping should satisfy the need, but per some benchmarks in the network there is still an impact on write performance on it compared to RAID0 which is considered as much better (attaching a table that summarizes different RAID levels and their pros/cons and capacity ratio). If we have ~200-320 shards spread by our 7 Solr node servers (part of SolrCloud cluster) on single core/collection configured with replication factor 2, shouldn't it supply applicative level redundancy of indexed data ? Solr will hold each shard in two places with its documents, and will ensure that every shard placed in 2 different servers, no ? If that's the case, why not choosing RAID0 that is the best from both read and write performance ? Thanks, Adi Sent from Workspace ONE Boxer On Jul 30, 2019 20:51, Furkan KAMACI wrote: Hi Adi, RAID10 is good for satisfying both indexing and query, striping across mirror sets. However, you lose half of your raw disk space, just like with RAID1. Here is a mail thread of mine which discusses RAID levels for Solr specific: https://lists.apache.org/thread.html/462d7467b2f2d064223eb46763a6a6e606ac670fe7f7b40858d97c0d@1366325333@%3Csolr-user.lucene.apache.org%3E Kind Regards, Furkan KAMACI On Mon, Jul 29, 2019 at 10:25 PM Kaminski, Adi wrote: > Hi, > We are about to size large environment with 7 nodes/servers with > replication factor 2 of SolrCloud cluster (using Solr 7.6). > > The system contains parent-child (nested documents) schema, and about to > have 40M parent docs with 50-80 child docs each (in total 2-3.2B Solr docs). > > We have a use case that will require to update parent document fields > triggered by an application flow (with re-indexing or atomic/partial update > approach, that will probably require to upgrade to Solr 8.1.1 that supports > this feature and contains some fixes in nested docs handling area). > > Since these updates might be quite heavy from IOPS perspective, we would > like to make sure that the IO hardware and RAID configuration are optimized > (r/w ratio of 50% read and 50% write, to allow balanced search and update > flows). > > Can someone share similar scale/use- case/deployment RAID level > configuration ? > (I assume that RAID5&6 are not an option due to parity/dual parity heavy > impact on write operations, so it leaves RAID 0, 1 or 10). > > Thanks in advance, > Adi > > > > > Sent from Workspace ONE Boxer > > > This electronic message may contain proprietary and confidential > information of Verint Systems Inc., its affiliates and/or subsidiaries. The > information is intended to be for the use of the individual(s) or > entity(ies) named above. If you are not the intended recipient (or > authorized to receive this e-mail for the intended recipient), you may not > use, copy, disclose or distribute to anyone this message or any information > contained in this message. If you have received this electronic message in > error, please notify us by replying to this e-mail. > This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Re: Problem with solr suggester in case of non-ASCII characters
Hi Furkan, Thanks the suggestion, I always forget the most effective debugging tool the analysis page. It turned out that "Jó" was a stop word and it was eliminated during the text analysis. What I will do is to create a new field type but without stop word removal and I will use it like this: short_text_hu_without_stop_removal Thanks again Roland Furkan KAMACI ezt írta (időpont: 2019. júl. 30., K, 16:17): > Hi Roland, > > Could you check Analysis tab ( > https://lucene.apache.org/solr/guide/8_1/analysis-screen.html) and tell > how > the term is analyzed for both query and index? > > Kind Regards, > Furkan KAMACI > > On Tue, Jul 30, 2019 at 4:50 PM Szűcs Roland > wrote: > > > Hi All, > > > > I have an author suggester (searchcomponent and the related request > > handler) defined in solrconfig: > > > > > > > > > author > > AnalyzingInfixLookupFactory > > DocumentDictionaryFactory > > BOOK_productAuthor > > short_text_hu > > suggester_infix_author > > false > > false > > 2 > > > > > > > > > startup="lazy" > > > > > true > > 10 > > author > > > > > > suggest > > > > > > > > Author field has just a minimal text processing in query and index time > > based on the following definition: > > > positionIncrementGap="100" multiValued="true"> > > > > > > > >> ignoreCase="true"/> > > > > > > > > > >> ignoreCase="true"/> > > > > > > > >> docValues="true"/> > >> docValues="true" multiValued="true"/> > >> positionIncrementGap="100"> > > > > > > > >words="lang/stopwords_ar.txt" > > ignoreCase="true"/> > > > > > > > > > > > > When I use qeries with only ASCII characters, the results are correct: > > "Al":{ > > "term":"Alexandre Dumas", "weight":0, "payload":""} > > > > When I try it with Hungarian authorname with special character: > > "Jó":"author":{ > > "Jó":{ "numFound":0, "suggestions":[]}} > > > > When I try it with three letters, it works again: > > "Józ":"author":{ > > "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", " > > weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, " > > payload":""}, { "term":"Eötvös József", "weight":0, > "payload":""}, { > > "term":"Eötvös József", "weight":0, "payload":""}, { > > "term":"József > > Attila", "weight":0, "payload":""}.. > > > > Any idea how can it happen that a longer string has more matches than a > > shorter one. It is inconsistent. What can I do to fix it as it would > > results poor customer experience. > > They would feel that sometimes they need 2 sometimes 3 characters to get > > suggestions. > > > > Thanks in advance, > > Roland > > >
Re: SolrCloud recommended I/O RAID level
Hi Adi, RAID10 is good for satisfying both indexing and query, striping across mirror sets. However, you lose half of your raw disk space, just like with RAID1. Here is a mail thread of mine which discusses RAID levels for Solr specific: https://lists.apache.org/thread.html/462d7467b2f2d064223eb46763a6a6e606ac670fe7f7b40858d97c0d@1366325333@%3Csolr-user.lucene.apache.org%3E Kind Regards, Furkan KAMACI On Mon, Jul 29, 2019 at 10:25 PM Kaminski, Adi wrote: > Hi, > We are about to size large environment with 7 nodes/servers with > replication factor 2 of SolrCloud cluster (using Solr 7.6). > > The system contains parent-child (nested documents) schema, and about to > have 40M parent docs with 50-80 child docs each (in total 2-3.2B Solr docs). > > We have a use case that will require to update parent document fields > triggered by an application flow (with re-indexing or atomic/partial update > approach, that will probably require to upgrade to Solr 8.1.1 that supports > this feature and contains some fixes in nested docs handling area). > > Since these updates might be quite heavy from IOPS perspective, we would > like to make sure that the IO hardware and RAID configuration are optimized > (r/w ratio of 50% read and 50% write, to allow balanced search and update > flows). > > Can someone share similar scale/use- case/deployment RAID level > configuration ? > (I assume that RAID5&6 are not an option due to parity/dual parity heavy > impact on write operations, so it leaves RAID 0, 1 or 10). > > Thanks in advance, > Adi > > > > > Sent from Workspace ONE Boxer > > > This electronic message may contain proprietary and confidential > information of Verint Systems Inc., its affiliates and/or subsidiaries. The > information is intended to be for the use of the individual(s) or > entity(ies) named above. If you are not the intended recipient (or > authorized to receive this e-mail for the intended recipient), you may not > use, copy, disclose or distribute to anyone this message or any information > contained in this message. If you have received this electronic message in > error, please notify us by replying to this e-mail. >
Re: Solr Geospatial Polygon Indexing/Querying Issue
David, Firstly, thanks for putting together such a thorough email it helps a lot to understand some of the things we were just guessing at because (as you mentioned a few times) the documentation around all of this is rather sparse. I’ll explain the context around the use case we’re trying to solve and then attempt to respond as best I can to each of your points. What we have is a list of documents that in our case the location is sometimes a point and sometimes a circle. These basically represent (in our case) inventory at a physical location (point) or inventory that can be delivered to you within X km (configurable per document) which represents the circle use case. We want to be able to allow a user to say I want all documents within X distance of my location, but also all documents that are able to be delivered to your point where the delivery distance is defined on the inventory (creating the circle). This is why we were actually trying to combine both point based data and poly/circle data into a single geospatial field, since I don’t believe you could do something like fq=geofilt(latlng, x, y, d) OR geofilt(latlngCircle, x, y, 1) but perhaps we’re just not getting quite the right syntax, etc. * Personally, I find it highly confusing to have a field named "latlng" and have it be anything other than a simple point -- it's all you have if given a single latitude longitude pair. If you intend for the data to be a circle (either exactly or approximated) then perhaps call it latLngCircle - This is happening because we’re trying to combine two different use cases into a single field, since I don’t think we have that option from the query side. The name is really just us re-using our current field for this exploration, but would probably end up being named something different. * geodist() and for that matter any other attempt to get the distance to a non-point shape is not going to work -- either error or confusing results; I forget. This is hard to do and the logic isn't there for it, and probably wouldn't perform to user's expectations if it did. This ought to be documented but seems not to be. -Good to know, so no matter what we’ll have to have a point value stored somewhere for each document and calculate geodist on that. * Generally RptWithGeometrySpatialField should be used over SpatialRecursivePrefixTreeFieldType unless you want heatmaps or are willing to make trade-offs in higher index size and lossy precision in order to get faster search. It's up to you; if you benchmark both I'd love to hear how it went. -We may explore both but typically we’re more interested in speed than accuracy, benchmarking it may be a very interesting exercise however. For sorting for instance we’re actually using sqedist instead of geodist because we’re not overly concerned about sorting accuracy. * In WKT format, the ordinate order is "X Y" (thus longitude then latitude). Looking at your triangle, it is extremely close to Antarctica, and I'm skeptical you intended that. This is not directly documented AFAICT but it's such a common mistake that it ought to be called out in the docs. -Definitely did not intend it to be close to Antarctica, I think we tried both but probably went back to lat,long and was definitely more common in our (failed) testing. * I see you are using Geo3D, which is not the default. Geo3D is strict about the coordinate order -- counter-clickwise. Your triangle is clockwise and thus it has an inverted interpretation -- thus it's a shape that covers nearly the whole globe. I recently documented this https://issues.apache.org/jira/browse/SOLR-13467 but it's not published yet since it's so new. - Thanks for this clarification as well. I had read this in the WKT docs too, again something we tried but really weren’t sure about what the right answer was and had been going back and forth on. The documentation seems to specify that you need to specify either JTS or Geo3d, but doesn’t provide much info/guidance about which to use when and since JTS required adding another jar manually and therefore complicates our build process significantly (at least vs using Geo3D) we tried Geo3D. I’d love to hear more about the tradeoffs and other considerations between the two, but sounds like we should switch to JTS (the default, correct?) * You can absolutely index a circle in Solr -- this is something cool and somewhat unique. And you don't need format=legacy. The documentation needs to call this out better, though it at least refers to circles as a "buffered point" which is the currently supported way of representing it, and it does have one example. Search for "BUFFER" and you'll see a WKT-like syntax to do it. BUFFER is not standard WKT; it was added on to do this. The first arg is a X Y center, and 2nd arg is a distance in decimal degrees (not km). BTW Geo3D is a good choice here but
Re: Solr Backup
The FS backup feature requires a shared drive as you say, and this is clearly documented. No way around it. Cloud Filestore would likely fix it. Or you could write a new backup repo plugin for backup directly to Google Cloud Storage? Jan Høydahl > 30. jul. 2019 kl. 13:41 skrev Jayadevan Maymala : > > Hello all, > > We have a 3-node Solr cluster running on google cloud platform. I would > like to schedule a backup and have been trying the backup API and getting > java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException error. > I suspect it is because a shared drive is necessary. Google VM instances > don't have this feature, unless I go for Cloud Filestore. > Is there a work-around? Some way in which I can have the cluster back up > taken on the node on which I am executing the backup command? Solr Version > is 7.3 > > Regards, > Jayadevan
Re: Basic Query Not Working - Please Help
Hi Vipul, You are welcome! Kind Regards, Furkan KAMACI On Fri, Jul 26, 2019 at 11:07 AM Vipul Bahuguna < newthings4learn...@gmail.com> wrote: > Hi Furkan - > > I realized that I was searching incorrectly. > I later realized that if I need to search by specific field, I need to do > as you suggested - > q=appname:App1 . > > OR if need to simply search by App1, then I need to use to > index my field appname at the time of insertion so that it can be later > search without specifying the fieldname. > > thanks for your response. > > On Tue, Jul 23, 2019 at 6:07 AM Furkan KAMACI > wrote: > > > Hi Vipul, > > > > Which query do you submit? Is that one: > > > > q=appname:App1 > > > > Kind Regards, > > Furkan KAMACI > > > > On Mon, Jul 22, 2019 at 10:52 AM Vipul Bahuguna < > > newthings4learn...@gmail.com> wrote: > > > > > Hi, > > > > > > I have installed SOLR 8.1.1. > > > I am new and trying the very basics. > > > > > > I installed solr8.1.1 on Windows and I am using SOLR in standalone > mode. > > > > > > Steps I followed - > > > > > > 1. created a core as follows: > > > solr create_core -c dox > > > > > > 2. updated the managed_schema.xml file to add few specific fields > > specific > > > to my schema as belows: > > > > > > stored="true"/> > > > stored="true"/> > > > > stored="true"/> > > > > > stored="true"/> > > > > > > 3. then i restarted SOLR > > > > > > 4. then i went to the Documents tab to enter my sample data for > indexing, > > > which looks like below: > > > { > > > > > > "id" : "1", > > > "prjname" : "Project1", > > > "apps" : [ > > > { > > > "appname" : "App1", > > > "topics" : [ > > > { > > > "topicname" : "topic1", > > > "links" : [ > > > "http://www.google.com;, > > > "http://www.t6.com; > > > ] > > > }, > > > { > > > "topicname" : "topic2", > > > "links" : [ > > > "http://www.java.com;, > > > "http://www.rediff.com; > > > ] > > > } > > > ] > > > }, > > > { > > > "appname" : "App2", > > > "topics" : [ > > > { > > > "topicname" : "topic3", > > > "links" : [ > > > "http://www.t3.com;, > > > "http://www.t4.com; > > > ] > > > }, > > > { > > > "topicname" : "topic4", > > > "links" : [ > > > "http://www.rules.com;, > > > "http://www.amazon.com; > > > ] > > > } > > > ] > > > } > > > ] > > > } > > > > > > 5. Now when i go to Query tab and click Execute Search with *.*, it > shows > > > my recently added document as follows: > > > { > > > "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "_": > > > "1563780352100"}}, "response":{"numFound":1,"start":0,"docs":[ { > > "id":"1", > > > " > > > prjname":["Project1"], "apps":["{appname=App1, > topics=[{topicname=topic1, > > > links=[http://www.google.com, http://www.t6.com]}, {topicname=topic2, > > > links=[http://www.java.com, http://www.rediff.com]}]};, > "{appname=App2, > > > topics=[{topicname=topic3, links=[http://www.t3.com, http://www.t4.com > > ]}, > > > {topicname=topic4, links=[http://www.rules.com, http://www.amazon.com > > > ]}]}"], > > > "_version_":1639742305772503040}] }} > > > > > > 6. But now when I am trying to search based on field topicname or > > prjname, > > > it does not returns any document. Even if put anything in q like App1, > > zero > > > results are being returned. > > > > > > > > > Can someone help me understanding what I might have done incorrectly? > > > May be I defined my schema incorrectly. > > > > > > Thanks in advance > > > > > >
Re: Problem with solr suggester in case of non-ASCII characters
Hi Roland, Could you check Analysis tab ( https://lucene.apache.org/solr/guide/8_1/analysis-screen.html) and tell how the term is analyzed for both query and index? Kind Regards, Furkan KAMACI On Tue, Jul 30, 2019 at 4:50 PM Szűcs Roland wrote: > Hi All, > > I have an author suggester (searchcomponent and the related request > handler) defined in solrconfig: > > > > > author > AnalyzingInfixLookupFactory > DocumentDictionaryFactory > BOOK_productAuthor > short_text_hu > suggester_infix_author > false > false > 2 > > > > startup="lazy" > > > true > 10 > author > > > suggest > > > > Author field has just a minimal text processing in query and index time > based on the following definition: > positionIncrementGap="100" multiValued="true"> > > > >ignoreCase="true"/> > > > > >ignoreCase="true"/> > > > >docValues="true"/> >docValues="true" multiValued="true"/> >positionIncrementGap="100"> > > > >ignoreCase="true"/> > > > > > > When I use qeries with only ASCII characters, the results are correct: > "Al":{ > "term":"Alexandre Dumas", "weight":0, "payload":""} > > When I try it with Hungarian authorname with special character: > "Jó":"author":{ > "Jó":{ "numFound":0, "suggestions":[]}} > > When I try it with three letters, it works again: > "Józ":"author":{ > "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", " > weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, " > payload":""}, { "term":"Eötvös József", "weight":0, "payload":""}, { > "term":"Eötvös József", "weight":0, "payload":""}, { > "term":"József > Attila", "weight":0, "payload":""}.. > > Any idea how can it happen that a longer string has more matches than a > shorter one. It is inconsistent. What can I do to fix it as it would > results poor customer experience. > They would feel that sometimes they need 2 sometimes 3 characters to get > suggestions. > > Thanks in advance, > Roland >
Re: Solr Backup
On 7/30/2019 7:11 AM, Jayadevan Maymala wrote: We will need the *FULL* error message. It is probably dozens of lines long and MIGHT contain multiple "Caused by" sections. { "responseHeader":{ "status":500, "QTime":22}, "Operation backup caused exception:":"java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException: /backups/coupon", That is the response, and it doesn't have the information I was hoping to see. It doesn't show what happened ... basically it says there was an error, but doesn't have any real detail about what the error was. Analyzing what stacktrace data is available doesn't reveal anything useful. I was hoping for actual logfile data. The solr.log file from the server side may contain a more complete error. If you can use a file sharing website or paste website to share the whole logfile, we might be able to find more information. Thanks, Shawn
Problem with solr suggester in case of non-ASCII characters
Hi All, I have an author suggester (searchcomponent and the related request handler) defined in solrconfig: > author AnalyzingInfixLookupFactory DocumentDictionaryFactory BOOK_productAuthor short_text_hu suggester_infix_author false false 2 true 10 author suggest Author field has just a minimal text processing in query and index time based on the following definition: When I use qeries with only ASCII characters, the results are correct: "Al":{ "term":"Alexandre Dumas", "weight":0, "payload":""} When I try it with Hungarian authorname with special character: "Jó":"author":{ "Jó":{ "numFound":0, "suggestions":[]}} When I try it with three letters, it works again: "Józ":"author":{ "Józ":{ "numFound":10, "suggestions":[{ "term":"Bajza József", " weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, " payload":""}, { "term":"Eötvös József", "weight":0, "payload":""}, { "term":"Eötvös József", "weight":0, "payload":""}, { "term":"József Attila", "weight":0, "payload":""}.. Any idea how can it happen that a longer string has more matches than a shorter one. It is inconsistent. What can I do to fix it as it would results poor customer experience. They would feel that sometimes they need 2 sometimes 3 characters to get suggestions. Thanks in advance, Roland
Re: Solr Backup
On Tue, Jul 30, 2019 at 5:56 PM Shawn Heisey wrote: > On 7/30/2019 5:41 AM, Jayadevan Maymala wrote: > > We have a 3-node Solr cluster running on google cloud platform. I would > > like to schedule a backup and have been trying the backup API and getting > > java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException > error. > > I suspect it is because a shared drive is necessary. Google VM instances > > don't have this feature, unless I go for Cloud Filestore. > > Is there a work-around? Some way in which I can have the cluster back up > > taken on the node on which I am executing the backup command? Solr > Version > > is 7.3 > > We will need the *FULL* error message. It is probably dozens of lines > long and MIGHT contain multiple "Caused by" sections. { "responseHeader":{ "status":500, "QTime":22}, "Operation backup caused exception:":"java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException: /backups/coupon", "exception":{ "msg":"/backups/coupon", "rspCode":-1}, "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.solr.common.SolrException"], "msg":"/backups/coupon", "trace":"org.apache.solr.common.SolrException: /backups/coupon\n\tat org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:258)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:230)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)\n\tat org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:736)\n\tat org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:717)\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:498)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:530)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)\n\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)\n\tat java.lang.Thread.run(Thread.java:748)\n", "code":500}} > We will also need > the complete Solr version to be able to interpret the error -- 7.3 is > not
Re: [ZOOKEEPER] - Error - HEAP MEMORY
Hi, Follow the process, and more data about my SOLR + ZOOKEEPER. root 48425 1 26 Jul29 ?03:00:39 java -server -Xms28g -Xmx32g -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:-OmitStackTraceInFastThrow -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/solr/server/logs/solr_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M -DzkClientTimeout=15000 -DzkHost=177.55.55.152:2181,177.55.55.153:2181,177.55.55.154:2181, 177.55.55.155:2181,177.55.55.156:2181 -Dsolr.log.dir=/solr/server/logs -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC -Djetty.home=/solr/server -Dsolr.solr.home=/solr/server/solr -Dsolr.data.home= -Dsolr.install.dir=/solr -Dsolr.default.confdir=/solr/server/solr/configsets/_default/conf -Xss256k -Dsolr.jetty.https.port=8983 -Dsolr.log.muteconsole -XX:OnOutOfMemoryError=/solr/bin/oom_solr.sh 8983 /solr/server/logs -jar start.jar --module=http root 48163 1 0 Jul29 ?00:01:33 java -Dzookeeper.log.dir=/zoop/bin/../logs -Dzookeeper.log.file=zookeeper-root-server-eddison0001.log -Dzookeeper.root.logger=INFO,CONSOLE -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=kill -9 %p -cp /zoop/bin/../zookeeper-server/target/classes:/zoop/bin/../build/classes:/zoop/bin/../zookeeper-server/target/lib/*.jar:/zoop/bin/../build/lib/*.jar:/zoop/bin/../lib/zookeeper-jute-3.5.5.jar:/zoop/bin/../lib/zookeeper-3.5.5.jar:/zoop/bin/../lib/slf4j-log4j12-1.7.25.jar:/zoop/bin/../lib/slf4j-api-1.7.25.jar:/zoop/bin/../lib/netty-all-4.1.29.Final.jar:/zoop/bin/../lib/log4j-1.2.17.jar:/zoop/bin/../lib/json-simple-1.1.1.jar:/zoop/bin/../lib/jline-2.11.jar:/zoop/bin/../lib/jetty-util-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-servlet-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-server-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-security-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-io-9.4.17.v20190418.jar:/zoop/bin/../lib/jetty-http-9.4.17.v20190418.jar:/zoop/bin/../lib/javax.servlet-api-3.1.0.jar:/zoop/bin/../lib/jackson-databind-2.9.8.jar:/zoop/bin/../lib/jackson-core-2.9.8.jar:/zoop/bin/../lib/jackson-annotations-2.9.0.jar:/zoop/bin/../lib/commons-cli-1.2.jar:/zoop/bin/../lib/audience-annotations-0.5.0.jar:/zoop/bin/../zookeeper-*.jar:/zoop/bin/../zookeeper-server/src/main/resources/lib/*.jar:/zoop/bin/../conf: -*Xmx1000m -Xmx4096m* -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /zoop/conf/zoo.cfg root 91749 91497 0 09:43 pts/000:00:00 grep --color=auto -i zook [09:43:55] root@eddison0001:~$ Answer the question, look the ZOOKEEPER process, 2 xmx. By the way, I changed the SOLR (28xms - 32 xmx) because after 5 days ago using the SOLR, I received the message about Heap Memory in SOLR. Nowadays I don't have more message about Heap Memory in Solr. Regards, Em ter, 30 de jul de 2019 às 08:41, Jörn Franke escreveu: > 2 xmx does not make sense, > > Your heap seems unusual large usually your heap should be much smaller > than available memory so solr can use it for index caching which is off-heap > > > Am 30.07.2019 um 13:25 schrieb Rodrigo Oliveira < > adamantina.rodr...@gmail.com>: > > > > Hi, > > > > My environment have 5 servers with solr + zookeeper in the same hosts. > > > > However, I've had 48 RAM each servers (solr - xms 28 gb and xmx - 32 gb). > > > > Looking for my servers and process, in zookeepers has xmx 1000 mb and xmx > > 4096 mb (last item, was change for me). > > > > Why 2 values for xmx? > > > > Regards, > > > > Em ter, 30 de jul de 2019 04:44, Dominique Bejean < > dominique.bej...@eolya.fr> > > escreveu: > > > >> Hi, > >> > >> I don’t find any documentation about the parameter > >> zookeeper_server_java_heaps > >> in zoo.cfg. > >> The way to control java heap size is either the java.env file of the > >> zookeeper-env.sh file. In zookeeper-env.sh > >> SERVER_JVMFLAGS="-Xmx=512m" > >> > >> How many RAM on your server ? > >> > >> Regards > >> > >> Dominique > >> > >> > >> > >> > >> Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira < > >> adamantina.rodr...@gmail.com> a écrit : > >> > >>> Hi, > >>> > >>> After 3 days running, my zookeeper showing this error. > >>> > >>> 2019-07-29 15:10:41,906 [myid:1] - WARN > >>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - > >>> Connection broken for id 4332550065071534382, my id = 1, error = > >>> java.io.IOException: Received packet with invalid packet: 824196618 > >>> at > >>> > >>> > >> >
Re: Solr Backup
On 7/30/2019 5:41 AM, Jayadevan Maymala wrote: We have a 3-node Solr cluster running on google cloud platform. I would like to schedule a backup and have been trying the backup API and getting java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException error. I suspect it is because a shared drive is necessary. Google VM instances don't have this feature, unless I go for Cloud Filestore. Is there a work-around? Some way in which I can have the cluster back up taken on the node on which I am executing the backup command? Solr Version is 7.3 We will need the *FULL* error message. It is probably dozens of lines long and MIGHT contain multiple "Caused by" sections. We will also need the complete Solr version to be able to interpret the error -- 7.3 is not specific enough. Thanks, Shawn
Re: [ZOOKEEPER] - Error - HEAP MEMORY
2 xmx does not make sense, Your heap seems unusual large usually your heap should be much smaller than available memory so solr can use it for index caching which is off-heap > Am 30.07.2019 um 13:25 schrieb Rodrigo Oliveira > : > > Hi, > > My environment have 5 servers with solr + zookeeper in the same hosts. > > However, I've had 48 RAM each servers (solr - xms 28 gb and xmx - 32 gb). > > Looking for my servers and process, in zookeepers has xmx 1000 mb and xmx > 4096 mb (last item, was change for me). > > Why 2 values for xmx? > > Regards, > > Em ter, 30 de jul de 2019 04:44, Dominique Bejean > escreveu: > >> Hi, >> >> I don’t find any documentation about the parameter >> zookeeper_server_java_heaps >> in zoo.cfg. >> The way to control java heap size is either the java.env file of the >> zookeeper-env.sh file. In zookeeper-env.sh >> SERVER_JVMFLAGS="-Xmx=512m" >> >> How many RAM on your server ? >> >> Regards >> >> Dominique >> >> >> >> >> Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira < >> adamantina.rodr...@gmail.com> a écrit : >> >>> Hi, >>> >>> After 3 days running, my zookeeper showing this error. >>> >>> 2019-07-29 15:10:41,906 [myid:1] - WARN >>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - >>> Connection broken for id 4332550065071534382, my id = 1, error = >>> java.io.IOException: Received packet with invalid packet: 824196618 >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) >>> 2019-07-29 15:10:41,906 [myid:1] - WARN >>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] - >>> Interrupting SendWorker >>> 2019-07-29 15:10:41,907 [myid:1] - WARN >>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] - >>> Interrupted while waiting for message on queue >>> java.lang.InterruptedException >>> at >>> >>> >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) >>> at >>> >>> >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) >>> at >>> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78) >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080) >>> 2019-07-29 15:10:41,907 [myid:1] - WARN >>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - >> Send >>> worker leaving thread id 4332550065071534382 my id = 1 >>> 2019-07-29 15:10:41,917 [myid:1] - INFO [/177.55.55.152:3888 >>> :QuorumCnxManager$Listener@888] - Received connection request / >>> 177.55.55.63:53972 >>> 2019-07-29 15:10:41,920 [myid:1] - WARN >>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - >>> Connection broken for id 4332550065071534382, my id = 1, error = >>> java.io.IOException: Received packet with invalid packet: 840973834 >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) >>> 2019-07-29 15:10:41,921 [myid:1] - WARN >>> [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] - >>> Interrupting SendWorker >>> 2019-07-29 15:10:41,922 [myid:1] - WARN >>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] - >>> Interrupted while waiting for message on queue >>> java.lang.InterruptedException >>> at >>> >>> >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) >>> at >>> >>> >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) >>> at >>> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78) >>> at >>> >>> >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080) >>> 2019-07-29 15:10:41,922 [myid:1] - WARN >>> [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - >> Send >>> worker leaving thread id 4332550065071534382 my id = 1 >>> 2019-07-29 15:10:41,932 [myid:1] - INFO [/177.55.55.152:3888 >>> :QuorumCnxManager$Listener@888] - Received connection request / >>> 177.55.55.63:38633 >>> 2019-07-29 15:10:41,933 [myid:1] - WARN >>> [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1176] - >>> Connection broken for id 4332550065071534638, my id = 1, error = >>> java.io.IOException: Received packet with invalid packet: 807419402 >>> at >>> >>> >>
Solr Backup
Hello all, We have a 3-node Solr cluster running on google cloud platform. I would like to schedule a backup and have been trying the backup API and getting java.nio.file.NoSuchFileException:java.nio.file.NoSuchFileException error. I suspect it is because a shared drive is necessary. Google VM instances don't have this feature, unless I go for Cloud Filestore. Is there a work-around? Some way in which I can have the cluster back up taken on the node on which I am executing the backup command? Solr Version is 7.3 Regards, Jayadevan
Re: [ZOOKEEPER] - Error - HEAP MEMORY
Hi, My environment have 5 servers with solr + zookeeper in the same hosts. However, I've had 48 RAM each servers (solr - xms 28 gb and xmx - 32 gb). Looking for my servers and process, in zookeepers has xmx 1000 mb and xmx 4096 mb (last item, was change for me). Why 2 values for xmx? Regards, Em ter, 30 de jul de 2019 04:44, Dominique Bejean escreveu: > Hi, > > I don’t find any documentation about the parameter > zookeeper_server_java_heaps > in zoo.cfg. > The way to control java heap size is either the java.env file of the > zookeeper-env.sh file. In zookeeper-env.sh > SERVER_JVMFLAGS="-Xmx=512m" > > How many RAM on your server ? > > Regards > > Dominique > > > > > Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira < > adamantina.rodr...@gmail.com> a écrit : > > > Hi, > > > > After 3 days running, my zookeeper showing this error. > > > > 2019-07-29 15:10:41,906 [myid:1] - WARN > > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - > > Connection broken for id 4332550065071534382, my id = 1, error = > > java.io.IOException: Received packet with invalid packet: 824196618 > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) > > 2019-07-29 15:10:41,906 [myid:1] - WARN > > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] - > > Interrupting SendWorker > > 2019-07-29 15:10:41,907 [myid:1] - WARN > > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] - > > Interrupted while waiting for message on queue > > java.lang.InterruptedException > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) > > at > > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78) > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080) > > 2019-07-29 15:10:41,907 [myid:1] - WARN > > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - > Send > > worker leaving thread id 4332550065071534382 my id = 1 > > 2019-07-29 15:10:41,917 [myid:1] - INFO [/177.55.55.152:3888 > > :QuorumCnxManager$Listener@888] - Received connection request / > > 177.55.55.63:53972 > > 2019-07-29 15:10:41,920 [myid:1] - WARN > > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - > > Connection broken for id 4332550065071534382, my id = 1, error = > > java.io.IOException: Received packet with invalid packet: 840973834 > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) > > 2019-07-29 15:10:41,921 [myid:1] - WARN > > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] - > > Interrupting SendWorker > > 2019-07-29 15:10:41,922 [myid:1] - WARN > > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] - > > Interrupted while waiting for message on queue > > java.lang.InterruptedException > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) > > at > > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78) > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080) > > 2019-07-29 15:10:41,922 [myid:1] - WARN > > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - > Send > > worker leaving thread id 4332550065071534382 my id = 1 > > 2019-07-29 15:10:41,932 [myid:1] - INFO [/177.55.55.152:3888 > > :QuorumCnxManager$Listener@888] - Received connection request / > > 177.55.55.63:38633 > > 2019-07-29 15:10:41,933 [myid:1] - WARN > > [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1176] - > > Connection broken for id 4332550065071534638, my id = 1, error = > > java.io.IOException: Received packet with invalid packet: 807419402 > > at > > > > > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) > > 2019-07-29 15:10:41,933 [myid:1] - WARN > > [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1179] - > > Interrupting SendWorker > > 2019-07-29 15:10:41,934 [myid:1] - WARN > >
Re: Problem with uploading Large synonym files in cloud mode
You have to increase the -Djute.maxbuffer for large configs. In Solr bin/solr/solr.in.sh use e.g. SOLR_OPTS="$SOLR_OPTS -Djute.maxbuffer=1000" This will increase maxbuffer for zookeeper on solr side to 10MB. In Zookeeper zookeeper/conf/zookeeper-env.sh SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Djute.maxbuffer=1000" I have a >10MB Thesaurus and use 30MB for jute.maxbuffer, works perfect. Regards Am 30.07.19 um 13:09 schrieb Salmaan Rashid Syed: Hi Solr Users, I have a very big synonym file (>5MB). I am unable to start Solr in cloud mode as it throws an error message stating that the synonmys file is too large. I figured out that the zookeeper doesn't take a file greater than 1MB size. I tried to break down my synonyms file to smaller chunks less than 1MB each. But, I am not sure about how to include all the filenames into the Solr schema. Should it be seperated by commas like synonyms = "__1_synonyms.txt, __2_synonyms.txt, __3synonyms.txt" Or is there a better way of doing that? Will the bigger file when broken down to smaller chunks will be uploaded to zookeeper as well. Please help or please guide me to relevant documentation regarding this. Thank you. Regards. Salmaan.
Re: Problem with uploading Large synonym files in cloud mode
Aside that a 5 MB synonym file is rather strange (what is the use case for such a large synonym file?) and that it will have impact on index size and/or query time: You can configure zookeeper server and the Solr client to allow larger files using the jute.maxbuffer option. > Am 30.07.2019 um 13:09 schrieb Salmaan Rashid Syed > : > > Hi Solr Users, > > I have a very big synonym file (>5MB). I am unable to start Solr in cloud > mode as it throws an error message stating that the synonmys file is > too large. I figured out that the zookeeper doesn't take a file greater > than 1MB size. > > I tried to break down my synonyms file to smaller chunks less than 1MB > each. But, I am not sure about how to include all the filenames into the > Solr schema. > > Should it be seperated by commas like synonyms = "__1_synonyms.txt, > __2_synonyms.txt, __3synonyms.txt" > > Or is there a better way of doing that? Will the bigger file when broken > down to smaller chunks will be uploaded to zookeeper as well. > > Please help or please guide me to relevant documentation regarding this. > > Thank you. > > Regards. > Salmaan.
Problem with uploading Large synonym files in cloud mode
Hi Solr Users, I have a very big synonym file (>5MB). I am unable to start Solr in cloud mode as it throws an error message stating that the synonmys file is too large. I figured out that the zookeeper doesn't take a file greater than 1MB size. I tried to break down my synonyms file to smaller chunks less than 1MB each. But, I am not sure about how to include all the filenames into the Solr schema. Should it be seperated by commas like synonyms = "__1_synonyms.txt, __2_synonyms.txt, __3synonyms.txt" Or is there a better way of doing that? Will the bigger file when broken down to smaller chunks will be uploaded to zookeeper as well. Please help or please guide me to relevant documentation regarding this. Thank you. Regards. Salmaan.
Re: [ZOOKEEPER] - Error - HEAP MEMORY
Hi, I don’t find any documentation about the parameter zookeeper_server_java_heaps in zoo.cfg. The way to control java heap size is either the java.env file of the zookeeper-env.sh file. In zookeeper-env.sh SERVER_JVMFLAGS="-Xmx=512m" How many RAM on your server ? Regards Dominique Le lun. 29 juil. 2019 à 20:35, Rodrigo Oliveira < adamantina.rodr...@gmail.com> a écrit : > Hi, > > After 3 days running, my zookeeper showing this error. > > 2019-07-29 15:10:41,906 [myid:1] - WARN > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - > Connection broken for id 4332550065071534382, my id = 1, error = > java.io.IOException: Received packet with invalid packet: 824196618 > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) > 2019-07-29 15:10:41,906 [myid:1] - WARN > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] - > Interrupting SendWorker > 2019-07-29 15:10:41,907 [myid:1] - WARN > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] - > Interrupted while waiting for message on queue > java.lang.InterruptedException > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) > at > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080) > 2019-07-29 15:10:41,907 [myid:1] - WARN > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - Send > worker leaving thread id 4332550065071534382 my id = 1 > 2019-07-29 15:10:41,917 [myid:1] - INFO [/177.55.55.152:3888 > :QuorumCnxManager$Listener@888] - Received connection request / > 177.55.55.63:53972 > 2019-07-29 15:10:41,920 [myid:1] - WARN > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1176] - > Connection broken for id 4332550065071534382, my id = 1, error = > java.io.IOException: Received packet with invalid packet: 840973834 > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) > 2019-07-29 15:10:41,921 [myid:1] - WARN > [RecvWorker:4332550065071534382:QuorumCnxManager$RecvWorker@1179] - > Interrupting SendWorker > 2019-07-29 15:10:41,922 [myid:1] - WARN > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1092] - > Interrupted while waiting for message on queue > java.lang.InterruptedException > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) > at > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080) > 2019-07-29 15:10:41,922 [myid:1] - WARN > [SendWorker:4332550065071534382:QuorumCnxManager$SendWorker@1102] - Send > worker leaving thread id 4332550065071534382 my id = 1 > 2019-07-29 15:10:41,932 [myid:1] - INFO [/177.55.55.152:3888 > :QuorumCnxManager$Listener@888] - Received connection request / > 177.55.55.63:38633 > 2019-07-29 15:10:41,933 [myid:1] - WARN > [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1176] - > Connection broken for id 4332550065071534638, my id = 1, error = > java.io.IOException: Received packet with invalid packet: 807419402 > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1163) > 2019-07-29 15:10:41,933 [myid:1] - WARN > [RecvWorker:4332550065071534638:QuorumCnxManager$RecvWorker@1179] - > Interrupting SendWorker > 2019-07-29 15:10:41,934 [myid:1] - WARN > [SendWorker:4332550065071534638:QuorumCnxManager$SendWorker@1092] - > Interrupted while waiting for message on queue > java.lang.InterruptedException > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) > at > java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) > at > > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243) > at > >