Re: ManagedIndexSchema Bad version when trying to persist schema
Bah, I should have said when you create a collection. You get the following if you create your collection using the default schema: WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use. To turn off: bin/solr config -c eoe -p 8982 -action set-user-property -property update.autoCreateFields -value false But as Shawn says, if you have add-unknown-fields-to-the-schema configured in your update processor chain in solrconfig.xml, you're using schemaless mode. Best, Erick On Thu, Oct 11, 2018 at 9:58 AM Shawn Heisey wrote: > > On 10/11/2018 10:07 AM, Mikhail Ibraheem wrote: > > Hi Erick,Thanks for your reply.No, we aren't using schemaless mode. > > is not explicitly declared in our solrconfig.xml > > Schemaless mode is not turned on by the schemaFactory config element. > > The default configurations that Solr ships with have schemaless mode > enabled. It is handled with an update processor chain named > add-unknown-fields-to-the-schema. As Erick mentioned, we *STRONGLY* > recommend that you disable this in production. In version 7.5, you only > need to change whether that update chain is set to the default chain. > This can be handled with a property (update.autoCreateFields) but you > can just also edit solrconfig.xml to change the "default" attribute to > false. > > The exception you're getting is > org.apache.zookeeper.KeeperException$BadVersionException which comes > from zookeeper. Can you share the *entire* error? It will likely be a > few dozen lines in length and may contain multiple "Caused by" sections, > each with their own stacktrace. With the entire error, we can narrow > down what part of Solr code is executing when the error happens. > > Thanks, > Shawn >
Which is the best option to secure the solr specific data in zookeeper?
Hi, When the solr authentication is enabled, which is better to use ZK ACL or enable authentication for the whole zookeeper itself? Or is their any other better option? Thanks, Yamuna J
Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)
On 10/11/2018 4:51 AM, yasoobhaider wrote: Hi Shawn, thanks for the inputs. I have uploaded the gc logs of one of the slaves here: https://ufile.io/ecvag (should work till 18th Oct '18) I uploaded the logs to gceasy as well and it says that the problem is consecutive full GCs. According to the solution they have mentioned, increasing the heap size is a solution. But I am already running on a pretty big heap, so don't think increasing the heap size is going to be a long term solution. Surprisingly, the GC performance in that logfile is actually pretty good. I was more interested in how much heap was actually being used than the performance. The "Heap after GC" button on the gceasy report page (which controls which graph is shown) shows that you really are using most of that 80GB. If the info you shared about your index sizes is accurate, the only way I can imagine this much heap being necessary is configuration. It sounds like each system should have two configurations -- solrconfig.xml and the schema are the primary things in those configs. Can you share the unique configuration directories for each of your indexes, which I think means there will be two of them? For each of the configurations, indicate the number of documents and size on disk. You'll need to use a file sharing site. It would be best to archive each directory into its own zipfile or .tar.gz file. If your systems are running in cloud mode, the active configuration will be stored in zookeeper. Thanks, Shawn
Re: ManagedIndexSchema Bad version when trying to persist schema
On 10/11/2018 10:07 AM, Mikhail Ibraheem wrote: Hi Erick,Thanks for your reply.No, we aren't using schemaless mode. is not explicitly declared in our solrconfig.xml Schemaless mode is not turned on by the schemaFactory config element. The default configurations that Solr ships with have schemaless mode enabled. It is handled with an update processor chain named add-unknown-fields-to-the-schema. As Erick mentioned, we *STRONGLY* recommend that you disable this in production. In version 7.5, you only need to change whether that update chain is set to the default chain. This can be handled with a property (update.autoCreateFields) but you can just also edit solrconfig.xml to change the "default" attribute to false. The exception you're getting is org.apache.zookeeper.KeeperException$BadVersionException which comes from zookeeper. Can you share the *entire* error? It will likely be a few dozen lines in length and may contain multiple "Caused by" sections, each with their own stacktrace. With the entire error, we can narrow down what part of Solr code is executing when the error happens. Thanks, Shawn
Re: ManagedIndexSchema Bad version when trying to persist schema
Erick, I don't get any such message when I start solr - could you share what that curl command should be? You suggest modifying solrconfig.xml - could you be more explicit on what changes to make? Terry On 10/11/2018 11:52 AM, Erick Erickson wrote: > bq: Also why solr updates and persists the managed-schema while ingesting > data? > > I'd guess you are using "schemaless mode", which is expressly > recommended _against_ for production systems. See "Schemaless Mode" in > the reference guide. > > I'd disable schemaless mode (when you start Solr there should be a > message telling you how to disable it via curl, but I'd modify my > solrconfig.xml file to remove it permanently) > > Best, > Erick > On Thu, Oct 11, 2018 at 8:02 AM Mikhail Ibraheem > wrote: >> Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in >> concurrent updates (Many threads).We are getting this >> exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist >> schema using 1 due to: >> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = >> BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema >> Failed to persist managed schema at /configs/my-core/managed-schema - >> version mismatch >> >> Also why solr updates and persists the managed-schema while ingesting data? >> I see managed-schema shouldn't be affected by data updates. >> Thanks
Re: ManagedIndexSchema Bad version when trying to persist schema
Hi Erick,Thanks for your reply.No, we aren't using schemaless mode. is not explicitly declared in our solrconfig.xmlAlso we have only one replica and one shard. Any help? ThanksMikhail On Thursday, 11 October 2018, 17:53:01 EET, Erick Erickson wrote: bq: Also why solr updates and persists the managed-schema while ingesting data? I'd guess you are using "schemaless mode", which is expressly recommended _against_ for production systems. See "Schemaless Mode" in the reference guide. I'd disable schemaless mode (when you start Solr there should be a message telling you how to disable it via curl, but I'd modify my solrconfig.xml file to remove it permanently) Best, Erick On Thu, Oct 11, 2018 at 8:02 AM Mikhail Ibraheem wrote: > > Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in > concurrent updates (Many threads).We are getting this > exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist > schema using 1 due to: > org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = > BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema > Failed to persist managed schema at /configs/my-core/managed-schema - version > mismatch > > Also why solr updates and persists the managed-schema while ingesting data? I > see managed-schema shouldn't be affected by data updates. > Thanks
Re: ManagedIndexSchema Bad version when trying to persist schema
bq: Also why solr updates and persists the managed-schema while ingesting data? I'd guess you are using "schemaless mode", which is expressly recommended _against_ for production systems. See "Schemaless Mode" in the reference guide. I'd disable schemaless mode (when you start Solr there should be a message telling you how to disable it via curl, but I'd modify my solrconfig.xml file to remove it permanently) Best, Erick On Thu, Oct 11, 2018 at 8:02 AM Mikhail Ibraheem wrote: > > Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in > concurrent updates (Many threads).We are getting this > exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist > schema using 1 due to: > org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = > BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema > Failed to persist managed schema at /configs/my-core/managed-schema - version > mismatch > > Also why solr updates and persists the managed-schema while ingesting data? I > see managed-schema shouldn't be affected by data updates. > Thanks
Re: Tika and Solr : rejected document due to mime type restrictions
On 10/11/2018 9:06 AM, Bisonti Mario wrote: I startup tika server from command line: java -jar /opt/tika/tika-server-1.19.1.jar I configured, with ManifoldCF a connector to Solr. When I start the ingest of pdf and .xls document, I see in the tika server: so it seems that tika server process the cocuments, but, Solr server doesn't ingest. I obtain the error: Solr connector rejected document due to mime type restrictions: (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) Solr connector rejected document due to mime type restrictions: (application/pdf) Those errors are not coming from Solr. Do you see any errors in solr.log? If you do, then we can help you with those. Since ManifoldCF calls its components connectors, I am betting the errors are being generated by ManifoldCF, and that for those documents, nothing has actually been sent to Solr, so you won't see errors in the solr.log for those files. ManifoldCF is a separate project within Apache, which has its own support infrastructure. https://manifoldcf.apache.org/en_US/mail.html Thanks, Shawn
Tika and Solr : rejected document due to mime type restrictions
Hallo. I startup tika server from command line: java -jar /opt/tika/tika-server-1.19.1.jar I configured, with ManifoldCF a connector to Solr. When I start the ingest of pdf and .xls document, I see in the tika server: INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @1053ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.z-SNAPSHOT; built: 2018-06-05T18:24:03.829Z; git: d5fc0523cfa96bfebfbda19606cad384d772f04c; jvm 10.0.2+13-Ubuntu-1ubuntu0.18.04.2 INFO Started ServerConnector@f74e835{HTTP/1.1,[http/1.1]}{localhost:9998} INFO Started @1134ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@68d6972f{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO meta (application/pdf) INFO meta (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) WARN Using fallback font 'LiberationSans' for 'TimesNewRomanPS-BoldMT' WARN Using fallback font 'LiberationSans' for 'Arial-Black' WARN Using fallback font 'LiberationSans' for 'TimesNewRomanPSMT' WARN Using fallback font 'LiberationSans' for 'Arial-BoldMT' WARN Using fallback font 'LiberationSans' for 'ArialMT' WARN Using fallback font 'LiberationSans' for 'CourierNewPSMT' WARN Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT' INFO tika (application/pdf) WARN Using fallback font 'LiberationSans' for 'TimesNewRomanPS-BoldMT' WARN Using fallback font 'LiberationSans' for 'Arial-Black' WARN Using fallback font 'LiberationSans' for 'TimesNewRomanPSMT' WARN Using fallback font 'LiberationSans' for 'Arial-BoldMT' WARN Using fallback font 'LiberationSans' for 'ArialMT' WARN Using fallback font 'LiberationSans' for 'CourierNewPSMT' WARN Using fallback font 'LiberationSans' for 'TimesNewRomanPS-ItalicMT' INFO tika (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet so it seems that tika server process the cocuments, but, Solr server doesn't ingest. I obtain the error: Solr connector rejected document due to mime type restrictions: (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) Solr connector rejected document due to mime type restrictions: (application/pdf) I understood that tika converts all documents in text so it index to solr, or are there any restriction about Tika Server mime typ? Thanks a lot Mario
ManagedIndexSchema Bad version when trying to persist schema
Hi,We upgraded to Solr 7.5, we try to ingest to solr using solrJ in concurrent updates (Many threads).We are getting this exception:o.a.s.s.ManagedIndexSchema Bad version when trying to persist schema using 1 due to: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /configs/my-core/managed-schemao.a.s.s.ManagedIndexSchema Failed to persist managed schema at /configs/my-core/managed-schema - version mismatch Also why solr updates and persists the managed-schema while ingesting data? I see managed-schema shouldn't be affected by data updates. Thanks
Re: Solr JVM Memory settings
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Shawn, On 10/11/18 12:54 AM, Shawn Heisey wrote: > On 10/10/2018 10:08 PM, Sourav Moitra wrote: >> We have a Solr server with 8gb of memory. We are using solr in >> cloud mode, solr version is 7.5, Java version is Oracle Java 9 >> and settings for Xmx and Xms value is 2g but we are observing >> that the RAM getting used to 98% when doing indexing. >> >> How can I ensure that SolrCloud doesn't use more than N GB of >> memory ? > > Where precisely are you seeing the 98% usage? It is completely > normal for a modern operating system to report that almost all the > system memory is in use, at least after the system has been > shuffling a lot of data. All modern operating systems will use > memory that has not been specifically allocated to programs for > disk caching purposes, and system information tools will generally > indicate that this memory is in use, even though it can be > instantly claimed by any program that requests it. > > https://en.wikipedia.org/wiki/Page_cache > > If you tell a Java program that it is limited to a 2GB heap, then > that program will never use more than 2GB, plus a little extra for > the java runtime itself. I cannot give you an exact figure for > that little bit extra. But every bit of data on disk that Solr > accesses will end up (at least temporarily) in the operating > system's disk cache -- using that unallocated memory. > > https://wiki.apache.org/solr/SolrPerformanceProblems#RAM To be fair, the JVM can use *much more* memory than you have specified for your Java heap. It's just that the Java heap itself wont exceed those values. The JVM uses quite a bit of native memory which isn't counted in the Java heap. There is only one way I know of to control that, and it's to set a process-limit at the OS level on the amount of memory allowed. I'm not sure how sensitive to those limits the JVM actually is, so attempting to artificially constrain the JVM might end up with a native OOM crash. - -chris -BEGIN PGP SIGNATURE- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlu/YgoACgkQHPApP6U8 pFjcbBAAgYegk20pYvfu3vcrAKxj3s+RSMGRPZ+nN5g0KYQFuhFgptYE+TqjLfBX geekQUNqNUHO5psMA5q/6m6b3LwpqrMxJiapv0wWQ2wPah21CgLs/P/iG+elNQ63 H0ZXbe3wX0P0onZbP4+sfDyzhujZ+5+gMooK87o8Q4z91hIVX1EZfM4lcaZ3pbnb JJ44YorWGPpXjQNEtOHfS7l/Q+8+6+XfEyfKha3JpRFcwcqgLpv23Koy4xgxgYr+ PMqfjptMBMjZ04xSdd491crm2yZowv3KH1Ss8v/L51rknGYPxCEkdKvPrUlpn+Rb 4WnQS6H//dJvQaLum/qR9Jxd+3vc13K7Mn++5Lu+jMbeEgaJU2hD4/ap/KMtFCqn eIXl6HQYPW36sVcm/MIpkRvAgx8vri17sd3/5sOYaETrp4SMxMN5W44GvgDdkbGF R9/tVBCFWb3p+o8eSKUf7QmARiN69DHGVwtQHWMIp8K9893IeHUNgVXKD7281zLB AjHPc7QTvAn4xne0X9lvQjr+YKOPxd9FFqMBejdKht9aBFQvApma9LtJT3FInrob QkSIx594KhoRltRy7E9t3XuWWGg8ujiuzKl6SEPsgXUC2Opwr4Wwu1yn9dCWkFJz RzCKbaDBaNmrK6HSEsoNvS+yQPksPxM8MuchFaCAMZpVOsobCM0= =77dD -END PGP SIGNATURE-
Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)
I have to echo what others have said. An 80G heap is waaay out the norm, especially when you consider the size of your indexes and the number of docs. Understanding why you think you need that much heap should be your top priority. As has already been suggested, insuring docValues are set for all fields that are used for sorting, faceting and grouping is a must. Deep paging can hurt too. In addition I'd check the cache settings, do you have a huge filterCache? What about the other caches? One common mistake is to have very high cache settings, in your setup I'd stick with 512 to start. Without _data_ it's hard to say, so unless some of those settings don't help the next thing I'd do is a heap dump or put a profiler on the JVM and see where the heap is actually allocated. It's quite possible that you arrived at 80G with some mistaken assumptions and once those are cleared up you can reduce your heap a lot. You say "through a lot of trial and error", what exactly happens when you use, say, a 32G heap? OOMs? Slowdowns? This is also starving your OS cache where most of the Lucene index data is stored, see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Best, Erick On Thu, Oct 11, 2018 at 4:42 AM yasoobhaider wrote: > > Hi Shawn, thanks for the inputs. > > I have uploaded the gc logs of one of the slaves here: > https://ufile.io/ecvag (should work till 18th Oct '18) > > I uploaded the logs to gceasy as well and it says that the problem is > consecutive full GCs. According to the solution they have mentioned, > increasing the heap size is a solution. But I am already running on a pretty > big heap, so don't think increasing the heap size is going to be a long term > solution. > > From what I understood from a bit more looking around, this is Concurrent > Mode Failure for CMS. I found an old blog mentioning the use of > XX:CMSFullGCsBeforeCompaction=1 to make sure that compaction is done prior > to next collection trigger. So if it is a fragmentation problem, this will > solve it I hope. > > I will also try out using docValues as suggested by Ere on a couple of > fields on which we make a lot of faceting queries to reduce memory usage on > the slaves. > > Please share any ideas that you may have from the gc logs analysis > > Thanks > Yasoob > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr JVM Memory settings
Don't know if this directly affects what you're trying to do. But I have an 8GB server and when I run "solr status" I can see what % of the automatic memory allocation is being used. As it turned out, solr would occasionally exceed that (and crashed). I then began starting solr with the additional parameter: "-m 1g" Now the solr consumption is almost always 50% or less, and have had no further problems. On 10/11/2018 12:08 AM, Sourav Moitra wrote: > Hello, > > We have a Solr server with 8gb of memory. We are using solr in cloud > mode, solr version is 7.5, Java version is Oracle Java 9 and settings > for Xmx and Xms value is 2g but we are observing that the RAM getting > used to 98% when doing indexing. > > How can I ensure that SolrCloud doesn't use more than N GB of memory ? > > Sourav Moitra > https://souravmoitra.com >
Re: Realtime get not always returning existing data
We are relatively far behind with this one. The collections that we experience the problem on are currently running on 6.3.0. If it's easy enough for you to upgrade, it might be worth a try, but I didn't see any changes to the RealTimeGet in either of the 7.4/5 change logs after a cursory glance. Due to the volume and number of different processes that, this cluster requires more coordination to reindex and upgrade. So it's currently the last one on our plan to get upgraded to 7.X (or 8.X if timing allows). On Thu, Oct 11, 2018 at 8:22 AM sgaron cse wrote: > Hey Chris, > > Which version of SOLR are you running? I was thinking of maybe trying > another version to see if it fixes the issue. > > On Thu, Oct 11, 2018 at 8:11 AM Chris Ulicny wrote: > > > We've also run into that issue of not being able to reproduce it outside > of > > running production loads. > > > > However, we haven't been encountering the problem in live production > quite > > as much as we used to, and I think that might be from the /get requests > > being spread out a little more evenly over the running interval which is > > due to other process changes. > > > > If I get any new information, I'll update as well. > > > > Thanks for your help. > > > > On Wed, Oct 10, 2018 at 10:53 AM sgaron cse > wrote: > > > > > I haven't found a way to reproduce the problem other that running our > > > entire set of code. I've also been trying different things to make sure > > to > > > problem is not from my end and so far I haven't managed to fix it by > > > changing my code. It has to be a race condition somewhere but I just > > can't > > > put my finger on it. > > > > > > I'll message back if I find a way to reproduce. > > > > > > On Wed, Oct 10, 2018 at 10:48 AM Erick Erickson < > erickerick...@gmail.com > > > > > > wrote: > > > > > > > Well assigning a bogus version that generates a 409 error then > > > > immediately doing an RTG on the doc doesn't fail for me either 18 > > > > million tries later. So I'm afraid I haven't a clue where to go from > > > > here. Unless we can somehow find a way to generate this failure I'm > > > > going to drop it for the foreseeable future. > > > > > > > > Erick > > > > On Tue, Oct 9, 2018 at 7:39 AM Erick Erickson < > erickerick...@gmail.com > > > > > > > wrote: > > > > > > > > > > H. I wonder if a version conflict or perhaps other failure can > > > > > somehow cause this. It shouldn't be very hard to add that to my > test > > > > > setup, just randomly add n _version_ field value. > > > > > > > > > > Erick > > > > > On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson < > > erickerick...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > Thanks. I'll be away for the rest of the week, so won't be able > to > > > try > > > > > > anything more > > > > > > On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny > > > wrote: > > > > > > > > > > > > > > In our case, we are heavily indexing in the collection while > the > > > /get > > > > > > > requests are happening which is what we assumed was causing > this > > > > very rare > > > > > > > behavior. However, we have experienced the problem for a > > collection > > > > where > > > > > > > the following happens in sequence with minutes in between them. > > > > > > > > > > > > > > 1. Document id=1 is indexed > > > > > > > 2. Document successfully retrieved with /get?id=1 > > > > > > > 3. Document failed to be retrieved with /get?id=1 > > > > > > > 4. Document successfully retrieved with /get?id=1 > > > > > > > > > > > > > > We've haven't looked at the issue in a while, so I don't have > the > > > > exact > > > > > > > timing of that sequence on hand right now. I'll try to find an > > > actual > > > > > > > example, although I'm relatively certain it was multiple > minutes > > in > > > > between > > > > > > > each of those requests. However our autocommit (and soft > commit) > > > > times are > > > > > > > 60s for both collections. > > > > > > > > > > > > > > I think the following two are probably the biggest differences > > for > > > > our > > > > > > > setup, besides the version difference (v6.3.0): > > > > > > > > > > > > > > > index to this collection, perhaps not at a high rate > > > > > > > > separate the machines running solr from the one doing any > > > querying > > > > or > > > > > > > indexing > > > > > > > > > > > > > > The clients are on 3 hosts separate from the solr instances. > The > > > > total > > > > > > > number of threads that are making updates and making /get > > requests > > > is > > > > > > > around 120-150. About 40-50 per host. Each of our two > collections > > > > gets an > > > > > > > average of 500 requests per second constantly for ~5 minutes, > and > > > > then the > > > > > > > number slowly tapers off to essentially 0 after ~15 minutes. > > > > > > > > > > > > > > Every thread attempts to make the same series of requests. > > > > > > > > > > > > > > -- Update with "_version_=-1". If successful, no other requests > > are > > > > made. > > >
Re: Realtime get not always returning existing data
Hey Chris, Which version of SOLR are you running? I was thinking of maybe trying another version to see if it fixes the issue. On Thu, Oct 11, 2018 at 8:11 AM Chris Ulicny wrote: > We've also run into that issue of not being able to reproduce it outside of > running production loads. > > However, we haven't been encountering the problem in live production quite > as much as we used to, and I think that might be from the /get requests > being spread out a little more evenly over the running interval which is > due to other process changes. > > If I get any new information, I'll update as well. > > Thanks for your help. > > On Wed, Oct 10, 2018 at 10:53 AM sgaron cse wrote: > > > I haven't found a way to reproduce the problem other that running our > > entire set of code. I've also been trying different things to make sure > to > > problem is not from my end and so far I haven't managed to fix it by > > changing my code. It has to be a race condition somewhere but I just > can't > > put my finger on it. > > > > I'll message back if I find a way to reproduce. > > > > On Wed, Oct 10, 2018 at 10:48 AM Erick Erickson > > > wrote: > > > > > Well assigning a bogus version that generates a 409 error then > > > immediately doing an RTG on the doc doesn't fail for me either 18 > > > million tries later. So I'm afraid I haven't a clue where to go from > > > here. Unless we can somehow find a way to generate this failure I'm > > > going to drop it for the foreseeable future. > > > > > > Erick > > > On Tue, Oct 9, 2018 at 7:39 AM Erick Erickson > > > > wrote: > > > > > > > > H. I wonder if a version conflict or perhaps other failure can > > > > somehow cause this. It shouldn't be very hard to add that to my test > > > > setup, just randomly add n _version_ field value. > > > > > > > > Erick > > > > On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson < > erickerick...@gmail.com > > > > > > wrote: > > > > > > > > > > Thanks. I'll be away for the rest of the week, so won't be able to > > try > > > > > anything more > > > > > On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny > > wrote: > > > > > > > > > > > > In our case, we are heavily indexing in the collection while the > > /get > > > > > > requests are happening which is what we assumed was causing this > > > very rare > > > > > > behavior. However, we have experienced the problem for a > collection > > > where > > > > > > the following happens in sequence with minutes in between them. > > > > > > > > > > > > 1. Document id=1 is indexed > > > > > > 2. Document successfully retrieved with /get?id=1 > > > > > > 3. Document failed to be retrieved with /get?id=1 > > > > > > 4. Document successfully retrieved with /get?id=1 > > > > > > > > > > > > We've haven't looked at the issue in a while, so I don't have the > > > exact > > > > > > timing of that sequence on hand right now. I'll try to find an > > actual > > > > > > example, although I'm relatively certain it was multiple minutes > in > > > between > > > > > > each of those requests. However our autocommit (and soft commit) > > > times are > > > > > > 60s for both collections. > > > > > > > > > > > > I think the following two are probably the biggest differences > for > > > our > > > > > > setup, besides the version difference (v6.3.0): > > > > > > > > > > > > > index to this collection, perhaps not at a high rate > > > > > > > separate the machines running solr from the one doing any > > querying > > > or > > > > > > indexing > > > > > > > > > > > > The clients are on 3 hosts separate from the solr instances. The > > > total > > > > > > number of threads that are making updates and making /get > requests > > is > > > > > > around 120-150. About 40-50 per host. Each of our two collections > > > gets an > > > > > > average of 500 requests per second constantly for ~5 minutes, and > > > then the > > > > > > number slowly tapers off to essentially 0 after ~15 minutes. > > > > > > > > > > > > Every thread attempts to make the same series of requests. > > > > > > > > > > > > -- Update with "_version_=-1". If successful, no other requests > are > > > made. > > > > > > -- On 409 Conflict failure, it makes a /get request for the id > > > > > > -- On doc:null failure, the client handles the error and moves on > > > > > > > > > > > > Combining this with the previous series of /get requests, we end > up > > > with > > > > > > situations where an update fails as expected, but the subsequent > > /get > > > > > > request fails to retrieve the existing document: > > > > > > > > > > > > 1. Thread 1 updates id=1 successfully > > > > > > 2. Thread 2 tries to update id=1, fails (409) > > > > > > 3. Thread 2 tries to get id=1 succeeds. > > > > > > > > > > > > ...Minutes later... > > > > > > > > > > > > 4. Thread 3 tries to update id=1, fails (409) > > > > > > 5. Thread 3 tries to get id=1, fails (doc:null) > > > > > > > > > > > > ...Minutes later... > > > > > > > > > > > > 6. Thread 4 tries to update id=1, fails (409) > > > > > > 7.
Re: Realtime get not always returning existing data
We've also run into that issue of not being able to reproduce it outside of running production loads. However, we haven't been encountering the problem in live production quite as much as we used to, and I think that might be from the /get requests being spread out a little more evenly over the running interval which is due to other process changes. If I get any new information, I'll update as well. Thanks for your help. On Wed, Oct 10, 2018 at 10:53 AM sgaron cse wrote: > I haven't found a way to reproduce the problem other that running our > entire set of code. I've also been trying different things to make sure to > problem is not from my end and so far I haven't managed to fix it by > changing my code. It has to be a race condition somewhere but I just can't > put my finger on it. > > I'll message back if I find a way to reproduce. > > On Wed, Oct 10, 2018 at 10:48 AM Erick Erickson > wrote: > > > Well assigning a bogus version that generates a 409 error then > > immediately doing an RTG on the doc doesn't fail for me either 18 > > million tries later. So I'm afraid I haven't a clue where to go from > > here. Unless we can somehow find a way to generate this failure I'm > > going to drop it for the foreseeable future. > > > > Erick > > On Tue, Oct 9, 2018 at 7:39 AM Erick Erickson > > wrote: > > > > > > H. I wonder if a version conflict or perhaps other failure can > > > somehow cause this. It shouldn't be very hard to add that to my test > > > setup, just randomly add n _version_ field value. > > > > > > Erick > > > On Mon, Oct 1, 2018 at 8:20 AM Erick Erickson > > > wrote: > > > > > > > > Thanks. I'll be away for the rest of the week, so won't be able to > try > > > > anything more > > > > On Mon, Oct 1, 2018 at 5:10 AM Chris Ulicny > wrote: > > > > > > > > > > In our case, we are heavily indexing in the collection while the > /get > > > > > requests are happening which is what we assumed was causing this > > very rare > > > > > behavior. However, we have experienced the problem for a collection > > where > > > > > the following happens in sequence with minutes in between them. > > > > > > > > > > 1. Document id=1 is indexed > > > > > 2. Document successfully retrieved with /get?id=1 > > > > > 3. Document failed to be retrieved with /get?id=1 > > > > > 4. Document successfully retrieved with /get?id=1 > > > > > > > > > > We've haven't looked at the issue in a while, so I don't have the > > exact > > > > > timing of that sequence on hand right now. I'll try to find an > actual > > > > > example, although I'm relatively certain it was multiple minutes in > > between > > > > > each of those requests. However our autocommit (and soft commit) > > times are > > > > > 60s for both collections. > > > > > > > > > > I think the following two are probably the biggest differences for > > our > > > > > setup, besides the version difference (v6.3.0): > > > > > > > > > > > index to this collection, perhaps not at a high rate > > > > > > separate the machines running solr from the one doing any > querying > > or > > > > > indexing > > > > > > > > > > The clients are on 3 hosts separate from the solr instances. The > > total > > > > > number of threads that are making updates and making /get requests > is > > > > > around 120-150. About 40-50 per host. Each of our two collections > > gets an > > > > > average of 500 requests per second constantly for ~5 minutes, and > > then the > > > > > number slowly tapers off to essentially 0 after ~15 minutes. > > > > > > > > > > Every thread attempts to make the same series of requests. > > > > > > > > > > -- Update with "_version_=-1". If successful, no other requests are > > made. > > > > > -- On 409 Conflict failure, it makes a /get request for the id > > > > > -- On doc:null failure, the client handles the error and moves on > > > > > > > > > > Combining this with the previous series of /get requests, we end up > > with > > > > > situations where an update fails as expected, but the subsequent > /get > > > > > request fails to retrieve the existing document: > > > > > > > > > > 1. Thread 1 updates id=1 successfully > > > > > 2. Thread 2 tries to update id=1, fails (409) > > > > > 3. Thread 2 tries to get id=1 succeeds. > > > > > > > > > > ...Minutes later... > > > > > > > > > > 4. Thread 3 tries to update id=1, fails (409) > > > > > 5. Thread 3 tries to get id=1, fails (doc:null) > > > > > > > > > > ...Minutes later... > > > > > > > > > > 6. Thread 4 tries to update id=1, fails (409) > > > > > 7. Thread 4 tries to get id=1 succeeds. > > > > > > > > > > As Steven mentioned, it happens very, very rarely. We tried to > > recreate it > > > > > in a more controlled environment, but ran into the same issue that > > you are, > > > > > Erick. Every simplified situation we ran produced no problems. > Since > > it's > > > > > not a large issue for us and happens very rarely, we stopped trying > > to > > > > > recreate it. > > > > > > > > > > > > > > > On Sun, Sep
Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)
Hi Shawn, thanks for the inputs. I have uploaded the gc logs of one of the slaves here: https://ufile.io/ecvag (should work till 18th Oct '18) I uploaded the logs to gceasy as well and it says that the problem is consecutive full GCs. According to the solution they have mentioned, increasing the heap size is a solution. But I am already running on a pretty big heap, so don't think increasing the heap size is going to be a long term solution. >From what I understood from a bit more looking around, this is Concurrent Mode Failure for CMS. I found an old blog mentioning the use of XX:CMSFullGCsBeforeCompaction=1 to make sure that compaction is done prior to next collection trigger. So if it is a fragmentation problem, this will solve it I hope. I will also try out using docValues as suggested by Ere on a couple of fields on which we make a lot of faceting queries to reduce memory usage on the slaves. Please share any ideas that you may have from the gc logs analysis Thanks Yasoob -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr JVM Memory settings
Beside the heap the JVM has other memory areas, like the metaspace: https://docs.oracle.com/javase/9/tools/java.htm -> MaxMetaspaceSize search for "size" in that document and you'll find tons of further settings. I have not tried out Oracle Java 9 yet. regards, Hendrik On 11.10.2018 06:08, Sourav Moitra wrote: Hello, We have a Solr server with 8gb of memory. We are using solr in cloud mode, solr version is 7.5, Java version is Oracle Java 9 and settings for Xmx and Xms value is 2g but we are observing that the RAM getting used to 98% when doing indexing. How can I ensure that SolrCloud doesn't use more than N GB of memory ? Sourav Moitra https://souravmoitra.com