Re: Plugin Performance Issues
Interesting...I guess I had logically assumed that having type=index meant it wasn't used for query time, but I see why that's not possible. Here's the thing though: We had one field defined using this fieldtype and we deployed the new schema to solr when we started seeing the issue. However, we had not yet released our code that was using the new field (obviously we have to make the change on the solr end before the code, so we asynchronously do this offset by a few days). So the field that was of that fieldtype wasn't even being queried against. The problem for us would be pretty easy to reproduce, but I don't think our sys admins would appreciate experimenting with our production solr servers. We can pretty much only reproduce on our live environment because that's the only environment that's really getting regular (100 qps) traffic, so I guess you could say that it is traffic related. Just some other notes, we have a distributed index across 3 shards. We also regularly pick up snapshots from the master server about once per hour, so whatever commits happen during snapinstalling may affect it, but the timeline of the memory growing doesn't really line up with those commits. Anyway, I know it all seems like mystery and I apologize if it seems like I'm being vague, but the issue really is that simple. Hopefully if someone else ever experiences it they can come up with a better explanation why. Until then, we decided to just deploy our custom classes the old way by exploding the war and placing the jars in there - not nearly as convenient, but we haven't experienced any problems doing it this way (same code and config btw, so since the only difference is using the lib directory vs. not, that's most likely the problem). Thanks for your help hossman wrote: : fieldtype name=text_lc class=solr.TextField tokenized=false : analyzer type=index : tokenizer class=my.custom.TokenizerFactory/ : filter class=my.custom.FilterFactory words=stopwords.txt/ : filter class=solr.LowerCaseFilterFactory/ : filter class=solr.RemoveDuplicatesTokenFilterFactory/ : /analyzer : /fieldtype ... : only do indexing on the master server. However, with this schema in place : on the slaves, as well as our custom.jar in the solrHome/lib directory, we : run into these issues where the memory usage grows and grows without : explanation. ...even if you only o indexing on the master, having a single analyzer defined for a field means it's used at both index and query time (even though you say 'type=index') so a memory leak in either of your custom factories could cause a problem on a query box. This however concerns me... : fact, in a previous try, we had simply dropped one of our custom plugin jars : into the lib directory but forgot to deploy the new solrconfig or schema : files that referenced the classes in there, and the issue still occurred. ...this i can't think of a rational explanation for. Can you elaborate on what you can do to create this problem .. ie: does the memory usage grow even when solr doesn't get any requests? or do it happen when searches are executed? or when commits happen? etc... If the problem is as easy to reproduce as you describe, can you please generate some heap dumps against a server that isn't processing any queries -- one from when hte server first starts up, and one from when hte server crashes from an OOM (there's a JVM option for generating heap dumps on OOM that i can't think of off hte top of my head) -Hoss -- View this message in context: http://old.nabble.com/Plugin-Performance-Issues-tp24295010p26201123.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
: the thing though: We had one field defined using this fieldtype and we : deployed the new schema to solr when we started seeing the issue. However, : we had not yet released our code that was using the new field (obviously we : have to make the change on the solr end before the code, so we : asynchronously do this offset by a few days). So the field that was of that : fieldtype wasn't even being queried against. but even then: having the Factorty declared as part of a fieldtype means the factory is going to get instantiated -- so without nay information about what the factory does in it's init/inform methods, there's really no way to guess what might be causing hte behavior you are seeing. : The problem for us would be pretty easy to reproduce, but I don't think our : sys admins would appreciate experimenting with our production solr servers. I completley understand that, but at this point i can't reproduce what you're seeing, and i haven't seen anyone else sy that they can reproduce it either -- the simplest explanation being that it's probably not a bug in Solr, but it might be a bug in your code. I don't know how else to say this but: if you don't show us some code that other people can use to try and reproduce, we can't really help you. -Hoss
Re: Plugin Performance Issues
: fieldtype name=text_lc class=solr.TextField tokenized=false : analyzer type=index : tokenizer class=my.custom.TokenizerFactory/ : filter class=my.custom.FilterFactory words=stopwords.txt/ : filter class=solr.LowerCaseFilterFactory/ : filter class=solr.RemoveDuplicatesTokenFilterFactory/ : /analyzer : /fieldtype ... : only do indexing on the master server. However, with this schema in place : on the slaves, as well as our custom.jar in the solrHome/lib directory, we : run into these issues where the memory usage grows and grows without : explanation. ...even if you only o indexing on the master, having a single analyzer defined for a field means it's used at both index and query time (even though you say 'type=index') so a memory leak in either of your custom factories could cause a problem on a query box. This however concerns me... : fact, in a previous try, we had simply dropped one of our custom plugin jars : into the lib directory but forgot to deploy the new solrconfig or schema : files that referenced the classes in there, and the issue still occurred. ...this i can't think of a rational explanation for. Can you elaborate on what you can do to create this problem .. ie: does the memory usage grow even when solr doesn't get any requests? or do it happen when searches are executed? or when commits happen? etc... If the problem is as easy to reproduce as you describe, can you please generate some heap dumps against a server that isn't processing any queries -- one from when hte server first starts up, and one from when hte server crashes from an OOM (there's a JVM option for generating heap dumps on OOM that i can't think of off hte top of my head) -Hoss
Re: Plugin Performance Issues
I would guess that your code is being used. I'm not sure what you mean by it was only referenced in the schema. That implies usage to me. Is it a new field type? What is your plugin doing? Have you tried setting breakpoints at method entry points in your plugin and starting up Solr w/ a debugger attached. -Grant On Oct 28, 2009, at 4:54 PM, entdeveloper wrote: This is an issue we experienced a while back. We once again tried to load a custom class as a plugin jar from the lib directory and began experiencing severe memory problems again. The code in our jar wasn't being used at all...the class was only referenced in the schema. I find it strange that no one else has experienced this, but we're not doing anything particularly complex, which is still leading me to believe that there is something strange going on with Solr's class loading for this lib directory. Perhaps it is something specific with our environment (specs below)? java version 1.6.0_05 Java(TM) SE Runtime Environment (build 1.6.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode) Tomcat 6.0.16 Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64 x86_64 x86_64 GNU/Linux Max heap set to 1GB. With the jars in the plugin directory, RAM usage increases by 1.5 - 2GB, increasing at about 200MB/hr. hossman wrote: : I'm not entirely convinced that it's related to our code, but it could be. : Just trying to get a sense if other plugins have had similar problems, just : by the nature of using Solr's resource loading from the /lib directory. Plugins aren't something that every Solr users -- but enough people use them that if there was a fundemental memory leak just from loading plugin jars i'm guessing more people would be complaining. I use plugins in several solr instances, and i've never noticed any problems like you describe -- but i don't personally use tomcat. Otis is right on the money: you need to use profiling tools to really look at the heap and see what's taking up all that ram. Alternately: a quick way to rule out the special plugin class loader would be to embed your custom handler directly into the solr.war (The Old Way on the SolrPlugins wiki) ... if you still have problems, then the cause isn't the plugin classloader. -Hoss -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26101741.html Sent from the Solr - User mailing list archive at Nabble.com. -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: Plugin Performance Issues
This is an issue we experienced a while back. We once again tried to load a custom class as a plugin jar from the lib directory and began experiencing severe memory problems again. The code in our jar wasn't being used at all...the class was only referenced in the schema. I find it strange that no one else has experienced this, but we're not doing anything particularly complex, which is still leading me to believe that there is something strange going on with Solr's class loading for this lib directory. Perhaps it is something specific with our environment (specs below)? java version 1.6.0_05 Java(TM) SE Runtime Environment (build 1.6.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode) Tomcat 6.0.16 Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64 x86_64 x86_64 GNU/Linux Max heap set to 1GB. With the jars in the plugin directory, RAM usage increases by 1.5 - 2GB, increasing at about 200MB/hr. hossman wrote: : I'm not entirely convinced that it's related to our code, but it could be. : Just trying to get a sense if other plugins have had similar problems, just : by the nature of using Solr's resource loading from the /lib directory. Plugins aren't something that every Solr users -- but enough people use them that if there was a fundemental memory leak just from loading plugin jars i'm guessing more people would be complaining. I use plugins in several solr instances, and i've never noticed any problems like you describe -- but i don't personally use tomcat. Otis is right on the money: you need to use profiling tools to really look at the heap and see what's taking up all that ram. Alternately: a quick way to rule out the special plugin class loader would be to embed your custom handler directly into the solr.war (The Old Way on the SolrPlugins wiki) ... if you still have problems, then the cause isn't the plugin classloader. -Hoss -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26101741.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
: I'm not entirely convinced that it's related to our code, but it could be. : Just trying to get a sense if other plugins have had similar problems, just : by the nature of using Solr's resource loading from the /lib directory. Plugins aren't something that every Solr users -- but enough people use them that if there was a fundemental memory leak just from loading plugin jars i'm guessing more people would be complaining. I use plugins in several solr instances, and i've never noticed any problems like you describe -- but i don't personally use tomcat. Otis is right on the money: you need to use profiling tools to really look at the heap and see what's taking up all that ram. Alternately: a quick way to rule out the special plugin class loader would be to embed your custom handler directly into the solr.war (The Old Way on the SolrPlugins wiki) ... if you still have problems, then the cause isn't the plugin classloader. -Hoss
Plugin Performance Issues
We recently created a custom class for our spellchecking implementation in Solr. We decided to include the class in a custom jar and deployed it to the /lib directory in solr_home to use it as a plugin. After a while (about 12 hours), the heap usage for Solr slowly starts to rise, and we eventually run into swap issues which ends up killing our performance. We've tried several different things to try to solve the problem, originally thinking it was our code, but on one of our servers, the new code in the plugin wasn't even being used. Has anyone else experienced? I'm wondering if this is perhaps a side-effect of using plugins in general, perhaps something going on with the custom class loading of Solr. We're using Tomcat 6 and Solr 1.3 by the way. -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24295010.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
Our max heap was configured to use 5GB. It has been running fine until we tried to deploy a new queryConverter for our SpellcheckComponent. After which, we upped our heap to 8GB and still had issues. Solr is the only webapp running on Tomcat. We are using sorting and faceting, but again, hadn't had problems until deploying this plugin. Also, seeing as how it's only spellchecking related (and we have a separate RequestHandler that only handles spellchecking, while leaving the SpellcheckComponent out of our standard RequestHandler), I'm not entirely convinced that it's related to our code, but it could be. Just trying to get a sense if other plugins have had similar problems, just by the nature of using Solr's resource loading from the /lib directory. Otis Gospodnetic wrote: Hi, Could it simply be the case that you really do need all that memory that the JVM start consuming with time? How large of a heap are you using, is Solr the only webapp in your TOmcat, and are you using sorting or faceting? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: CameronL cameron.develo...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 2:37:40 PM Subject: Plugin Performance Issues We recently created a custom class for our spellchecking implementation in Solr. We decided to include the class in a custom jar and deployed it to the /lib directory in solr_home to use it as a plugin. After a while (about 12 hours), the heap usage for Solr slowly starts to rise, and we eventually run into swap issues which ends up killing our performance. We've tried several different things to try to solve the problem, originally thinking it was our code, but on one of our servers, the new code in the plugin wasn't even being used. Has anyone else experienced? I'm wondering if this is perhaps a side-effect of using plugins in general, perhaps something going on with the custom class loading of Solr. We're using Tomcat 6 and Solr 1.3 by the way. -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24295010.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24296828.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Plugin Performance Issues
Hi, 5GB heap sounds quite big, let along the 8 GB heap. I would try simple stuff like jmap to see what's eating the memory, and if that doesn't work I'd try using a profiler. Turn off norms if you don't need them, and either use trie-based fields for date if you have them and sort by them, or round those dates up. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: CameronL cameron.develo...@gmail.com To: solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 4:43:11 PM Subject: Re: Plugin Performance Issues Our max heap was configured to use 5GB. It has been running fine until we tried to deploy a new queryConverter for our SpellcheckComponent. After which, we upped our heap to 8GB and still had issues. Solr is the only webapp running on Tomcat. We are using sorting and faceting, but again, hadn't had problems until deploying this plugin. Also, seeing as how it's only spellchecking related (and we have a separate RequestHandler that only handles spellchecking, while leaving the SpellcheckComponent out of our standard RequestHandler), I'm not entirely convinced that it's related to our code, but it could be. Just trying to get a sense if other plugins have had similar problems, just by the nature of using Solr's resource loading from the /lib directory. Otis Gospodnetic wrote: Hi, Could it simply be the case that you really do need all that memory that the JVM start consuming with time? How large of a heap are you using, is Solr the only webapp in your TOmcat, and are you using sorting or faceting? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: CameronL To: solr-user@lucene.apache.org Sent: Wednesday, July 1, 2009 2:37:40 PM Subject: Plugin Performance Issues We recently created a custom class for our spellchecking implementation in Solr. We decided to include the class in a custom jar and deployed it to the /lib directory in solr_home to use it as a plugin. After a while (about 12 hours), the heap usage for Solr slowly starts to rise, and we eventually run into swap issues which ends up killing our performance. We've tried several different things to try to solve the problem, originally thinking it was our code, but on one of our servers, the new code in the plugin wasn't even being used. Has anyone else experienced? I'm wondering if this is perhaps a side-effect of using plugins in general, perhaps something going on with the custom class loading of Solr. We're using Tomcat 6 and Solr 1.3 by the way. -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24295010.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24296828.html Sent from the Solr - User mailing list archive at Nabble.com.