Re: Plugin Performance Issues

2009-11-04 Thread entdeveloper

Interesting...I guess I had logically assumed that having type=index meant
it wasn't used for query time, but I see why that's not possible.  Here's
the thing though: We had one field defined using this fieldtype and we
deployed the new schema to solr when we started seeing the issue.  However,
we had not yet released our code that was using the new field (obviously we
have to make the change on the solr end before the code, so we
asynchronously do this offset by a few days).  So the field that was of that
fieldtype wasn't even being queried against.

The problem for us would be pretty easy to reproduce, but I don't think our
sys admins would appreciate experimenting with our production solr servers. 
We can pretty much only reproduce on our live environment because that's the
only environment that's really getting regular (100 qps) traffic, so I guess
you could say that it is traffic related.  

Just some other notes, we have a distributed index across 3 shards.  We also
regularly pick up snapshots from the master server about once per hour, so
whatever commits happen during snapinstalling may affect it, but the
timeline of the memory growing doesn't really line up with those commits.

Anyway, I know it all seems like mystery and I apologize if it seems like
I'm being vague, but the issue really is that simple.  Hopefully if someone
else ever experiences it they can come up with a better explanation why. 
Until then, we decided to just deploy our custom classes the old way by
exploding the war and placing the jars in there - not nearly as convenient,
but we haven't experienced any problems doing it this way (same code and
config btw, so since the only difference is using the lib directory vs. not,
that's most likely the problem).

Thanks for your help


hossman wrote:
 
 
 : fieldtype name=text_lc class=solr.TextField tokenized=false
 :   analyzer type=index
 : tokenizer class=my.custom.TokenizerFactory/
 : filter class=my.custom.FilterFactory words=stopwords.txt/
 : filter class=solr.LowerCaseFilterFactory/
 : filter class=solr.RemoveDuplicatesTokenFilterFactory/
 :   /analyzer
 : /fieldtype
   ...
 : only do indexing on the master server.  However, with this schema in
 place
 : on the slaves, as well as our custom.jar in the solrHome/lib directory,
 we
 : run into these issues where the memory usage grows and grows without
 : explanation.
 
 ...even if you only o indexing on the master, having a single analyzer 
 defined for a field means it's used at both index and query time (even 
 though you say 'type=index') so a memory leak in either of your custom 
 factories could cause a problem on a query box.
 
 This however concerns me...
 
 : fact, in a previous try, we had simply dropped one of our custom plugin
 jars
 : into the lib directory but forgot to deploy the new solrconfig or schema
 : files that referenced the classes in there, and the issue still
 occurred.
 
 ...this i can't think of a rational explanation for.  Can you elaborate on 
 what you can do to create this problem .. ie: does the memory usage grow 
 even when solr doesn't get any requests? or do it happen when searches are 
 executed? or when commits happen? etc...
 
 If the problem is as easy to reproduce as you describe, can you please 
 generate some heap dumps against a server that isn't processing any 
 queries -- one from when hte server first starts up, and one from when hte 
 server crashes from an OOM (there's a JVM option for generating heap dumps 
 on OOM that i can't think of off hte top of my head)
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Plugin-Performance-Issues-tp24295010p26201123.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Plugin Performance Issues

2009-11-04 Thread Chris Hostetter

: the thing though: We had one field defined using this fieldtype and we
: deployed the new schema to solr when we started seeing the issue.  However,
: we had not yet released our code that was using the new field (obviously we
: have to make the change on the solr end before the code, so we
: asynchronously do this offset by a few days).  So the field that was of that
: fieldtype wasn't even being queried against.

but even then: having the Factorty declared as part of a fieldtype means 
the factory is going to get instantiated -- so without nay information 
about what the factory does in it's init/inform methods, there's really no 
way to guess what might be causing hte behavior you are seeing.

: The problem for us would be pretty easy to reproduce, but I don't think our
: sys admins would appreciate experimenting with our production solr servers. 

I completley understand that, but at this point i can't reproduce what 
you're seeing, and i haven't seen anyone else sy that they can reproduce 
it either -- the simplest explanation being that it's probably not a bug 
in Solr, but it might be a bug in your code.

I don't know how else to say this but: if you don't show us some code 
that other people can use to try and reproduce, we can't really help you.



-Hoss



Re: Plugin Performance Issues

2009-11-03 Thread Chris Hostetter

: fieldtype name=text_lc class=solr.TextField tokenized=false
:   analyzer type=index
: tokenizer class=my.custom.TokenizerFactory/
: filter class=my.custom.FilterFactory words=stopwords.txt/
: filter class=solr.LowerCaseFilterFactory/
: filter class=solr.RemoveDuplicatesTokenFilterFactory/
:   /analyzer
: /fieldtype
...
: only do indexing on the master server.  However, with this schema in place
: on the slaves, as well as our custom.jar in the solrHome/lib directory, we
: run into these issues where the memory usage grows and grows without
: explanation.

...even if you only o indexing on the master, having a single analyzer 
defined for a field means it's used at both index and query time (even 
though you say 'type=index') so a memory leak in either of your custom 
factories could cause a problem on a query box.

This however concerns me...

: fact, in a previous try, we had simply dropped one of our custom plugin jars
: into the lib directory but forgot to deploy the new solrconfig or schema
: files that referenced the classes in there, and the issue still occurred.

...this i can't think of a rational explanation for.  Can you elaborate on 
what you can do to create this problem .. ie: does the memory usage grow 
even when solr doesn't get any requests? or do it happen when searches are 
executed? or when commits happen? etc...

If the problem is as easy to reproduce as you describe, can you please 
generate some heap dumps against a server that isn't processing any 
queries -- one from when hte server first starts up, and one from when hte 
server crashes from an OOM (there's a JVM option for generating heap dumps 
on OOM that i can't think of off hte top of my head)



-Hoss



Re: Plugin Performance Issues

2009-10-29 Thread Grant Ingersoll
I would guess that your code is being used.  I'm not sure what you  
mean by it was only referenced in the schema.  That implies usage to  
me.  Is it a new field type?  What is your plugin doing?


Have you tried setting breakpoints at method entry points in your  
plugin and starting up Solr w/ a debugger attached.


-Grant

On Oct 28, 2009, at 4:54 PM, entdeveloper wrote:



This is an issue we experienced a while back.  We once again tried  
to load a
custom class as a plugin jar from the lib directory and began  
experiencing
severe memory problems again.  The code in our jar wasn't being used  
at
all...the class was only referenced in the schema.  I find it  
strange that
no one else has experienced this, but we're not doing anything  
particularly

complex, which is still leading me to believe that there is something
strange going on with Solr's class loading for this lib directory.   
Perhaps

it is something specific with our environment (specs below)?

java version 1.6.0_05
Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode)

Tomcat 6.0.16

Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64  
x86_64 x86_64

GNU/Linux

Max heap set to 1GB.

With the jars in the plugin directory, RAM usage increases by 1.5 -  
2GB,

increasing at about 200MB/hr.



hossman wrote:



: I'm not entirely convinced that it's related to our code, but it  
could

be.
: Just trying to get a sense if other plugins have had similar  
problems,

just
: by the nature of using Solr's resource loading from the /lib  
directory.


Plugins aren't something that every Solr users -- but enough people  
use
them that if there was a fundemental memory leak just from loading  
plugin

jars i'm guessing more people would be complaining.

I use plugins in several solr instances, and i've never noticed any
problems like you describe -- but i don't personally use tomcat.

Otis is right on the money: you need to use profiling tools to  
really look

at the heap and see what's taking up all that ram.

Alternately: a quick way to rule out the special plugin class  
loader would
be to embed your custom handler directly into the solr.war (The  
Old Way
on the SolrPlugins wiki) ... if you still have problems, then the  
cause

isn't the plugin classloader.





-Hoss





--
View this message in context: 
http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26101741.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Re: Plugin Performance Issues

2009-10-28 Thread entdeveloper

This is an issue we experienced a while back.  We once again tried to load a
custom class as a plugin jar from the lib directory and began experiencing
severe memory problems again.  The code in our jar wasn't being used at
all...the class was only referenced in the schema.  I find it strange that
no one else has experienced this, but we're not doing anything particularly
complex, which is still leading me to believe that there is something
strange going on with Solr's class loading for this lib directory.  Perhaps
it is something specific with our environment (specs below)?

java version 1.6.0_05
Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode)

Tomcat 6.0.16

Linux 2.6.9-35.ELsmp #1 SMP Thu Jun 1 14:31:29 PDT 2006 x86_64 x86_64 x86_64
GNU/Linux

Max heap set to 1GB.

With the jars in the plugin directory, RAM usage increases by 1.5 - 2GB,
increasing at about 200MB/hr.



hossman wrote:
 
 
 : I'm not entirely convinced that it's related to our code, but it could
 be. 
 : Just trying to get a sense if other plugins have had similar problems,
 just
 : by the nature of using Solr's resource loading from the /lib directory.
 
 Plugins aren't something that every Solr users -- but enough people use 
 them that if there was a fundemental memory leak just from loading plugin 
 jars i'm guessing more people would be complaining.
 
 I use plugins in several solr instances, and i've never noticed any 
 problems like you describe -- but i don't personally use tomcat.
 
 Otis is right on the money: you need to use profiling tools to really look 
 at the heap and see what's taking up all that ram.
 
 Alternately: a quick way to rule out the special plugin class loader would 
 be to embed your custom handler directly into the solr.war (The Old Way 
 on the SolrPlugins wiki) ... if you still have problems, then the cause 
 isn't the plugin classloader.
 
 
 
 
 
 -Hoss
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Plugin-Performance-Issues-tp24295010p26101741.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Plugin Performance Issues

2009-07-02 Thread Chris Hostetter

: I'm not entirely convinced that it's related to our code, but it could be. 
: Just trying to get a sense if other plugins have had similar problems, just
: by the nature of using Solr's resource loading from the /lib directory.

Plugins aren't something that every Solr users -- but enough people use 
them that if there was a fundemental memory leak just from loading plugin 
jars i'm guessing more people would be complaining.

I use plugins in several solr instances, and i've never noticed any 
problems like you describe -- but i don't personally use tomcat.

Otis is right on the money: you need to use profiling tools to really look 
at the heap and see what's taking up all that ram.

Alternately: a quick way to rule out the special plugin class loader would 
be to embed your custom handler directly into the solr.war (The Old Way 
on the SolrPlugins wiki) ... if you still have problems, then the cause 
isn't the plugin classloader.





-Hoss



Plugin Performance Issues

2009-07-01 Thread CameronL

We recently created a custom class for our spellchecking implementation in
Solr.  We decided to include the class in a custom jar and deployed it to
the /lib directory in solr_home to use it as a plugin.

After a while (about 12 hours), the heap usage for Solr slowly starts to
rise, and we eventually run into swap issues which ends up killing our
performance.  We've tried several different things to try to solve the
problem, originally thinking it was our code, but on one of our servers, the
new code in the plugin wasn't even being used.

Has anyone else experienced?  I'm wondering if this is perhaps a side-effect
of using plugins in general, perhaps something going on with the custom
class loading of Solr.

We're using Tomcat 6 and Solr 1.3 by the way.
-- 
View this message in context: 
http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24295010.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Plugin Performance Issues

2009-07-01 Thread CameronL

Our max heap was configured to use 5GB.  It has been running fine until we
tried to deploy a new queryConverter for our SpellcheckComponent.  After
which, we upped our heap to 8GB and still had issues.

Solr is the only webapp running on Tomcat.

We are using sorting and faceting, but again, hadn't had problems until
deploying this plugin.  Also, seeing as how it's only spellchecking related
(and we have a separate RequestHandler that only handles spellchecking,
while leaving the SpellcheckComponent out of our standard RequestHandler),
I'm not entirely convinced that it's related to our code, but it could be. 
Just trying to get a sense if other plugins have had similar problems, just
by the nature of using Solr's resource loading from the /lib directory.


Otis Gospodnetic wrote:
 
 
 Hi,
 
 Could it simply be the case that you really do need all that memory that
 the JVM start consuming with time?  How large of a heap are you using, is
 Solr the only webapp in your TOmcat, and are you using sorting or
 faceting?
 
  Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
 From: CameronL cameron.develo...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, July 1, 2009 2:37:40 PM
 Subject: Plugin Performance Issues
 
 
 We recently created a custom class for our spellchecking implementation
 in
 Solr.  We decided to include the class in a custom jar and deployed it to
 the /lib directory in solr_home to use it as a plugin.
 
 After a while (about 12 hours), the heap usage for Solr slowly starts to
 rise, and we eventually run into swap issues which ends up killing our
 performance.  We've tried several different things to try to solve the
 problem, originally thinking it was our code, but on one of our servers,
 the
 new code in the plugin wasn't even being used.
 
 Has anyone else experienced?  I'm wondering if this is perhaps a
 side-effect
 of using plugins in general, perhaps something going on with the custom
 class loading of Solr.
 
 We're using Tomcat 6 and Solr 1.3 by the way.
 -- 
 View this message in context: 
 http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24295010.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24296828.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Plugin Performance Issues

2009-07-01 Thread Otis Gospodnetic

Hi,

5GB heap sounds quite big, let along the 8 GB heap.  I would try simple stuff 
like jmap to see what's eating the memory, and if that doesn't work I'd try 
using a profiler.

Turn off norms if you don't need them, and either use trie-based fields for 
date if you have them and sort by them, or round those dates up.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: CameronL cameron.develo...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Wednesday, July 1, 2009 4:43:11 PM
 Subject: Re: Plugin Performance Issues
 
 
 Our max heap was configured to use 5GB.  It has been running fine until we
 tried to deploy a new queryConverter for our SpellcheckComponent.  After
 which, we upped our heap to 8GB and still had issues.
 
 Solr is the only webapp running on Tomcat.
 
 We are using sorting and faceting, but again, hadn't had problems until
 deploying this plugin.  Also, seeing as how it's only spellchecking related
 (and we have a separate RequestHandler that only handles spellchecking,
 while leaving the SpellcheckComponent out of our standard RequestHandler),
 I'm not entirely convinced that it's related to our code, but it could be. 
 Just trying to get a sense if other plugins have had similar problems, just
 by the nature of using Solr's resource loading from the /lib directory.
 
 
 Otis Gospodnetic wrote:
  
  
  Hi,
  
  Could it simply be the case that you really do need all that memory that
  the JVM start consuming with time?  How large of a heap are you using, is
  Solr the only webapp in your TOmcat, and are you using sorting or
  faceting?
  
   Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
  
  
  
  - Original Message 
  From: CameronL 
  To: solr-user@lucene.apache.org
  Sent: Wednesday, July 1, 2009 2:37:40 PM
  Subject: Plugin Performance Issues
  
  
  We recently created a custom class for our spellchecking implementation
  in
  Solr.  We decided to include the class in a custom jar and deployed it to
  the /lib directory in solr_home to use it as a plugin.
  
  After a while (about 12 hours), the heap usage for Solr slowly starts to
  rise, and we eventually run into swap issues which ends up killing our
  performance.  We've tried several different things to try to solve the
  problem, originally thinking it was our code, but on one of our servers,
  the
  new code in the plugin wasn't even being used.
  
  Has anyone else experienced?  I'm wondering if this is perhaps a
  side-effect
  of using plugins in general, perhaps something going on with the custom
  class loading of Solr.
  
  We're using Tomcat 6 and Solr 1.3 by the way.
  -- 
  View this message in context: 
  http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24295010.html
  Sent from the Solr - User mailing list archive at Nabble.com.
  
  
  
 
 -- 
 View this message in context: 
 http://www.nabble.com/Plugin-Performance-Issues-tp24295010p24296828.html
 Sent from the Solr - User mailing list archive at Nabble.com.