Re: Truncated XML responses from CoreAdminHandler

2009-07-31 Thread James Brady
Hi Mark,
You're right - a custom request handler sounds like the right option.

I've created a handler as you suggested, but I'm having problems on Solr
startup (my class is LiveCoresHandler):
Jul 31, 2009 5:20:39 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassCastException: LiveCoresHandler
at
org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:152)
at
org.apache.solr.core.RequestHandlers$1.create(RequestHandlers.java:161)
at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140)
at
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:169)
at org.apache.solr.core.SolrCore.init(SolrCore.java:444)

I've tried a few variations on the class definition, including extending
RequestHandlerBase (as suggested here:
http://wiki.apache.org/solr/SolrRequestHandler#head-1de7365d7ecf2eac079c5f8b92ee9af712ed75c2)
and implementing SolrRequestHandler directly.

I'm sure that the Solr libraries I built against and those I'm running on
are the same version too, as I unzipped the Solr war file and copies the
relevant jars out of there to build against.

Any ideas on what could be causing the ClassCastException? I've attached a
debugger to the running Solr process but it didn't shed any light on the
issue...

Thanks!
James

2009/7/20 Mark Miller markrmil...@gmail.com

 Hi James,

 That is very odd behavior! I'm not sure what causing it at the moment, but
 that is not a great way to get all of the core names anyway. It also
 gathers
 a *lot* of information for each core that you don't need, including index
 statistic from Luke. Its very heavy weight for what you want. So while I
 hope we get to the bottom of this, here is what I would recommend:

 Create your own plugin RequestHandler. This is very simple - often they
 just
 extend RequestHandlerBase, but for this you don't even need to. You can
 leave most of the RequestHandler methods unimplemented if you'd like - you
 just want to override/add to:

 public void handleRequest(SolrQueryRequest req, SolrQueryResponse rsp)

 and in that method you can have a very simple impl:

CollectionString names =
 req.getCore().getCoreDescriptor().getCoreContainer().getCoreNames();
rsp.add(cores, names);
   // if the cores are dynamic, you prob don't want to cache
rsp.setHttpCaching(false);

 Then just plug your simple RequestHandler into {solr.home}/lib and add it
 to
 solrconfig.xml.

 You might also add a JIRA issue requesting the feature for future versions
 -
 but that's prob the best solution for 1.3 - I'm not see the functionality
 there.

 --
 - Mark

 http://www.lucidimagination.com

 On Sat, Jul 18, 2009 at 9:02 PM, James Brady james.colin.br...@gmail.com
 wrote:

  The Solr application I'm working on has many concurrently active cores -
 of
  the order of 1000s at a time.
 
  The management application depends on being able to query Solr for the
  current set of live cores, a requirement I've been satisfying using the
  STATUS core admin handler method.
 
  However, once the number of active cores reaches a particular threshold
  (which I haven't determined exactly), the response to the STATUS method
 is
  truncated, resulting in malformed XML.
 
  My debugging so far has revealed:
 
- when doing STATUS queries from the local machine, they succeed,
untruncated, 90% of the time
- when local STATUS queries do fail, they are always truncated to the
same length: 73685 bytes in my case
- when doing STATUS queries from a remote machine, they fail due to
truncation every time
- remote STATUS queries are always truncated to the same length: 24704
bytes in my case
- the failing STATUS queries take visibly longer to complete on the
client - a few seconds for a truncated result versus 1 second for an
untruncated result
- all STATUS queries return a successful 200 HTTP code
- all STATUS queries are logged as returning in ~700ms in Solr's info
 log
- during failing (truncated) responses, Solr's CPU usage spikes to
saturation
- behaviour seems the same whatever client I use: wget, curl, Python,
 ...
 
  Using Solr 1.3.0 694707, Jetty 6.1.3.
 
  At the moment, the main puzzles for me are that the local and remote
  behaviour is so different. It leads me to think that it is something to
 do
  with the network transmission speed. But the response really isn't that
 big
  (untruncated it's ~1MB), and the CPU spike seems to suggest that
 something
  in the process of serialising the core information is taking too long and
  causing a timeout?
 
  Any suggestions on settings to tweak, ways to get extra debug
 information,
  or ascertain the active core list in some other way would be much
  appreciated!
 
  James
 




-- 
http://twitter.com/goodgravy
512 300 4210
http://webmynd.com/
Sent from Bury, United Kingdom


Truncated XML responses from CoreAdminHandler

2009-07-18 Thread James Brady
The Solr application I'm working on has many concurrently active cores - of
the order of 1000s at a time.

The management application depends on being able to query Solr for the
current set of live cores, a requirement I've been satisfying using the
STATUS core admin handler method.

However, once the number of active cores reaches a particular threshold
(which I haven't determined exactly), the response to the STATUS method is
truncated, resulting in malformed XML.

My debugging so far has revealed:

   - when doing STATUS queries from the local machine, they succeed,
   untruncated, 90% of the time
   - when local STATUS queries do fail, they are always truncated to the
   same length: 73685 bytes in my case
   - when doing STATUS queries from a remote machine, they fail due to
   truncation every time
   - remote STATUS queries are always truncated to the same length: 24704
   bytes in my case
   - the failing STATUS queries take visibly longer to complete on the
   client - a few seconds for a truncated result versus 1 second for an
   untruncated result
   - all STATUS queries return a successful 200 HTTP code
   - all STATUS queries are logged as returning in ~700ms in Solr's info log
   - during failing (truncated) responses, Solr's CPU usage spikes to
   saturation
   - behaviour seems the same whatever client I use: wget, curl, Python, ...

Using Solr 1.3.0 694707, Jetty 6.1.3.

At the moment, the main puzzles for me are that the local and remote
behaviour is so different. It leads me to think that it is something to do
with the network transmission speed. But the response really isn't that big
(untruncated it's ~1MB), and the CPU spike seems to suggest that something
in the process of serialising the core information is taking too long and
causing a timeout?

Any suggestions on settings to tweak, ways to get extra debug information,
or ascertain the active core list in some other way would be much
appreciated!

James


Re: Truncated XML responses from CoreAdminHandler

2009-07-18 Thread Otis Gospodnetic

James,

Not enough memory and Garbage Collection?  Connecting to Solr via JConsole 
should show it.


Otis --
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: James Brady james.colin.br...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Saturday, July 18, 2009 5:02:42 PM
 Subject: Truncated XML responses from CoreAdminHandler
 
 The Solr application I'm working on has many concurrently active cores - of
 the order of 1000s at a time.
 
 The management application depends on being able to query Solr for the
 current set of live cores, a requirement I've been satisfying using the
 STATUS core admin handler method.
 
 However, once the number of active cores reaches a particular threshold
 (which I haven't determined exactly), the response to the STATUS method is
 truncated, resulting in malformed XML.
 
 My debugging so far has revealed:
 
- when doing STATUS queries from the local machine, they succeed,
untruncated, 90% of the time
- when local STATUS queries do fail, they are always truncated to the
same length: 73685 bytes in my case
- when doing STATUS queries from a remote machine, they fail due to
truncation every time
- remote STATUS queries are always truncated to the same length: 24704
bytes in my case
- the failing STATUS queries take visibly longer to complete on the
client - a few seconds for a truncated result versus 1 second for an
untruncated result
- all STATUS queries return a successful 200 HTTP code
- all STATUS queries are logged as returning in ~700ms in Solr's info log
- during failing (truncated) responses, Solr's CPU usage spikes to
saturation
- behaviour seems the same whatever client I use: wget, curl, Python, ...
 
 Using Solr 1.3.0 694707, Jetty 6.1.3.
 
 At the moment, the main puzzles for me are that the local and remote
 behaviour is so different. It leads me to think that it is something to do
 with the network transmission speed. But the response really isn't that big
 (untruncated it's ~1MB), and the CPU spike seems to suggest that something
 in the process of serialising the core information is taking too long and
 causing a timeout?
 
 Any suggestions on settings to tweak, ways to get extra debug information,
 or ascertain the active core list in some other way would be much
 appreciated!
 
 James