Re: Server inconsistent state Core Reload issue

2013-05-31 Thread Chris Hostetter

: If you look at my email the container that is running SOLR got the request
: params (http access logs provided in first email) but when it goes through
: the SOLR app/code on the container (probably through request filters or
: dispatchers..I don't know exactly) its getting lost, which is what I am
: trying to understand. I want to understand under what situations this mat
: happen.

i can't think of any situation where anything in solr would do this ... i 
do however note that in the stacktraces you provided from your threaddump, 
there is a *lot* more forwarding and dispatching going on then i have ever 
seen in a typical jetty or tomcat type setup -- aprently because of how 
you have glassfish running

you've got the usual stuff prior to the SolrDispatchFilter 
(org.apache.catalina.*) but you've also got things i have never seen 
before...

   com.sun.enterprise.web.WebPipeline
   com.sun.enterprise.web.PESessionLockingStandardPipeline
   com.sun.enterprise.web.connector.grizzly.*

...i would bet money that whatever is fausing problems with your request 
paramaters is happening somewhere in there.

It would be trivial to test: just add some logging of the full request 
params in SolrDispatchFilter and see what it's getting from our servlet 
container when the doFilter method is called.

FWIW, some sporadic googling has lead to a handful of indications that 
you are not alone in encountering problems with glassfish sporadically 
swallowing request params (or POST bodies) so that applications running 
in glassfish don't get them...

 http://jira.icesoft.org/browse/ICE-5298
 http://www.coderanch.com/t/601685/glassfish/Body-Request-occasionally-lost


-Hoss


Server inconsistent state Core Reload issue

2013-05-01 Thread Ravi Solr
We are using Solr 3.6.2 with a single core setup on a glassfish server,
every 4-5 hours the server gradually gets into a some kind of a
inconsistent state and stops accepting any queries giving back cached
results. Even the core reload fails giving the following. Has anybody
experienced such behavior ? Can anybody help me understand why this might
happen ?

http://searchserver:80/solr/admin/cores?action=RELOADcore=core1

response
 lst name=responseHeader
  int name=status0/int
  int name=QTime9/int
 /lst
 lst name=status
  lst name=core1
  str name=namecore1/str
  str name=instanceDir/data/solr/core1-home//str
  str name=dataDir/data/solr/core/core1-data//str
  date name=startTime2013-05-01T19:16:31.32Z/date
  long name=uptime137850/long
  lst name=index
  int name=numDocs21479/int
  int name=maxDoc25170/int
  long name=version1367184551418/long
  int name=segmentCount4/int
  bool name=currenttrue/bool
  bool name=hasDeletionstrue/bool
  str
name=directoryorg.apache.lucene.store.MMapDirectory:org.apache.lucene.store.MMapDirectory@/data/solr/core/core1-data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@71d9673/str
 date name=lastModified2013-05-01T19:15:04Z/date
   /lst
  /lst
   /lst
/response


During the inconsistent state any queries being issued to the server loose
the query parameters. We can see the proper queries in the container's http
access logs but solr somehow solr doesn't get the query params at all. Also
note that content length on the container's access logs is always 68935,
which implies its always giving the same docs irrespective of the query.

If we restart the server everything is back to normal and the same queries
run properly.

SOLR Log
--
[#|2013-05-01T15:20:02.031-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=20;_ThreadName=httpSSLWorkerThread-9001-1;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=17 |#]

[#|2013-05-01T15:20:02.034-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=24;_ThreadName=httpSSLWorkerThread-9001-4;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=13 |#]

[#|2013-05-01T15:20:02.055-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=23;_ThreadName=httpSSLWorkerThread-9001-3;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=13 |#]

[#|2013-05-01T15:20:02.081-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=25;_ThreadName=httpSSLWorkerThread-9001-5;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=14 |#]

[#|2013-05-01T15:20:02.106-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=19;_ThreadName=httpSSLWorkerThread-9001-0;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=14 |#]

[#|2013-05-01T15:20:02.136-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-2;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=16 |#]

[#|2013-05-01T15:20:02.161-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=20;_ThreadName=httpSSLWorkerThread-9001-1;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=15 |#]

[#|2013-05-01T15:20:02.185-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=24;_ThreadName=httpSSLWorkerThread-9001-4;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=14 |#]

[#|2013-05-01T15:20:02.209-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=23;_ThreadName=httpSSLWorkerThread-9001-3;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=14 |#]

[#|2013-05-01T15:20:02.241-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=25;_ThreadName=httpSSLWorkerThread-9001-5;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=16 |#]

[#|2013-05-01T15:20:02.266-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=19;_ThreadName=httpSSLWorkerThread-9001-0;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=15 |#]

[#|2013-05-01T15:20:02.288-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=httpSSLWorkerThread-9001-2;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=14 |#]

[#|2013-05-01T15:20:02.291-0400|INFO|sun-appserver2.1.1|org.apache.solr.core.SolrCore|_ThreadID=20;_ThreadName=httpSSLWorkerThread-9001-1;|[core1]
webapp=/solr path=/select params={} hits=21479 status=0 QTime=15 |#]



Container Access Logs
-
xx.xxx.xx.xx  01/May/2013:15:20:02 -0500 GET

Re: Server inconsistent state Core Reload issue

2013-05-01 Thread Ravi Solr
Shawn,
  I don't believe its the container because we use the same container
in another setup that has 6 cores which is serving almost 1.8 Million
requests a day without a hitch.

If you look at my email the container that is running SOLR got the request
params (http access logs provided in first email) but when it goes through
the SOLR app/code on the container (probably through request filters or
dispatchers..I don't know exactly) its getting lost, which is what I am
trying to understand. I want to understand under what situations this mat
happen.

Having said that this application that uses this problematic SOLR instance
retrieves large number of facets results for each of 26 facets for each
query and every query is a group query, would that cause any issues with
SOLR caches that could lead to the issues like I am facing ???

With regards to the port number, our paranoid security folks wanted me to
not reveal our ports so I put it as 80 without thinking :-), I assure use
that its not 80.

Thanks,

Ravi


On Wed, May 1, 2013 at 6:03 PM, Shawn Heisey s...@elyograg.org wrote:

 On 5/1/2013 3:14 PM, Ravi Solr wrote:

 We are using Solr 3.6.2 with a single core setup on a glassfish server,
 every 4-5 hours the server gradually gets into a some kind of a
 inconsistent state and stops accepting any queries giving back cached
 results. Even the core reload fails giving the following. Has anybody
 experienced such behavior ? Can anybody help me understand why this might
 happen ?

 http://searchserver:80/solr/**admin/cores?action=RELOAD**core=core1http://searchserver:80/solr/admin/cores?action=RELOADcore=core1

 response
   lst name=responseHeader
int name=status0/int
int name=QTime9/int
   /lst
   lst name=status


 It is dropping the parameters from the /admin/cores request too, so it
 returns status instead of acting on the RELOAD.

 This is acting like a servlet container issue more than a Solr issue. It's
 always possible that it actually is Solr.

 It's a little unusual to see Solr running on port 80.  It's not
 impossible, just not the normal setup, because exposing Solr directly to
 the outside world is a very bad idea, so it's a lot safer to have it listen
 on another port.

 Is glassfish actually listening on port 80?  If it's not, then you
 probably have something acting as a proxy in front of Solr.  If your
 platform is a UNIX variant or Linux and has a fully functional 'lsof'
 command, the following will tell you which process is bound to port 80:

 lsof -nPi | grep :80

 Can you try running Solr under the jetty that's included with the Solr
 download?  For Solr 3.6.2, this is a slightly modified Jetty 6.  You can't
 use the Jetty 8 that's included with a newer version of Solr.  If port 80
 is a requirement, that should be possible as long as it's running as root.

 Thanks,
 Shawn