Hi Karl,

I'm just a novice Solr user, but I've seen this error some time ago, it usually happend when I was pushing my machine to hard (the machine that was gathering the documents to be send to solr). I was using to many threads on the machine, and now that I've reduced those to a more reasonable amount I haven't seen it. It was as if the machine just didn't have enough CPU power to actually send everything over the wire.

I read that you had about 50 threads running. Depending an what they are doing it seems like a lot, but that all depends totally on the hardware you're running on I guess.

Thijs

On 6/7/10 5:30 PM, [email protected] wrote:
Hi Simon,

The same data was successfully indexed with one thread over the weekend.  So it 
is unlikely to be a data issue.

It turns out there was one of the content-missing exceptions far back in the 
screen buffer, but only one:

Jun 7, 2010 9:57:48 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: missing content stream
         at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
ntentStreamHandlerBase.java:49)
         at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
erBase.java:131)
         at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
Request(RequestHandlers.java:233)
         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321)
         at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
.java:341)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
r.java:244)
         at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
Handler.java:1089)
         at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
65)
         at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
a:216)
         at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
81)
         at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
12)
         at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
         at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand
lerCollection.java:211)
         at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.
java:114)
         at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
39)
         at org.mortbay.jetty.Server.handle(Server.java:285)
         at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:50
2)
         at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
n.java:835)
         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
         at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.
java:226)
         at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool
.java:442)

So it seems like Solr is behaving consistently with it being convinced of 
missing content.

I am actually wondering if the problem is occurring because so much is 
happening at the same time we're seeing a socket timeout for some of the 
requests.  If the socket times out and closes, then I could well imagine that 
Solr would toss this error.  Thoughts?

Karl


-----Original Message-----
From: ext Simon Willnauer [mailto:[email protected]]
Sent: Monday, June 07, 2010 11:20 AM
To: [email protected]
Subject: Re: Solr spewage and dropped documents, while indexing

Karl, the HTTP error lines are produced by your code right?! Can you
provide what has been returned by Solr?
If that would be related to any server side problem described above
like no sockets or so you would not see a 400! I could also imagine
that the documents you are sending are empty - is that something which
could have happened?

simon

On Mon, Jun 7, 2010 at 5:05 PM,<[email protected]>  wrote:
Perhaps - although missing_content_stream seems to imply that it had at least 
partly read 4 requests which later failed.  Also, wouldn't there be something 
in the output log which would give us a clue as to what happened?

Is there any post-hiccup spelunking I can reasonably do?  Or should I try to 
reproduce the problem with more diagnostics on?

Karl


-----Original Message-----
From: ext Bernd Fondermann [mailto:[email protected]]
Sent: Monday, June 07, 2010 10:54 AM
To: [email protected]
Subject: Re: Solr spewage and dropped documents, while indexing

Looks like a server-side problem to me.
Maybe the server ran out of sockets or other resources and just replied
with a 400 error?

  Bernd

[email protected] wrote:
Hi folks,

This morning I was experimenting with using multiple threads while indexing 
some 20,000,000 records worth of content.  In fact, my test spun up some 50 
threads, and happily chugged away for a couple of hours before I saw the 
following output from my test code:

Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index 
record 6469124
Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index 
record 6469551
Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index 
record 6470592
Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to index 
record 6472454
java.net.SocketException: Connection reset
         at java.net.SocketInputStream.read(SocketInputStream.java:168)
         at HttpPoster.getResponse(HttpPoster.java:280)
         at HttpPoster.indexPost(HttpPoster.java:191)
         at ParseAndLoad$PostThread.run(ParseAndLoad.java:638)
<<<<<<

Looking at the solr-side output, I see nothing interesting at all:

Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update/extract 
params={literal.nokia_longitude=9.78518981933594&literal.nokia_phone=%2B497971910474&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Münster&literal.nokia_placerating=0&literal.id=6472724&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=1&literal.nokia_ppid=276u0wyw-c8cb7f4d6cd84a639a4e7d3570bf8814&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9985514322917&literal.nokia_postalcode=74405&literal.nokia_street=WeinhaldenstraÃe&literal.nokia_title=Dorfgemeinschaft+Münster+e.V.&literal.nokia_category=261}
 status=0 QTime=1
Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update/extract 
params={literal.nokia_longitude=9.76717020670573&literal.nokia_phone=%2B497971950725&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472737&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=13&literal.nokia_ppid=276u0wyw-d3bed6449fcb41b0adc50ae08e041f8d&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9974405924479&literal.nokia_fax=%2B497971950712&literal.nokia_postalcode=74405&literal.nokia_street=KochstraÃe&literal.nokia_title=BayWa+AG+Bau-+%26+Gartenmarkt&literal.nokia_category=194}
 status=0 QTime=0
Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update/extract 
params={literal.nokia_longitude=9.77591044108073&literal.nokia_phone=%2B49797124009&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Unterrot&literal.nokia_placerating=0&literal.id=6472739&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=28&literal.nokia_ppid=276u0wyw-d534d7a9235a4edf878d5e32a34bad8b&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9791788736979&literal.nokia_fax=%2B49797123431&literal.nokia_postalcode=74405&literal.nokia_street=HauptstraÃe&literal.nokia_title=Gastel+R.&literal.nokia_category=5}
 status=0 QTime=1
Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update/extract 
params={literal.nokia_longitude=9.76935&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=5&literal.id=6472698&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=15&literal.nokia_ppid=276u0wyw-9544100e68d74162aff54783b9376134&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9981&literal.nokia_postalcode=74405&literal.nokia_street=KanzleistraÃe&literal.nokia_tag=Steuerberater&literal.nokia_tag=Business+%26+Service&literal.nokia_title=Consultis+GmbH&literal.nokia_category=215}
 status=0 QTime=92
Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/update/extract 
params={literal.nokia_longitude=9.77173970540364&literal.nokia_phone=%2B4979713238&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472699&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=37&literal.nokia_ppid=276u0wyw-9600016fd0d248c9b442111838350f64&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9987182617188&literal.nokia_fax=%2B497971911639&literal.nokia_postalcode=74405&literal.nokia_street=KarlstraÃe&literal.nokia_title=Videothek,+5th+avenue+Peltekis+Apostolos&literal.nokia_category=5}
 status=0 QTime=93
<<<<<<

It is unlikely (but, of course, not out of the question) that this hiccup is 
due to some reentrancy problem in my test code.  It is much more likely to be 
some kind of a Solr multi-threaded race condition - especially since it looks 
like a number of requests all failed at precisely the same time.  This is a 
Solr 1.5 build from mid-late March, FWIW.  Does anyone know of an 
extractingUpdateRequestHandler re-entrancy bug of this kind?

Thanks,
Karl





---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to