Yup, which is certainly not a possibility on the sending side. I ran it a few more times today and it failed on different records each time.
I also cut the number of client threads down to 10 - didn't help. Next I will try just 2... karl --- original message --- From: "ext Mark Miller" <[email protected]> Subject: Re: Solr spewage and dropped documents, while indexing Date: June 7, 2010 Time: 5:59:11 PM Could be - you'll get that error if no content is actually found in the http post. On 6/7/10 2:45 PM, Thijs Vonk wrote: > Hi Karl, > > I'm just a novice Solr user, but I've seen this error some time ago, it > usually happend when I was pushing my machine to hard (the machine that > was gathering the documents to be send to solr). I was using to many > threads on the machine, and now that I've reduced those to a more > reasonable amount I haven't seen it. It was as if the machine just > didn't have enough CPU power to actually send everything over the wire. > > I read that you had about 50 threads running. Depending an what they are > doing it seems like a lot, but that all depends totally on the hardware > you're running on I guess. > > Thijs > > On 6/7/10 5:30 PM, [email protected] wrote: >> Hi Simon, >> >> The same data was successfully indexed with one thread over the >> weekend. So it is unlikely to be a data issue. >> >> It turns out there was one of the content-missing exceptions far back >> in the screen buffer, but only one: >> >> Jun 7, 2010 9:57:48 AM org.apache.solr.common.SolrException log >> SEVERE: org.apache.solr.common.SolrException: missing content stream >> at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co >> ntentStreamHandlerBase.java:49) >> at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl >> erBase.java:131) >> at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle >> Request(RequestHandlers.java:233) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1321) >> at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter >> .java:341) >> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte >> r.java:244) >> at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet >> Handler.java:1089) >> at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3 >> 65) >> at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav >> a:216) >> at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1 >> 81) >> at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7 >> 12) >> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) >> at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHand >> lerCollection.java:211) >> at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection. >> java:114) >> at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1 >> 39) >> at org.mortbay.jetty.Server.handle(Server.java:285) >> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:50 >> 2) >> at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio >> n.java:835) >> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) >> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) >> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) >> at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector. >> java:226) >> at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool >> .java:442) >> >> So it seems like Solr is behaving consistently with it being convinced >> of missing content. >> >> I am actually wondering if the problem is occurring because so much is >> happening at the same time we're seeing a socket timeout for some of >> the requests. If the socket times out and closes, then I could well >> imagine that Solr would toss this error. Thoughts? >> >> Karl >> >> >> -----Original Message----- >> From: ext Simon Willnauer [mailto:[email protected]] >> Sent: Monday, June 07, 2010 11:20 AM >> To: [email protected] >> Subject: Re: Solr spewage and dropped documents, while indexing >> >> Karl, the HTTP error lines are produced by your code right?! Can you >> provide what has been returned by Solr? >> If that would be related to any server side problem described above >> like no sockets or so you would not see a 400! I could also imagine >> that the documents you are sending are empty - is that something which >> could have happened? >> >> simon >> >> On Mon, Jun 7, 2010 at 5:05 PM,<[email protected]> wrote: >>> Perhaps - although missing_content_stream seems to imply that it had >>> at least partly read 4 requests which later failed. Also, wouldn't >>> there be something in the output log which would give us a clue as to >>> what happened? >>> >>> Is there any post-hiccup spelunking I can reasonably do? Or should I >>> try to reproduce the problem with more diagnostics on? >>> >>> Karl >>> >>> >>> -----Original Message----- >>> From: ext Bernd Fondermann [mailto:[email protected]] >>> Sent: Monday, June 07, 2010 10:54 AM >>> To: [email protected] >>> Subject: Re: Solr spewage and dropped documents, while indexing >>> >>> Looks like a server-side problem to me. >>> Maybe the server ran out of sockets or other resources and just replied >>> with a 400 error? >>> >>> Bernd >>> >>> [email protected] wrote: >>>> Hi folks, >>>> >>>> This morning I was experimenting with using multiple threads while >>>> indexing some 20,000,000 records worth of content. In fact, my test >>>> spun up some 50 threads, and happily chugged away for a couple of >>>> hours before I saw the following output from my test code: >>>> >>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while >>>> trying to index record 6469124 >>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while >>>> trying to index record 6469551 >>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while >>>> trying to index record 6470592 >>>> Http protocol error: HTTP/1.1 400 missing_content_stream, while >>>> trying to index record 6472454 >>>> java.net.SocketException: Connection reset >>>> at java.net.SocketInputStream.read(SocketInputStream.java:168) >>>> at HttpPoster.getResponse(HttpPoster.java:280) >>>> at HttpPoster.indexPost(HttpPoster.java:191) >>>> at ParseAndLoad$PostThread.run(ParseAndLoad.java:638) >>>> <<<<<< >>>> >>>> Looking at the solr-side output, I see nothing interesting at all: >>>> >>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute >>>> INFO: [] webapp=/solr path=/update/extract >>>> params={literal.nokia_longitude=9.78518981933594&literal.nokia_phone=%2B497971910474&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Münster&literal.nokia_placerating=0&literal.id=6472724&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=1&literal.nokia_ppid=276u0wyw-c8cb7f4d6cd84a639a4e7d3570bf8814&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9985514322917&literal.nokia_postalcode=74405&literal.nokia_street=WeinhaldenstraÃe&literal.nokia_title=Dorfgemeinschaft+Münster+e.V.&literal.nokia_category=261} >>>> status=0 QTime=1 >>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute >>>> INFO: [] webapp=/solr path=/update/extract >>>> params={literal.nokia_longitude=9.76717020670573&literal.nokia_phone=%2B497971950725&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472737&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=13&literal.nokia_ppid=276u0wyw-d3bed6449fcb41b0adc50ae08e041f8d&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9974405924479&literal.nokia_fax=%2B497971950712&literal.nokia_postalcode=74405&literal.nokia_street=KochstraÃe&literal.nokia_title=BayWa+AG+Bau-+%26+Gartenmarkt&literal.nokia_category=194} >>>> status=0 QTime=0 >>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute >>>> INFO: [] webapp=/solr path=/update/extract >>>> params={literal.nokia_longitude=9.77591044108073&literal.nokia_phone=%2B49797124009&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Unterrot&literal.nokia_placerating=0&literal.id=6472739&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=28&literal.nokia_ppid=276u0wyw-d534d7a9235a4edf878d5e32a34bad8b&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9791788736979&literal.nokia_fax=%2B49797123431&literal.nokia_postalcode=74405&literal.nokia_street=HauptstraÃe&literal.nokia_title=Gastel+R.&literal.nokia_category=5} >>>> status=0 QTime=1 >>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute >>>> INFO: [] webapp=/solr path=/update/extract >>>> params={literal.nokia_longitude=9.76935&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=5&literal.id=6472698&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=15&literal.nokia_ppid=276u0wyw-9544100e68d74162aff54783b9376134&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9981&literal.nokia_postalcode=74405&literal.nokia_street=KanzleistraÃe&literal.nokia_tag=Steuerberater&literal.nokia_tag=Business+%26+Service&literal.nokia_title=Consultis+GmbH&literal.nokia_category=215} >>>> status=0 QTime=92 >>>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute >>>> INFO: [] webapp=/solr path=/update/extract >>>> params={literal.nokia_longitude=9.77173970540364&literal.nokia_phone=%2B4979713238&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472699&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=37&literal.nokia_ppid=276u0wyw-9600016fd0d248c9b442111838350f64&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9987182617188&literal.nokia_fax=%2B497971911639&literal.nokia_postalcode=74405&literal.nokia_street=KarlstraÃe&literal.nokia_title=Videothek,+5th+avenue+Peltekis+Apostolos&literal.nokia_category=5} >>>> status=0 QTime=93 >>>> <<<<<< >>>> >>>> It is unlikely (but, of course, not out of the question) that this >>>> hiccup is due to some reentrancy problem in my test code. It is much >>>> more likely to be some kind of a Solr multi-threaded race condition >>>> - especially since it looks like a number of requests all failed at >>>> precisely the same time. This is a Solr 1.5 build from mid-late >>>> March, FWIW. Does anyone know of an extractingUpdateRequestHandler >>>> re-entrancy bug of this kind? >>>> >>>> Thanks, >>>> Karl >>>> >>>> >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > -- - Mark http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
