Re: SOLRJ replace document
So I found out the issue here... It was related to what you guys said regarding the Map object in my document. The problem is that I had data being serialized from DB - .NET - JSON and some of the fields in .NET was == System.DBNull.Value instead of null. This caused the JSON serializer to write out an object (ie. Map) so when these fields got deserialized into the SolrInputDocument it had the Map objects as you indicated. Thanks for the help! Much appreciated! On Sat, Oct 19, 2013 at 12:58 AM, Jack Krupansky j...@basetechnology.comwrote: By all means please do file a support request with DataStax, either as an official support ticket or as a question on StackOverflow. But, I do think the previous answer of avoiding the use of a Map object in your document is likely to be the solution. -- Jack Krupansky -Original Message- From: Brent Ryan Sent: Friday, October 18, 2013 10:21 PM To: solr-user@lucene.apache.org Subject: Re: SOLRJ replace document So I think the issue might be related to the tech stack we're using which is SOLR within DataStax enterprise which doesn't support atomic updates. But I think it must have some sort of bug around this because it doesn't appear to work correctly for this use case when using solrj ... Anyways, I've contacted support so lets see what they say. On Fri, Oct 18, 2013 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 10/18/2013 3:36 PM, Brent Ryan wrote: My schema is pretty simple and has a string field called solr_id as my unique key. Once I get back to my computer I'll send some more details. If you are trying to use a Map object as the value of a field, that is probably why it is interpreting your add request as an atomic update. If this is the case, and you're doing it because you have a multivalued field, you can use a List object rather than a Map. If this doesn't sound like what's going on, can you share your code, or a simplification of the SolrJ parts of it? Thanks, Shawn
SOLRJ replace document
How do I replace a document in solr using solrj library? I keep getting this error back: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Atomic document updates are not supported unless updateLog/ is configured I don't want to do partial updates, I just want to replace it... Thanks, Brent
Re: SOLRJ replace document
I wish that was the case but calling addDoc() is what's triggering that exception. On Friday, October 18, 2013, Jack Krupansky wrote: To replace a Solr document, simply add it again using the same technique used to insert the original document. The set option for atomic update is only used when you wish to selectively update only some of the fields for a document, and that does require that the update log be enabled using updateLog. -- Jack Krupansky -Original Message- From: Brent Ryan Sent: Friday, October 18, 2013 4:59 PM To: solr-user@lucene.apache.org Subject: SOLRJ replace document How do I replace a document in solr using solrj library? I keep getting this error back: org.apache.solr.client.solrj.**impl.HttpSolrServer$**RemoteSolrException: Atomic document updates are not supported unless updateLog/ is configured I don't want to do partial updates, I just want to replace it... Thanks, Brent
Re: SOLRJ replace document
My schema is pretty simple and has a string field called solr_id as my unique key. Once I get back to my computer I'll send some more details. Brent On Friday, October 18, 2013, Shawn Heisey wrote: On 10/18/2013 2:59 PM, Brent Ryan wrote: How do I replace a document in solr using solrj library? I keep getting this error back: org.apache.solr.client.solrj.**impl.HttpSolrServer$**RemoteSolrException: Atomic document updates are not supported unless updateLog/ is configured I don't want to do partial updates, I just want to replace it... Replacing a document is done by simply adding the document, in the same way as if you were adding a new one. If you have properly configured Solr, the old one will be deleted before the new one is inserted. Properly configuring Solr means that you have a uniqueKey field in yourschema, and that it is a simple type like string, int, long, etc, and is not multivalued. A TextField type that is tokenized cannot be used as the uniqueKey field. Thanks, Shawn
Re: SOLRJ replace document
So I think the issue might be related to the tech stack we're using which is SOLR within DataStax enterprise which doesn't support atomic updates. But I think it must have some sort of bug around this because it doesn't appear to work correctly for this use case when using solrj ... Anyways, I've contacted support so lets see what they say. On Fri, Oct 18, 2013 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote: On 10/18/2013 3:36 PM, Brent Ryan wrote: My schema is pretty simple and has a string field called solr_id as my unique key. Once I get back to my computer I'll send some more details. If you are trying to use a Map object as the value of a field, that is probably why it is interpreting your add request as an atomic update. If this is the case, and you're doing it because you have a multivalued field, you can use a List object rather than a Map. If this doesn't sound like what's going on, can you share your code, or a simplification of the SolrJ parts of it? Thanks, Shawn
Re: SOLR grouped query sorting on numFound
ya, that's the problem... you can't sort by numFound and it's not feasible to do the sort on the client because the grouped result set is too large. Brent On Wed, Sep 25, 2013 at 6:09 AM, Erick Erickson erickerick...@gmail.comwrote: Hmmm, just specifying sort= is _almost_ what you want, except it sorts by the value of fields in the doc not numFound. this shouldn't be hard to do on the client though, but you'd have to return all the groups... FWIW, Erick On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan brent.r...@gmail.com wrote: We ran into 1 snag during development with SOLR and I thought I'd run it by anyone to see if they had any slick ways to solve this issue. Basically, we're performing a SOLR query with grouping and want to be able to sort by the number of documents found within each group. Our query response from SOLR looks something like this: { responseHeader:{ status:0, QTime:17, params:{ indent:true, q:*:*, group.limit:0, group.field:rfp_stub, group:true, wt:json, rows:1000}}, grouped:{ rfp_stub:{ matches:18470, groups:[{ groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e, doclist:{*numFound*:3,start:0,docs:[] }}, { groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce, doclist:{*numFound*:5,start:0,docs:[] }}, { groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131, doclist:{*numFound*:6,start:0,docs:[] }}, … The *numFound* shows the number of documents within that group. Is there anyway to perform a sort on *numFound* in SOLR ? I don't believe this is supported, but wondered if anyone their has come across this and if there was any suggested workarounds given that the dataset is really too large to hold in memory on our app servers?
SOLR grouped query sorting on numFound
We ran into 1 snag during development with SOLR and I thought I'd run it by anyone to see if they had any slick ways to solve this issue. Basically, we're performing a SOLR query with grouping and want to be able to sort by the number of documents found within each group. Our query response from SOLR looks something like this: { responseHeader:{ status:0, QTime:17, params:{ indent:true, q:*:*, group.limit:0, group.field:rfp_stub, group:true, wt:json, rows:1000}}, grouped:{ rfp_stub:{ matches:18470, groups:[{ groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e, doclist:{*numFound*:3,start:0,docs:[] }}, { groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce, doclist:{*numFound*:5,start:0,docs:[] }}, { groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131, doclist:{*numFound*:6,start:0,docs:[] }}, … The *numFound* shows the number of documents within that group. Is there anyway to perform a sort on *numFound* in SOLR ? I don't believe this is supported, but wondered if anyone their has come across this and if there was any suggested workarounds given that the dataset is really too large to hold in memory on our app servers?
apache bench solr keep alive not working?
Does anyone know why solr is not respecting keep-alive requests when using apache bench? ab -v 4 -H Connection: Keep-Alive -H Keep-Alive: 3000 -k -c 10 -n 100 http://host1:8983/solr/test.solr/select?q=*%3A*wt=xmlindent=true; Response contains this and if you look at debug output you see http header of Connection: Close Keep-Alive requests:0 Any ideas? I'm seeing the same behavior when using http client. Thanks, Brent
Re: apache bench solr keep alive not working?
thanks guys. I saw this other post with curl and verified it working. I've also used apache bench for a bunch of stuff and keep-alive works fine with things like Java netty.io servers ... strange that tomcat isn't respecting the http protocol or headers There must be a bug in this version of tomcat being used. On Tue, Sep 10, 2013 at 6:23 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : Does anyone know why solr is not respecting keep-alive requests when using : apache bench? I've seen this before from people trying to test with ab, but never fully understood it. There is some combination of using ab (which uses HTTP/1.0 and the non-RFC compliant HTTP/1.0 version of optional Keep-Alive) and jetty (which also attempts to support the non-RFC compliant HTTP/1.0 version of optional Keep-Alive) and chunked responses that come from jetty when you request dynamic resources (like solr query URLs) as opposed to static resources with a known file size. I suspect that jetty's best attempt at doing the hooky HTTP/1.0 Keep-Alive thing doesn't work with chunked responses -- or maybe ab doesn't actually keep the connection alive unless it gets a content-length. either way, see this previous thread for how you can demonstrate that keep-alive works properly using curl... https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201209.mbox/%3c5048e856.9080...@ea.com%3E -Hoss
CRLF Invalid Exception ?
Has anyone ever hit this when adding documents to SOLR? What does it mean? ERROR [http-8983-6] 2013-09-06 10:09:32,700 SolrException.java (line 108) org.apache.solr.common.SolrException: Invalid CRLF at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:175) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:663) at com.datastax.bdp.cassandra.index.solr.CassandraDispatchFilter.execute(CassandraDispatchFilter.java:176) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155) at com.datastax.bdp.cassandra.index.solr.CassandraDispatchFilter.doFilter(CassandraDispatchFilter.java:139) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at com.datastax.bdp.cassandra.audit.SolrHttpAuditLogFilter.doFilter(SolrHttpAuditLogFilter.java:194) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at com.datastax.bdp.cassandra.index.solr.auth.CassandraAuthorizationFilter.doFilter(CassandraAuthorizationFilter.java:95) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at com.datastax.bdp.cassandra.index.solr.auth.DseAuthenticationFilter.doFilter(DseAuthenticationFilter.java:102) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:722) Caused by: com.ctc.wstx.exc.WstxIOException: Invalid CRLF at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086) at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:387) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:245) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) ... 30 more Caused by: java.io.IOException: Invalid CRLF at org.apache.coyote.http11.filters.ChunkedInputFilter.parseCRLF(ChunkedInputFilter.java:352) at org.apache.coyote.http11.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:151) at org.apache.coyote.http11.InternalInputBuffer.doRead(InternalInputBuffer.java:710) at org.apache.coyote.Request.doRead(Request.java:428) at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:304) at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:403) at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:327) at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:162) at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365) at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110) at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) at com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046) at com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796) at com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756) at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2914) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) ... 33 more
Re: CRLF Invalid Exception ?
Thanks. I realized there's an error in the ChunkedInputFilter... I'm not sure if this means there's a bug in the client library I'm using (solrj 4.3) or is a bug in the server SOLR 4.3? Or is there something in my data that's causing the issue? On Fri, Sep 6, 2013 at 1:02 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : Has anyone ever hit this when adding documents to SOLR? What does it mean? Always check for the root cause... : Caused by: java.io.IOException: Invalid CRLF : : at : org.apache.coyote.http11.filters.ChunkedInputFilter.parseCRLF(ChunkedInputFilter.java:352) ...so while Solr is trying to read XML off the InputStream from the client, an error is encountered by the ChunkedInputFilter. I suspect the client library you are using for the HTTP connection is claiming it's using chunking but isn't, or is doing something wrong with the chunking, or there is a bug in the ChunkedInputFilter. -Hoss
Re: CRLF Invalid Exception ?
For what it's worth... I just updated to solrj 4.4 (even though my server is solr 4.3) and it seems to have fixed the issue. Thanks for the help! On Fri, Sep 6, 2013 at 1:41 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : I'm not sure if this means there's a bug in the client library I'm using : (solrj 4.3) or is a bug in the server SOLR 4.3? Or is there something in : my data that's causing the issue? It's unlikly that an error in the data you pass to SolrJ methods would be causing this problem -- i'm pretty sure it's not even a problem with the raw xml data being streamed, it appears to be a problem with how that data is getting shunked across the wire. My best guess is that the most likely causes are either... * a bug in the HttpClient versio you are using on the client side * a bug in the ChunkedInputFilter you are using on the server side * a misconfiguration on the HttpClient object you are using with SolrJ (ie: claiming it's sending chunked when it's not?) -Hoss