Re: SOLRJ replace document

2013-10-19 Thread Brent Ryan
So I found out the issue here...  It was related to what you guys said
regarding the Map object in my document.  The problem is that I had data
being serialized from DB - .NET - JSON and some of the fields in .NET was
== System.DBNull.Value instead of null.  This caused the JSON serializer to
write out an object (ie. Map) so when these fields got deserialized into
the SolrInputDocument it had the Map objects as you indicated.

Thanks for the help! Much appreciated!


On Sat, Oct 19, 2013 at 12:58 AM, Jack Krupansky j...@basetechnology.comwrote:

 By all means please do file a support request with DataStax, either as an
 official support ticket or as a question on StackOverflow.

 But, I do think the previous answer of avoiding the use of a Map object in
 your document is likely to be the solution.


 -- Jack Krupansky

 -Original Message- From: Brent Ryan
 Sent: Friday, October 18, 2013 10:21 PM
 To: solr-user@lucene.apache.org
 Subject: Re: SOLRJ replace document


 So I think the issue might be related to the tech stack we're using which
 is SOLR within DataStax enterprise which doesn't support atomic updates.
 But I think it must have some sort of bug around this because it doesn't
 appear to work correctly for this use case when using solrj ...  Anyways,
 I've contacted support so lets see what they say.


 On Fri, Oct 18, 2013 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote:

  On 10/18/2013 3:36 PM, Brent Ryan wrote:

  My schema is pretty simple and has a string field called solr_id as my
 unique key.  Once I get back to my computer I'll send some more details.


 If you are trying to use a Map object as the value of a field, that is
 probably why it is interpreting your add request as an atomic update.  If
 this is the case, and you're doing it because you have a multivalued
 field,
 you can use a List object rather than a Map.

 If this doesn't sound like what's going on, can you share your code, or a
 simplification of the SolrJ parts of it?

 Thanks,
 Shawn






SOLRJ replace document

2013-10-18 Thread Brent Ryan
How do I replace a document in solr using solrj library?  I keep getting
this error back:

org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Atomic document updates are not supported unless updateLog/ is configured

I don't want to do partial updates, I just want to replace it...


Thanks,
Brent


Re: SOLRJ replace document

2013-10-18 Thread Brent Ryan
I wish that was the case but calling addDoc() is what's triggering that
exception.

On Friday, October 18, 2013, Jack Krupansky wrote:

 To replace a Solr document, simply add it again using the same
 technique used to insert the original document. The set option for atomic
 update is only used when you wish to selectively update only some of the
 fields for a document, and that does require that the update log be enabled
 using updateLog.

 -- Jack Krupansky

 -Original Message- From: Brent Ryan
 Sent: Friday, October 18, 2013 4:59 PM
 To: solr-user@lucene.apache.org
 Subject: SOLRJ replace document

 How do I replace a document in solr using solrj library?  I keep getting
 this error back:

 org.apache.solr.client.solrj.**impl.HttpSolrServer$**RemoteSolrException:
 Atomic document updates are not supported unless updateLog/ is configured

 I don't want to do partial updates, I just want to replace it...


 Thanks,
 Brent



Re: SOLRJ replace document

2013-10-18 Thread Brent Ryan
My schema is pretty simple and has a string field called solr_id as my
unique key.  Once I get back to my computer I'll send some more details.

Brent

On Friday, October 18, 2013, Shawn Heisey wrote:

 On 10/18/2013 2:59 PM, Brent Ryan wrote:

 How do I replace a document in solr using solrj library?  I keep getting
 this error back:

 org.apache.solr.client.solrj.**impl.HttpSolrServer$**RemoteSolrException:
 Atomic document updates are not supported unless updateLog/ is
 configured

 I don't want to do partial updates, I just want to replace it...


 Replacing a document is done by simply adding the document, in the same
 way as if you were adding a new one.  If you have properly configured Solr,
 the old one will be deleted before the new one is inserted.  Properly
 configuring Solr means that you have a uniqueKey field in yourschema, and
 that it is a simple type like string, int, long, etc, and is not
 multivalued. A TextField type that is tokenized cannot be used as the
 uniqueKey field.

 Thanks,
 Shawn




Re: SOLRJ replace document

2013-10-18 Thread Brent Ryan
So I think the issue might be related to the tech stack we're using which
is SOLR within DataStax enterprise which doesn't support atomic updates.
 But I think it must have some sort of bug around this because it doesn't
appear to work correctly for this use case when using solrj ...  Anyways,
I've contacted support so lets see what they say.


On Fri, Oct 18, 2013 at 5:51 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/18/2013 3:36 PM, Brent Ryan wrote:

 My schema is pretty simple and has a string field called solr_id as my
 unique key.  Once I get back to my computer I'll send some more details.


 If you are trying to use a Map object as the value of a field, that is
 probably why it is interpreting your add request as an atomic update.  If
 this is the case, and you're doing it because you have a multivalued field,
 you can use a List object rather than a Map.

 If this doesn't sound like what's going on, can you share your code, or a
 simplification of the SolrJ parts of it?

 Thanks,
 Shawn




Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Brent Ryan
ya, that's the problem... you can't sort by numFound and it's not
feasible to do the sort on the client because the grouped result set is too
large.

Brent


On Wed, Sep 25, 2013 at 6:09 AM, Erick Erickson erickerick...@gmail.comwrote:

 Hmmm, just specifying sort= is _almost_ what you want,
 except it sorts by the value of fields in the doc not numFound.

 this shouldn't be hard to do on the client though, but you'd
 have to return all the groups...

 FWIW,
 Erick

 On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan brent.r...@gmail.com wrote:
  We ran into 1 snag during development with SOLR and I thought I'd run it
 by
  anyone to see if they had any slick ways to solve this issue.
 
  Basically, we're performing a SOLR query with grouping and want to be
 able
  to sort by the number of documents found within each group.
 
  Our query response from SOLR looks something like this:
 
  {
 
responseHeader:{
 
  status:0,
 
  QTime:17,
 
  params:{
 
indent:true,
 
q:*:*,
 
group.limit:0,
 
group.field:rfp_stub,
 
group:true,
 
wt:json,
 
rows:1000}},
 
grouped:{
 
  rfp_stub:{
 
matches:18470,
 
groups:[{
 
 
  groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e,
 
doclist:{*numFound*:3,start:0,docs:[]
 
}},
 
  {
 
 
  groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce,
 
doclist:{*numFound*:5,start:0,docs:[]
 
}},
 
  {
 
 
  groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131,
 
doclist:{*numFound*:6,start:0,docs:[]
 
}},
 
  …
 
 
  The *numFound* shows the number of documents within that group.  Is there
  anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
  supported, but wondered if anyone their has come across this and if there
  was any suggested workarounds given that the dataset is really too large
 to
  hold in memory on our app servers?



SOLR grouped query sorting on numFound

2013-09-24 Thread Brent Ryan
We ran into 1 snag during development with SOLR and I thought I'd run it by
anyone to see if they had any slick ways to solve this issue.

Basically, we're performing a SOLR query with grouping and want to be able
to sort by the number of documents found within each group.

Our query response from SOLR looks something like this:

{

  responseHeader:{

status:0,

QTime:17,

params:{

  indent:true,

  q:*:*,

  group.limit:0,

  group.field:rfp_stub,

  group:true,

  wt:json,

  rows:1000}},

  grouped:{

rfp_stub:{

  matches:18470,

  groups:[{


groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e,

  doclist:{*numFound*:3,start:0,docs:[]

  }},

{


groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce,

  doclist:{*numFound*:5,start:0,docs:[]

  }},

{


groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131,

  doclist:{*numFound*:6,start:0,docs:[]

  }},

…


The *numFound* shows the number of documents within that group.  Is there
anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
supported, but wondered if anyone their has come across this and if there
was any suggested workarounds given that the dataset is really too large to
hold in memory on our app servers?


apache bench solr keep alive not working?

2013-09-10 Thread Brent Ryan
Does anyone know why solr is not respecting keep-alive requests when using
apache bench?

ab -v 4 -H Connection: Keep-Alive -H Keep-Alive: 3000 -k -c 10 -n 100 
http://host1:8983/solr/test.solr/select?q=*%3A*wt=xmlindent=true;


Response contains this and if you look at debug output you see http header
of Connection: Close

Keep-Alive requests:0


Any ideas?  I'm seeing the same behavior when using http client.


Thanks,
Brent


Re: apache bench solr keep alive not working?

2013-09-10 Thread Brent Ryan
thanks guys.  I saw this other post with curl and verified it working.

I've also used apache bench for a bunch of stuff and keep-alive works fine
with things like Java netty.io servers ...  strange that tomcat isn't
respecting the http protocol or headers

There must be a bug in this version of tomcat being used.


On Tue, Sep 10, 2013 at 6:23 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : Does anyone know why solr is not respecting keep-alive requests when
 using
 : apache bench?

 I've seen this before from people trying to test with ab, but never
 fully understood it.

 There is some combination of using ab (which uses HTTP/1.0 and the non-RFC
 compliant HTTP/1.0 version of optional Keep-Alive) and jetty (which also
 attempts to support the non-RFC compliant HTTP/1.0 version of optional
 Keep-Alive) and chunked responses that come from jetty when you request
 dynamic resources (like solr query URLs) as opposed to static resources
 with a known file size.

 I suspect that jetty's best attempt at doing the hooky HTTP/1.0 Keep-Alive
 thing doesn't work with chunked responses -- or maybe ab doesn't actually
 keep the connection alive unless it gets a content-length.

 either way, see this previous thread for how you can demonstrate that
 keep-alive works properly using curl...


 https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201209.mbox/%3c5048e856.9080...@ea.com%3E


 -Hoss



CRLF Invalid Exception ?

2013-09-06 Thread Brent Ryan
Has anyone ever hit this when adding documents to SOLR?  What does it mean?


ERROR [http-8983-6] 2013-09-06 10:09:32,700 SolrException.java (line 108)
org.apache.solr.common.SolrException: Invalid CRLF

at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:175)

at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)

at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)

at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)

at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:663)

at
com.datastax.bdp.cassandra.index.solr.CassandraDispatchFilter.execute(CassandraDispatchFilter.java:176)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)

at
com.datastax.bdp.cassandra.index.solr.CassandraDispatchFilter.doFilter(CassandraDispatchFilter.java:139)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

at
com.datastax.bdp.cassandra.audit.SolrHttpAuditLogFilter.doFilter(SolrHttpAuditLogFilter.java:194)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

at
com.datastax.bdp.cassandra.index.solr.auth.CassandraAuthorizationFilter.doFilter(CassandraAuthorizationFilter.java:95)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

at
com.datastax.bdp.cassandra.index.solr.auth.DseAuthenticationFilter.doFilter(DseAuthenticationFilter.java:102)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)

at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)

at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)

at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)

at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)

at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)

at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)

at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)

at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)

at java.lang.Thread.run(Thread.java:722)

Caused by: com.ctc.wstx.exc.WstxIOException: Invalid CRLF

at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708)

at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086)

at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:387)

at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:245)

at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)

... 30 more

Caused by: java.io.IOException: Invalid CRLF

at
org.apache.coyote.http11.filters.ChunkedInputFilter.parseCRLF(ChunkedInputFilter.java:352)

at
org.apache.coyote.http11.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:151)

at
org.apache.coyote.http11.InternalInputBuffer.doRead(InternalInputBuffer.java:710)

at org.apache.coyote.Request.doRead(Request.java:428)

at
org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:304)

at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:403)

at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:327)

at
org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:162)

at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365)

at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110)

at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101)

at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84)

at
com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57)

at
com.ctc.wstx.sr.StreamScanner.loadMoreFromCurrent(StreamScanner.java:1046)

at com.ctc.wstx.sr.StreamScanner.parseLocalName2(StreamScanner.java:1796)

at com.ctc.wstx.sr.StreamScanner.parseLocalName(StreamScanner.java:1756)

at
com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2914)

at
com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2848)

at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)

... 33 more


Re: CRLF Invalid Exception ?

2013-09-06 Thread Brent Ryan
Thanks.  I realized there's an error in the ChunkedInputFilter...

I'm not sure if this means there's a bug in the client library I'm using
(solrj 4.3) or is a bug in the server SOLR 4.3?  Or is there something in
my data that's causing the issue?


On Fri, Sep 6, 2013 at 1:02 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : Has anyone ever hit this when adding documents to SOLR?  What does it
 mean?

 Always check for the root cause...

 : Caused by: java.io.IOException: Invalid CRLF
 :
 : at
 :
 org.apache.coyote.http11.filters.ChunkedInputFilter.parseCRLF(ChunkedInputFilter.java:352)

 ...so while Solr is trying to read XML off the InputStream from the
 client, an error is encountered by the ChunkedInputFilter.

 I suspect the client library you are using for the HTTP connection is
 claiming it's using chunking but isn't, or is doing something wrong with
 the chunking, or there is a bug in the ChunkedInputFilter.


 -Hoss



Re: CRLF Invalid Exception ?

2013-09-06 Thread Brent Ryan
For what it's worth... I just updated to solrj 4.4 (even though my server
is solr 4.3) and it seems to have fixed the issue.

Thanks for the help!


On Fri, Sep 6, 2013 at 1:41 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : I'm not sure if this means there's a bug in the client library I'm using
 : (solrj 4.3) or is a bug in the server SOLR 4.3?  Or is there something in
 : my data that's causing the issue?

 It's unlikly that an error in the data you pass to SolrJ methods would be
 causing this problem -- i'm pretty sure it's not even a problem with the
 raw xml data being streamed, it appears to be a problem with how that data
 is getting shunked across the wire.

 My best guess is that the most likely causes are either...
  * a bug in the HttpClient versio you are using on the client side
  * a bug in the ChunkedInputFilter you are using on the server side
  * a misconfiguration on the HttpClient object you are using with SolrJ
(ie: claiming it's sending chunked when it's not?)


 -Hoss