Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-27 Thread Shawn Heisey

On 3/26/2012 10:25 PM, Shawn Heisey wrote:
The problem is that I currently have no way (that I know of so far) to 
detect that a problem happened.  As far as my code is concerned, 
everything worked, so it updates my position tracking and those 
documents will never be inserted.  I have not yet delved into the 
response object to see whether it can tell me anything.  My code 
currently assumes that if no exception was thrown, it was successful.  
This works with CHSS.  I will write some test code that tries out 
various error situations and see what the response contains.


I've written some test code.  When doing an add with SUSS against a 
server that's down, no exception is thrown.  It does throw one for query 
and deleteByQuery.  When doing the add test with CHSS, an exception is 
thrown.  I guess I'll just have to use CHSS until this gets fixed, 
assuming it ever does.  Would it be at all helpful to file an issue in 
jira, or has one already been filed?  With a quick search, I could not 
find one.


Thanks,
Shawn



Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-27 Thread Mark Miller
Like I said, you have to extend the class and override the error method. 

Sent from my iPhone

On Mar 27, 2012, at 2:29 AM, Shawn Heisey s...@elyograg.org wrote:

 On 3/26/2012 10:25 PM, Shawn Heisey wrote:
 The problem is that I currently have no way (that I know of so far) to 
 detect that a problem happened.  As far as my code is concerned, everything 
 worked, so it updates my position tracking and those documents will never be 
 inserted.  I have not yet delved into the response object to see whether it 
 can tell me anything.  My code currently assumes that if no exception was 
 thrown, it was successful.  This works with CHSS.  I will write some test 
 code that tries out various error situations and see what the response 
 contains.
 
 I've written some test code.  When doing an add with SUSS against a server 
 that's down, no exception is thrown.  It does throw one for query and 
 deleteByQuery.  When doing the add test with CHSS, an exception is thrown.  I 
 guess I'll just have to use CHSS until this gets fixed, assuming it ever 
 does.  Would it be at all helpful to file an issue in jira, or has one 
 already been filed?  With a quick search, I could not find one.
 
 Thanks,
 Shawn
 


Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-27 Thread Shawn Heisey

On 3/26/2012 6:43 PM, Mark Miller wrote:

It doesn't get thrown because that logic needs to continue - you don't 
necessarily want one bad document to stop all the following documents from 
being added. So the exception is sent to that method with the idea that you can 
override and do what you would like. I've written sample code around stopping 
and throwing an exception, but I guess its not totally trivial. Other ideas for 
reporting errors have been thrown around in the past, but no work on it has 
gotten any traction.


It looks like StreamingUpdateSolrServer is not meant for situations 
where strict error checking is required.  I think the documentation 
should reflect that.  Would you be opposed to a javadoc update at the 
class level (plus a wiki addition) like the following? Because document 
inserts are handled as background tasks, exceptions and errors that 
occur during those operations will not be available to the calling 
program, but they will be logged.  For example, if the Solr server is 
down, your program must determine this on its own.  If you need strict 
error handling, use CommonsHttpSolrServer.  If my wording is bad, feel 
free to make suggestions.


If I'm wrong and you do have an example of an error handling override 
that would do what I need, I would love to see it.  From what I can 
tell, add requests are pushed down and handled by Runner threads, 
completely disconnected from the request.  The response to add calls 
always seems to be a NOTE element saying the request is processed in a 
background stream, even if successful.


Thanks,
Shawn



Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-27 Thread Mark Miller

On Mar 27, 2012, at 10:51 AM, Shawn Heisey wrote:

 On 3/26/2012 6:43 PM, Mark Miller wrote:
 It doesn't get thrown because that logic needs to continue - you don't 
 necessarily want one bad document to stop all the following documents from 
 being added. So the exception is sent to that method with the idea that you 
 can override and do what you would like. I've written sample code around 
 stopping and throwing an exception, but I guess its not totally trivial. 
 Other ideas for reporting errors have been thrown around in the past, but no 
 work on it has gotten any traction.
 
 It looks like StreamingUpdateSolrServer is not meant for situations where 
 strict error checking is required.  I think the documentation should reflect 
 that.  Would you be opposed to a javadoc update at the class level (plus a 
 wiki addition) like the following? Because document inserts are handled as 
 background tasks, exceptions and errors that occur during those operations 
 will not be available to the calling program, but they will be logged.  For 
 example, if the Solr server is down, your program must determine this on its 
 own.  If you need strict error handling, use CommonsHttpSolrServer.  If my 
 wording is bad, feel free to make suggestions.
 
 If I'm wrong and you do have an example of an error handling override that 
 would do what I need, I would love to see it.  From what I can tell, add 
 requests are pushed down and handled by Runner threads, completely 
 disconnected from the request.  The response to add calls always seems to be 
 a NOTE element saying the request is processed in a background stream, even 
 if successful.
 
 Thanks,
 Shawn
 


I'm not saying what it's meant for, I'm just saying what it is. Currently, the 
only thing you can do to check for errors is override that method. I understand 
it's still somewhat limiting - it depends on your use case how well it can 
work. For example, I've know people that just want to stop the update process 
if a doc fails, and throw an exception. You can write code to do that by 
extending the class and overriding handleError. You can also collection the 
exceptions, count the fails, read and parse any error messages, etc. It doesn't 
help you with an ID or anything though - unless you get unluck/lucky and can 
parse it out of error messages (if it's even in them). It might be more useful 
if you could set the name of an id field for it to look for and perhaps also 
dump to that method.

Their have been previous conversations about improving error reporting for this 
SolrServer, but no work has ever really gotten off the ground. There may be 
existing JIRA issues around this topic - certainly there are previous email 
threads.

All and all though, please, make all the suggestions and JIRA issues you want. 
Javadoc improvements can be submitted as patches through JIRA as well. Also, 
the Wiki is open to anyone to update. 

- Mark Miller
lucidimagination.com













Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-27 Thread Erick Erickson
https://issues.apache.org/jira/browse/SOLR-445

This JIRA reflects the slightly different case of wanting better
reporting of *which* document failed in a multi-document packet, it
doesn't specifically address SUSS. But it might serve to give you some
ideas if you tackle this.

On Tue, Mar 27, 2012 at 11:14 AM, Mark Miller markrmil...@gmail.com wrote:

 On Mar 27, 2012, at 10:51 AM, Shawn Heisey wrote:

 On 3/26/2012 6:43 PM, Mark Miller wrote:
 It doesn't get thrown because that logic needs to continue - you don't 
 necessarily want one bad document to stop all the following documents from 
 being added. So the exception is sent to that method with the idea that you 
 can override and do what you would like. I've written sample code around 
 stopping and throwing an exception, but I guess its not totally trivial. 
 Other ideas for reporting errors have been thrown around in the past, but 
 no work on it has gotten any traction.

 It looks like StreamingUpdateSolrServer is not meant for situations where 
 strict error checking is required.  I think the documentation should reflect 
 that.  Would you be opposed to a javadoc update at the class level (plus a 
 wiki addition) like the following? Because document inserts are handled as 
 background tasks, exceptions and errors that occur during those operations 
 will not be available to the calling program, but they will be logged.  For 
 example, if the Solr server is down, your program must determine this on its 
 own.  If you need strict error handling, use CommonsHttpSolrServer.  If my 
 wording is bad, feel free to make suggestions.

 If I'm wrong and you do have an example of an error handling override that 
 would do what I need, I would love to see it.  From what I can tell, add 
 requests are pushed down and handled by Runner threads, completely 
 disconnected from the request.  The response to add calls always seems to be 
 a NOTE element saying the request is processed in a background stream, 
 even if successful.

 Thanks,
 Shawn



 I'm not saying what it's meant for, I'm just saying what it is. Currently, 
 the only thing you can do to check for errors is override that method. I 
 understand it's still somewhat limiting - it depends on your use case how 
 well it can work. For example, I've know people that just want to stop the 
 update process if a doc fails, and throw an exception. You can write code to 
 do that by extending the class and overriding handleError. You can also 
 collection the exceptions, count the fails, read and parse any error 
 messages, etc. It doesn't help you with an ID or anything though - unless you 
 get unluck/lucky and can parse it out of error messages (if it's even in 
 them). It might be more useful if you could set the name of an id field for 
 it to look for and perhaps also dump to that method.

 Their have been previous conversations about improving error reporting for 
 this SolrServer, but no work has ever really gotten off the ground. There may 
 be existing JIRA issues around this topic - certainly there are previous 
 email threads.

 All and all though, please, make all the suggestions and JIRA issues you 
 want. Javadoc improvements can be submitted as patches through JIRA as well. 
 Also, the Wiki is open to anyone to update.

 - Mark Miller
 lucidimagination.com













Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-27 Thread Mike Sokolov

On 3/27/2012 11:14 AM, Mark Miller wrote:

On Mar 27, 2012, at 10:51 AM, Shawn Heisey wrote:


On 3/26/2012 6:43 PM, Mark Miller wrote:

It doesn't get thrown because that logic needs to continue - you don't 
necessarily want one bad document to stop all the following documents from 
being added. So the exception is sent to that method with the idea that you can 
override and do what you would like. I've written sample code around stopping 
and throwing an exception, but I guess its not totally trivial. Other ideas for 
reporting errors have been thrown around in the past, but no work on it has 
gotten any traction.

It looks like StreamingUpdateSolrServer is not meant for situations where strict error 
checking is required.  I think the documentation should reflect that.  Would you be 
opposed to a javadoc update at the class level (plus a wiki addition) like the following? 
Because document inserts are handled as background tasks, exceptions and errors 
that occur during those operations will not be available to the calling program, but they 
will be logged.  For example, if the Solr server is down, your program must determine 
this on its own.  If you need strict error handling, use CommonsHttpSolrServer.  If 
my wording is bad, feel free to make suggestions.

It might make sense to accumulate the errors in a fixed-size queue and 
report them either when the queue fills up or when the client commits 
(assuming the commit will wait for all outstanding inserts to complete 
or fail).  This is what we do client-side when performing multi-threaded 
inserts.  Sounds great in theory, I think, but then I haven't delved in 
to SUSS at all ... just a suggestion, take it or leave it.  Actually I 
wonder whether SUSS is necessary of you do the threading client-side?  
You might get a similar perf gain; I know we see a substantial speedup 
that way.  because then your updates spawn multiple threads in the 
server anyway, don't they?


- Mike


StreamingUpdateSolrServer - exceptions not propagated

2012-03-26 Thread Shawn Heisey
I've been building a new version of my app that keeps our Solr indexes 
up to date.  I had hoped to use StreamingUpdateSolrServer instead of 
CommonsHttpSolrServer for performance reasons, but I have run into a 
showstopper problem that has made me revert to CHSS.


I have been relying on exception handling to detect when there is any 
kind of problem with any request sent to Solr.  Looking at the code for 
SUSS, it seems that any exceptions thrown by lower level code are simply 
logged, then forgotten as if they had never happened.


So far I have not been able to decipher how things actually work, so I 
can't tell if it would be possible to propagate the exception back up 
into my code.


Questions for the experts: Would such propagation be possible without 
compromising performance?  Is this a bug?  Can I somehow detect the 
failure and throw an exception of my own?


For reference, here is the exception that gets logged, but not actually 
thrown:


java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
at java.net.Socket.connect(Socket.java:579)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:154)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

at java.lang.Thread.run(Thread.java:722)

Thanks,
Shawn



Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-26 Thread Mark Miller
It doesn't get thrown because that logic needs to continue - you don't 
necessarily want one bad document to stop all the following documents from 
being added. So the exception is sent to that method with the idea that you can 
override and do what you would like. I've written sample code around stopping 
and throwing an exception, but I guess its not totally trivial. Other ideas for 
reporting errors have been thrown around in the past, but no work on it has 
gotten any traction.


- Mark Miller
lucidimagination.com

On Mar 26, 2012, at 7:33 PM, Shawn Heisey wrote:

 I've been building a new version of my app that keeps our Solr indexes up to 
 date.  I had hoped to use StreamingUpdateSolrServer instead of 
 CommonsHttpSolrServer for performance reasons, but I have run into a 
 showstopper problem that has made me revert to CHSS.
 
 I have been relying on exception handling to detect when there is any kind of 
 problem with any request sent to Solr.  Looking at the code for SUSS, it 
 seems that any exceptions thrown by lower level code are simply logged, then 
 forgotten as if they had never happened.
 
 So far I have not been able to decipher how things actually work, so I can't 
 tell if it would be possible to propagate the exception back up into my code.
 
 Questions for the experts: Would such propagation be possible without 
 compromising performance?  Is this a bug?  Can I somehow detect the failure 
 and throw an exception of my own?
 
 For reference, here is the exception that gets logged, but not actually 
 thrown:
 
 java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
at java.net.Socket.connect(Socket.java:579)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
 org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
at 
 org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
at 
 org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at 
 org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
at 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:154)
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
 
 Thanks,
 Shawn
 














Re: StreamingUpdateSolrServer - exceptions not propagated

2012-03-26 Thread Shawn Heisey

On 3/26/2012 6:43 PM, Mark Miller wrote:

It doesn't get thrown because that logic needs to continue - you don't 
necessarily want one bad document to stop all the following documents from 
being added. So the exception is sent to that method with the idea that you can 
override and do what you would like. I've written sample code around stopping 
and throwing an exception, but I guess its not totally trivial. Other ideas for 
reporting errors have been thrown around in the past, but no work on it has 
gotten any traction.


- Mark Miller
lucidimagination.com

On Mar 26, 2012, at 7:33 PM, Shawn Heisey wrote:


I've been building a new version of my app that keeps our Solr indexes up to 
date.  I had hoped to use StreamingUpdateSolrServer instead of 
CommonsHttpSolrServer for performance reasons, but I have run into a 
showstopper problem that has made me revert to CHSS.

I have been relying on exception handling to detect when there is any kind of 
problem with any request sent to Solr.  Looking at the code for SUSS, it seems 
that any exceptions thrown by lower level code are simply logged, then 
forgotten as if they had never happened.


The problem is that I currently have no way (that I know of so far) to 
detect that a problem happened.  As far as my code is concerned, 
everything worked, so it updates my position tracking and those 
documents will never be inserted.  I have not yet delved into the 
response object to see whether it can tell me anything.  My code 
currently assumes that if no exception was thrown, it was successful.  
This works with CHSS.  I will write some test code that tries out 
various error situations and see what the response contains.


Thanks,
Shawn