I hope you can help me with what's probably a basic question, but one I haven't 
found an answer to elsewhere.  

I haven't used Derby extensively, but I do have experience with multithreaded 
applications with MySQL.  I've noticed some interesting but surprising behavior 
with Derby I was hoping I could get around.  

Typically in my app I have a large number of inserts happening from multiple 
threads as the system functions.  I have another thread that runs occasionally 
updating the status of these records based on queries to an external source; 
this update may be a record update or a delete of the record.  

Initially I had it set up so this "updating" thread was iterating over a scroll 
insensitive updatable resultset, deleting/updating as needed. But this resulted 
in a lot more blocking than I can afford: the inserts, which happen frequently 
- up to say 5-8 a second at times - but nothing that seems it should terribly 
overburden Derby, would block out the resultset from even stepping to the first 
record.  So instead I had it iterating over a read-only resultset then 
separately deleting/updating records - with inserts happening from various 
other threads all the while.  (Each thread also has its own pooled connection 
from DBCP)

What I'm seeing is that under heavy load, I occasionally get "lock could not be 
obtained.." exceptions.  I could ramp up the lock timeout, but I feel like this 
would be just kicking the problem down the road, and that there shouldn't be 
any problem with what I'm doing.

To this point, all the inserts are under autocommitting connections, and 
there's really no way to batch those; they need to be inserted real-time.  I 
can try batching the updates/deletes from iterating over the resultset, but I 
really would prefer (for requirements of my app) that these be performed 
real-time.  The only solution I've come up with so far is to select a "page" of 
data to update at a time (like 100 records), then close that 
resultset/statement and reselect the next "page".  It's not critical that the 
resultset I iterate over include all the data that's been inserted since the 
initial query was ran, though it would be preferable I suppose.  

Any thoughts on what I might be doing wrong or what I can do to make this work 
a little more reliably?  Performance is important, but stability and lower 
overhead is more important.

-Ron

Reply via email to