OK, I think I'm beginning to get my head around the background thread design. It seems to be used in a lightweight way by a number of various components, but is most heavily used by the transaction manager to handle post-commit work.

As you mentioned, the current design spreads the work across the user threads if the background thread gets backed up, and I agree that overall throughput may not be much improved by having multiple background threads do the work. If the work is coming in too fast, then sooner or later you're going to have to throttle back the background work and have it handled by the user threads so that it gets done in a reasonable amount of time. However, having multiple background threads could help if you have "bursts" of transactions that require post-commit work that the daemon threads can "drain out" in the background without impacting response time; it seems to me this might be a common scenario. I guess it depends on the application and what you're trying to optimize for; if you have something that is constantly inserting and deleting records, then it's possible that what exists now is fine. If you have an application that generally doesn't do deletes, but say once or twice a day "cleans house", then having more background threads could be very useful.

But, as you said, no design should be done until these theories are verified by some performance and scalability testing. I'm thinking of having one test that does sort of steady state inserts and deletes, and another one that does mostly inserts and updates and then "cleans house" when the number of records reaches some maximum. What do you think?

While reading the code, I did encounter one thing that had me concerned. What I am calling the transaction manager (impl.store.Xact.java) seems to me to be overstepping its responsibilities: it takes care of running work items when the work item requires immediate attention or if the daemon thread is backed up. This logic is not made available to any of the other components that make use of the daemon service. One can imagine this logic being refactored out and made generally available to all clients of the daemon service. Is there a specific reason for having this code in Xact.java?

Thanks,

David

Mike Matrigali wrote:

I have changed the subject, as I completely missed the original post
which had something to do with adding Junit tests.

I am not sure what is the right solution here, but getting a discussion
going would be good.

Currently a number of store actions are queued in "post commit" mode,
which means they should be executed until after the transaction which
queued them commits.  Currently there is one background thread which
processes these, if it gets too full then the work is done by the actual
thread which queued the work.   Most of the post commit work involves
claiming space from deleted rows after their transaction commits.

Going forward there is going to be a need for more background work.  I
soon will be posting the first phase of work to allow for returning
space back to the operating system, eventually it would be best if this
work was also done in background, somehow automatically queued by the
system.

I would also recommend coming up with a usage scenario which shows a
problem before coding up a solution.  I believe a test with lots of
users doing insert and delete should eventually show the background task
being bogged down -- but I am not sure if moving work to additional
threads is much better than just spreading the work out across the
existing user threads.

The code for the current background thread can be found in:
opensource/java/engine/org/apache/derby/impl/services/daemon

An example of one of the unit of work put on the queue is in:
opensource/java/engine/org/apache/derby/impl/store/access/heap/heappostcommit.java

Dan is probably the person who most recently worked on this code, and
should have some comments in this area.  He should be back active on the
list early next week.

Note another interesting area of research/coding would be to see how
derby scales on larger number of processor machines.  Not much work has
been done at all on machines with more than 2 processors.  The system
has been designed from bottom up to be multi-threaded, but not much
testing/monitoring has been done on 4 or more processor machines.   The
following single threading points exist in derby:
   o each user query is executed by a single thread.
   o the locking system in protected by a single java synchonization point.
   o copying log records into the log is a single sync point
   o finding a buffer in the buffer cache is a single sync point

All of these seemed to be reasonable designs for 1, 2 and 4 way machines.

/mikem


David Van Couvering wrote:



I noticed on the todo list there is a need to have more than one
background thread to enable better scalability with lots of client
connections.  I'm trying to find a way to gently work my way into doing
some work on Derby, and this seemed like a project of small enough scope
to get my feet wet.  Is there any background on this, or should I just
jump right in?  I didn't see any discussion of this on the list...

Thanks,

David



Reply via email to