Patricia Shanahan wrote:
Peter Firmstone wrote:
Peter Firmstone wrote:
Patricia Shanahan wrote:
On 7/5/2010 8:39 PM, Peter Firmstone wrote:
Patricia Shanahan wrote:
Peter Firmstone wrote:
...
This is where you have to be careful of Distributed computing, I
think you'd have to branch out in both directions, checking
older and
younger tasks as the tasks arrival may be a combination of
processing
remote and local calls. In fact a task might arrive so far out of
sequence that it could be at the opposite end of the queue.
...
So what happens if Task x arrives so late that we have already done
the runAfter tests for Task y, but y should run after x?
Patricia
Hmm, yes I was just thinking that, need to look at the
implementations
again...
I've checked both the comments and the code. The TaskManager class
Javadoc comment talks about "not required to run after any of the
tasks that precede it in the queue", and I believe that is the way
it is implemented. For example, takeTask sets the size to i in the
runAfter call for a candidate at index i in the list.
It seems to be the caller's responsibility to make sure that a task
is not added to a TaskManager until after any task it needs to run
after.
Patricia
Hmm, yes you have a point, that is the current behaviour.
It'll be interesting to see if we can simulate RemoteEvent's
arriving out of order with a time delay between them. I suspect
that the state would just get muddled without any complaint, we'd
need to test that with a current implementation, then decide which
party should be responsible.
I'm interested in getting River to run on the Internet, currently
River / Jini is at home on local intranet's where there is probably
a very low likelyhood that RemoteEvents will be received out of
order. I suspect this behaviour was overlooked or missed by the
designer, the caller cannot always know, if it did, I wouldn't need
a TaskManager that manages dependencies, just a fifo queue. But I
could be wrong. It's a pity the original author isn't around to
comment. I find my understanding improves as I implement things, we
don't have to know the right answer up front, so experiment away,
I'm confident you'll work out a good solution, based on your
comments to date.
My assumption (and that's all it is) is based on tasks taking
sufficient time, combined with enough queue length and current
locking with poor scalability to allow all RemoteEvents and thus
tasks to arrive on the queue on a low latency network for it not to
have been an issue. I suspect that you'll fix it so well, that the
queue will be empty, waiting on the network and therefore the
dependency's won't get checked at all.
Thanks for your confidence.
My last job (I was a graduate student from 2002 to late last year) was
as a large SPARC server platform architect. To improve prototype
system testing, I wrote an extremely silly but extremely useful
program called "parstore". It just block stores the floating point
registers on a specified number of processors to memory, repeatedly,
as fast as it can. The effect is to fill queues, and generally disturb
and stress the interconnect. It never detected any errors, but
prototypes were more likely to crash and operating system stress tests
were more likely to fail while it was running.
If one of the River developers has an intranet test environment it may
be possible to simulate the effect of running over the Internet by a
similar trick. Create some workload that keeps the network very busy,
and run it in parallel with a quality assurance test.
In some cases it may not matter which of two transactions is done
first, but it is important to make sure there is a consistent order
between them.
Patricia
Cool & Wow!
Still do most of my development on SPARC, will migrate to Linux x86
shortly, I don't think Oracle or Fujitsu intend to support developers
with SPARC workstations, mine's still going, but it's getting old.
Cheers,
Peter.