On Jun 14, 2012, at 12:48 PM, Peter Cock wrote: > In a separate example with 33 sub-tasks, there were two of these > inversions, while in yet another example with 33 sub-tasks there was > a trio submitted out of order. This non-deterministic behavior is a > little surprising, but in itself not an immediate problem.
You're correct in that submission order shouldn't matter at all, but I'll take a look and see if I can come up with an explanation for why. > In what appears to be a separate (and more concerning) loss of order, > after merging the output file order appears randomized. I would expect > the output from task_0, then task_1, ..., finally task_16. I haven't yet > worked out what order I am getting, but it isn't this, and neither is it > the order from the SGE job numbers (e.g. correct bar one pair > switched round). This would be happening in the merge. It looks like changeset c959d32f2405 might be the culprit for this -- it doesn't explicitly reorder by task number in the merge method, which would lead to (I'm guessing) an alphanumeric sort. I'll test and fix this. > [*] P.S. I would like to see an upper bound on the sleep_time in method > run_job, say half an hour? Otherwise with a group of long running jobs > it seems Galaxy may end up waiting a very long time between checks > for their completion since it just doubles the wait at each point. I had > sometimes noticed a delay between the sub-jobs finishing according > to the cluster and Galaxy doing anything about merging it - this is > probably why. This sleep time should currently cap at 8 seconds. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/