Re: [gt-user] excessive latency

Ian Foster Tue, 22 Jul 2008 04:57:39 -0700

Hi Alexander:

A few comments on this topic:

a) We have put a fair bit of time into streamlining GRAM4 jobsubmission. We'll keep working on it, and it will surely improvesomewhat further in the future, but it will always be somewhatexpensive because we are using SOAP, performing authentication/authorization, etc. So the use case that we support is not "lots offew second jobs."

b) That said, I note that the *throughput* that GRAM4 achieves isbetter than the *latency*. See http://www.globus.org/alliance/publications/papers/TG07-GRAM-comparison-final.pdffor a somewhat dated discussion of performance issues--note theresults in Table 2 and Table 3. These are old data--GRAM4 has improvedsince then, but observe that back in 2007, we saw that when streamingmultiple jobs, per-job cost was a few seconds.

c) If the use case in question is "lots of few second jobs" then theapproach that we recommend is to use multi-level scheduling, e.g., viaFalkon, MyCluster, etc.

d) If the use case is "lots of many-second jobs" and the concern isthe rate at which your submitting client can send jobs to the manyCPUs of D-Grid, then we should consult with the GRAM team to seewhether there is some alternative way of implementing your submittingclient to increase throughput.


Regards -- Ian.



On Jul 22, 2008, at 4:24 AM, Alexander Beck-Ratzka wrote:

On Sonntag, 20. Juli 2008 18:19:57 Ioan Raicu wrote:
Hi,
You are forgetting that in real Grid deployments, the majority of the
wait time will be in queue wait times in batch schedulers. Forexample,in some logs I looked at from 2005 from SDSC, I recall seeing queuewaittimes of 6 hours on average over a 1 year period. So, having someextralatency on the order of 1~60 seconds is not a big deal when youraverage
job lengths are hours, or more.
This might be write for your usecase. However, there are also otherusecasesaround in the grid world. We are running [EMAIL PROTECTED] as a taskfarmingapplication on the ressourece of D-Grid, and we consume per dayabout 100000
CPU hours. So it is really a productive application. Because we are
submitting hundred of jobs, the latency cannot be neglected, and itwold bereally helpful to reduce it to a time below 1 second. If you'relooking intothe net traffic caused by globusrun-ws -submit, you can see therearea lot of
communication cicles (I think it are 9) between the submitting and the
execution host. Is this really necessary? SOAP only requires one...

So please note: there is no "real Grid deployment" in that way, you've
mentioned it. I think this problem will get still more bothersome,if a
scheduler as e.g. Gridway is coming into the game.

Cheers

Alexander

Re: [gt-user] excessive latency

Reply via email to