Leif and Brian,

Thanks a lot!

I tried the patched version.  It applies to 3.0.2.  All delays that had me
wracking my brains appear to be GONE.  That was quite a relief to see.  My
stress level is now considerably lower.

We have been working with 3.0.1 for some time now.  I could make a case
for going with 3.0.2, which would probably fail.  Also, Boss man said
something to the effect of "Do not fork the code, do not branch the code,
do not patch the code.  Sorry, No."  And, well, that's a pretty good
approach to take.

We have 3.0.1 running.  The performance with 3.0.2 without the patch is
worse.  Delays with 3.0.2 were across the board 40ms, whereas with 3.0.1
the delays were distributed between 10ms and mostly 40ms, with occasional
bigger numbers... Some runs with 3.0.1 would have a lot of bigger numbers
(>40ms), some had none over 40ms.

I tried recompiling with different constants.... Neither of the two places
I changed the 10ms delay that are suggested below caused any changes in
the latency patterns.  P_UnixEThread.h had one too... That didn't change
it.  There are more of them to try, any suggestions?  There are many
places where it is enqueing things with 10ms delays.

Leif, you suggested it might be a lock below.  One on level that seems
like it might be right, but I am baffled about why a VM would lock when
physical hardware would not.  Changing to the local queue as per that
patch got rid of the problem... So it is 99% likely related to something
happening when queueing/dequeing.  It would be nice to have an idea of how
to prove that that is what it is, since handwaving doesn't always cut it.

It seems to me that 10ms is a standard clock granularity, so the timing of
that might not be from something coded into ATS.

There aren't any bare metal boxes available to try this on here in VM
land, at least not today, so I can't provide any feedback.  I hear that we
are getting hardware soon... The claim you made about VM's below got
attention.

Thanks for you help,

James Whitaker


On 1/25/12 7:40 PM, "Leif Hedstrom" <[email protected]> wrote:

>On 1/25/12 7:34 PM, Whitaker, James X. -ND wrote:
>> works) and have a 512MB ramcache configured.... with 5Gig of RAM
>>allocated
>> to the VM.  I shut off all logging, all peer caching... anything that
>>can
>> hit the disks or introduce any kind of dependency, and tell it to serve
>>up
>> a 100 byte chunk of cached data....
>
>Are you fetching a Range: request? Is the object large ? There are known
>performance problems with fetching ranges of a large objects.
>
>Or are you talking Chunked encoded responses, with no content-length?
>Once 
>in cache, we generally don't serve chunked encoded (they get transformed
>to 
>a response with Content-Length: header once it's in the cache, since we
>now 
>know the length).
>
>
>> rs
>> between X2.0 ms and X4.0ms for each request, where X is a random tens
>> place digit.  This suggests a 10ms granularity in a timer or a reference
>> clock.  Maybe VMWare or CentOS only supports 10ms timer granularity when
>> something doesn't get immediately pulled off the queue?
>
>10ms sounds like a retry / rescheduling of e.g. a lock. Perhaps try
>modifying any of
>
>iocore/net/P_UnixNet.h: #define NET_RETRY_DELAY
>HRTIME_MSECONDS(10)
>iocore/cluster/P_ClusterCacheInternal.h: #define
>CACHE_RETRY_PERIOD              HRTIME_MSECONDS(10)
>
>
>There are a few other places where we have 10ms retry times too, but try
>changing one of the above, and see if you see a difference. Assuming you
>can 
>reproduce this easily, it ought to help a bit isolating where things are
>going wrong.
>
>Also, you can try turning on slow-log in records.config, set it at some
>reasonably long time, and you can hopefully get some ideas where it's
>taking 
>a long time to process the requests.
>>
>>
>> I'm rather stumped.  Does anyone there have ideas?  Is the answer ³ATS
>>was
>> not designed to run on VM's²?
>
>When you run on a bare metal box, do you not see the same problem? I
>can't 
>say that we've spent any time trying to test (or optimize) for VMs. I've
>personally seen pretty abysmal performance out of e.g. VirtualBox, but
>it's 
>the same for both ATS and Varnish for example (typically, I see 1/5th -
>1/10th the throughput on a VM vs the bare metal).
>
>Cheers,
>
>-- leif
>

Reply via email to