Re: Newbie question about text encoding

Dave Angel Thu, 26 Feb 2015 23:33:00 -0800

On 02/27/2015 12:58 AM, Steven D'Aprano wrote:

Dave Angel wrote:

(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because "you can't fake what you ain't got.")


If I recall correctly, disk access is about 10000 times slower than RAM, so
virtual memory is *at least* that much slower than real memory.

It's so much more complicated than that, that I hardly know where tostart. I'll describe a generic processor/OS/memory/disk architecture;there will be huge differences between processor models even from asingle manufacturer.

First, as soon as you add swapping logic to yourprocessor/memory-system, you theoretically slow it down. And in thedays of that quote, Cray's memory was maybe 50 times as fast as thememory used by us mortals. So adding swapping logic would have slowedit down quite substantially, even when it was not swapping. But thatlogic is inside the CPU chip these days, and presumably thoroughlyoptimized.

Next, statistically, a program uses a small subset of its total program& data space in its working set, and the working set should reside inreal memory. But when the program greatly increases that working set,and it approaches the amount of physical memory, then swapping becomesmore frenzied, and we say the program is thrashing. Simple example, trysorting an array that's about the size of available physical memory.

Next, even physical memory is divided into a few levels of caching, someon-chip and some off. And the caching is done in what I call strips,where accessing just one byte causes the whole strip to be loaded fromnon-cached memory. I forget the current size for that, but it's maybe64 to 256 bytes or so.

If there are multiple processors (not multicore, but actual separateprocessors), then each one has such internal caches, and any writes onone processor may have to trigger flushes of all the other processorsthat happen to have the same strip loaded.

The processor not only prefetches the next few instructions, but decodesand tentatively executes them, subject to being discarded if aconditional branch doesn't go the way the processor predicted. So someinstructions execute in zero time, some of the time.

Every address of instruction fetch, or of data fetch or store, goesthrough a couple of layers of translation. Segment register plus offsetgives linear address. Lookup those in tables to get physical address,and if table happens not to be in on-chip cache, swap it in. Ifphysical address isn't valid, a processor exception causes the OS topotentially swap something out, and something else in.

Once we're paging from the swapfile, the size of the read is perhaps 4k.And that read is regardless of whether we're only going to use onebyte or all of it.

The ratio between an access which was in the L1 cache and one whichrequired a page to be swapped in from disk? Much bigger than your10,000 figure. But hopefully it doesn't happen a big percentage of thetime.

Many, many other variables, like the fact that RAM chips are notdirectly addressable by bytes, but instead count on rows and columns.So if you access many bytes in the same row, it can be much quicker thanrandom access. So simple access time specifications don't mean as muchas it would seem; the controller has to balance the RAM spec with thevarious cache requirements.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: Newbie question about text encoding

Reply via email to