Re: [ql-users] efficient buffer size
Robert Newson wrote: [...] When testing out the various versions of grep (grep, egrep, fgrep) on Unix, I ran the timings at least twice - first to ensure that the command and data were in memory cached and ignored that time, and then a second (or more) time(s) for the actual timing. Including the first run would make the comparison skewed as the data for second program would be cached after testing first program, plus I preferred to do several timings and take the mean (the first timing would be larger due to the program not being cached). [If you're interested, I found egrep to be the fasted for the grepping I do.] Perhaps because the egrep that you tried used a DFA RE engine, DFA engines are generally slower to compile the REs, but faster in operation, than the NFA engines used in many grep implementations (this problem is especially noticeable with POSIX NFA engines because they must try all possible matches). Interestingly I've often found that fgrep is sometimes slower than running egrep with the same parameters, which is odd because it is named f(ast)grep, and was clearly intended to be used for pure string searches. In theory I suppose pure, i.e., non-RE, string searches ought to be faster than RE searches, but clearly not by much in this case: $ time egrep susp .bash_profile susp() ## print the POSIX man page for $1 to $1.posix export -f l path substr sus wd awd susp real0m1.487s user0m0.000s sys 0m0.000s $ time fgrep susp .bash_profile susp() ## print the POSIX man page for $1 to $1.posix export -f l path substr sus wd awd susp real0m1.269s user0m0.000s sys 0m0.000s $ As you say a lot depends on the type of searching that you are doing and the system that you are using. -- Peter S Tillier Who needs perl when you can write dc, sokoban, arkanoid and an unlambda interpreter in sed?
Re: [ql-users] efficient buffer size
P Witte wrote: Robert Newson writes: [If you're interested, I found egrep to be the fasted for the grepping I do.] I am interested. Was it you I corresponded with about a multi-file version of grep? The current versions for the QL will only allow one file at a time which makes it rather inefficient for scanning a whole directory or disk. Per Really? I'm surprised. I posted a port of gsed2.05 that runs under Adrian Ives' shell on the QL and from the Basic command line. AFAICR it accepts wildcard characters from either, using the wildcard expansion code that Dave Walker's C68 provides where necessary. I ported this version because that supplied with C68 did not do all that I wanted. I also ported Brian Kernighan's awk because there was no port available for the QL. Peter S Tillier Who needs perl when you can write dc, sokoban, arkanoid and an unlambda interpreter in sed?
RE: [ql-users] efficient buffer size
History indeed ! I used to use the 8K space between 8192 and 16384 as storage space. It was a ROM shadow area, but never gave me any problems. I even managed to run machine code programs from there as well. Cheers, Norman. - Norman Dunbar Database/Unix administrator Lynx Financial Systems Ltd. mailto:[EMAIL PROTECTED] Tel: 0113 289 6265 Fax: 0113 289 3146 URL: http://www.Lynx-FS.com - -Original Message- From: Marcel Kilgus [mailto:[EMAIL PROTECTED] Sent: Monday, June 23, 2003 12:50 PM To: ql-users Subject: Re: [ql-users] efficient buffer size Norman Dunbar wrote: In my day it was a ZX-81 with 1KB of memory - every byte counted then !! Despite my age this is where I started, too. But that's just history now. Marcel This email is intended only for the use of the addressees named above and may be confidential or legally privileged. If you are not an addressee you must not read it and must not use any information contained in it, nor copy it, nor inform any person other than Lynx Financial Systems or the addressees of its existence or contents. If you have received this email and are not a named addressee, please delete it and notify the Lynx Financial Systems IT Department on 0113 2892990.
Re: [ql-users] efficient buffer size
Lau wrote: Sound like fun. I guess it's a little warmer than home... We have currently 38°C (100°F, 311°K) in southern Germany, it can't possibly be much hotter than here ;-) Q. What's the longest monosyllabic word? (Clue: ryrira yrggref) Has it something to do with a certain small wood creature (note that wood is not capitalised here ;-))? That's the only one I found that fits your clue. Marcel
RE: [ql-users] efficient buffer size
I think yesterday we got 41°C in France ... -Message d'origine- De : Marcel Kilgus [mailto:[EMAIL PROTECTED] Envoyé : lundi 23 juin 2003 15:23 À : ql-users Objet : Re: [ql-users] efficient buffer size Lau wrote: Sound like fun. I guess it's a little warmer than home... We have currently 38°C (100°F, 311°K) in southern Germany, it can't possibly be much hotter than here ;-)
Re: [ql-users] efficient buffer size
False : every bit was precious :) CU In my day it was a ZX-81 with 1KB of memory - every byte counted then !! :o) - Norman Dunbar Database/Unix administrator Lynx Financial Systems Ltd. mailto:[EMAIL PROTECTED] Tel: 0113 289 6265 Fax: 0113 289 3146 URL: http://www.Lynx-FS.com - This email is intended only for the use of the addressees named above and may be confidential or legally privileged. If you are not an addressee you must not read it and must not use any information contained in it, nor copy it, nor inform any person other than Lynx Financial Systems or the addressees of its existence or contents. If you have received this email and are not a named addressee, please delete it and notify the Lynx Financial Systems IT Department on 0113 2892990.
Re: [ql-users] efficient buffer size
Marcel Kilgus wrote: Norman Dunbar wrote: In my day it was a ZX-81 with 1KB of memory - every byte counted then !! Despite my age this is where I started, too. But that's just history now. Marcel My very first working true computer based setup I had at home was Strictly experimental, a Motorola 6800 AND 256 BYTES, Yup, 256 bytes, the standard setup came with 128 bytes cashe and 128 bytes RAM, I added the second RAM chip to bring it up to 256 bytes, strickly a mechine language programming effort. -- Paul Holmgren Hoosier Corps #33, L-6 2 57 300-C's in Indy
Re: [ql-users] efficient buffer size
How about a puzzle thread on here? Q. What's the longest monosyllabic word? (Clue: ryrira yrggref) We don't all speak Welsh you know :o) You'd have a hope. As W and Y are vowels in Welsh, you can't even use them to make your monosyllables longer! Now where did I put Goeff's Solvit Plus... -- Dilwyn Jones
Re: [ql-users] efficient buffer size
ZN wrote: On 6/21/2003 at 11:35 PM Lau wrote: Back to my earlier mention of caching... hard drives and their controllers do caching as well. I'm not certain if they do read-ahead caching. In short, yes. Ta. I'll add a little proviso. The hardware can't know what the next logical sector(allocation unit) for a file is, hence the need for a file system to avoid fragmentation. Unix (et alia) actually also does (did?) do software read-ahead. It assumed that when you read from a file, you would be wanting to read more of the same, so it initiated the read of the next sector (or whatever) as soon as it gave you the current one. -- Lau http://www.bergbland.info Get a domain from http://oneandone.co.uk/xml/init?k_id=5165217 and I'll get the commission!
Re: [ql-users] efficient buffer size
P Witte wrote: ... That was the significance of 2^n size no. s remarks --- - - --- --- x: xxx xxx xx Primer run ;) A first time run to ensure that any caching that would be done by the first run was done before the first run so that all further runs had a similar same state machine? When testing out the various versions of grep (grep, egrep, fgrep) on Unix, I ran the timings at least twice - first to ensure that the command and data were in memory cached and ignored that time, and then a second (or more) time(s) for the actual timing. Including the first run would make the comparison skewed as the data for second program would be cached after testing first program, plus I preferred to do several timings and take the mean (the first timing would be larger due to the program not being cached). [If you're interested, I found egrep to be the fasted for the grepping I do.]
Re: [ql-users] efficient buffer size
P Witte wrote: Your explanation made reminded me that a considerable amount of buffering is already going on (the hard disk, Windoze, and Smsq). iof.load is possibly not much more efficient under those circumstances than iob.fmul. Yes, it's not that much of a difference anymore. In the past iof.load was much faster on machines with much RAM because it didn't invoke the slaving mechanism. Now with slaving limited to only a little ram area the data is effectively just copied around one more time in the case of iob.fmul. But I thought that PIO mode hard disks, the current norm, actually pushed the data into memory with barely any intervention from the CPU. PIO mode is the slow stuff where the CPU has to fetch the data. Ultra DMA does its writes directly at the location the data is needed. Hehe, youre probably right, though I think I'll rely on my test results in this particular case. I suppose my real question was whether there is some sweet buffer size pertaining to Qdos/Smsq, that minimises fiddly edge conditions and the like. It mostly depends on what you're doing. But I'd use a nice 16k or 32k buffer for most purposes. Marcel
Re: [ql-users] efficient buffer size
On Mon, 23 Jun 2003 at 00:43:09, Lau wrote: (ref: [EMAIL PROTECTED]) My suggestion of one byte buffers was a little facetious (one of the two words was the five vowels in order - there's one with them reversed). 'was' - 'with' Ah that is worth remembering as it help spell the damn word (8-) I have to be beardless as I can't find hash on this Malaysian machine. I am in Kuala Lumpur setting up hardware for worldnews.com. There is no web server on the Windows machine with my email program. I am using VNC to get at the windows machine via the worldnews VPN using 192.168 addressing. Pretty good. Last night I was also listening to the Archers while working on my home machine. Almost as good as being home (8-) Still food is cheaper here - 55p for a large lunch of chicken, broth, green veg etc ( what we might call soup here) second bowl of noodles and other things on the side. They have a 1 ringgit note (15p) subcontinental uncomplimentary duoliteral abstemious arterious annelidous Do I win a prize. Lau - remember the BBC puzzle panel where I came up with more answers that they expected as well (8-) and you won the following week. How about the geometric mean of our two responses - that would be 128 bytes - but even better would be 124 bytes, which will have long alignment going for it and will certainly save some code (you can do it with MOVEQ). Great to see you still thinking this way Lau. You are the master of code saving. I will never forget your use of JMP instead of JSR in bp.init. How many bytes was your Forth (and a good implementation so you said) - 4k? -- QBBS (QL fido BBS 2:252/67) +44(0)1442-828255 tony@surname.co.uk http://www.firshman.co.uk Voice: +44(0)1442-828254 Fax: +44(0)1442-828255 TF Services, 29 Longfield Road, TRING, Herts, HP23 4DG
Re: [ql-users] efficient buffer size
On 21 Jun 2003, at 2:41, P Witte wrote: (...) Yes, that is understood. It is in situations where the whole file cannot be read at once, Im thinking about. (Besides, on a multitasking machine it is probably not very polite to grab huge buffers ;) (...) Oh well, if you start worrying about being polite to other programs... :-) I'd still simply grab just as much memory I can use. If speed is of the essence, as you said in your requirements, then the user will probably also know to let the machine alone (tell him!) and not have too many other progs trying to get memory at the same time. If notn, then speed is not that essential, after all. So I'd still go for as much memory as I can get and read in the entire file. If that can't be done (not enough space): Ultimately, it will then be the read operations that slow everything down. Now, considering that iob.fmul fstrg use D2 to indicate how many bytes they should get, and since D2 only can be word sized, you can, at most, read $f bytes in one go. If nothing else, I'd use that as my buffer size Wolfgang - www.wlenerz.com
Re: [ql-users] efficient buffer size
P Witte wrote: snip As far as I know, nothing my program does should be affected by the size of the buffer, apart from filling it in the first place. So my findings would seem to indicate that a buffer size of between 256 bytes! and 1k are optimal for this kind of thing. This is strange enough, considering that iob.fmul is called more frequently the smaller the buffer. What surprises me is why were not seeing the benefits of iof.load in this (or at least I dont). Anyone got a theory? I was watching this thread with interest. Maybe I should have commented earlier... that the results you have obtained are exactly what I was expecting. If you had run your test on a microdrive... you would have found that scatter load for reading in files in their entirety may have had some effect (on a 68008, where the cost of copying the data might have a noticeable effect). I'm not sure that scatter load is implemented in any floppy driver, although it could be done. It could easily cut the time to load a file by a factor of three, on a standard sort of floppy. That's because of the interleave. (logical to physical sector mapping does every third sector... the idea being that you get two sectors worth of time between writing one sector to get you data sorted for writing the next sector. A consideration which is pretty much wasted when reading a file!) You mention doing unspeakable things to the contents of the files. If that means doing an *enormous* amount of processing, then the buffer size and disk transfer times will always be irrelevant. Indeed, even if you only did a byte search through the data for a single fixed value byte, using Basic, that would mash the buffer copy by a factor of fifty, at least, at a guess. Then there's caching... if you had enough memory, your test would produce some curious results... whichever buffer size you ran first would seem to be rather slow compared to the rest. As you ran your tests on your hard disk, the caching effect doesn't come into it, but I thought I'd mention it. Finally, it all depends on how fast your processor and hard disk are. These days, processors tend to have a lot of cycles available per byte transferred from(to) disk. That's the reason why compressed hard disk partitions are actually *faster* to handle that uncompressed. The processor has lots of spare time on its hands to perform the (de)compression. Overall time to process a file becomes tied mainly to the raw disk access time to get all the data on/off the disk. A compressed file occupies about half the physical space, so can be handled in handled in half the overall time. Back to my earlier mention of caching... hard drives and their controllers do caching as well. I'm not certain if they do read-ahead caching. In summary, I believe your results are just giving the shear time it takes the drive to spin on its axis tracks * cylinders * interleave times, and have very little to do with any CPU processing. Answer to your original question: How big a buffer should I use?... one byte? -- Lau http://www.bergbland.info Get a domain from http://oneandone.co.uk/xml/init?k_id=5165217 and I'll get the commission!
Re: [ql-users] efficient buffer size
On 6/21/2003 at 11:35 PM Lau wrote: Back to my earlier mention of caching... hard drives and their controllers do caching as well. I'm not certain if they do read-ahead caching. In short, yes. Even older IDE drives with sufficient buffer memory at least attempt to always read in the whole track if given time (no requests for sectors from another track within the time needed for a full revolution). However, the definition of a 'track' can vary - logical tracks (as addressed by the CPU) have very little to do with the physycal actuality, as most drives since the ra of 40Mb drives have constant linear bit density - i.e. outer tracks, being of a larger circumference, have more sectors. The drive does the translation into a uniform sector per track topology. Most drives do read ahead in terms of physical tracks, but in some cases (such as small buffers or odd translation schemes) will work in terms of logical tracks. In any case, the actual mechanism is hardly important and the implementation is is left to the drive manufacturer.
Re: [ql-users] efficient buffer size
Wolfgang writes: A question: A program uses io.fstrg/iob.fmul to load files in smaller chunks for scanning. The files could be of any size on any media (first of all hard disks). What, theoretically, is the smallest efficient buffer size to use? (Im thinking *speed* here.) Eg 512 bytes, as a whole sector can be loaded in at once? Or allocation unit size? Or any arbitrary size that best suits my program? If you're thinking speed, then the larger the buffer, the better - reading the data in small chunks will always cost more time. If at all possible use a buffer for the entire file (scatter) read it in. Yes, that is understood. It is in situations where the whole file cannot be read at once, Im thinking about. (Besides, on a multitasking machine it is probably not very polite to grab huge buffers ;) Sector (512 bytes) sized buffers don't make that much sense IMHO, since the file data doesn't occupy the whole of the first sector (there's the file header), so reading the first 512 bytes from a file will read from 2 separate sectors. This brings us to the heart of the question: What would be a sensible size? First one block of 512-64b and then subsequent blocks of 512 bytes (or multiples thereof)? Per