Re: RE the large file challenge

2002-11-16 Thread Richard Gaskin
Sadhunathan Nadesan wrote: > > For an interesting read on security and high level languages, this is fun: > > http://m.bacarella.com/papers/secsoft/html Great article -- thanks for posting that! -- Richard Gaskin Fourth World Media Corporation Developer of WebMerge 2.0: Publish any databa

RE the large file challenge

2002-11-16 Thread Sadhunathan Nadesan
| Maybe we need a new name for what Transcript does. | | Transcript pre-processes scripts into pointer-based bytecode, which | generally outperforms purely interpreted xTalk by anywhere from several | times to a few orders of magnitude. | Maybe? This is an excellent clarification. If MC is s

Transcript name (was the large file challenge)

2002-11-16 Thread FlexibleLearning
Richard: Maybe we need a new name for what Transcript does. Transcript pre-processes scripts into pointer-based bytecode, which generally outperforms purely interpreted xTalk by anywhere from several times to a few orders of magnitude. What about 'pre-processed'? 'Tokenised' sounds like 'Notion

Re: the large file challenge

2002-11-15 Thread Richard Gaskin
Sadhunathan Nadesan wrote: > I think it has come to light that MC holds it's own with compiled > languages. That was where this whole thing began, I was explaining > to Swami that MC is not a compiled language, then Scott kinda said, so > what, there is not that much difference between compiled a

RE: the large file challenge

2002-11-15 Thread Sadhunathan Nadesan
| | Message: 1 | Date: Thu, 14 Nov 2002 10:39:01 -0700 | Subject: RE: the large file challenge | From: John Vokey <[EMAIL PROTECTED]> | To: [EMAIL PROTECTED] | Reply-To: [EMAIL PROTECTED] | | To be fair: most of metacard is coded in metatalk; it is a | boot-strapped language, much like m

RE: the large file challenge

2002-11-14 Thread John Vokey
To be fair: most of metacard is coded in metatalk; it is a boot-strapped language, much like many of the TILs (threaded interpreted languages) of yesteryears (e.g., forth, apl). On Thursday, November 14, 2002, at 10:01 AM, [EMAIL PROTECTED] wrote: | MC, as well, is also coded in C, so in many

Re: the large file challenge

2002-11-14 Thread Pierre Sahores
Sadhunathan Nadesan a écrit : > > Ok, here are the results so far, > > bash > Sun Nov 10 13:01:59 PST 2002 > 17333 > Sun Nov 10 13:03:43 PST 2002 > > pascal > Sun Nov 10 13:03:43 PST 2002 > 17333 > Sun Nov 10 13:05:47 PST 2002 > > andu's metacard > Sun Nov 10 13:05:47 PST 2002 > 29623 > Sun Nov

RE: the large file challenge

2002-11-13 Thread Sadhunathan Nadesan
| Actually, this says more about your specific implementation of the algorithm | and/or the quality of your compiler than it does about the relative speed | merits of any given language. As in your bash example, the bash shell | actually calls functions from libraries of well written highly optimiz

RE: the large file challenge

2002-11-12 Thread Yates, Glen
> Here's the latest round of times > > > bash 1:44 > pascal 2:04 > C 2:28 > MC 2:10 > > goodness, C is slowest of all?!? Actually, this says more about your specific implementation of the algorithm and/or the quality of your compiler than it does about the relative speed merits of any given lan

the large file challenge

2002-11-12 Thread Sadhunathan Nadesan
| >>> So ! MC as far so fast than Pascal ! Is'nt it great ? And, thanks again | >>> to Scott, for that too ! | >> | >> It's enough to make a Java programmer cry. ;) | > | > Java ? Help me to remember... Are you speaking, Richard, in about this | > dead marketed toy that crashes any time he search

Re: the large file challenge

2002-11-10 Thread Yennie
All right... I tweaked a little more outside of email. For accuracy in the case where "mystic_mouse" occurs multiple times on one line, uncomment the line: "add offset(return, thisChunk, theOffset) to theOffset" This just skips to the next line whenever a match is found. This should run faster t

Re: the large file challenge

2002-11-10 Thread Scott Raney
On Sun, 10 Nov 2002 Richard Gaskin <[EMAIL PROTECTED]> wrote: > My hunch is that reading for lines is slower than reading a > specified number of chars, since with lines it needs to evaluate > each incoming character to determine if it's a return -- Scott, am I > right or should they be about the

Re: the large file challenge

2002-11-10 Thread Yennie
For mine... if you are not concerned so much about the exact number then change: "read from file the_file for numLines lines" to "read from file the_file for numLines" for a big speedup. and up the chunkSize to something closer to your available memory, for example (8*1024*1024) = 8MB. As for

Re: the large file challenge

2002-11-10 Thread andu
--On Sunday, November 10, 2002 13:21:04 -0800 Sadhunathan Nadesan <[EMAIL PROTECTED]> wrote: Here's another try for whatever it's worth. I tested it on a file with 7000 lines of about 800k and it takes less then a sec: on startup put 0 into tCount put "mystic_mouse" into tWord put empty in

Re: the large file challenge

2002-11-10 Thread Pierre Sahores
Sadhunathan Nadesan a écrit : > > I got this suggestion from Jeanne A. E. DeVoto ~ [EMAIL PROTECTED] > > repeat > read from stdin until "mystic_mouse" > if the result is not empty then add 1 to the_counter -- found it > else exit repeat -- encountered end of file, no more occurrenc

Re: the large file challenge

2002-11-10 Thread Pierre Sahores
Sadhunathan Nadesan a écrit : > > Ok, here are the results so far, > > bash > Sun Nov 10 13:01:59 PST 2002 > 17333 > Sun Nov 10 13:03:43 PST 2002 > > pascal > Sun Nov 10 13:03:43 PST 2002 > 17333 > Sun Nov 10 13:05:47 PST 2002 > > andu's metacard > Sun Nov 10 13:05:47 PST 2002 > 29623 > Sun Nov

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
Ok, here are the results so far, bash Sun Nov 10 13:01:59 PST 2002 17333 Sun Nov 10 13:03:43 PST 2002 pascal Sun Nov 10 13:03:43 PST 2002 17333 Sun Nov 10 13:05:47 PST 2002 andu's metacard Sun Nov 10 13:05:47 PST 2002 29623 Sun Nov 10 13:08:10 PST 2002 pierre's metacard Sun Nov 10 13:08:10 PST

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
I got this suggestion from Jeanne A. E. DeVoto ~ [EMAIL PROTECTED] repeat read from stdin until "mystic_mouse" if the result is not empty then add 1 to the_counter -- found it else exit repeat -- encountered end of file, no more occurrences end repeat put the_counter But I was no

Re: the large file challenge

2002-11-10 Thread Richard Gaskin
Sadhunathan Nadesan wrote: > By golly, that would be I think the "conventional wisdom" alright! > > Another myth goes by the wayside? :-) > > Of course, now the C programmers will probably come out of > the closet. Not if Tom Pittman is around. I've never seen objective data on the subject,

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
| > So that is 1:53 for bash, 2:04 for pascal, and 2:19 for MC. darn good! | | But golly, I thought an "interpreted" language like MetaTalk was supposed to | be slow, certainly much slower than compiled Pascal. | | :) | By golly, that would be I think the "conventional wisdom" alright! Anothe

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
| ># repeat for each line this_line in the_text | ># if (not eof) then | ># if (this_line contains "mystic_mouse") then | ># put the_counter + 1 into the_counter | ># end if | ># end if | ># end repeat | | > close file the_file | Allo Sadhu, | |

Re: the large file challenge

2002-11-10 Thread Richard Gaskin
Sadhunathan Nadesan wrote: > | > | One last note: > | > | Be careful of using > | > | If you do not read for "lines", you run the risk of cutting a line in half > on > | the spot where your magic string occurs. > | > | So always use > | > | HTH. > | Brian > | > > Good point. For this pa

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
| | One last note: | | Be careful of using | | If you do not read for "lines", you run the risk of cutting a line in half on | the spot where your magic string occurs. | | So always use | | HTH. | Brian | Good point. For this particular use of the program a close count is

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
| | I'm pretty sure the problem with speed here is from reading in the entire | file. | Unless of course you have enough free RAM- but that's hard to imagine when | the files are 300MB+. | | How about this, which you can adjust to read any given number of lines at a | time. | Try it with 10, 1

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
| If we're allowed to read the whole thing into RAM and the goal is the count | the occurences of the string "mystic_mouse", then to optimize speed we can | just remove the redundant read commands and use offset to search for us: | | #!/usr/local/bin/mc | on startup | put "/gig/tmp/log/xaa" into

Re: the large file challenge

2002-11-10 Thread Sadhunathan Nadesan
| I'm confused: if the point is to avoid reading the entire file into memory, | isn't what what line 8 does? And if it's already in memory, why is it read | again inside the loop? | | I think I missed something from the original post Hi Sorry, yes you missed something but not from the or

Re: the large file challenge

2002-11-10 Thread Richard Gaskin
Pierre Sahores wrote: > Richard Gaskin wrote: >> >> Pierre Sahores wrote: >> >>> So ! MC as far so fast than Pascal ! Is'nt it great ? And, thanks again >>> to Scott, for that too ! >> >> It's enough to make a Java programmer cry. ;) > > Java ? Help me to remember... Are you speaking, Richard,

Re: the large file challenge

2002-11-09 Thread Pierre Sahores
Richard Gaskin wrote: > > Pierre Sahores wrote: > > > So ! MC as far so fast than Pascal ! Is'nt it great ? And, thanks again > > to Scott, for that too ! > > It's enough to make a Java programmer cry. ;) Java ? Help me to remember... Are you speaking, Richard, in about this dead marketed toy t

Re: the large file challenge

2002-11-09 Thread Richard Gaskin
Pierre Sahores wrote: > So ! MC as far so fast than Pascal ! Is'nt it great ? And, thanks again > to Scott, for that too ! It's enough to make a Java programmer cry. ;) -- Richard Gaskin Fourth World Media Corporation Developer of WebMerge 2.0: Publish any database on any site

Re: the large file challenge

2002-11-09 Thread Pierre Sahores
Sannyasin Sivakatirswami wrote: > > Om Sadhunathan: > > Excellent! i had been thinking that we should probably save access logs from our >servers in honolulu, but then parsing those was a blind spot. This will help >immensely. > > Now, do i read this to say that there were 17,338 attempts to l

Re: the large file challenge

2002-11-09 Thread Sannyasin Sivakatirswami
Om Sadhunathan: Excellent! i had been thinking that we should probably save access logs from our servers in honolulu, but then parsing those was a blind spot. This will help immensely. Now, do i read this to say that there were 17,338 attempts to look at Mystic Mouse PDF's ? and if so, over what

Re: the large file challenge

2002-11-09 Thread Richard Gaskin
Sadhunathan Nadesan wrote: > So that is 1:53 for bash, 2:04 for pascal, and 2:19 for MC. darn good! But golly, I thought an "interpreted" language like MetaTalk was supposed to be slow, certainly much slower than compiled Pascal. :) -- Richard Gaskin Fourth World Media Corporation Develope

Re: the large file challenge

2002-11-09 Thread Sadhunathan Nadesan
Wow, Just logged on to work and saw all the great responses. Thanks all, what fun. Anyway I will respond to each later and try your code too. I have to run right now, appointment. I did however have some code from Andu via Swami that I modifed somewhat and got enormous speed improvement. Her

Re: the large file challenge

2002-11-09 Thread Pierre Sahores
Sadhunathan Nadesan a écrit : > > | Try something alike : > | > | > on mouseup > | > put "1" into startread > | > open file thefile for read > | > read from file thefile until eof > | > put the num of lines of it in endtoread > | > close file thefile > | > repeat while startread < endtoread > | >

Re: the large file challenge

2002-11-08 Thread Yennie
One last note: Be careful of using If you do not read for "lines", you run the risk of cutting a line in half on the spot where your magic string occurs. So always use HTH. Brian

Re: the large file challenge

2002-11-08 Thread Yennie
I'm pretty sure the problem with speed here is from reading in the entire file. Unless of course you have enough free RAM- but that's hard to imagine when the files are 300MB+. How about this, which you can adjust to read any given number of lines at a time. Try it with 10, 1000, 1, etc and se

Re: the large file challenge

2002-11-08 Thread andu
--On Friday, November 08, 2002 19:15:59 -0800 Richard Gaskin <[EMAIL PROTECTED]> wrote: # !/usr/local/bin/mc on startup put "/gig/tmp/log/xaa" into the_file put url ("file:"&the_file) into the_text put 0 into the_counter put 1 into tPointer -- repeat for each line this_line in the_

Re: the large file challenge

2002-11-08 Thread Richard Gaskin
andu wrote: >> I think I missed something from the original post > > No, you got it right. Thanks, Andu. I thought I was losin' it. If we're allowed to read the whole thing into RAM and the goal is the count the occurences of the string "mystic_mouse", then to optimize speed we can just re

Re: the large file challenge

2002-11-08 Thread andu
--On Friday, November 08, 2002 18:24:56 -0800 Richard Gaskin <[EMAIL PROTECTED]> wrote: Sadhunathan Nadesan wrote: # !/usr/local/bin/mc on startup put "/gig/tmp/log/xaa" into the_file put 1 into start_read put 0 into the_counter put 1 into the_offset open file the_file for read read from file

Re: the large file challenge

2002-11-08 Thread Richard Gaskin
Sadhunathan Nadesan wrote: > #!/usr/local/bin/mc > on startup > put "/gig/tmp/log/xaa" into the_file > put 1 into start_read > put 0 into the_counter > put 1 into the_offset > open file the_file for read > read from file the_file until eof > put the num of lines of it into end_read > close file th

the large file challenge

2002-11-08 Thread Sadhunathan Nadesan
| Try something alike : | | > on mouseup | > put "1" into startread | > open file thefile for read | > read from file thefile until eof | > put the num of lines of it in endtoread | > close file thefile | > repeat while startread < endtoread | > open file thefile for read | > read from file thefil