[REBOL] Essay on Associative Data Stores Re:(3)
Hi Joel, 19-Sep-2000 you wrote: [...] >I'll wait to respond with specifics until I've had a chance to look >them over adequately. I'm curious about the motivations (and >the performance) of your compromise solution -- part object, part >function library. >Would I be correct in guessing that you chose to make associate-ads >and lookup-ads standalone functions instead of object methods to >avoid the duplication which REBOL (currently -- hint, hint, hint) >imposes on objects? You can get rid of the duplication by doing a super-object containing the functions, then cloning this object for all your "real objects". So no, that wasn't the case. The reason was laziness ;-) 1: I needed to encapsulate the two different lists into something, and I chose an object. Could've used a list with the two lists as the only elements, but I just didn't. 2: When I tested my code, I just wanted to be able to 'do the source file and then directly use the functions on my existing object. If I had put the functions inside the object, then i would have to create a new object in order to use the new functions. This is the laziness part. So, there wasn't any clever reason. And anyway, I just wrote it to show how to do the thing relatively elegant by separating the keys and the values into two separate lists, instead of the approach everybody else uses by cramming everything into a single list. Anyway, it can be done a lot nicer than I have done. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Essay on Associative Data Stores Re:(4)
Hi Larry, 19-Sep-2000 you wrote: >What Joel writes about tradeoffs is true, but the most important thing to >know is that the REBOL hash! datatype only hashes strings (a little missing >documentation). For other datatypes, you just get a more expensive block >with no speed benefits. Jim has said that he will extend it to hash numbers >as well, but no schedule on that. OK, thanks for the info. That scheme sounds reasonable to me, given that the only thing I've used hash tables for in the past has been strings. I don't immediately see any reason for having fast lookup on more advanced types. (But probably others immediately see the reason, no doubt ;-) ) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] deleting email Re:
Hi Michael, 18-Sep-2000 you wrote: [...] >The only piece I don't yet know is how to delete an email item from the >host POP server. Anyone know? Just use 'remove on the opened port, like you would with any series. (Absolutely not tested) code for deleting the 4th message in your mailbox: p: open pop://username:[EMAIL PROTECTED] remove at p 4 close p Hope this is what you asked for (since I have a great track record of misunderstanding what people actually want) :-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Essay on Associative Data Stores Re:
Hi Joel, 16-Sep-2000 you wrote: >1.1. Keys and values are not distinguised. >1.2. Storage management/economy require extra code to > handle replacement of values for existing keys. >1.3. The block! type doesn't appear to be supported as a key > in a manner consistent with other data types. I've attached some functions for manipulating with an associative data structure. I personally think they suffice for my normal uses, but I'm looking forward to your comments (though, please keep it short then, I don't have the time to read through all that ASCII ;-) ). BTW, using a hash! for the "keys" array would probably be a better idea than just using a block!, as I currently do. _If_ find/only works faster on hash! lists, that is (it should be, but I don't know). Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc. REBOL [] create-ads: func [ "Creates a new associative data structure" ][ make object! [ keys: copy [] values: copy [] ] ] lookup-ads: func [ "Returns the value associated with the key" ads [object!] "The associative data structure" key [any-type!] "Key" /local t-list ][ if found? t-list: find/only ads/keys key [ pick ads/values index? t-list ] ] associate-ads: func [ "Appends the new key/value pair in the data structure" ads [object!] "The associative data structure" key [any-type!] "Key" value [any-type!] "Value" /local t-list ][ either found? t-list: find/only ads/keys key [ change/only at ads/values index? t-list value ][ append/only ads/keys key append/only ads/values value ] ]
[REBOL] REBOL Official Guide in the Czech Republic? Re:(2)
Hi Ralph, 6-Sep-2000 you wrote: [...] >I believe Petr recently got his, that's the only other Czech order I know >of Mail to all of Europe this summer seems incredibly SLW. The post >office just waffles when we complain, saying it's out of their hands. Whose >hands is it in? >We hope your book arrives soon. Just a small notice: I got my copy of the official guide this Friday, I think (but I'm not quite sure, since I've been so busy lately...). It really looks great - I'm looking forward to getting some time to read it (and, of course, writing a review about it for this Danish Amiga magazine :-) ). BTW: Yes, I ordered my copy as soon as I could, so it's taken quite some time reaching me. But then again, I recall I chose the cheapest delivery method, so this is probably fair. Actually, I found the book in the bookstore at my university before I got it myself. Anyway, I'm content with life, the universe, and everything. And Ralph, too :-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] mailing list since 1998 & full names in sender Re:(2)
Hi , 24-Aug-2000 you wrote: ^ Name is missing, yup! :-) [...] >> As far as I remember I once had a look at Selma and the full name gets >> stripped somewhere in import-email/parse-header. At least for me the >> full name of the sender of a mail is usually rather interesting (and >> on the mailing list it would make things a little easier) so why is it >> cut anyway??? >There's a few different ways that e-mail programs format the From: >header, and parsing all of the permutations was a bit of a trick. My >sample code from late '99 (message 13311) was fixed up by Thomas >Jensen in message 13315 and is a great starting point for a final >solution to this. >Maybe it's time to dust this issue off and take another look at it. >Other messages in the thread/related to this issue are 13119, 13307, >13295, 15460, 36519, 47439. Just in case everything else fails, I've got some C code (for my self-written mailer) that parses mail addresses. Perhaps there exist mailers that output addresses in a format that my C code doesn't handle, but I haven't ever seen such a format. And I've used my own mailer in the last couple of years, so I should've noticed if I'd got such a mail. Just send me a mail if this code snippet is of interest. >Best regards, >Kev Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] curiosity killed the code-generator Re:(5)
Hi , 7-Aug-2000 you wrote: >> Hi Hen, 6-Aug-2000 you wrote: >> >> >Are there any plans to create a Rebol implementation for the Java machine? >> >Answer was finished, then i read again: Oops. a Rebol >implementation for the Java machine" or " a Java implementation >for the Rebol machine" ? i answer to the first below >(bytecode-rebol), but you seem to answer the second? Yeah, it seems like I read "of" instead of "for" in the original question ;-) But replying to the real question, then: This would force REBOL Tech. to rewrite the REBOL interpreter in JAVA, and I sincerely doubt that they have resources (employees, time, etc.) for that. Besides, do any platforms exist for which there is a JVM but no REBOL interpreter? I cannot think of any (perhaps the forthcoming MAJC-based systems, but I don't know the timeline for those), and if there aren't any such platforms, nobody would gain anything from a JAVA implementation of REBOL. Just my thoughts... Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] REBOL Internet Chess? Re:(3)
Hi Cal, 3-Aug-2000 you wrote: [...] >>Now that you've got my >>wheels spinning, would it be >>possible to develop a >>real-time chat application in >>REBOL? I'm not talking about [...] > Sounds simple enough, I'll try to put one together when i'm back on a real computer instead of the PalmPilot... > A good place to look for ideas on how to write this would be %littlebell.r (the telnet program) it should be helpful. littlebell only sends data when the user hits return, if I recall correctly. I had some problems sending/receiving backspaces, but I do not remember which problems. Some time ago I wrote a _very_ simple chat client and server that send the input character by character. (I also made a similar system which waits for the entire line.) They are attached. However, i've no idea whether or not this is the intended behaviour? BTW, the variables and the names of the scripts are in Danish - I didn't dare just changing the text here in my mail program without testing it, and I'm too tired to test it for real in REBOL now. Hope you can figure out what it does. ("Afsender": sender, "modtager": receiver, "portnummer": port number, "servernavn": server name.) Start both programs without modification in separate consoles on the same machine to test it locally. Change the "portnummer" and "servernavn" variables to change the server name and port number.) Anyway, who am I kidding? This is probably not what anybody had in mind, but anyway... Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc. REBOL [ Title: "Snakkesystem 2 - afsender" ] portnummer: 30 servernavn: 127.0.0.1 a: open/binary join tcp:// [servernavn ":" portnummer] konsol: open/binary [scheme: 'console] bogstav: #" " while [bogstav <> #"*"] [ wait konsol bogstav: to-char pick konsol 1 either bogstav = #"^(back)" [ prin "^(back) ^(back)" ][ either (bogstav = #"^M") or (bogstav = #"^/") [ print "" ][ prin to-string bogstav ] ] append a bogstav ] close a REBOL [ Title: "Snakkesystem 2 - modtager" ] portnummer: 30 a: open/binary join tcp://: portnummer b: first a while [(ind: pick b 1) <> none] [ bogstav: to-char ind either bogstav = #"^(back)" [ prin "^(back) ^(back)" ][ either (bogstav = #"^M") or (bogstav = #"^/") [ print "" ][ prin bogstav ] ] ] close b close a
[REBOL] curiosity killed the code-generator Re:(3)
Hi Hen, 6-Aug-2000 you wrote: >Are there any plans to create a Rebol implementation for the Java machine? Would be interesting, but the virtual machine would run _extremely_ slow. However, the part of the JVM instruction set that I have seen is very simple (it's a stack machine), but AWT etc. would probably be a "no-no" because of speed, complexity, etc. But a JVM that interprets a subset of JAVA bytecode could be interesting. Holger from REBOL Tech. once worked on a JVM implementation for the Amiga, so perhaps he can tell us if such a project would be feasible. Holger? Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Bug in 'use? Re:(4)
Hi Ladislav, 19-Jul-2000 you wrote: [...] >> Do you know if this problem has already been reported to >feedback? [...] >You should probably report it to feedback. BTW, did you succeed to >sort the permutations correctly? Yup, the compression and decompression work like a charm now. Although the compression process is extremely slow on normal-sized text documents - it would probably give quite a speed-up if I could use sort/compare (which runs the sorting algorithm itself natively) instead of a custom-written sorting algorithm. Anyway, I'll report the 'use bug to feedback, then! Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Bug in 'use? Re:(4)
Hi Brett, 19-Jul-2000 you wrote: >> Thanks! In the meantime, I got it working by simply having the temporary >> variables as parameters to the function. Mean, but it works. >You don't need temporary variables. You can use local function variables. [...] I did briefly try this, but it seemed to give the same errors. Though now I've written a bug report to feedback (about 'use), and I just tried writing a small program which used local function variables, and it seemed to work. So I probably had another problem in the program... Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Bug in 'use? Re:(2)
Hi Ladislav, 15-Jul-2000 you wrote: >Hi, >don't know, what is the problem, but here is a merge-sort working >for me looong time: [...] Thanks! In the meantime, I got it working by simply having the temporary variables as parameters to the function. Mean, but it works. As a curiosity, I also got the version of merge-sort working which uses less memory. I'll attach it here, just in case anybody finds it interesting. It's less artistic, but I guess that's just the price you have to pay for using less memory. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc. ; Merge sorting, uses only memory proportional to the length of list, ; instead of memory proportional to (length of list)*log(length of list) sort-list: function [list cfunc] [sort-list2 low2] [ sort-list2: func [list list2 cfunc low high middle] [ if (high - low) > 1 [ ; Sort recursively middle: to-integer (low + high) / 2 sort-list2 list2 list :cfunc low middle 0 sort-list2 list2 list :cfunc middle high 0 ; Merge list2[low..middle) and list2[middle..high) into list[low..high] list: skip list (low - 1) low2: middle while [(low < middle) and (low2 < high)] [ either cfunc (pick list2 low) (pick list2 low2) [ change list pick list2 low low: low + 1 ] [ change list pick list2 low2 low2: low2 + 1 ] list: next list ] ; Append the missing items for low low (middle - 1) 1 [ change list pick list2 low list: next list ] for low2 low2 (high - 1) 1 [ change list pick list2 low2 list: next list ] ] ] sort-list2 list (copy list) :cfunc 1 (length? list) + 1 0 ]
[REBOL] Bug in 'use? Re:(2)
Hi Ladislav, 15-Jul-2000 you wrote: >you are right, the problem is caused by a context manipulation - >Use unsets your Middle every time it gets executed. My suggestion >is to not use Use in recursive functions, while this problem >doesn't get corrected. Judging from the nature of recursiveness, that's a little hard, isn't it? ;-) Do you know if this problem has already been reported to feedback? Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Bug in 'use?
Hi again Boy, all those problems... Try running the attached script. It gives the following error: ** Script Error: middle has no value. ** Where: copy skip list middle OK... well, the line with "copy skip list middle" is just _below_ this line: list1: sort-list (copy/part list middle) :cfunc which doesn't produce an error, and since I don't touch 'middle between those two lines, 'middle _has_ to have a value! I suspect that 'use is broken. For recursion I need 'use, otherwise the recursive calls will mess with the variables. But it seems like 'use causes recursive calls to unset the used variables. Any ideas what else could be wrong, before I send this to [EMAIL PROTECTED]? I know that it might just be me having made a stupid bug in the attached code. BTW, the attached function is a merge-sort, which I had to write since bubble-sort is too slow, and since I still haven't got sort/compare working (will be sending this bug report to [EMAIL PROTECTED] soon too, unless somebody can see if there's an error in my code). I know that merge-sort can be implemented in a less memory-hungry way, but I'd like to get this thing working first. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc. REBOL [] sort-list: func [list cfunc] [ use [list1 list2 middle] [ if (length? list) > 1 [ ; Recursively sort the two halves of list middle: to-integer ((length? list) / 2) list1: sort-list (copy/part list middle) :cfunc list2: sort-list (copy skip list middle) :cfunc ; Merge list1 and list2 into list clear list while [(not tail? list1) and (not tail? list2)] [ either (cfunc (first list1) (first list2)) [ append list first list1 list1: next list1 ] [ append list first list2 list2: next list2 ] list: next list ] append list list1 append list list2 ] ] list ] test-list: [6 5 3 8 9] sort-list test-list :greater?
[REBOL] Find cant find reference to block. Re:
Hi Brett, 14-Jul-2000 you wrote: >Is this right? Just wondering. >list: [] >append list reduce ['a copy ["a"]] >append list reduce ['b copy ["b"]] >append select list 'a "a-test" ; OK >probe list >find list select list 'a ; no good find can't find the reference to the >block Try with find/only instead. I'm not sure why find shouldn't work on your example, and why /only is a necessary refinement here, though. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Testing for words in blocks? Re:
Hi Douglas, 13-Jul-2000 you wrote: >Any other ideas? A better way to do this with a native word? How about found? find [word1 "Hi"] 'howdy But, this is still two native words used, I just don't think it can be done with a single native function. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Bug in sort/compare ? Re:(2)
Hi Eric, 13-Jul-2000 you wrote: >The problem is in your sort-func all the return values are true. I think >this'll give better results: [...] In fact, this _doesn't_ work. Just checked. Have a look at the attached function (yes, 'input is redefined inside the function, but since it isn't used there, it doesn't matter ;-) ). Just before the first "print permutations" I do a sort/compare. The output here is not what I expect. Then I do a simple bubble-sort on the list, using the comparison function exactly as I would expect sort/compare uses it (that is, I don't suspect that sort uses bubble-sort, of course). Then the result is as expected. In fact, according to the REBOL User's Guide, the comparison function should return false when the two input values are equal. I've tried changing this too, but it doesn't change the validity of the output of sort/compare. Very strange. BTW, if anybody has ideas on how to make the attached function smaller (but still readable) or faster, I'd be very interested to know. BTW2, JFYI: The function takes a normal text string, applies a transformation on that text string, and returns the transformed string and a number. Using the transformed string and the number, another function can recover the original string. The transformed string will very likely consist of more homogenous substrings than the original, thus making it a lot easier to compress using more traditional approaches. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc. ; Burrows-Wheeler transform encode-bw: func [input] [ use [input-length sort-func permutations transformed-string transformed-info] [ input-length: length? input ; Orders the n1'th and the n2'th permutations of the input sort-func: func [n1 n2] [ loop input-length [ if n1 > input-length [n1: 1] if n2 > input-length [n2: 1] c1: pick input n1 c2: pick input n2 if c1 < c2 [return true] if c1 > c2 [return false] n1: n1 + 1 n2: n2 + 1 ] return true ] ; Build a representation of the permutations permutations: make block! input-length for i 1 input-length 1 [append permutations i] ; Sort the permutations in lexicographic order sort/compare permutations :sort-func print permutations ; A little bubble-sort :-( sorted: no while [sorted = no] [ sorted: yes for i 2 input-length 1 [ if (sort-func pick permutations i - 1 pick permutations i) = false [ temp1: pick permutations i - 1 temp2: pick permutations i poke permutations i - 1 temp2 poke permutations i temp1 sorted: no ] ] ] print permutations ; Now, let's create the new string transformed-string: make string! input-length transformed-info: (index? find permutations 1) foreach permutation permutations [ i: either permutation = 1 [input-length] [permutation - 1] append transformed-string pick input i ] return reduce [transformed-string transformed-info] ] ]
[REBOL] Bug in sort/compare ? Re:(2)
Hi Eric, 13-Jul-2000 you wrote: >The problem is in your sort-func all the return values are true. I think >this'll give better results: [...] Ah, so the compare function should return true or false (or 1 or 0). REBOL Tech., it might be a very good idea to include this information when the user types "help sort". >It seems a little odd, though, that there's no way for the comparison >function to say that the two values were equal. If (sort-func a b) and (sort-func b a) both return true, a and b are equal. >By the way, your script redefines 'input, which might not be a good idea. In the final program I won't do this, but thanks for the reminder anyway. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc.
[REBOL] Bug in sort/compare ?
Hi there! I strongly suspect that 'sort is severely broken. Attached is some code I wrote to do a Burrows-Wheeler transform. If you don't know what that is, don't worry - it's a transformation used for compressing text, but that's irrelevant for this mail. The important part is that in order to do this transformation, I need to sort a list in a special way. Try running the program. Four times the line which prints "Not sorted" is executed. The critical part is this: ; Sort the permutations in lexicographic order sort/compare permutations :sort-func ; Check if the sorting went well for i 2 input-length 1 [ if (sort-func pick permutations (i - 1) pick permutations i) = 1 [ print ["Not sorted:" pick permutations (i - 1) pick permutations i] ] ] First, I sort the list "permutations", using my own sorting function. This line _should_ sort "permutations" so that for two neighbouring entries in "permutations", say entry i and entry i+1, the following expression: sort-func (pick permutations i) (pick permutations (i + 1)) should in no way give 1. That is, the sorting function taken on any entry and its rightmost neighbour should give either 0 or -1 (that is, the entry should be less than or equal to its right neighbour). Right? Or have I just missed a point in the attached script? Kind regards, -- Ole Friis <[EMAIL PROTECTED]> Amiga is a trademark of Amiga Inc. REBOL [] input: "this is the" ; 8 10 input-length: length? input ; Burrows-Wheeler transform ; Orders the n1'th and the n2'th permutations of the input sort-func: func [n1 n2] [ loop input-length [ n1: n1 + 1 n2: n2 + 1 if n1 > input-length [n1: 1] if n2 > input-length [n2: 1] c1: pick input n1 c2: pick input n2 if c1 > c2 [return 1] if c2 > c1 [return -1] ] return 0 ] ; Prints the n'th permutation of the input print-perm: func [n] [ for i 1 input-length 1 [ n: n + 1 if n > input-length [n: 1] prin pick input n ] print "" ] ; Build a representation of the permutations permutations: make block! input-length for i 0 (input-length - 1) 1 [append permutations i] ; Sort the permutations in lexicographic order sort/compare permutations :sort-func ; Check if the sorting went well for i 2 input-length 1 [ if (sort-func pick permutations (i - 1) pick permutations i) = 1 [ print ["Not sorted:" pick permutations (i - 1) pick permutations i] ] ] ; Some feedback I like print permutations foreach permutation permutations [print-perm permutation]
[REBOL] OT: Beta (was: New? programming construct ) Re:(6)
Hi Elan, 19-Feb-2000 you wrote: >>Do you mean the Beta programming language, >Yes. Cool! I've got to tell my study mates about that ;-) [...] >>And before I get your reply, I still think I'm >>right on that one ;-) ) >Yikes. I didn't know Beta was *that* unknown. I actually printed out and Perhaps it isn't. But as it's mainly developed here in Aarhus, we (the poor students) always felt like the only reason we were forced to writing Beta using the very, very poor development tools ("poor" regarding stability and speed, mainly...) was that the developers wanted some testers. >studied the accompanying manuals. I especially like the idea of fragments. Yup. Fragments and the way that methods can interact with "overriding methods" is very smart. And the pattern abstraction mechanism, though it sometimes seems a bit cumbersome. >It provides some interesting added flexibility. I also found the arrow >notation very intuitive and appealing. The only thing that stopped from Hmmm, I still prefer REBOL's ":" or C's "=", but that's probably a habit. >actually using Beta beyond just entering all the examples from the manuals >was the fact that the MS Windows GUI support was rather weak (at the time >it relied on Tcl/Tk, if I remember correctly.) The kind of software I was I think they have done major things regarding Windows GUI in the meantime. [...] >At some point I remember someone announced that they were a implementing a >constraint sub system for Beta. That wouldn't surprise me. Beta acts as a kind of crash test dummy for a lot of (sometimes strange) ideas the students and professors get. But what I hate most about Beta is the compiler. The executables are far too big (3-4 MB's for medium-sized projects is not a peculiar sight when using Beta), though apparently somebody is working on that one. But still, there are many other programming languages I would prefer for practical use instead of Beta. Anyway, I just wanted to tell you that I found it interesting that you knew about Beta, too ;-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] New? programming construct Re:(4)
Hi Elan, 17-Feb-2000 you wrote: >Would that mean that GnuC++ would not be able to incorporate rules? Beta >has had rule constraints for at least two or three years. Do you mean the Beta programming language, or just the beta versions of GnuC++? (Didn't think anybody outside of the University of Aarhus, here in Denmark, knew about Beta. And before I get your reply, I still think I'm right on that one ;-) ) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Setting timeout for POP3 connection
Hi! Some time ago somebody (at REBOL Tech., I suppose) told us how to set the timeout value for socket connections. Does anybody remember how this is done? Thanks in advance! And, to REBOL Tech.: Please add this to the manual somewhere, as it can be really useful. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Simple textfile handling... Again.. Re:
Hi Stefan, 11-Jan-2000 you wrote: >me again with my puny textediting script. :) >The textfile has a couple o' lines: >Foo123 >Bar123 >Doo123 > now I wanna search the textfile for all occurances of Doo and remove >those lines.. Assuming your file always ends in a newline: removeme: "doo" parse/all str [ some [ start: removeme thru newline end: (remove/part start end) :start | thru newline ] ] Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] About DFAs Re:(2)
Hi Eric, 12-Jan-2000 you wrote: >Thanks a lot for the information on DFA's. It's very interesting, but I >don't think I'm ready to try to implement a DFA engine in REBOL. Still, >it's given me some new ideas on how to improve the NFA matching algorithm. >Thanks! Well, great! :-) Perhaps I should try making some DFA thingies for REBOL some time, but I've got some exams and a JavaScript interpreter do do first ;-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(8)
Hi Gabriele, 10-Jan-2000 you wrote: > o> BTW, Gabriele, wasn't it you who once said that you would do a > o> JavaScript interpreter for REBOL? Well, I've started writing >It was Andrew, IIRC. :-) I'm not an expert of Javascript... ;-) Thanks. I'll start nagging him, then :-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(7)
Hi Petr, 9-Jan-2000 you wrote: [...] >> I have the answer for you: change b1 as: >> >> b1: [skip b1 | "ing" end] >b1: [skip b1 | "ing" to end] seems to do the job better :-) No, since this will match any string containing "ing", and that wasn't the intention of b1. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(2)
Hi Eric, 10-Jan-2000 you wrote: >The hardest thing for me was implementing quantifiers. What I'm doing >with search-text.r is "compiling" an expression like > [bound "t" 3 7 alpha "ing" bound] >to a form like: > [bound "t" [4 6 3 7] make bitset! #{...} [3] "ing" bound] >in which the blocks are nodes which tell: >one or two elements to jump to >the minimum number of times the first jump must be made >the maximum number of times the first jump can be made >So we have to go through the cycle of elements 3/4/5 > --- >| | >V | >"t"(2) -> node(3) -> bitset(4) -> node(5) - >| > -> "ing"(6) >three times before we even think about jumping to the "ing" element. >What I'm doing is saving the number of times node(3) has been passed >in another block in a position tied to the "ing" element. When there >is a failure to match somewhere in the cycle, or when the cycle has >been done the maximum number of times, it's important to erase the >information on how many times node(3) has been passed, in case one loop >is nested inside another loop and there's a chance of coming back to >it. >Now with the NFA engine, saving a state means you only have to push >onto a stack the index of a jump you could have taken, plus the current >index of the text string. But with a DFA engine, you wouldn't need the >text string index (because once you deal with a character in the text >stream it's done once and for all) - is that right? On the other hand, That's right. >every time you come to a place where two jumps are possible you have >to start another thread so that both possible pattern indexes can be >compared against the next character in the text stream. You mean, with DFA's? In that case, no. Though, it's something like that you need to do when you construct the DFA from the NFA. >What seems especially complicated here is that you have to save the >information on which node has been passed how many times with each >thread. Complicated yes, but once a thread dies, it should be easier >to delete all the information associated with it. Once you have a DFA, you don't need to store such information. What you have is one big system with states (nodes) and transitions (arrows between the states, each associated with a character). You only need to remember which state you are in. Let's take a simple example: A DFA recognizing all binary numbers that are a multiplum of 3 (11 binary ;-) ). State 1 is an accepting state: 0 |--| V | -->State 1 (accepting) | ^ 1| |1 V | State 2 | ^ 0| |0 V | State 3 | ^ |-| 1 I hope it's clear what it says: >From state 1: "0" moves to state 1, "1" moves to state 2. >From state 2: "0" moves to state 3, "1" moves to state 1. >From state 3: "0" moves to state 2, "1" moves to state 3. We start in state 1 (that's why there's the arrow coming from the left), and when we have a string, for example "10101" (which is 21 in decimal), you just eat the characters from left to right, following the transitions. Since it is a DFA (the D stands for deterministic, you know ;-) ), we will never have more than one possible move (when we have no move for a given input character, we reject the whole input string immediately). When we are done with the whole string, we see whether or not we are in an accepting state. If so, our DFA has recognized the input string. If not, our DFA has rejected the input string. Simple as that. As you can see, once we have built a DFA for a corresponding regular expression, matching strings is _very_ fast and requires no recursion, no stack, or whatever. In fact, the only differences between a DFA are these: 1: An NFA may have more than one possible transition for certain input characters in each state. 2: An NFA may have "the empty string" associated with a transition. But the beauty is that given any NFA, you can construct a corresponding DFA. >Well, that's the idea anyway. If you would take a look at the code and >tell me any ideas you have, or if you could suggest a reference on >DFA machines, I'd really appreciate it. I'll try to find some electronic material on DFA's tomorrow (when I have free bandwith from University ;-) ). Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(6)
Hi Gabriele, 9-Jan-2000 you wrote: > o> To make things even more spooky, look at this: [...] >I have the answer for you: change b1 as: >b1: [skip b1 | "ing" end] >Do you see what I mean? Wow, YES I do! Dang, I must've looked like a jerk ;-) Thanks a lot! BTW, Gabriele, wasn't it you who once said that you would do a JavaScript interpreter for REBOL? Well, I've started writing one for myself (with flex and bison, not anything directly REBOL-related), and I get great headaches trying to grasp how to make automatic semicolon insertion. Did you come this far in your evaluation? If so, do you have any idea how to implement this? (If it wasn't you who talked about a JavaScript interpreter, I'm sorry ;-) ) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(8)
Hi Petr, 9-Jan-2000 you wrote: [...] >any. "to end" would solve it. The result is, even if "ing" got matched and >applying of rule was a success, it doesn't mean whole parse is succesfull. It's >like expecting parse "ringing bells" [thru "ing"] is going to return 'true, but it >isn't as "to end" is missing ... I understand this now. Sorry, I just didn't think this far. [...] >huh :-) and we are back from all recursion calls. Being it one way or other, I >would suggest forget your technique, as it's very inefficient ... As I said in another mail, this wasn't what I had in my mind when I wrote the mails - I just wanted a way to write the regular expressions in the Parse dialect, not thinking of how Parse is actually implemented. [...] >b2: g i.n.g. 17 >b2: ng i.n.g. 16 >b2: ing i.n.g. 15 >== false >->> >oops - do you see? "ing" was applied, but we are NOT returning to upper levels of >recursion. The result is - 'false, but why? As the rule doesn't contain "to end" ...and therefore the whole Parse operation fails, since we don't reach the end of the input string. Yup, I get it now! [...] >hope-this-helps Sure! Thanks for being so stubborn on me, otherwise I wouldn't have understood it ;-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(6)
Hi Petr, 9-Jan-2000 you wrote: [...] >> >This will not work imho :-) >> >> You're right. As I said, it was completely untested ;-) >> >> I used "to" instead of "thru", and we need the /all refinement. But apparently >> that's still not enough, though I don't know why... >> >As I said - it doesn't matter if you used 'to or 'thru, because once you are back >from 'b's maddening :-), you are standing just behind the last occurance of "ing" >in the parsed string ... Erm, we should be standing just in front of it, otherwise there's now point? And I wanted to see if we had a separator just behind the "ing", which means I should use 'thru. >> To make things even more spooky, look at this: >> >> sep: charset " ,.!?" >> b1: [skip b1 | "ing"] >> b2: [skip b2 | "ing" sep to end] >> a: [b1 | b2] >> >> The idea here is that b1 will match any string ending in "ing", and b2 will >> match any string which contains "ing" followed by a separator. Now, watch >> this: >> >> >> parse/all "ringing bells" b2 >> == true >it's just the same as 'b1, it just contains the code to skip to the end of the >string Yes. As I said above: The idea here is that b1 will match any string ending in "ing", and b2 will match any string which contains "ing" followed by a separator. Notice the use of "ending in" and "contains". [...] >6) so b1 is true for the first time and we are standing just behind the "ing" >string, so at beginning of "bells", or " bells", once parse/all is used You _do_ mean that we're in front of the "ing" string, right? We cannot see what is just to the left of our position, so talking about what happens when we're at the right of "ing" makes no sense. Besides, b1 will never ever return true for "ringing bells", as this string does not end in "ing". >and now back to parse/all a >a is subrule! Huh??? >b2 part of this subrule is NEVER applied. Know why? Just because at some point, we >returned true from b1, so there is no need to perform OR option. The 'false it >there, because all we need to do is to skip to the end of the string ... But if we returned true from b1, the result of the whole parsing should be true too. But of course b1 will not return true, as b1 will only match strings _ending_ with "ing", and "ringing bells" is not ending with "ing". >> >as for "ringing bells" >> >> >Again, you are going to reach end of the string, then going back recursively >> >until first "ing" (applied from end of the string) is not matched. : >> >> >>> b: [skip markb: (print ["markb: " markb index? markb]) b | back-b: (print >> >["back-b: " back-b in >> >dex? back-b]) "ing"] >> >== [skip markb: (print ["markb: " markb index? markb]) b | back-b: (print >> >["back-b: " back-b index >> >? back-b]) "ing"] >> >>> parse str a >> [...] >> >back-b: g bells 7 >> >back-b: ng bells 6 >> >back-b: ing bells 5 >> >== false >> >> >Your code will fail with something like "ringing sounding bell" ... it will >> match >> >sounding, so generally said - the last occurance of "ing" contained in the >> string >> >... >> >> It should then try to match the part to the right of the "|" in the top rule >> (rule a), which should match. But that doesn't seem to happen. >No, why? | means OR, not continuation of the rule. There is no need for OR, as >your first part of rule got 'true. Parser then tries to continue after right ] of >rule. Remember [blabla | bleble] is just subrule, once blabla is true, it then >skips behind the right ] ... Either I do not understand you, or you have missed the point that the rule to the left of the "|" is returning _false_, not true. >> >change 'a to [b to end] and once succesfully back from 'b, it will continue >> "to >> >end" and return "true" ... >> >> But that's not what it should do. Then the rule would match all strings with >> "ing" in it. >Yes, backtracked from the tail of the string ... I repeat: That's not what it should do. >> We only want to match strings that either end in "ing" or have a >> word inside it (words are separated by spaces, commas, etc.) that end with >> "ing". >> >I don't understand it as for above parse/all "ringing bells" [thru "ing"] should >be sufficient ... Nope, see below: >well if you want to return false for str: "ringdong bells", that's another story > And that's exactly what I want to do! >parse/all str [thru "ing " to end] But a word can end in ",", ".", "!", "?", etc. I wanted to cover all these instances with my "sep" variable, which is a bitset. You cannot do that in the above way, since 'thru doesn't accept the block ["ing" sep] instead of your "ing " string. >and if it doesn't cover "ringdong bellsing", then just add space to the parsed >string before parse is applied ... what a hack :-) That's why I had my b1 block... and your solution only accepts " " as a separator. And I wanted it all to be inside the parse rule, not using other REBOL commands, which is also why I didn't just start by splitting up the string with strings: parse str " ,.!?"
[REBOL] Regular Expressions Re:(6)
Hi Petr, 7-Dec-1999 you wrote: >> [...] >> >> How about this: >> >> >> >> >> a: [skip a | "ing"] >> >> == [skip a | "ing"] >> >> >> parse "ringin" a >> >> == false >> >> >> parse "ringing" a >> >> == true >> >> >> >> It should do the job. So actually backtracking in Parse can be achieved by >> use >> >> of the "|" operator in Parse blocks. >> >> >What? | is simple OR or am I wrong? You will end up in recursion imho. >> >> That was the very point. >Aha, I just thought that you thought "ing" is matched or skip is performed, not >knowing of recursion. Well, I should look better at your email adress, as I would >recognize, you are experienced reboller :-) >Anyway ... your example will fail with "ing" being present somewhere in middle of >the text ... I would even sincerely dare to say - it has no sense. You are nesting >to recursion while end of string is not matched (let's imagine some 1000 chars = >1000 subrules) and then backtracking by one char, while "ing" or we are not back >at the beginning of string ... Or am I wrong? It is true that we get a ridiculously deep recursion. In effect, we first go to the end of the string (by recursion), then we step backwards and each time look if the string from the position we're at is "ing". So: 1: If we have a string with n chars, ending in "ing": First make n recursive calls. On the third character on the way back, we find out that our string ends with "ing", and we therefore return "success" all the way up back through the recursion. 2: If we have a string with n chars, not ending in "ing": First make n recursive calls, then at each step on our way back, see if the string starting from our position is "ing". So, yes, the solution is very, very stack-consuming and cumbersome. I simply didn't pay attention at all to efficiency, just the principle that it could be done that way. >> I just don't see what the problem is. The examples above do exactly as I >> wanted. I used "|" as "a backtracker", since it appears to do this: >> >> First, call recursively on the left side of the "|". If the match succeeds, >> return success. >No, no chance, You have no chance to match "ing". It will always reach the end of >the string. Once returning to upper levels from recursion, backtracking starts In my terminology, backtracking starts when the left side of a "|" is tested, since we have to store information on our position in the string. But I guess this is not the terminology everybody else uses, so sorry about that :-) Why haven't I got a chance of matching "ing"? When we're at the point where we have reached the end of the string, the part left to "|" will fail immediately, since 'skip fails. Then we try to match "ing", which will fail. Only when we go three steps back (i.e. we're at the position in the string where we can see the last 3 characters in the string), the match on "img" might succeed. If so, the "succeed" result is propagated all the way through our recursive calls, since "skip a" has then succeeded for all the calls above. >> If the first call didn't return success, >It can't because of the reasons mentioned above ... >> call recursively on the right side of >> the "|" (starting from the same position in the input stream as for the call >> on the left side of the "|"). If the match succeeds, return success, >> otherwise return failure. >> >> This is effectively the same as backtracking. >> >pretty ineffective seems to me ... >... no offense ... No offence taken :-) By "effectively the same as backtracking", I meant "another way of achieving backtracking". I didn't think about speed and stack usage. >> >PS: I am somehow tired today to study and comment another examples, sorry :-) >> >> Me too, I still have to read quite a few pages (of _really_ boring statistics) >> before I can go to bed ;-) >I think still better than really bad toothache :-) Ouch! Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(4)
Hi Petr, 8-Jan-2000 you wrote: [...] >> Ah, then I just got you wrong. The easy way to do the above is first to split >> the text up with >> >> parse str none >> >> and then match the individual words. However, this should work too: >> >> sep: charset " ,.!?" ; and whatever else you want to split up words >> b: [skip b | "ing"] >> a: [b | to "ing" [sep to end | a]] >> parse str a >> >> though it's completely untested, and I agree that it's ugly :-) >This will not work imho :-) You're right. As I said, it was completely untested ;-) I used "to" instead of "thru", and we need the /all refinement. But apparently that's still not enough, though I don't know why... To make things even more spooky, look at this: sep: charset " ,.!?" b1: [skip b1 | "ing"] b2: [skip b2 | "ing" sep to end] a: [b1 | b2] The idea here is that b1 will match any string ending in "ing", and b2 will match any string which contains "ing" followed by a separator. Now, watch this: >> parse/all "ringing bells" b2 == true >> parse/all "ringing bells" a == false Since the parse with b2 returned true, the parse with 'a should indeed be true, too, since 'a is true if either b1 or b2 gives true. Now, if instead I define 'a as [b2 | b1], we get true with 'a instead. I just cannot see why this should be so? >as for "ringing bells" >Again, you are going to reach end of the string, then going back recursively >until first "ing" (applied from end of the string) is not matched. : >>> b: [skip markb: (print ["markb: " markb index? markb]) b | back-b: (print >["back-b: " back-b in >dex? back-b]) "ing"] >== [skip markb: (print ["markb: " markb index? markb]) b | back-b: (print >["back-b: " back-b index >? back-b]) "ing"] >>> parse str a [...] >back-b: g bells 7 >back-b: ng bells 6 >back-b: ing bells 5 >== false >Your code will fail with something like "ringing sounding bell" ... it will match >sounding, so generally said - the last occurance of "ing" contained in the string >... It should then try to match the part to the right of the "|" in the top rule (rule a), which should match. But that doesn't seem to happen. >And because of that, second part of 'a - {to "ing" [sep to end | a]}will be NEVER >applied, as your pointer is just behind last occurance of "ing" contained in your >string ... that's why you got false result. So the second part of 'a will pick up where the first part left? If that's the case, then Parse _really_ has a bug, since the second part of 'a should pick up at the very same spot as where the first part started, which is the start of the entire string. >change 'a to [b to end] and once succesfully back from 'b, it will continue "to >end" and return "true" ... But that's not what it should do. Then the rule would match all strings with "ing" in it. We only want to match strings that either end in "ing" or have a word inside it (words are separated by spaces, commas, etc.) that end with "ing". Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(2)
Hi Eric, 7-Jan-2000 you wrote: >Ole, you suggested this to find a word ending in "ing": >>> a: [skip a | "ing"] >== [skip a | "ing"] >>> parse "ringin" a >== false >>> parse "ringing" a >== true >That's fine, except it won't detect a word ending in "ing" in the >middle of the string: >>> parse "ringing bells" a >== false Ah, then I just got you wrong. The easy way to do the above is first to split the text up with parse str none and then match the individual words. However, this should work too: sep: charset " ,.!?" ; and whatever else you want to split up words b: [skip b | "ing"] a: [b | to "ing" [sep to end | a]] parse str a though it's completely untested, and I agree that it's ugly :-) >Actually, I think your suggestion is equivalent to the rule: >b: [to "ing"] No, as this rule wouldn't return true on the string "ringing". >Also, recursion is not so good for very long strings: >>> str: "ringing" >== "ringing" >>> parse str a >== true ; OK >>> insert str "- - - - -" >== "ringing" >>> parse str a >== true ; still OK >>> insert/dup str "- - - - -" 10 >== "- - - - -ringing" >>> parse str a >== true ; still OK >>> insert/dup str "- - - - -" 100 >== {- - - - -- - - - -- - - - ... >>> parse str a >== false; stops working >I expected a stack overflow error, but it just stopped working. Bug? Strange. Don't know what happened, but you're right, recursion is really a problem with REBOL as it offers only limited depth. [...] >>> parse "This is the straw that broke the camel's back" rule >straw that broke the camel's back >== true >>> parse "I need some straw, and a camel" rule >== false >This is also very complex with PARSE, but trivial with regular >expressions. True. If Parse supported backtracking, it would just be a case of parse str [any skip "straw" any non-punct "camel" to end] >Ole wrote: >>Converting "(string)*" and "(string)+", for example, into this: >> >>some-string: ["string" some-string | "string"] >>any-string: ["string" any-string | none] >> >>should work, shouldn't it? But yes, I must admit, too, that 'any and 'some >>ought to do some backtracking - in fact, I believed they did, so please >>ignore the mail I sent a couple of days ago on how to easily build Parse >>blocks from regular expressions ;-) >> >>Rebol Tech., isn't the lack of backtracking in 'any and 'some a bug? >I'm not sure what the advantage of your suggestion would be over >converting "(string)*" to [any "string"] and "(string)+" to [some I actually had an idea with it, but I guess I need to think a little more about it, as it doesn't seem quite so clear to me anymore ;-) [...] >The rule QUOTED-STR needs to be adjusted of course, depending on what >you want to do, and there are always problems with how to deal with >illegal strings (should you allow newlines in the middle or not etc). >Still, parse rules make this kind of work easy, and backtracking can >only be a complicating factor. You're right on this one, i.e. that without backtracking, it might be easier to understand exactly what's happening. >>BTW, by building a DFA for a regular expression, you can get rid of >>the backtracking. However, I don't know how fast a program can build >>a DFA, it might not be that fast... but if you try to match one >>regular expression to a lot of strings, it might be worth a go. Keep >>in mind, though, that then you'll only be able to get a "matched" or >>"didn't match" answer from your expression matching, it will probably >>be very difficult to implement copying of the text inside >>subexpressions etc. >I've never read anything on how DFA's actually work, but it sounds real >complicated. Search-text.r implements an NFA, of course, but even then I >haven't figured out to copy text matching portions of the expression. If >anyone would give me some hints, I'd be grateful. Unfortunately I haven't ever tried to actually write a program which turns an NFA into a DFA, but there are relatively simple algorithms to do so. I might try to give it a go when I get the time, but unfortunately that's not now (due to exams). If you have an NFA with states s1, s2, ..., sn, you can build a DFA in which the states are all possible subsets of {s1, s2, ..., sn}. The intuition is this: When you feed some input stream to your NFA, it can be in a number of states at once. Just select the state in your DFA that corresponds to the set of states your NFA is in. To decide which transitions you need between the states in your DFA given an input symbol: For each element s in the set corresponding to the DFA state, see which states you can go into from the NFA in state s. Take the union of all the states you get, and you get another state in your DFA, and you've got a transition between the two DFA states. The starting state in your DFA is the state containing exactly the states you can reach in your NFA given the empty string
[REBOL] Regular Expressions Re:(4)
Hi Petr, 6-Jan-2000 you wrote: [...] >> How about this: >> >> >> a: [skip a | "ing"] >> == [skip a | "ing"] >> >> parse "ringin" a >> == false >> >> parse "ringing" a >> == true >> >> It should do the job. So actually backtracking in Parse can be achieved by use >> of the "|" operator in Parse blocks. >What? | is simple OR or am I wrong? You will end up in recursion imho. That was the very point. >Try this: >a: [skip mark: (print mark) a | "ing"] >you will get following result: >ingin >ngin >gin >in >n >== false ...which was exactly what I wanted. >"ing" is never going to be considered imho, unless skip fails, and that's when? >When we reach end. Then it fails, as at the end, "ing" is not applicable. Because I wanted to match a string _ending_ with "ing". I guess we have just misunderstood each other? >How's that your second example works? Let's extend the rule: > a: [skip mark1: (print ["mark1: " mark1 index? mark1]) a | mark2: (print ["mark2: >" mark2 index? mark2]) "ing"] >mark1: inging 2 >mark1: nging 3 >mark1: ging 4 >mark1: ing 5 >mark1: ng 6 >mark1: g 7 >mark1: 8 >mark2: 8 >mark2: g 7 >mark2: ng 6 >mark2: ing 5 >== true >Hmm, strange, at mark1: 8 skip fails as we are at the end of the string, parse >tries to aply "ing" - not a succes, it then returns from recursion one level upon. >How's that 'skip fails here (mark2: 7)? In real - it doesn't, as the rule is: >[skip a | "ing"] >and we are just behind the skip, back from the nested 'a, which failed, so it >tries to aply "ing" >Whee ya, backtracking as a wine? :-) >I am just starting to learn some simple parse rules, so take my comments easy, if >wrong :-) I just don't see what the problem is. The examples above do exactly as I wanted. I used "|" as "a backtracker", since it appears to do this: First, call recursively on the left side of the "|". If the match succeeds, return success. If the first call didn't return success, call recursively on the right side of the "|" (starting from the same position in the input stream as for the call on the left side of the "|"). If the match succeeds, return success, otherwise return failure. This is effectively the same as backtracking. >See ya, >PS: I am somehow tired today to study and comment another examples, sorry :-) Me too, I still have to read quite a few pages (of _really_ boring statistics) before I can go to bed ;-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] What's next for REBOL... Re:(5)
Hi Carl, 6-Jan-2000 you wrote: >Yes! As long as they have the same subject, I'll find them all, and we will send you REBOL/View. Just as soon as we've got it. >-Carl Well, then: Me too! Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:(2)
Hi Eric, 5-Jan-2000 you wrote: [...] >I think backtracking is an essential part of regular expressions. To quote >from Mastering Regular Expressions by Jeffrey Friedl (O'Reilly, p. 102), >The essence of an NFA engine [the regex matching engine of the kind >that Perl uses] is this: it considers each subexpression or component >in turn, and whenever it needs to decide between two equally viable >options, it selects one and remembers the other to return to later if >need be. >Suppose you want to find a word ending in "ing". In Perl you could write: > /\b[a-z_]+ing\b/i >Backtracking is crucial here, because "[a-z_]+" first matches to the end of >the word, and if you didn't backtrack, "ing" would never match. With a parse >rule, you have to do something like this: >pp: copy [] >alpha: make bitset! [#"a" - #"z" #"A" - #"Z" #"_"] >non-alpha: complement alpha >parse/all string [ some [ >p: some [ "ing" [non-alpha | end ] (append pp p) | alpha ] | >some non-alpha >]] >This will leave you with the block PP holding pointers to the beginning of >all the words ending in "ing". It isn't quite equivalent to the Perl regex, >but it's close enough. Here backtracking is accomplished in the following >way: Suppose you were scanning through the word "ringing". The string "ing" >would match "ringing" first, and then attempt to match a non-alphabet > ~~~ >character or the string end. That wouldn't work, but the following | says >that we should try to jump back to the second letter of "ringing" and just >match any alphabet character, and keep going through the word. This is the >kind of thing that regexes do for you automatically. How about this: >> a: [skip a | "ing"] == [skip a | "ing"] >> parse "ringin" a == false >> parse "ringing" a == true It should do the job. So actually backtracking in Parse can be achieved by use of the "|" operator in Parse blocks. >But if you want to search for a pattern in text, such as {a word begining >with "t" and ending with "gh"}, >> a: ["t" b] == ["t" b] >> b: [skip b | "gh"] == [skip b | "gh"] >> parse "tough" a == true >> parse "tougher" a == false Or have I misunderstood you? >or {the words "straw" and "camel" not >separated by any punctuation}, and especially if you want to do that Hmmm, what's the problem with this one? >interactively without stopping to think how you're going to program it, the >parse rules that would be needed are much too complex. This is the problem >I've been trying to tackle with search-text.r. [...] >The big problem is speed. When processing the expression given, an optimized >function is created that tries to quickly identify potential matches. >Unfortunately for an expression that begins with ANY there's no simple way to >do this, so the main matching function is called for every character in the >string. Needless to say that is useless for most purposes, but in interactive >searching I don't think most people would use such an expression. Replacing >text is another matter, but then that won't be much use until I get text >capture of portions of the expression implemented. Converting "(string)*" and "(string)+", for example, into this: some-string: ["string" some-string | "string"] any-string: ["string" any-string | none] should work, shouldn't it? But yes, I must admit, too, that 'any and 'some ought to do some backtracking - in fact, I believed they did, so please ignore the mail I sent a couple of days ago on how to easily build Parse blocks from regular expressions ;-) Rebol Tech., isn't the lack of backtracking in 'any and 'some a bug? BTW, by building a DFA for a regular expression, you can get rid of the backtracking. However, I don't know how fast a program can build a DFA, it might not be that fast... but if you try to match one regular expression to a lot of strings, it might be worth a go. Keep in mind, though, that then you'll only be able to get a "matched" or "didn't match" answer from your expression matching, it will probably be very difficult to implement copying of the text inside subexpressions etc. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Regular Expressions Re:
Hi Keith, 4-Jan-2000 you wrote: >1. Is there any support for regular expressions planned (or is Rebol only >going to support 'parse for everything regular expressions might be used for >that isn't already taken care of by words like 'find.)? Parse does even more than you want, so there's no point in this. >2. I've heard that 'parse actually supports a superset of regular >expressions. Is this true? That's right. Parse can be used for context-free languages, which is a superset of regular languages. This is necessary, for example, to match strings with balanced parentheses. This is easy with Parse: >> b: ["(" b ")" | ""] == ["(" b ")" | ""] >> parse "((()))" b == true >> parse "()))" b == false But it's simply not possible with regular expressions. (Yes, I've heard that they've put a hack into Perl which "fixes" this, but it really is just a hack.) >3. If there are no plans to build regular expressions into the language, >would anyone be interested if someone were to develop a regular expression >capability? What's the point? It's so dead easy converting regular expressions into the BNF-like notation that Parse uses: "abc" becomes ["abc"] [abc] becomes ["a" | "b" | "c"], or you could make a bitset for this: abc-set: charset "abc" and then the Parse block becomes [abc-set]. The Charset function can, as far as I know, cope with all the features you can normally use in a character class definition, such as: abc-set: charset [#"a" - #"c"] and with the Complement function, you can create negated character classes as well, here to create the charset corresponding to [^abc]: complement charset "abc" Let a, b be regular expressions and r, s be the corresponding Parse blocks: a* becomes [any r] a+ becomes [some r] (a|b) becomes [r | s] ab becomes [r s] At least, that's the way it should be, I guess. Though, I'm not that familiar with the way they write regular expressions in e.g. Python and Perl. I've only used regular expressions in (f)lex and for theoretical usage. >4. If there is interest, what are your thoughts on it? How would it work, >look, etc? >maybe there could be a 'regex word, or possibly have it as a refinement to >'parse, as in parse/regex. (Is it even possible to add a refinement to a >native word?) >a skeleton of 'regex might look like this: >regex: make function! [ >string [string!] "The string to run the regex on" >pattern [string! block!] "The regular expression pattern" >] >[ >;do matching >] >'regex could return a block of values for patterns that have parentheses in >them, so with >string: "Hello, World. This is your captain speaking." >regex string "(H.+),.+(c.+) " >regex returns ["Hello" "captain"] Shouldn't it also be possible to see if the regular expression does not match the whole string, like in the above (where "speaking" is not matched)? >Anyway, these are just some quick sketches. Does anyone think it would be >worth it were someone to develop it? Would it be so slow that it would be >useless? Does anyone have any other thoughts they want to share? Please don't get me wrong, but I don't think it's worth it. Better try to learn Parse instead: It's very easy to use, and Parse blocks are more easy to read than spooky regular expressions. Besides the added power that you'll soon find yourself using. >Also, quick question about 'parse... does 'parse ever return a true value >for anything other than simple splitting? If you do more than simple splitting, it returns true or false, depending on whether or not your Parse block has matched the input string. (See the balanced parentheses example above.) If you want results, you can copy from the input string from inside the parse block. Have a look at the REBOL manual (on the web site) on how to do this and a lot more. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] emacs mode Re:(3)
Hi Sterling, 27-Dec-1999 you wrote: >It does nothing? Even if you load up a script and then do >'Meta-X rebol-mode'? Odd. Yup. And I've got witnesses ;-) I used GNU EMacs, but I don't remember the exact version. Somebody once said that the REBOL mode didn't do as much in GNU EMacs as in XEMacs (or whatever it was...), but that it did nothing at all (at least as far as I and a friend saw it) was a bit of a surprise. However, I cannot rule out the possibility that there _was_ something changed in EMacs, I just didn't notice it. And please don't rule out the possibility that I'm a jerk concerning Unix-related stuff in general. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] emacs mode Re:
Hi , 24-Dec-1999 you wrote: >hey everybody, >Hope you had a nice Christmas (I did!) >can someone refresh my memory about the availability of a REBOL mode for >emacs? would be nice if there was one, and I seem to recall some >mention... Something like http://www.rebol.org/rebol.el However, it does nothing (zero, zip) to the installation of EMacs at my university, so I wish you better luck. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] REBOL Trivia :) Re:(3)
Hi , 7-Dec-1999 you wrote: >Sorry, couldn't resist, but there was at least one more name before >LAVA, and it starts with an M... >But I am not sure I can divulge it... LAVA was just one phase in Carl's big plan. IIRC, Carl wanted to end up with a new kind of operating system, built with LAVA as the foundation - as he said, being able to communicate is a basic need, and only when this is possible can the other layers be added. I don't remember which part of the big plan was called Magma, but one of them was. I guess you can all see the theme of these names, and I kinda like it - seeing the current computer situation as a grey, boring, and unproductive thing: A rock. Along comes Carl's plan, which partially melts the rock in order to create something new, something much more dynamic. (Oh, do behave, this is probably enough dumb philosophy from my side for the next couple of months!...) Of course, REBOL has had other predecessors too, one of them being a multimedia language that Carl (and some others?) made at Viscorp. Though, I do not know what this one was called. Anyway, it would be really interesting to know if Carl still has those plans about a new OS kinda thing, or whether REBOL will be "just" a platform-independent programming system. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] [ato] [Daytona-Ann] Changes for the year 2000 (fwd) (fwd)
Hi! Well, I just had to forward this forwarded message... and to start of with: I definitely do not agree with David Rey that this is bad news! So, REBOL Technologies has got a new employee. For those who do not know who Holger Kruse is, I can tell you that I believe he is one of the coolest programmers on this planet. Have a look at www.nordicglobal.com and have a glance. Hurrah! Hello, *** Begin of forwarded message *** Date: 02-Dec-99 22:48:14 From: David Rey <[EMAIL PROTECTED]> Subject: [ato] [Daytona-Ann] Changes for the year 2000 (fwd) --- Forwarded message follows --- Hi, Unless today is Fools Day in FL, prepare yourself to read some bad news... *** Message forwarded *** *From*:Holger Kruse <[EMAIL PROTECTED]> *To*: [EMAIL PROTECTED] <[EMAIL PROTECTED]> *Date*:thursday 02-dec-99, 19:13:17 (GMT+0100) *Subject*: [Daytona-Ann] Changes for the year 2000 *** Begin of forwarded message *** There will be some changes at Nordic Global for the year 2000. In a nutshell: - Starting January 1st, 2000, I will work full time for REBOL Technologies, and will therefore not be able to spend as much time on Amiga software development as I used to. - I will move from Florida to California, which may cause some service outages of my Internet presence and delays during the months of December '99 and January 2000. - All of my Amiga software is still supported, except for Daytona, which unfortunately has to be cancelled, because of lack of time and a conflict of interest regarding Java and REBOL. I will try to find a way to release the current snapshot or have development continue by others in some way though. For the full announcements please see http://www.nordicglobal.com/ng2000.txt and http://www.nordicglobal.com/ngoutages.txt Holger Kruse [EMAIL PROTECTED] -- To unsubscribe send "unsubscribe daytona-announce-ml" to "[EMAIL PROTECTED]". For help on list commands send "help" to "[EMAIL PROTECTED]". *** End of forwarded message *** -- David Rey Team *AMIGA* Worldwide - C.U.A.E. - S.A.U.G. - A.T.O. - CasaAMIGA PGP key available on request ** ATO Mailing List ** Questions, Suggestions or Problems? ** Our homepage is http://ato.vapor.com/ato/ ** or ask your local admin or [EMAIL PROTECTED] *** End of forwarded message *** Kind regards
[REBOL] Re: Tail recursion in REBOL 2.0
Hello John On 27-Nov-99, you wrote: >>> If I've comprehended that correctly, a good language should >>> eliminate tail >>> recursion and replace it by iteration. This is a feature of most >>> CommonLISP systems and Scheme. > >> Well, that would probably require some static analysis of the >> REBOL program, something which is probably not part of the >> philosophy of the REBOL interpreter. (Am I mistaken?) > > You don't need to do static analysis, you just have to notice > that the current function is calling another function, and that > the result of that call is only going to be returned, i.e., you > are making a function call followed immediately by a return. > If you replace the CALL/RET pair with a JMP, and adjust the stack, > you should be all set. That's definitely what I would consider static analysis. Additionally, remember that in REBOL you can define 'return to do so much else than simply return a value (in fact, 'return is just a value like everything else, and thus can be redefined), so you will never know what the value of 'return will be just after returning from a function call. Though you should be able to remove tail recursion in the specific example of return some-function arg1 arg2 ... argn But using 'return isn't the only means of returning a value. Consider this (untested, though): fac1: func [n] [ either (n > 1) [return n * fac1 n - 1] [return 1] ] fac2: func [n] [ either (n > 1) [n * fac2 n - 1] [1] ] fac1 and fac2 are semantically identical. But how would you dynamically see that you should eliminate tail recursion in fac2? Sure, you could do it by looking forward some bytes (seeing that there isn't anything more to do in the body of the function), but if you should do this properly, program execution will probably be slowed down. Regards Ole
[REBOL] Re: Encouraging functional programming Re:(2)
Hello [EMAIL PROTECTED] On 26-Nov-99, you wrote: >> Was it Ingo who mentioned that tail recursion had been removed from the >> current REBOL implementation and asked when it would be returned? That >> should take care of the stack overflow problem. The problem is a result >> of an incomplete implementation of REBOL (missing tail recursion) and I >> don't think it makes a good argument regarding the degree to which REBOL >> supports functional programming. >> >> Elan > > If I've comprehended that correctly, a good language should eliminate tail > recursion and replace it by iteration. This is a feature of most > CommonLISP systems and Scheme. Well, that would probably require some static analysis of the REBOL program, something which is probably not part of the philosophy of the REBOL interpreter. (Am I mistaken?) > After all, recursion is more expensive than iteration and should be > avoided whenever possible, and its always possible if the function calls > are proper tail-recursive. Except when it makes the code more readable, thus more maintainable. Though, taking a step back and comparing procedural and functional programming, I think the main point is that procedural programming is nice seen from the user's perspective (as it will almost certainly result in faster and less memory-hungry programs), while functional programming is nice seen from a programmer's perspective (as it makes aestethically beautiful code). Though, users will of course benefit from a functional approach if it means less buggy code and more maintainable code. Which it often will. But I digress. Sorry. The bottom line: I think we can all agree that tail recursion would be a nice thing to get back. Regards Ole
[REBOL] Re: Sorting and language specific chars ... Re:
Hello [EMAIL PROTECTED] On 15-Nov-99, you wrote: > Petr, > > I've forwarded your comment to Dan Ramsey, head of technical documentation > here. Now that Core documentation is getting close to re-release, we are > planning on implementing a localization project in various phases. You and > a number of others who have expressed interest for their native languages > will be contacted. > > If there are others who are interested in contributing to this project, > please send an email to Dan Ramsey ([EMAIL PROTECTED]) or to me > ([EMAIL PROTECTED]) directly. Thanks in advance, I've given up counting the number of times I've told the REBOL team about the existence of ATO: http://ato.vapor.com/ Who knows, perhaps we're willing to help? ;-)
[REBOL] "Everything is relative"... hmmm... Re:(2)
Hi , 28-Oct-1999 you wrote: >> Sorry about having that opinion - I'm not a fanatic physicist, but do know >> that the point of Einstein's relativity theory is that the speed of light is >> constant (i.e., /not/ relative), and I'm generally tired of cases of science >You are right, but that implies (in general relativity) that even >space and time are relative to the observer. There is no absolute >space or time, and even if any observer would explain the universe >with the same law, each observer would see a "different" universe. Before this discussion gets out of control (but thanks for enlightening me, anyway ;-) ), I'd like to point out that the meaning of my original mail was that I really cannot see any equivalence between Einstein's relativity theory and REBOL, other than the word "relative". (Besides, as far as I remember, "Relative Expression-Based Object Language" was made up _after_ the name "REBOL" was decided upon...) Sure, we all have our visions, but trying to sell a product from this kind of claim is not fair (nor honourable), IMHO. Anyway, I did mean the sentence "sorry about having that opinion", as I know it's probably of the deepest irrelevance to the great people at REBOL Tech. Anyway, now I've had room for my frustration ;-) BTW, if REBOL Tech. _do_ want to cite Einstein, better make it "Everything should be as simple as possible, but not simpler". That fits REBOL a lot more, I guess. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] "Everything is relative"... hmmm...
Hi! Just an advice to the REBOL homepage maintainers: Better remove this part of the page "http://www.rebol.com/whyrebol.html": "It flows from the same theory that Einstein proved: everything is relative." Why? Because it shows a great deal of ignorance. Sorry about having that opinion - I'm not a fanatic physicist, but do know that the point of Einstein's relativity theory is that the speed of light is constant (i.e., /not/ relative), and I'm generally tired of cases of science reduced to some nice-sounding crap. Please do not reduce REBOL to something that only smart people who avoid any insight will choose. Do anybody else agree with me? :-) Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] An article on rebol ... Re:
Hi , 20-Oct-1999 you wrote: [...] >Is it just based on one of older articles, to which author refers to, or >just new info? I guess this is the article that Carl denied recently (that is, the point that he's planning on selling out REBOL). Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Saving Post Data Re:(2)
Hi Sterling, 12-Oct-1999 you wrote: >Well, POST is interesting since when you start reading you don't know >how much data you're going to be getting. The post data is laid out, [...] It might very well turn out that I'm an ignorant bugger, but isn't that what you get from the "CONTENT_LENGTH" environment variable? Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] Quick and mergesort Re:
Hi Gisle, 5-Oct-1999 you wrote: >Hi, I remember a discussion about quicksort and mergesort >implementations quite a while >ago (in REBOL of course). >Does anyone still have these optimized functions lying around? What a coincidence - I've just written a bubblesort, mergesort, and quicksort routine, which I did for a tutorial I'm writing in a Danish Amiga magazine. I'll send 'em to you in private. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)
[REBOL] "Everything is relative"... hmmm... Re:(4)
Hi Elliott, 2-Nov-1999 you wrote: >> relativity theory and REBOL, other than the word "relative". (Besides, as >> far as I remember, "Relative Expression-Based Object Language" was made up >> _after_ the name "REBOL" was decided upon...) >I don't know about this, but I do remember reading about Carl and Lava >before Rebol. Was this a simple name change or a change in the whole >concept? AFAIR, Carl said something about his lawyers urging him to change the name, due to the potential name confusion with Java. I doubt that the name change had any conceptual changes for the language, but I guess only Carl can answer that question. Kind regards, -- Ole Friis <[EMAIL PROTECTED]> "Ignorance is bliss" (Cypher, The Matrix)