Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Keisuke Nishida
At Thu, 22 Mar 2001 13:37:29 -0800, John Harper wrote: > > |> I've looked, a little, (and months ago at that) at the LibREP (ala > |> "sawfish") virtual machine. It's a pretty good indirect threaded VM > |> that uses techniques pioneered by Forth engines. It utilizes the GCC > |> ability to

Re: Unicode handling

2001-03-22 Thread Nicholas Clark
On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote: > 1) All Unicode data perl does regular expressions against will be in > Normalization Form C, except for... > 2) Regexes tagged to run against a decomposed form will instead be run > against data in Normalization Form D. (What the ta

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Hong Zhang
> >> What if, at the C level, you had a signal handler that sets or > >> increments a flag or counter, stuffs a struct with information about > >> the signal's context, then pushes (by "push", I mean "(cons v ls)", > >> not "(append! ls v)" 'whatever ;-) that struct on a stack... >

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Karl M. Hegbloom
> "Hong" == Hong Zhang <[EMAIL PROTECTED]> writes: >> What if, at the C level, you had a signal handler that sets or >> increments a flag or counter, stuffs a struct with information about >> the signal's context, then pushes (by "push", I mean "(cons v ls)", >> not "(append!

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread John Harper
Hong Zhang writes: |> I've looked, a little, (and months ago at that) at the LibREP (ala |> "sawfish") virtual machine. It's a pretty good indirect threaded VM |> that uses techniques pioneered by Forth engines. It utilizes the GCC |> ability to take the address of a label to build a jump ta

Re: Unicode handling

2001-03-22 Thread Hong Zhang
> 6) There will be a glyph boundary/non-glyph boundary pair of regex > characters to match the word/non-word boundary ones we already have. (While > I'd personally like \g and \G, that won't work as \G is already taken) > > I also realize that the decomposition flag on regexes would mean that > s/

Unicode handling

2001-03-22 Thread Dan Sugalski
At the moment, I'm not particularly inclined to argue unicode. Short of Larry handing down an edict and invoking Rule #1, the following rules will be in effect: 1) All Unicode data perl does regular expressions against will be in Normalization Form C, except for... 2) Regexes tagged to run aga

Re: Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Hong Zhang
Here is some of my experience with HotSpot for Linux port. > I've read, in the glibc info manuals, the the similar situation > exists in C programming -- you don't want to do a lot inside the > signal handler; just set a flag and return, then check that flag from > your main loop, and run a "

Re: PDD 4: Internal data types

2001-03-22 Thread Simon Cozens
On Thu, Mar 22, 2001 at 11:14:53AM -0800, Hong Zhang wrote: > Please not fight on wording. For most encodings I know of, the concept of > normalization does not even exist. *boggle*. I don't think we're talking about the same Unicode. > What is your definition of normalization? Well, either ca

Re: PDD 4: Internal data types

2001-03-22 Thread Buddha Buck
At 11:14 AM 03-22-2001 -0800, Hong Zhang wrote: >Please not fight on wording. For most encodings I know of, the concept of >normalization does not even exist. What is your definition of normalization? To me, the usual definition of "normalization' is conversion of something into a standard form

Idea for safe signal handling by a byte code interpreter

2001-03-22 Thread Karl M. Hegbloom
I've not researched this at all... perhaps it's a "known" way of doing things and there is research writing out there already, etc... I've not even looked at this point. I have about 30 minutes to outline this and bounce it off of you all this morning. 8-) I was reading Lincoln D. Stein's

Re: PDD 4: Internal data types

2001-03-22 Thread Hong Zhang
> > The normalization has something to do with encoding. If you compare two > > strings with the same encoding, of course you don't have to care about it. > > Of course you do. Think about it. I said "you don't have to". You can use "==" for codepoint comparison, and something like "Normalizer.co

Re: PDD 4: Internal data types

2001-03-22 Thread Simon Cozens
On Tue, Mar 06, 2001 at 01:21:20PM -0800, Hong Zhang wrote: > The normalization has something to do with encoding. If you compare two > strings with the same encoding, of course you don't have to care about it. Of course you do. Think about it. If I'm comparing "(Greek letter lower case alpha wi