question regarding rules and bytes vs characters

Ph. Marek Mon, 31 May 2004 22:56:51 -0700

Hello everybody,

I'm about to learn myself perl6 (after using perl5 for some time).


One of my first questions deals with regexes.


I'd like to parse data of the form
        Len: 15\n
        (15 bytes data)\n
        Len: 5\n
        (5 bytes data)\n
        \n
        OtherTag: some value here\n
and so on, where the data can (and will) be binary.

I'd try for something like
        my $data_tag= rule { 
                Len\: $len:=(\d) \n 
                $data:=([:u0 .]<$len>)\n  # these are bytes
        };

Is that correct?

And furthermore is perl6 said to be unicode-ready.
So I put the :u0-modifier in the data-regex; will that DWIM if I try to match 
a unicode-string with that rule?


Is anything known about the internals of pattern matching whether the 
hypothetical variables will consume (double) space?
I'm asking because I imagine getting a tag like "Len: 200000000" and then 
having problems with 256MB RAM. Matching shouldn't be a problem according to 
apo 5 (see the chapter "RFC 093: Regex: Support for incremental pattern 
matching") but I'll maybe have troubles using the matched data?


Thank you for all answers!


Regards,

Phil

question regarding rules and bytes vs characters

Reply via email to