https://github.com/MoarVM/MoarVM/pull/528/files?diff=split
was merged so this ticket can be closed. Thank you.
-- Sinan
I think my pull request has reached the point where it should work on
others' machines, too ;-)
Please try it out:
https://github.com/MoarVM/MoarVM/pull/528/files?diff=split
See also https://github.com/MoarVM/MoarVM/issues/527
@Parrot Raiser, please see
https://github.com/perl6/nqp/issues/346#issuecomment-278090170
https://github.com/perl6/nqp/issues/346#issuecomment-278102220
https://github.com/perl6/nqp/issues/346#issuecomment-278104580
-- Sinan
I created this report by mistake when I was hastily trying to follow-up on
https://rt.perl.org/Public/Bug/Display.html?id=127925
The reply belongs there. I would appreciate it if you could merge this
ticket with the correct one.
Apologies and thank you.
# New Ticket Created by A. Sinan Unur
# Please include the string: [perl #130736]
# in the subject line of all future correspondence about this issue.
# https://rt.perl.org/Ticket/Display.html?id=130736 >
The problem is caused by the fact that moar.exe uses main rather than
wmain, so it has n
A quick look at Stackoverflow suggests that Windows isn't being
terribly helpful.
On 2/7/17, Zoffix Znet via RT wrote:
> Another report in NQP repo: https://github.com/perl6/nqp/issues/346
>
> ->8--
>
> It is entirely possible that I am missing something obvious, b
Another report in NQP repo: https://github.com/perl6/nqp/issues/346
->8--
It is entirely possible that I am missing something obvious, but while trying
to figure out what happens between typing
C:\> perl6 -e "say 'yağmur'"
and getting the output
yagmur
at https://stackoverflow.com/q/36648940/1529709
Unicode handling on the Windows command line fails:
C:\Windows\System32>perl6 -e "'Я'.say"
?
Interestingly, this works:
C:\Windows\System32>perl6 -e "Buf.new(0xD0, 0xAF).decode('UTF-8').say"
Я
Seen on
Dan Sugalski writes:
: I'm not sure that raw's the right word, given that the data is really
: Unicode. It's not raw in the sense that a JPEG image or executable is raw data.
I'm suggesting it might be raw in that very sense, and simultaneously
be perfectly valid "internal" Unicode. Otherwise y
At 11:06 AM 3/27/2001 -0800, Larry Wall wrote:
>Dan Sugalski writes:
>: At 07:21 AM 3/27/2001 -0800, Larry Wall wrote:
>: >Dan Sugalski writes:
>: >Assume that in practice most of the normalization will be done by the
>: >input disciplines. Then we might have a pragma that says to try to
>: >enfo
At 01:09 PM 3/27/2001 -0800, Hong Zhang wrote:
> > The only problem with that is it means we'll be potentially altering the
> > data as it comes in, which leads back to the problem of input and output
> > files not matching for simple filter programs. (Plus it means we spend CPU
> > cycles alterin
> The only problem with that is it means we'll be potentially altering the
> data as it comes in, which leads back to the problem of input and output
> files not matching for simple filter programs. (Plus it means we spend CPU
> cycles altering data that we might not actually need to)
>
I don't t
Dan Sugalski writes:
: At 07:21 AM 3/27/2001 -0800, Larry Wall wrote:
: >Dan Sugalski writes:
: >Assume that in practice most of the normalization will be done by the
: >input disciplines. Then we might have a pragma that says to try to
: >enforce level 1, level 2, level 3 if your data doesn't ma
On Tue, Mar 27, 2001 at 12:38:23PM -0500, Dan Sugalski wrote:
> I'm afraid this isn't what I'd normally think of--ord to me returns the
> integer value of the first code point in the string. That does mean that A
> is different for ASCII and EBCDIC, but that's just One Of Those Things.
My perso
At 08:37 PM 3/26/2001 +, [EMAIL PROTECTED] wrote:
>Damien Neil <[EMAIL PROTECTED]> writes:
> >> >So $c = chr(ord($c)) could change $c? That seems odd.
> >>
> >> It changes its _representation_ (e.g. from 0x45,ASCII to 0xC1,EBCDIC)
> >> but not its "fundamental" 'LATIN CAPITAL LETTER A'-ness.
At 07:21 AM 3/27/2001 -0800, Larry Wall wrote:
>Dan Sugalski writes:
>: Fair enough. I think there are some cases where there's a base/combining
>: pair of codepoints that don't map to a single combined-character code
>: point. Not matching on a glyph boundary could make things really odd, but
>:
Garrett Goebel writes:
: Someone please clue me in. A pointer to an RFC which defines the use of
: colons in Perl6 among other things would help.
Heh. If you read the RFCs, you'll discover one of the basic rules of
language redesign: everybody wants the colon. And it never seems to
occur to peo
Dan Sugalski writes:
: Fair enough. I think there are some cases where there's a base/combining
: pair of codepoints that don't map to a single combined-character code
: point. Not matching on a glyph boundary could make things really odd, but
: I'd hate to have the checking code on by default,
From: Damien Neil [mailto:[EMAIL PROTECTED]]
> On Mon, Mar 26, 2001 at 08:37:05PM +, [EMAIL PROTECTED] wrote:
> > >
> > > If ord is dependent on the encoding of the string it gets, as Dan
> > > was saying, than ord($e) is 0x81,
> >
> > It it could still be 0x81 (from ebcdic) with the encodin
On Mon, Mar 26, 2001 at 08:37:05PM +, [EMAIL PROTECTED] wrote:
> >If ord is dependent on the encoding of the string it gets, as Dan
> >was saying, than ord($e) is 0x81,
>
> It it could still be 0x81 (from ebcdic) with the encoding carried
> along with the _number_ if we thought that worth t
Damien Neil <[EMAIL PROTECTED]> writes:
>> >So $c = chr(ord($c)) could change $c? That seems odd.
>>
>> It changes its _representation_ (e.g. from 0x45,ASCII to 0xC1,EBCDIC)
>> but not its "fundamental" 'LATIN CAPITAL LETTER A'-ness.
>> Then of course someone will want it to be the number 0x45 a
On Mon, Mar 26, 2001 at 06:16:00PM +, [EMAIL PROTECTED] wrote:
> Damien Neil <[EMAIL PROTECTED]> writes:
> >On Mon, Mar 26, 2001 at 11:32:46AM -0500, Dan Sugalski wrote:
> >> At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
> >> >So the results of ord are dependent on a global setting for "curr
Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>For length, I'd as soon it returned the number of code points, but glyphs
>and bytes are also valid return values.
And that may be where it belongs - at the language level
chars($s) == 120
bytes($s) == 480
glyphs($s) == 360
length($
Damien Neil <[EMAIL PROTECTED]> writes:
>On Mon, Mar 26, 2001 at 11:32:46AM -0500, Dan Sugalski wrote:
>> At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
>> >So the results of ord are dependent on a global setting for "current
>> >character set" or some such, not on the encoding of the string that
At 02:52 AM 3/25/2001 -0500, Philip Newton wrote:
>On Fri, 23 Mar 2001, Dan Sugalski wrote:
>
> > At 02:31 PM 3/23/2001 -0500, Bryan C. Warnock wrote:
> > >On Friday 23 March 2001 14:18, Dan Sugalski wrote:
> > > > At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > > > >We need the character equiv
Dan Sugalski <[EMAIL PROTECTED]> writes:
>>This the main pain with 5.7.*'s EBCDIC scheme - making
>>
>>ord('A') == 193
>>
>>true :-/
>
>That would be true if EBCDIC was the default encoding, otherwise false.
But what about
our $var;
{
use encoding 'US-ascii';
$var = 'A';
}
{use Encoding 'i
At 04:34 PM 3/24/2001 -0800, Dave Storrs wrote:
> I'll just toss my 0.01 cents in...my thought here is that this
>thread has now tied up a lot of cycles from a lot of very smart, very
>experienced people without resulting in an answer that is clearly The
>Right Thing. Whatever we do, ther
On Mon, Mar 26, 2001 at 11:32:46AM -0500, Dan Sugalski wrote:
> At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
> >So the results of ord are dependent on a global setting for "current
> >character set" or some such, not on the encoding of the string that
> >is passed to it?
>
> Nope, ord is depen
At 05:45 PM 3/26/2001 +, [EMAIL PROTECTED] wrote:
>Dan Sugalski <[EMAIL PROTECTED]> writes:
> >At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
> >>So the results of ord are dependent on a global setting for "current
> >>character set" or some such, not on the encoding of the string that
> >>is
At 11:42 AM 3/26/2001 -0600, Garrett Goebel wrote:
>From: Dan Sugalski [mailto:[EMAIL PROTECTED]]
> > At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
> > > So the results of ord are dependent on a global setting for
> > > "current character set" or some such, not on the encoding
> > > of the strin
Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
>>So the results of ord are dependent on a global setting for "current
>>character set" or some such, not on the encoding of the string that
>>is passed to it?
>
>Nope, ord is dependent on the string it gets,
From: Dan Sugalski [mailto:[EMAIL PROTECTED]]
> At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
> > So the results of ord are dependent on a global setting for
> > "current character set" or some such, not on the encoding
> > of the string that is passed to it?
>
> Nope, ord is dependent on the s
At 05:09 PM 3/23/2001 -0800, Damien Neil wrote:
>So the results of ord are dependent on a global setting for "current
>character set" or some such, not on the encoding of the string that
>is passed to it?
Nope, ord is dependent on the string it gets, as those strings know what
their encoding is.
At 09:09 AM 3/26/2001 -0600, Garrett Goebel wrote:
>Someone please clue me in. A pointer to an RFC which defines the use of
>colons in Perl6 among other things would help.
>
>Why not have subsequent uses of : on the same variable name perform a cast?
>Or perhaps better returned the casted value?
From: Dan Sugalski [mailto:[EMAIL PROTECTED]]
> At 11:09 PM 3/23/2001 +, Simon Cozens wrote:
> >
> > For instance, chr() will produce Unicode codepoints. But
> > you can pretend that they're ASCII codepoints, it's only
> > the EBCDIC folk that'll get hurt. I hope and suspect
> > there'll be an
Simon Cozens wrote:
[...]
> I'm just not sure it's fair on Old World hackers. Will there be a way to stop
> Perl upgrading stuff to Unicode on the way in?
and I'm probably not the only Old World hacker that would
prefer a build option to simply eliminate Unicode support altogether...
On Fri, 23 Mar 2001, Dan Sugalski wrote:
> At 02:31 PM 3/23/2001 -0500, Bryan C. Warnock wrote:
> >On Friday 23 March 2001 14:18, Dan Sugalski wrote:
> > > At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > > >We need the character equivalence construct, such as [[=a=]], which
> > > >matches "a",
On Fri, 23 Mar 2001, Dan Sugalski wrote:
> At 11:41 PM 3/22/2001 +, Nicholas Clark wrote:
> >On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
>
> hadn't thought of. If we do, then something as simple as this:
>
>while () {
> $count++ if /bar/;
> print OU
Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>On Fri, Mar 23, 2001 at 02:50:05PM -0500, Dan Sugalski wrote:
>> At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote:
>> > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
>> >
>> > DS> U doesn't really signal "glyph" to me, but we are sort of limi
On Fri, Mar 23, 2001 at 06:31:13PM -0500, Dan Sugalski wrote:
> >Err, perhaps I'm being dumb here - but surely $foo and $bar arent
> >typed strings, they're just numbers (or strings which match /^\d+$/) ???
>
> D'oh! Too much blood in my caffeine stream. Yeah, I was thinking of ord.
>
> chr will
On Fri, Mar 23, 2001 at 06:16:58PM -0500, Dan Sugalski wrote:
> At 11:09 PM 3/23/2001 +, Simon Cozens wrote:
> >For instance, chr() will produce Unicode codepoints. But you can pretend that
> >they're ASCII codepoints, it's only the EBCDIC folk that'll get hurt. I hope
> >and suspect there'll
At 11:26 PM 3/23/2001 +, Dave Mitchell wrote:
>Dan Sugalski <[EMAIL PROTECTED]> doodled:
> > At 11:09 PM 3/23/2001 +, Simon Cozens wrote:
> > >For instance, chr() will produce Unicode codepoints. But you can
> pretend that
> > >they're ASCII codepoints, it's only the EBCDIC folk that'll g
Dan Sugalski <[EMAIL PROTECTED]> doodled:
> At 11:09 PM 3/23/2001 +, Simon Cozens wrote:
> >For instance, chr() will produce Unicode codepoints. But you can pretend that
> >they're ASCII codepoints, it's only the EBCDIC folk that'll get hurt. I hope
> >and suspect there'll be an equivalent of
At 11:09 PM 3/23/2001 +, Simon Cozens wrote:
>For instance, chr() will produce Unicode codepoints. But you can pretend that
>they're ASCII codepoints, it's only the EBCDIC folk that'll get hurt. I hope
>and suspect there'll be an equivalent of "use bytes" which makes chr(256)
>either blow up o
On Fri, Mar 23, 2001 at 03:15:41PM -0800, Brad Hughes wrote:
> Simon Cozens wrote:
> [...]
> > I'm just not sure it's fair on Old World hackers. Will there be a way to stop
> > Perl upgrading stuff to Unicode on the way in?
>
> and I'm probably not the only Old World hacker that would
> prefe
On Fri, Mar 23, 2001 at 05:56:19PM -0500, Dan Sugalski wrote:
> Nah, they only apply to data that perl's tagged as Unicode, either because
> its input stream is marked that way or because the program explicitly
> converted the data.
Oh, colour me dull. I read
4) Data converted to Unicode (
At 10:48 PM 3/23/2001 +, Simon Cozens wrote:
>On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
> > Yes, I realize that point 5 may result in someone getting a meaningless
> > Unicode string. Too bad--it is *not* the place of a programming
> language to
> > enforce validity on dat
On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
> Yes, I realize that point 5 may result in someone getting a meaningless
> Unicode string. Too bad--it is *not* the place of a programming language to
> enforce validity on data. That's the programmer's job.
But points 4 and 5 do en
At 01:07 PM 3/23/2001 -0800, Larry Wall wrote:
>Jarkko Hietaniemi writes:
>: *cough* \C *is* taken.
>:
>: > >also \U has a meaning in double quotish strings.
>:
>: "\Uindeed."
>
>Bear in mind we are redesigning the language. If there's a botch we
>can think about fixing it.
>
>Though maybe not on
Jarkko Hietaniemi writes:
: *cough* \C *is* taken.
:
: > >also \U has a meaning in double quotish strings.
:
: "\Uindeed."
Bear in mind we are redesigning the language. If there's a botch we
can think about fixing it.
Though maybe not on -internals... :-)
Larry
At 08:14 PM 3/23/2001 +, Nicholas Clark wrote:
>On Fri, Mar 23, 2001 at 03:08:35PM -0500, Dan Sugalski wrote:
> > I'm half tempted, since this is a Unicode-only feature, to use a non-ASCII
> > character.
> >
> > \SMILEY FACE, perhaps?
>
>that makes it kind of hard to edit perl scripts that use
On Fri, Mar 23, 2001 at 03:08:35PM -0500, Dan Sugalski wrote:
> I'm half tempted, since this is a Unicode-only feature, to use a non-ASCII
> character.
>
> \SMILEY FACE, perhaps?
that makes it kind of hard to edit perl scripts that use this feature on
any good old fashioned 8 bit xterm.
Let alo
On Friday 23 March 2001 14:48, you wrote
> In Unicode, there's theoretically no locale. Theoretically...
Well, yes, but Unicode makes no pretenses about encoding the world's
languages - just the various symbols use by the world's languages.
If you want to orient Perl so that it remains(?) data-
At 02:06 PM 3/23/2001 -0600, Jarkko Hietaniemi wrote:
>On Fri, Mar 23, 2001 at 02:50:05PM -0500, Dan Sugalski wrote:
> > At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote:
> > > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
> > >
> > > DS> U doesn't really signal "glyph" to me, but we are
At 11:52 AM 3/23/2001 -0800, Hong Zhang wrote:
> > >I recommend to use 'u' flag, which indicates all operations are performed
> > >against unicode grapheme/glyph. By default re is performed on codepoint.
> >
> > U doesn't really signal "glyph" to me, but we are sort of limited in what
> > we have
On Fri, Mar 23, 2001 at 02:50:05PM -0500, Dan Sugalski wrote:
> At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote:
> > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
> >
> > DS> U doesn't really signal "glyph" to me, but we are sort of limited
> > DS> in what we have left. We still need a
> > >We need the character equivalence construct, such as [[=a=]], which
> > >matches "a", "A ACUTE".
> >
> > Yeah, we really need a big list of these. PDD anyone?
> >
>
> But surely this is a locale issue, and not an encoding one? Not every
> language recognizes the same character equivalences
> >I recommend to use 'u' flag, which indicates all operations are performed
> >against unicode grapheme/glyph. By default re is performed on codepoint.
>
> U doesn't really signal "glyph" to me, but we are sort of limited in what
> we have left. We still need a zero-width assertion for glyph boun
At 02:31 PM 3/23/2001 -0500, Bryan C. Warnock wrote:
>On Friday 23 March 2001 14:18, Dan Sugalski wrote:
> > At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > > > 6) There will be a glyph boundary/non-glyph boundary pair of regex
> > > > characters to match the word/non-word boundary ones we alre
At 02:27 PM 3/23/2001 -0500, Uri Guttman wrote:
> > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
>
> DS> U doesn't really signal "glyph" to me, but we are sort of limited
> DS> in what we have left. We still need a zero-width assertion for
> DS> glyph boundary within regexes themselv
On Friday 23 March 2001 14:18, Dan Sugalski wrote:
> At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > > 6) There will be a glyph boundary/non-glyph boundary pair of regex
> > > characters to match the word/non-word boundary ones we already have.
> >
> >(While
> >
> > > I'd personally like \g and
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
DS> U doesn't really signal "glyph" to me, but we are sort of limited
DS> in what we have left. We still need a zero-width assertion for
DS> glyph boundary within regexes themselves.
how about \C? it doesn't seem to be taken and would
At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > 6) There will be a glyph boundary/non-glyph boundary pair of regex
> > characters to match the word/non-word boundary ones we already have.
>(While
> > I'd personally like \g and \G, that won't work as \G is already taken)
> >
> > I also realize t
At 01:26 PM 3/23/2001 -0500, NeonEdge wrote:
>Dan Sugalski wrote:
> >If we do, then something as simple as this:
> >
> > while () {
> > $count++ if /bar/;
> > print OUT $_;
> > }
> >
> >would potentially result in the output file being rather different from the
> >input file. E
At 10:56 AM 3/23/2001 -0800, Damien Neil wrote:
>On Fri, Mar 23, 2001 at 12:38:04PM -0500, Dan Sugalski wrote:
> >while () {
> > $count++ if /bar/;
> > print OUT $_;
> >}
>
>I would find it surprising for this to have different output
>than input. Other people's milage m
At 11:05 AM 3/23/2001 -0600, Garrett Goebel wrote:
>From: Nicholas Clark [mailto:[EMAIL PROTECTED]]
> >
> > On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
> > > 1) All Unicode data perl does regular expressions against
> > >will be in Normalization Form C, except for...
> > > 2
On Fri, Mar 23, 2001 at 12:38:04PM -0500, Dan Sugalski wrote:
>while () {
> $count++ if /bar/;
> print OUT $_;
>}
I would find it surprising for this to have different output
than input. Other people's milage may vary.
In general, however, I think I would prefer to be
Dan Sugalski wrote:
>If we do, then something as simple as this:
>
> while () {
> $count++ if /bar/;
> print OUT $_;
> }
>
>would potentially result in the output file being rather different from the
>input file. Equivalent, yes, but different. Whether that's bad or not is an
>
At 11:41 PM 3/22/2001 +, Nicholas Clark wrote:
>On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
> > 1) All Unicode data perl does regular expressions against will be in
> > Normalization Form C, except for...
> > 2) Regexes tagged to run against a decomposed form will instead be
From: Nicholas Clark [mailto:[EMAIL PROTECTED]]
>
> On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
> > 1) All Unicode data perl does regular expressions against
> >will be in Normalization Form C, except for...
> > 2) Regexes tagged to run against a decomposed form will
> >
On Thu, Mar 22, 2001 at 04:10:28PM -0500, Dan Sugalski wrote:
> 1) All Unicode data perl does regular expressions against will be in
> Normalization Form C, except for...
> 2) Regexes tagged to run against a decomposed form will instead be run
> against data in Normalization Form D. (What the ta
> 6) There will be a glyph boundary/non-glyph boundary pair of regex
> characters to match the word/non-word boundary ones we already have.
(While
> I'd personally like \g and \G, that won't work as \G is already taken)
>
> I also realize that the decomposition flag on regexes would mean that
> s/
At the moment, I'm not particularly inclined to argue unicode. Short of
Larry handing down an edict and invoking Rule #1, the following rules will
be in effect:
1) All Unicode data perl does regular expressions against will be in
Normalization Form C, except for...
2) Regexes tagged to run aga
74 matches
Mail list logo