Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Smylers
Jon Lang writes:

> Approaching this with the notion firmly in mind that infix:<..> is
> supposed to be used for matching ranges while infix:<...> should be
> used to generate series:
> 
> With series, we want C< $LHS ... $RHS > to generate a list of items
> starting with $LHS and ending with $RHS.  If $RHS > $LHS, we want it
> to increment one step at a time; if $RHS < $LHS, we want it to
> decrement one step at a time.

Do we? I'm used to generating lists and iterating over them (in Perl 5)
with things like like:

  for (1 .. $max)

where the intention is that if $max is zero, the loop doesn't execute at
all. Having the equivalent Perl 6 list generation operator, C<...>,
start counting backwards could be confusing.

Especially if Perl 6 also has a range operator, C<..>, which would Do
The Right Thing for me in this situation, and where the Perl 6 operator
that Does The Right Thing is spelt the same as the Perl 5 operator that
I'm used to; that muddles the distinction you make above about matching
ranges versus generating lists.

Smylers
-- 
http://twitter.com/Smylers2


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Darren Duncan

Darren Duncan wrote:
specific, the generic "eqv" operator, or "before" etc would have to be 


Correction, I meant to say "cmp", not "eqv", here. -- Darren Duncan


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Darren Duncan

Aaron Sherman wrote:

2) The spec doesn't put this information anywhere near the definition of the
range operator. Perhaps we can make a note? This was a source of confusion
for me.


My impression is that a "Range" primarily defines an "interval" in terms of 2 
endpoint values such that it defines a possibly infinite set values between 
those endpoints.


For example, 'aa'..'bb' is an infinite sized set that includes every possible 
character string that starts with the letter 'a', plus every one that starts 
with the string 'ba'.  And so, asking $anysuchstring ~~ 'aa'..'bb' is TRUE.


(Note that for ".." to work, its 2 arguments would need to be of the same type, 
so that we know which set of rules to follow.  Or to be specific, the generic 
"eqv" operator, or "before" etc would have to be defined that takes both of the 
".." arguments as its arguments.  Although this might be fuzzed a bit if the 
spec defines somewhere about automatic casting.  For example, if someone said 
'foo'..42 then I would expect that to fail.)


A "Range" can also be used in a limited fashion to generate a finite list of 
values, but that is not its primary purpose, and the "..." operator does that 
job much better.



3) It seems that there are two competing multi-character approaches and both
seem somewhat valid. Should we use a pragma to toggle behavior between A and
B:

 A: "aa" .. "bb" contains "az"
 B: "aa" .. "bb" contains ONLY "aa", "ab", "ba" and "bb"


I would find A to be the only reasonable answer.

If you want B's semantics then use "..." instead; ".." should not be overloaded 
for that.


If there were to be any similar pragma, then it should control matters like 
"collation", or what nationality/etc-specific subtype of Str the 'aa' and 'bb' 
are blessed into on definition, so that their collation/sorting/etc rules can be 
applied when figuring out if a particular $foo~~$bar..$baz is TRUE or not.


-- Darren Duncan


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Aaron Sherman
OK, there's a lot here and my head is swimming, so let me re-consolidate and
re-state (BTW: thanks Jon, you've really helped me understand, here).

1) The spec is somewhat vague, but the proposal that I made for single
characters is not an unreasonable interpretation of what's there. Thus, we
could adopt the script/major cat/minor cat triplet as the core tool that
.succ will use for single, non-combining, non-modifying, valid characters?

2) The spec doesn't put this information anywhere near the definition of the
range operator. Perhaps we can make a note? This was a source of confusion
for me.

3) It seems that there are two competing multi-character approaches and both
seem somewhat valid. Should we use a pragma to toggle behavior between A and
B:

 A: "aa" .. "bb" contains "az"
 B: "aa" .. "bb" contains ONLY "aa", "ab", "ba" and "bb"

4) About the ranges I gave as examples, you asked:

"Which codepoint is invalid, and why?"

There's just an undefined codepoint smack in the middle of the Greek
uppercase letters (U+03A2). I'm sure the Unicode specs have a rationale for
that somewhere, but my guess is that there's some thousand-year-old debate
about the Greek alphabet behind it.

"In both of these cases, what do you think it should produce?"

I actually gave that answer a bit later on. I think that "Ā" .. "Ē" should
produce ĀĂĄĆĈĊČĎĐĒ and オ .. ヺ should produce
オカガキギクグケゲコゴサザシジスズセゼソゾタダチヂツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモヤユヨラリルレロワヰヱヲンヴヷヸヹヺ
which are all of the Katakana syllabic characters.

"I also have to wonder how or if "0" ... "z" ought to be resolved.  If
you're thinking in terms of the alphabet or digits, this is
nonsensical"

Well, since you agreed with my statement about the properties checking, it
would be 0 through 9 and then a through z because 0 through 9 are Latin
numbers, matching the LHS's properties and a through z are lowercase Latin
letters, matching the RHS's properties.

For reference, this is the relevant section of the spec:

Character positions are incremented within their natural range for any
Unicode range that is deemed to represent the digits 0..9 or that is deemed
to be a complete cyclical alphabet for (one case of) a (Unicode) script.
Only scripts that represent their alphabet in codepoints that form a cycle
independent of other alphabets may be so used. (This specification defers to
the users of such a script for determining the proper cycle of letters.) We
arbitrarily define the ASCII alphabet not to intersect with other scripts
that make use of characters in that range, but alphabets that intersperse
ASCII letters are not allowed.


I'm not sure that all of that tracks with the Unicode standard's use of some
of the terms, but based on what we've discussed, perhaps we could get more
specific there:

Character positions are incremented within their Unicode Script, but only in
keeping with their General Category property. Thus C<"A"++> yields C<"B">
which is the next codepoint, but C<"Ă"++> yields C<"Ą"> even though "ą"
falls between the two, when incrementing codepoints. Should this prove
problematic for any specific Unicode Script which requires special handling
(e.g. because a "letter" really isn't used as a letter at all), such special
handling may be applied, but the above is the general rule.


and then in the section on ranges:

As discussed previously, incrementing a character (which is to say, invoking
C<.succ>) seeks the next codepoint with the same Unicode Script and General
Category properties (major and minor category to be specific). For ranges,
succession is the same if .min and .max have the same properties, but if
they do not, then all codepoints are considered which are greater than
C<.min> and smaller than C<.max> and which agree with either the properties
of C<.min> I the properties of C<.max>


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Mark J. Reed
On Wed, Jul 21, 2010 at 12:04 AM, Jon Lang  wrote:
> Mark J. Reed wrote:
>> Perhaps the syllabic kana could be the "integer" analogs, and what you
>> get when you iterate over the range using ..., while the modifier kana
>> would not be generated by the series  ア ... ヴ but would be considered
>> in the range  ア .. ヴ?  I wouldn't object to such script-specific
>> behavior, though perhaps it doesn't belong in core.
>
> As I understand it, it wouldn't need to be script-specific behavior;
> just behavior that's aware of Unicode properties.

That wouldn't help in this case.  For example, U+30A1 KATAKANA SMALL
LETTER A - the small "modifier" variety of letter under discussion -
is not a modifier in the Unicode sense.  It has exactly the same
properties as U+30A2 KATAKANA LETTER A, an actual syllable:

30A1;KATAKANA LETTER SMALL A;Lo;0;L;N;
30A2;KATAKANA LETTER A;Lo;0;L;N;

So without script-specific special-case code, there's no way to
distinguish them.  As Aaron said, they're treated like lowercase, but
that's not an accurate representation of how they're used in actual
text, or of the common idea of what constitutes the set of kana.

-- 
Mark J. Reed 


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Jon Lang
Mark J. Reed wrote:
> Perhaps the syllabic kana could be the "integer" analogs, and what you
> get when you iterate over the range using ..., while the modifier kana
> would not be generated by the series  ア ... ヴ but would be considered
> in the range  ア .. ヴ?  I wouldn't object to such script-specific
> behavior, though perhaps it doesn't belong in core.

As I understand it, it wouldn't need to be script-specific behavior;
just behavior that's aware of Unicode properties.  That particular
issue doesn't come up with the English alphabet because there aren't
any modifier codepoints embedded in the middle of the standard
alphabet.  And if there were, I'd hope that they'd be filtered out
from the series generation by default.

And I'd hope that there would be a way to turn the default filtering
off when I don't want it.

-- 
Jonathan "Dataweaver" Lang


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Mark J. Reed
On Tue, Jul 20, 2010 at 11:28 PM, Aaron Sherman  wrote:
> So, what's the intention of the range operator, then?

... is a generator that lazily enumerates a series.  .. is a
constructor for a Range object.  They're two different things, with
different behaviors.  In particular, consider that pi ~~ 0..4 is true,
 because pi is within the range; but pi ~~ 0...4 is false, because pi
is not one of the generated elements.

> I guess you could write:
>
>  ア, イ, ウ, エ, オ, カ ... ヂ,ツ ...モ,ヤ, ユ, ヨ ... ロ, ワ ... ヴ (add quotes to taste)
>
> But that seems quite a bit more painful than:

Perhaps the syllabic kana could be the "integer" analogs, and what you
get when you iterate over the range using ..., while the modifier kana
would not be generated by the series  ア ... ヴ but would be considered
in the range  ア .. ヴ?  I wouldn't object to such script-specific
behavior, though perhaps it doesn't belong in core.

-- 
Mark J. Reed 


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Jon Lang
Aaron Sherman wrote:
> So, what's the intention of the range operator, then? Is it just there to
> offer backward compatibility with Perl 5? Is it a vestige that should be
> removed so that we can Huffman ... down to ..?
>
> I'm not trying to be difficult, here, I just never knew that ... could
> operate on a single item as LHS, and if it can, then .. seems to be obsolete
> and holding some prime operator real estate.

On the contrary: it is not a vestige, it is not obsolete, and it's
making good use of the prime operator real estate that it's holding.
It's just not doing what it did in Perl 5.

I strongly recommend that you reread S03 to find out exactly what each
of these operators does these days.

>> The questions definitely look different that way: for example,
>> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and
>> clearly expressed as
>>
>>    'A' ... 'Z', 'a' ... 'z'     # don't think this works in Rakudo yet  :(
>>
>
> I still contend that this is so frequently desirable that it should have a
> simpler form, but it's still going to have problems.
>
> One example: for expressing "Katakana letters" (I use "letters" in the
> Unicode sense, here) it's still dicey. There are things interspersed in the
> Unicode sequence for Katakana that aren't the same thing at all. Unicode
> calls them lowercase, but that's not quite right. They're smaller versions
> of Katakana characters which are used more as punctuation or accents than as
> syllabic glyphs the way the rest of Katakana is.
>
> I guess you could write:
>
>  ア, イ, ウ, エ, オ, カ ... ヂ,ツ ...モ,ヤ, ユ, ヨ ... ロ, ワ ... ヴ (add quotes to taste)
>
> But that seems quite a bit more painful than:
>
>  ア .. ヴ (or ... if you prefer)
>
> Similar problems exist for many scripts (including some of Latin, we're just
> used to the parts that are odd), though I think it's possible that Katakana
> may be the worst because of the mis-use of Ll to indicate a letter when the
> truth of the matter is far more complicated.

Some of this might be addressed by filtering the list as you go -
though I don't remember the method for doing so.  Something like
.grep, I think, with a regex in it that only accepts letters:

(ア ... ヴ).«grep(/<:alpha:>/)

...or something to that effect.

Still, it's possible that we might need something that's more flexible
than that.

-- 
Jonathan "Dataweaver" Lang


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Jon Lang
Approaching this with the notion firmly in mind that infix:<..> is
supposed to be used for matching ranges while infix:<...> should be
used to generate series:

Aaron Sherman wrote:
> Walk with me a bit, and let's explore the concept of intuitive character
> ranges? This was my suggestion, which seems pretty basic to me:
>
> "x .. y", for all strings x and y, which are composed of a single, valid
> codepoint which is neither combining nor modifying, yields the range of all
> valid, non-combining/modifying codepoints between x and y, inclusive which
> share the Unicode script, general category major property and general
> category minor property of either x or y (lack of a minor property is a
> valid value).

This is indeed true for both range-matching and series-generation as
the spec is currently written.

> In general we have four problems with current specification and
> implementation on the Perl 6 and Perl 5 sides:
>
> 1) Perl 5 and Rakudo have a fundamental difference of opinion about what
> some ranges produce ("A" .. "z", "X" .. "T", etc) and yet we've never really
> articulated why we want that.
>
> 2) We deny that a range whose LHS is "larger" than its RHS makes sense, but
> we also don't provide an easy way to construct such ranges lazily otherwise.
> This would be annoying only, but then we have declared that ranges are the
> right way to construct basic loops (e.g. for (1..1e10).reverse -> $i {...}
> which is not lazy (blows up your machine) and feels awfully clunky next to
> for 1e10..1 -> $i {...} which would not blow up your machine, or even make
> it break a sweat, if it worked)

With ranges, we want C< when $LHS .. $RHS" > to always mean C<< if
$LHS <= $_ <= $RHS >>.  If $RHS < $LHS, then the range being specified
is not valid.  In this context, it makes perfect sense to me why it
doesn't generate anything.

With series, we want C< $LHS ... $RHS > to generate a list of items
starting with $LHS and ending with $RHS.  If $RHS > $LHS, we want it
to increment one step at a time; if $RHS < $LHS, we want it to
decrement one step at a time.

So: 1) we want different behavior from the Range operator in Perl 6
vs. Perl 5 because we have completely re-envisioned the range
operator.  What we have replaced it with is fundamentally more
flexible, though not necessarily perfect.

> 3) We've never had a clear-cut goal in allowing string ranges (as opposed to
> character ranges, which Perl 5 and 6 both muddy a bit), so "intuitive"
> becomes sketchy at best past the first grapheme, and ever muddier when only
> considering codepoints (thus that wing of my proposal and current behavior
> are on much shakier ground, except in so far as it asserts that we might
> want to think about it more).

I think that one notion that we're dealing with here is the idea that
C<< $X < $X.succ >> for all strings.  This seems to be a rather
intuitive assumption to make; but it is apparently not an assumption
that Stringy.succ makes.  As I understand it, "Z".succ eqv "AA".  What
benefit do we gain from this behavior?  Is it the idea that eventually
this will iterate over every possible combination of capital letters?
If so, why is that a desirable goal?


My own gut instinct would be to define the string iterator such that
it increments the final letter in the string until it gets to "Z";
then it resets that character to "A" and increments the next character
by one:

"ABE", "ABF", "ABG" ... "ABZ", "ACA", "ACB" ... "ZZZ"

This pattern ensures that for any two strings in the series, the first
one will be less than its successor.  It does not ensure that every
possible string between "ABE" and "ZZZ" will be represented; far from
it.  But then, 1...9 doesn't produce every number between 1 and 9; it
only produces integers.  Taken to an extreme: pi falls between 1 and
9; but no one in his right mind expects us to come up with a general
sequencing of numbers that increments from 1 to 9 with a guarantee
that it will hit pi before reaching 9.

Mind you, I know that the above is full of holes.  In particular, it
works well when you limit yourself to strings composed of capital
letters; do anything fancier than that, and it falls on its face.

> 4) Many ranges involving single characters on LHS and RHS result in null
> or infinite output, which is deeply non-intuitive to me, and I expect many
> others.

Again, the distinction between range-matching and series-generation
comes to the rescue.

> Solve those (and I tried in my suggestion) and I think you will be able to
> apply intuition to character ranges, but only in so far as a human being is
> likely to be able to intuit anything related to Unicode.

Of the points that you raise, #1, 2, and 4 are neatly solved already.
I'm unsure as to #3; so I'd recommend focusing some scrutiny on it.

> The current behaviour of the range operator is (if I recall correctly):
>> 1) if both sides are single characters, make a range by incrementing
>> codepoints
>>
>
> Sadly, you can't do that reasonabl

Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Aaron Sherman
Side note: you could get around some of the problems, below, but in order to
do so, you would have to exhaustively express all of Unicode using the Str
builtin module's RANGES constant. In fact, as it is now, it defines ASCII
lowercase, but doesn't define Latin lowercase. Presumably because doing so
would be a massive pain. Again, I'll point out that using script and
properties is much easier

On Tue, Jul 20, 2010 at 10:35 PM, Solomon Foster  wrote:

>
> Sorry, didn't mean to imply the series operator was perfect.  (Though
> it is surprisingly awesome in  general, IMO.)  Just that the right
> questions would be about the series operator rather than Ranges.
>

So, what's the intention of the range operator, then? Is it just there to
offer backward compatibility with Perl 5? Is it a vestige that should be
removed so that we can Huffman ... down to ..?

I'm not trying to be difficult, here, I just never knew that ... could
operate on a single item as LHS, and if it can, then .. seems to be obsolete
and holding some prime operator real estate.


>
> The questions definitely look different that way: for example,
> ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and
> clearly expressed as
>
>'A' ... 'Z', 'a' ... 'z' # don't think this works in Rakudo yet  :(
>

I still contend that this is so frequently desirable that it should have a
simpler form, but it's still going to have problems.

One example: for expressing "Katakana letters" (I use "letters" in the
Unicode sense, here) it's still dicey. There are things interspersed in the
Unicode sequence for Katakana that aren't the same thing at all. Unicode
calls them lowercase, but that's not quite right. They're smaller versions
of Katakana characters which are used more as punctuation or accents than as
syllabic glyphs the way the rest of Katakana is.

I guess you could write:

  ア, イ, ウ, エ, オ, カ ... ヂ,ツ ...モ,ヤ, ユ, ヨ ... ロ, ワ ... ヴ (add quotes to taste)

But that seems quite a bit more painful than:

 ア .. ヴ (or ... if you prefer)

Similar problems exist for many scripts (including some of Latin, we're just
used to the parts that are odd), though I think it's possible that Katakana
may be the worst because of the mis-use of Ll to indicate a letter when the
truth of the matter is far more complicated.



> That suggests to me that the current behavior of 'A' ... 'z' is pretty
> reasonable.
>

You still have to decide to make at least some allowances for invalid
codepoints and I think you should avoid ever generating a combining or
modifying codepoint in such a sequence (e.g. "Ѻ" ... "Ҋ" in Cyrillic which
contains several combining characters for currency and counting as well as
one undefined codepoint).

-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs


Re: r31777 -[S32/Temporal] Reverted DateTime back to being mutable. I think we ought to make a big change like this only after reaching some kind of consensus to do so, not least because I just impl

2010-07-20 Thread Mark J. Reed
Well, then, let's start building that consensus.  I firmly believe
DateTimes should definitely be value types, immutable.  Otherwise you
can't use them for hash keys, for one thing.

Note that timestamps in Perl have always been values; it's just that
they used to be specifically integers, whose value-ness comes
automatically. You can't change the value of 1279682340; you can
change a variable to hold a different value instead of 1279682340, but
the integer itself hasn't changed.  This may sound like a
philosophical point, but it's fundamental to the operation of
mathematics.

When you get into reference types you have to work harder to make an
object act like a value, but I think the effort is worth it in the
case of datelike objects.  Even in languages that have mutable dates,
like Java, there are "best practices" guidelines that suggest using an
immutable representation instead (e.g.
http://www.javapractices.com/topic/TopicAction.do?Id=81).  With Perl6
I'd rather get it right in the first place.


On Tue, Jul 20, 2010 at 8:12 PM,   wrote:
> Author: Kodi
> Date: 2010-07-21 02:12:24 +0200 (Wed, 21 Jul 2010)
> New Revision: 31777
>
> Modified:
>   docs/Perl6/Spec/S32-setting-library/Temporal.pod
> Log:
> [S32/Temporal] Reverted DateTime back to being mutable. I think we ought to 
> make a big change like this only after reaching some kind of consensus to do 
> so, not least because I just implemented a lot of mutating methods!
>
> Note that += and friends need only the *container* on the LHS to be mutable, 
> not the value?\226?\128?\148'$x += 1' should be allowed whether $x holds an 
> Int, a Date, or a DateTime.
>
>
> Modified: docs/Perl6/Spec/S32-setting-library/Temporal.pod
> ===
> --- docs/Perl6/Spec/S32-setting-library/Temporal.pod    2010-07-20 21:51:26 
> UTC (rev 31776)
> +++ docs/Perl6/Spec/S32-setting-library/Temporal.pod    2010-07-21 00:12:24 
> UTC (rev 31777)
> @@ -16,7 +16,7 @@
>     Created: 19 Mar 2009
>
>     Last Modified: 20 Jul 2010
> -    Version: 15
> +    Version: 16
>
>  The document is a draft.
>
> @@ -67,11 +67,7 @@
>  =head1 C
>
>  A C object describes the time as it would appear on someone's
> -calendar and someone's clock.
> -
> -C objects are immutables.
> -
> -You can create a C object from an
> +calendar and someone's clock. You can create a C object from an
>  C or from an C; in the latter case, the argument is
>  interpreted as POSIX time.
>
> @@ -98,11 +94,7 @@
>  Another multi exists with C instead of C<:year>, C<:month> and
>  C<:day> (and the same defaults as listed above).
>
> -A C can also be created by modifying an existing object:
> -
> -    my $moonlanding_anniv = DateTime.new($moonlanding, :year(2010));
> -
> -All five of the aforementioned forms of C accept two additional named
> +All four of the aforementioned forms of C accept two additional named
>  arguments. C<:formatter> is a callable object that takes a C and
>  returns a string. The default formatter creates an ISO 8601 timestamp (see
>  below). C<:timezone> is a callable object that takes a C to
> @@ -186,11 +178,38 @@
>  C<$dt.timezone($dt, True)>; otherwise, C<$dt.offset> returns
>  C<$dt.timezone> as is.
>
> -The C method returns a new object where a number of time values
> -have been cleared below a given resolution:
> +=head2 "Set" methods
>
> -    $dt2 = $dt.truncate( :to ); # clears minutes and seconds
> +To set the the day of a C object to something, just assign to
> +its public accessor:
>
> +    $dt.day = 15;
> +
> +The same methods exists for all the values you can set in the
> +constructor: C, C, C, C, C, C,
> +C and C.  Also, there's a C method, which
> +accepts all of these as named arguments, allowing several values to be
> +set at once:
> +
> +    $dt.set( :year(2014), :month(12), :day(25) );
> +
> +Just as with the C method, validation is performed on the resulting
> +values, and an exception is thrown if the result isn't a sensible date
> +and time.
> +
> +If you use the C public accessor to adjust the time zone, the
> +local time zone is adjusted accordingly:
> +
> +    my $dt = DateTime.new('2005-02-01T15:00:00+0900');
> +    say $dt.hour;                # 15
> +    $dt.timezone = 6 * 60 * 60;  # 6 hours ahead of UTC
> +    say $dt.hour;                # 12
> +
> +The C method allows you to "clear" a number of time values
> +below a given resolution:
> +
> +    $dt.truncate( :to ); # clears minutes and seconds
> +
>  The time units are "cleared" in the sense that they are set to their
>  inherent defaults: 1 for months and days, 0 for the time components.
>
> @@ -198,6 +217,9 @@
>  Monday of the week in which it occurs, and the time components are all
>  set to 0.
>
> +For the convenience of method chaining, C and C return the
> +calling object.
> +
>  =head1 Date
>
>  C objects are immutable and represent a day without a time component.
> @@ -246,8 +268,6 @@
>     $d + 3                      # Date.new('2010-12-27')
>

Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Solomon Foster
On Tue, Jul 20, 2010 at 10:00 PM, Jon Lang  wrote:
> Solomon Foster wrote:
>> Ranges haven't been intended to be the "right way" to construct basic
>> loops for some time now.  That's what the "..." series operator is
>> for.
>>
>>    for 1e10 ... 1 -> $i {
>>         # whatever
>>    }
>>
>> is lazy by the spec, and in fact is lazy and fully functional in
>> Rakudo.  (Errr... okay, actually it just seg faulted after hitting
>> 968746 in the countdown.  But that's a Rakudo bug unrelated to
>> this, I'm pretty sure.)
>
> You took the words out of my mouth.
>
>> All the magic that one wants for handling loop indices -- going
>> backwards, skipping numbers, geometric series, and more -- is present
>> in the series operator.  Range is not supposed to do any of that stuff
>> other than the most basic forward sequence.
>
> Here, though, I'm not so sure: I'd like to see how many of Aaron's
> issues remain unresolved once he reframes them in terms of the series
> operator.

Sorry, didn't mean to imply the series operator was perfect.  (Though
it is surprisingly awesome in  general, IMO.)  Just that the right
questions would be about the series operator rather than Ranges.

The questions definitely look different that way: for example,
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz is easily and
clearly expressed as

'A' ... 'Z', 'a' ... 'z' # don't think this works in Rakudo yet  :(

That suggests to me that the current behavior of 'A' ... 'z' is pretty
reasonable.

-- 
Solomon Foster: colo...@gmail.com
HarmonyWare, Inc: http://www.harmonyware.com


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Jon Lang
Solomon Foster wrote:
> Ranges haven't been intended to be the "right way" to construct basic
> loops for some time now.  That's what the "..." series operator is
> for.
>
>    for 1e10 ... 1 -> $i {
>         # whatever
>    }
>
> is lazy by the spec, and in fact is lazy and fully functional in
> Rakudo.  (Errr... okay, actually it just seg faulted after hitting
> 968746 in the countdown.  But that's a Rakudo bug unrelated to
> this, I'm pretty sure.)

You took the words out of my mouth.

> All the magic that one wants for handling loop indices -- going
> backwards, skipping numbers, geometric series, and more -- is present
> in the series operator.  Range is not supposed to do any of that stuff
> other than the most basic forward sequence.

Here, though, I'm not so sure: I'd like to see how many of Aaron's
issues remain unresolved once he reframes them in terms of the series
operator.

-- 
Jonathan "Dataweaver" Lang


Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Solomon Foster
On Tue, Jul 20, 2010 at 7:31 PM, Aaron Sherman  wrote:
> 2) We deny that a range whose LHS is "larger" than its RHS makes sense, but
> we also don't provide an easy way to construct such ranges lazily otherwise.
> This would be annoying only, but then we have declared that ranges are the
> right way to construct basic loops (e.g. for (1..1e10).reverse -> $i {...}
> which is not lazy (blows up your machine) and feels awfully clunky next to
> for 1e10..1 -> $i {...} which would not blow up your machine, or even make
> it break a sweat, if it worked)

Ranges haven't been intended to be the "right way" to construct basic
loops for some time now.  That's what the "..." series operator is
for.

for 1e10 ... 1 -> $i {
 # whatever
}

is lazy by the spec, and in fact is lazy and fully functional in
Rakudo.  (Errr... okay, actually it just seg faulted after hitting
968746 in the countdown.  But that's a Rakudo bug unrelated to
this, I'm pretty sure.)

All the magic that one wants for handling loop indices -- going
backwards, skipping numbers, geometric series, and more -- is present
in the series operator.  Range is not supposed to do any of that stuff
other than the most basic forward sequence.

-- 
Solomon Foster: colo...@gmail.com
HarmonyWare, Inc: http://www.harmonyware.com


Re: r31777 -[S32/Temporal] Reverted DateTime back to being mutable. I think we ought to make a big change like this only after reaching some kind of consensus to do so, not least because I just implem

2010-07-20 Thread Darren Duncan

pugs-comm...@feather.perl6.nl wrote:

Modified:
   docs/Perl6/Spec/S32-setting-library/Temporal.pod
Log:
[S32/Temporal] Reverted DateTime back to being mutable. I think we ought to 
make a big change like this only after reaching some kind of consensus to do 
so, not least because I just implemented a lot of mutating methods!

Note that += and friends need only the *container* on the LHS to be mutable, 
not the value?\226?\128?\148'$x += 1' should be allowed whether $x holds an 
Int, a Date, or a DateTime.


Types representing temporal artifacts should *not* be mutable; they should be 
"value" types.


If you want to derive a DateTime from another, say, then just have the 
pseudo-mutator method return a new object with the differences.


-- Darren Duncan


r31777 -[S32/Temporal] Reverted DateTime back to being mutable. I think we ought to make a big change like this only after reaching some kind of consensus to do so, not least because I just implemente

2010-07-20 Thread pugs-commits
Author: Kodi
Date: 2010-07-21 02:12:24 +0200 (Wed, 21 Jul 2010)
New Revision: 31777

Modified:
   docs/Perl6/Spec/S32-setting-library/Temporal.pod
Log:
[S32/Temporal] Reverted DateTime back to being mutable. I think we ought to 
make a big change like this only after reaching some kind of consensus to do 
so, not least because I just implemented a lot of mutating methods!

Note that += and friends need only the *container* on the LHS to be mutable, 
not the value?\226?\128?\148'$x += 1' should be allowed whether $x holds an 
Int, a Date, or a DateTime.


Modified: docs/Perl6/Spec/S32-setting-library/Temporal.pod
===
--- docs/Perl6/Spec/S32-setting-library/Temporal.pod2010-07-20 21:51:26 UTC 
(rev 31776)
+++ docs/Perl6/Spec/S32-setting-library/Temporal.pod2010-07-21 00:12:24 UTC 
(rev 31777)
@@ -16,7 +16,7 @@
 Created: 19 Mar 2009
 
 Last Modified: 20 Jul 2010
-Version: 15
+Version: 16
 
 The document is a draft.
 
@@ -67,11 +67,7 @@
 =head1 C
 
 A C object describes the time as it would appear on someone's
-calendar and someone's clock.
-
-C objects are immutables.
-
-You can create a C object from an
+calendar and someone's clock. You can create a C object from an
 C or from an C; in the latter case, the argument is
 interpreted as POSIX time.
 
@@ -98,11 +94,7 @@
 Another multi exists with C instead of C<:year>, C<:month> and
 C<:day> (and the same defaults as listed above).
 
-A C can also be created by modifying an existing object:
-
-my $moonlanding_anniv = DateTime.new($moonlanding, :year(2010));
-
-All five of the aforementioned forms of C accept two additional named
+All four of the aforementioned forms of C accept two additional named
 arguments. C<:formatter> is a callable object that takes a C and
 returns a string. The default formatter creates an ISO 8601 timestamp (see
 below). C<:timezone> is a callable object that takes a C to
@@ -186,11 +178,38 @@
 C<$dt.timezone($dt, True)>; otherwise, C<$dt.offset> returns
 C<$dt.timezone> as is.
 
-The C method returns a new object where a number of time values
-have been cleared below a given resolution:
+=head2 "Set" methods
 
-$dt2 = $dt.truncate( :to ); # clears minutes and seconds
+To set the the day of a C object to something, just assign to
+its public accessor:
 
+$dt.day = 15;
+
+The same methods exists for all the values you can set in the
+constructor: C, C, C, C, C, C,
+C and C.  Also, there's a C method, which
+accepts all of these as named arguments, allowing several values to be
+set at once:
+
+$dt.set( :year(2014), :month(12), :day(25) );
+
+Just as with the C method, validation is performed on the resulting
+values, and an exception is thrown if the result isn't a sensible date
+and time.
+
+If you use the C public accessor to adjust the time zone, the
+local time zone is adjusted accordingly:
+
+my $dt = DateTime.new('2005-02-01T15:00:00+0900');
+say $dt.hour;# 15
+$dt.timezone = 6 * 60 * 60;  # 6 hours ahead of UTC
+say $dt.hour;# 12
+
+The C method allows you to "clear" a number of time values
+below a given resolution:
+
+$dt.truncate( :to ); # clears minutes and seconds
+
 The time units are "cleared" in the sense that they are set to their
 inherent defaults: 1 for months and days, 0 for the time components.
 
@@ -198,6 +217,9 @@
 Monday of the week in which it occurs, and the time components are all
 set to 0.
 
+For the convenience of method chaining, C and C return the
+calling object.
+
 =head1 Date
 
 C objects are immutable and represent a day without a time component.
@@ -246,8 +268,6 @@
 $d + 3  # Date.new('2010-12-27')
 3  + $d # Date.new('2010-12-27')
 
-As temporal objects are immutable, += -= ... do not work.
-
 =head1 Additions
 
 Please post errors and feedback to C. If you are making



Re: Suggested magic for "a" .. "b"

2010-07-20 Thread Aaron Sherman
This is a long reply, but I read it over a few times, and I don't see any
fat to trim. This isn't really a simple issue for which intuition is going
to be a sufficient guide, though I agree fully that it needs to be high on
or at the top of the list.

On Sun, Jul 18, 2010 at 6:26 AM, Moritz Lenz  wrote:

> In general, stuffing more complex behaviour into something that feels
> unintuitive is rarely (if ever) a good solution.


Walk with me a bit, and let's explore the concept of intuitive character
ranges? This was my suggestion, which seems pretty basic to me:

"x .. y", for all strings x and y, which are composed of a single, valid
codepoint which is neither combining nor modifying, yields the range of all
valid, non-combining/modifying codepoints between x and y, inclusive which
share the Unicode script, general category major property and general
category minor property of either x or y (lack of a minor property is a
valid value).


In general we have four problems with current specification and
implementation on the Perl 6 and Perl 5 sides:

1) Perl 5 and Rakudo have a fundamental difference of opinion about what
some ranges produce ("A" .. "z", "X" .. "T", etc) and yet we've never really
articulated why we want that.

2) We deny that a range whose LHS is "larger" than its RHS makes sense, but
we also don't provide an easy way to construct such ranges lazily otherwise.
This would be annoying only, but then we have declared that ranges are the
right way to construct basic loops (e.g. for (1..1e10).reverse -> $i {...}
which is not lazy (blows up your machine) and feels awfully clunky next to
for 1e10..1 -> $i {...} which would not blow up your machine, or even make
it break a sweat, if it worked)

3) We've never had a clear-cut goal in allowing string ranges (as opposed to
character ranges, which Perl 5 and 6 both muddy a bit), so "intuitive"
becomes sketchy at best past the first grapheme, and ever muddier when only
considering codepoints (thus that wing of my proposal and current behavior
are on much shakier ground, except in so far as it asserts that we might
want to think about it more).

4) Many ranges involving single characters on LHS and RHS result in null
or infinite output, which is deeply non-intuitive to me, and I expect many
others.

Solve those (and I tried in my suggestion) and I think you will be able to
apply intuition to character ranges, but only in so far as a human being is
likely to be able to intuit anything related to Unicode.


The current behaviour of the range operator is (if I recall correctly):
>
> 1) if both sides are single characters, make a range by incrementing
> codepoints
>

Sadly, you can't do that reasonably. Here are some examples of why, using
only Latin and Greek as examples (not the most convoluted Unicode sections
to be sure):


   - "Α" (capital Greek alpha, not Latin A) .. "Ω" would result in a range
   that contains an invalid codepoint (rakudo: drops the invalid codepoint,
   which you may have meant to imply, but I'm being pedantic because I want to
   come to a specification, not just a sense of the right solution)
   - "Ā" .. "Ē" would be "ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒ" which is really not what
   you're likely to expect! (rakudo: Ā, infinitely repeating, which is an even
   larger problem for Katakana, where "オ" .. "ヺ" seems a very intuitive way to
   say "all Katakana non-cased letters" but fails because the range contains
   both cased and uncased; Perl 5 just prints "オ", and I think it also sneers
   at you)
   - "A" .. "z" comes out really odd because it contains punctuation (mind
   you, your suggestion is saner than Rakudo's current behavior on "A" .. "z"
   which is an infinite progression of capital-letter-only sequences of 1 or
   more characters! Intuitive, it's not.)


My point was that, if you want simple and intuitive out of Unicode, you're
kind of screwed. The closest you can get is to build your range using
properties and script. The way I suggested doing that was the simplest I
could think of. Speak up if you have a simpler one.

For most simple ranges, our results will be identical (e.g "A" .. "Q").

For the above examples, I would end up producing:

1: Alpha through Omega greek capital letters
2: ĀĂĄĆĈĊČĎĐĒ
(and オカガキギクグケゲコゴサザシジスズセゼソゾタダチヂツヅテデトドナニヌネノハバパヒビピフブプヘベペホボポマミムメモヤユヨラリルレロワヰヱヲンヴヷヸヹヺ
for the Katakana)
3: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz

That seems pretty darned intuitive to me. Mind you, "A" .. "ž" is still ugly
as sin in terms of ordering once you listify, and I can't reasonably fix
that without re-defining Unicode or having a really, really convoluted and
special-case rule, but without getting convoluted, even that ugly example
does something useful and, I dare say, intuitive for testing membership.

Here's the pseudo-code for my suggestion:

  class SingleCharAlphaRange {
 has $.start;
 has $.end;
 # Verify that this is a single character string which is valid
 # and non-combining/non-modifying and repre

Re: r31776 -[S32/Temporal] Make DateTime immutable.

2010-07-20 Thread Mark J. Reed
On Tue, Jul 20, 2010 at 5:51 PM,   wrote:
> +C objects are immutables.
> +

I think just the adjective works better here ("are immutable")... but
more to the point:

> +A C can also be created by modifying an existing object:

It's mildly confusing to say they're immutable and then turn around
and talk about modifying them.   How about just saying that "A new
C can also be based on an existing C object:" ?


-- 
Mark J. Reed 


r31776 -[S32/Temporal] Make DateTime immutable.

2010-07-20 Thread pugs-commits
Author: dolmen
Date: 2010-07-20 23:51:26 +0200 (Tue, 20 Jul 2010)
New Revision: 31776

Modified:
   docs/Perl6/Spec/S32-setting-library/Temporal.pod
Log:
[S32/Temporal] Make DateTime immutable.


Modified: docs/Perl6/Spec/S32-setting-library/Temporal.pod
===
--- docs/Perl6/Spec/S32-setting-library/Temporal.pod2010-07-20 21:11:32 UTC 
(rev 31775)
+++ docs/Perl6/Spec/S32-setting-library/Temporal.pod2010-07-20 21:51:26 UTC 
(rev 31776)
@@ -15,8 +15,8 @@
 
 Created: 19 Mar 2009
 
-Last Modified: 15 Jul 2010
-Version: 14
+Last Modified: 20 Jul 2010
+Version: 15
 
 The document is a draft.
 
@@ -67,7 +67,11 @@
 =head1 C
 
 A C object describes the time as it would appear on someone's
-calendar and someone's clock. You can create a C object from an
+calendar and someone's clock.
+
+C objects are immutables.
+
+You can create a C object from an
 C or from an C; in the latter case, the argument is
 interpreted as POSIX time.
 
@@ -94,7 +98,11 @@
 Another multi exists with C instead of C<:year>, C<:month> and
 C<:day> (and the same defaults as listed above).
 
-All four of the aforementioned forms of C accept two additional named
+A C can also be created by modifying an existing object:
+
+my $moonlanding_anniv = DateTime.new($moonlanding, :year(2010));
+
+All five of the aforementioned forms of C accept two additional named
 arguments. C<:formatter> is a callable object that takes a C and
 returns a string. The default formatter creates an ISO 8601 timestamp (see
 below). C<:timezone> is a callable object that takes a C to
@@ -178,38 +186,11 @@
 C<$dt.timezone($dt, True)>; otherwise, C<$dt.offset> returns
 C<$dt.timezone> as is.
 
-=head2 "Set" methods
+The C method returns a new object where a number of time values
+have been cleared below a given resolution:
 
-To set the the day of a C object to something, just assign to
-its public accessor:
+$dt2 = $dt.truncate( :to ); # clears minutes and seconds
 
-$dt.day = 15;
-
-The same methods exists for all the values you can set in the
-constructor: C, C, C, C, C, C,
-C and C.  Also, there's a C method, which
-accepts all of these as named arguments, allowing several values to be
-set at once:
-
-$dt.set( :year(2014), :month(12), :day(25) );
-
-Just as with the C method, validation is performed on the resulting
-values, and an exception is thrown if the result isn't a sensible date
-and time.
-
-If you use the C public accessor to adjust the time zone, the
-local time zone is adjusted accordingly:
-
-my $dt = DateTime.new('2005-02-01T15:00:00+0900');
-say $dt.hour;# 15
-$dt.timezone = 6 * 60 * 60;  # 6 hours ahead of UTC
-say $dt.hour;# 12
-
-The C method allows you to "clear" a number of time values
-below a given resolution:
-
-$dt.truncate( :to ); # clears minutes and seconds
-
 The time units are "cleared" in the sense that they are set to their
 inherent defaults: 1 for months and days, 0 for the time components.
 
@@ -217,9 +198,6 @@
 Monday of the week in which it occurs, and the time components are all
 set to 0.
 
-For the convenience of method chaining, C and C return the
-calling object.
-
 =head1 Date
 
 C objects are immutable and represent a day without a time component.
@@ -268,6 +246,8 @@
 $d + 3  # Date.new('2010-12-27')
 3  + $d # Date.new('2010-12-27')
 
+As temporal objects are immutable, += -= ... do not work.
+
 =head1 Additions
 
 Please post errors and feedback to C. If you are making
@@ -290,3 +270,4 @@
 Daniel Ruoso 
 Dave Rolsky 
 Matthew (lue) 
+Olivier Mengué 



r31773 -Fix minor typo

2010-07-20 Thread pugs-commits
Author: tene
Date: 2010-07-20 20:35:35 +0200 (Tue, 20 Jul 2010)
New Revision: 31773

Modified:
   docs/Perl6/Spec/S03-operators.pod
Log:
Fix minor typo

Modified: docs/Perl6/Spec/S03-operators.pod
===
--- docs/Perl6/Spec/S03-operators.pod   2010-07-20 03:45:37 UTC (rev 31772)
+++ docs/Perl6/Spec/S03-operators.pod   2010-07-20 18:35:35 UTC (rev 31773)
@@ -1741,7 +1741,7 @@
 @a minmax @b
 
 Returns a C from the minimum element of C<@a> and C<@b> to the maximum
-element. C elements in the input9 are treated as if their
+element. C elements in the input are treated as if their
 minimum and maximum values were passed individually, except that if the
 corresponding C flag is set in Range, the excludes flag is also set
 in the returned C.