RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

Perl6 RFC Librarian Fri, 15 Sep 2000 12:18:40 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Data: Multi-dimensional arrays/hashes and slices

=head1 VERSION

  Maintainer: Ilya Zakharevich <[EMAIL PROTECTED]>
  Date: 15 September 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 231
  Version: 1
  Status: Developing

=head1 ABSTRACT

This RFC 

=over

=item *

proposes a convenient syntax to access multi-dimensional arrays' elements;

=item *

proposes a convenient syntax to slice multi-dimensional arrays;

=item *

proposes a convenient syntax create subobjects which behave as slices;

=item *

does it with as little overhead as possible.

=back

=head1 DESCRIPTION

Observation1: it is enough to define the interface which tie()d arrays
can use.  Then *if judged useful*, this semantic can be optionally
supported by the builtin arrays/hashes.

Observation2: current supports for multi-dimensional objects and slicing
clash between themselves (at least visually):

   @slice = @hash{$ind1,$ind2};
   @slice = @arry[$ind1,$ind2];
   $elmnt = $hash{$ind1,$ind2};

moreover, the analogue for arrays

   $elmnt = $arry[$ind1,$ind2];

is not supported (syntax error).

Observation3: comma operator has 3 different meanings: one in list context,
one in scalar context, one inside an index for a hash/array.

The following proposals move the special meaning from C<,> to C<;>,
introduce a new meaning for C<...> inside indices, 4 new tie()ed methods,
and 4 new utility functions in a module C<tie::multi>.  An optional
OO access to slices, an optional iterator, and an optional support
for builtin arrays complement the proposal.

=over

=item 1.  Multi-dim hashes

change multi-dimensional hash syntax to become

   $elmnt = $hash{$ind1;$ind2};

(so that C<;> is used as a separator instead of C<,>).  Note that this
is more compatible with the name of $;.

The difference with RFC 205 "New operator ';' for creating array slices"
is that C<;> is special inside a hash/array index only.

=item 2.  Tied multi-dim hashes

Allow multi-dimensional support for tied objects, so that

   $elmnt = $hash{$ind1;$ind2};
   $elmnt = $arry[$ind1;$ind2];

would call C<tied(%hash)->FETCH_MULTY($ind1,$ind2)> (same for @arry and
STORE_MULTY) with a scalar context.  If no FETCH_MULTY is available, do
as now: join() arguments using $;.

=item 3.  Slices of tied array/hashes

Allow slices for tied objects, so that

   @slice = @hash{$ind1,$ind2};
   @slice = @arry[$ind1,$ind2];

would call C<tied(%hash)->FETCH_SLICE($ind1,$ind2)> (same for @arry and
STORE_SLICE) with a list context.  If no FETCH_SLICE is available, do
as now: get/set elements one-by-one using FETCH/STORE.

STORE_SLICE should have a prototype

    sub STORE_SLICE ($me, @keys, @values);

to allow the proposals below, and to avoid the (giant) overhead of
an anonymous array.

=item 4.  Multidimensional slices

Extend "3" to allow slices for multi-dimensional tied objects,
so that

   @slice = @hash{$ind11,$ind12,$ind13 ; $ind21,$ind22};
   @slice = @arry[$ind11,$ind12,$ind13 ; $ind21,$ind22];

would call

   tied(%hash)->FETCH_SLICE( $ind11,$ind12,$ind13, tie::multi::separator(),
                             $ind21,$ind22 )

(same for @arry and STORE_SLICE) with a list context.  If no FETCH_SLICE
is available, this is an error.

A special token for a separator is used to avoid the (giant) overhead of
creation of anonymous arrays, and allow the proposal which follows.

=item 5. Ranges in slices

Extend "3", "4" to allow ranges in slices:

   @slice = @hash{$ind1, $ind2..$ind3, $ind4..$ind5, $ind6};
   @slice = @arry[$ind1, $ind2..$ind3, $ind4..$ind5, $ind6];

would call

   tied(%hash)->FETCH_SLICE( $ind11,
                             tie::multi::range(), $ind2, $ind3,
                             tie::multi::range(), $ind4, $ind5,
                             $ind6 )

(same for @arry and STORE_SLICE) with a list context.  Similarly for
combinations of C<..> and C<;> for multi-dimensional access.  If no
FETCH_SLICE is available, this is an error in multi-dimensional case,
in one-dimensional case C<..> are inlined (as now), and FETCH/STORE
are called one-by-one.

=item 6.  Bidirectional ranges

Extend this to allow bidirectional ranges for arrays:

   @slice = @arry[3, 4...-3, -6...7, 9];
   @slice = @arry[3, 4.. 7,   4.. 7, 9];

would behave the same if no FETCH_SLICE is defined (for C<@array==10>.
If FETCH_SLICE is defined, the first slice will use tie::multi::bi_range()
as the tokens preceeding (4,-3) and (-6,7).

=item 7.  Convenience index parser

Usage of tokens to separate arguments makes it possible to
parse arguments in XSUBs with a very low overhead.  To facilitate an
easier (but slower) parsing by Perl subroutines, provide a utility
which would translate keys which may include separators and ranges:

   @slice = @arry[3, 4...-3, 4..7 ; 4, 3...-5];
   sub FETCH_SLICE ($me, @keys, @values) {
       my $dimensions = [$me->dimensions];
       @keys = tie::multi::inline($dimensions, @keys); ...
 
would make @keys into

     ( [3, 4..(10-3), 4..7], [4, 3..(20-5)] )

if $dimensions is C<[10,20]>.  Optionally, a call to
tie::multi::inline_ranges() would convert the above list to

     ( [3, [4,(10-3)], [4,7]], [4, [3,(20-5)]] )

=item 8. (Optional)  OO slicing

Allow slicing of objects.  The FETCH_SLICE
interfaces introduced above return lists.  Sometimes it is reasonable
to make them return objects: suppose that $matrix is a C<7x8> matrix,
which is a reference to a tie()d array (with appropriate overloading
as well).  Then

  @subvectors = @$matrix[2..5; 2..4]

would most probably return a list of 4 3-dimensional vectors.  However,
time to time it is desirable to be able to extract the C<4x3> submatrix
with the indices specified above as I<one object>.  The proposal is to
allow the leading C<$> (instead of C<@>) as the syntax fo such a request:

  $submatrix  = $$matrix[2..5; 2..4];

For example,

  $matrix1->[2..5; 2..4] * $matrix2->[1,3,5; 11..64];

would denote: create two new objects for the specified submatrices, apply
(overloaded) multiplication to these objects.  Such a request is illegal
for untie()d arrays; for tie()d arrays it is converted to a call to
FETCH_SLICE in a I<scalar> context.  (Alternative: introduce two new
tie()d methods: FETCH_SUBOBJECT, STORE_SUBOBJECT.)

Such a call evaluates indices to the matrix in a I<list> contents.
Disambiguation of extraction of a 1-dimensional vector and of an element:

  $element  = $array[7];
  $vector1d = $array[(7)];

Similarly,

  $element  = $array[func()];   # func() is a scalar context
  $vector1d = $array[(func())]; # func() in a list context

(These is the only ambiguity, since C<,> and C<;> were prohibited
inside array index in a scalar access to array.)

=item 9. (Optional) Multi-dimensional builtin arrays. 

Translate

  $arry[$a;$b;$c]        to     $arry[$a][$b][$c]
  @arry[$a,$b...$c]      to     @arry[$a,$b..($c >=0 ? $c : @arry-$c)]
                                # Same translation for $b, in fact
  @arry[$a,$b;$c,$d,$e]  to     map [$arry[$_][$c,$d,$e]], $a, $b;
  $arry[$a,$b;$c,$d,$e]  to     [map [$arry[$_][$c,$d,$e]], $a, $b];

etc.

=item 10: (Optional)  ==>

Syntactic sugar C<<EXPR==>ACCESSOR>> for mapping:
This operator evaluates the EXPR in list context, and does the same as
C<<map $_->ACCESSOR, EXPR>>.  Same effects:

  $arry[$a,$b]==>[$c,$d,$e];
  map $arry[$_]->[$c,$d,$e], $a, $b;

(flat list, while C<$arry[$a,$b;$c,$d,$e]> is multi-dimensional).

=back

=head1 MIGRATION ISSUES

=over

=item a)

C<@arry[3...-5]> calls C<...> in a list context.  This operator was
not defined, but some people could have used it nevertheless.  Translate
to C<@arry[3..-5]>.  Probably a non-issue.

=item b)

C<$arry[3..5]> calls C<..> in a list context, while Perl5 called it
in a scalar context.  Translate to C<$arry[scalar(3..5)]>.
Probably a non-issue.

=item c)

C<$arry[(func())]> calls func() in a list context, while Perl5 called
it in a scalar context.  Translate to C<$arry[func()]>.  Probably a non-issue.

=back

Without the optional proposal 8 only "a" stands.


=head1 IMPLEMENTATION

With the exception of Proposals 9 and 10 should be straightforward.
9 and 10 require some dexterity with building optrees on the fly.

=head1 REFERENCES

RFC 205: New operator ';' for creating array slices.
RFC 231 (v1) Data: Multi-dimensional arrays/hashes and slices

Reply via email to