Matt,

I tackle the problem of what I call "breaks" (the "Test" values changing faster (as you go down the rows) than the "Visit" values, which change faster than the "Patient" values) with a closure. From the POD:

break_col_iterator($data, $break_cols)

Returns an iterator which, when called, will return a set of rows which
are grouped at column breaks for (zero-indexed) column numbers given in
C<@$break_cols>

For example, data which is declared as:

  $data = [
    [ qw( a b c 1 2 ) ],
    [ qw( a b c 3 4 ) ],
    [ qw( a b d 5 6 ) ],
    [ qw( a b d 7 8 ) ],
    [ qw( a e d 9 0 ) ],
    [ qw( f e d 1 2 ) ],
          ];

In the following code:

  $break_cols = [ 0..2 ];
  $iter = break_col_iterator($data, $break_cols);

  while ( my $subset = $iter->() ) {
    ...
  }

for each iteration of the C<while> loop, C<@$subset> would contain, in
order, lines 1&2, 3&4, 5, 6 (line numbers starting from 1)

For C<$break_cols = [ 1, 3 ]>, C<@$subset> would contain, in order, lines
1&2, 3..5, 6.

and the size of break_col_iterator() (with error and array-bound checking statements removed) is quite compact:

sub break_col_iterator {
  my ($data, $break_cols) = @_;

  # make $data a local copy (of the row pointers)
  $data = [ @$data ];

  my $previous_line = shift @$data;

  return sub {
    my @result;
    my $line;

    return unless $previous_line; # end of data

    push @result, $previous_line;

    while ($line = shift @$data) {

      last
        unless vec_eq([ @{ $previous_line }[@$break_cols] ],
                      [ @{          $line }[@$break_cols] ]);

      push @result, $line;
    }
    # line was different from previous, or end-of-data
    $previous_line = $line;

    return \@result;
  }
}

vec_eq($v1, $v2) returns true if vectors (array references) $v1 and $v2 contain the same data. The comparison operator used is 'eq' and not '==', and with undef values comparing equal. i.e. for @a = qw( 1 2 undef 4 5), @b = qw(1 2 3 4 5), @c = qw(1 2 undef 4 5): vec(\@a, \@b) would return FALSE, but vec(\@a, \@c) would return TRUE.

I hope this allays your fears.

Sam
On 3/30/2011 12:49 PM, Matt S Trout wrote:
On 28/03/11 16:49, Sam Brain wrote:
I would like ask the group for advice on module naming, as I have seen
some missteps in the past.

I have written a small module which takes the output of
DBI::fetchall_arrayref() or its ilk, and generates Moose objects from it.

Showing great imagination, I have called the module
DBIx::BuildMooseObjects. I picked "DBIx::" as the module's functionality
seems to fit the DBIx namespace.

There are two main exported routines, rather clumsily named
mk_AoMobj_from_2d_array() and mk_complex_Mobj_from_2d_array(). The first
returns a (ref to) an array of (already-declared) Moose objects, the
second (a ref to) an array of complex, nested Moose objects (sub-objects
declared as "isa => 'ArrayRef[...]' " )

You'll want to make sure the unrolling is configurable.

The crawling horror that is _collapse_result in DBIx::Class::ResultSet may prove enlightening as to the contortions one ends up going throw to make this stuff work.


--
Sam Brain
Department of Radiation Oncology
Stanford Medical Center
Stanford, CA 94305

Reply via email to