Hi again,

The script below extracts 10 columns from a 2D raw array
of 7688 columns (and 137050 rows), and then it accesses
those 10 columns in some irrelevant way. The physical ram
consumtion is then about 1 gb. This seems wrong, working
on a tiny part of a dataset shouldnt trigger all of it to
be loaded. I hope I am missing something, or else it will
be a stopper for me, because the 2D arrays can be as
large as the file system. PDL works underneath this,

http://rnp.uthct.edu:8000/UTHCT

and I would like to "scale up" to the largest of data.

Niels L
Danish Genome Institute



#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

use Common::Messages;
use Common::Util;

{
    local $SIG{__DIE__};

    require PDL::Lite;
    require PDL::Char;
    require PDL::IO::FastRaw;
}

my ( $header, $big_pdl, $pdl, $cols, $rows, $i, @cols );

$cols = 7688;
$rows = 137050;

$big_pdl = PDL->mapfraw( "current_prokMSA_aligned.pdl", {
    'NDims' => 2,
    'Datatype' => 'byte',
    'Dims' => [ $cols, $rows ],
    'ReadOnly' => 1,
});

# Create 10 random sorted column indices and get corresponding pdl,

@cols = sort { $a <=> $b } map { int $cols * rand(1) } ( 0 .. 9 );

$pdl = $big_pdl->dice( [EMAIL PROTECTED] )->sever;

# Do some operation on those 10 columns,

foreach $i ( 0 .. 9 )
{
    $col = $pdl->slice( "($i),:" )->hist;
}

# Give time to see physical ram consumption at 1 gb,

while (1) {};


_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to