On 7/7/2011 2:55 AM, dpath2o wrote:
Chris, et. al.

I'm through the header and now into the data portion of this file. I have
used your suggestions to get through the header and they have worked really
nicely.

I have successfully read the data into a piddle with the following command:
my $data = readflex($fh, [ { Type=>'float', NDims=>3,
Dims=>[$Dpplr_n,10,$rng_n] } ] );
Note: the 10th dimension there is because of a quality control values for
each range cell.

However, that method makes getting out the data a little more tricky and the
following method is more intuitive:
for (my $i=0; $i<$rng_n; $i++) {
   my @data = readflex($fh, [
     { Type=>'float', NDims=>1, Dims=>[$Dpplr_n] },
     { Type=>'float', NDims=>1, Dims=>[$Dpplr_n] },
     { Type=>'float', NDims=>1, Dims=>[$Dpplr_n] },
     { Type=>'float', NDims=>2, Dims=>[2,$Dpplr_n] },
     { Type=>'flota', NDims=>2, Dims=>[2,$Dpplr_n] },
     { Type=>'float', NDims=>2, Dims=>[2,$Dpplr_n] },
     { Type=>'float', NDmins=>1, Dims=>[$Dpplr_n] }
    ]);
}

My problem that I'm stuck on right now is verifying that the data is
actually the data!

If the reading is correct, I would expect the values to
be identical.  The general approach I use is to read a single
chunk of data by hand in the pdl2 shell doing one readflex()
at a time and verify the data step-by-step.

Some problems you might be having (the kind I usually
see myself):

(1) perl is 0 based and matlab is 1 based array indexing
    so you could confirm that the offsets are not off-by-1

(2) data big/little-endian issues (does the data look
    better if you swap bytes or reverse them?  try using
    unpack on the data chunk using the get_dataref trick
    to investigate.

(3) dimension order for matlab and PDL multidimensional
    data is different---probably not the issue here with
    all the 1-D reads, but it could be.

Hope this helps,
Chris


If you get a moment and have a look at the attached two plots you'll noticed
(in the filename that has ``matlab'') that this is what the data from
$data[0] from any most $i (iterations through $rng_n) should look like.
However, in the other figure (pgplot.pl) this does not look correct. So, I
think, maybe there's something weird going on with PGPLOT, right?
  Unfortunately the values do not match up so I know it's not a problem with
PDL::Graphics::PGPLOT::Window.

Here are the first 10 values extracted from matlab from $data[0] at $i=1:
2.7007e-12   1.4496e-12   8.9045e-13   1.0742e-12   1.9727e-12   3.0348e-12
   2.4803e-12   3.3136e-12   3.0401e-12   1.5643e-12

And here are the first 10 values extracted with PDL::IO::FlexRaw from
$data[0] at $i=1:
-2.88438e+09   8.52691e+29   8.71627e-27   -4.9354e+19   -685.496    43.237
   2.70207   -8.28144e+26   3.50417e+33   2.56412e-06

I'm hoping you might be able to shed some more wisdom on this subject.

Kind Regards.

On 28 June 2011 00:01, Chris Marshall<[email protected]>  wrote:

On Mon, Jun 27, 2011 at 9:01 AM, dpath2o<[email protected]>  wrote:
Chris, David, and Ingo

... I can do this in Matlab, however, the IO in Matlab is woefully
slow and I need to go through thousands of these files, hence my
switch to Perl and PDL. Since I knew a bit of Perl for file organisation
I thought that converting my code from Matlab to Perl (with PDL)
would be straightforward ... it's not, at least, for me!

If you have matlab code to read these files, it should
be possible to transcribe them to perl/PDL by mapping
to corresponding IO functions.  Please keep track of
where you get stuck or where something is not clearly
documented/discoverable.

@Chris, Starting off, I'm confused on $tmpl and what this should look
like.
I have read through perlpacktut and feel even stupider now. Actually, I
understood the words but do not know how to apply it to this particular
type
of file ... I have tried just unpacking the first few bytes and just this
was poking holes in the dark. I do realise $tmpl = 'n+xn+xn ... ' should
be
something like this, but I'm lacking something fundamental here because
the
documentation and what I implement don't produce the desired result.

I suggest working out the unpack arguments by going
interactively in the pdl2 (or perldl) shell.  For example,
when I need to build a template string, I start by reading
in some example data, and then repeating unpack's
with different template strings and printing the output
until I get what I want.  E.g.,

  pdl>  $fh = IO::File->new('datafile')

  pdl>  { local $/; $file =<$fh>; }

  pdl>  p $file
  This is from the
  datafile.  Your
  data would need
  to be unpacked..

  pdl>  $tmpl = 'S S C C S'  # i.e., ushort, ushort, uchar, uchar, ushort

  pdl>  @hdr = unpack $tmpl, $file

  pdl>  print "@hdr"

If your datatypes match existing PDL ones, you
can use readflex to read the header directly as the
example from David shows.

Good luck,
Chris


--------------------------------------------
Daniel Atwater
Australian Coastal Ocean Radar Network
James Cook University
P: +61(0)7 4781 4184
M: +61(0)4 2991 4545
E: [email protected]


On 27 June 2011 21:03, chm<[email protected]>  wrote:

On 6/26/2011 8:01 PM, Chris Marshall wrote:

I can't help you with specifics (the description is
a bit sketchy) but I can suggest a few things.

(1) PDL::IO::FlexRaw is well suited for reading
binary data files of multidimensional arrays
but for mixed-type headers (like a struct in
C) pack/unpack is your friend.

(2) Read the header part into a byte pdl of the
appropriate size (assuming you know how big
it is). E.g.,

$hdr = readflex(FH, [{Type=>'byte',NDims=>1,Dims=>[$hdrsize]}]);

This should be either \*FH (a reference to a file
handle/typeglob) or a handle from IO::File->new().

Now you have $hdr as a $hdrsize piddle of bytes. You
can access the bytes in the piddle using the get_dataref
method which returns a perl ref to the pdl data as a string
which you can use with unpack to extract any needed
fields:

@fields = unpack $tmpl, ${ $hdr->get_dataref };

where $tmpl is the pack/unpack template for the
header data you have.

(3) Now you can read the piddle data using the info
in the header. Since you appear to have an
array of structures, you need to loop over the
nRangeCells:

for (my $i=0; $i<nRangeCells; $i++) {
@data = readflex(FH, [
{ Type=>'float', NDims=>1, Dims=>[nDopplerCells] },
{ Type=>'float', NDims=>1, Dims=>[nDopplerCells] },
{ Type=>'float', NDims=>1, Dims=>[nDopplerCells] },
{ Type=>'float', NDims=>2, Dims=>[2,nDopplerCells] },
{ Type=>'float', NDims=>2, Dims=>[2,nDopplerCells] },
{ Type=>'float', NDims=>2, Dims=>[2,nDopplerCells] } ]);
# do something with @data here
# you'll have to handle any special cases as well
}

Also, note the use of Type=>'float' with [2,nDopplerCells]
rather than Type=>'complex' with [nDopplerCells] since PDL
doesn't have a native C complex data type.

For more performance (and perhaps clarity), you
could take advantage of the fact that all of the
data appears to be in chunks of 'float' assuming
the complex data is single precision.  You could
replace the entire loop with a single read and
use PDL slicing operations to rearrange the data.
E.g., the above example could become:

  $data = readflex($fh,
   [ { Type=>'float',
       NDims=>3,
       Dims=>[nDopplerCells,9,nRangeCells] } ] );

where 9 is 1+1+1+2+2+2 and you could slice out
the first complex chunk for $i==7 as

  $data(:,3:4,(7))->clump(2)->splitdim(0,2)

where in the (untested) code above the clump
and splitdim are used to make the slice dims
match the actual data dims.  It might be simpler
to read a 1-dim piddle of 'float' and slice from
that instead....

(4) Whatever you have for NDims should be the same as
the number of elements of your Dims=>[] array ref
so your use of an 80 dimensional array is not
consistent with the single dimension you specify.
NOTE: I've never seen data with more than several
dimensions. If you have 80, then something may
be suspect...

Hope this helps,
Chris

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to