Re: Checking header is csv file for accuracy

Jenda Krynicky Thu, 19 Dec 2002 06:07:09 -0800

From: "Perl" <[EMAIL PROTECTED]>
> This should work (but beware - it is untested :)
> 
> Rob
> 
>     my @required = qw(head1 head6 head8);
>     my $line;
> 
>     for (<DATA>)


It's much better to use 
        while (<DATA>)

This "for (<DATA>)" forces Perl to read the whole file into an array 
in memory. (Well ... unless the optimizer is bright enough to change 
the command to "while (<DATA>)". Not sure it is. You should not 
depend on the optimizer.)

>     {
>         chomp;
>         my @field = split /,/;

While this may be fine for column headers I would not recommend doing 
this with actual data. What if some of the fields are quoted and what 
if they contain commas? use Text::CSV_XS instead.

"News.Support.Veritas.Com" <[EMAIL PROTECTED]> wrote 
in
message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> Before working with lines in a csv file I would like to check the 
headers
> from the file for accuracy i.e. do the headers I expect exist, and 
are
they
> in the right order.

Well then just read the first line into an array and compare it with 
whatever you need.

        use IO::Handle;
        use Text::CSV_XS;
        $csv = Text::CSV_XS->new();
        open my $DATA, '< ' . $filename
                or die "Can't open the file $filename : $!\n";

    $headers = $csv->getline($DATA)
                or die "It ain't a CSV!\n";

        print "The headers found in the file are: \n\t",
                join( "\n\t", @$headers),"\n";

> Alternatively, and even better, I would like to check all headers I
require
> exist, then for each parsed line extract only extract fileds that 
match
the
> respective header values I want. Can anyone guide me in the best 
direction
> to achieve either option.

I guess something like this could do what you want :

        ...
        my @wanted = qw(Foo Bar Baz); # the columns you want
        my @idx; # indexes of the columns in the CSV in the order 
                # you want them

        foreach my $wanted (@wanted) {
                my $found = 0;
                for (my $i = 0; $i < @$headers; $i++) {
                        if ($wanted eq $headers->[$i]) {
                                push @idx, $i;
                                $found = 1;
                                last;
                        }
                }
                die "Can't find required column '$wanted' !\n"
                        unless $found;
        }
        # Eg. if $headers = [ 'Some', 'Bar', 'Other', 'Foo', 'Baz', 'XXX']
        # then @idx will be set to ( 3, 1, 4)
        # that is the 'Foo' is on the 3rd column (counting from zero!)
        # 'Bar' on the 1st and 'Baz' on the 4th

        my @data;
        while (my $columns = $csv->getline($DATA)) {
                @data = @$columns[@idx];
                # now data contain the columns you are interested in
                # and they are in the expected order
                ...
        }
        close $DATA;


HTH, Jenda

P.S.: Rob, could you change the comment part of your mail address?
It's kinda strange to see a post comming from "Perl". :-)
===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =====
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
        -- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Checking header is csv file for accuracy

Reply via email to