RFC: New interface for Text::CSV_XS

Jeff Zucker Thu, 16 Dec 2004 12:06:43 -0800

While not strictly DBI, this request for comments may be of interest to many on this list, so, with Tim's permission, I am posting here.

I propose to add a new interface to Text::CSV_XS and would value comments and suggestions. This interface will be *in addition to* the existing interface so scripts using the old interface will not require any changes.

To use the new interface, users will get a Text::CSV_XS object by calling open_file() or open_string() methods instead of calling new(). To use the old interface, users will continue to get the object by calling new() and should ignore the open_file() and all the other new interface methods. The new interface is simply a wrapper around the old interface so actual data access and modification will be the same.

The new methods should be strangely familiar to those on this mailing list :-).

=head1 SYNOPSIS

 use Text::CSV_XS;

 $c=Text::CSV_XS->open_file($filename,\%attr)   # open a CSV file
 $c=Text::CSV_XS->open_file(*filehandle,\%attr) # use an existing handle
 $c=Text::CSV_XS->open_string($string,\%attr)   # open a CSV string

 @row  =$c->fetchrow_array         # fetch one row into an array
 $row  =$c->fetchrow_arrayref      # fetch one row into an array ref
 $row  =$c->fetchrow_hashref       # fetch one row into a hashref
 $table=$c->fetchall_arrayref      # fetch all rows into an array ref
                                   # of array refs
 $table=$c->fetchall_hashref($key) # fetch all rows into a hashref

 $c->write_row( @array )    # insert one row from an array of values
 $c->write_table($arrayref) # insert multiple rows from  an arrayref
 $c->write_table($hashref)  # insert multiple rows from  a hashref

                     # loop through a file fetching hashrefs
                     # note: undef is returned at EOF so the loop
                     # aborts as it should; at conclusion it
                     # seeks to start of data, so may be reused;
                     # the same is true for loops with the other
                     # fetchrow methods

 my $c = open_file( $filename );
 while(my $row = $c->fetchrow_hashref){
    if($row->{$column_name} eq $value){
        # do something
    }
 }

 # note1: The $filename in open_file() must be preceded by '>'
 # if you intend to write data to a new file, by '>>' if you
 # intend to append data to an existing file, and by nothing
 # if you intend to read from the file.  If you intend to write
 # and then read, you need to reopen the file in read mode.

 # note2: All methods in the new interface, incuding open_file()
 # and open_string() will die on error. You do *not* need to
 # check for open or parse errors.  You *do* need to use eval
 # if you want to trap errors.

 # note3: open_file() and open_string() take an optional second
 # parameter which should be a hashref containing settings for
 # separators, delimiters, escape characters, etc, for example:

 $c=Text::CSV_XS->open_file( ">$filename" ,
     { quote_char => q{'}
     , sep_char   => q{;}
     , columns    => q{partid,partname}
     });

=head1 Differences in defaults between the two interfaces

The open_file() and open_string() methods take the same attribute flags as the new() method but the defaults differ in three key respects.

1. The binary attribute will default to true in the new interface - i.e. by default embedded newlines and binary data will be allowed. This can be over-ridden by setting binary=>0 in the open_file() or open_string() calls.

2. The escape_char attribute will default to being the same as the quote_char attribute unless the user specifically defines the escape_char attribute. That means that if the user specifies quote_char as ' (single-quote) rather than " (double-quote), the escape_char also becomes ' (single-quote). Likewise, if the user defines the quote_char as undef, the escape_char will also become undef unless the escape_char is specifically defined.

3. An additional attribute "columns" is supported. If it is set to "none", no column names will be defined and the hash related methods will not work, all access must be by arrays. If columns is set to a comma-separated string of column names, those will become the column names for hash methods. If no columns attribute is defined, the columns will be taken from the first row of data in the file or string.

=head1 PREREQUISITES

There are no prerequistes for the old interface. The new interface requires IO::File for file operations (unless a you pass a filehandle created with a different module). The new interface requires IO::Scalar for string operations. These are both handled with require rather than use so if you don't use methods requiring them, they are no longer prerequistes.

RFC: New interface for Text::CSV_XS

Reply via email to