Re: DBD::CSV and skip_first_line

2012-11-26 Thread H.Merijn Brand
On Mon, 26 Nov 2012 11:49:49 -0500, Scott R. Godin scot...@mhg2.com
wrote:

 
 On 11/25/2012 04:16 AM, Jens Rehsack wrote:
  On 25.11.12 10:00, H.Merijn Brand wrote:
  On Fri, 23 Nov 2012 17:43:50 -0500, Scott R. Godin scot...@mhg2.com
  wrote:
 
  I've run into an issue where I need both col_names set and
  skip_first_line still set to TRUE, because of malformed colnames in the
  original dumpfiles that conflict with SQL Reserved Words (such as
  'key')
  that I am unable to find any other acceptable workaround short of
 
  Why not automate the hacking using Text::CSV_XS and rewrite the header
  before using DBD::CSV?
 
  Or simply quote the column names in your SQL statement?
 
 I tried various quoting mechanisms up to and including attempting to use
 backticks, but all result in errors of one kind or another

Can you attach the first 4 lines of your csv datafile?

 $dbh-prepare(q{Select 'key', PHM_ID, DAW_CD, GENBRND_CD from clms limit
 10})
 results in every record having the literal value key for the column `key`
 same if I try select 'key' as PKEY
 
 if I switch to double-quotes rather than single quotes around key in
 the above, I get the following error:
 Execution ERROR: No such column 'key' called from clms_test.pl at 23.
 
 I'll look into playing with Text::CSV_XS, and see what I can come up with.
 
 I still think it would be easier if skip_first_line were not presumed
 (forced to) false if col_names is set, but rather presumed false only if
 not set explicitly true.

We agree, investigating what is actually required (and should be
documented)

-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.17   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/


Re: DBD::CSV and skip_first_line

2012-11-26 Thread H.Merijn Brand
On Mon, 26 Nov 2012 12:29:25 -0500, Scott R. Godin scot...@mhg2.com
wrote:

 On 11/26/2012 11:56 AM, H.Merijn Brand wrote:
  On Mon, 26 Nov 2012 11:49:49 -0500, Scott R. Godin scot...@mhg2.com
  wrote:
 
  On 11/25/2012 04:16 AM, Jens Rehsack wrote:
  On 25.11.12 10:00, H.Merijn Brand wrote:
  On Fri, 23 Nov 2012 17:43:50 -0500, Scott R. Godin scot...@mhg2.com
  wrote:
  I've run into an issue where I need both col_names set and
  skip_first_line still set to TRUE, because of malformed colnames in the
  original dumpfiles that conflict with SQL Reserved Words (such as
  'key')
  that I am unable to find any other acceptable workaround short of
  Why not automate the hacking using Text::CSV_XS and rewrite the header
  before using DBD::CSV?
  Or simply quote the column names in your SQL statement?
  I tried various quoting mechanisms up to and including attempting to use
  backticks, but all result in errors of one kind or another
  Can you attach the first 4 lines of your csv datafile?
 Unfortunately, no, as I am under HIPAA restrictions.
 
 key consists of seemingly random alphanumeric [A-Z0-9] sequences that
 may or may not contain one dash (at about position 11), of 16-char length
 PHM_ID consists of P\d{7} and may repeat across records
  of the other two fields DAW_CD is numeric(1) and GENBRND_CD is boolean
 all records are pipe-delimited

Well, in that case, I'd use Text::CSV_XS to rewrite the data before
using DBD::CSV to use it. I do sorta the same on a regular basis, as my
customers mak typos in the headers and change them all the time.

In any case, key quoted/unquoted has no magical value at all. The
key-name is what you pass in in column_names or in the first line. That
the first line should be skippable when specifying your own column
names is already agreed on.

 The actual csv contains 44 columns; in the interest of brevity I limited
 the sample to the below four. :)
  $dbh-prepare(q{Select 'key', PHM_ID, DAW_CD, GENBRND_CD from clms limit
  10})
  results in every record having the literal value key for the column `key`
  same if I try select 'key' as PKEY

--8--- example of CSV rewrites written without checking from top of my head
my $csv_o = Text::CSV_XS-new ({
binary= 1,
auto_diag = 1,
eol   = \n,
});
my $csv_i = Text::CSV-new ({
binary= 1,
auto_diag = 1,
sep_char  = |, # | often requires allow_white_space = 1
});

my %hdr;
while (DATA) {
m/^#/ and next;
my ($fld, $text) = m/^(\S+)\s+(.*)/ or next;
$hdr{$text} = $fld;
}

my @hdr;
for (@{$csv_i-getline ($fh)}) {
if (exists $hdr{$_}) {
push @hdr, $hdr{$_};
next;
}
# special hardcoded fields
if (m/^[A-Za-z]\w{0,7}-?\w{1,12}$/) {
push @hdr, key; # This is a key?
next;
}
die I do not know how to translate '$_' to something useful;
}
my %rec;
$csv_i-bind_columns (\@rec{@hdr});
$csv_o-print ($out, \@hdr);
while ($csv_i-getline ($fh)) {
$csv_o-print (\@rec{@hdr});
}

__END__
# Wanted   Seen in their shit
c_foo  code foo
c_foo  kode foo
c_foo  kode-foo
c_foo  kode_foo
foodescription of foo
fooomschrijving foo
fooomschr. foo
--8---

-- 
H.Merijn Brand  http://tux.nl   Perl Monger  http://amsterdam.pm.org/
using perl5.00307 .. 5.17   porting perl5 on HP-UX, AIX, and openSUSE
http://mirrors.develooper.com/hpux/http://www.test-smoke.org/
http://qa.perl.org   http://www.goldmark.org/jeff/stupid-disclaimers/


Re: DBD::CSV and skip_first_line

2012-11-26 Thread Scott R. Godin
On 11/26/2012 11:56 AM, H.Merijn Brand wrote:
 On Mon, 26 Nov 2012 11:49:49 -0500, Scott R. Godin scot...@mhg2.com
 wrote:

 On 11/25/2012 04:16 AM, Jens Rehsack wrote:
 On 25.11.12 10:00, H.Merijn Brand wrote:
 On Fri, 23 Nov 2012 17:43:50 -0500, Scott R. Godin scot...@mhg2.com
 wrote:
 I've run into an issue where I need both col_names set and
 skip_first_line still set to TRUE, because of malformed colnames in the
 original dumpfiles that conflict with SQL Reserved Words (such as
 'key')
 that I am unable to find any other acceptable workaround short of
 Why not automate the hacking using Text::CSV_XS and rewrite the header
 before using DBD::CSV?
 Or simply quote the column names in your SQL statement?
 I tried various quoting mechanisms up to and including attempting to use
 backticks, but all result in errors of one kind or another
 Can you attach the first 4 lines of your csv datafile?
Unfortunately, no, as I am under HIPAA restrictions.

key consists of seemingly random alphanumeric [A-Z0-9] sequences that
may or may not contain one dash (at about position 11), of 16-char length
PHM_ID consists of P\d{7} and may repeat across records
 of the other two fields DAW_CD is numeric(1) and GENBRND_CD is boolean
all records are pipe-delimited

The actual csv contains 44 columns; in the interest of brevity I limited
the sample to the below four. :)
 $dbh-prepare(q{Select 'key', PHM_ID, DAW_CD, GENBRND_CD from clms limit
 10})
 results in every record having the literal value key for the column `key`
 same if I try select 'key' as PKEY

 if I switch to double-quotes rather than single quotes around key in
 the above, I get the following error:
 Execution ERROR: No such column 'key' called from clms_test.pl at 23.

 I'll look into playing with Text::CSV_XS, and see what I can come up with.

 I still think it would be easier if skip_first_line were not presumed
 (forced to) false if col_names is set, but rather presumed false only if
 not set explicitly true.
 We agree, investigating what is actually required (and should be
 documented)


-- 
Scott R. Godin, Senior Programmer
MAD House Graphics 
 302.468.6230 - main office
 302.722.5623 - home office



Re: DBD::CSV and skip_first_line

2012-11-26 Thread Scott R. Godin

On 11/26/2012 11:56 AM, H.Merijn Brand wrote:
 Can you attach the first 4 lines of your csv datafile?
Here is some randomized data that closely resembles the data in the csv
if this is any help in working with variations on

$dbh-prepare(q{Select key, PHM_ID, DAW_CD, GENBRND_CD from clms limit 10});

(bearing in mind the csv contains 44, not 4 columns and this is just a sample)


key|PHM_ID|DAW_CD|GENBRND_CD
667291120KNM4728|P1951532|2|0
858525298EEA3248|P8697017|5|0
286424010HTG2644|P8607393|3|1
344987842DYH2950|P8662248|3|0
225509049XEU3393|P1222508|1|0
061473729SFZ1183|P2785408|6|0
370501125YPF2594|P1534462|2|0
620354050CRF3119|P4438944|3|1
901228431AUF5822|P5315769|1|0
969358370QPO9757|P1523687|8|0
543692286WTA5861|P5993819|1|0
591327753QVR5452|P1013462|4|0
159204117LXL0308|P5358769|8|1
352853355KYT5615|P2810873|3|1
195099617GNE7056|P1306424|6|0


-- 
Scott R. Godin, Senior Programmer
MAD House Graphics 
 302.468.6230 - main office
 302.722.5623 - home office