>Writing 100000 records out to a file and then reading them back in
>is likely to take a lot of time
Depends on what you mean by a lot of time. It will take more time than
inserting 10 records, but definitelly shouldn't take 1 hour and 20 minutes.
I think there are bigger problems non DBI related which you correctly
pointed out.
Ilya Sterin
-----Original Message-----
From: Mitch Helle-Morrissey
To: 'Greg Wardawy'; [EMAIL PROTECTED]
Sent: 06/15/2001 11:10 AM
Subject: RE: How can I speed it up?
Here are a few ways you can speed it up:
1. Read the placeholder info from the other posts.
2. Writing 100000 records out to a file and then reading them back in
is
likely to take a lot of time. Is there a reason why the first two while
loops can't be just one while() loop? And do you really need to write
to
that file at all?
3. Look at bind_columns for your first while loop.
4. Optimize your regexes. For example, if all your $upc_num are in one
of
the two formats, and you just want to remove the spaces, do something
like
$upc_num =~ s/\s+//g;
or
$upc_num =~ tr[ ][]d;
Instead of:
($upc_num = $upc_num) =~
s/^(\d)\s(\d{5})\s(\d{5})\s(\d)|^(\d)(\d{5})(\d{5})(\d)\s{3}/$1$2$3$4/;
Always try to use the anchors ^ and $ if you can. Alternation with |
might
take time.
5. Use exists() on your $upc_itemlist{$upc} lookup instead of just
calling
$upc_itemlist{$upc}. The way you are doing it right now, it has to make
a
new element in the hash if it doesn't exist there.
6. Cut the fat. When you are looping 100k times, little things add up.
For example, you take time to increment $count, but you never use it
outside
the for loop. You do a "my @fields = split /\</;" both outside and
inside
the if statements. Get rid of the "no warnings" statement or at least
put
it outside the while loop.
7. This code doesn't make sense:
} elsif (/^IEA/) {
next unless (/^GS<PD<2017781300/);
You are saying "if it starts with 'IEA' then next unless it starts with
'GS...'". It has to start with only one of those....
Hope this helps and good luck.
Mitch
> -----Original Message-----
> From: Greg Wardawy [mailto:[EMAIL PROTECTED]]
> Sent: Friday, June 15, 2001 11:34 AM
> To: [EMAIL PROTECTED]
> Subject: How can I speed it up?
>
>
> Hello all,
>
> I'm just new to DBI, it's my first working DBI script so
> please don't laugh too hard at me. The script below works
> fine (AFAIK) but is slow - inserting 100000 records into the
> table takes ~1 hour 20 minutes. I think that's because I
> placed $sth = $dbh->prepare in the while loop inside the for
> block. I tried to place it outside the for block but some
> data was missing and some wasn't correct. How can I improve
> my script? How can I speed it up? Any advice, any comment and
> any information are greatly appreciated.
> TIA
> Greg
>
> Here goes my script:
>
> #########################
> #!perl
>
> use strict;
> #use Date::Format;
> use DBI;
> use diagnostics;
>
> my $dir = "G:\\apps\\quickedi\\ib";
> my $recv_file = "padrecv.1";
> my $elite_file = "item_upc.txt";
>
> print "Poczekaj chwile..............\n";
> open(ITEMUPC, "> $dir/$elite_file ") || die "can't
> create $dir/$elite_file: $!";
>
> my %upc_itemlist;
> my %item_upclist;
> my ($upc_num,$item_num);
>
> my $DSN = "Elite";
>
> my $dbh1 = DBI->connect ("dbi:ODBC:$DSN", "rpt_user", "",
> {RaiseError => 1, AutoCommit => 0});
>
> my $SQL = $dbh1->prepare("select upc, item_num from item
> where org_code = '01' and item_num = report_item_num and
> memo_1 <> 'TITLE' and upc is not null and upc <> 'NONE' order
> by upc");
> $SQL->execute;
> while ( my @row = $SQL->fetchrow_array) {
> my($upc_num,$item,$price) = @row;
> ($upc_num = $upc_num) =~
> s/^(\d)\s(\d{5})\s(\d{5})\s(\d)|^(\d)(\d{5})(\d{5})(\d)\s{3}/$
> 1$2$3$4/;
> ($item =$item) =~ s/\s+$//;
> print ITEMUPC "$upc_num,$item\n";
> }
>
> $dbh1->disconnect;
> close ITEMUPC;
>
> open(CHECK, "< $dir/$elite_file");
>
> while (<CHECK>) {
> chomp;
>
> ($upc_num, $item_num) = split /\s*,\s*/;
> $upc_itemlist{$upc_num} = $item_num;
>
> }
>
> close CHECK;
>
> my $dbh = DBI->connect( "dbi:ODBC:EDI Data", "", "" )or die
> "Can't connect to ODBC database: $DBI::errstr\n";
> open(RECV, "< $dir/$recv_file") || die "can't
> open $recv_file: $_!";
>
> my @names =
> qq(customer,upc_num,item_num,store_num,activity,quantity,begin
> ning_date,ending_date);
> my ($count,$customer,$pad_num,$upc,$xq02,$xq03,$za01);
>
> while (<RECV>) {
> no warnings 'uninitialized';
> s/QA/Current Inventory/;
> s/QS/Net Quantity Sold/;
> s/.$//;
>
> chomp;
> my @fields = split /\</;
>
> if (/^ISA<00< <00< <12<2017781300/) {
> $customer = "LNT";
> } elsif (/^GS<PD<2017781300/) {
> #ignore
> } elsif (/^ST<852/) {
> $pad_num = $fields[2];
> } elsif (/^XQ/) {
> my @fields = split /\</;
> ($xq02 = $fields[2]) =~
> s/(\d{4})(\d{2})(\d{2})/$2\/$3\/$1/; #Begining Date Range
> ($xq03 = $fields[3]) =~
> s/(\d{4})(\d{2})(\d{2})/$2\/$3\/$1/; #Ending Date Range
> } elsif (/^N9/) {
> #ignore
> } elsif (/^LIN/) {
> $upc = $fields[3]; #UPC number
> } elsif (/^ZA/) {
> $za01 = $fields[1]; #QA = Current Inventory
> Qty, QS = Net Qty Sold
> } elsif (/^SDQ/) {
> $upc_itemlist{$upc} =
> "NO_ELITE_ITEM_NUM_FOR_LNT_UPC" if length($upc_itemlist{$upc}) == 0;
> my @sdq = split/\</;
> for (my $i=3;$i<=21;$i+=2) {
> if (defined $pad_num) { # because other
> transactions also have a SDQ lines
> if (defined $sdq[$i]) { # some SDQ
> lines have less then 22 fields
> $count++;
> my $sth = $dbh->prepare("INSERT INTO
> edi_pad (@names) VALUES
> ('$customer','$upc','$upc_itemlist{$upc}','$sdq[$i]','$za01','
> $sdq[$i+1]','$xq02','$xq03')");
> print "Processing
> $count:\t$pad_num,$upc,$sdq[$i],$sdq[$i+1]\n";
> $sth->execute or die "can't
> execute SQL statement: $DBI::errstr\n";
>
> } else {
> #ignore
> }
> } else {
> #ignore
> }
> }
>
> } elsif (/^CTT/) {
> #ignore
> } elsif (/^SE/) {
> #ignore
> } elsif (/^GE/) {
> #ignore
> } elsif (/^IEA/) {
> next unless (/^GS<PD<2017781300/);
> } else {
> print "Unrecognized: ", substr ($_, 0, 10), "\n";
> }
> }
> close RECV;
> $dbh->disconnect;
>
> #######################
>
> and the part of the input file:
>
> ISA<00< <00< <12<2017781300
> <12<7084061572 <010611<0537<U<00401<000000588<0<P<:
> GS<PD<2017781300<7084061572<20010611<0537<588<X<004010VICS
> ST<852<5880001
> XQ<H<20010603<20010609
> N9<IA<1982
> LIN<<UP<009281020035
> ZA<QA
> SDQ<EA<92<129<5<183<25<187<9<191<9<194<7<206<6<288<16<302<9<30
> 5<4<320<30
> SDQ<EA<92<329<5<338<6<342<6<346<7<347<8<348<7<350<5<351<5<354<
> 17<355<6
> SDQ<EA<92<358<8<359<7<360<8<362<14<363<8<364<7<365<12<367<11<3
> 68<3<369<5
> SDQ<EA<92<370<7<371<6<372<6<373<11<374<9<375<17<376<10<377<7<3
> 78<-2<379<6
> ZA<QS
> SDQ<EA<92<129<2<187<1<206<1<371<1<387<1<391<1<398<1<400<1<403<1<425<1
> SDQ<EA<92<429<1<440<2<441<2<442<1<444<1<446<1<454<1<463<2<470<1<476<1
> SDQ<EA<92<478<2<479<1<480<1<483<1<487<1<494<1<498<1<505<1<512<2<514<1
> SDQ<EA<92<525<1<528<1<534<1<541<2<549<1<552<1<554<1<562<1<575<1<585<1
> SDQ<EA<92<593<1<598<1<616<1<620<2<625<1<627<1<633<2
> LIN<<UP<009281021315
> ZA<QA
> SDQ<EA<92<305<31<320<-3<338<2<346<31<348<95<354<31<355<-2<360<
> 37<373<102<374<9
> SDQ<EA<92<377<77<379<-6<381<49<382<2<386<54<390<89<392<84<399<
> 7<402<-5<405<6
> SDQ<EA<92<409<74<421<90<425<-1<431<2<437<-3<441<59<462<210<468
> <18<480<9<494<1
> SDQ<EA<92<495<17<502<-1<515<82<523<1<526<-1<527<101<536<9<552<
> 92<555<2<564<-1
> SDQ<EA<92<574<58<581<109<586<12<601<16<609<2<616<30<632<1<646<
> 5<654<-2<673<4
> ZA<QS
> SDQ<EA<92<346<1<360<1<381<1<399<1<436<1<437<1<495<1<515<2<564<1<574<1
> CTT<698
> SE<8943<5880001
> GE<1<588
> IEA<1<000000588
>
>