Here are a few ways you can speed it up:

1.  Read the placeholder info from the other posts.

2.  Writing 100000 records out to a file and then reading them back in is
likely to take a lot of time.  Is there a reason why the first two while
loops can't be just one while() loop?  And do you really need to write to
that file at all?

3.  Look at bind_columns for your first while loop.

4.  Optimize your regexes.  For example, if all your $upc_num are in one of
the two formats, and you just want to remove the spaces, do something like

        $upc_num =~ s/\s+//g;

or

        $upc_num =~ tr[ ][]d;

Instead of:

        ($upc_num = $upc_num) =~
s/^(\d)\s(\d{5})\s(\d{5})\s(\d)|^(\d)(\d{5})(\d{5})(\d)\s{3}/$1$2$3$4/;

Always try to use the anchors ^ and $ if you can.  Alternation with | might
take time.


5.  Use exists() on your $upc_itemlist{$upc} lookup instead of just calling
$upc_itemlist{$upc}.  The way you are doing it right now, it has to make a
new element in the hash if it doesn't exist there.

6.  Cut the fat.  When you are looping 100k times, little things add up.
For example, you take time to increment $count, but you never use it outside
the for loop.  You do a "my @fields = split /\</;" both outside and inside
the if statements.  Get rid of the "no warnings" statement or at least put
it outside the while loop.

7.  This code doesn't make sense:

} elsif (/^IEA/) {
                next unless (/^GS<PD<2017781300/);

You are saying "if it starts with 'IEA' then next unless it starts with
'GS...'".  It has to start with only one of those....

Hope this helps and good luck.

Mitch

> -----Original Message-----
> From: Greg Wardawy [mailto:[EMAIL PROTECTED]]
> Sent: Friday, June 15, 2001 11:34 AM
> To: [EMAIL PROTECTED]
> Subject: How can I speed it up?
>
>
> Hello all,
>
> I'm just new to DBI, it's my first working DBI script so
> please don't laugh too hard at me. The script below works
> fine (AFAIK) but is slow - inserting 100000 records into the
> table takes ~1 hour 20 minutes. I think that's because I
> placed $sth = $dbh->prepare in the while loop inside the for
> block. I tried to place it outside the for block but some
> data was missing and some wasn't correct.  How can I improve
> my script? How can I speed it up? Any advice, any comment and
> any information are greatly appreciated.
> TIA
> Greg
>
> Here goes my script:
>
> #########################
> #!perl
>
> use strict;
> #use Date::Format;
> use DBI;
> use diagnostics;
>
> my $dir = "G:\\apps\\quickedi\\ib";
> my $recv_file = "padrecv.1";
> my $elite_file = "item_upc.txt";
>
> print "Poczekaj chwile..............\n";
> open(ITEMUPC, "> $dir/$elite_file ")          || die "can't
> create $dir/$elite_file: $!";
>
> my %upc_itemlist;
> my %item_upclist;
> my ($upc_num,$item_num);
>
> my $DSN = "Elite";
>
> my $dbh1 = DBI->connect ("dbi:ODBC:$DSN", "rpt_user", "",
> {RaiseError => 1, AutoCommit => 0});
>
> my $SQL = $dbh1->prepare("select upc, item_num from item
> where org_code = '01' and item_num = report_item_num and
> memo_1 <> 'TITLE' and upc is not null and upc <> 'NONE' order
> by upc");
> $SQL->execute;
> while ( my @row = $SQL->fetchrow_array) {
>       my($upc_num,$item,$price) = @row;
>       ($upc_num = $upc_num) =~
> s/^(\d)\s(\d{5})\s(\d{5})\s(\d)|^(\d)(\d{5})(\d{5})(\d)\s{3}/$
> 1$2$3$4/;
>       ($item =$item) =~ s/\s+$//;
>       print ITEMUPC "$upc_num,$item\n";
> }
>
> $dbh1->disconnect;
> close ITEMUPC;
>
> open(CHECK, "< $dir/$elite_file");
>
> while (<CHECK>) {
> chomp;
>
>       ($upc_num, $item_num) = split /\s*,\s*/;
>       $upc_itemlist{$upc_num} = $item_num;
>
> }
>
> close CHECK;
>
> my $dbh = DBI->connect( "dbi:ODBC:EDI Data", "", "" )or die
> "Can't connect to ODBC database: $DBI::errstr\n";
> open(RECV, "< $dir/$recv_file")               || die "can't
> open $recv_file: $_!";
>
> my @names =
> qq(customer,upc_num,item_num,store_num,activity,quantity,begin
> ning_date,ending_date);
> my ($count,$customer,$pad_num,$upc,$xq02,$xq03,$za01);
>
> while (<RECV>) {
> no warnings 'uninitialized';
>               s/QA/Current Inventory/;
>               s/QS/Net Quantity Sold/;
>               s/.$//;
>
> chomp;
>  my @fields = split /\</;
>
>       if (/^ISA<00<          <00<          <12<2017781300/) {
>               $customer = "LNT";
>       } elsif (/^GS<PD<2017781300/) {
>               #ignore
>       } elsif (/^ST<852/) {
>               $pad_num = $fields[2];
>       } elsif (/^XQ/) {
>               my @fields = split /\</;
>               ($xq02 = $fields[2]) =~
> s/(\d{4})(\d{2})(\d{2})/$2\/$3\/$1/; #Begining Date Range
>               ($xq03 = $fields[3]) =~
> s/(\d{4})(\d{2})(\d{2})/$2\/$3\/$1/; #Ending Date Range
>       } elsif (/^N9/) {
>               #ignore
>       } elsif (/^LIN/) {
>               $upc = $fields[3];      #UPC number
>       } elsif (/^ZA/) {
>               $za01 = $fields[1]; #QA = Current Inventory
> Qty, QS = Net Qty Sold
>       } elsif (/^SDQ/) {
>               $upc_itemlist{$upc} =
> "NO_ELITE_ITEM_NUM_FOR_LNT_UPC" if length($upc_itemlist{$upc}) == 0;
>               my @sdq = split/\</;
>               for (my $i=3;$i<=21;$i+=2) {
>                       if (defined $pad_num) { # because other
> transactions also have a SDQ lines
>                       if (defined $sdq[$i]) { # some SDQ
> lines have less then 22 fields
>                       $count++;
>                       my $sth = $dbh->prepare("INSERT INTO
> edi_pad (@names) VALUES
> ('$customer','$upc','$upc_itemlist{$upc}','$sdq[$i]','$za01','
> $sdq[$i+1]','$xq02','$xq03')");
>                               print  "Processing
> $count:\t$pad_num,$upc,$sdq[$i],$sdq[$i+1]\n";
>                               $sth->execute or die "can't
> execute SQL statement: $DBI::errstr\n";
>
>                               } else {
>                                       #ignore
>                               }
>                               } else {
>                                       #ignore
>                               }
>       }
>
>       } elsif (/^CTT/) {
>               #ignore
>       } elsif (/^SE/) {
>               #ignore
>       } elsif (/^GE/) {
>               #ignore
>       } elsif (/^IEA/) {
>               next unless (/^GS<PD<2017781300/);
>       } else {
>               print "Unrecognized: ", substr ($_, 0, 10), "\n";
>       }
> }
> close RECV;
> $dbh->disconnect;
>
> #######################
>
> and the part of the input file:
>
> ISA<00<          <00<          <12<2017781300
> <12<7084061572     <010611<0537<U<00401<000000588<0<P<:
> GS<PD<2017781300<7084061572<20010611<0537<588<X<004010VICS
> ST<852<5880001
> XQ<H<20010603<20010609
> N9<IA<1982
> LIN<<UP<009281020035
> ZA<QA
> SDQ<EA<92<129<5<183<25<187<9<191<9<194<7<206<6<288<16<302<9<30
> 5<4<320<30
> SDQ<EA<92<329<5<338<6<342<6<346<7<347<8<348<7<350<5<351<5<354<
> 17<355<6
> SDQ<EA<92<358<8<359<7<360<8<362<14<363<8<364<7<365<12<367<11<3
> 68<3<369<5
> SDQ<EA<92<370<7<371<6<372<6<373<11<374<9<375<17<376<10<377<7<3
> 78<-2<379<6
> ZA<QS
> SDQ<EA<92<129<2<187<1<206<1<371<1<387<1<391<1<398<1<400<1<403<1<425<1
> SDQ<EA<92<429<1<440<2<441<2<442<1<444<1<446<1<454<1<463<2<470<1<476<1
> SDQ<EA<92<478<2<479<1<480<1<483<1<487<1<494<1<498<1<505<1<512<2<514<1
> SDQ<EA<92<525<1<528<1<534<1<541<2<549<1<552<1<554<1<562<1<575<1<585<1
> SDQ<EA<92<593<1<598<1<616<1<620<2<625<1<627<1<633<2
> LIN<<UP<009281021315
> ZA<QA
> SDQ<EA<92<305<31<320<-3<338<2<346<31<348<95<354<31<355<-2<360<
> 37<373<102<374<9
> SDQ<EA<92<377<77<379<-6<381<49<382<2<386<54<390<89<392<84<399<
> 7<402<-5<405<6
> SDQ<EA<92<409<74<421<90<425<-1<431<2<437<-3<441<59<462<210<468
> <18<480<9<494<1
> SDQ<EA<92<495<17<502<-1<515<82<523<1<526<-1<527<101<536<9<552<
> 92<555<2<564<-1
> SDQ<EA<92<574<58<581<109<586<12<601<16<609<2<616<30<632<1<646<
> 5<654<-2<673<4
> ZA<QS
> SDQ<EA<92<346<1<360<1<381<1<399<1<436<1<437<1<495<1<515<2<564<1<574<1
> CTT<698
> SE<8943<5880001
> GE<1<588
> IEA<1<000000588
>
>

Reply via email to