::massive sql query using dbi - please help::
Hello to all! This is my first time posting to this group! But I'm in desperate need of any help! (BTW, thanks for all the emails from contributors to this list..I learn a lot from you guys every day!) I have two tables in a mySQL db, named users_old and users_new, both with UserId and Email columns, no primary keys and no auto-increment columns. The users_old table has numeric values for the UserId while the users_new have NULL values. The users_old table has 120,000 rows and the users_new has 910,000 rows. I'm trying to find a simple, painless way of querying these two tables so I can store the UserId and Email from the users_old table if the Email exists in both tables. Everything I try just continues to run without ever producing any results - it just hangs at the command line when running the standalone query. Perhaps there are just too many rows to compare. I tried writing a simple script using the Perl DBI to just log the results of this massive query in a simple tab delimited flat file so I can load the data into the live database after it finishesbut no luck. Anybody have any suggestions on a better approach? My simple code looks like this: #!/usr/local/bin/perl use DBI; use strict; my($dbh); my($sth); my($exclusive_lock); eval { $dbh = DBI-connect(DBI:mysql:dbname;host=localhost, dbuser, dbpassword, {'RaiseError' = 1}); }; if($@) { my($error) = Error opening Database: $@\n; print $error\n; } my $sth = $dbh-prepare(SELECT users_old.UserId, users_old.Email FROM users_new, users_old WHERE users_old.Email = users_new.Email); $sth-execute or die Unable to execute query: $dbh-errstr\n; my ($row); while($row = $sth-fetchrow_arrayref) { my($data_log) = /home/chastie/sony_showbiz.txt; open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); print LOG $row-[0]\t$row-[1]\n; close (LOG); } $sth-finish; $dbh-disconnect; exit; \/ (o o) ooO-(_)-Ooo christa hastie programmer sonypicturesdigitalentertainment www.sonypictures.com -
RE: ::massive sql query using dbi - please help::
Thanks Marcelo for all three of those points. I tried performing just the query from mysql, but it just hangs there also, even if I leave it for an hour! I've modified the code to stop opening and closing the file in the loop, and added the lock values, but it's still just running and running and still produces no resultsh: #!/usr/local/bin/perl use DBI; use strict; my($dbh); my($sth); my($exclusive_lock) = 2; my($unlock_lock) = 8; #eval { $dbh = DBI-connect(DBI:mysql:dbname;host=localhost, dbuser, dbpassword, {'RaiseError' = 1}); }; if($@) { print Content-type: text/html\n\n; my($error) = Error opening Database: $@\n; print htmlbody$error/body/html; } my $sth = $dbh-prepare(SELECT users_old.UserId, users_old.Email FROM users_new, users_old WHERE users_old.Email = users_new.Email); $sth-execute or die Unable to execute query: $dbh-errstr\n; my ($row); my($data_log) = /home/chastie/sony_showbiz.txt; open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); while($row = $sth-fetchrow_arrayref) { print LOG $row-[0]\t$row-[1]\n or die Error: $!\n; } flock (DATATEMP, $unlock_lock); close (LOG); $sth-finish; $dbh-disconnect; exit; -Original Message- From: Marcelo Guelfi [mailto:[EMAIL PROTECTED]] Sent: Friday, November 02, 2001 11:34 AM To: [EMAIL PROTECTED] Subject: Re: ::massive sql query using dbi - please help:: 1) Did you try to perform that query from mysql instead of a perl program to measure the time? 2) Is this code correct? : open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); print LOG $row-[0]\t$row-[1]\n; close (LOG); You never opened de LOG descriptor so you can't write to it. Add a warn or die call to capture the error: print LOG or die Error: $!\n; 3) Why are you opening,locking and closing the file inside the loop? Saludos, Marcelo. Saludos, Marcelo. Hastie, Christa To: [EMAIL PROTECTED] chastie@sonypic cc: tures.com Subject: ::massive sql query using dbi - please help:: 02/11/2001 15:57 Please respond to Hastie, Christa Hello to all! This is my first time posting to this group! But I'm in desperate need of any help! (BTW, thanks for all the emails from contributors to this list..I learn a lot from you guys every day!) I have two tables in a mySQL db, named users_old and users_new, both with UserId and Email columns, no primary keys and no auto-increment columns. The users_old table has numeric values for the UserId while the users_new have NULL values. The users_old table has 120,000 rows and the users_new has 910,000 rows. I'm trying to find a simple, painless way of querying these two tables so I can store the UserId and Email from the users_old table if the Email exists in both tables. Everything I try just continues to run without ever producing any results - it just hangs at the command line when running the standalone query. Perhaps there are just too many rows to compare. I tried writing a simple script using the Perl DBI to just log the results of this massive query in a simple tab delimited flat file so I can load the data into the live database after it finishesbut no luck. Anybody have any suggestions on a better approach? My simple code looks like this: #!/usr/local/bin/perl use DBI; use strict; my($dbh); my($sth); my($exclusive_lock); eval { $dbh = DBI-connect(DBI:mysql:dbname;host=localhost, dbuser, dbpassword, {'RaiseError' = 1}); }; if($@) { my($error) = Error opening Database: $@\n; print $error\n; } my $sth = $dbh-prepare(SELECT users_old.UserId, users_old.Email FROM users_new, users_old WHERE users_old.Email = users_new.Email); $sth-execute or die Unable to execute query: $dbh-errstr\n; my ($row); while($row = $sth-fetchrow_arrayref) { my($data_log) = /home/chastie/sony_showbiz.txt; open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); print LOG $row-[0]\t$row-[1]\n; close (LOG); } $sth-finish; $dbh-disconnect; exit; \/ (o o) ooO-(_)-Ooo christa hastie programmer sonypicturesdigitalentertainment www.sonypictures.com -
Re: ::massive sql query using dbi - please help::
If the query just hangs there in mysql, putting it into DBI is not going to fix that. Are the columns indexed in the tables? If not, you've probably got it doing nested full table scans. Try adding indexes to the tables on the join columns. Get the query to run in mysql before you ever try it in perl again, or if there is anything wrong with your perl code, you'll never know it. jeff Hastie, Christa wrote: Thanks Marcelo for all three of those points. I tried performing just the query from mysql, but it just hangs there also, even if I leave it for an hour! I've modified the code to stop opening and closing the file in the loop, and added the lock values, but it's still just running and running and still produces no resultsh: #!/usr/local/bin/perl use DBI; use strict; my($dbh); my($sth); my($exclusive_lock) = 2; my($unlock_lock) = 8; #eval { $dbh = DBI-connect(DBI:mysql:dbname;host=localhost, dbuser, dbpassword, {'RaiseError' = 1}); }; if($@) { print Content-type: text/html\n\n; my($error) = Error opening Database: $@\n; print htmlbody$error/body/html; } my $sth = $dbh-prepare(SELECT users_old.UserId, users_old.Email FROM users_new, users_old WHERE users_old.Email = users_new.Email); $sth-execute or die Unable to execute query: $dbh-errstr\n; my ($row); my($data_log) = /home/chastie/sony_showbiz.txt; open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); while($row = $sth-fetchrow_arrayref) { print LOG $row-[0]\t$row-[1]\n or die Error: $!\n; } flock (DATATEMP, $unlock_lock); close (LOG); $sth-finish; $dbh-disconnect; exit; -Original Message- From: Marcelo Guelfi [mailto:[EMAIL PROTECTED]] Sent: Friday, November 02, 2001 11:34 AM To: [EMAIL PROTECTED] Subject: Re: ::massive sql query using dbi - please help:: 1) Did you try to perform that query from mysql instead of a perl program to measure the time? 2) Is this code correct? : open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); print LOG $row-[0]\t$row-[1]\n; close (LOG); You never opened de LOG descriptor so you can't write to it. Add a warn or die call to capture the error: print LOG or die Error: $!\n; 3) Why are you opening,locking and closing the file inside the loop? Saludos, Marcelo. Saludos, Marcelo. Hastie, Christa To: [EMAIL PROTECTED] chastie@sonypic cc: tures.com Subject: ::massive sql query using dbi - please help:: 02/11/2001 15:57 Please respond to Hastie, Christa Hello to all! This is my first time posting to this group! But I'm in desperate need of any help! (BTW, thanks for all the emails from contributors to this list..I learn a lot from you guys every day!) I have two tables in a mySQL db, named users_old and users_new, both with UserId and Email columns, no primary keys and no auto-increment columns. The users_old table has numeric values for the UserId while the users_new have NULL values. The users_old table has 120,000 rows and the users_new has 910,000 rows. I'm trying to find a simple, painless way of querying these two tables so I can store the UserId and Email from the users_old table if the Email exists in both tables. Everything I try just continues to run without ever producing any results - it just hangs at the command line when running the standalone query. Perhaps there are just too many rows to compare. I tried writing a simple script using the Perl DBI to just log the results of this massive query in a simple tab delimited flat file so I can load the data into the live database after it finishesbut no luck. Anybody have any suggestions on a better approach? My simple code looks like this: #!/usr/local/bin/perl use DBI; use strict; my($dbh); my($sth); my($exclusive_lock); eval { $dbh = DBI-connect(DBI:mysql:dbname;host=localhost, dbuser, dbpassword, {'RaiseError' = 1}); }; if($@) { my($error) = Error opening Database: $@\n; print $error\n; } my $sth = $dbh-prepare(SELECT users_old.UserId, users_old.Email FROM users_new, users_old WHERE users_old.Email = users_new.Email); $sth-execute or die Unable to execute query: $dbh-errstr\n; my ($row); while($row = $sth-fetchrow_arrayref) { my($data_log) = /home/chastie/sony_showbiz.txt; open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); print LOG $row-[0]\t$row-[1]\n; close (LOG); } $sth-finish; $dbh-disconnect; exit; \/ (o o) ooO-(_)-Ooo christa hastie programmer sonypicturesdigitalentertainment www.sonypictures.com -
RE: ::massive sql query using dbi - please help::
yours The users_old table has 120,000 rows and the users_new has 910,000 rows. /yours If you have no indexes, I'm not at all surprised it takes that long or even longer to get results from a join on MySQL on two tables with this many rows. The join must be completed before results are returned, and that is a long, processor intensive process without indexes. Can e-mail addresses be a primary key on either table? if it can, it should be. If not, it should at least be indexed. Only suggestion I have beyond the indexes is don't use Legacy syntax - that won't speed up the execution, but is just a good habit to have for when you want to do more complex queries. Use this type of join syntax: SELECT users_old.UserId, users_old.Email FROM users_new INNER JOIN users_old ON users_old.Email = users_new.Email aliasing will then save you a bit of typing, but that's secondary - indexing your tables is what you need to improve the performance in this case. Steve H. -Original Message- From: Hastie, Christa [mailto:[EMAIL PROTECTED]] Sent: Friday, November 02, 2001 12:58 PM To: [EMAIL PROTECTED] Subject: ::massive sql query using dbi - please help:: Hello to all! This is my first time posting to this group! But I'm in desperate need of any help! (BTW, thanks for all the emails from contributors to this list..I learn a lot from you guys every day!) I have two tables in a mySQL db, named users_old and users_new, both with UserId and Email columns, no primary keys and no auto-increment columns. The users_old table has numeric values for the UserId while the users_new have NULL values. The users_old table has 120,000 rows and the users_new has 910,000 rows. I'm trying to find a simple, painless way of querying these two tables so I can store the UserId and Email from the users_old table if the Email exists in both tables. Everything I try just continues to run without ever producing any results - it just hangs at the command line when running the standalone query. Perhaps there are just too many rows to compare. I tried writing a simple script using the Perl DBI to just log the results of this massive query in a simple tab delimited flat file so I can load the data into the live database after it finishesbut no luck. Anybody have any suggestions on a better approach? My simple code looks like this: #!/usr/local/bin/perl use DBI; use strict; my($dbh); my($sth); my($exclusive_lock); eval { $dbh = DBI-connect(DBI:mysql:dbname;host=localhost, dbuser, dbpassword, {'RaiseError' = 1}); }; if($@) { my($error) = Error opening Database: $@\n; print $error\n; } my $sth = $dbh-prepare(SELECT users_old.UserId, users_old.Email FROM users_new, users_old WHERE users_old.Email = users_new.Email); $sth-execute or die Unable to execute query: $dbh-errstr\n; my ($row); while($row = $sth-fetchrow_arrayref) { my($data_log) = /home/chastie/sony_showbiz.txt; open (DATATEMP, $data_log); flock (DATATEMP, $exclusive_lock); print LOG $row-[0]\t$row-[1]\n; close (LOG); } $sth-finish; $dbh-disconnect; exit; \/ (o o) ooO-(_)-Ooo christa hastie programmer sonypicturesdigitalentertainment www.sonypictures.com -