My name is Konrad Lorincz, a graduate student in computer science at Harvard University. I am doing a class project in which I am comparing the performance of queries between MySQL, grep, and FlatSQL (an SQL like language that we implemented using awk and perl, that performs queries on flat files). I am writing because I discovered a strange behavior in MySQL when performing joins involving three columns over non-indexed data.
When the number of tuples is lager than a certain size, it take a very long time (at first I thought it entered an infinite loop) for operation to complete. Our implementation FlatSQL on the other time completes in a reasonable time. I wonder if MySQL does something funny like try to build an index over the non-indexed data that it's causing it to be so slow. Please let me know if you have any ideas of what's causing this behavior and if there is a way around it. Below, are some additional details Here is a sample of the date. It's a collection of NFS traces from the file system. The second column (time) is indexed in MySQL. Each line is of the form: Op | Time | XID | ClientID | FH | Offset | Size write 1003809600.416227 29caeba9 8cf7216b.03ce 1 229376 8192 write 1003809600.417535 2bcaeba9 8cf7216b.03ce 1 237568 1142 read 1003809601.072451 22897eaa 8cf7216c.03e8 2 0 8192 and the query is SELECT a.XID, a.ClientID, b.Offset, c.Size FROM 2_10__3_4 AS a, 2_10__3_6 AS b, 2_10__3_7 AS c WHERE a.XID=b.XID AND a.XID=c.XID; where 2_10__3_4 is a table with 2^10=1024 tuples (lines) of data having columns 3 and 4 *** MySQL timing *** NbrLinesInTable | Time to complete Joins (sec) 2_06 lines 0:00.08 sec 2_07 lines 0:00.12 sec 2_08 lines 0:00.36 sec 2_09 lines 2:26.41 sec 2_10 lines 20:16.59 sec *** FlatSQL *** 2_06 lines 0:01.18 sec 2_07 lines 0:02.20 sec 2_08 lines 0:04.28 sec 2_09 lines 0:09.15 sec 2_10 lines 0:21.94 sec As we can see, something strange happens when the table is larger than 2^8 lines (tuples) the time to complete jumped from 0.36sec to 2min 26 sec and then to 20min 16sec. For FlatSQL, we have a steady increase. I created the tables in MySQL using the following my $TABLE_TO_DESC="CREATE TABLE ${TABLE_TO} ( \ Op VARCHAR(10) NOT NULL default '', \ Time DECIMAL(15, 5), \ XID VARCHAR(10) default NULL, \ ClientID VARCHAR(15) default NULL, \ FH INT UNSIGNED, \ Offset INT UNSIGNED, \ Size INT UNSIGNED, \ KEY Time (Time) \ ) TYPE=MyISAM"; MySQL version mysql Ver 11.18 Distrib 3.23.52, for pc-linux-gnu (i686) I appreciate your help, -Konrad ************************************************************ ** Konrad Lorincz ** graduate student at ** Harvard University ** ** Tel: (617) 493-4956 ** E-Mail: [EMAIL PROTECTED] ** URL: http://www.eecs.harvard.edu/~konrad ** ** ** "For here we are not afraid to follow truth ** wherever it may lead, nor to tolerate any error ** so long as reason is left free to combat it." ** - Thomas Jefferson ** ** ... to infinity and beyond ... ************************************************************ --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php