Hmmm, verrry interesting on big summation

2005-02-11 Thread Brad Eacker
Harrison,
 Taking your suggestion and building a combined key of member_id
and pts_awarded the query took 17 mins

create table pts_sumC_snap
select member_id, count(1) count, sum(pts_awarded) points
from pts_awarded_snap 
group by member_id;
Query OK, 12488780 rows affected (16 min 50.21 sec)
Records: 12488780  Duplicates: 0  Warnings: 0

Building the combined index took 1 hr 12 mins for the total creation
time of approximately 1.5 hours.

Without any kind of index on the pts_awarded_snap table the
query took 7 hours to build a similar summation table.

When I built the index on member_id, the query took 31 hours to complete
utilizing the index that took more time to build.

This data set holds 776723372 rows.

Bottom line, there appears to quite a difference between how fast indicies
can be read and processed than how long it takes to process the index
and data combination.
 Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Any means to get the optimizer out of the way?

2005-02-10 Thread Brad Eacker
Folks,
 I have a 677M row table with index

desc pts_awarded_snap;
+-+-+--+-+-+---+
| Field   | Type| Null | Key | Default | Extra |
+-+-+--+-+-+---+
| member_id   | int(11) |  | MUL | 0   |   |
| PTS_AWARDED | int(11) |  | | 0   |   |
+-+-+--+-+-+---+
2 rows in set (0.00 sec)

select count(*) from pts_awarded_snap;
+---+
| count(*)  |
+---+
| 776723372 |
+---+
1 row in set (0.00 sec)

Which you can see has an index on member_id.

The problem is when I try to create a temporary table from a
portion of this based upon the member_id, the optimizer decides
that a table scan is necessary on this 7GB table.

When I try:
  create temporary table T_awards_snap
select member_id, pts_awarded
from pts_awarded_snap FORCE INDEX (member_id)
where member_id between 71 and 80;

the optimizer decides to do a full table scan to choose the
required rows.

Yet a counting query  of the same space:

  select count(*) from pts_awarded_snap
where member_id between 71 and 80;
+--+
| count(*) |
+--+
|  3957677 |
+--+
1 row in set (5.84 sec)

works fine.

  explain select member_id, pts_awarded
from pts_awarded_snap FORCE INDEX (member_id)
where member_id between 71 and 80;
+--+---+---+---+-+--+--
---+-+
| table| type  | possible_keys | key   | key_len | ref  | rows 
   | Extra   |
+--+---+---+---+-+--+--
---+-+
| pts_awarded_snap | range | member_id | member_id |   4 | NULL | 
3885524 | Using where |
+--+---+---+---+-+--+--
---+-+
1 row in set (0.00 sec)

seems to indicate the member_id index can be used.  But the length of time to
do this query, and the amount blocks being read as indicated by vmstat, tell
me that a full table scan is being performed.

 I've looked over the optimization elements of the MySQL manual, trying

set max_seeks_for_key=100;

even 'FORCE INDEX (member_id)' has been tried as shown in the query above
to no avail, a table scan is still done.

 Bottom line question out of all this:  Is there a way to convince the
optimizer that it is fine to use a range portion of the index to fulfill
this query?

  Please let me know if there is something I've missed,
  Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: simple but frustrating query

2004-10-14 Thread Brad Eacker
Jeff Mathis writes:
now, leaving off name from the group by clause
(select max(close_date), symbol, name from TD
where symbol in ('aa','bb','cc') group by symbol;)

gives

+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | drugs   |
| 2004-10-05  | cc | infosvcs|
+-++-+

Jeff,
 What version of MySQL are you using?  I ran your query on
4.0.18 and got a different answer...

mysql select max(close_date), symbol, name from TD
- where symbol in ('aa','bb','cc') group by symbol;
+-++-+
| max(close_date) | symbol | name|
+-++-+
| 2004-10-05  | aa | cmptrhw |
| 2004-10-05  | bb | biotech |
| 2004-10-05  | cc | biotech |
+-++-+
3 rows in set (0.01 sec)

Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: InnoDB TableSpace Question

2004-08-04 Thread Brad Eacker
David Griffiths writes:
Oracle cannot shrink datafiles (same idea as InnoDB datafiles) when data 
is deleted either.

 Actually, Oracle has been able to resize data files since 7.2.  It
is usually done with an 'alter tablespace ... coalesce' followed by an
'alter tablespace datafile ... resize nM' command.  But the resize will
choke on a datafile with active extents in that datafile.
Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: sql bench problems

2004-04-13 Thread Brad Eacker
Yonah Russ writes:
here is a sample output line:

Time for alter_table_drop (91): 19 wallclock secs ( 0.02 usr  0.00 sys +  
0.00 cusr  0.00 csys =  0.02 CPU)


here is the regexp:

/^(estimated |)time (to|for) ([^\s:]*)\s*\((\d*)(:\d*)*\)[^:]*:\s*([\d.]+) 
.*secs \(\s*([^\s]*) usr\s*\+*\s*([^\s]*) sys.*=\s+([\d.]*)\s+cpu/i

Yonah,
 It appears that the source has been modified from the original regexp.
Primary changes are:
time - Time
cpu - CPU

You may want to run the output thru
tr '[A-Z]' '[a-z]'
to change upper case to lower case to allow the regexp to work properly.

Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Best practice on table design

2004-04-13 Thread Brad Eacker
Ciprian Trofin writes:
Basically I have some tables with only 2 fields (ID and name), and a
central table, joined by a one-to-many relation. The key point here are the
2-field tables. If I keep them separate, I can extend them (add new fields)
without problem when need arise. But if there is no need for an extension
(my case), all I get is a greater number of tables that I have to take care
of. Wouldn't be better (maybe more efficient ?) to put all the 2-field
tables in only table, with a separate ENUM field to separate the records on
categories ?

Ciprian,
 There are two main purposes for normalization in this case.  The first
is to provide consistency of data.  Going back to your example, placing the
city name in each record allows the possibility of multiple spellings for
the city name, since each record has its own copy of the data.  The second
is space savings, since storing an int is usually 4 bytes at worst while a
city name is definitely more than 4 bytes.  Yes it does generate a second
table that only has the mappings from cityID to cityName, but you will likely
find it well worth the effort to use the mapping.
   Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: newbie: increment an existing record

2004-03-19 Thread Brad Eacker
Jim [EMAIL PROTECTED] writes:
 It's likely he would prefer to update a particular record via

 update t1 set f = f + 1 where id = 1234;

 Which will update only the record matching the id of 1234.

That is indeed what I ended up doing, and it worked. Thanks.

Jim,
 It's good to hear that you were able to accomplish your
desired results.  Now if I could just figure out how to get a
job based upon the knowledge that I've built up over the last
decade of working with various RDBMS systems, coupled with over
9 years of web/back-end integration, and 13 years of Unix kernel
internal development experience.

 You may want to take a look at Teach Yourself SQL in 21 Days
by Stephens and Plew, Sams Publishing ISBN 0-672-31674-9 for the
copy that I have in front of me.  It does a good job of explaining
these kinds of questions.
 Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Newbie question

2004-02-23 Thread Brad Eacker
Donny Simonton [EMAIL PROTECTED] writes:
But as a manager, the first thing I do is ask for code examples, as well as
table structures.  I don't even need to see a resume once I look at
somebody's code and table structures.

 Unfortunately I will have to disagree with you on this one.  Having
been searching for solid employment for the last year, I'm finding that the
managers seek only to place someone who has done the exact job they are
trying to fill.  And this is with 22 years of relevant experience on my
part.  8 years of web-back integration, 13 years inside the unix kernel,
coupled with 9 years of RDBMS experience don't seem to hold much weight
anymore.

 You mention examples, I have a pair I use having fully developed both
sets

http://www.misc.com/costumes/Dickens_0312/index.php
http://66.134.203.2/~beacker/genbank/gbf_extract.cgi?LOCUS=AF158101

The first being a photographic PHP based mechanism that I use in at least
a dozen pages without change.  The second being a 30 million record data
set I utilize for selecting GenBank info.  I'm finding it very frustrating
that my knowledge and capabilities are not found to be compelling.  I have
utilized C for over 21 years, and perl for 20 years.  Yet these don't
appear to be enough to get a position in Silicon Valley.

   Brad Eacker ([EMAIL PROTECTED])



-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]