Re: float comparison?

2005-10-31 Thread Scott Gifford
Michael Stassen [EMAIL PROTECTED] writes:

 Scott Gifford wrote:

[...]

Try something like this:

   select * from price where amount  3.44 and amount = 3.46

 I hope the = was a typo?

Yes.

Scott.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: float comparison?

2005-10-29 Thread Scott Gifford
Jerry Swanson [EMAIL PROTECTED] writes:

 Why when I do select from database and try to compare float field with float
 number, I have no results

 amount --- type float

 select * from price where amount = 3.45 // doesn't work
 select * from price where amount = '3.45' //doesn't work

Floating point numbers are very rarely equal, because of small
rounding errors that happen all the time.  Probably the actual number
in the database is something like '3.44001'.

Try something like this:

   select * from price where amount  3.44 and amount = 3.46

Scott.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Downgrade privileges on connect

2005-10-12 Thread Scott Gifford
[EMAIL PROTECTED] writes:

 Is it possible for a MySQL connection to request a downgrade in
 privileges?  What I'd like to be able to do is create one database
 user account for a database application with read and write
 privileges to the tables, but if an application user logs in who
 only has read access, then to downgrade the permission's on the SQL
 server for the database user to select only for that connection.

I don't know of a way to do exactly what you describe, but you can get
the same effects by creating two accounts, one with only SELECT
privileges and one with whatever write privileges you need.  Then
you application can simply decide which account to authenticate with.

 I want to be able to give an added level of assurance that the user
 will not be able to write to the database even if the application
 controls are circumvented.

That sounds like an excellent design decision.

Scott.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Downgrade privileges on connect

2005-10-12 Thread Scott Gifford
[EMAIL PROTECTED] writes:

[...]

 Your application will still need access to the data it gets from
 MySQL so changing your MySQL permissions doesn't make any sense,
 does it. It's your application that needs to say no to the
 user. You don't want MySQL saying no to your application. Do you?

Having multiple layers of security is generally a good design (often
called Security in depth or Defense in depth).  That way if
there's a flaw in your application, the damage is limited.  If you
think that's unlikely to happen, then you're not paying attention:

http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=sql+injection

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Possible Bug? Left Join With OR Clause Takes Minutes

2005-10-08 Thread Scott Gifford
Scott Klarenbach [EMAIL PROTECTED] writes:

 I'm using Mysql 5.0.7 and I've noticed the following very strange
 functionality, perhaps someone can shed some light on it for me.

Try using the EXPLAIN statement to get some insight into what MySQL
is thinking.

Scott.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Result row count *without* buffering all results?

2005-09-18 Thread Scott Gifford
Ken Gieselman [EMAIL PROTECTED] writes:

[...]

 Yeah, that's my fall-back option -- though the thought of executing
 the query twice is a bit daunting.  Some of the tables run into
 billions of rows per year (the merge tables anyhow, the data tables
 are broken down by month to keep them from becoming totally
 unmanageable), and a multi-year search can take a while to grind out.

There are some optimizations an SQL server can do if it knows it's
only retreiving a count, and not the rows, so it's often not as
computationally expensive as executing the full query twice.

 Seems to me that by the time the first query execution is done, the
 server should *know* exactly how many rows are in the result set --
 just need to find the magic trick that allows the client to query that
 number, rather than counting for itself as it buffers them.

I think in many cases it actually doesn't know; it keeps spitting back
rows until it finds it doesn't have any more.  Especially if the query
is complicated enough to require complete records to be inspected, the
server isn't going to keep inspecting records once it's found enough
data to satisfy the current query.

I also think that the COUNT(*) will end up pulling many things into
cache, so the actual results query will be somewhat faster than it
would otherwise be.

Perhaps somebody who knows MySQL internals better than I do can say
for sure whether these are true.

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Result row count *without* buffering all results?

2005-09-17 Thread Scott Gifford
Ken Gieselman [EMAIL PROTECTED] writes:

[...]

 So, here's my question: How can I find out how many rows are being
 returned by a query, without using mysql_stmt_store_result() to
 buffer all the rows back to the client?

One straightforward way is to replace the SELECT clause with just
SELECT COUNT(*) and execute that query first.

-ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



ORDER BY distance from a point

2005-08-24 Thread Scott Gifford
Hello,

I'd like to sort my query results based on their distance from a given
point.  The actual data I have will be in (longitude,latitude) format,
but I can convert to something else if that will work better.

For example, I may have data like this

Item   Latitude  Longitude
     -
Scott's House  37.4419   -122.1419
Tom's House37.4519   -122.2419
Mary's House   37.4619   -122.3419
Sally's House  37.4719   -122.4419

and I'd like to see these rows sorted by distance from (38,-121).

My actual data has many more columns (about 30) and rows (about
25,000), and joins in a few other tables.  Most queries will have a
LIMIT clause with 10 results, possibly starting a few hundred rows in
(LIMIT 240,10).  Currently all searches take much less than 1 second,
and I'd like to keep it that way.

Is there a way to have MySQL do this query efficiently?  I know how to
do the calculations, but MySQL has to calculate the Great Circle
distance to this point for each row in the table, which is slow for
25,000 rows.

I tried using the GIS functions, but in the version of MySQL I have
(4.1.7 on Debian Linux) the Distance() function isn't implemented, so
that doesn't help much.  I'm willing to look at upgrading MySQL if
that will help.

My database friends tell me I want to use an RTREE index, but I
haven't yet found a version of MySQL that implements those yet except
with the GIS functions.

Thanks for any advice, help, or hints!

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Native XML Support

2005-08-15 Thread Scott Gifford
Scott Klarenbach [EMAIL PROTECTED] writes:

 Does MySQL 5 provide native XML support?  ie, can I have a stored
 procedure return an XML string instead of a recordset?  Can I pass in
 an XML string/doc and have the DB update relational tables based on
 it?

MyXML is supposed to help with this sort of thing:

http://tangent.org/index.pl?lastnode_id=478node_id=388

though I haven't used it.

As far as I know, MySQL doesn't support XML as a native datatype, and
doesn't support indexing XML documents directly; you have to treat
them as BLOBs or TEXT fields, and extract and index the various fields
yourself.

Probably what you want to do is implement this at the application
layer: convert an XML document into one or more SQL queries, and
convert SQL rows into XML documents.  The other option is using a
database other than MySQL which has native XML support.

ScottG.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: unix timestamp

2005-08-15 Thread Scott Gifford
Sebastian [EMAIL PROTECTED] writes:

 i have this query:

 SELECT COUNT(*) AS score FROM downloads WHERE date_add(dateline,
 interval 1 hour) = now() GROUP BY filename ORDER BY score DESC

 unfortunately for other reasons i had to change `dateline` to unix
 timestamp so this query is no longer able to run as intended. can
 anyone help with a work around?

Sure, 1 hour is just 3600 seconds, which is how Unix timestamps are
measured:

SELECT COUNT(*) AS score FROM downloads 
WHERE dateline + 3600 = UNIX_TIMESTAMP() 
GROUP BY filename ORDER BY score DESC

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: unix timestamp

2005-08-15 Thread Scott Gifford
Keith Ivey [EMAIL PROTECTED] writes:

 Scott Gifford wrote:

 SELECT COUNT(*) AS score FROM downloads WHERE dateline +
 3600 = UNIX_TIMESTAMP() GROUP BY filename ORDER BY score DESC

 It would be better with

   WHERE dateline = UNIX_TIMESTAMP() - 3600

 so that it can use an index on dateline.

You're right; I have too much faith in the optimizer.  :-)

Scott.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: True randominess

2005-08-04 Thread Scott Gifford
Pat Adams [EMAIL PROTECTED] writes:

 On Thu, 2005-08-04 at 14:44 -0500, 2wsxdr5 wrote:
 There are also several places that you can get a reasonably random 
 number for the seed from your machine.  The amount of free disk space, 
 unless that doesn't change much on your machine.  The amount of free 
 RAM, (up time mod cpu usage).  Any number of things could be used that 
 are not very predictable, if at all.

 But again, those aren't truely random. They're random-enough for the
 average web applications. The original poster, if memory serves, asked
 if it was possible to get true random numbers from MySQL. True random
 numbers can't be predicted even if I know everything about your system.
 Because computers are predictable beasts, the random number generators
 that they used are constrained by the hardware limits.

/dev/random is a source of some genuine entropy on many Unix-like
operating systems.  It uses variations in system timings that are
believed to be truly random.  It's not good for a large volume of
output, but it's a good seed.  You could probably incorporate access
to it or its friend /dev/urandom as a UDF:

http://sunsite.mff.cuni.cz/MIRRORS/ftp.mysql.com/doc/en/Adding_UDF.html

EGD (Entropy Gathering Daemon) is an option for other Unix-like
systems:

http://egd.sourceforge.net/

or you can use a Lava Lamp:

http://www.lavarnd.org/index.html

I'm sure Windows has some way to do this, too.

Many systems also have an onboard random number generator which you
should be able to access through an OS driver.

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Calculate LONG/LAT from ZIP+4

2005-06-27 Thread Scott Gifford
Brian Dunning [EMAIL PROTECTED] writes:

 Dude, that's more than 5 years old.

Yup, but it's worked fine for every application I've used it for.  ZIP
codes don't tend to get moved around very often, and when they do it's
not very far...

ScottG.

[...]

 As I said below, you can download a free database from the US Census
 Bureau.  Specifically, the data here:

 http://www.census.gov/tiger/tms/gazetteer/zcta5.txt

 contains ZIP codes and lat/lon information.  I've not used that data,
 but I've used this file from 1999 with good results:

 http://www.census.gov/geo/www/tiger/zip1999.html

[...]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Calculate LONG/LAT from ZIP+4

2005-06-26 Thread Scott Gifford
Jack Lauman [EMAIL PROTECTED] writes:

 A couple of months ago these was a discussion about ZIP code plotting
 on this list.  Does anyone know how to calculate LONG/LAT coordinates
 from ZIP+4?

In the U.S., you can use the census data:

http://www.census.gov/geo/www/gazetteer/places2k.html

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Calculate LONG/LAT from ZIP+4

2005-06-26 Thread Scott Gifford
Brian Dunning [EMAIL PROTECTED] writes:

 You have to purchase a database. ZIP codes are not geographic, you
 can't calculate lat/lon from them. Here is one source:
 http://www.zipwise.com

As I said below, you can download a free database from the US Census
Bureau.  Specifically, the data here:

http://www.census.gov/tiger/tms/gazetteer/zcta5.txt

contains ZIP codes and lat/lon information.  I've not used that data,
but I've used this file from 1999 with good results:

http://www.census.gov/geo/www/tiger/zip1999.html

Apparently that will no longer be updated, though, so the ZCTA data is
probably better.

ScottG.

[...]

 On Jun 26, 2005, at 6:43 PM, Scott Gifford wrote:

[...]

 In the U.S., you can use the census data:

 http://www.census.gov/geo/www/gazetteer/places2k.html

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Slow query: optimizer ignores index, using filesort

2005-06-14 Thread Scott Gifford
Jigal van Hemert [EMAIL PROTECTED] writes:

 From: Scott Gifford

[...]

 Right, ALL would be a great plan if it weren't for the LIMIT 1.

 The LIMIT 1 will be performed *after* the recordset is sorted :-(

Ah, I think that is the piece I was missing.

[...]

 I'm a little surprised MySQL can't figure out on its own that it can
 get the same effect by using mls_num from either table, since the
 tables are joined on it, so the values will always be identical.

 You have two LEFT JOINs, so the values of mls_num might be something (the
 identical value you refer to) or NULL. So, you expect MySQL to evaluate the
 JOIN conditions, figure out that the ORDER BY column actually is the same as
 columns from other tables and see if one of these columns is actually the
 first non-const table in the execution path?

Of course with the LEFT JOINs it can be difficult to figure out
(though in this case the JOINs are on mls_num, so it will never be
NULL), but with two tables were using a regular join on the column
in question, it does seem like a straightforward optimization.  Maybe
it's not common enough in practice to justify implementing it (or
maybe I'm missing something).

[...]

 which is the same as what I get when I force a STRAIGHT_JOIN.
 So with two plans that will use the same number of records, I'm
 surprised MySQL doesn't choose the one that will allow it to use an
 index.
 There is no limiting condition whatsoever in the WHERE clause (except the
 JOIN condition), so it will try to estimate based on the cardinality and the
 number of records in the table which excecution path will require it to
 evaluate the smallest number of records. It will try to estimate whether the
 use of an index is appropriate or not. But I can imagine that it will favour
 a full table scan of a small table that will result in a small recordset,
 especially when it knows that the other, bigger tables are JOINed with only
 1 or a few records per join.

In this case, the two tables have identical cardinality, so it
shouldn't be favoring one over the other.

[...]

 This is part of a routine that generates many different queries based on a
 wide variety of search forms, predefined selections, etc. Most of the time
 MySQL does a great job with finding the 'right' execution path, but
 sometimes it goes horribly wrong.

That's been my experience too; we've been running this application for
about 5 years, and this is only the second time something like this
has come up.

[...]

 I tried that using mirealsource_homes.mls_num in the WHERE clause and
 it didn't make a difference.

 I tried to say that you could/should try to limit the number of records that
 are selected in the first place. 
[...]
 So, can you think of a limitation that would preselect a part of the
 mirealsource_homes.mls_num table?

Unfortunately, not in the general case.

 Are there any other global things I can try, to tell MySQL to avoid
 table scans?  The queries almost always use at most LIMIT 10, so a
 table scan is generally not the right idea

 A full table scan is the right idea as long as you select more than 30% of
 the table, sort it and after that apply the limit 10 (which is the only
 solution for such a query).

Right, but our queries always use a LIMIT 10 or LIMIT 1, so a full
table scan is never a good deal if an index is available.  I was
hoping for some way to communicate that to MySQL.

I think I'm going to take a look at the MySQL source and see if
there's anything I can tweak to get the effect I want.  I'll report
back my results.

Thanks for all your help!

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: discuss: user management

2005-06-14 Thread Scott Gifford
Danny Stolle [EMAIL PROTECTED] writes:

[...]

 3. Create multiple user IDs for each role played by each user
 (dannys_arch as an architect, dannys_dev as a developer).

An interesting argument in favor of the much maligned option 3 is
that it allows users to select which of their privileges they want for
a particular task.  For example, even if you have developer privileges
which allow you to delete tables, you may want to log in as
dannys_search if you're testing a program or using a GUI frontend,
to avoid accidentally damaging anything.

I often use a technique like this when doing Web development; I'll
create a cust_select for use in scripts that should only be reading
the data, and a cust_update for scripts that should be updating.
That way bugs in one script don't allow it to do too much damage.

This is in general good security practice, much like only logging in
as root when you need to.

If you choose this route, I think you could manage it by maintaining
your own tables with usernames and roles, then generating the actual
MySQL user table.  Your frontends could handle automatically appending
the appropriate role.  Even if you decide on a more traditional RBAC
system, this still might be a useful technique.

Good luck,

ScottG.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Slow query: optimizer ignores index, using filesort

2005-06-14 Thread Scott Gifford
Scott Gifford [EMAIL PROTECTED] writes:

[...]

 I think I'm going to take a look at the MySQL source and see if
 there's anything I can tweak to get the effect I want.  I'll report
 back my results.

The MySQL source looked a bit too complex for casual hacking, but
here's what I ended up doing.  When I'm JOINing a table to the primary
table, if there are no columns in the table which are used in the
WHERE or ORDER BY clause, then I do a LEFT JOIN instead of a regular
JOIN.  That seems to stop MySQL from choosing bad plans in the cases
that are common for my application, since in most cases WHERE and
ORDER BY use only columns from the primary table, so the table with
the ORDER BY will end up first, so the index can be used.

---ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Slow query: optimizer ignores index, using filesort

2005-06-13 Thread Scott Gifford
I'm having a hard time figuring out why a query in my application is
slow.  It seems that MySQL is using a filesort where it's not
necessary, and as a result a query that should be taking a fraction of
a second is taking up to 10 seconds.

Essentially, the query is doing a join of 4 tables, two of which may
not have a matching row (and so must use an explicit LEFT JOIN).  All
tables have a common mls_num primary key, which is the column used
for the join.  The results are sorted by mls_num, and only the first
row is requested.

I've whittled the query down to just a few lines that reproduce the
problem:

mysql explain 
  SELECT mirealsource_homes.mls_num, 
 mirealsource_homes_supplemental.listdate,
 mirealsource_images.image1,
 mirealsource_homes_stats.detail_views
FROM mirealsource_homes, 
 mirealsource_homes_supplemental 
 LEFT JOIN mirealsource_images 
   ON mirealsource_homes.mls_num =
  mirealsource_images.mls_num
 LEFT JOIN mirealsource_homes_stats
   ON mirealsource_homes.mls_num =
  mirealsource_homes_stats.mls_num
   WHERE 
 mirealsource_homes.mls_num =
  mirealsource_homes_supplemental.mls_num
   ORDER BY mirealsource_homes.mls_num
   LIMIT 1;

++-+-++---+-+-+---+--+-+
| id | select_type | table   | type   |
possible_keys | key | key_len | ref
| rows | Extra|
++-+-++---+-+-+---+--+-+
|  1 | SIMPLE  | mirealsource_homes_supplemental | ALL|
PRIMARY   | NULL|NULL | NULL
|  100 | Using temporary; Using filesort |
|  1 | SIMPLE  | mirealsource_homes  | eq_ref |
PRIMARY   | PRIMARY |   8 |
devel.mirealsource_homes_supplemental.mls_num |1 ||
|  1 | SIMPLE  | mirealsource_images | eq_ref |
PRIMARY   | PRIMARY |   8 | devel.mirealsource_homes.mls_num
|1 ||
|  1 | SIMPLE  | mirealsource_homes_stats| eq_ref |
PRIMARY   | PRIMARY |   8 | devel.mirealsource_homes.mls_num
|1 ||
++-+-++---+-+-+---+--+-+

All tables are indexed by primary key (as can be seen from the
execution plan).  The column is declared as CHAR(8), though it is
VARCHAR(8) in some tables that have other VARCHAR columns.  Tables are
MyISAM, and the server version is 4.1.7-standard.

Apparently MySQL's optimizer sees that it can use the primary key for
mirealsource_home_supplemental to do the query, but for some reason
decides not to.

I had a similar problem a few months ago, and it was solved when I
upgraded from a very old 3.x version of MySQL to this one.

I can fix the problem by requiring a STRAIGHT_JOIN, or by sorting by
mirealsource_homes_supplemental.mls_num instead.  However, this
query is part of a larger framework that handles a wide variety of
queries, so I need to understand why this is happening instead of
tweaking individual cases.

Any idea why this might be happening, and how to fix it?

Thanks!

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Slow query: optimizer ignores index, using filesort

2005-06-13 Thread Scott Gifford
Thanks for your response, Jigal.  More below...

Jigal van Hemert [EMAIL PROTECTED] writes:

 From: Scott Gifford

[...]

 Apparently MySQL's optimizer sees that it can use the primary key for
 mirealsource_home_supplemental to do the query, but for some reason
 decides not to.

 This is often the case when the query will probably return more than 30% of
 the records in that table. In such cases it is more efficient to do a full
 table scan (which is indicated here by the 'ALL' type in the explain
 output).

Right, ALL would be a great plan if it weren't for the LIMIT 1.  

 I can fix the problem by requiring a STRAIGHT_JOIN, or by sorting by
 mirealsource_homes_supplemental.mls_num instead.  However, this
 query is part of a larger framework that handles a wide variety of
 queries, so I need to understand why this is happening instead of
 tweaking individual cases.

 Furthermore MySQL can only use an index for sorting if all columns in the
 ORDER BY clause are from the first table in the explain output that doesn't
 have a 'const' join type. This is why setting the ORDER BY to
 mirealsource_homes_supplemental.mls_num will remove the 'Using filesort' and
 result in faster sorting.

I'm a little surprised MySQL can't figure out on its own that it can
get the same effect by using mls_num from either table, since the
tables are joined on it, so the values will always be identical.

 MySQL tries to optimize queries by (among others) guestimating which path
 will result in the smallest number of records. It appears that this path
 with 100 * 8 * 8 * 8 (51200) records is the minimum size it can achieve.

It looks to me like it's actually 100 * 1 * 1 * 1 = 100 (8 is the key
length), which is the same as what I get when I force a STRAIGHT_JOIN.
So with two plans that will use the same number of records, I'm
surprised MySQL doesn't choose the one that will allow it to use an
index.

 You can use USE INDEX to (strongly) suggest the use of an index to MySQL and
 see if it speeds up the query. Using  FORCE INDEX will tell MySQL that a
 full table scan is very expensive, so this will make it extremely unlikely
 that it will not use the index.

I know there are a couple of tricks like that to fix this one query.
What makes this hard is that that query is one of many that can be
generated by a CGI-to-SQL search gateway.  If I put in a FORCE INDEX
(mls_num) and the user searches by price, for example, query
performance will be very bad.  In order to know the right indexes to
force, as far as I can tell I'd have to implement my own optimizer,
which seems somewhat excessive.

[...]

 In this query you want the data where mls_num is as small as possible. Is
 there a way you can limit the number of records by using an extra where
 condition? This way you may change the order of the tables and make the
 query faster.

I tried that using mirealsource_homes.mls_num in the WHERE clause and
it didn't make a difference.

 I would also move the current WHERE condition to an ON condition in the FROM
 part as it is not meant to limit the selected records, but as a definition
 on how to join the two tables.

I tried that too, and it didn't make a difference (though I agree it
is clearer).

I also tried installing a copy of 4.1.12-standard to see if the
situation was better, but the results were identical.

And, I tried setting max_seeks_for_key to a low number (I tried 100,
10, 1, and 0), which is supposed to discourage tables scans, and that
didn't help.

Are there any other global things I can try, to tell MySQL to avoid
table scans?  The queries almost always use at most LIMIT 10, so a
table scan is generally not the right idea

Scott.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: LEFT JOIN changes order of results

2005-06-03 Thread Scott Gifford
[EMAIL PROTECTED] writes:

 You have a sort because you did an order by.
 If you had an index with the desired order by, it may be used.
 Try as you usage of covering indexes.

Hi Mathias,

Along with your suggestion, a little more thinking about the problem
and some experimenting seems to have led to a good solution.  There
are only a few columns that are commonly sorted by, so I'll create a
multi-column index for each of those columns, with mls_num as the
second column.  Then I'll make the sort on mls_num be in the same
direction (ASC or DESC) as the primary sort.  According to some quick
experiments with EXPLAIN, query plans using this scheme seem to be
comparable to plans without the additional mls_num sort.

Thanks!

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: LEFT JOIN changes order of results

2005-06-02 Thread Scott Gifford
Johan Höök [EMAIL PROTECTED] writes:

 Hi Scott,
 I think you've been lucky so far...
 As you're only ordering on listdate, which is the same
 for both homes in your example you might definitely get different
 results once you put in the limit.
 A basic thing about rdb's is that you must never make assumptions
 that it returns resultsets in the same order unless you specify what
 to order by, of course quite often you'll get it back in the same order
 but you must never bank on it.

Hi Johan,

I guess I have been lucky.

[...]

 I guess you somehow have to include the mls_num in your second query
 to ensure that you get same resultset.

I'm looking into adding mls_num into all queries to fix this problem,
but it looks likely to make performance much worse.  Here's MySQL's
plan for a typical query:

mysql EXPLAIN SELECT * 
 FROM faar_homes 
WHERE zip = 48503 
 ORDER BY price DESC 
LIMIT 10 \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: faar_homes
 type: index
possible_keys: zip
  key: price
  key_len: 4
  ref: NULL
 rows: 5194
Extra: Using where
1 row in set (0.00 sec)

When I add in mls_num, it uses a filesort:

mysql EXPLAIN SELECT * 
 FROM faar_homes 
WHERE zip = 48503 
 ORDER BY price DESC, mls_num
LIMIT 10 \G
*** 1. row ***
   id: 1
  select_type: SIMPLE
table: faar_homes
 type: ALL
possible_keys: zip
  key: NULL
  key_len: NULL
  ref: NULL
 rows: 5194
Extra: Using where; Using filesort
1 row in set (0.00 sec)

It seems that this fix will cause nearly all of my queries to use
filesort.

Any ideas for avoiding this?

Thanks!

--ScottG.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: LEFT JOIN changes order of results

2005-06-02 Thread Scott Gifford
[EMAIL PROTECTED] writes:

 hi,
 mls_num is not in a key, have you tried index creation on (zip,price
 desc,mls_num) ?

Hi mathias,

mls_num is the primary key, so it does have its own index.

I could create a multi-column index covering (zip,price,mls_num), but
that was really just one example of many searches; there are about 10
fields that are commonly used for searches, and about 4 that are
commonly sorted by, so creating all of those indexes would require 40
indexes, and that's if the searches only use one field.

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



LEFT JOIN changes order of results

2005-05-31 Thread Scott Gifford
Hello,

I have an application that does searches against a database of homes.
A summary of initial search results is displayed by showing a few
columns of information about each home.  When the user clicks on one
of the listings, it retrieves additional information (some from other
tables) and displays more detailed information about the house.

The summary listings are generated using a normal MySQL query.  The
detailed views are implemented by specifying which result row to
display using a LIMIT clause.  For example, if the user clicks on the
3rd listing on a page, the query will use this LIMIT clause:

LIMIT 2,1

We do this instead of specifying a value for the primary key so we can
have a Next and Previous button that will move between detailed
listings.  These result rows may pull information in from other tables
for display.  Sometimes the homes are sorted according to a particular
column, and sometimes they aren't.

Obviously this whole scheme depends on the homes staying in the same
order between the summary queries and the detail queries, even if the
ordering is ambiguous.

We've had this running for several years, and it's always worked
fine.  We're now seeing some problems with it, possibly because of a
move from a server running MySQL 3.x to one running 4.x.

The problem we're seeing is that when additional tables are pulled in
for the detailed view, the order is different from the summary view,
so the wrong homes are displayed.  Here's a simplified example.  A
summary query might ask:

mysql SELECT lutar_homes.mls_num, lutar_images.num_images,
  lutar_homes_supplemental.address,
  lutar_homes.listdate 
   FROM lutar_homes, lutar_homes_supplemental 
   LEFT JOIN lutar_images ON lutar_homes.mls_num = lutar_images.mls_num
   WHERE ((lutar_homes.listdate = (NOW() - INTERVAL '14 00:00:00' 
DAY_SECOND))) 
 AND lutar_homes.mls_num = lutar_homes_supplemental.mls_num
   ORDER BY lutar_homes.listdate DESC
   LIMIT 1;

+-++-++
| mls_num | num_images | address | listdate   |
+-++-++
| 051768  |  1 | 7540 Country Pride Lane | 2005-05-31 |
+-++-++
1 row in set (0.00 sec)

When I add one more LEFT JOIN clause (the second one below) to get
additional fields for the detailed view, I get a different first home,
even though none of the search parameters have changed, and the table
hasn't changed:

mysql SELECT lutar_homes.mls_num, lutar_images.num_images,
  lutar_homes_supplemental.address,
  lutar_homes.listdate 
   FROM lutar_homes, lutar_homes_supplemental 
   LEFT JOIN lutar_images ON lutar_homes.mls_num = lutar_images.mls_num
   LEFT JOIN lutar_homes_stats ON lutar_homes.mls_num = 
lutar_homes_stats.mls_num
   WHERE ((lutar_homes.listdate = (NOW() - INTERVAL '14 00:00:00' 
DAY_SECOND))) 
 AND lutar_homes.mls_num = lutar_homes_supplemental.mls_num
   ORDER BY lutar_homes.listdate DESC
   LIMIT 1;

+-++--++
| mls_num | num_images | address  | listdate   |
+-++--++
| 051770  |  9 | 9149 Frankenmuth | 2005-05-31 |
+-++--++
1 row in set (0.02 sec)

This change in ordering screws up my system, since if the user clicked
on the first result in the summary view, the detailed view may display
a completely different home.

So my question is: Is this normal behavior for MySQL, or is it a bug?
Any suggestions for dealing with it?

Thanks!  Please let me know if any of this is confusing, and I'll
clarify.

---ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Get a Random Row on a HUGE db

2005-04-27 Thread Scott Gifford
[EMAIL PROTECTED] writes:

[...]

 So what I am trying is this.

 $last_row =SELECT from firebase_content LAST_INSERT_ID();
 $last_row_query = $dbi-query($last_row);
 $last_row_result = $row-id;

LAST_INSERT_ID() only works if you just inserted an element; it's
maintained per-connection.  See:

http://dev.mysql.com/doc/mysql/en/getting-unique-id.html

If your items are numbered sequentially, try just using:

   SELECT COUNT(*) FROM firebase_content;

to get the count.  That's very fast; it comes from the table summary
information, IIRC.  I use a similar solution for a similar problem,
and have had great luck with it.

To deal better with deleted items, you could periodically renumber
your articles to avoid gaps in numbering.

It would be great if MySQL had an optimization for this case.

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: question about chunking sql uploads

2005-04-25 Thread Scott Gifford
[EMAIL PROTECTED] writes:

 Frank Bax [EMAIL PROTECTED] wrote on 04/25/2005 11:47:12 AM:

 At 10:44 AM 4/25/05, Art.M (Wikki) wrote:
 I have a large .sql file to upload which is about 9 mb and I was
 wondering if anyone knew of a program that could break it up into
 chunks of 2 mb or under? So I can upload it to a shared web server.
 
 
 You can't upload a 9M file to webserver?  But you can upload 9x1M files? 

 Fix your webserver.  You didn't google for this did you?  You also
 didn't mention OS, so we'll assume unix based.

This is probably more common than you would expect.  I've run into it
when using shared hosting providers; they have timeouts and resource
limits on programs like phpMyAdmin, and so can't handle big chunks of
data.  It's not very easy to call up a large shared hosting provider
and tell them fix your server, especially if you are a small
customer.

That said, I don't have a great solution.  I often re-chunk these
files by hand, which isn't that bad if you only need to break it into
5-10 chunks.  It would be straightforward to write a Perl script to do
this.

 You could try compressing file with zip, gzip, etc.

This often helps, too, if the program you're importing data with
supports it.

 Or simply use split, default is 1000 lines, but can be changed via 
 command line.

That's a start, but the files each need to be a valid SQL statement,
so that's not enough; a little fixup needs to happen at the beginning
and end of each file.

---ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Slow query: mysql not using correct indexes?

2005-04-21 Thread Scott Gifford
Hello,

I'm having a problem with query running very slowly.  I run similar
queries on other tables all the time that perform as expected, and
this query used to run fine until I removed an explicit LEFT JOIN and
let the optimizer decide in what order to join two of the tables.
That fixed some other performance problems I was having, but seems to
have introduced this new one.

I'm using Ver 11.16 Distrib 3.23.49, for pc-linux-gnu (i686) on an
older copy of RedHat Linux 7.3 (kernel 2.4.25, libc6 2.2.5).

I've simplified the problem as much as I can and still reproduce it;
the actual tables I'm interested in are much larger, with many more
columns and rows.

The basic problem seems to be when I do a query sorting by price and
joining these three tables together, mysql resorts to Using
temporary; Using filesort:

mysql EXPLAIN SELECT test_homes.price, 
  test_homes.mls_num,
  test_homes_supplemental.bathrooms,
  test_homes_stats.detail_views
 FROM test_homes, 
  test_homes_supplemental 
LEFT JOIN test_homes_stats 
  ON test_homes.mls_num = test_homes_stats.mls_num 
WHERE test_homes.mls_num = test_homes_supplemental.mls_num 
 ORDER BY test_homes.price
LIMIT 10;


+-++---+-+-+-+--+-+
| table   | type   | possible_keys | key | key_len | 
ref | rows | Extra   |

+-++---+-+-+-+--+-+
| test_homes_supplemental | ALL| PRIMARY   | NULL|NULL | 
NULL|  100 | Using temporary; Using filesort |
| test_homes  | eq_ref | PRIMARY   | PRIMARY |   9 | 
test_homes_supplemental.mls_num |1 |   |
| test_homes_stats| eq_ref | PRIMARY   | PRIMARY |   9 | 
test_homes.mls_num  |1 |   |

+-++---+-+-+-+--+-+

That's very slow for 22,000 rows.  I don't know why it's doing this,
since the column I'm sorting by is indexed; it seems like it should
get the data from the price index on test_homes, then use eq_ref to
join in test_homes_supplemental and test_homes_stats.  Here are the
table definitions:

CREATE TABLE `test_homes` (
  `mls_num` char(9) NOT NULL default '',
  `price` mediumint(8) unsigned default NULL,
PRIMARY KEY  (`mls_num`),
  KEY `price` (`price`));

CREATE TABLE `test_homes_supplemental` (
  `mls_num` char(9) NOT NULL default '',
  `bathrooms` tinyint(3) unsigned default NULL,
PRIMARY KEY  (`mls_num`));

 CREATE TABLE `test_homes_stats` (
  `mls_num` char(9) NOT NULL default '',
  `detail_views` int(11) NOT NULL default '0',
PRIMARY KEY  (`mls_num`));

Both test_homes and test_homes_supplemental contain one row for each
item; test_homes_stats contain zero or one rows for each item.

If I do explicit LEFT JOINs to tell MySQL what order to join in, I get
the results I expect:

mysql EXPLAIN SELECT test_homes.price,
  test_homes.mls_num,
  test_homes_supplemental.bathrooms,
  test_homes_stats.detail_views
 FROM test_homes
LEFT JOIN test_homes_supplemental 
  ON test_homes.mls_num = 
test_homes_supplemental.mls_num
LEFT JOIN test_homes_stats 
  ON test_homes.mls_num = test_homes_stats.mls_num 
 ORDER BY test_homes.price
LIMIT 10;

+-++---+-+-++--+---+
| table   | type   | possible_keys | key | key_len | 
ref| rows | Extra |

+-++---+-+-++--+---+
| test_homes  | index  | NULL  | price   |   4 | 
NULL   |  100 |   |
| test_homes_supplemental | eq_ref | PRIMARY   | PRIMARY |   9 | 
test_homes.mls_num |1 |   |
| test_homes_stats| eq_ref | PRIMARY   | PRIMARY |   9 | 
test_homes.mls_num |1 |   |

+-++---+-+-++--+---+

But this query is part of a larger query-generating framework, and I'd
rather not fill the framework will special 

Re: zip code search within x miles

2005-04-16 Thread Scott Gifford
Jeff Kolber [EMAIL PROTECTED] writes:

 How are sites doing the search by zip and coming up with results within x
 miles?  Is there some OSS zip code download that has been created for this?

The ones I'm familiar with use this:

http://www.census.gov/geo/www/tiger/zip1999.html

[...]

 I would be interested, anyone have insights into doing  with Canadian
 Zip Codes or otherwise optimizing a database of Canadian Zip Codes?

I looked into this a few months back (for a very similar project,
actually) and found many places offering Canadian postal code
databases, but none for free.  If you already have the database, I
don't see why optimizing it would be any different than for American
postal codes.

ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: zip code search within x miles

2005-04-15 Thread Scott Gifford
Scott Haneda [EMAIL PROTECTED] writes:

 How are sites doing the search by zip and coming up with results within x
 miles?  Is there some OSS zip code download that has been created for this?

Zipdy does most of what you want; it needs to be modified to support
MySQL instead of PostgreSQL, but that shouldn't be too hard.  It also
has the great circle function you need to calculate the distances
correctly.  You can get it from:

http://www.cryptnet.net/fsp/zipdy/

If you're using Perl, Geo::PostalCode works very well, though it
doesn't use an SQL database at all:

http://search.cpan.org/~tjmather/Geo-PostalCode-0.06/lib/Geo/PostalCode.pm

---ScottG.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Processing Time

2002-07-11 Thread Scott Gifford

Asma Aripin [EMAIL PROTECTED] writes:

 --
 I want to select 1 record from 200,000 rows of records. I use this SQL
 statement:
 
 SELECT password FROM TableUser WHERE username='jack00'
 
 It takes me about 20 seconds just to view the password.
 
 Is there any other way to decrease the time? (In about less then 1 second)

Build an index on the username field.  Either re-create the table with
an index on 'username', or else use the SQL command CREATE INDEX to
create an index on an already-created table.  Syntax is in the
documentation.

Good luck,

ScottG.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Affecting weighting of fulltext indices

2002-07-03 Thread Scott Gifford

I'm working on a search interface in MySQL for a series of items that
have titles and descriptions.  I'd like to weight the results so that
matches in the title count for more than matches in the description.
Is there a recommended way of doing this?

So far, I've come up with three ways, two of which seem to work.

The first and most straightforward is to create a keywords table,
and then list the word multiple times depending on where it was listed
to give it extra weight.  For example, if the title is Children and
Dogs and the description is Boys and girls take their pets for
walks, I might put in:

  children children dogs dogs boys girls take their pets walks

to give the words that appeared in the title, children and dogs,
twice the weight of the other words.

The second is to list a column multiple times when creating the
index.  For example:

create fulltext index test 
  on TxKeywords
(Title,Title,Description);

seems to have roughly the desired effect, but I don't exactly
understand what it's doing, and I don't know if this is supported
behavior, or if might go away in a future version.

The third is to create multiple fulltext indices, and multiply the
MATCH...AGAINST number against a multiplier.  For example:

select *, 
match(Title) against('children') * 2 +
match(Description) against ('children')
  as score 
from
  TxKeywords
 order by 
  (score) 
 limit 10

This is extremely slow---on a few thousand rows, it takes several
minutes at least, which just won't work for an interactive
application.

Has anybody done anything like this, and would they be willing to
share their experiences?  Am I missing something obvious?  Any way to
speed up the third method (which seems to be the most flexible)?  Can
I count on the second method being supported in future releases?

Thanks for any tips,

-ScottG.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: MySQL counter for HIGH traffic site

2002-04-14 Thread Scott Gifford

Craig Westerman [EMAIL PROTECTED] writes:

 I'm needing counter for site that receives 80 to 120 hits a minute
 at peak load. Many I have tried cause excessive server load and need
 to be deactivated or they lose data and return to zero without
 warning. All tried so far have been written in Perl writing to a
 file.

I did a similar project recently, and found that an extremely fast way
to do this was keeping the stats in a shared memory segment.  I found
I could do about 40,000 writes/second in Perl (see shmget), and
250,000 in C (see shmget(2) and shmop(2)).  This information isn't
kept across a reboot, but you can poll the segment periodically and
record the value in a file, or else recreate the stats information
from the logs on reboot.

Don't know how mySQL's performance would be for something like this,
but thought I would throw out what I did know.  :-)

Good luck,

ScottG.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Column order

2002-04-09 Thread Scott Gifford

Sorry, this came out ugly.  I misinterpreted the list filter's
comments, and thought it would post my original message if I replied
to it, but instead it posted my reply including its warning.

Here's my original question, which is indeed about MySQL, although it
does not contain the words SQL or query:

Is there any reason to prefer a particular order of columns in a
table?

For example, will SELECTs be faster if the columns are requested in
the same order they exist in the table?  Is there a tendency for
columns earlier in the table to be preferred in queries with complex
WHERE clauses, so I should put columns which would be good candidates
for using their index earlier in the database?  Is there any inherent
speed difference between earlier and later columns, for example
earlier columns are always at the beginning of the record, and so
won't require as much of the database file to be read to get their
value for each row?

Thanks for any tuning tips along these lines; I'll do some benchmarks
tonight, but I was hoping somebody already knew this stuff and could
point me in the right direciton.

ScottG.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Performance of ORDER BY on an indexed column

2002-04-05 Thread Scott Gifford

I'm a little bit confused about what EXPLAIN is saying about a query
I'm doing with an ORDER BY clause.  I've read and re-read the parts of
the manual about how ORDER BY is optimized, but it won't sink in to my
thick skull for some reason.

I've got an index on a column called price, which is an INT:

mysql show columns from homes like 'price' \G
*** 1. row ***
  Field: price
   Type: int(11)
   Null: YES
Key: MUL
Default: NULL
  Extra: 
1 row in set (0.00 sec)

which I frequently sort by.  EXPLAIN says that a filesort is necessary
to do this:


mysql explain select * from homes order by price \G
*** 1. row ***
table: homes
 type: ALL
possible_keys: NULL
  key: NULL
  key_len: NULL
  ref: NULL
 rows: 3628
Extra: Using filesort
1 row in set (0.00 sec)

even when I do an ALTER TABLE homes ORDER BY price.

Why is a filesort required to sort these rows?  Is this something I
should worry about as this system goes from 1500 queries/day to 15000
queries/day to 15 queries/day?  Is there something I can do that
is faster?

Sorry if this is a stupid question; this is the first database I've
designed where I've had to put much thought into performance.

Thanks for any info or pointers to documentation,

-ScottG.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: Abandoned Cart Query

2002-04-05 Thread Scott Gifford

rory oconnor [EMAIL PROTECTED] writes:

[...]

 each item in a cart is a record, so simply querying on records won't
 work.  What I need to count is how many distinct customerId's there
 are.  a sample from the table looks like:
 
 cartItemIdcustomerId  productId
 1944  244 2139
 1945  244 2234
 1946  244 8291
 1947  198 4435
 1948  198 3342
 1949  221 3324
 1950  221 4432
 
 so in this example, the number of abandoned carts would be 3, because
 there are only three unique customerId's.
 
 Any help on what that query would look like is appreciated!  

Something like SELECT DISTINCT(customerID) FROM cart should give you
want you want.  See the mysql documentation for SELECT and DISTINCT
for more details on exactly how these work.

-ScottG.

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail [EMAIL PROTECTED]
To unsubscribe, e-mail [EMAIL PROTECTED]
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php