Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Dan Nelson
In the last episode (Nov 19), Benjamin Pflugmann said:
> On Mon 2002-11-18 at 19:01:57 -0500, [EMAIL PROTECTED] wrote:
> [...]
> > A BITMAP index will work efficiently in this case, no matter what
> > the distribution of the keys is.  The only requirement is that the
> > column be low-cardinality.
> > 
> > It basically works like this:
> > 
> > true   01101
> > false  10010
> > 
> > One bitmap is stored for each possible value of the column.  In
> > this case the false bitmap shows that the first and fourth records
> > should be return for false = '1'.
> 
> How/where would the row pointer be stored in this scheme? MySQL does
> not know how to find the 'fourth record' without a pointer to the row
> (and this row pointer is what takes most space in this case).

Bitmap indices are neat.  The bitmap values are stored in the same
physical order as the records in the database, so for a fixed-length
table format, you don't even need row pointers.  For variable-record
formats, I'd expect bitmaps to be stored in 64k-record groups, with a
pointer to the first row at the start of the block, and byte offsets to
the next row (think linked list).  This list would then be compressed,
since the resulting values would be nicely clustered around the average
rowlength.  The bitmap values themselves are also compressed.
Theoretically, a bitmap index on a column of 1 million "yes" and 50
"no"s could be one disk block in size.  This is an extreme example, of
course, and Oracle states that a bitmap index will usually be around
25% the size of the equivalent btree index.  I have seen them as small
as 10% on 100-million-record tables, indexing fields with up to 100
unique values.
 
> > A table scan would never be necessary, no matter what the
> > distribution of values.
> 
> A table scan is usually faster than jumping wildly around in the
> table file according to an index, as soon as the number of rows to
> read get over some percentage (for MySQL it's about 1/3rd, normally),
> as disk seeks are rather expensive.
>
> So I do not see what advantage a bitmap index could bring for the
> discussed case, except for some minor size advantage.

Bitmaps are perfect for data-warehouse type tables.  They are great for
doing counts, and are one of the few index types where multiple indices
can be used on a single table because they are so small.  Since they
can be combined, one would usually build three bitmaps on three
individual fields instead of having to build btree indices on a+b+c,
b+c, and c.

However, for most OLTP-type tables they suck, because it takes a
relatively massive time to update a bitmap index when a single record
changes (unpack the bitmap block covering to that record, update the
bit, repack).  On Oracle at least, expect an update query to take 10 to
100 times longer to run if you have bitmaps on changed fields.

For MySQL, I'd lean toward the "don't index these values" btree
optimization, since it's not really geared toward data warehousing.

-- 
Dan Nelson
[EMAIL PROTECTED]

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Benjamin Pflugmann
Hi.

On Mon 2002-11-18 at 19:01:57 -0500, [EMAIL PROTECTED] wrote:
[...]
> A BITMAP index will work efficiently in this case, no matter what the
> distribution of the keys is.  The only requirement is that the column be
> low-cardinality.
> 
> It basically works like this:
> 
> true   01101
> false  10010
> 
> One bitmap is stored for each possible value of the column.  In this
> case the false bitmap shows that the first and fourth records should be
> return for false = '1'.

How/where would the row pointer be stored in this scheme? MySQL does
not know how to find the 'fourth record' without a pointer to the row
(and this row pointer is what takes most space in this case).

> A table scan would never be necessary, no matter what the distribution
> of values.

A table scan is usually faster than jumping wildly around in the table
file according to an index, as soon as the number of rows to read get
over some percentage (for MySQL it's about 1/3rd, normally), as disk
seeks are rather expensive.

So I do not see what advantage a bitmap index could bring for the
discussed case, except for some minor size advantage.


The advantage the suggested condition-based index could bring is that
only the rows that match the condition (a minority as by the
presumption), say 1%, would be indexed and therefore the index would
be only 1% of the space (and only be used to access this 1% of rows).

Bye,

Benjamin.

-- 
[EMAIL PROTECTED]

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Daniel Koch
On Mon, 2002-11-18 at 16:22, Michael T. Babcock wrote:
> Neulinger, Nathan wrote:
> 
> >It's actually relatively speedy WITH the bad index, but maintaining that
> >index (given it's lopsided nature) is very expensive. 
> >
> >Yes, the point is to ONLY index the row if it matches the restriction.
> >  
> >
> 
> To clarify, if MySQL is indexing a binary value with lopsided 
> distribution, simply only keep an index of record locations for the 
> less-likely value.  The more-likely value will end up being table-scanned.
> 
> 'true', 'true', 'false', 'true', 'true', 'true'
> ... 'false' -> 3rd row.
> ... 'true' -> read the whole thing.



A BITMAP index will work efficiently in this case, no matter what the
distribution of the keys is.  The only requirement is that the column be
low-cardinality.

It basically works like this:

true   01101
false  10010

One bitmap is stored for each possible value of the column.  In this
case the false bitmap shows that the first and fourth records should be
return for false = '1'.

A table scan would never be necessary, no matter what the distribution
of values.


-- 
Daniel Koch <[EMAIL PROTECTED]>


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Michael T. Babcock
Neulinger, Nathan wrote:


It's actually relatively speedy WITH the bad index, but maintaining that
index (given it's lopsided nature) is very expensive. 

Yes, the point is to ONLY index the row if it matches the restriction.
 


To clarify, if MySQL is indexing a binary value with lopsided 
distribution, simply only keep an index of record locations for the 
less-likely value.  The more-likely value will end up being table-scanned.

'true', 'true', 'false', 'true', 'true', 'true'
... 'false' -> 3rd row.
... 'true' -> read the whole thing.

--
Michael T. Babcock
C.T.O., FibreSpeed Ltd.
http://www.fibrespeed.net/~mbabcock



-
Before posting, please check:
  http://www.mysql.com/manual.php   (the manual)
  http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Jeremy Zawodny
On Tue, Nov 19, 2002 at 07:12:27AM +1100, Dean Harding wrote:
> Actually, it's a slightly different problem - a very uneven distribution
> of values on a column, not a small number of possible values like a
> bitmap index is for.

Exactly.

> In my opinion, this is a pretty useless feature, I mean the whole
> *point* of the optimizer is to see things like that and do a full table
> scan when it's going to be faster.

Well...

> I guess I can see the point if the row is only *added* to the index if
> it matches the WHERE clause.  That'd speed up the index management as
> well.

Bingo.  That's it.  Speed up, save space, save CPU, etc.

Jeremy
-- 
Jeremy D. Zawodny |  Perl, Web, MySQL, Linux Magazine, Yahoo!
<[EMAIL PROTECTED]>  |  http://jeremy.zawodny.com/

MySQL 3.23.51: up 104 days, processed 2,262,358,069 queries (250/sec. avg)

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Neulinger, Nathan
The problem is - a whole table scan in this case is not faster. It's a
LOT slower. Table only has about 180,000 rows right now, but it's going
to get up to about half a million. 

It's actually relatively speedy WITH the bad index, but maintaining that
index (given it's lopsided nature) is very expensive. 

Yes, the point is to ONLY index the row if it matches the restriction.

-- Nathan


Nathan Neulinger   EMail:  [EMAIL PROTECTED]
University of Missouri - Rolla Phone: (573) 341-4841
Computing Services   Fax: (573) 341-4216


> -Original Message-
> From: Dean Harding [mailto:[EMAIL PROTECTED]] 
> Sent: Monday, November 18, 2002 2:12 PM
> To: 'Daniel Koch'; [EMAIL PROTECTED]
> Cc: 'Egor Egorov'; Neulinger, Nathan; 'Jeremy Zawodny'
> Subject: RE: feature suggestion - indexes with "where" clause 
> or similar
> 
> 
> Actually, it's a slightly different problem - a very uneven 
> distribution
> of values on a column, not a small number of possible values like a
> bitmap index is for.
> 
> In my opinion, this is a pretty useless feature, I mean the whole
> *point* of the optimizer is to see things like that and do a 
> full table
> scan when it's going to be faster.
> 
> I guess I can see the point if the row is only *added* to the index if
> it matches the WHERE clause.  That'd speed up the index management as
> well.
> 
> Dean Harding.
> 
> > -Original Message-
> > From: Daniel Koch [mailto:[EMAIL PROTECTED]]
> > Sent: Tuesday, 19 November 2002 5:58 am
> > To: [EMAIL PROTECTED]
> > Cc: Egor Egorov; Neulinger, Nathan; Jeremy Zawodny
> > Subject: Re: feature suggestion - indexes with "where" clause or
> similar
> > 
> > On Mon, 2002-11-18 at 12:29, Jeremy Zawodny wrote:
> > 
> > > > If I've got you right status can have values 0 or 1. In 
> this case
> > > > you can just use " SELECT ... WHERE status=1 .." (index wil be
> used)
> > > > or "SELECT .. WHERE status=0 .." (index will not be 
> used, because
> > > > scan the whole table will be faster to retrieve 99,9% of rows)
> > > > depends on what you want to get.
> > >
> > > I'd like to second Nathan's request.
> > >
> > > Just because MySQL is smart enough to not use an index when 99% of
> the
> > > rows would match doesn't mean that this is an unnecessary request.
> > > It'd be a great optimization it MySQL could "know" not to bother
> > > indexing those records.  It'd save a lot of space and CPU time on
> > > larger data sets.
> > >
> > > Jeremy
> > 
> > 
> > 
> > I think this problem could be solved by implementing a BITMAP index,
> > like Oracle.  They're perfect for indexing boolean 
> true/false columns
> or
> > any column that has a small number of possible values.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > --
> > 
> > Daniel Koch <[EMAIL PROTECTED]>
> > 
> > 
> > 
> -
> > Before posting, please check:
> >http://www.mysql.com/manual.php   (the manual)
> >http://lists.mysql.com/   (the list archive)
> > 
> > To request this thread, e-mail <[EMAIL PROTECTED]>
> > To unsubscribe, e-mail  > [EMAIL PROTECTED]>
> > Trouble unsubscribing? Try: 
> http://lists.mysql.com/php/unsubscribe.php
> 
> 
> 

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Dean Harding
Actually, it's a slightly different problem - a very uneven distribution
of values on a column, not a small number of possible values like a
bitmap index is for.

In my opinion, this is a pretty useless feature, I mean the whole
*point* of the optimizer is to see things like that and do a full table
scan when it's going to be faster.

I guess I can see the point if the row is only *added* to the index if
it matches the WHERE clause.  That'd speed up the index management as
well.

Dean Harding.

> -Original Message-
> From: Daniel Koch [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, 19 November 2002 5:58 am
> To: [EMAIL PROTECTED]
> Cc: Egor Egorov; Neulinger, Nathan; Jeremy Zawodny
> Subject: Re: feature suggestion - indexes with "where" clause or
similar
> 
> On Mon, 2002-11-18 at 12:29, Jeremy Zawodny wrote:
> 
> > > If I've got you right status can have values 0 or 1. In this case
> > > you can just use " SELECT ... WHERE status=1 .." (index wil be
used)
> > > or "SELECT .. WHERE status=0 .." (index will not be used, because
> > > scan the whole table will be faster to retrieve 99,9% of rows)
> > > depends on what you want to get.
> >
> > I'd like to second Nathan's request.
> >
> > Just because MySQL is smart enough to not use an index when 99% of
the
> > rows would match doesn't mean that this is an unnecessary request.
> > It'd be a great optimization it MySQL could "know" not to bother
> > indexing those records.  It'd save a lot of space and CPU time on
> > larger data sets.
> >
> > Jeremy
> 
> 
> 
> I think this problem could be solved by implementing a BITMAP index,
> like Oracle.  They're perfect for indexing boolean true/false columns
or
> any column that has a small number of possible values.
> 
> 
> 
> 
> 
> 
> 
> --
> 
> Daniel Koch <[EMAIL PROTECTED]>
> 
> 
> -
> Before posting, please check:
>http://www.mysql.com/manual.php   (the manual)
>http://lists.mysql.com/   (the list archive)
> 
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail  [EMAIL PROTECTED]>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php



-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Daniel Koch
On Mon, 2002-11-18 at 12:29, Jeremy Zawodny wrote:

> > If I've got you right status can have values 0 or 1. In this case
> > you can just use " SELECT ... WHERE status=1 .." (index wil be used)
> > or "SELECT .. WHERE status=0 .." (index will not be used, because
> > scan the whole table will be faster to retrieve 99,9% of rows)
> > depends on what you want to get.
> 
> I'd like to second Nathan's request.
> 
> Just because MySQL is smart enough to not use an index when 99% of the
> rows would match doesn't mean that this is an unnecessary request.
> It'd be a great optimization it MySQL could "know" not to bother
> indexing those records.  It'd save a lot of space and CPU time on
> larger data sets.
> 
> Jeremy



I think this problem could be solved by implementing a BITMAP index,
like Oracle.  They're perfect for indexing boolean true/false columns or
any column that has a small number of possible values.







-- 

Daniel Koch <[EMAIL PROTECTED]>


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Jeremy Zawodny
On Mon, Nov 18, 2002 at 05:38:00PM +0200, Egor Egorov wrote:
> Neulinger,
> Friday, November 15, 2002, 7:25:27 PM, you wrote:
> 
> NN> Assume I have a mysql table (myisam most likely) with a few hundred
> NN> thousand rows in it. One of the columns indicates success or failure.
> NN> 99.9% of the rows will have "0" in that column. But a small number will
> NN> have 1. I need to be able to fetch those rows quickly, without slowing
> NN> everything else down, but ideally without doing a full table scan. 
> 
> NN> I can create an index on that column, but I am under the impression that
> NN> this a really bad/slow type of index to create/maintain, since one of
> NN> the values will cover most of the table.
> 
> NN> I'd like to be able to say something like:
> 
> NN> create index failures on dumps(status) where status!=0;
> 
> NN> If the sql query being run isn't compatible with the restriction on the
> NN> index, then it cannot be used. For example, if I query for status=2, it
> NN> would be ok, but status=0 would not be able to use the index. Simpler
> NN> may be to only allow the index to be used if the query contains exactly
> NN> the same restriction. i.e. the "where status !=0" index could only be
> NN> used if I had "status != 0" in my select query. 
> 
> NN> Or alternatively, if you can suggest some other means for accomplishing
> NN> this efficiently... 
> 
> NN> (Yes, I know I can make a temporary or results table updated
> NN> periodically, which I will likely do in the meantime, but would be nice
> NN> to have an efficient way of accomplishing this with live data.)
> 
> If I've got you right status can have values 0 or 1. In this case
> you can just use " SELECT ... WHERE status=1 .." (index wil be used)
> or "SELECT .. WHERE status=0 .." (index will not be used, because
> scan the whole table will be faster to retrieve 99,9% of rows)
> depends on what you want to get.

I'd like to second Nathan's request.

Just because MySQL is smart enough to not use an index when 99% of the
rows would match doesn't mean that this is an unnecessary request.
It'd be a great optimization it MySQL could "know" not to bother
indexing those records.  It'd save a lot of space and CPU time on
larger data sets.

Jeremy
-- 
Jeremy D. Zawodny |  Perl, Web, MySQL, Linux Magazine, Yahoo!
<[EMAIL PROTECTED]>  |  http://jeremy.zawodny.com/

MySQL 3.23.51: up 104 days, processed 2,259,137,221 queries (250/sec. avg)

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




re: feature suggestion - indexes with "where" clause or similar

2002-11-18 Thread Egor Egorov
Neulinger,
Friday, November 15, 2002, 7:25:27 PM, you wrote:

NN> Assume I have a mysql table (myisam most likely) with a few hundred
NN> thousand rows in it. One of the columns indicates success or failure.
NN> 99.9% of the rows will have "0" in that column. But a small number will
NN> have 1. I need to be able to fetch those rows quickly, without slowing
NN> everything else down, but ideally without doing a full table scan. 

NN> I can create an index on that column, but I am under the impression that
NN> this a really bad/slow type of index to create/maintain, since one of
NN> the values will cover most of the table.

NN> I'd like to be able to say something like:

NN> create index failures on dumps(status) where status!=0;

NN> If the sql query being run isn't compatible with the restriction on the
NN> index, then it cannot be used. For example, if I query for status=2, it
NN> would be ok, but status=0 would not be able to use the index. Simpler
NN> may be to only allow the index to be used if the query contains exactly
NN> the same restriction. i.e. the "where status !=0" index could only be
NN> used if I had "status != 0" in my select query. 

NN> Or alternatively, if you can suggest some other means for accomplishing
NN> this efficiently... 

NN> (Yes, I know I can make a temporary or results table updated
NN> periodically, which I will likely do in the meantime, but would be nice
NN> to have an efficient way of accomplishing this with live data.)

If I've got you right status can have values 0 or 1. In this case you
can just use " SELECT ... WHERE status=1 .." (index wil be used) or "SELECT .. WHERE
status=0 .." (index will not be used, because scan the whole table
will be faster to retrieve 99,9% of rows) depends on what you want to get.




-- 
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.net http://www.ensita.net/
   __  ___ ___   __
  /  |/  /_ __/ __/ __ \/ /Egor Egorov
 / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
/_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.net
   <___/   www.mysql.com




-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




RE: Feature suggestion

2001-08-28 Thread Simon Green

See V 4

-Original Message-
From: Alexander Barkov [mailto:[EMAIL PROTECTED]]
Sent: 28 August 2001 14:13
To: [EMAIL PROTECTED]
Subject: Feature suggestion


  Hello!

Currently replication is implemented between
two mysqld. I would like to suggest new feature.
Make it possible for two mysqld to exchange data
between each other. For example, MERGE tables
distributed between two machines. 

Question to authors. How do you think, will
it ever be implemented. If so, how soon?

Thanks...

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail
<[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php