Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 3:54 PM, Thomas Kellerer <[EMAIL PROTECTED]> wrote:
> Hmm. I was not talking about an index _fast full_ scan, I was talking about
> index scans in general. Personally I have never seen Oracle using a table
> scan (whatever kind) if all columns in the select are present in the index.
>
> And the manual actually suggests the same:
>
> "If the statement accesses only columns of the index, then Oracle reads the
> indexed column values directly from the index, rather than from the table"
> http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i52300

The manual is wrong.

>> Those are essentially clustered indexes, and they're not quite stored
>> exactly the same..
>>
> Hmm, my understanding of a clustered index, that it "orders" the table data
> according to the index, but there is still "table data" and "index data",
> right?
>
> That is a bit different to an index-organized table were only a B-Tree index
> exists. This is not mandatory, but for my example (a link table with two PK
> columns) only a B-Tree index is created.

Well, clustered indexes mean different things to different vendors.
Oracle's implementation stores the data with the index as does SQL
Server, but in a little different fashion.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Thomas Kellerer

Jonah H. Harris wrote on 18.11.2008 20:58:

On Tue, Nov 18, 2008 at 2:33 PM, Thomas Kellerer <[EMAIL PROTECTED]> wrote:

If all the columns from the select list are available in the index, then
Oracle will always prefer the index scan over a table scan (at least I have
never seen something else). Even for a SELECT that returns all rows of the
table.


No, it doesn't always prefer index fast full scan.


Hmm. I was not talking about an index _fast full_ scan, I was talking about 
index scans in general. Personally I have never seen Oracle using a table scan 
(whatever kind) if all columns in the select are present in the index.


And the manual actually suggests the same:

"If the statement accesses only columns of the index, then Oracle reads the 
indexed column values directly from the index, rather than from the table"

http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#i52300


They are taking this concept even further with index organized tables, where
no real "table data" exists, everything is stored in the index (quited nice
for e.g. link tables that only consist of two or three integer columns)


Those are essentially clustered indexes, and they're not quite stored
exactly the same..

Hmm, my understanding of a clustered index, that it "orders" the table data 
according to the index, but there is still "table data" and "index data", right?


That is a bit different to an index-organized table were only a B-Tree index 
exists. This is not mandatory, but for my example (a link table with two PK 
columns) only a B-Tree index is created.


(I have to admit I don't really know the concept of clustered indexes)

Thomas




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Daniel Verite

Scott Marlowe wrote:


>> They aren't borrowing anything, Oracle has had this functionality
>> since at least Oracle 8i (1999).
>
> Whoa, calm down Francis.  I'm not suggesting they stole it or
> something. Just that they're using the same basic concepts.

Oh, and citation needed.  I don't remember seeing anything about
oracle using indexes as sole storage units back in 8i


They call that IOT, for Index Organized Tables, and they were 
introduced with Oracle8 (1997). Check out 
http://www.orafaq.com/wiki/Oracle_8


Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: 
http://www.manitou-mail.org


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 3:45 PM, Joshua D. Drake <[EMAIL PROTECTED]> wrote:
> Off topic much?

Hey, all I did was make a joke; other people wanted to get all
*correct* about it :)

Anyway, as this has been discussed at least twenty times before, this
is a waste of a thread.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Joshua D. Drake
On Tue, 2008-11-18 at 15:28 -0500, Jonah H. Harris wrote:
> On Tue, Nov 18, 2008 at 3:09 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> > Oh, and citation needed.  I don't remember seeing anything about
> > oracle using indexes as sole storage units back in 8i
> 
> Your memory-foo is weak.  See ORGANIZATION INDEX:
> 
> http://download-west.oracle.com/docs/cd/A87860_01/doc/server.817/a85397/statem3e.htm#2061671
> 

Off topic much?

> -- 
> Jonah H. Harris, Senior DBA
> myYearbook.com
> 
-- 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 3:09 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> Oh, and citation needed.  I don't remember seeing anything about
> oracle using indexes as sole storage units back in 8i

Your memory-foo is weak.  See ORGANIZATION INDEX:

http://download-west.oracle.com/docs/cd/A87860_01/doc/server.817/a85397/statem3e.htm#2061671

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 3:07 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
>> They aren't borrowing anything, Oracle has had this functionality
>> since at least Oracle 8i (1999).
>
> Whoa, calm down Francis.

My name's not Francis :)

> I'm not suggesting they stole it or something.  Just that they're using
> the same basic concepts.

Hmm...

--- snip
> Sounds like they're borrowing the code from innodb that does much the same 
> thing

You can't borrow something you started developing prior to InnoDB's release.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Scott Marlowe
On Tue, Nov 18, 2008 at 1:07 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> On Tue, Nov 18, 2008 at 1:03 PM, Jonah H. Harris <[EMAIL PROTECTED]> wrote:
>> On Tue, Nov 18, 2008 at 2:57 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
>>> Sounds like they're borrowing the code from innodb that does much the
>>> same thing.  In Innodb, if a field is indexed, it lives only as an
>>> index, not in the table and an index at the same time.
>>
>> They aren't borrowing anything, Oracle has had this functionality
>> since at least Oracle 8i (1999).
>
> Whoa, calm down Francis.  I'm not suggesting they stole it or
> something.  Just that they're using the same basic concepts.

Oh, and citation needed.  I don't remember seeing anything about
oracle using indexes as sole storage units back in 8i


-- 
When fascism comes to America, it will be draped in a flag and
carrying a cross - Sinclair Lewis

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 2:57 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> Sounds like they're borrowing the code from innodb that does much the
> same thing.  In Innodb, if a field is indexed, it lives only as an
> index, not in the table and an index at the same time.

They aren't borrowing anything, Oracle has had this functionality
since at least Oracle 8i (1999).

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Scott Marlowe
On Tue, Nov 18, 2008 at 1:03 PM, Jonah H. Harris <[EMAIL PROTECTED]> wrote:
> On Tue, Nov 18, 2008 at 2:57 PM, Scott Marlowe <[EMAIL PROTECTED]> wrote:
>> Sounds like they're borrowing the code from innodb that does much the
>> same thing.  In Innodb, if a field is indexed, it lives only as an
>> index, not in the table and an index at the same time.
>
> They aren't borrowing anything, Oracle has had this functionality
> since at least Oracle 8i (1999).

Whoa, calm down Francis.  I'm not suggesting they stole it or
something.  Just that they're using the same basic concepts.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 2:33 PM, Thomas Kellerer <[EMAIL PROTECTED]> wrote:
> If all the columns from the select list are available in the index, then
> Oracle will always prefer the index scan over a table scan (at least I have
> never seen something else). Even for a SELECT that returns all rows of the
> table.

No, it doesn't always prefer index fast full scan.

> They are taking this concept even further with index organized tables, where
> no real "table data" exists, everything is stored in the index (quited nice
> for e.g. link tables that only consist of two or three integer columns)

Those are essentially clustered indexes, and they're not quite stored
exactly the same..

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Scott Marlowe
On Tue, Nov 18, 2008 at 12:33 PM, Thomas Kellerer <[EMAIL PROTECTED]> wrote:
> If all the columns from the select list are available in the index, then
> Oracle will always prefer the index scan over a table scan (at least I have
> never seen something else). Even for a SELECT that returns all rows of the
> table.
>
> They are taking this concept even further with index organized tables, where
> no real "table data" exists, everything is stored in the index (quited nice
> for e.g. link tables that only consist of two or three integer columns)

Sounds like they're borrowing the code from innodb that does much the
same thing.  In Innodb, if a field is indexed, it lives only as an
index, not in the table and an index at the same time.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Thomas Kellerer

Jonah H. Harris wrote on 18.11.2008 20:15:

On Tue, Nov 18, 2008 at 2:02 PM, Scara Maccai <[EMAIL PROTECTED]> wrote:

SELECT A FROM myTAB where A <1

only uses the index (if there's an index defined for A) in Oracle.


Well, not exactly.  That's called a "covered" index because the query
could be satisfied directly from the index (the attribute is covered
by the index).  Oracle sometimes satisfies it with an index fast full
scan, but not always; it depends on the cost of other access methods
and/or what Oracle believes is currently in cache.

If all the columns from the select list are available in the index, then Oracle 
will always prefer the index scan over a table scan (at least I have never seen 
something else). Even for a SELECT that returns all rows of the table.


They are taking this concept even further with index organized tables, where no 
real "table data" exists, everything is stored in the index (quited nice for 
e.g. link tables that only consist of two or three integer columns)



Thomas


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 2:02 PM, Scara Maccai <[EMAIL PROTECTED]> wrote:
> SELECT A FROM myTAB where A <1
>
> only uses the index (if there's an index defined for A) in Oracle.

Well, not exactly.  That's called a "covered" index because the query
could be satisfied directly from the index (the attribute is covered
by the index).  Oracle sometimes satisfies it with an index fast full
scan, but not always; it depends on the cost of other access methods
and/or what Oracle believes is currently in cache.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Jonah H. Harris
On Tue, Nov 18, 2008 at 12:48 PM, Sam Mason <[EMAIL PROTECTED]> wrote:
>> It makes sense to me,
>> but I don't understand is how other databases (such as Oracle) do it.
>
> There are tradeoffs in both directions; [...] but Oracle's way is more 
> optimized


For the most part, that's all you needed to say :)


-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Scara Maccai
> FWIW, I believe that count(*) is pretty slow in Oracle too.

Well COUNT was only an example. I think (but I'm not sure AT ALL) that 

SELECT A FROM myTAB where A <1

only uses the index (if there's an index defined for A) in Oracle.

But mine was just curiosity... which I think you and Sam answered.

Thank you.







  Unisciti alla community di Io fotografo e video, il nuovo corso di 
fotografia di Gazzetta dello sport:
http://www.flickr.com/groups/iofotografoevideo

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Tom Lane
Sam Mason <[EMAIL PROTECTED]> writes:
> On Tue, Nov 18, 2008 at 04:49:35PM +, Scara Maccai wrote:
>> It makes sense to me,
>> but I don't understand is how other databases (such as Oracle) do it.

> I believe Oracle maintains a separate log (not sure how it's structured)
> that contains this information and all the data in both the main table
> and index can be considered committed.

FWIW, I believe that count(*) is pretty slow in Oracle too.  The DBs
that can do it fast are the ones that maintain a centralized counter
of the number of rows in each table.  Which makes count(*) nice and
fast, at the cost of horrendous concurrency impacts for updates; plus
there's no chance of real MVCC operation.  (In an MVCC world the correct
answer for count(*) can vary depending on who's asking --- there's no
hope of doing that with a single counter.)

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MVCC and index-only read

2008-11-18 Thread Sam Mason
On Tue, Nov 18, 2008 at 04:49:35PM +, Scara Maccai wrote:
> if I got it right the reason some aggregates (such as COUNT) using
> only index columns are "slow" on postgresql is that it uses MVCC, so
> it has to read the data as well as the index.

Every aggregate (of which COUNT is just one example) has to read data
from both the index and the table.  The reason is that each row in a
table has two important identifiers; the transaction that created it and
the transaction that killed it.  Every time a query scans the table it
looks to see that both the transaction that created it COMMITed and that
transaction that killed it (if any) didn't COMMIT.  The index doesn't
contain these two identifiers so when scanning the index the code needs
to go and check what these are.

There are various optimizations in PG so that it doesn't need to
actually check the transaction numbers the whole time, thus speeding
things up a bit, but the semantics/behavior is the same.

> It makes sense to me,
> but I don't understand is how other databases (such as Oracle) do it.

I believe Oracle maintains a separate log (not sure how it's structured)
that contains this information and all the data in both the main table
and index can be considered committed.

There are tradeoffs in both directions; PG's implementation allows
greater concurrency, but Oracle's way is more optimized for read access.
Which implementation is better depends a lot on your work load.

There has been talk of adding the transaction identifiers into the
indexes in PG, which would mean that index scans wouldn't need to go
to the table.  The problem is that the indexes would be larger and
modifying data would incur larger overheads as both the data and index
would have to be updated.

I hope someone will point out any mistakes I've made!


  Sam

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general