Re: [HACKERS] Including Snapshot Info with Indexes

Gokulakannan Somasundaram Fri, 19 Oct 2007 20:58:10 -0700

Hi,
  I think i have a initial Implementation. It has some bugs and i am working
on fixing it. But to show the advantages, I want to show the number of
Logical I/Os on the screen. In order to show that, i tried enabling the
log_statement option in PostgreSQL.conf. But it shows only the physical
reads. What i wanted was a Logical reads count( No. of  ReadBuffer calls,
which is stored in ReadBufferCount variable). So i have added this stats to
the bufmgr.c(function is BufferUsage, i suppose) to show Logical Reads and
Physical Reads. Is this a acceptable change?
  I thought logical read count would be helpful, even for SQL tuning. Since
if someone wants to tune the SQL on a test system, things might get cached
and he wouldn't know how much I/O his SQL is potentially capable of. May be
we can add a statistic to show how many of those ReadBuffers are pinned
Buffers.


Expecting your comments.

Thanks,
Gokul.

On 10/14/07, Gokulakannan Somasundaram <[EMAIL PROTECTED]> wrote:
>
>
>
> On 10/14/07, Trevor Talbot <[EMAIL PROTECTED]> wrote:
> >
> > On 10/14/07, Gokulakannan Somasundaram <[EMAIL PROTECTED]> wrote:
> >
> > > http://www.databasecolumn.com/2007/09/one-size-fits-all.html
> >
> > > > > The Vertica database(Monet is a open source version with the same
> > > > > principle) makes use of the very same principle. Use more disk
> > space,
> > > > > since they are less costly and optimize the data warehousing.
> >
> > > What i  meant there was, it has duplicated storage of certain columns
> > of the
> > > table. A table with more than one projection always needs more space,
> > than a
> > > table with just one projection. By doing this they are reducing the
> > number
> > > of disk operations. If they are duplicating columns of data to avoid
> > reading
> > > un-necessary information, we are duplicating the snapshot information
> > to
> > > avoid going to the table.
> >
> > Was this about Vertica or MonetDB?  I saw that article a while ago,
> > and I didn't see anything that suggested Vertica duplicated data, just
> > that it organized it differently on disk.  What are you seeing as
> > being duplicated?
>
>
> Hi Trevor,
>              This is a good paper to read about the basics of
> Column-oriented databases.
> http://db.lcs.mit.edu/projects/cstore/vldb.pdf
> If you goto the Section 2 - Data Model. He has shown the data model, with
> a sample EMP table.
>
> The example shows that EMP table contains four columns - Name, Age, Dept,
> Salary
> From this table, projections are being formed - (In the paper, they have
> shown the creation of four projections for Example 1)
> EMP1 (name, age)
> EMP2 (dept, age, DEPT.floor)
> EMP3 (name, salary)
> DEPT1(dname, floor)
>
> As you can see, the same column information gets duplicated in different
> projections.
> The advantage is that if a query is around name and age, it need not skim
> around other details. But the storage requirements go high, since there is
> redundancy. As you may know, if you increase data redundancy, it will help
> selects at the cost of inserts, updates and deletes.
>
> This is what i was trying to say.
>
> Thanks,
> Gokul.
>
>
>

Re: [HACKERS] Including Snapshot Info with Indexes

Reply via email to