date:20101221

Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Heikki Linnakangas


On 22.12.2010 09:25, Rob Wultsch wrote:

On Wed, Dec 22, 2010 at 12:15 AM, Heikki Linnakangas
  wrote:

Hmm, innodb_autoextend_increment seems more like what we're discussing here
(http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
If I'm reading that correctly, InnoDB defaults to extending files in 8MB
chunks.


This is not pure apples to apples as InnoDB does direct io, however
doesn't the checkpoint completion target code call fsync repeatedly in
order to achieve the check point completion target?


It only fsync's each file once. If there's a lot of files, it needs to 
issue a lot of fsync's, but for different files.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Rob Wultsch

On Wed, Dec 22, 2010 at 12:15 AM, Heikki Linnakangas
 wrote:
> On 22.12.2010 03:45, Rob Wultsch wrote:
>>
>> On Tue, Dec 21, 2010 at 4:49 AM, Robert Haas
>>  wrote:
>>>
>>> On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasby  wrote:

 On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:
>
> Does postgres make an effort to create a file with physically
> continuous blocks?

 AFAIK all files are expanded as needed. I don't think there's any flags
 you can pass to the filesystem to tell it "this file will eventually be 1GB
 in size". So, we're basically at the mercy of the FS to try and keep things
 contiguous.
>>>
>>> There have been some reports that we would do better on some
>>> filesystems if we extended the file more than a block at a time, as we
>>> do today.  However, AFAIK, no one is pursuing this ATM.
>>
>> The has been found to be the case in the MySQL world, particularly
>> when ext3 is in use:
>> http://forge.mysql.com/worklog/task.php?id=4925
>> http://www.facebook.com/note.php?note_id=194501560932
>
> These seem to be about extending the transaction log, and we already
> pre-allocate the WAL. The WAL is repeatedly fsync'd, so I can understand
> that extending that in small chunks would hurt performance a lot, as the
> filesystem needs to flush the metadata changes to disk at every commit.
> However, that's not an issue with extending data files, they are only
> fsync'd at checkpoints.
>
> It might well be advantageous to extend data files in larger chunks too, but
> it's probably nowhere near as important as with the WAL.

Agree.

>> Also, InnoDB has an option for how much data should be allocated at
>> the end of a tablespace when it needs to grow:
>>
>> http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path
>
> Hmm, innodb_autoextend_increment seems more like what we're discussing here
> (http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
> If I'm reading that correctly, InnoDB defaults to extending files in 8MB
> chunks.

This is not pure apples to apples as InnoDB does direct io, however
doesn't the checkpoint completion target code call fsync repeatedly in
order to achieve the check point completion target? And for that
matter, haven't there been recent discussion on hackers about calling
fsync more often?

Sorry for the loopy email. I have not been getting anywhere near
enough sleep recently :(
-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Heikki Linnakangas


On 22.12.2010 02:56, Merlin Moncure wrote:

On Tue, Dec 21, 2010 at 7:45 PM, Tom Lane  wrote:

Merlin Moncure  writes:

Attached is an incomplete patch disabling hint bits based on compile
switch. ...
So far, at least doing pgbench runs and another test designed to
exercise clog lookups, the performance loss of always doing full
lookup hasn't materialized.


The standard pgbench test would be just about 100% useless for stressing
this, because its net database activity is only about one row
touched/updated per query.  You need a test case that hits lots of rows
per query, else you're just measuring parse+plan+network overhead.


right -- see the attached clog_stress.sql above.  It creates a script
that inserts records in blocks of 1, deletes half of them, and
vacuums.  Neither the execution of the script nor a seq scan following
its execution showed an interesting performance difference (which I am
arbitrarily calling 5% in either direction).  Like I said though, I
don't trust the patch or the results yet.


Make sure you have a good mix of different xids in the table, 
TransactionLogFetch has a one-item cache so repeatedly checking the same 
xid is much faster than the general case.


Perhaps run pgbench for a while, and then do "SELECT COUNT(*)" on the 
resulting tables.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Heikki Linnakangas

On 22.12.2010 03:45, Rob Wultsch wrote:

On Tue, Dec 21, 2010 at 4:49 AM, Robert Haas wrote:

On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasby wrote:

On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:

Does postgres make an effort to create a file with physically continuous blocks?

AFAIK all files are expanded as needed. I don't think there's any flags you can pass to
the filesystem to tell it "this file will eventually be 1GB in size". So, we're
basically at the mercy of the FS to try and keep things contiguous.

There have been some reports that we would do better on some
filesystems if we extended the file more than a block at a time, as we
do today. However, AFAIK, no one is pursuing this ATM.

The has been found to be the case in the MySQL world, particularly
when ext3 is in use:
http://forge.mysql.com/worklog/task.php?id=4925
http://www.facebook.com/note.php?note_id=194501560932

These seem to be about extending the transaction log, and we already
pre-allocate the WAL. The WAL is repeatedly fsync'd, so I can understand
that extending that in small chunks would hurt performance a lot, as the
filesystem needs to flush the metadata changes to disk at every commit.
However, that's not an issue with extending data files, they are only
fsync'd at checkpoints.

It might well be advantageous to extend data files in larger chunks too,
but it's probably nowhere near as important as with the WAL.

Also, InnoDB has an option for how much data should be allocated at
the end of a tablespace when it needs to grow:
http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path

Hmm, innodb_autoextend_increment seems more like what we're discussing
here
(http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
If I'm reading that correctly, InnoDB defaults to extending files in 8MB
chunks.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

80 matches

Mail list logo