Re: [GENERAL] Monitoring Object access

2010-09-14 Thread adi hirschtein
I'd like to look at it from the object level and see how much I/O is being
done on specific table or index and then check which sessions are
responsible for that.

also, what's the catalog table you would recommend me to use if I want to
see I/O activity on an object regardless of the session?

On Mon, Sep 13, 2010 at 6:57 PM, Greg Smith g...@2ndquadrant.com wrote:

 adi hirschtein wrote:

 Using the catalog tables, is there any way to correlate session id/user id
 to which object (i.e. tables, indexes etc) it access and much how disk reads
 or I/O wait has been done against the objects.
 in general, I'd like to see which objects are being accessed by which user
 and the time/amount of I/O wait/reads.


 On recent Linux systems, the iotop utility is handy to figure out which
 individual users are doing lots of I/O.  There are some cases where the user
 doing the I/O and the one who caused the I/O are different, which includes
 things from synchronized scans to background writer writes.  But for the
 most part that utility gives a useful view into per-user I/O.

 Mark Wong has done some good work toward integrating that same data source
 on Linux into something you can query and match against database activity in
 his pg_proctab project:
 http://www.slideshare.net/markwkm/pgproctab-accessing-system-stats-in-postgresql-3573304

 And if you're on Solaris you can extract of a lot of this data with custom
 DTrace scripting.

 I have a rough plan for directly instrumenting more of this information
 from within the database, more like what Oracle does here.  But that's going
 to take months of development time, and I'm not sure the PostgreSQL core
 will even accept the overhead it would add in all cases.  If we could get
 one Oracle user who's on the fence over a PostgreSQL conversion to throw a
 small portion of the money they'd save toward that project, I'm sure I could
 get it developed.  It's just that nobody has been interested enough in such
 a thing to sponsor it so far.

 --
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us




Re: [GENERAL] Monitoring Object access

2010-09-14 Thread tv
 I'd like to look at it from the object level and see how much I/O is being
 done on specific table or index and then check which sessions are
 responsible for that.

 also, what's the catalog table you would recommend me to use if I want to
 see I/O activity on an object regardless of the session?

There is a bunch of interesting tables - see pg_stat_ and pg_statio_ tables.

Don't forget the values grow all the time, so you'll have to do snapshots
and subtract them. You could do that by hand, or use a tool for that.
There is a pgstatspack (should work fine, although it's not developed
since 8.1 AFAIK) and I'm working on an alternative tool with a web GUI
(http://sourceforge.net/projects/pgmonitor/). Still under development and
I know about several issues, but generally works (thanks in advance for
any feedback).

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring Object access

2010-09-14 Thread Satoshi Nagayasu

On 2010/09/12 23:02, adi hirschtein wrote:

I'm coming from the Oracle side of the house and In oracle for instance, you 
use shared buffer as well, but you are still able to see which session is 
waiting for which blocks
and if one session is doing the real I/O then the other one wait on 'wait for 
other session event so you are able to know who did the actual I/O
the reason behind it is that you want to check which objects is being heavily 
hit by which  business processes or users and then tier your storage 
accordingly.
I agree with your point about the OS buffer cache, I need to monitor it as well.
is there any place rather than pg_stat_activity that you think I should take a 
look at?


I think you should also look at pg_locks to know
which session is processing (or waiting on locks).

http://www.postgresql.org/docs/8.4/interactive/view-pg-locks.html

pg_locks table contains several lock information
including lock dependencies which you may need
to monitor session activities.

--
NAGAYASU Satoshi satoshi.nagay...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring Object access

2010-09-13 Thread adi hirschtein
Thanks!
I'll look into those system tools and probably come back with some more
questions...

Best,
Adi

On Mon, Sep 13, 2010 at 4:58 AM, Craig Ringer
cr...@postnewspapers.com.auwrote:

 On 09/12/2010 10:02 PM, adi hirschtein wrote:

 Hi Craig,

 Thanks a lot for the quick response!
 I'm coming from the Oracle side of the house and In oracle for instance,
 you use shared buffer as well, but you are still able to see which
 session is waiting for which blocks and if one session is doing the

  real I/O then the other one wait on 'wait for other session
  event so you are able to know who did the actual I/O

 There's nothing like that in PostgreSQL. There's some lock monitoring
 support for seeing what transactions hold locks and which other transactions
 are waiting on those locks, but AFAIK nothing like that for I/O. PostgreSQL
 does have DTrace hooks, so if you're on Solaris or some BSDs you might be
 able to use those to get the data you want.

 It'd be a pretty significant job to add a decent I/O monitoring system to
 PostgreSQL. Personally, if I needed something like that, I'd want to base it
 on an existing system-level tracing toolkit like Solaris's DTrace or Linux's
 perf. I'd want to add some additional instrumentation hooks - some of
 which already exist in Pg for DTrace - to permit the tools to beaware of
 transactions, statements, the current database, which tables are which,
 which indexes are associated with which tables, etc. Then I'd use the data
 collected by the performance monitoring tools to report on load associated
 with particular users, indexes, tables, queries, etc. That way I'd be able
 to handle things like whether a request was satisfied with OS buffer cache
 or had to go to real disk, report on disk queue depth, etc as part of the
 whole system. It'd be a big job even with the use of existing trace tools to
 help.

 Currently there are some DTrace hooks, but I don't think there's any kind
 of integrated toolset like I've described to use the monitoring hooks plus
 the existing system hooks to do detailed reporting of load/user,
 load/tablespace, etc.


  the reason behind it is that you want to check which objects is being
 heavily hit by which  business processes or users and then tier your
 storage accordingly.


 At the moment, all you can really do is turn up the logging levels to log
 queries, logins, etc. Then watch pg_stat_activity and use system-level tools
 like iostat, vmstat, top, perf/dtrace, etc. If you see backends that're
 hogging resources you can look their pid up in pg_stat_activity or the logs,
 see what they were doing, and run controlled tests to see what can be
 improved.

 It's somewhat clumsy, but seems to work pretty well most of the time.

 Nobody has stepped up to build a comprehensive tracing and performance
 framework - and even if they did, they'd have to make it lightweight enough
 that it didn't slow PostgreSQL down when it wasn't in use, show that it
 wouldn't add an excessive maintenance burden for the developers, show that
 it wouldn't break or produce incorrect results the first time something
 changed, etc. The Linux kernel demonstrates just how hard getting this right
 can be. So does the amount of effort Sun put in to DTrace. Sure, PostgreSQL
 isn't an OS kernel, but it's far from simple.

 I guess that's why Oracle charges the big bucks - because of all the extras
 they include that round the database out into the kitchen-sink monster that
 it is.


  is there any place rather than pg_stat_activity that you think I should
 take a look at?


 System-level tools and the postgresql logs, especially after proper
 configuration. There are some tools on pgfoundry that help a little with log
 analysis.

 --
 Craig Ringer



Re: [GENERAL] Monitoring Object access

2010-09-13 Thread Greg Smith

adi hirschtein wrote:
Using the catalog tables, is there any way to correlate session 
id/user id to which object (i.e. tables, indexes etc) it access and 
much how disk reads or I/O wait has been done against the objects.
in general, I'd like to see which objects are being accessed by which 
user and the time/amount of I/O wait/reads.


On recent Linux systems, the iotop utility is handy to figure out which 
individual users are doing lots of I/O.  There are some cases where the 
user doing the I/O and the one who caused the I/O are different, which 
includes things from synchronized scans to background writer writes.  
But for the most part that utility gives a useful view into per-user I/O.


Mark Wong has done some good work toward integrating that same data 
source on Linux into something you can query and match against database 
activity in his pg_proctab project:  
http://www.slideshare.net/markwkm/pgproctab-accessing-system-stats-in-postgresql-3573304


And if you're on Solaris you can extract of a lot of this data with 
custom DTrace scripting.


I have a rough plan for directly instrumenting more of this information 
from within the database, more like what Oracle does here.  But that's 
going to take months of development time, and I'm not sure the 
PostgreSQL core will even accept the overhead it would add in all 
cases.  If we could get one Oracle user who's on the fence over a 
PostgreSQL conversion to throw a small portion of the money they'd save 
toward that project, I'm sure I could get it developed.  It's just that 
nobody has been interested enough in such a thing to sponsor it so far.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring Object access

2010-09-12 Thread Craig Ringer

On 09/12/2010 06:52 PM, adi hirschtein wrote:

Hi,
Using the catalog tables, is there any way to correlate session id/user
id to which object (i.e. tables, indexes etc) it access and much how
disk reads or I/O wait has been done against the objects.
in general, I'd like to see which objects are being accessed by which
user and the time/amount of I/O wait/reads.


There isn't really anything like that, no. You have pg_stat_activity, 
but it's a pretty coarse tool.


The shared buffer cache and the use of things like synchronized 
sequential scans means that it wouldn't even be possible to truly track 
who's causing load reliably. As I understand it, if Joe and Fred both to 
a SELECT * FROM sometable, it's quite likely that only one of the 
backends will actually appear to read anything from disk. Which one is 
pretty much luck.


Adding something like this would require instrumenting not only the 
backends' disk I/O code, but also their shared buffer access code. It'd 
potentially add a lot of overhead, and I'm not sure the results would 
mean very much because the caching effects would make fred's backend 
did X I/O operations less meaningful. Fred's might just have been the 
first that ran after Joe's giant seqscan cross join of two big tables 
that forced everything else out of shared_buffers.


Even if you did have that instrumentation, you'd need OS-level 
instrumentation (dtrace, prof, etc) to track the OS's buffer cache, 
which PostgreSQL relies on heavily. Without that you can't tell the 
difference between a query that caused I/O calls from postgresql but 
could be satisfied by OS buffer cache and one that required expensive 
physical disk I/O to satisfy.


Really, I don't know if it's realistic to do what you're suggesting 
unless every user's data set is isolated from every other user's, in 
which case you can maybe use OS-level tools like prof or DTrace to 
achieve it if you set postgresql up to log whichstarts which backend pid.


If there's any significant overlap in the data sets used by the users 
(if they use the same databases or schema) I wouldn't think you'd be 
able to get any meaningful results.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Monitoring Object access

2010-09-12 Thread adi hirschtein
Hi Craig,

Thanks a lot for the quick response!
I'm coming from the Oracle side of the house and In oracle for instance, you
use shared buffer as well, but you are still able to see which session is
waiting for which blocks
and if one session is doing the real I/O then the other one wait on 'wait
for other session event so you are able to know who did the actual I/O
the reason behind it is that you want to check which objects is being
heavily hit by which  business processes or users and then tier your storage
accordingly.
I agree with your point about the OS buffer cache, I need to monitor it as
well.
is there any place rather than pg_stat_activity that you think I should take
a look at?

Best Regard,
Adi

On Sun, Sep 12, 2010 at 4:04 PM, Craig Ringer
cr...@postnewspapers.com.auwrote:

 On 09/12/2010 06:52 PM, adi hirschtein wrote:

 Hi,
 Using the catalog tables, is there any way to correlate session id/user
 id to which object (i.e. tables, indexes etc) it access and much how
 disk reads or I/O wait has been done against the objects.
 in general, I'd like to see which objects are being accessed by which
 user and the time/amount of I/O wait/reads.


 There isn't really anything like that, no. You have pg_stat_activity, but
 it's a pretty coarse tool.

 The shared buffer cache and the use of things like synchronized sequential
 scans means that it wouldn't even be possible to truly track who's causing
 load reliably. As I understand it, if Joe and Fred both to a SELECT * FROM
 sometable, it's quite likely that only one of the backends will actually
 appear to read anything from disk. Which one is pretty much luck.

 Adding something like this would require instrumenting not only the
 backends' disk I/O code, but also their shared buffer access code. It'd
 potentially add a lot of overhead, and I'm not sure the results would mean
 very much because the caching effects would make fred's backend did X I/O
 operations less meaningful. Fred's might just have been the first that ran
 after Joe's giant seqscan cross join of two big tables that forced
 everything else out of shared_buffers.

 Even if you did have that instrumentation, you'd need OS-level
 instrumentation (dtrace, prof, etc) to track the OS's buffer cache, which
 PostgreSQL relies on heavily. Without that you can't tell the difference
 between a query that caused I/O calls from postgresql but could be satisfied
 by OS buffer cache and one that required expensive physical disk I/O to
 satisfy.

 Really, I don't know if it's realistic to do what you're suggesting unless
 every user's data set is isolated from every other user's, in which case you
 can maybe use OS-level tools like prof or DTrace to achieve it if you set
 postgresql up to log whichstarts which backend pid.

 If there's any significant overlap in the data sets used by the users (if
 they use the same databases or schema) I wouldn't think you'd be able to get
 any meaningful results.

 --
 Craig Ringer



Re: [GENERAL] Monitoring Object access

2010-09-12 Thread Craig Ringer

On 09/12/2010 10:02 PM, adi hirschtein wrote:

Hi Craig,

Thanks a lot for the quick response!
I'm coming from the Oracle side of the house and In oracle for instance,
you use shared buffer as well, but you are still able to see which
session is waiting for which blocks and if one session is doing the

 real I/O then the other one wait on 'wait for other session
 event so you are able to know who did the actual I/O

There's nothing like that in PostgreSQL. There's some lock monitoring 
support for seeing what transactions hold locks and which other 
transactions are waiting on those locks, but AFAIK nothing like that for 
I/O. PostgreSQL does have DTrace hooks, so if you're on Solaris or some 
BSDs you might be able to use those to get the data you want.


It'd be a pretty significant job to add a decent I/O monitoring system 
to PostgreSQL. Personally, if I needed something like that, I'd want to 
base it on an existing system-level tracing toolkit like Solaris's 
DTrace or Linux's perf. I'd want to add some additional 
instrumentation hooks - some of which already exist in Pg for DTrace - 
to permit the tools to beaware of transactions, statements, the current 
database, which tables are which, which indexes are associated with 
which tables, etc. Then I'd use the data collected by the performance 
monitoring tools to report on load associated with particular users, 
indexes, tables, queries, etc. That way I'd be able to handle things 
like whether a request was satisfied with OS buffer cache or had to go 
to real disk, report on disk queue depth, etc as part of the whole 
system. It'd be a big job even with the use of existing trace tools to help.


Currently there are some DTrace hooks, but I don't think there's any 
kind of integrated toolset like I've described to use the monitoring 
hooks plus the existing system hooks to do detailed reporting of 
load/user, load/tablespace, etc.



the reason behind it is that you want to check which objects is being
heavily hit by which  business processes or users and then tier your
storage accordingly.


At the moment, all you can really do is turn up the logging levels to 
log queries, logins, etc. Then watch pg_stat_activity and use 
system-level tools like iostat, vmstat, top, perf/dtrace, etc. If you 
see backends that're hogging resources you can look their pid up in 
pg_stat_activity or the logs, see what they were doing, and run 
controlled tests to see what can be improved.


It's somewhat clumsy, but seems to work pretty well most of the time.

Nobody has stepped up to build a comprehensive tracing and performance 
framework - and even if they did, they'd have to make it lightweight 
enough that it didn't slow PostgreSQL down when it wasn't in use, show 
that it wouldn't add an excessive maintenance burden for the developers, 
show that it wouldn't break or produce incorrect results the first time 
something changed, etc. The Linux kernel demonstrates just how hard 
getting this right can be. So does the amount of effort Sun put in to 
DTrace. Sure, PostgreSQL isn't an OS kernel, but it's far from simple.


I guess that's why Oracle charges the big bucks - because of all the 
extras they include that round the database out into the kitchen-sink 
monster that it is.



is there any place rather than pg_stat_activity that you think I should
take a look at?


System-level tools and the postgresql logs, especially after proper 
configuration. There are some tools on pgfoundry that help a little with 
log analysis.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general