(2014/01/29 16:58), Peter Geoghegan wrote:
On Tue, Jan 28, 2014 at 10:51 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
KONDO Mitsumasa <kondo.mitsum...@lab.ntt.co.jp> writes:
By the way, latest pg_stat_statement might affect performance in Windows system.
Because it uses fflush() system call every creating new entry in
pg_stat_statements, and it calls many fread() to warm file cache.

This statement doesn't seem to have much to do with the patch as
committed.

You could make a strong case for the patch improving performance in
practice for many users, by allowing us to increase the
pg_stat_statements.max default value to 5,000, 5 times the previous
value. The real expense of creating a new entry is the exclusive
locking of the hash table, which blocks *everything* in
pg_stat_statements. That isn't expanded at all, since queries are only
written with a shared lock, which only blocks the creation of new
entries which was already relatively expensive. In particular, it does
not block the maintenance of costs for all already observed entries in
the hash table. It's obvious that simply reducing the pressure on the
cache will improve matters a lot, which for many users the external
texts patch does.

Since Mitsumasa has compared that patch and at least one of his
proposed pg_stat_statements patches on several occasions, I will
re-iterate the difference: any increased overhead from the external
texts patch is paid only when a query is first entered into the hash
table. Then, there is obviously and self-evidently no additional
overhead from contention for any future execution of the same query,
no matter what, indefinitely (the exclusive locking section of
creating a new entry does not do I/O, except in fantastically unlikely
circumstances). So for many of the busy production systems I work
with, that's the difference between a cost paid perhaps tens of
thousands of times a second, and a cost paid every few days or weeks.
Even if he is right about a write() taking an unreasonably large
amount of time on Windows, the consequences felt as pg_stat_statements
LWLock contention would be very limited.

I am not opposed in principle to adding new things to the counters
struct in pg_stat_statements. I just think that the fact that the
overhead of installing the module on a busy production system is
currently so low is of *major* value, and therefore any person that
proposes to expand that struct should be required to very conclusively
demonstrate that there is no appreciably increase in overhead. Having
a standard deviation column would be nice, but it's still not that
important. Maybe when we have portable atomic addition we can afford
to be much more inclusive of that kind of thing.

I'd like to know the truth and the fact in your patch, rather than your 
argument:-)

So I create detail pg_stat_statements benchmark tool using pgbench.
This tool can create 10000 pattern unique sqls in a file, and it is only
for measuring pg_stat_statements performance. Because it only updates
pg_stat_statements data and doesn't write to disk at all. It's fair benchmark.

[usage]
perl create_sql.pl >file.sql
psql -h -h xxx.xxx.xxx.xxx mitsu-ko -c 'SELECT pg_stat_statements_reset()'
pgbench -h xxx.xxx.xxx.xxx mitsu-ko -c64 -j32 -n -T 180 -f file.sql

[part of sample sqls file, they are executed in pgbench]
SELECT 1-1/9+5*8*6+5+9-6-3-4/9%7-2%7/5-9/7+2+9/7-1%5%9/3-4/4-9/1+5+5/6/5%2+1*2*3-8/8+5-3-8-4/8+5%2*2-2-5%6+4 SELECT 1%9%7-8/5%6-1%2*2-7+9*6-2*6-9+1-2*9+6+7*8-4-2*1%3/7-1%4%9+4+7/5+4/2-3-5%8/3/7*6-1%8*6*1/7/2%5*6/2-3-9 SELECT 1%3*2/8%6%5-3%1+1/6*6*5/9-2*5+6/6*5-1/2-6%4%8/7%2*7%5%9%5/9%1%1-3-9/2*1+1*6%8/2*4/1+6*7*1+5%6+9*1-9*6
...

I test some methods that include old pgss, old pgss with my patch, and new pgss.
Test server and postgresql.conf are same in I tested last day in this ML-thread.
And test methods and test results are here,

[methods]
method 1: with old pgss, pg_stat_statements.max=1000(default)
method 2: with old pgss with my patch, pg_stat_statements.max=1000(default)
peter 1 : with new pgss(Peter's patch), pg_stat_statements.max=5000(default)
peter 2 : with new pgss(Peter's patch), pg_stat_statements.max=1000

[for reference]
method 5:with old pgss, pg_stat_statements.max=5000
method 6:with old pgss with my patch, pg_stat_statements.max=5000

[results]
 method   |  try1  |  try2  |  try3  | degrade performance ratio
-----------------------------------------------------------------
 method 1 | 6.546  | 6.558  | 6.638  | 0% (reference score)
 method 2 | 6.527  | 6.556  | 6.574  | 1%
 peter 1  | 5.204  | 5.203  | 5.216  |20%
 peter 2  | 4.241  | 4.236  | 4.262  |35%

 method 5 | 5.931  | 5.848  | 5.872  |11%
 method 6 | 5.794  | 5.792  | 5.776  |12%


New pgss is extremely degrade performance than old pgss, and I cannot see performance scaling. I understand that his argument was "My patch reduces time of LWLock in pg_stat_statements,
it is good for performance. However, Kondo's patch will be degrade performance 
in
pg_stat_statements. so I give it -1.". But unless seeing this test results, his claim seems
to contradict it even in Linux OS.

I think that his patch's strong point is only storing long query, at the 
sacrifice of
significant performance. Does it hope in community? At least, I don't hope it.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
for($x=0;$x<10000;$x++){
        $query="1";
        for($i=0;$i<50;$i++){
                $rand=int(rand(5));
                $rand_num=int(rand(9))+1;
                if($rand==0){
                       $query.="+$rand_num";
                }elsif($rand==1){
                        $query.="-$rand_num";
                }elsif($rand==2){
                        $query.="*$rand_num";
                }elsif($rand==3){
                        $query.="%$rand_num";
                }else{
                        $query.="/$rand_num";
                }
        }
        print "SELECT $query\n";
}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to