Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Magnus Hagander
On Sat, Jul 7, 2012 at 1:46 AM, Bruce Momjian  wrote:
> On Fri, Jul 06, 2012 at 03:41:41PM -0700, Daniel Farina wrote:
>> On Fri, Jul 6, 2012 at 12:21 PM, Bruce Momjian  wrote:
>> > I think our big gap is in integrating these sections.  There is no easy
>> > way for a bug reporter to find out what happens to his report unless the
>> > patch is applied in the same email thread as the report.  It is hard for
>> > users to see _all_ the changes made in a release because the release
>> > notes are filtered.
>> >
>> > Our current system is designed to have very limited friction of action,
>> > and this give us a high-quality user experience and release quality, but
>> > it does have limits in how well we deal with complex cases.
>>
>> I do basically agree with this.  I was reflecting on the bug tracker
>> issue (or lack thereof) for unrelated reasons earlier today and I
>> think there are some very nice things to recommend the current
>> email-based system, which are the reasons you identify above.  Perhaps
>> the area where it falls down is structured searches (such as for
>> "closed" or "wontfix") and tracking progress of related, complex, or
>> multi-part issues that span multiple root email messages.
>
> I normally assume "friction" is just something that slows you down from
> attaining a goal, but open source development is only _possible_ because
> of the low friction communication available via the Internet.  It isn't
> that open source development would be slower --- it would probably not
> exist in its current form (think shareware diskettes for an
> alternative).

I've never thought of it in terms of "friction", but I think that term
makes a lot of sense. And it sums up pretty much one of the things
that I find the most annoying with the commitfest app - in the end it
boils down to the same thing. I find it annoying that whenever someone
posts a new review or new comments, one has to *also* go into the CF
app and add them there. Which leads to a lot of friction, which in
turn leads to people not actually putting their comments in there,
which decreases the value of the tool.

Don't get me wrong - the cf app is a huge step up from no app at all.
But it's just not done yet.


>> Maybe just using the message-ids to cross reference things (or at
>> least morally: perhaps a point of indirection as to collapse multiple
>> bug reports about the same thing, or to provide a place to add more
>> annotation would be good, not unlike the CommitFest web application in
>> relation to emails) is enough.  Basically, perhaps an overlay
>> on-top-of email might be a more supple way to figure out what process
>> improvements work well without committing to a whole new tool chain
>> and workflow all at once.
>
> I know there is work to allow cross-month email archive threading, and
> that might help.

Yup, it's being worked on.

FWIW, somewhere there's a piece of bugtracker code in a dusty corner
that I once wrote that was intended as a prototype for something
sitting as this "thin overlay on top of email". It worked reasonably
well, and then crashed and burned when it came to the cross-month
thread splitting.

Once that is fixed (and unlike most times before (yes, there are
exceptions) it's being actively worked on right now), it could be
dusted off again - or something else could be built on top of that
capability.


> I feel we have to be honest in what our current development process does
> poorly.

+1.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Martijn van Oosterhout
On Sat, Jul 07, 2012 at 11:36:41AM +0200, Magnus Hagander wrote:
> I've never thought of it in terms of "friction", but I think that term
> makes a lot of sense. And it sums up pretty much one of the things
> that I find the most annoying with the commitfest app - in the end it
> boils down to the same thing. I find it annoying that whenever someone
> posts a new review or new comments, one has to *also* go into the CF
> app and add them there. Which leads to a lot of friction, which in
> turn leads to people not actually putting their comments in there,
> which decreases the value of the tool.
> 
> Don't get me wrong - the cf app is a huge step up from no app at all.
> But it's just not done yet.

Well, if that's the issue then there are well known solutions to that. 
Give each commitfest entry a tag, say, CF#8475 and add a bot to the
mail server that examines each email for this tag and if found adds it
to the CF app.

This could then easily track commit messages, emails on mailing list
and even bug reports.  (For example, someone replys to a bug report
with "See CF#8484" and then a reference to the bug thread gets pushed
into the CF app.)

It's also a searchable identifier, which is also useful for google.

We do this at my work, it's a very low barrier method of linking
different systems.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Magnus Hagander
On Sat, Jul 7, 2012 at 12:48 PM, Martijn van Oosterhout
 wrote:
> On Sat, Jul 07, 2012 at 11:36:41AM +0200, Magnus Hagander wrote:
>> I've never thought of it in terms of "friction", but I think that term
>> makes a lot of sense. And it sums up pretty much one of the things
>> that I find the most annoying with the commitfest app - in the end it
>> boils down to the same thing. I find it annoying that whenever someone
>> posts a new review or new comments, one has to *also* go into the CF
>> app and add them there. Which leads to a lot of friction, which in
>> turn leads to people not actually putting their comments in there,
>> which decreases the value of the tool.
>>
>> Don't get me wrong - the cf app is a huge step up from no app at all.
>> But it's just not done yet.
>
> Well, if that's the issue then there are well known solutions to that.
> Give each commitfest entry a tag, say, CF#8475 and add a bot to the
> mail server that examines each email for this tag and if found adds it
> to the CF app.

So now you moved the friction over to the initial submitter instead,
because he/she how has to get this tag before posting the patch in the
first place. And this would need to happen before it's even known if
this needs to go on a CF or will just be approved/rejected directly
without even getting there.

That's not to say it's a horrible idea, it's just to say that things
are never as easy as they first look.

BTW, the *bigger* issue with this path is that now we suddenly have
different kinds of identifiers for different things. So you have a bug
id, and a cf id, and they are different things. So you're going to end
up tagging things with multiple id's. etc. That part goes to show that
we cannot just "solve" this in one part of the workflow. We can put in
useful workarounds that go pretty far - like the cf app - but we
cannot get the "complete solution". OTOH, maybe we never can get the
complete solution, but we should work hard at not locking into
something else.


> This could then easily track commit messages, emails on mailing list
> and even bug reports.  (For example, someone replys to a bug report
> with "See CF#8484" and then a reference to the bug thread gets pushed
> into the CF app.)

All of these things are already "emails on mailing lists", after all...


> It's also a searchable identifier, which is also useful for google.

Yes, I agree that having a searchable and *referencable* identifier is
a good thing. In fact, one of the main reasons to do something like
this. Right now we just tell people "look in the archives", and that
doesn't work. Heck, *I* search them quite often and I still don't
understand how some people can find things so quickly :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: inline code with params

2012-07-07 Thread Pavel Stehule
Hello

updated patch - parameters can be subqueries now. This needs enhancing
SPI little bit.

postgres=# do (a int, b int, text) $$begin raise notice '% % %', $1,
$2, $3; end; $$ language plpgsql using 10+100,(select a from x),
:'USER';
NOTICE:  110 10 pavel
DO

Regards

Pavel


2012/7/6 Pavel Stehule :
> Hello
>
> I updated my two years old patch
> http://archives.postgresql.org/pgsql-hackers/2010-07/txtIakTCBA15Z.txt
>
> Syntax is based on Florian Pflog's proposal
> http://archives.postgresql.org/pgsql-hackers/2010-07/msg00110.php
>
> postgres=# do (a int, b int, text) $$begin raise notice '% % %', $1,
> $2, $3; end; $$ language plpgsql using 10+100,20, :'USER';
> NOTICE:  110 20 pavel
> DO
>
> This patch is not final - missing documentation, regress tests, and
> doesn't support subselects as expr.


inline_code_with_params2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Satoshi Nagayasu
Hi all,

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This patch provides two new functions, pg_stat_get_xlog_dirty_write()
and pg_stat_reset_xlog_dirty_write(), which have been designed to
determine an appropriate value for WAL buffer size.

If this counter is increasing in the production environment,
it would mean that the WAL buffer size is too small to hold
xlog records generated the transactions. So, you can increase
your WAL buffer size to keep xlog records and to reduce WAL writes.

I think this patch would not affect to WAL write performance,
but still paying attention to it.

Any comments or suggestions?

Regards,

---
[snaga@devvm03 src]$ psql -p 15432 postgres
psql (9.3devel)
Type "help" for help.

postgres=# SELECT pg_stat_get_xlog_dirty_write();
 pg_stat_get_xlog_dirty_write
--
0
(1 row)

postgres=# \q
[snaga@devvm03 src]$ pgbench -p 15432 -s 10 -c 32 -t 1000 postgres
Scale option ignored, using pgbench_branches table count = 10
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 32
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 32000/32000
tps = 141.937738 (including connections establishing)
tps = 142.123457 (excluding connections establishing)
[snaga@devvm03 src]$ psql -p 15432 postgres
psql (9.3devel)
Type "help" for help.

postgres=# SELECT pg_stat_get_xlog_dirty_write();
 pg_stat_get_xlog_dirty_write
--
0
(1 row)

postgres=# begin;
BEGIN
postgres=# DELETE FROM pgbench_accounts;
DELETE 100
postgres=# commit;
COMMIT
postgres=# SELECT pg_stat_get_xlog_dirty_write();
 pg_stat_get_xlog_dirty_write
--
19229
(1 row)

postgres=# SELECT pg_stat_reset_xlog_dirty_write();
 pg_stat_reset_xlog_dirty_write


(1 row)

postgres=# SELECT pg_stat_get_xlog_dirty_write();
 pg_stat_get_xlog_dirty_write
--
0
(1 row)

postgres=# \q
[snaga@devvm03 src]$
---


-- 
Satoshi Nagayasu 
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 642c129..df1e6d4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -280,6 +280,11 @@ static XLogRecPtr RedoRecPtr;
  */
 static XLogRecPtr RedoStartLSN = InvalidXLogRecPtr;
 
+/*
+ * Counter for WAL dirty buffer writes.
+ */
+static uint64  WalBufferWriteDirtyCount = 0;
+
 /*--
  * Shared-memory data structures for XLOG control
  *
@@ -1513,6 +1518,7 @@ AdvanceXLInsertBuffer(bool new_segment)
WriteRqst.Flush = 0;
XLogWrite(WriteRqst, false, false);
LWLockRelease(WALWriteLock);
+   WalBufferWriteDirtyCount++;
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
}
}
@@ -10492,3 +10498,15 @@ SetWalWriterSleeping(bool sleeping)
xlogctl->WalWriterSleeping = sleeping;
SpinLockRelease(&xlogctl->info_lck);
 }
+
+uint64
+xlog_dirty_write_counter_get()
+{
+   return WalBufferWriteDirtyCount;
+}
+
+void
+xlog_dirty_write_counter_reset()
+{
+   WalBufferWriteDirtyCount = 0;
+}
diff --git a/src/backend/utils/adt/pgstatfuncs.c 
b/src/backend/utils/adt/pgstatfuncs.c
index 7c0705a..d544a5b 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -117,6 +117,9 @@ extern Datum pg_stat_reset_shared(PG_FUNCTION_ARGS);
 extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS);
 extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS);
 
+extern Datum pg_stat_get_xlog_dirty_write(PG_FUNCTION_ARGS);
+extern Datum pg_stat_reset_xlog_dirty_write(PG_FUNCTION_ARGS);
+
 /* Global bgwriter statistics, from bgwriter.c */
 extern PgStat_MsgBgWriter bgwriterStats;
 
@@ -1700,3 +1703,16 @@ pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS)
 
PG_RETURN_VOID();
 }
+
+Datum
+pg_stat_get_xlog_dirty_write(PG_FUNCTION_ARGS)
+{
+   PG_RETURN_INT64(xlog_dirty_write_counter_get());
+}
+
+Datum
+pg_stat_reset_xlog_dirty_write(PG_FUNCTION_ARGS)
+{
+   xlog_dirty_write_counter_reset();
+   PG_RETURN_VOID();
+}
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index ec79870..01343b9 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -325,6 +325,9 @@ extern XLogRecPtr do_pg_start_backup(const char 
*backupidstr, bool fast, char **
 extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive);
 exter

Re: [HACKERS] enhanced error fields

2012-07-07 Thread Pavel Stehule
2012/7/7 Peter Geoghegan :
> Attached is a revision of this patch, with a few clean-ups, mostly to
> the wording of certain things.
>
> On 5 July 2012 17:41, Pavel Stehule  wrote:
>> * renamed auxiliary functions and moved it elog.c - header is new file
>> "relerror.h"
>
> In my revision, I've just added a pre-declaration and removed the
> dedicated header, which didn't make too much sense to me:
>
> + /* Pre-declare Relation, in order to avoid a build dependency on rel.h. */
> + typedef struct RelationData *Relation;
>
> This works fine, and is precedented. See
> src/include/storage/buffile.h, for example. If you think it's
> unreasonable that none of the functions now added to elog.h are
> callable without also including rel.h, consider that all call sites
> are naturally already doing that, for the errmsg() string itself.
> Besides, this is a perfectly valid way of reducing build dependencies,
> or at least a more valid way than creating a new header that does not
> really represent a natural new division for these new functions, IMHO.
> Opaque pointers are ordinarily used to encapsulate things in C, rather
> than to prevent build dependencies, but I believe that's only because
> in general that's something that C programmers are more likely to
> want.
>

It is question for Alvaro or Tom. I have not strong opinion on it.

>> * new fields "constraint_table" and "trigger_table" - constraints and
>> triggers are related to relation in pg, not just to schema
>
> I've added some remarks to that effect in the docs of my revision for
> your consideration.
>
>> * better coverage of enhancing errors in source code
>
> Good. I think it's important that we nail down just where these are
> expected to be available. It would be nice if there was a quick and
> easy answer to the question "Just what ErrorResponse fields should
> this new "sub-category of class 23" ereport() site have?".  We clearly
> have yet to work those details out.
>
> I have another concern with this patch. log_error_verbosity appears to
> be intended as an exact analogue of client verbosity (as set by
> PQsetErrorVerbosity()). While this generally holds for this patch,
> there is one notable exception: You always log all of these new fields
> within write_csvlog(), even if (Log_error_verbosity <
> PGERROR_VERBOSE). Why? There will be a bunch of commas in most CSV
> logs once this happens, so that the schema of the log is consistent.
> That is kind of ugly, but I don't see a way around it. We already do
> this for location and application_name (though that's separately
> controlled by the application_name guc). I haven't touched that in the
> attached revision, as I'd like to hear what you have to say.

it is bug - these new fields should be used only when verbosity is >= VERBOSE

>
> Another problem that will need to be fixed is that of the follow values:
>
> +#define PG_DIAG_COLUMN_NAME'c'
> +#define PG_DIAG_TABLE_NAME 't'
> +#define PG_DIAG_SCHEMA_NAME's'
> +#define PG_DIAG_CONSTRAINT_NAME'n'
> +#define PG_DIAG_CONSTRAINT_TABLE   'o'
> +#define PG_DIAG_CONSTRAINT_SCHEMA  'm'
> +#define PG_DIAG_ROUTINE_NAME   'r'
> +#define PG_DIAG_ROUTINE_SCHEMA 'u'
> +#define PG_DIAG_TRIGGER_NAME   'g'
> +#define PG_DIAG_TRIGGER_TABLE  'i'
> +#define PG_DIAG_TRIGGER_SCHEMA 'h'
>
> Not all appear to have a way of setting the value within the ereport
> interface. For example, there is nothing like "errrelation_column()"
> (or "errrelcol()", as I call it) to set PG_DIAG_ROUTINE_NAME. This is
> something I haven't touched.

When I sent this patch first time, then one issue was new functions
for these fields. Tom proposal was using a generic function for these
new fields. These fields holds separated values, but in almost all
cases some combinations are used - "ROUTINE_NAME, ROUTINE_SCHEMA",
"TABLE_NAME, TABLE_SCHEMA" - so these fields are not independent -
this is difference from original ErrorData fields - so axillary
functions doesn't follow these fields - because it is not practical.

>
>>> src/backend/utils/adt/domains.c
>>> 162:
>>> (errcode(ERRCODE_CHECK_VIOLATION),
>>>
>>
>> these exceptions are related to domains - we has not adequate fields
>> now - and these fields are not in standards
>>
>> it needs some like DOMAIN_NAME and DOMAIN_SCHEMA ???
>
> Hmm. I'm not sure that it's worth it.
>
> I took a look at recent JDBC documentation, because I'd expect it to
> offer the most complete support for exposing this in exception
> classes. Turns out that it does not expose things at as fine a
> granularity as you have here at all, which is disappointing. It seems
> to suppose that just a vendor code and "cause" will be sufficient. If
> you're using this one proprietary database, there is a command that'll
> let you get diagnostics. The wisdom for users of another proprietary
> database seems to be that you just use stored proced

Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Euler Taveira
On 07-07-2012 09:00, Satoshi Nagayasu wrote:
> I've created new patch to get/reset statistics of WAL buffer
> writes (flushes) caused by WAL buffer full.
> 
This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and  you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Satoshi Nagayasu
2012/07/07 22:07, Euler Taveira wrote:
> On 07-07-2012 09:00, Satoshi Nagayasu wrote:
>> I've created new patch to get/reset statistics of WAL buffer
>> writes (flushes) caused by WAL buffer full.
>>
> This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
> give you the wal_buffers value. It only says "hey, I needed more buffers so I
> write those dirty ones". It doesn't say how many. I would like to have
> something that says "hey, you have 1000 buffers available and  you are using
> 100 buffers (10%)". This new statistic is only useful for decreasing the
> WALWriteLock contention.

I agree with that it would not tell the exact number for wal_buffers,
but it would help DBA understand what's actually happening around WAL
buffers.

Also, decreasing the WALWriteLock contention is obviously important
for DBA in terms of improving database performance.

Actually, that's the reason why I'm working on another statistics. :)
http://archives.postgresql.org/pgsql-hackers/2012-06/msg01489.php

Regards,
-- 
Satoshi Nagayasu 
Uptime Technologies, LLC. http://www.uptime.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Robert Haas
On Jul 7, 2012, at 9:07 AM, Euler Taveira  wrote:
> On 07-07-2012 09:00, Satoshi Nagayasu wrote:
>> I've created new patch to get/reset statistics of WAL buffer
>> writes (flushes) caused by WAL buffer full.
>> 
> This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
> give you the wal_buffers value. It only says "hey, I needed more buffers so I
> write those dirty ones". It doesn't say how many. I would like to have
> something that says "hey, you have 1000 buffers available and  you are using
> 100 buffers (10%)". This new statistic is only useful for decreasing the
> WALWriteLock contention.

The number of WAL buffers that you are using is going to change so quickly as 
to be utterly meaningless.  I don't really see that there's any statistic we 
could gather that would tell us how many WAL buffers are needed.  This patch 
seems like it's on the right track, at least telling you how often you're 
running out.

I'm interested to run some benchmarks with this; I think it could be quite 
informative.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Magnus Hagander
On Sat, Jul 7, 2012 at 3:52 PM, Robert Haas  wrote:
> On Jul 7, 2012, at 9:07 AM, Euler Taveira  wrote:
>> On 07-07-2012 09:00, Satoshi Nagayasu wrote:
>>> I've created new patch to get/reset statistics of WAL buffer
>>> writes (flushes) caused by WAL buffer full.
>>>
>> This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
>> give you the wal_buffers value. It only says "hey, I needed more buffers so I
>> write those dirty ones". It doesn't say how many. I would like to have
>> something that says "hey, you have 1000 buffers available and  you are using
>> 100 buffers (10%)". This new statistic is only useful for decreasing the
>> WALWriteLock contention.
>
> The number of WAL buffers that you are using is going to change so quickly as 
> to be utterly meaningless.  I don't really see that there's any statistic we 
> could gather that would tell us how many WAL buffers are needed.  This patch 
> seems like it's on the right track, at least telling you how often you're 
> running out.

We could keep a high watermark of "what's the largest percentage we've
used", perhaps?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Bruce Momjian
On Sat, Jul 07, 2012 at 11:36:41AM +0200, Magnus Hagander wrote:
> >> I do basically agree with this.  I was reflecting on the bug tracker
> >> issue (or lack thereof) for unrelated reasons earlier today and I
> >> think there are some very nice things to recommend the current
> >> email-based system, which are the reasons you identify above.  Perhaps
> >> the area where it falls down is structured searches (such as for
> >> "closed" or "wontfix") and tracking progress of related, complex, or
> >> multi-part issues that span multiple root email messages.
> >
> > I normally assume "friction" is just something that slows you down from
> > attaining a goal, but open source development is only _possible_ because
> > of the low friction communication available via the Internet.  It isn't
> > that open source development would be slower --- it would probably not
> > exist in its current form (think shareware diskettes for an
> > alternative).
> 
> I've never thought of it in terms of "friction", but I think that term
> makes a lot of sense. And it sums up pretty much one of the things
> that I find the most annoying with the commitfest app - in the end it
> boils down to the same thing. I find it annoying that whenever someone
> posts a new review or new comments, one has to *also* go into the CF
> app and add them there. Which leads to a lot of friction, which in
> turn leads to people not actually putting their comments in there,
> which decreases the value of the tool.

For me, we can't just say that our process is the the way it is because
we have always done it that way, or because we are too lazy to change
it.  I think most of us have a gut feeling that our process is good, but
we have to have some logic behind why we are doing things differently
than most other projects, I think "friction" does explain a lot of it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Bruce Momjian
On Sat, Jul 07, 2012 at 12:59:02PM +0200, Magnus Hagander wrote:
> On Sat, Jul 7, 2012 at 12:48 PM, Martijn van Oosterhout
>  wrote:
> > On Sat, Jul 07, 2012 at 11:36:41AM +0200, Magnus Hagander wrote:
> >> I've never thought of it in terms of "friction", but I think that term
> >> makes a lot of sense. And it sums up pretty much one of the things
> >> that I find the most annoying with the commitfest app - in the end it
> >> boils down to the same thing. I find it annoying that whenever someone
> >> posts a new review or new comments, one has to *also* go into the CF
> >> app and add them there. Which leads to a lot of friction, which in
> >> turn leads to people not actually putting their comments in there,
> >> which decreases the value of the tool.
> >>
> >> Don't get me wrong - the cf app is a huge step up from no app at all.
> >> But it's just not done yet.
> >
> > Well, if that's the issue then there are well known solutions to that.
> > Give each commitfest entry a tag, say, CF#8475 and add a bot to the
> > mail server that examines each email for this tag and if found adds it
> > to the CF app.
> 
> So now you moved the friction over to the initial submitter instead,
> because he/she how has to get this tag before posting the patch in the
> first place. And this would need to happen before it's even known if
> this needs to go on a CF or will just be approved/rejected directly
> without even getting there.
> 
> That's not to say it's a horrible idea, it's just to say that things
> are never as easy as they first look.
> 
> BTW, the *bigger* issue with this path is that now we suddenly have
> different kinds of identifiers for different things. So you have a bug
> id, and a cf id, and they are different things. So you're going to end
> up tagging things with multiple id's. etc. That part goes to show that
> we cannot just "solve" this in one part of the workflow. We can put in
> useful workarounds that go pretty far - like the cf app - but we
> cannot get the "complete solution". OTOH, maybe we never can get the
> complete solution, but we should work hard at not locking into
> something else.

Yes, we would almost need to number each incoming email, and use that
reference all the way through to the release note item text. or at least
a searchable git log of the release note commits.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Magnus Hagander
On Sat, Jul 7, 2012 at 4:42 PM, Bruce Momjian  wrote:
> On Sat, Jul 07, 2012 at 12:59:02PM +0200, Magnus Hagander wrote:
>> On Sat, Jul 7, 2012 at 12:48 PM, Martijn van Oosterhout
>>  wrote:
>> > On Sat, Jul 07, 2012 at 11:36:41AM +0200, Magnus Hagander wrote:
>> >> I've never thought of it in terms of "friction", but I think that term
>> >> makes a lot of sense. And it sums up pretty much one of the things
>> >> that I find the most annoying with the commitfest app - in the end it
>> >> boils down to the same thing. I find it annoying that whenever someone
>> >> posts a new review or new comments, one has to *also* go into the CF
>> >> app and add them there. Which leads to a lot of friction, which in
>> >> turn leads to people not actually putting their comments in there,
>> >> which decreases the value of the tool.
>> >>
>> >> Don't get me wrong - the cf app is a huge step up from no app at all.
>> >> But it's just not done yet.
>> >
>> > Well, if that's the issue then there are well known solutions to that.
>> > Give each commitfest entry a tag, say, CF#8475 and add a bot to the
>> > mail server that examines each email for this tag and if found adds it
>> > to the CF app.
>>
>> So now you moved the friction over to the initial submitter instead,
>> because he/she how has to get this tag before posting the patch in the
>> first place. And this would need to happen before it's even known if
>> this needs to go on a CF or will just be approved/rejected directly
>> without even getting there.
>>
>> That's not to say it's a horrible idea, it's just to say that things
>> are never as easy as they first look.
>>
>> BTW, the *bigger* issue with this path is that now we suddenly have
>> different kinds of identifiers for different things. So you have a bug
>> id, and a cf id, and they are different things. So you're going to end
>> up tagging things with multiple id's. etc. That part goes to show that
>> we cannot just "solve" this in one part of the workflow. We can put in
>> useful workarounds that go pretty far - like the cf app - but we
>> cannot get the "complete solution". OTOH, maybe we never can get the
>> complete solution, but we should work hard at not locking into
>> something else.
>
> Yes, we would almost need to number each incoming email, and use that
> reference all the way through to the release note item text. or at least
> a searchable git log of the release note commits.

Which we in a lot of ways have - we have the messageid. But we need a
nicer interface for people to use than that... But that serves pretty
well as an underlying identifier.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Bruce Momjian
On Sat, Jul 07, 2012 at 04:45:42PM +0200, Magnus Hagander wrote:
> >> That's not to say it's a horrible idea, it's just to say that things
> >> are never as easy as they first look.
> >>
> >> BTW, the *bigger* issue with this path is that now we suddenly have
> >> different kinds of identifiers for different things. So you have a bug
> >> id, and a cf id, and they are different things. So you're going to end
> >> up tagging things with multiple id's. etc. That part goes to show that
> >> we cannot just "solve" this in one part of the workflow. We can put in
> >> useful workarounds that go pretty far - like the cf app - but we
> >> cannot get the "complete solution". OTOH, maybe we never can get the
> >> complete solution, but we should work hard at not locking into
> >> something else.
> >
> > Yes, we would almost need to number each incoming email, and use that
> > reference all the way through to the release note item text. or at least
> > a searchable git log of the release note commits.
> 
> Which we in a lot of ways have - we have the messageid. But we need a
> nicer interface for people to use than that... But that serves pretty
> well as an underlying identifier.

Imagine a case where a single message-id could show all emails before
and after that refer to it, and all commits and release note text
associated with it.  I think there are going to be cases where multiple
email threads all represent the same item, of course, and might not be
linked via email references.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug tracker tool we need

2012-07-07 Thread Magnus Hagander
On Sat, Jul 7, 2012 at 4:48 PM, Bruce Momjian  wrote:
> On Sat, Jul 07, 2012 at 04:45:42PM +0200, Magnus Hagander wrote:
>> >> That's not to say it's a horrible idea, it's just to say that things
>> >> are never as easy as they first look.
>> >>
>> >> BTW, the *bigger* issue with this path is that now we suddenly have
>> >> different kinds of identifiers for different things. So you have a bug
>> >> id, and a cf id, and they are different things. So you're going to end
>> >> up tagging things with multiple id's. etc. That part goes to show that
>> >> we cannot just "solve" this in one part of the workflow. We can put in
>> >> useful workarounds that go pretty far - like the cf app - but we
>> >> cannot get the "complete solution". OTOH, maybe we never can get the
>> >> complete solution, but we should work hard at not locking into
>> >> something else.
>> >
>> > Yes, we would almost need to number each incoming email, and use that
>> > reference all the way through to the release note item text. or at least
>> > a searchable git log of the release note commits.
>>
>> Which we in a lot of ways have - we have the messageid. But we need a
>> nicer interface for people to use than that... But that serves pretty
>> well as an underlying identifier.
>
> Imagine a case where a single message-id could show all emails before
> and after that refer to it, and all commits and release note text
> associated with it.  I think there are going to be cases where multiple
> email threads all represent the same item, of course, and might not be
> linked via email references.

Yes, that's the part that has to be taken care of by that "thin
overlay over email". Some way of grouping multiple threads
together,and somehow keep the end-status of them
(rejected/comitted/wontfix/whatevryouwantocallthestatus). But not
actually keeping a copy of the information anywhere else.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-07 Thread Aidan Van Dyk
On Fri, Jul 6, 2012 at 4:50 PM, Peter Eisentraut  wrote:

> I have code in the wild that defines new operators and casts and has no
> C code and is not in an extension and has no business being in an
> extension.

Nobody is claiming that pgdump shouldn't dump it.

But, since you're using operators, what would you think is an
appropriate name for the file the operator is dumped into?

a.


-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Robert Haas
On Jul 7, 2012, at 8:54 AM, Magnus Hagander  wrote:
> On Sat, Jul 7, 2012 at 3:52 PM, Robert Haas  wrote:
>> On Jul 7, 2012, at 9:07 AM, Euler Taveira  wrote:
>>> On 07-07-2012 09:00, Satoshi Nagayasu wrote:
 I've created new patch to get/reset statistics of WAL buffer
 writes (flushes) caused by WAL buffer full.
 
>>> This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
>>> give you the wal_buffers value. It only says "hey, I needed more buffers so 
>>> I
>>> write those dirty ones". It doesn't say how many. I would like to have
>>> something that says "hey, you have 1000 buffers available and  you are using
>>> 100 buffers (10%)". This new statistic is only useful for decreasing the
>>> WALWriteLock contention.
>> 
>> The number of WAL buffers that you are using is going to change so quickly 
>> as to be utterly meaningless.  I don't really see that there's any statistic 
>> we could gather that would tell us how many WAL buffers are needed.  This 
>> patch seems like it's on the right track, at least telling you how often 
>> you're running out.
> 
> We could keep a high watermark of "what's the largest percentage we've
> used", perhaps?

Sure, but I doubt that would be as informative as this.  It's no big deal if 
you hit 100% every once in a while; what you really want to know is whether 
it's happening once per second or once per week.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Magnus Hagander
On Sat, Jul 7, 2012 at 7:06 PM, Robert Haas  wrote:
> On Jul 7, 2012, at 8:54 AM, Magnus Hagander  wrote:
>> On Sat, Jul 7, 2012 at 3:52 PM, Robert Haas  wrote:
>>> On Jul 7, 2012, at 9:07 AM, Euler Taveira  wrote:
 On 07-07-2012 09:00, Satoshi Nagayasu wrote:
> I've created new patch to get/reset statistics of WAL buffer
> writes (flushes) caused by WAL buffer full.
>
 This new statistic doesn't solve your problem (tune wal_buffers). It 
 doesn't
 give you the wal_buffers value. It only says "hey, I needed more buffers 
 so I
 write those dirty ones". It doesn't say how many. I would like to have
 something that says "hey, you have 1000 buffers available and  you are 
 using
 100 buffers (10%)". This new statistic is only useful for decreasing the
 WALWriteLock contention.
>>>
>>> The number of WAL buffers that you are using is going to change so quickly 
>>> as to be utterly meaningless.  I don't really see that there's any 
>>> statistic we could gather that would tell us how many WAL buffers are 
>>> needed.  This patch seems like it's on the right track, at least telling 
>>> you how often you're running out.
>>
>> We could keep a high watermark of "what's the largest percentage we've
>> used", perhaps?
>
> Sure, but I doubt that would be as informative as this.  It's no big deal if 
> you hit 100% every once in a while; what you really want to know is whether 
> it's happening once per second or once per week.

I'm not suggesting one or the other, I'm suggesting that both values
might be interesting. Though in reality, you'd want that high
watermark to only count if it was the state for more than , which
is a lot more difficult to get. so yeah, maybe that's overkill to even
try.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] regex_fixed_prefix() is still a few bricks shy of a load

2012-07-07 Thread Tom Lane
In
http://archives.postgresql.org/pgsql-general/2012-07/msg00107.php
it's pointed out that regex_fixed_prefix() gets the wrong answer
when presented with a regex like '^(xyz...)?...'.  It thinks this
pattern has a fixed prefix of "xyz", that is can only match strings
beginning with "xyz".  But because of the quantifier attached to the
parenthesized subexpression, that ain't so.  This leads to generating
incorrect indexable conditions from a regex WHERE clause.

I can see a few ways we might attack this:

1. Give up on index-optimizing patterns that begin with "^(".  This is
fairly undesirable because psql's "\d object-name-pattern" commands
all generate regexes that start that way; so for example "\df myfunc"
would no longer be able to use the index on pg_proc.proname.

2. Teach regex_fixed_prefix() to recognize whether the parenthesized
subexpression has a quantifier.  This would require a quantum jump in
regex_fixed_prefix's intelligence, though, because now it would have
to be capable of finding the matching right paren.  We could perhaps
restrict the set of things it knows how to skip over to find the
matching paren to things that are likely to appear in psql \d usage,
but it still seems messy and fragile.

3. Try another approach entirely.  The idea that I've got in mind here
is to compile the regex using the regex library and then look at the
compiled NFA representation to see if there must be a fixed prefix.
I would not have risked this before this year, but after last winter's
expeditions into the darkest corners of the regex library I feel
competent to do it, and some quick study suggests that it might not take
very much code to produce something that is significantly brighter than
what we have now.  However, there are a couple of arguments against
pursuing this path:

* We would now require the regex library to expose an API that seems
rather Postgres-specific, namely "give me the fixed prefix if any for
this regex_t".  Given our ambition of eventually splitting off Spencer's
library as a standalone project, it's somewhat annoying to saddle it
with such a requirement.  On the other hand, it's a rather small tax to
pay for Postgres effectively sponsoring further development of the regex
library.

* Character values in the compiled regex are in pg_wchar representation,
so we'd need a wchar-to-server-encoding transformation.  Such code
exists in HEAD, as of a couple days ago, but it's practically untested;
so back-porting it into stable branches and relying on it to be correct
seems a tad optimistic.  Possibly in the back branches we could restrict
regex_fixed_prefix to only cope with 7-bit ASCII prefix strings?  I bet
there would be squawks from non-English-speakers, though.

I think that solution #3 is probably the best way to go forward, but
it's not at all clear what to do for the back branches.  Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_prewarm

2012-07-07 Thread Jeff Janes
This is a review for pg_prewarm V2.

It applies (with some fuzz, but it is handled correctly) and builds cleanly.

It includes docs, but does not include regression tests, which it
probably should (just to verify that it continues to compile and
execute without throwing errors, I wouldn't expect an automated test
to verify actual performance improvement).

I think we want this.  There is some discussion about how much overlap
it has with pgfincore, but I don't think there is an active proposal
to put that into contrib, so don't see that as blocking this.

It works as advertised.  using pgbench -i -s100 (about 1.5Gig), with
shared_buffers of current default (128 MB), it takes 10 minutes for
pgbench -S to revive the cache from a cold start and reach its full
TPS.  If I use pg_prewarm on both pgbench_accounts and
pgbench_accounts_pkey from a cold start, it takes 22 seconds, and then
pgbench -S runs at full speed right from the start.

It does not matter if I use 'read' or 'buffer'.  While all the data
doesn't fit in shared_buffers, trying to read it into the buffers acts
to populate the file cache anyway, and doesn't take significantly more
time.

On my test system (openSuse 12.1) 'prefetch' took just as long 'read'
or 'buffer', and sometimes it seemed to fail to load everything (it
would take pgbench -S up to 60 seconds to reach full speed).  I expect
this to be too system depend to care much about figuring what is going
on, though.


For the implementation:

1)
I think that for most users, making them know or care about forks and
block numbers is too much.  It would be nice if there were a
single-argument form:  pg_prewarm(relation) which loaded all of either
main, or all of all forks, using 'buffer'.  This seems like a good
default.  Also, the last two arguments are NULL in all the given
examples.  Do we expect those to be used only for experimental
purposes by hackers, or are those of general interest?

2)
The error message:
ERROR:  prewarm type cannot be null

Should offer the same hint as:

ERROR:  invalid prewarm type
HINT:  Valid prewarm types are "prefetch", "read", and "buffer".

3)
In the docs, the precedence seems to be that fork names ('main', here)
when in LITERAL classes are shown with single quotes around them,
rather than bare.

4) Not sure that the copyright should start in  2010 in pg_prewarm.c:
Copyright (c) 2010-2012


I have not tested on a system which does not support posix_fadvise.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-07 Thread Alvaro Herrera

Excerpts from Aidan Van Dyk's message of sáb jul 07 11:32:33 -0400 2012:
> 
> On Fri, Jul 6, 2012 at 4:50 PM, Peter Eisentraut  wrote:
> 
> > I have code in the wild that defines new operators and casts and has no
> > C code and is not in an extension and has no business being in an
> > extension.
> 
> Nobody is claiming that pgdump shouldn't dump it.
> 
> But, since you're using operators, what would you think is an
> appropriate name for the file the operator is dumped into?

I was thinking that it might make sense to group operators according to
the type(s) they operate on, somehow.  Using funny chars for names is
guaranteed to cause problems somewhere.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regex_fixed_prefix() is still a few bricks shy of a load

2012-07-07 Thread Robert Haas
On Jul 7, 2012, at 1:46 PM, Tom Lane  wrote:
> 3. Try another approach entirely.  The idea that I've got in mind here
> is to compile the regex using the regex library and then look at the
> compiled NFA representation to see if there must be a fixed prefix.
> I would not have risked this before this year, but after last winter's
> expeditions into the darkest corners of the regex library I feel
> competent to do it, and some quick study suggests that it might not take
> very much code to produce something that is significantly brighter than
> what we have now.  However, there are a couple of arguments against
> pursuing this path:

I think this is clearly the best way forward, probably even in the back 
branches.  It's true that the wchar to mb conversion is largely untested, but 
it's also pretty simple code.  Sure, it could have bugs, but so could whatever 
work-around you cobble together to avoid back-patching it.  And it's not like 
we'll break anything else, either: the code will only be used in the case that 
is buggy right now anyway.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-07 Thread Tom Lane
Alvaro Herrera  writes:
> Excerpts from Aidan Van Dyk's message of sáb jul 07 11:32:33 -0400 2012:
>> But, since you're using operators, what would you think is an
>> appropriate name for the file the operator is dumped into?

> I was thinking that it might make sense to group operators according to
> the type(s) they operate on, somehow.  Using funny chars for names is
> guaranteed to cause problems somewhere.

Sure.  You need not look further than "/" to find an operator name that
absolutely *will* cause trouble if it's dumped into a filename
literally.

I'm not especially thrilled by the idea of using url-encoding or
something like that for operator names, though.  Seems like it loses on
readability.

If we think that operators outside of extensions will be an infrequent
special case, what about just dumping all of them into a single file
named "operators"?  And similarly for casts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autocomplete - SELECT fx

2012-07-07 Thread Noah Misch
On Thu, Jul 05, 2012 at 06:43:09PM -0700, Josh Kupershmidt wrote:
> On Mon, Jul 2, 2012 at 1:13 AM, Pavel Stehule  wrote:
> 
> > I tested Peter's patch and it works well.
> 
> I liked it as well. But I'm not sure what should happen with the patch
> now. It seems like it'd be commit-ready with just a tweak or two to
> the query as I noted in my last mail, but Tom did seem opposed[1] to
> the idea in the first thread, and no one else has spoken up recently
> in favor of the idea, so maybe it should just be marked Rejected or
> Returned with Feedback?

I like the patch, as far as it goes.  It's the natural addition to the
completions we already offer; compare the simplistic completion after WHERE.
Like Pavel and Robert, I think a delightful implementation of tab completion
for SELECT statements would require radical change.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-07-07 Thread Tatsuo Ishii
>> Tatsuo Ishii  writes:
 So far as I can see, the only LCPRVn marker code that is actually in
 use right now is 0x9d --- there are no instances of 9a, 9b, or 9c
 that I can find.
 
 I also read in the xemacs internals doc, at
 http://www.xemacs.org/Documentation/21.5/html/internals_26.html#SEC145
 that XEmacs thinks the marker code for private single-byte charsets
 is 0x9e (only) and that for private multi-byte charsets is 0x9f (only);
 moreover they think 0x9a-0x9d are potential future official multibyte
 charset codes.  I don't know how we got to the current state of using
 0x9a-0x9d as private charset markers, but it seems pretty inconsistent
 with XEmacs.
>> 
>>> At the time when mule internal code was introduced to PostgreSQL,
>>> xemacs did not have multi encoding capabilty and mule (a patch to
>>> emacs) was the only implementation allowed to use multi encoding. So I
>>> used the specification of mule documented in the URL I wrote.
>> 
>> I see.  Given that upstream has decided that a simpler definition is
>> more appropriate, is there any reason not to follow their lead, to the
>> extent that we can do so without breaking existing on-disk data?
> 
> Please let me spend week end to understand the their latest spec.

This is an intermediate report on the internal multi-byte charset
implementation of emacen. I have read the link Tom showed. Also I made
a quick scan on xemacs-21.4.0 source code, especially
xemacs-21.4.0/src/mule-charset.h. It seems the web document is
essentially a copy of the comments in the file. Also I looked into
other place of xemacs code and I think I can conclude that xeamcs
21.4's multi-byte implementation is based on the doc on the web.

Next I looked into emacs 24.1 source code because I could not find any
doc regarding emacs's(not xemacs's) implementation of internal
multi-byte charset. I found followings in src/charset.h:

/* Leading-code followed by extended leading-code.DIMENSION/COLUMN */
#define EMACS_MULE_LEADING_CODE_PRIVATE_11  0x9A /* 1/1 */
#define EMACS_MULE_LEADING_CODE_PRIVATE_12  0x9B /* 1/2 */
#define EMACS_MULE_LEADING_CODE_PRIVATE_21  0x9C /* 2/2 */
#define EMACS_MULE_LEADING_CODE_PRIVATE_22  0x9D /* 2/2 */

And these are used like this:

/* Read one non-ASCII character from INSTREAM.  The character is
   encoded in `emacs-mule' and the first byte is already read in
   C.  */

static int
read_emacs_mule_char (int c, int (*readbyte) (int, Lisp_Object), Lisp_Object 
readcharfun)
{
:
:
  else if (len == 3)
{
  if (buf[0] == EMACS_MULE_LEADING_CODE_PRIVATE_11
  || buf[0] == EMACS_MULE_LEADING_CODE_PRIVATE_12)
{
  charset = CHARSET_FROM_ID (emacs_mule_charset[buf[1]]);
  code = buf[2] & 0x7F;
}

As far as I can tell, this is exactly the same way how PostgreSQL
handles single private character sets: they consist of 3 bytes, and
leading byte is either 0x9a or 0x9b. Other examples regarding single
byte/multi-byte private charsets can be seen in coding.c.

As far as I can tell, it seems emacs and xemacs employes different
implementations of multi-byte charaset regarding "private"
charsets. Emacs's is same as PostgreSQL, while xemacs is not.  I am
contacting to the original Mule author if he knows anything about
this.

BTW, while looking into emacs's source code, I found their charset
definitions are in lisp/international/mule-conf.el. According to the
file several new charsets has been added. Included is the patch to
follow their changes. This makes no changes to current behavior, since
the patch just changes some comments and non supported charsets.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index 1bcdfbc..e44749e 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -108,7 +108,7 @@ typedef unsigned int pg_wchar;
 #define LC_KOI8_R			0x8b	/* Cyrillic KOI8-R */
 #define LC_ISO8859_5		0x8c	/* ISO8859 Cyrillic */
 #define LC_ISO8859_9		0x8d	/* ISO8859 Latin 5 (not supported yet) */
-/* #define FREE0x8e	free (unused) */
+#define LC_ISO8859_15		0x8e	/* ISO8859 Latin 15 (not supported yet) */
 /* #define CONTROL_1		0x8f	control characters (unused) */
 
 /* Is a leading byte for "official" single byte encodings? */
@@ -119,14 +119,13 @@ typedef unsigned int pg_wchar;
  * 0x9a-0x9d are free. 0x9e and 0x9f are reserved.
  */
 #define LC_JISX0208_1978	0x90	/* Japanese Kanji, old JIS (not supported) */
-/* #define FREE0x90	free (unused) */
 #define LC_GB2312_80		0x91	/* Chinese */
 #define LC_JISX0208			0x92	/* Japanese Kanji (JIS X 0208) */
 #define LC_KS5601			0x93	/* Korean */
 #define LC_JISX0212			0x94	/* Japanese Kanji (JIS X 0212) */
 #define LC_CNS11643_1		0x95	/* CNS 11643-1992 Plane 1 */
 #define LC_CNS11643_2		0x96	/* CNS 11643-1992 Plane 2 */
-/* #define FREE0x97	free (unused) */

Re: [HACKERS] New statistics for WAL buffer dirty writes

2012-07-07 Thread Satoshi Nagayasu
Hi,

Jeff Janes has pointed out that my previous patch could hold
a number of the dirty writes only in single local backend, and
it could not hold all over the cluster, because the counter
was allocated in the local process memory.

That's true, and I have fixed it with moving the counter into
the shared memory, as a member of XLogCtlWrite, to keep total
dirty writes in the cluster.

Regards,

2012/07/07 21:00, Satoshi Nagayasu wrote:
> Hi all,
> 
> I've created new patch to get/reset statistics of WAL buffer
> writes (flushes) caused by WAL buffer full.
> 
> This patch provides two new functions, pg_stat_get_xlog_dirty_write()
> and pg_stat_reset_xlog_dirty_write(), which have been designed to
> determine an appropriate value for WAL buffer size.
> 
> If this counter is increasing in the production environment,
> it would mean that the WAL buffer size is too small to hold
> xlog records generated the transactions. So, you can increase
> your WAL buffer size to keep xlog records and to reduce WAL writes.
> 
> I think this patch would not affect to WAL write performance,
> but still paying attention to it.
> 
> Any comments or suggestions?
> 
> Regards,
> 
> ---
> [snaga@devvm03 src]$ psql -p 15432 postgres
> psql (9.3devel)
> Type "help" for help.
> 
> postgres=# SELECT pg_stat_get_xlog_dirty_write();
>   pg_stat_get_xlog_dirty_write
> --
>  0
> (1 row)
> 
> postgres=# \q
> [snaga@devvm03 src]$ pgbench -p 15432 -s 10 -c 32 -t 1000 postgres
> Scale option ignored, using pgbench_branches table count = 10
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 10
> query mode: simple
> number of clients: 32
> number of threads: 1
> number of transactions per client: 1000
> number of transactions actually processed: 32000/32000
> tps = 141.937738 (including connections establishing)
> tps = 142.123457 (excluding connections establishing)
> [snaga@devvm03 src]$ psql -p 15432 postgres
> psql (9.3devel)
> Type "help" for help.
> 
> postgres=# SELECT pg_stat_get_xlog_dirty_write();
>   pg_stat_get_xlog_dirty_write
> --
>  0
> (1 row)
> 
> postgres=# begin;
> BEGIN
> postgres=# DELETE FROM pgbench_accounts;
> DELETE 100
> postgres=# commit;
> COMMIT
> postgres=# SELECT pg_stat_get_xlog_dirty_write();
>   pg_stat_get_xlog_dirty_write
> --
>  19229
> (1 row)
> 
> postgres=# SELECT pg_stat_reset_xlog_dirty_write();
>   pg_stat_reset_xlog_dirty_write
> 
> 
> (1 row)
> 
> postgres=# SELECT pg_stat_get_xlog_dirty_write();
>   pg_stat_get_xlog_dirty_write
> --
>  0
> (1 row)
> 
> postgres=# \q
> [snaga@devvm03 src]$
> ---
> 
> 


-- 
Satoshi Nagayasu 
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 642c129..893acf8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -370,6 +370,11 @@ typedef struct XLogCtlWrite
 {
int curridx;/* cache index of next 
block to write */
pg_time_t   lastSegSwitchTime;  /* time of last xlog 
segment switch */
+
+   /*
+* Counter for WAL dirty buffer writes.
+*/
+   uint64  WalBufferWriteDirtyCount;
 } XLogCtlWrite;
 
 /*
@@ -1504,6 +1509,8 @@ AdvanceXLInsertBuffer(bool new_segment)
}
else
{
+   XLogCtlWrite *Write = &XLogCtl->Write;
+
/*
 * Have to write buffers while holding insert 
lock. This is
 * not good, so only write as much as we 
absolutely must.
@@ -1512,6 +1519,10 @@ AdvanceXLInsertBuffer(bool new_segment)
WriteRqst.Write = OldPageRqstPtr;
WriteRqst.Flush = 0;
XLogWrite(WriteRqst, false, false);
+   /*
+* XLogCtrlWrite must be protected with 
WALWriteLock.
+*/
+   Write->WalBufferWriteDirtyCount++;
LWLockRelease(WALWriteLock);
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
}
@@ -10492,3 +10503,26 @@ SetWalWriterSleeping(bool sleeping)
xlogctl->WalWriterSleeping = sleeping;
SpinLockRelease(&xlogctl->info_lck);
 }
+
+uint64
+xlog_dirty_write_counter_get()
+{
+   XLogCtlWrite *Write = &XLogCtl->Write;
+   uint64 count;
+
+   LWLockAcquire(WALWriteLock, LW_SHARE

Re: [HACKERS] pg_prewarm(some more observations in patch)

2012-07-07 Thread Amit kapila


From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] 
on behalf of Jeff Janes [jeff.ja...@gmail.com]
>For the implementation:

>1)
>I think that for most users, making them know or care about forks and
> block numbers is too much.  It would be nice if there were a
> single-argument form:  pg_prewarm(relation) which loaded all of either
> main, or all of all forks, using 'buffer'.  This seems like a good
> default.  Also, the last two arguments are NULL in all the given
> examples.  Do we expect those to be used only for experimental
> purposes by hackers, or are those of general interest?
I agree with you. 
2 forms of the function can exist one with only one argument and other
with the arguments as specified in  you interface. This can solve the purpose 
of all kind of users.
In the first form there should be defaults for all other values.

1. For the prewarm type(prefetch,read,buffer), it would have been better if 
either it is enum or 
   an case insensitive string. It could have been more convienient from user 
perspective.

2. 
+ if (PG_ARGISNULL(4))
+ last_block = nblocks - 1;
+ else
+ {
+ last_block = PG_GETARG_INT64(4);
+ if (last_block > nblocks)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("ending block number " INT64_FORMAT " exceeds number of blocks in 
relation " INT64_FORMAT, last_block, nblocks)));
+ }

It can add additional check if last block number is less than first block 
number, then report meaningful error.
   As for this kind of case, it will not load any buffers and returns 
successfully.

3. + else if (ptype == PREWARM_READ)
+ {
+  /*
+   * In read mode, we actually read the blocks, but not into shared
+   * buffers.  This is more portable than prefetch mode (it works
+   * everywhere) and is synchronous.
+   */
+  RelationOpenSmgr(rel);

   Is it required to call RelationOpenSmgr(rel) as in the begining already it 
is done?

With Regards,
Amit Kapila.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers