Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-15 Thread Fujii Masao
On Fri, Mar 12, 2010 at 8:13 PM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Mar 10, 2010 at 10:09, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php
 On win32, the blocking libpq functions like PQconnectdb() and
 PQexec() are uninterruptible since they use the vanilla select()
 instead of our signal emulation layer compatible select().
 Nevertheless, currently walreceiver uses them to establish a
 connection, send a handshake message and wait for the reply.
 So walreceiver also becomes uninterruptible for a while. This
 is the must-fix problem for 9.0.

 I replaced the blocking libpq functions currently used with
 asynchronous ones, and used the emulated version of select()
 to wait, to make walreceiver interruptible. Here is the patch.

 These are issues that affect other things running libpq in the backend
 as well, right? Such as dblink?

Yes. So Heikki wrote the patch for dblink.
http://archives.postgresql.org/pgsql-hackers/2010-01/msg02072.php

 Perhaps we can factor out most of this
 into functions in backend/port/win32 so that we can re-use it fro
 there?

Sorry. I couldn't get your point. Could you explain it in detail?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Fujii Masao
On Sat, Mar 13, 2010 at 12:28 PM, Bruce Momjian br...@momjian.us wrote:
 Where are we in getting to beta1?  I know people are looking to me for
 9.0 release notes and I will have them done in about a week, but what
 about open issues?  I don't see many on the main 9.0 open items page:

        http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items#Bugs

 The list has been reduced greatly in the past week.  What about HS/SR
 open items?

I think that at least the following item should be addressed before beta1
since it's a serious problem.

* Walreceiver is not interruptible on win32
  http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php

I've already submitted the patch, and am waiting for the review.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-15 Thread Magnus Hagander
On Mon, Mar 15, 2010 at 10:14, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Mar 12, 2010 at 8:13 PM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Mar 10, 2010 at 10:09, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 http://archives.postgresql.org/pgsql-hackers/2010-01/msg01672.php
 On win32, the blocking libpq functions like PQconnectdb() and
 PQexec() are uninterruptible since they use the vanilla select()
 instead of our signal emulation layer compatible select().
 Nevertheless, currently walreceiver uses them to establish a
 connection, send a handshake message and wait for the reply.
 So walreceiver also becomes uninterruptible for a while. This
 is the must-fix problem for 9.0.

 I replaced the blocking libpq functions currently used with
 asynchronous ones, and used the emulated version of select()
 to wait, to make walreceiver interruptible. Here is the patch.

 These are issues that affect other things running libpq in the backend
 as well, right? Such as dblink?

 Yes. So Heikki wrote the patch for dblink.
 http://archives.postgresql.org/pgsql-hackers/2010-01/msg02072.php

IIRC that was never applied.


 Perhaps we can factor out most of this
 into functions in backend/port/win32 so that we can re-use it fro
 there?

 Sorry. I couldn't get your point. Could you explain it in detail?

What I'm referring to is the part that Heikki writes as The
implementation should be shared between the two, but I'm not sure
how. I think we should try to factor out things that can be shared
into separate functions and stick those in port/win32 (assuming
they're win32-specific, otherwise, in another suitable location), and
then call them from both. There seems to be a lot of things that
should be doable that way.

I notice for example that the dblink patch doesn't have the code for
timeout handling - shouldn't it?

I think we need to look at this as a single problem needing to be
solved, and then have the same solution applied to dblink and
walreceiver.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Dimitri Fontaine
David E. Wheeler da...@kineticode.com writes:
 On Mar 14, 2010, at 3:38 PM, Josh Berkus wrote:

 I'm planning on writing a Guide to HS  SR for the beta.  Originally I
 planned to put this in the main docs, but I couldn't figure out how to
 fit it in there structurally.  Plus, it needs more examples, output
 samples, and a tutorial feel.

 Perhaps a tutorial could go under Server Administration? Or perhaps
 under Tutorial even? It would be section I.4.

+1 for having a Tutorial chapter about setting up archiving, PITR and
replication. While at it, let's read the release notes and list other
points that the tutorial could cover too.

Do we need some more sections in Chapter 3. Advanced Features, dealing
with Exclusion Constraint, the new DO command, how to manage privileges
in a realistic examples (a superuser role shared between 2 DBAs, the
database owner, and the application which is not allowed DDL, for
example)?

A psql chapter/section would maybe fit too, with tricks such as \o then
query the catalog then \i, ON_ERROR_{STOP,ROLLBACK}, -1, -v, PGOPTIONS,
etc, like Peter did in his last blog entry.

Maybe some more admin level tutorial would be great to have too, such as
how to find what's locking, how to monitor table and index usage to
determine which indexes to drop, which to create, how to monitor
things (slaves lag, hitratio, transactions, I/U/D activity, you name
it).

A lot of things are described in the manual and provided in munin or
nagios plugins already, but still the Tutorial looks like a good place
to give the recipes, ready-to-go queries etc.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-15 Thread Fujii Masao
On Mon, Mar 15, 2010 at 6:42 PM, Magnus Hagander mag...@hagander.net wrote:
 Perhaps we can factor out most of this
 into functions in backend/port/win32 so that we can re-use it fro
 there?

 Sorry. I couldn't get your point. Could you explain it in detail?

 What I'm referring to is the part that Heikki writes as The
 implementation should be shared between the two, but I'm not sure
 how. I think we should try to factor out things that can be shared
 into separate functions and stick those in port/win32 (assuming
 they're win32-specific, otherwise, in another suitable location), and
 then call them from both. There seems to be a lot of things that
 should be doable that way.

 I notice for example that the dblink patch doesn't have the code for
 timeout handling - shouldn't it?

 I think we need to look at this as a single problem needing to be
 solved, and then have the same solution applied to dblink and
 walreceiver.

Thanks for the explanation. I agree that the code should be shared,
but am not sure how, too.

Something like libpq_select() which waits for the socket to become
ready would be required for walreceiver and dblink. But it's necessary
for walreceiver on not only win32 but also the other, so some functions
might need to be placed in the location other than port/win32.

I'll think of this issue for a while.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Greg Smith

Dimitri Fontaine wrote:

Maybe some more admin level tutorial would be great to have too, such as
how to find what's locking, how to monitor table and index usage to
determine which indexes to drop, which to create, how to monitor
things (slaves lag, hitratio, transactions, I/U/D activity, you name
it).
  


Wow, that's at least one order of magnitude more ambitious than the 
actual scope of work on the docs that should be getting focused on for 
beta right now, perhaps two.  Regardless, I already have stubs for the 
first couple of these sitting on the wiki at 
http://wiki.postgresql.org/wiki/Category:Administration (locks, 
monitoring).  I know I'd rather see work done on those, where we can 
continue to improve without doc commits and easily make things available 
for all versions, until that content is good.  Maybe then we can talk 
about merging some of that back into the main docs.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Dimitri Fontaine
Greg Smith g...@2ndquadrant.com writes:

 Dimitri Fontaine wrote:
 Maybe some more admin level tutorial would be great to have too, such as
 how to find what's locking, how to monitor table and index usage to
 determine which indexes to drop, which to create, how to monitor
 things (slaves lag, hitratio, transactions, I/U/D activity, you name
 it).
   

 Wow, that's at least one order of magnitude more ambitious than the actual
 scope of work on the docs that should be getting focused on for beta right
 now, perhaps two.

Yes, I took the message as an opportunity to talk about how much stuff
we'd like to add in the tutorial, then I'll see about spending time on
it if core agrees with the need. There's no reason I'd want this to
happen pre-beta unless it's about hard to grasp things we want lots of
people to test. So we're in agreement here...

Maybe it's time to start another thread if people want to follow-up on
expanding our tutorial.

  Regardless, I already have stubs for the first couple of
 these sitting on the wiki at
 http://wiki.postgresql.org/wiki/Category:Administration (locks, monitoring).
 I know I'd rather see work done on those, where we can continue to improve
 without doc commits and easily make things available for all versions, until
 that content is good.  Maybe then we can talk about merging some of that
 back into the main docs.

Some kind of canonical reference on how to use the catalogs and system
view to realise common tasks does not seem out of place in the tutorial
for me.

As far as using the wiki to prepare the content, +1.
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Merlin Moncure
On Sat, Mar 13, 2010 at 1:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder if it could work to treat the result of a record-fieldname
 operator as being of UNKNOWN type initially, and resolve its actual
 type in the parser in the same way we do for undecorated literals
 and parameters, to wit
        * you can explicitly cast it, viz
                (record-fieldname)::bigint
        * you can let it be inferred from context, such as the type
          of whatever it's compared to
        * throw error if type is not inferrable
 Then at runtime, if the actual type of the field turns out to not be
 what the parser inferred, either throw error or attempt a run-time
 type coercion.  Throwing error seems safer, because it would avoid
 surprises of both semantic (unexpected behavior) and performance
 (expensive conversion you weren't expecting to happen) varieties.
 But possibly an automatic coercion would be useful enough to justify
 those risks.

the casting rules are completely reasonable.  Throwing an error seems
like a better choice.  Better to be strict now and relax the rules
later.  record-fieldname takes a string (possibly a variable)?  If
so, his would nail the problem.  This would work with run time typed
records (new, etc)?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Andrew Dunstan



Merlin Moncure wrote:
record-fieldname takes a string (possibly a variable)? 


If it doesn't we have a communication problem. :-)

If so, his would nail the problem.  


Not quite, but close. We also need a nice way of querying for field 
names (at least) at run time. I've seen that requested several times.



This would work with run time typed
records (new, etc)?

  


Again, if it doesn't we have a communication problem.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 A lot of things are described in the manual and provided in munin or
 nagios plugins already, but still the Tutorial looks like a good place
 to give the recipes, ready-to-go queries etc.

This sounds like a pretty horrid idea.  The tutorial is meant to be read
first, so it cannot depend on having already read any of the main
documentation.  If we try to fill it with hints and tricks then either
it will be completely unintelligible to newbies, or there will be a
staggering amount of material duplicated from the main docs to support
the hints.  The latter approach would be no fun to write and even less
fun to maintain in the future.

There might well be a use for a separate hints-and-tricks document.
I don't agree with stuffing it into the existing tutorial though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 This sounds like a pretty horrid idea.  The tutorial is meant to be read
 first, so it cannot depend on having already read any of the main
 documentation.  If we try to fill it with hints and tricks then either
 it will be completely unintelligible to newbies, or there will be a
 staggering amount of material duplicated from the main docs to support
 the hints.  The latter approach would be no fun to write and even less
 fun to maintain in the future.

I'm not sure how much your comment applies to what I picture, so here's
what I had in mind:

  There's a system view called pg_stat_activity which is maintained
  up-to-date by PostgreSQL, and you can query it to gather information
  about running queries at any time. Another such view is pg_locks,
  which reports about current lock requests and whether they're granted
  or not.

  You can use both those system views to get important information on
  your running system, and to show currently waiting for a lock query
  texts, here's how you join those views:

insert recipe from the link
http://wiki.postgresql.org/wiki/Lock_Monitoring

  To produce a locking situation, let's open two concurrent connections
  to the database, either with psql in two different terminals, or using
  pgadmin. Now instruction for 2 concurrent UPDATE on the same
  row. Use the previous query to see the locked query, commit the first
  session to unlock it.

As the concepts of SELECT, view and JOINs are already addressed in the
tutorial, I'd think it could be ok. Now, maybe the tutorial isn't the
right place to be confronted to MVCC, locks, and system monitoring, but
some kind of soft introduction would be good here, methinks.

 There might well be a use for a separate hints-and-tricks document.
 I don't agree with stuffing it into the existing tutorial though.

Yeah, maybe expanding current chapter 27. Monitoring Database Activity
would be a better idea?

  http://developer.postgresql.org/pgdocs/postgres/monitoring.html

In fact, a merge of chapters 27 and 28 Monitoring Disk Usage into a
larger one about Monitoring PostgreSQL could be a better fit?

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Merlin Moncure
On Mon, Mar 15, 2010 at 10:02 AM, Andrew Dunstan and...@dunslane.net wrote:
 Not quite, but close. We also need a nice way of querying for field names
 (at least) at run time. I've seen that requested several times.

ok. just making sure we were on the same page. wasn't there a
technical objection to querying the fields at runtime?  If not, maybe
you could get by with something like:

Integer variant of operator pulls fields by index
somettype v := recvar-3;

integer n := nfields(recordtype);

text[] fields := fieldnames(recordtype);

text fieldname := fieldname(recordtype, 3);
int fieldpos := fieldpos(recordtype, 'a_field');

OK, from archives (Tom wrote) quoting:
So, inventing syntax at will, what you're imagining is something like

   modified := false;
   for name in names(NEW) loop
   -- ignore modified_timestamp
   continue if name = 'modified_timestamp';
   -- check all other columns
   if NEW.{name} is distinct from OLD.{name} then
   modified := true;
   exit;
   end if;
   end loop;
   if modified then ...

While this is perhaps doable, the performance would take your breath
away ... and I don't mean that in a positive sense.  The only way we
could implement that in plpgsql as it stands would be that every
single execution of the IF would invole a parse/plan cycle for the
$1 IS DISTINCT FROM $2 expression.  At best we would avoid a replan
when successive executions had the same datatypes for the tested
columns (ie, adjacent columns in the table have the same types).
Which would happen some of the time, but the cost of the replans would
still be enough to sink you.
/end quote

does the parse/plan objection still hold?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-15 Thread Heikki Linnakangas
Fujii Masao wrote:
 On Mon, Mar 15, 2010 at 6:42 PM, Magnus Hagander mag...@hagander.net wrote:
 I think we need to look at this as a single problem needing to be
 solved, and then have the same solution applied to dblink and
 walreceiver.

Agreed.

 Something like libpq_select() which waits for the socket to become
 ready would be required for walreceiver and dblink. But it's necessary
 for walreceiver on not only win32 but also the other, ...

Really, why? I thought this is a purely Windows specific problem.

Just replacing PQexec() with PQsendQuery() is pretty straightforward, we
could put that replacement in a file in port/win32. Replacing
PQconnectdb() is more complicated because you need to handle connection
timeout. I suggest that we only add the replacement for PQexec(), and
live with the situation for PQconnectdb(), that covers 99% of the
scenarios anyway.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Mar 15, 2010 at 10:02 AM, Andrew Dunstan and...@dunslane.net wrote:
 Not quite, but close. We also need a nice way of querying for field names
 (at least) at run time. I've seen that requested several times.

 does the parse/plan objection still hold?

Yeah.  Providing the field names isn't the dubious part --- the dubious
part is what are you going to *do* with them.  It's difficult to see
applications in which you can make the simplifying assumption that the
actual field datatypes are known/fixed.  Using field numbers instead of
names doesn't get you out from under that.  (Though I like the idea
insofar as it simplifies the looping mechanism.)

If we make the implementation be such that (rec-field)::foo forces
a runtime cast to foo (rather than throwing an error if it's not type
foo already), then it's possible to suppose that this sort of application
could be catered to by forcing all the fields to text, or some other
generic datatype.  This at least puts the text dependency out where the
user can see it, though it still seems rather inelegant.  It also takes
away possible error detection in other circumstances where a forced cast
isn't really wanted.

The cost of looking up the ever-changing cast function could still be
unpleasant, although I think we could hide it in the executor expression
node instead of forcing a whole new parse/plan cycle each time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Greg Smith
Just noticed a problem popping up sometimes with the new multi-threaded 
pgbench.  This is on a Linux RPM build (the alpha4 set) compiled with 
'--disable-thread-safety'.  Still trying to nail down whether that's a 
requirement for this problem to appear or not.  I did most of my review 
of this feature with it turned on, and haven't been seeing this problem 
on other systems that are thread safe.  Not sure yet if that's cause and 
effect or coincidence yet.


Here's a sample invocation that produces ragged output for me on my one 
system:


pgbench -S -T 5 -c 4 -j 4 -l pgbench

The log file produced by this (pgbench_log.pid) is supposed to consist 
of a series of lines in the following format:


client,trans,latency,filenum,sec,usec

It looks like the switch between clients running on separate workers can 
lead to a mix of their respective lines showing up though.  Here's a 
couple of typical samples, with the bad line in the middle of each set:


1 138 178 0 1268665788 607559
1 139 182 0 1268665788 607751
1 1402 0 2491 0 1268665788 586135
2 1 264 0 1268665788 586463
2 2 192 0 1268665788 586665

1 274 160 0 1268665788 632966
1 275 178 0 1268665788 633154
1 276 184 0 126866578 178 0 1268665788 614015
2 141 190 0 1268665788 614252
2 142 169 0 1268665788 614430

2 274 178 0 1268665788 639218
2 275 175 0 1268665788 639402
2 276 169 0 126866578 171 0 1268665788 626933
0 141 185 0 1268665788 627165
0 142 202 0 1268665788 627377

Looks like sometimes a client is only getting part of its line written 
out before getting stomped on by the next one.  I think one of the 
assumptions being made about how to safely write to this log file may be 
broken by the multi-process implementation, which is what you get when 
thread-safety is not available.


Since there should only be 6 fields here, I think you can find whether a 
given log file has this problem or not like this:


cat pgbench_log.x | cut -d   -f 7 | sort | uniq

If anything comes out of that, the latency log file has at least one bad 
line in it.


Similarly, this:

cat pgbench_log.x | cut -d   -f 1 | sort | uniq

Should only show the client numbers; here there's some first columns 
with much bigger numbers too.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-15 Thread Joe Conway
On 03/15/2010 02:42 AM, Magnus Hagander wrote:
 
 I think we need to look at this as a single problem needing to be
 solved, and then have the same solution applied to dblink and
 walreceiver.
 

+1

Joe



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Andrew Dunstan



Greg Smith wrote:
Just noticed a problem popping up sometimes with the new 
multi-threaded pgbench.  This is on a Linux RPM build (the alpha4 set) 
compiled with '--disable-thread-safety'.  Still trying to nail down 
whether that's a requirement for this problem to appear or not.  I did 
most of my review of this feature with it turned on, and haven't been 
seeing this problem on other systems that are thread safe.  Not sure 
yet if that's cause and effect or coincidence yet.





We had to turn handsprings to prevent this sort of effect with the 
logging collector, which was a requirement of being able to implement 
CSV logging sanely. So I'm not surprised by this report.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 Looks like sometimes a client is only getting part of its line written 
 out before getting stomped on by the next one.  I think one of the 
 assumptions being made about how to safely write to this log file may be 
 broken by the multi-process implementation, which is what you get when 
 thread-safety is not available.

pgbench doesn't make any effort at all to avoid interleaved writes on
that file.  I don't think there is anything much that can be done about
it when you are using the forked-processes implementation.  It's
probably possible for it to show up on the multi-threads version too,
depending on how hard libc tries to interlock stdio calls.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Merlin Moncure
On Mon, Mar 15, 2010 at 11:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we make the implementation be such that (rec-field)::foo forces
a runtime cast to foo (rather than throwing an error if it's not type
foo already)

yeah...explicit cast should always do 'best effort'

 The cost of looking up the ever-changing cast function could still be
 unpleasant, although I think we could hide it in the executor expression
 node instead of forcing a whole new parse/plan cycle each time.

right. if you do that, it's still going to be faster than the
dyna-sql/information schema/perl hacks people are doing right now
(assuming they didn't give up and code it in the app).  This is rtti
for plpgsql, and functions that use it are going have to be understood
as being slower and to be avoided if possible, like exception
handlers.  IMNSHO, this is a small price to pay.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Mar 15, 2010 at 11:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we make the implementation be such that (rec-field)::foo forces
 a runtime cast to foo (rather than throwing an error if it's not type
 foo already)

 yeah...explicit cast should always do 'best effort'

Probably so.  But is it worth inventing some other notation that says
expect this field to be of type foo, with an error rather than runtime
cast if it's not?  If we go with treating the result of - like UNKNOWN,
then you wouldn't need that in cases where the parser guesses the right
type.  But there are going to be cases where you need to override the
guess without necessarily wanting to buy into a forced conversion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dyamic updates of NEW with pl/pgsql

2010-03-15 Thread Merlin Moncure
On Mon, Mar 15, 2010 at 12:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Mar 15, 2010 at 11:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we make the implementation be such that (rec-field)::foo forces
 a runtime cast to foo (rather than throwing an error if it's not type
 foo already)

 yeah...explicit cast should always do 'best effort'

 Probably so.  But is it worth inventing some other notation that says
 expect this field to be of type foo, with an error rather than runtime
 cast if it's not?  If we go with treating the result of - like UNKNOWN,
 then you wouldn't need that in cases where the parser guesses the right
 type.  But there are going to be cases where you need to override the
 guess without necessarily wanting to buy into a forced conversion.

Maybe. That behaves like oid vector to PQexecParams, right?  Suggests
a type but does not perform a cast.  I see your point but I think it's
going to go over the heads of most people...type association vs type
coercion.  Maybe instead you could just supply typeof function in
order to provide very rigorous checking when wanted and presumably
allow things like pointing the assignment at a special field.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP: simple allocator

2010-03-15 Thread Pavel Stehule
Hello,

this patch significantly reduces memory usage of ispell dictionaries.

without patch (Czech dictionary, 64bit linux):

cspell: 48816784 total in 5930 blocks; 89496 free (1587 chunks); 48727288 used
  Ispell dictionary init context: 19226672 total in 12 blocks; 1742624
free (34 chunks); 17484048 used

final:
cspell: 48816784 total in 5930 blocks; 89496 free (1587 chunks); 48727288 used

46.5 MB (long mem) + 18.3 MB (short mem)

with patch:

cspell: 893584 total in 80 blocks; 11760 free (10 chunks); 881824 used
  Ispell dictionary simple context: 24647364 total in 188 blocks;
122464 free; 24524900 used
  Ispell dictionary init context: 2482224 total in 3 blocks; 24512
free (34 chunks); 2457712 used
Ispell dictionary simple init context: 8259489 total in 63 blocks;
59570 free; 8199919 used
final
cspell: 893584 total in 80 blocks; 11760 free (10 chunks); 881824 used
  Ispell dictionary simple context: 24647364 total in 188 blocks;
122464 free; 24524900 used

0.85 + 23.5 MB (long mem) + 2.4 + 7.9 MB (short mem)

Regards
Pavel Stehule


simple_alloc.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Josh Berkus
On 3/15/10 5:47 AM, Dimitri Fontaine wrote:
 Maybe it's time to start another thread if people want to follow-up on
 expanding our tutorial.

Yes, and on pgsql-docs rather than on this mailing list.

Or ... J.F.D.I (Just F Do It).  That is, if someone contributed a
whole buncha new text to the tutorial on pgsql-docs, I can't imagine it
being rejected out of hand.

For my part, I plan to just write the tutorial in whatever tool makes it
easiest to write (likely Lyx, but maybe OOo).  Then people can discuss
what portions belong in the docs, or not.

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Bruce Momjian
Josh Berkus wrote:
 Devs,
 
 Also, I would like to have a Beta or at least a new alpha release before
 April 3 for the test-fest, so that our volunteers aren't testing bugs
 which are already patched.

We can easily create another alpha by April 3.  I think the big question
is whether we can put out beta1 while we still have open HS/SR issues. 
My guess is no.  My other guess is that we will still have open HS/SR
issues on April 3.  So, putting those two guesses together, we will
create a new alpha by April 3 for you.  :-|

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how to use advanced gist options

2010-03-15 Thread Jeff Davis
On Sun, 2010-03-14 at 06:50 -0700, Sergej Galkin wrote:
 1) For example - can I delete entry in my picksplit procedure ?

No, entries are automatically removed by postgres; and only when the
underlying tuples in the table are removed (or they no longer match the
predicate of a partial index).

 2) Or to add logical conditions - when picksplit node ?  For exampe
 change default when number of entries of node is much than XX, split
 node - to when number of entries which element state is on is
 much than XX, split node ?

No, GiST doesn't allow that kind of fine-grained control. It's meant to
be a level above those details.

You can actually write your own index access method and plug that in to
postgresql at runtime. This is substantially more difficult than using
GiST, of course. The other disadvantage is that there are (currently) no
hooks for WAL recovery, so a crash may require an index rebuild (btree,
gist, and gin are safe against this by using the WAL).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Josh Berkus
On 3/15/10 8:41 AM, Greg Smith wrote:
 Just noticed a problem popping up sometimes with the new multi-threaded
 pgbench.  This is on a Linux RPM build (the alpha4 set) compiled with
 '--disable-thread-safety'.  Still trying to nail down whether that's a
 requirement for this problem to appear or not.  I did most of my review
 of this feature with it turned on, and haven't been seeing this problem
 on other systems that are thread safe.  Not sure yet if that's cause and
 effect or coincidence yet.

For my part, telling people that multi-thread pgbench doesn't work
correctly on systems which are not thread-safe seems perfectly OK.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Should we throw error when converting a nonexistent/ambiguous timestamp?

2010-03-15 Thread Tom Lane
It's DST transition season again, and that means that we're getting the
usual quota of questions from people who don't quite understand how
DST-related timestamp arithmetic works, and whose incorrect code seems
to work until exercised during a transition interval.  We've got this
one from a guy who got bit by converting a nonexistent local time:
http://archives.postgresql.org/pgsql-general/2010-03/msg00590.php
and last week we had one from an Aussie who was getting bit by the
behavior for ambiguous local times at the other end of the cycle:
http://archives.postgresql.org/pgsql-general/2010-03/msg00459.php

I'm starting to think that maybe we should throw error in these cases
instead of silently doing something that's got a 50-50 chance of being
wrong.  I'm not sure if the assume standard time rule is standardized,
but I think it might be better if we dropped it.  Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to beta1

2010-03-15 Thread Robert Haas
On Mon, Mar 15, 2010 at 4:24 PM, Bruce Momjian br...@momjian.us wrote:
 We can easily create another alpha by April 3.  I think the big question
 is whether we can put out beta1 while we still have open HS/SR issues.
 My guess is no.  My other guess is that we will still have open HS/SR
 issues on April 3.  So, putting those two guesses together, we will
 create a new alpha by April 3 for you.  :-|

I think we need to do a better job defining exactly what we think the
must fix HS/SR issues are.  Otherwise I can see this process of
trying to get to beta dragging out almost indefinitely.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Takahiro Itagaki

Greg Smith g...@2ndquadrant.com wrote:

 It looks like the switch between clients running on separate workers can 
 lead to a mix of their respective lines showing up though.

Oops. There might be two solutions for the issue:
  1. Use explicit locks. The lock primitive will be pthread_mutex for
 multi-threaded implementations or semaphore for multi-threaded ones.
  2. Use per-thread log files.
 File names would be pgbench_log.main-process-id.thread-id.

Which is better, or another idea?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we throw error when converting a nonexistent/ambiguous timestamp?

2010-03-15 Thread Robert Haas
On Mon, Mar 15, 2010 at 9:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Mar 15, 2010 at 7:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm starting to think that maybe we should throw error in these cases
 instead of silently doing something that's got a 50-50 chance of being
 wrong.  I'm not sure if the assume standard time rule is standardized,
 but I think it might be better if we dropped it.  Thoughts?

 That seems overly picky and fairly pointless to me.  Generally I'm a
 big fan of the idea that obvious breakage is better than silent
 breakage, but in this case it seems highly likely that you'll still
 have silent breakage until such time as a time change rolls around.

 Yes, that's true, the failure will only be apparent when a DST
 transition is sufficiently close by.  However, the problem with the
 current behavior is that the failure isn't obvious even then ---
 you might not notice the data inconsistency until much later when
 it's not possible to sort things out.

I disagree.  Even if the application DOES record a wrong time-stamp,
it's likely to be wrong by exactly an hour, or say two hours, and it
may well be possible to reconstruct what should have happened later;
an application crash may be result in no data being recorded at all,
and may therefore be harder to recover from.  Alternatively the
timestamp may be used for something non-critical, like logging a
last-changed time, and now you've turned a minor inaccuracy into an
application crash.  The scenario you describe is possible too, but
it's not clear-cut.  If we were starting from scratch I think either
behavior would be defensible, but changing it now doesn't seem good to
me.

 The current code behavior seems to me to be on par with, for example,
 trying to intuit MM-DD versus DD-MM field orders.  We used to try to
 do that, too, and gave it up as a bad idea.

I suppose it's topologically equivalent, but to me that is an order of
magnitude crazier than this case.

Of course I may be in the minority...  but you did ask...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] walreceiver is uninterruptible on win32

2010-03-15 Thread Fujii Masao
On Tue, Mar 16, 2010 at 12:32 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Something like libpq_select() which waits for the socket to become
 ready would be required for walreceiver and dblink. But it's necessary
 for walreceiver on not only win32 but also the other, ...

 Really, why? I thought this is a purely Windows specific problem.

Because, on all the platforms, libpq_receive() needs to call libpq_select().
Or you mean that we should leave the existing libpq_select() as it is, and
create new win32 specific function which waits for the socket to become ready,
for this issue?

 Just replacing PQexec() with PQsendQuery() is pretty straightforward, we
 could put that replacement in a file in port/win32. Replacing
 PQconnectdb() is more complicated because you need to handle connection
 timeout. I suggest that we only add the replacement for PQexec(), and
 live with the situation for PQconnectdb(), that covers 99% of the
 scenarios anyway.

I'll try to replace PQexec() first, and PQconnectdb() second if I have
enough time.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we throw error when converting a nonexistent/ambiguous timestamp?

2010-03-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Mar 15, 2010 at 9:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The current code behavior seems to me to be on par with, for example,
 trying to intuit MM-DD versus DD-MM field orders.  We used to try to
 do that, too, and gave it up as a bad idea.

 I suppose it's topologically equivalent, but to me that is an order of
 magnitude crazier than this case.

 Of course I may be in the minority...  but you did ask...

Well, the purpose of asking is to see whether there's a consensus for
doing something.  If not, fine, it's one less thing to worry about...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Greg Smith

Takahiro Itagaki wrote:

  1. Use explicit locks. The lock primitive will be pthread_mutex for
 multi-threaded implementations or semaphore for multi-threaded ones.
  2. Use per-thread log files.
 File names would be pgbench_log.main-process-id.thread-id.
  


I'm concerned that the locking itself will turn into a new pgbench 
bottleneck, just as we're clearing the point where it's not for the 
first time in a while.  And that sounds like it has its own potential 
risks/complexity involved.


I could live with per-thread log files.  I think my pgbench-tools is the 
main consumer of these latency logs floating around right now, I just 
pushed a 9.0 update to handle the multiple workers option today that 
discovered this).  It doesn't make any difference to what I'm doing how 
many file I have to process.  Just a few lines of extra shell code for 
me to pull the rest into the import.  That seems like the simplest 
solution that's guaranteed to work, just push the problem onto the 
client side instead where it's easier to deal with.


Unless someone feels strongly that these have to be interleaved into one 
file, based on Andrew's suggestion that this is a hard problem to get 
right and Tom's suggestion that this might even extend into the proper 
threaded version too, I think a log file per worker is the easiest way 
out of this.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Tom Lane
Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp writes:
 Greg Smith g...@2ndquadrant.com wrote:
 It looks like the switch between clients running on separate workers can 
 lead to a mix of their respective lines showing up though.

 Oops. There might be two solutions for the issue:
   1. Use explicit locks. The lock primitive will be pthread_mutex for
  multi-threaded implementations or semaphore for multi-threaded ones.
   2. Use per-thread log files.
  File names would be pgbench_log.main-process-id.thread-id.

I think #1 is out of the question, as the synchronization overhead will
do serious damage to the whole point of having a multithreaded pgbench.
#2 might be a reasonable idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Takahiro Itagaki

Tom Lane t...@sss.pgh.pa.us wrote:

 Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp writes:
  Oops. There might be two solutions for the issue:
1. Use explicit locks. The lock primitive will be pthread_mutex for
   multi-threaded implementations or semaphore for multi-threaded ones.
2. Use per-thread log files.
   File names would be pgbench_log.main-process-id.thread-id.
 
 I think #1 is out of the question, as the synchronization overhead will
 do serious damage to the whole point of having a multithreaded pgbench.
 #2 might be a reasonable idea.

Ok, I'll go for #2.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged latency log data in multi-threaded pgbench

2010-03-15 Thread Takahiro Itagaki

Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp wrote:

 2. Use per-thread log files.
File names would be pgbench_log.main-process-id.thread-id.

Here is a patch to implement per-thread log files for pgbench -l.

The log filenames are pgbench_log.main-process-id.thread-serial-number
for each thread, but the first thread (including single-threaded) still uses
pgbench_log.main-process-id for the name because of compatibility.

Example:
  $ pgbench -c16 -j4 -l
  $ ls
  pgbench_log.2196  pgbench_log.2196.1  pgbench_log.2196.2  pgbench_log.2196.3

Comments and suggenstions welcome.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



pgbench_log_20100316.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we throw error when converting a nonexistent/ambiguous timestamp?

2010-03-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Mar 15, 2010 at 7:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm starting to think that maybe we should throw error in these cases
 instead of silently doing something that's got a 50-50 chance of being
 wrong.  I'm not sure if the assume standard time rule is standardized,
 but I think it might be better if we dropped it.  Thoughts?

 That seems overly picky and fairly pointless to me.  Generally I'm a
 big fan of the idea that obvious breakage is better than silent
 breakage, but in this case it seems highly likely that you'll still
 have silent breakage until such time as a time change rolls around.

Yes, that's true, the failure will only be apparent when a DST
transition is sufficiently close by.  However, the problem with the
current behavior is that the failure isn't obvious even then ---
you might not notice the data inconsistency until much later when
it's not possible to sort things out.

The current code behavior seems to me to be on par with, for example,
trying to intuit MM-DD versus DD-MM field orders.  We used to try to
do that, too, and gave it up as a bad idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers