Re: [HACKERS] SSL renegotiation

2013-07-10 Thread Stuart Bishop
On Thu, Jul 11, 2013 at 4:20 AM, Alvaro Herrera
 wrote:

> I'm having a look at the SSL support code, because one of our customers
> reported it behaves unstably when the network is unreliable.  I have yet
> to reproduce the exact problem they're having, but while reading the
> code I notice this in be-secure.c:secure_write() :

The recap of my experiences you requested...

I first saw SSL renegotiation failures on Ubuntu 10.04 LTS (Lucid)
with openssl 0.9.8 (something). I think this was because SSL
renegotiation had been disabled due to due to CVE 2009-3555 (affecting
all versions before 0.9.8l). I think the version now in lucid is
0.9.8k with fixes for SSL renegotiation, but I haven't tested this.

The failures I saw with no-renegotiation-SSL for streaming replication
looked like this:

On the master:

2012-06-25 16:16:26 PDT LOG: SSL renegotiation failure
2012-06-25 16:16:26 PDT LOG: SSL error: unexpected record
2012-06-25 16:16:26 PDT LOG: could not send data to client: Connection
reset by peer

On the hot standby:

2012-06-25 11:12:11 PDT FATAL: could not receive data from WAL stream:
SSL error: sslv3 alert unexpected message
2012-06-25 11:12:11 PDT LOG: record with zero length at 1C5/95D2FE00


Now I'm running Ubuntu 12.04 LTS (Precise) with openssl 1.0.1, and I
think all the known renegotiation issues have been dealt with. I still
get failures, but they are less informative:

 2013-03-15 03:55:12 UTC LOG: SSL
renegotiation failure


-- 
Stuart Bishop 
http://www.stuartbishop.net/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #6532: pg_upgrade fails on Python stored procedures

2012-03-16 Thread Stuart Bishop
On Sat, Mar 17, 2012 at 12:54 AM, Bruce Momjian  wrote:

> Well, it will because, by creating the symlink, you allowed this
> function to be restored into the new database, and it isn't properly
> hooked to the plpython language.  I wonder if you should just delete it
> because I believe you already have the right plpython2 helper functions
> in place.  Can you run this query for me in one of the problem databases
> in the new and/or old cluster and send me the output:
>
>        SELECT proname,probin FROM pg_proc WHERE probin LIKE '%python%';

# SELECT nspname,proname,probin FROM pg_proc,pg_namespace WHERE probin
LIKE '%python%' and pg_proc.pronamespace=pg_namespace.oid;
  nspname   |proname|  probin
+---+--
 pg_catalog | plpython_call_handler | $libdir/plpython
 public | plpython_call_handler | $libdir/plpython
(2 rows)

I have no idea how I managed to grow the duplicate in the public
schema, but this does seem to be the source of the confusion. I might
be able to dig out when I grew it from revision control, but I don't
think that would help.

> What we need is for pg_dumpall to _not_ output those handlers.

Or pick it up in the check stage and make the user resolve the
problem. If I shot myself in the foot in some particularly obtuse way,
it might not be sane to bend over backwards making pg_upgrade repair
things.



-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] storing TZ along timestamps

2011-07-08 Thread Stuart Bishop
On Mon, Jun 6, 2011 at 7:50 AM, Jim Nasby  wrote:
> On Jun 4, 2011, at 3:56 AM, Greg Stark wrote:
>> On Thu, Jun 2, 2011 at 8:58 PM, Jim Nasby  wrote:
>>>
>>> I'm torn between whether the type should store the original time or the 
>>> original time converted to GMT.
>>
>> This is the wrong way to think about it. We *never* store time
>> "converted to GMT".  When we want to represent a point in time we
>> represent it as seconds since the epoch.
> Right. Sorry, my bad.
>
>> The question here is how to represent more complex concepts than
>> simply points in time. I think the two concepts under discussion are
>> a) a composite type representing a point in time and a timezone it
>> should be interpreted in for operations and display and b) the
>> original input provided which is a text string with the constraint
>> that it's a valid input which can be interpreted as a point in time.
>
> My fear with A is that something could change that would make it impossible 
> to actually get back to the time that was originally entered. For example, a 
> new version of the timezone database could change something. Though, that 
> problem also exists for timestamptz today, so presumably if it was much of an 
> issue we'd have gotten complaints by now.

The common problem is daylight savings time being declared or
cancelled. This happens numerous times throughout the year, often with
short notice.

If you want to store '6pm July 3rd 2014 Pacific/Fiji', and want that
to keep meaning 6pm Fiji time no matter what decisions the Fijian
government makes over the next two years, you need to store the
wallclock (local) time and the timezone. The wallclock time remains
fixed, but the conversion to UTC may float.

If you are storing an point in time that remains stable no matter
future political decisions, you store UTC time and an offset. The
conversion to wallclock time may float, and your 6pm Fiji time meeting
might change to 5pm or 7pm depending on the policical edicts.

If you are only storing past events, its not normally an issue but
timezone information does occasionally get changed retroactively if
errors are discovered.


-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Indent authentication overloading

2010-11-17 Thread Stuart Bishop
On Wed, Nov 17, 2010 at 10:35 PM, Magnus Hagander  wrote:
> Currently, we overload "indent" meaning both "unix socket
> authentication" and "ident over tcp", depending on what type of
> connection it is. This is quite unfortunate - one of them being one of
> the most secure options we have, the other one being one of the most
> *insecure* ones (really? ident over tcp? does *anybody* use that
> intentionally today?)

We use it. Do you have an alternative that doesn't lower security
besides Kerberos? Anti-ident arguments are straw man arguments - "If
you setup identd badly or don't trust remote root or your network,
ident sucks as an authentication mechanism".

Ident is great as you don't have to lower security by dealing with
keys on the client system (more management headaches == lower
security), or worry about those keys being reused by accounts that
shouldn't be reusing them. Please don't deprecate it unless there is
an alternative. And if you are a pg_pool or pgbouncer maintainer,
please consider adding support :)


-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-based Releases WAS: 8.5 release timetable, again

2009-09-08 Thread Stuart Bishop
On Tue, Sep 8, 2009 at 7:54 PM, Andrew Dunstan wrote:

> The release cycle is quite independent of the release lifetime.

If you have dates on releases, it is easier to set dates on release
lifetime. If you know the releases come out once a year at about the
same time, and you want to have a set number of versions in play, you
can state at release time when the community will stop support. This
gives everyone a clear picture to people what versions they should be
targeting and when upgrades will be required.

> In any case, I don't accept this analogy. The mechanics of a Linux
> distribution are very different from the mechanics of a project like
> PostgreSQL. The prominent OSS project that seems to me most like ours is the
> Apache HTTP project. But they don't do timed releases AFAIK, and theirs is
> arguably the most successful OSS project ever.

We find it works for stuff other than Ubuntu too. IIRC original
concerns where you could do it for a small open source project,  but
it would be impossible to do when juggling as many moving parts as a
Linux distribution. You might find the document I cited is for a
project with similar issues to PostgreSQL and may address your
concerns. It seems to work for other large projects too, such as
Gnome, as well as smaller ones. People are discussing switching for
reasons Joshua cited (maintaining momentum, planning, enterprise
adoption etc.), because people find it a good idea on other projects
they work with, or maybe because they read too many articles on agile
and lean development practices. It seems to be working fine for me
personally (I work on launchpad.net, which is an Open Source
mostly-web  application using generally Lean/Agile development
methodologies, a one month release cycle and a team of about 30 spread
over all timezones).

> I'm especially resistant to suggestions that we should in some way
> coordinate our releases with other projects' timings. Getting our own
> developers organized is sufficiently like herding cats that I have no
> confidence that anyone will successfully organize those of a plethora of
> projects.

I tend to think it will evolve naturally as more people switch to time
based releases. Its natural to sync in with the OS releases your
developers care about because it makes their lives easier, and its
natural for the distributions to get in sync too because it makes
their developer's lives easier. But only hindsight will tell of course
:-) With a yearly schedule, it probably doesn't matter much except for
distributions with a 2 or 3 year cycle - you would still end up with
latest PostgreSQL a maximum of I think 8 months after the official
release.

> I am not saying timed releases are necessarily bad. But many of the
> arguments that have been put forward to support them don't seem to me to
> withstand critical analysis.
>
> I would argue that it would be an major setback for us if we made another
> release without having Hot Standby or whatever we are calling it now. I
> would much rather slip one month or three than ship without it.

This is why you want your cycle as small as possible - if you have a 6
month cycle for instance, the feature would be available a maximum of
6 months after it is ready. With the feature based release cycle, what
if it still isn't ready for prime time after three months of slippage?
Having one feature slip hurts, but having all features slip hurts
more. Josh cited several examples where he felt similar situations had
hurt PostgreSQL development. Of course, if you think it is critical
enough you can let it slip and if it is critical enough people will
understand - we let one of the 10 Ubuntu releases slip once and people
generally understood (you want to get a LTS release right since you
have to live with your mistakes for 5 years). There was some flak but
we are still here.

I personally suspect PostgreSQL would want a 1 year cycle for major
releases while a full dump/reload is required for upgrades. When this
changes, 6 or even 4 months might actually be a good fit.

-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-based Releases WAS: 8.5 release timetable, again

2009-09-08 Thread Stuart Bishop
On Sat, Aug 29, 2009 at 12:19 AM, Josh Berkus wrote:

> I'd think the advantages for our commercial adopters (who pay the
> salaries for many of the people on this list) would be obvious; if they
> know with a small margin of error when the next version of PostgreSQL is
> coming out, they can plan testing and deployment of their new products.
>  See Kevin's post; many companies need to schedule serious testing
> hardware months in advance, and every ISV needs to plan new product
> deployments up to a year in advance.  We bitch a lot in the community
> about the super-old versions of PG which commercial software is using,
> but our variable release cycle is partly to blame.

It also works on the other end - with time based releases you can also
schedule obsolescence. It is just as critical knowing when the
community will stop bug fixes and  security fixes when you are trying
to schedule major rollouts and planning product development.

Canonical (my employer) certainly believe in time based releases, and
that is one of the major reasons for the growth of Ubuntu and the
Ubuntu Community. We now use time based releases for almost all our
sponsored projects (some 6 monthly, some monthly), and are lobbying
various projects and other OS distributions to get into some sort of
cadence with releases so everyone benefits. It makes us happier
(especially when we are choosing what we can commit to providing
security updates for the 5 year releases), and our users happier, and
I think you happier with less support issues.

(In fact the one project I'm personally aware of that doesn't have
time based releases also has the worst reputation for bug fixes and
updates and caused us trauma because of it, so I'll be pushing to get
that fixed too :-P)

> Certainly our project experiences with "waiting for feature X" have all
> been negative.  The windows port never got into 7.4 despite holding it
> up 4 months.  HOT held up 8.3 for three to five months, depending on how
> you count it, in what I think everyone feels was our most painful beta
> period ever.  Most recently, we let HS/SR hold up 8.4 for 2 months ...
> and they still weren't ready.
>
> I would like to see us go to an annual release timeline in which we
> release in the same month every year.  Any time we say "variable release
> date" what it really means is "later release date".  We've never yet
> released something *early*.

Yes please.

You may even want to seriously consider shorter release cycles.
Tighter cycles can actually reduce stress, as people are less
concerned with slippage. With our projects on one month cycles, it
doesn't matter that much if a feature isn't good enough for a release
- it just goes out with the next months release or the one after if
you really underestimated the work. With longer cycles, the penalties
of missing deadlines is much greater which can lead to cutting corners
if people are not disciplined.

Of course, PG already has its own historical cadence to start from
where as we had the luxury of adopting time based releases at the
start or relatively early in development. For PostgreSQL, with the
regular commit fests you might end up to a similar process to GNU
Bazaar except with yearly major releases and 2 month development
releases, documented at
http://doc.bazaar-vcs.org/latest/developers/cycle.html. This is a
smaller project, but had to address a number of similar concerns that
PostgreSQL would have to so may be a good basis for discussion.

-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: plpython3

2009-07-24 Thread Stuart Bishop



On Fri, Jul 24, 2009 at 5:23 AM, James Pye wrote:


  That also means that maintaining a separate, parallel code base
  for a Python 3 variant can only be acceptable if it gives major
advantages.


I'm not particularly interested in Python 3.x support yet (we are still back on 
2.4, soon to hop to 2.5 or 2.6. For us 3.1 is probably 2 years away at the 
earliest). I am interested in improved plpython though.


 * Reworked function structure (Python modules, not function fragments)


I think it would be an improvement to move away from function fragments. One 
thing I would like to be able to do is have my Python test suite import my 
plpython and run tests on it. This would be much easier to do if instead of 
'import Postgres' to pull in the api, an object was passed into the entry point 
which provides the interface to PostgreSQL. This way I can pass in a mock 
object. This is also useful outside of the test suite - the same module can be 
used as a stored procedure or by your Python application - your web application 
can use the same validators as your check constraints for instance.



The second feature, function structure, is actually new to the PL.
Originally PL/Py took a pl/python-like approach to triggers and functions.
*Currently*, I want to change procedures to be Python modules with specific
entry points used to handle an event. Mere invocation: "main". Or, a trigger
event: "before_insert", "after_insert", "before_update", etc.



So, a regular function might look like:

CREATE OR REPLACE FUNCTION foo(int) RETURNS int LANGUAGE plpython3u AS
$python$
import Postgres

def main(i):
   return i
$python$;

Despite the signature repetition, this is an improvement for the user and
the developer. The user now has an explicit initialization section that is
common to Python(it's a module). The PL developer no longer needs to munge
the source, and can work with common Python APIs to manage and introspect
the procedure's module(...thinking: procedure settings..).


I'd like a way to avoid initialization on module import if possible. Calling an 
initialization function after module import, if it exists, would do this.

CREATE FUNCTION foo(int) RETURNS in LANGUAGE plpythonu AS
$python$
[initialization on module import]
def pg_init(pg):
   [initialization after module import]
def pg_main(pg, i):
   return i
$python$;


Thoughts? [...it still has a *long* ways to go =]


I tend to dislike magic function names, but perhaps it is the most usable 
solution.

--
Stuart Bishop 
http://www.stuartbishop.net/



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Re: [GENERAL] pgstattuple triggered checkpoint failure and database outage?

2009-03-31 Thread Stuart Bishop
On Tue, Mar 31, 2009 at 2:20 PM, Heikki Linnakangas
 wrote:

>> This is exactly what happened, and temporary tables belonging to other
>> sessions where fed to pgstattuple.
>
> +1 for throwing an error. That's what we do for views, composite types, and
> GIN indexes as well. If you want to write a query to call pgstattuple for
> all tables in pg_class, you'll need to exclude all those cases anyway. To
> exclude temp tables of other sessions, you'll need to add "AND
> pg_is_other_temp_schema(relnamespace)".

I would have expected an exception to be raised personally.

> I'm ok with returning NULLs as well, but returning zeroes doesn't feel
> right.


-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] pgstattuple triggered checkpoint failure and database outage?

2009-03-30 Thread Stuart Bishop
On Tue, Mar 31, 2009 at 11:20 AM, Tom Lane  wrote:

> A quick look at contrib/pgstattuple shows that it makes no effort
> whatsoever to avoid reading temp tables belonging to other sessions.
> So even if that wasn't Stuart's problem (and I'll bet it was), this
> is quite broken.
>
> There is no way that pgstattuple can compute valid stats for temp
> tables of other sessions; it doesn't have access to pages in the other
> sessions' temp buffers.  It seems that the alternatives we have are
> to make it throw error, or to silently return zeroes (or perhaps
> nulls?).  Neither one is tremendously appetizing.  The former would
> be especially unhelpful if someone tried to write a query to apply
> pgstattuple across all pg_class entries, which I kinda suspect is
> what Stuart did.

This is exactly what happened, and temporary tables belonging to other
sessions where fed to pgstattuple.


-- 
Stuart Bishop 
http://www.stuartbishop.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog(FATAL) vs shared memory

2007-04-12 Thread Stuart Bishop
Jim Nasby wrote:
> On Apr 11, 2007, at 6:23 PM, Jim Nasby wrote:
>> FWIW, you might want to put some safeguards in there so that you don't
>> try to inadvertently kill the backend that's running that function...
>> unfortunately I don't think there's a built-in function to tell you
>> the PID of the backend you're connected to; if you're connecting via
>> TCP you could use inet_client_addr() and inet_client_port(), but that
>> won't work if you're using the socket to connect.
> 
> *wipes egg off face*
> 
> There is a pg_backend_pid() function, even if it's not documented with
> the other functions (it's in the stats function stuff for some reason).

eh. No worries - my safeguard is just a comment saying 'don't connect to the
same database you are killing the connections of' :-)


-- 
Stuart Bishop <[EMAIL PROTECTED]>   http://www.canonical.com/
Canonical Ltd.http://www.ubuntu.com/



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] elog(FATAL) vs shared memory

2007-04-09 Thread Stuart Bishop
Tom Lane wrote:
> Stuart Bishop <[EMAIL PROTECTED]> writes:
>> After a test is run, the test harness kills any outstanding connections so
>> we can drop the test database. Without this, a failing test could leave open
>> connections dangling causing the drop database to block.
> 
> Just to make it perfectly clear: we don't consider SIGTERMing individual
> backends to be a supported operation (maybe someday, but not today).
> That's why you had to resort to plpythonu to do this.  I hope you don't
> have anything analogous in your production databases ...

No - just the test suite. It seems the only way to terminate any open
connections, which is a requirement for hooking PostgreSQL up to a test
suite or any other situation where you need to drop a database *now* rather
than when your clients decide to disconnect (well... unless we refactor to
start a dedicated postgres instance for each test, but our overheads are
already pretty huge).

-- 
Stuart Bishop <[EMAIL PROTECTED]>   http://www.canonical.com/
Canonical Ltd.http://www.ubuntu.com/



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] elog(FATAL) vs shared memory

2007-04-06 Thread Stuart Bishop
Mark Shuttleworth wrote:
> Tom Lane wrote:
>> (1) something (still not sure what --- Martin and Mark, I'd really like
>> to know) was issuing random SIGTERMs to various postgres processes
>> including autovacuum.
>>   
> 
> This may be a misfeature in our test harness - I'll ask Stuart Bishop to
> comment.

After a test is run, the test harness kills any outstanding connections so
we can drop the test database. Without this, a failing test could leave open
connections dangling causing the drop database to block.

CREATE OR REPLACE FUNCTION _killall_backends(text)
RETURNS Boolean AS $$
import os
from signal import SIGTERM

plan = plpy.prepare(
"SELECT procpid FROM pg_stat_activity WHERE datname=$1", ['text']
)
success = True
for row in plpy.execute(plan, args):
try:
plpy.info("Killing %d" % row['procpid'])
os.kill(row['procpid'], SIGTERM)
except OSError:
success = False

return success
$$ LANGUAGE plpythonu;

-- 
Stuart Bishop <[EMAIL PROTECTED]>   http://www.canonical.com/
Canonical Ltd.http://www.ubuntu.com/



signature.asc
Description: OpenPGP digital signature