Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Amit Kapila
 From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] 
 On Behalf Of Jose Ildefonso Camargo Tolosa

 Please, stop arguing on all of this: I don't think that adding an
 option will hurt anybody (specially because the work was already done
 by someone), we are not asking to change how the things work, we just
 want an option to decided whether we want it to freeze on standby
 disconnection, or if we want it to continue automatically... is that
 asking so much?

I think this kind of decision should be done from outside utility or
scripts.
It would be better if from outside it can be detected that stand-by is down
during sync replication, and send command to master to change its mode or
change settings appropriately without stopping master.
Putting this kind of more and more logic into replication code will make it
more cumbersome.

With Regards,
Amit Kapila.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-07-12 Thread Heikki Linnakangas

On 12.07.2012 02:11, Alexander Korotkov wrote:

On Thu, Jul 12, 2012 at 3:03 AM, Alexander Korotkovaekorot...@gmail.comwrote:


On Tue, Jul 3, 2012 at 10:51 AM, Jeff Davispg...@j-davis.com  wrote:


Also, it would be helpful to add a couple tests to rangetypes.sql.



New version of patch is attached.



Oops, forgot to include one comment fix into patch.


Thanks. Can you do something about TrickFunctionCall2, please? 
(http://archives.postgresql.org/message-id/4fe2c968.2010...@enterprisedb.com) 
A separate patch to refactor that in the existing gist opclass would 
probably be best.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Joel Jacobson
On Wed, Jul 11, 2012 at 8:01 PM, Merlin Moncure mmonc...@gmail.com wrote:

 After extensive off-list discussion with Joel it became clear that
 per-object dumping ability really belongs in pg_restore.


The only benefit I could see in putting it in pg_restore is you would then
be able to do a --split on already existing historical dumps.

On the other hand, it would require you to use both pg_dump and pg_restore,
instead of only pg_dump, which makes it a bit less user-friendly.

I haven't looked at how it could be implemented in pg_restore, if its even
just
a little more complex, it's probably better to let pg_dump handle the task.


Re: [HACKERS] Schema version management

2012-07-12 Thread Joel Jacobson
On Wed, Jul 11, 2012 at 11:20 PM, Alvaro Herrera alvhe...@commandprompt.com
 wrote:

 operator_!___numeric.sql (postfix, name does not need escape)
 operator_%7C%2F_integer__.sql (prefix)
 operator_%3C_bit_varying__bit_varying.sql (type name with spaces,
 changed to _)


I think the directory structure [schema]/[type]/[name] should be the same
for all object types. I don't like operator being part of the filename,
it should be the directory name.


[HACKERS] Logging both start and end of temporary file usage

2012-07-12 Thread Christophe Pettus
In working with s a client to analyze their temp file usage, it became useful 
to know when a temporary file was created as well as when it was closed.  That 
way, we could process the logs to determine a high water mark of overall temp 
file usage, to know how high it was safe (in that workload) to set work_mem.  
So, I wrote a quick patch that logged both the open and close of the temp file.

Since the final size of the file isn't known at the time that the file is 
created, the patch just logs the filename.  The particular file can be 
correlated with the size by the name when the close message is logged.  Of 
course, there's no information about the pattern of the file size over time, 
but it's a bit more information than was there before.

As we don't know the size of the file until close time, the open is only logged 
if log_temp_files is 0 (the idea being that's maximum logging).

If this sounds like something worthwhile in general, I can package it up as a 
proper patch.
--
-- Christophe Pettus
   x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for XLogRecPtr in expand_fmt_string?

2012-07-12 Thread Heikki Linnakangas

On 07.07.2012 01:03, Peter Eisentraut wrote:

On tis, 2012-07-03 at 14:52 -0400, Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

On tis, 2012-07-03 at 19:35 +0200, Andres Freund wrote:

I wonder if we just should add a format code like %R or something similar as a
replacement for the %X/%X notion.



Maybe just print it as a single 64-bit value from now on.


That'd be problematic also, because of the lack of standardization of
the format code for uint64.  We could write things like
message...  UINT64_FORMAT  ...more message
but I wonder how well the translation tools would work with that;
and anyway it would at least double the translation effort for
messages containing such things.


The existing uses of INT64_FORMAT and UINT64_FORMAT show how this is
done:  You print the value in a temporary buffer and use %s in the final
string.  It's not terribly pretty, but it's been done this way forever,
including in xlog code, so there shouldn't be a reason to hesitate about
the use for this particular case.


That's hardly any simpler than what we have now.

On 03.07.2012 21:09, Tom Lane wrote:
 Andres Freundand...@2ndquadrant.com  writes:
 I wonder if we just should add a format code like %R or something 
similar as a

 replacement for the %X/%X notion.

 Only if you can explain how to teach gcc what it means for elog argument
 match checking.  %m is a special case in that it matches up with a
 longstanding glibc-ism that gcc knows about.  Adding format codes of our
 own invention would be problematic.

One idea would be to use a macro, like this:

#define XLOGRECPTR_FMT_ARGS(recptr) (uint32) ((recptr)  32), (uint32) 
(recptr)


elog(LOG, current WAL location is %X/%X, XLOGRECPTR_FMT_ARGS(RecPtr));

One downside is that at first glance, that elog() looks broken, because 
the number of arguments don't appear to match the format string.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.

2012-07-12 Thread Heikki Linnakangas

On 07.07.2012 00:12, Jan Urbański wrote:

On 06/07/12 22:47, Peter Eisentraut wrote:

On fre, 2012-07-06 at 18:53 +0300, Heikki Linnakangas wrote:

What shall we do about those? Ignore them? Document that if you're sing
one of these encodings then PL/Python with Python 2 will be crippled
and
with Python 3 just won't work?


We could convert to UTF-8, and use the PostgreSQL functions to convert
from UTF-8 to the server encoding. Double conversion might be slow, but
I think it would be better than failing.


Actually, we already do the other direction that way
(PLyUnicode_FromStringAndSize) , so maybe it would be more consistent to
always use this.

I would hesitate to use this as a kind of fallback, because then we
would sometimes be using PostgreSQL's recoding tables and sometimes
Python's recoding tables, which could became confusing.


So you're in favour of doing unicode - bytes by encoding with UTF-8 and
then using the server's encoding functions?


Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2 
should be quite fast, and it would be good to be consistent in the way 
we do conversions in both directions.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Scheduled back-branch releases?

2012-07-12 Thread Bernd Helmle
Do we have a schedule for when next back-branch releases are packaged (i hope i
didn't miss any announcement...)?

-- 
Thanks

Bernd

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scheduled back-branch releases?

2012-07-12 Thread Magnus Hagander
On Thu, Jul 12, 2012 at 11:41 AM, Bernd Helmle maili...@oopsware.de wrote:
 Do we have a schedule for when next back-branch releases are packaged (i hope 
 i
 didn't miss any announcement...)?

No, there is no such schedule (yet).

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Etsuro Fujita
 2012/6/26 Kohei KaiGai kai...@kaigai.gr.jp:
  Harada-san,
 
  I checked your patch, and had an impression that includes many
  improvements from the previous revision that I looked at the last
  commit fest.
 
  However, I noticed several points to be revised, or investigated.
 
  * It seems to me expected results of the regression test is not
attached, even though test cases were included. Please add it.

KaiGai-san, Did you find the file?  That is in the contrib/pgsql_fdw/expected
directory, named pgsql_fdw.out.  If necessary, I will send the file to you.
BTW, I found some bugs on the expected results of the file.  Attached is a patch
fixing the bugs.

Thanks,

Best regards,
Etsuro Fujita



pgsql_fdw_regress.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Shigeru HANADA
(2012/07/12 6:04), Peter Eisentraut wrote:
 On tor, 2012-06-14 at 21:29 +0900, Shigeru HANADA wrote:
 I'd like to propose pgsql_fdw, FDW for PostgreSQL, as a contrib module
 in core, again.
 
 Do you have any new proposals regarding naming, and how to deal with
 postgresql_fdw_validator, and dblink?

Yes, I've proposed to rename existing postgresql_fdw_validator to
dblink_fdw_validator and move it into contrib/dblink so that pgsql_fdw
can use the name postgresql_fdw and postgresql_fdw_validator.
Though this post has had no response...

http://archives.postgresql.org/message-id/4f854f43.4030...@gmail.com

Regards,
-- 
Shigeru HANADA



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Magnus Hagander
On Tue, Jul 10, 2012 at 6:45 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jul 10, 2012 at 6:27 AM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jul 9, 2012 at 8:23 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 I found several problems in pg_receivexlog, e.g., memory leaks,
 file-descripter leaks, ..etc. The attached patch fixes these problems.

 While I don't doubt that what you've found are real problems, would
 you mind explaining exactly what they are, so we don't have to
 reverse-engineer the patch to figure that out?

 Yep.

 When an error happens after replication connection has been established,
 pg_receivexlog doesn't close an open file descriptor and release an allocated
 memory area. This was harmless before 16282ae688de2b320cf176e9be8a89e4dfc60698
 because pg_receivexlog exits immediately when an error happens. But
 currently in an error case, pg_receivexlog tries reconnecting to the server
 infinitely, so file descriptors and memory would leak. I think this is problem
 and should be fixed. The patch which I submitted yesterday changes
 pg_receivexlog so that it closes the open file and frees the memory area
 before reconnecting to the server.

Thanks. I get it now, and this explains why I didn't see it before - I
didn't check properly after we added the loop mode. Patch applied with
minor changes (e.g. there's no point in doing PQfinish(tmpconn) right
after you've verified tmpconn is NULL)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Magnus Hagander
On Tue, Jul 10, 2012 at 7:03 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jul 10, 2012 at 3:23 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 I found several problems in pg_receivexlog, e.g., memory leaks,
 file-descripter leaks, ..etc. The attached patch fixes these problems.

 ISTM there are still some other problems in pg_receivexlog, so I'll
 read it deeply later.

 While pg_basebackup background process is streaming WAL records,
 if its replication connection is terminated (e.g., walsender in the server
 is accidentally terminated by SIGTERM signal), pg_basebackup ends
 up failing to include all required WAL files in the backup. The problem
 is that, in this case, pg_basebackup doesn't emit any error message at all.
 So an user might misunderstand that a base backup has been successfully
 taken even though it doesn't include all required WAL files.

Ouch. That is definitely a bug if it behaves that way.


 To fix this problem, I think that, when the replication connection is
 terminated, ReceiveXlogStream() should check whether we've already
 reached the stop point by calling stream_stop() before returning TRUE.
 If we've not yet (this means that we've not received all required WAL
 files yet), ReceiveXlogStream() should return FALSE and
 pg_basebackup should emit an error message.  Comments?

Doesn't it already return false because it detects the error of the
connection? What's the codepath where we end up returning true even
though we had a connection failure? Shouldn't that end up under the
could not read copy data branch, which already returns false?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Kohei KaiGai
2012/7/12 Shigeru HANADA shigeru.han...@gmail.com:
 (2012/07/12 6:04), Peter Eisentraut wrote:
 On tor, 2012-06-14 at 21:29 +0900, Shigeru HANADA wrote:
 I'd like to propose pgsql_fdw, FDW for PostgreSQL, as a contrib module
 in core, again.

 Do you have any new proposals regarding naming, and how to deal with
 postgresql_fdw_validator, and dblink?

 Yes, I've proposed to rename existing postgresql_fdw_validator to
 dblink_fdw_validator and move it into contrib/dblink so that pgsql_fdw
 can use the name postgresql_fdw and postgresql_fdw_validator.
 Though this post has had no response...

It seems to me what postgresql_fdw_validator() is doing looks like
a function to be named as libpq_fdw_validator().

How about your opinion? It will help this namespace conflicts.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Dimitri Fontaine
Hi,

Jose Ildefonso Camargo Tolosa ildefonso.cama...@gmail.com writes:
 environments.  And no, it doesn't makes synchronous replication
 meaningless, because it will work synchronous if it have someone to
 sync to, and work async (or standalone) if it doesn't: that's perfect
 for HA environment.

You seem to want Service Availibility when we are providing Data
Availibility. I'm not saying you shouldn't ask what you're asking, just
that it is a different need.

If you troll the archives, you will see that this debate has received
much consideration already. The conclusion is that if you care about
Service Availibility you should have 2 standby servers and set them both
as candidates to being the synchronous one.

That way, when you lose one standby the service is unaffected, the
second standby is now the synchronous one, and it's possible to
re-attach the failed standby live, with or without archiving (with is
preferred so that the master isn't involved in the catch-up phase).

 As synchronous standby currently is, it just doesn't fit the HA usage,

It does actually allow both data high availability and service high
availability, provided that you feed at least two standbys.

What you seem to be asking is both data and service high availability
with only two nodes. You're right that we can not provide that with
current releases of PostgreSQL. I'm not sure anyone has a solid plan to
make that happen.

 and if you really want to keep it that way, it doesn't belong to the
 HA chapter on the pgsql documentation, and should be moved.  And NO
 async replication will *not* work for HA, because the master can have
 more transactions than standby, and if the master crashes, the standby
 will have no way to recover these transactions, with synchronous
 replication we have *exactly* what we need: the data in the standby,
 after all, it will apply it once we promote it.

Exactly. We want data availability first. Service availability is
important too, and for that you need another standby.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PG9.2 and FDW query planning.

2012-07-12 Thread Ronan Dunklau
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 05/07/2012 18:30, Tom Lane wrote:
 Ronan Dunklau rdunk...@gmail.com writes:
 Let's say I have an IMAP foreign data wrapper, and I write a
 query joining the table on itself using the In-Reply-To and
 Message-ID headers, is there anything I can do to avoid fetching
 all the mails from the remote server ?
 
 If I could somehow inform the planner that it can look up rows
 by message-id, thus avoiding the need to fetch everything from
 the remote server. Perhaps persuading the planner to use a
 nested-loop ?
 
 OK, so what you're saying is that the imap server can effectively 
 provide an index on message_id.  What you'd do is create a
 parameterized path that uses the tbl.message_id =
 other_tbl.in_reply_to join clause. If that's enough cheaper than a
 full scan, the planner would select it.

Thank you, I was able to build such paths from your indication.

The python FDW implementor can optionally give a list of tuples
consisting of (path key, expected_number_of_row). So, in the imap
example that could be [('Message-ID', 1), ('From', 1000)] for example.

- From this information, if there is an equivalence class which
restrictinfo uses one of those keys, we build a parameterized path,
with an associated cost of base_width * expected_number_of_row, in
addition to the generic, unparameterized path.

The planner can then select this path, and build plans looking like this:

postgres=# explain select m1.From,
   m1.To,
   m2.From,
   m2.To
from mails m1 inner join mails m2 on m2.Message-ID = m1.In-Reply-To
where m1.From = '%t...@example.com%';

QUERY PLAN
- 
 Nested Loop  (cost=10.00..60001000.00 rows=5 width=128)
   -  Foreign Scan on mails m1  (cost=0.00..3000.00 rows=10
width=300)
 Filter: ((From)::text = '%t...@example.com%'::text)
   -  Foreign Scan on mails m2  (cost=10.00..300.00 rows=1 width=300)
 Filter: ((Message-ID)::text = (m1.In-Reply-To)::text)


If I understand it correctly, after returning a ForeignScan (from
GetForeignPlan), the planner decides to use a nestloop, and in the
process of creating the nestloop plan, replaces Var nodes coming from
the outerel (here, m1.In-Reply-To) by params nodes.

My current implementation already looks for (var = param) expressions
that it may handle  during the plan phase and stores the association
between the var and the param_id.
At execution time, the needed parameters values are fetched (from the
ParamExecData array found in es_param_exec_vals) and passed to the
python foreign data wrapper.

The problem I have: how can I retrieve the generated params and keep
the association between the var and the param ?

Should I replace the (var = outervar) clauses by (var = param) myself
and store them in the fdw_exprs field of the foreign scan ?

 FWIW, I'm not sure that it's sane to try to expose this stuff to
 python yet.  It's complicated and still something of a moving
 target.  Once we've got a few more C-coded FDWs that can do this
 type of optimization, things might be stable enough that it'd be
 useful to try to provide a high-level API.

The current API (as mentioned above) would be more declarative than
anything, only offering a way to (maybe) build parameterized paths
without guaranteeing anything. Even if the internals change, I fail to
see how it can hurt to offer such a feature.


Regards,

- -- 
Ronan Dunklau


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQEcBAEBAgAGBQJP/s2dAAoJECTYLCgFy3239KkIAIiKJo/F1r4Yp49wLpmThjQI
ICo910ZajqlUKVsl9ye8m2l6p+lyGEmZMWUAWP6ae2pqFR+aC0zThypjF1faZ9tN
HfqMbEKx/trkDf05U28tJlvOeu21tiEOEs4n02fmfdHu9SvemuLdyhU3dOLxoBVK
ZZ8ra9q/+zHCPpc3zt0Mow80Q1X1M3DtirsHPoeIdOK69wD4nD2ZfhQule5HaoV1
dG3FlrKGAGzRpohLBCuWzyGPcWCS584lXGWfhsz/waLaSDIjcjvaaMke54eaa8Ci
7KxXkMM12CKFQyheSR5VVwFJrobnME2HDiJCoAOkRc0dW+Y2aASJnKG/FwL8C7s=
=4RjB
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas

On 07/12/2012 12:31 AM, Daniel Farina wrote:


But RAID-1 as nominally seen is a fundamentally different problem,
with much tinier differences in latency, bandwidth, and connectivity.
Perhaps useful for study, but to suggest the problem is *that* similar
I think is wrong.


Well, yes and no. One of the reasons I brought up DRBD was because it's 
basically RAID-1 over a network interface. It's not without overhead, 
but a few basic pgbench tests show it's still 10-15% faster than a 
synchronous PG setup for two servers in the same rack. Greg Smith's 
tests show that beyond a certain point, a synchronous PG setup 
effectively becomes untenable simply due to network latency in the 
protocol implementation. In reality, it probably wouldn't be usable 
beyond two servers in different datacenters in the same city.


RAID-1 was the model for DRBD, but I brought it up only because it's 
pretty much the definition of a synchronous commit that degrades 
gracefully. I'd even suggest it's more important in a network context 
than for RAID-1, because you're far more likely to get sync 
interruptions due to network issues than you are for a disk to fail.



But, putting that aside, why not write a piece of middleware that
does precisely this, or whatever you want? It can live on the same
machine as Postgres and ack synchronous commit when nobody is home,
and notify (e.g. page) you in the most precise way you want if nobody
is home for a while.


You're right that there are lots of ways to kinda get this ability, 
they're just not mature enough or capable enough to really matter. 
Tailing the log to watch for secondary disconnect is too slow. Monit or 
Nagios style checks are too slow and unreliable. A custom-built 
middle-layer (a master-slave plugin for Pacemaker, for example) is too 
slow. All of these would rely on some kind of check interval. Set that 
too high, and we get 10,000xn missed transactions for n seconds. Too 
low, and we'd increase the likelihood of false positives and unnecessary 
detachments.


If it's possible through a PG 9.x extension, that'd probably be the way 
to *safely* handle it as a bolt-on solution. If the original author of 
the patch can convert it to such a beast, we'd install it approximately 
five seconds after it finished compiling.


So far as transaction durability is concerned... we have a continuous 
background rsync over dark fiber for archived transaction logs, DRBD for 
block-level sync, filesystem snapshots for our backups, a redundant 
async DR cluster, an offsite backup location, and a tape archival 
service stretching back for seven years. And none of that will cause the 
master to stop processing transactions unless the master itself dies and 
triggers a failover.


Using PG sync in its current incarnation would introduce an extra 
failure scenario that wasn't there before. I'm pretty sure we're not the 
only ones avoiding it for exactly that reason. Our queue discards 
messages it can't fulfil within ten seconds and then throws an error for 
each one. We need to decouple the secondary as quickly as possible if it 
becomes unresponsive, and there's really no way to do that without 
something in the database, one way or another.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com



__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Tom Lane
Joel Jacobson j...@trustly.com writes:
 I think the directory structure [schema]/[type]/[name] should be the same
 for all object types. I don't like operator being part of the filename,
 it should be the directory name.

What are you going to do with objects that don't have schemas?
(Including, but not restricted to, the schemas themselves.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts

2012-07-12 Thread Heikki Linnakangas

On 09.07.2012 11:35, Shigeru HANADA wrote:

Once the issues above are fixed, IMO this patch can be marked as Ready
for committer.


Thanks. The docs on async connections says:

The connect_timeout connection parameter is ignored when using 
PQconnectPoll; it is the application's responsibility to decide whether 
an excessive amount of time has elapsed.


I think we should recommend using PQconnectTimeout(), similar to what 
you did in psql, in all client applications that use the non-blocking 
connection API. Perhaps something like:


The connect_timeout connection parameter is ignored when using 
PQconnectPoll; it is the application's responsibility to decide whether 
an excessive amount of time has elapsed. It is recommended to use the 
PQconnectTimeout() to get value of the parameter, and use that as the 
timeout in the application. That way the user gets the same timeout 
behavior, regardless of whether the application uses PQconnectPoll or 
the nonblocking connection API.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Joel Jacobson
On Thursday, July 12, 2012, Tom Lane wrote:

 What are you going to do with objects that don't have schemas?
 (Including, but not restricted to, the schemas themselves.)


Good question. Maybe something like this?

For objects without schema:
/global/[type]/[name].sql

For objects with schema:
/schema/[schema]/[type]/[name].sql


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas stho...@optionshouse.com wrote:

 So far as transaction durability is concerned... we have a continuous
 background rsync over dark fiber for archived transaction logs, DRBD for
 block-level sync, filesystem snapshots for our backups, a redundant async DR
 cluster, an offsite backup location, and a tape archival service stretching
 back for seven years. And none of that will cause the master to stop
 processing transactions unless the master itself dies and triggers a
 failover.

Right, so if the dark fiber between New Orleans and Seattle (pick two
places for your datacenter) happens to be the first thing failing in
your NO data center.  Disconenct the sync-ness, and continue.  Not a
problem, unless it happens to be Aug 29, 2005.

You have lost data.  Maybe only a bit.  Maybe it wasn't even
important.  But that's not for PostgreSQL to decide.

But because your PG on DRDB continued when it couldn't replicate to
Seattle, it told it's clients the data was durable, just minutes
before the whole DC was under water.

OK, so a wise admin team would have removed the NO DC from it's
primary role days before that hit.

Change the NO to NYC and the date Sept 11, 2001.

OK, so maybe we can concede that these types of major catasrophies are
more devestating to us than loosing some data.

Now your primary server was in AWS US East last week.  It's sync slave
was in the affected AZ, but your PG primary continues on, until, since
it was a EC2 instance, it disappears.  Now where is your data?

Or the fire marshall orders the data center (or whole building) EPO,
and the connection to your backup goes down minutes before your
servers or other network peers.

 Using PG sync in its current incarnation would introduce an extra failure
 scenario that wasn't there before. I'm pretty sure we're not the only ones
 avoiding it for exactly that reason. Our queue discards messages it can't
 fulfil within ten seconds and then throws an error for each one. We need to
 decouple the secondary as quickly as possible if it becomes unresponsive,
 and there's really no way to do that without something in the database, one
 way or another.

It introduces an extra failure, because it has introduce an extra
data durability guarantee.

Sure, many people don't *really* want that data durability guarantee,
even though they would like the maybe guaranteed version of it.

But that fine line is actually a difficult (impossible?) one to define
if you don't know, at the moment of decision, what the next few
moments will/could become.

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Tom Lane
Joel Jacobson j...@trustly.com writes:
 On Thursday, July 12, 2012, Tom Lane wrote:
 What are you going to do with objects that don't have schemas?
 (Including, but not restricted to, the schemas themselves.)

 Good question. Maybe something like this?

 For objects without schema:
 /global/[type]/[name].sql

 For objects with schema:
 /schema/[schema]/[type]/[name].sql

FWIW, I think you could save a level of naming if you were willing to
put the type first, since the type would imply whether the object
lives in a schema or not:

[type]/[name].sql
[type]/[schema]/[name].sql

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Schema version management

2012-07-12 Thread Joel Jacobson
On Thu, Jul 12, 2012 at 4:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 FWIW, I think you could save a level of naming if you were willing to
 put the type first, since the type would imply whether the object
 lives in a schema or not:

 [type]/[name].sql
 [type]/[schema]/[name].sql


Could work. But I think it's more relevant and useful to keep all objects
in a schema in its own directory.

That way it's easier to get an overview of what's in a schema,
simply by looking at the file structure of the schema directory.

I think its more common you want to show all objects within schema X
than show all schemas of type X.

PS.

I was thinking -- the guys back in the 70s must have spent a lot of time
thinking about the UNIX directory structure -- before they decided upon it.

I did some googling and found found this explanation which was quite
amusing to say the least :-)

http://lists.busybox.net/pipermail/busybox/2010-December/074114.html


Re: [HACKERS] Schema version management

2012-07-12 Thread Andrew Dunstan


On 07/12/2012 10:01 AM, Tom Lane wrote:


FWIW, I think you could save a level of naming if you were willing to
put the type first, since the type would imply whether the object
lives in a schema or not:

[type]/[name].sql
[type]/[schema]/[name].sql






That will destroy the property of having everything for a given schema 
collected together.


Arguably we should make a special case for the create statement of a 
schema, but I'm not even sure about that.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Fujii Masao
On Thu, Jul 12, 2012 at 8:39 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jul 10, 2012 at 7:03 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jul 10, 2012 at 3:23 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 I found several problems in pg_receivexlog, e.g., memory leaks,
 file-descripter leaks, ..etc. The attached patch fixes these problems.

 ISTM there are still some other problems in pg_receivexlog, so I'll
 read it deeply later.

 While pg_basebackup background process is streaming WAL records,
 if its replication connection is terminated (e.g., walsender in the server
 is accidentally terminated by SIGTERM signal), pg_basebackup ends
 up failing to include all required WAL files in the backup. The problem
 is that, in this case, pg_basebackup doesn't emit any error message at all.
 So an user might misunderstand that a base backup has been successfully
 taken even though it doesn't include all required WAL files.

 Ouch. That is definitely a bug if it behaves that way.


 To fix this problem, I think that, when the replication connection is
 terminated, ReceiveXlogStream() should check whether we've already
 reached the stop point by calling stream_stop() before returning TRUE.
 If we've not yet (this means that we've not received all required WAL
 files yet), ReceiveXlogStream() should return FALSE and
 pg_basebackup should emit an error message.  Comments?

 Doesn't it already return false because it detects the error of the
 connection? What's the codepath where we end up returning true even
 though we had a connection failure? Shouldn't that end up under the
 could not read copy data branch, which already returns false?

You're right. If the error is detected, that function always returns false
and the error message is emitted (but I think that current error message
pg_basebackup: child process exited with error 1 is confusing),
so it's OK. But if walsender in the server is terminated by SIGTERM,
no error is detected and pg_basebackup background process gets out
of the loop in ReceiveXlogStream() and returns true.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] several problems in pg_receivexlog

2012-07-12 Thread Magnus Hagander
On Thu, Jul 12, 2012 at 6:07 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Jul 12, 2012 at 8:39 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jul 10, 2012 at 7:03 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Jul 10, 2012 at 3:23 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 I found several problems in pg_receivexlog, e.g., memory leaks,
 file-descripter leaks, ..etc. The attached patch fixes these problems.

 ISTM there are still some other problems in pg_receivexlog, so I'll
 read it deeply later.

 While pg_basebackup background process is streaming WAL records,
 if its replication connection is terminated (e.g., walsender in the server
 is accidentally terminated by SIGTERM signal), pg_basebackup ends
 up failing to include all required WAL files in the backup. The problem
 is that, in this case, pg_basebackup doesn't emit any error message at all.
 So an user might misunderstand that a base backup has been successfully
 taken even though it doesn't include all required WAL files.

 Ouch. That is definitely a bug if it behaves that way.


 To fix this problem, I think that, when the replication connection is
 terminated, ReceiveXlogStream() should check whether we've already
 reached the stop point by calling stream_stop() before returning TRUE.
 If we've not yet (this means that we've not received all required WAL
 files yet), ReceiveXlogStream() should return FALSE and
 pg_basebackup should emit an error message.  Comments?

 Doesn't it already return false because it detects the error of the
 connection? What's the codepath where we end up returning true even
 though we had a connection failure? Shouldn't that end up under the
 could not read copy data branch, which already returns false?

 You're right. If the error is detected, that function always returns false
 and the error message is emitted (but I think that current error message
 pg_basebackup: child process exited with error 1 is confusing),
 so it's OK. But if walsender in the server is terminated by SIGTERM,
 no error is detected and pg_basebackup background process gets out
 of the loop in ReceiveXlogStream() and returns true.

Oh. Because the server does a graceful shutdown. D'uh, of course.

Then yes, your suggested fix seems like a good one.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote:
  From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] 
  On Behalf Of Jose Ildefonso Camargo Tolosa
 
  Please, stop arguing on all of this: I don't think that adding an
  option will hurt anybody (specially because the work was already done
  by someone), we are not asking to change how the things work, we just
  want an option to decided whether we want it to freeze on standby
  disconnection, or if we want it to continue automatically... is that
  asking so much?
 
 I think this kind of decision should be done from outside utility or
 scripts.
 It would be better if from outside it can be detected that stand-by is down
 during sync replication, and send command to master to change its mode or
 change settings appropriately without stopping master.
 Putting this kind of more and more logic into replication code will make it
 more cumbersome.

We certainly would need something external to inform administrators that
the system is no longer synchronous.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Bruce Momjian
On Thu, Jul 12, 2012 at 08:21:08AM -0500, Shaun Thomas wrote:
 But, putting that aside, why not write a piece of middleware that
 does precisely this, or whatever you want? It can live on the same
 machine as Postgres and ack synchronous commit when nobody is home,
 and notify (e.g. page) you in the most precise way you want if nobody
 is home for a while.
 
 You're right that there are lots of ways to kinda get this ability,
 they're just not mature enough or capable enough to really matter.
 Tailing the log to watch for secondary disconnect is too slow. Monit
 or Nagios style checks are too slow and unreliable. A custom-built
 middle-layer (a master-slave plugin for Pacemaker, for example) is
 too slow. All of these would rely on some kind of check interval.
 Set that too high, and we get 10,000xn missed transactions for n
 seconds. Too low, and we'd increase the likelihood of false
 positives and unnecessary detachments.

Well, the problem also exists if add it as an internal database feature
--- how long do we wait to consider the standby dead, how do we inform
administrators, etc.

I don't think anyone says the feature is useless, but is isn't going to
be a simple boolean either.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-07-12 Thread Tom Lane
Here is a revised version of the timeout-infrastructure patch.
I whacked it around quite a bit, notably:

* I decided that the most convenient way to handle the initialization
issue was to combine establishment of the signal handler with resetting
of the per-process variables.  So handle_sig_alarm is no longer global,
and InitializeTimeouts is called at the places where we used to do
pqsignal(SIGALRM, handle_sig_alarm);.  I believe this is correct
because any subprocess that was intending to use SIGALRM must have
called that before establishing any timeouts.

* BTW, doing that exposed the fact that walsender processes were failing
to establish a SIGALRM signal handler, which is a pre-existing bug,
because they run a normal authentication transaction during InitPostgres
and hence need to be able to cope with deadlock and statement timeouts.
I will do something about back-patching a fix for that.

* I ended up putting the RegisterTimeout calls for DEADLOCK_TIMEOUT
and STATEMENT_TIMEOUT into InitPostgres, ensuring that they'd get
done in walsender and autovacuum processes.  I'm not totally satisfied
with that, but for sure it didn't work to only establish them in
regular backends.

* I didn't like the TimeoutName nomenclature, because to me name
suggests that the value is a string, not just an enum.  So I renamed
that to TimeoutId.

* I whacked around the logic in timeout.c a fair amount, because it
had race conditions if SIGALRM happened while enabling or disabling
a timeout.  I believe the attached coding is safe, but I'm not totally
happy with it from a performance standpoint, because it will do two
setitimer calls (a disable and then a re-enable) in many cases where
the old code did only one.

I think what we ought to do is go ahead and apply this, so that we
can have the API nailed down, and then we can revisit the internals
of timeout.c to see if we can't get the performance back up.
It's clearly a much cleaner design than the old spaghetti logic in
proc.c, so I think we ought to go ahead with this independently of
whether the second patch gets accepted.

I haven't really looked at the second patch yet, but at minimum that
will need some rebasing to match the API tweaks here.

regards, tom lane



binwIymnjnW5K.bin
Description: 1-timeout-framework-v16.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP pgindent replacement

2012-07-12 Thread Bruce Momjian
On Wed, Jun 22, 2011 at 10:16:28AM -0400, Bruce Momjian wrote:
 Andrew Dunstan wrote:
   Further research shows that C89 explicitly dropped support for the old
   KR =- operator, so we probably *should* remove this in case it
   introduces an unintended bug.
   Well, the point is if someone does use that, it isn't going to generate
   a pgindent error, but rather produce incorrect C code because =- is
   going to be changed.  FYI, my gcc 2.95.3 allows =- and does work as
   intended.
  
  
  As intended by whom? If the effect of x=4; x =- 1; is to subtract 1 
  from x then that's simply wrong by C89. It should assign -1 to x. The 
  =- must be parsed as two operators in C89, assignment and unary minus. 
  pgindent should not under any circumstances change the semantics of the 
  program being indented, and that's what this transformation does for 
  compilers conforming to the standard we explicitly follow.
  
  What happens when your ancient gcc is told to apply the ansi standard?
 
 I see now that my test wasn't complete.  You are right it assigns -1 so
 we can remove this from pgindent.

Per report form last year, removed from pgindent.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file foreign tables

2012-07-12 Thread Tom Lane
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 Besides the revision, I modified check_selective_binary_conversion() to run
 heap_close() in the whole-row-reference case.  Attached is an updated version 
 of
 the patch.

Applied with minor, mostly-cosmetic revisions.  I did fix
check_selective_binary_conversion to not continue touching the
relation's tupledesc after heap_close.  Also I thought
convert_selectively was a better name for the hidden COPY option.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Shaun Thomas

On 07/12/2012 12:02 PM, Bruce Momjian wrote:


Well, the problem also exists if add it as an internal database
feature --- how long do we wait to consider the standby dead, how do
we inform administrators, etc.


True. Though if there is no secondary connected, either because it's not 
there yet, or because it disconnected, that's an easy check. It's the 
network lag/stall detection that's tricky.



I don't think anyone says the feature is useless, but is isn't going
to be a simple boolean either.


Oh $Deity no. I'd never suggest that. I just tend to be overly verbose, 
and sometimes my intent gets lost in the rambling as I try to explain my 
perspective. I apologize if it somehow came across that anyone could 
just flip a switch and have it work.


My C is way too rusty, or I'd be writing an extension right now to do 
this, or be looking over that patch I linked to originally to make 
suitable adaptations. I know I talk about how relatively handy DRBD is, 
but it's also a gigantic PITA since it has to exist underneath the 
actual filesystem. :)


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com



__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sign mismatch in walreceiver.c

2012-07-12 Thread Peter Eisentraut
This looks suspicious

static TimeLineID   recvFileTLI = -1;

because TimeLineID is uint32.  The Solaris compiler complains about the
sign mismatch.

Maybe 0 would be a better initial value?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREGIN TABLE LACUNA

2012-07-12 Thread Peter Eisentraut
On lör, 2012-06-23 at 23:08 +0100, Dean Rasheed wrote:
 I spotted a couple of other issues during testing:
 
 * You're still allowing INCLUDING DEFAULTS and INCLUDING STORAGE, even
 though these options are not supported on foreign tables.
 
 * If I do INCLUDING ALL, I get an error because of the unsupported
 options. I think that ALL in this context needs to be made to mean
 all the options that foreign tables support (just COMMENTS at the
 moment).

Note that when I added CREATE TABLE LIKE to support composite types, it
was decided to ignore non-applicable options (like copying indexes from
types or views etc.).  The same should be done here, unless we have
reasons to revise the earlier decision.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] compiler warnings on the buildfarm

2012-07-12 Thread Peter Eisentraut
On sön, 2012-07-01 at 19:04 +0200, Stefan Kaltenbrunner wrote:
 seeing some of the latest commits about fixing compiler warnings I
 took a look at the buildfarm to see if there are any interesting ones
 there (in total we have a thousends of warnings on the buildfarm but
 most of those are from very noisy compilers).
 
 so in case anybody is interested those are a selection of the ones
 that at least look somewhat interesting(duplicates mostly removed,
 windows ignored):

Many of these come from ancient compilers, and from minor versions that
are not even the latest for that ancient major release.  They're mostly
not worth worrying about, because evidently the compiler developers
later improved the compilers to not warn about these cases anymore.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Peter Eisentraut
On tor, 2012-07-12 at 01:40 -0400, Tom Lane wrote:
 So it seems arguably reasonable to me for has_language_privilege()
 to take superuserness and lanpltrusted into account, without thereby
 concluding that other privilege() functions must do more than they
 do today.  If we don't want it to do that, then I think we ought to
 offer some other function that *does* consider those things ... but
 I'm not seeing the value of separating it out.

As long as we're spending time on this, I'd propose getting rid of
lanplistrusted, at least for access checking.  Instead, just don't
install USAGE privileges by default for those languages.

The reason is that there is value in having a role that can deploy
schemas, possibly containing functions in untrusted languages, without
having to be a full superuser.  Just like you can have a user that can
create roles without being a superuser.

 The sepgsql point is worth discussing too.  I have not been paying
 close attention to the sepgsql patches, but I have the distinct
 impression that they create a non-examinable privilege barrier,
 ie there's no way to inquire whether you have the privilege to do
 X except by actually trying it.  Is that really the way we want
 things to go?

Well, that's how SELinux works too.  You can inspect the labels and all
that, but nobody really knows what's going to happen until you try it.

Which is ultimately the recommended way anyway.  has_*_privilege is a
bit like the access() function, which has caveats associated with it.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 As long as we're spending time on this, I'd propose getting rid of
 lanplistrusted, at least for access checking.  Instead, just don't
 install USAGE privileges by default for those languages.

There's definitely something to that idea --- certainly lanpltrusted
dates from before we had a robust object-permissions system, and looks
like a bit of a wart now that we do have one.

I guess we could redefine the default privileges for languages as none
and then have the TRUSTED keyword mean to install public usage
privilege.  Or maybe it would be safer for upgrade purposes if we kept
the default interpretation as-is and did an automatic REVOKE when
TRUSTED wasn't specified.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: FW: [HACKERS] Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Gurjeet Singh
On Tue, Jul 10, 2012 at 2:40 AM, Amit Kapila amit.kap...@huawei.com wrote:

   Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT
 ... USING INDEX we added back in the day is not so useful in the field. **
 **

  Having to drop foreign key constraints before this command, and
 recreate them afterwards makes this command useless to most database
 setups. I feel sorry 

  that no one brought this up when we were implementing the feature;
 maybe we could've done something about it right then.

 ** **

 Will it impact user such that it will block its operation or something
 similar or it is a usability issue?


Yes, it will have to take an exclusive lock on the index, and possibly the
table too, but the operation should be quick to be even noticeable in low
load conditions.

However, if the x-lock is waiting for some other long running query to
finish, then lock queuing logic in Postgres will make new queries to wait
for this x-lock to be taken and released before any new query can begin
processing. This is my recollection of the logic from an old conversation,
others can weigh in to confirm.


 



  All we need to do is allow swapping of pg_class.relfilenode of two
 indexes. This will let the dependency entries stand as they are and allow
 us to drop the 

  bloated primary key index structure without having to rebuild the
 foreign key constraints.

 ** **

 I have noticed is that currently Oid and pg_class.relfilenode are same for
 user created tables and indexes. But after your implementation that will
 not remain same, I am not sure whether it can impact any other path of
 code.


They start off as same, but some operations, like REINDEX, changes the
relfilenode; that's the purpose of relfilenode: to map the oid to a
filename on disk.


 



 As for the syntactical sugar, this can be added to either ALTER TABLE or
 to ALTER INDEX. Although under no normal circumstances one would need to
 use ALTER INDEX to swap two indexes' relfilenode (because one can easily
 create a duplicate index and drop/rename-in-place the old one), I think it
 would make  more sense here since it is just an operation on two indexes
 and has nothing to do with the constraints, apart from the fact that we
 want to use this feature to 

  meddle with the constraints.

  Syntax options:

  ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE}
 USING INDEX new_index;

  ALTER INDEX ind REPLACE WITH new_index;

 After this new syntax there will be 2 ways for users to do the replacement
 of index, won’t it confuse users for which syntax to use?


Yes, I forgot to mention this in the original post. This feature will be a
superset of the feature we introduced in ALTER TABLE. I don't see a way
around that, except for slowly deprecating the older feature.

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Gurjeet Singh
On Tue, Jul 10, 2012 at 10:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Greg Stark st...@mit.edu writes:
  On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh singh.gurj...@gmail.com
 wrote:
  All we need to do is allow swapping of pg_class.relfilenode of two
 indexes.

  Fwiw I don't like swapping relfilenodes on indexes the user created.
  REINDEX currently does this but it's a bit of a hack and only works
  because reindex carefully builds the new index with exactly the same
  definition as the old one.

 Yes.  The swap-relfilenodes operation would have to carefully check that
 the index definitions were exactly equivalent, and there would be a
 constant risk for bugs of omission if that code weren't taught about
 any new index properties we invent.


IMHO there must be many other places in this code-base where we run that
risk.

The way I am planning to do it was to compare all relevant fields of the
FormData_pg_index. And I am assuming anybody changing the struct members
will take care of relevant changes needed for this code too.

We can add a runtime/compile-time assert to make sure that
Natts_pg_index==17. That way, if a new column gets added, we will get
alerted promptly.


 All of these things seem like ugly, hard-to-use kluges anyway (the
 make-sure-the-indexes-match business is just as much of a PITA for the
 DBA as it is for the system).  What we really want is REINDEX
 CONCURRENTLY.


+1, but I can't take on that task.

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: Allow replacement of bloated primary key indexes without foreign key rebuilds

2012-07-12 Thread Gurjeet Singh
On Tue, Jul 10, 2012 at 11:11 AM, Greg Stark st...@mit.edu wrote:

 On Tue, Jul 10, 2012 at 3:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  The problem you describe is one of constraints and dependencies and
  not one of indexes. It seems what you really want is a way to alter
  foreign key dependencies to depend on a new index. Either an explicit
  command that lets you set the new dependency or what seems even better
  would be to have DROP INDEX check any dependent objects to see if
  there's another index that can satisfy them and change their
  dependency.
 
  Either of these have exactly the same issue, namely their correctness
  depends on determining if two indexes have identical properties.

 This doesn't sound right to me. In these cases all it would have to
 know about is the same set of properties that CREATE CONSTRAINT looks
 for to find a satisfactory index to depend on.


I like the DROP index idea, but the silent side-effect may not make people
happy. Can you give me a pointer to relevant code.

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:35 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Hi,

 Jose Ildefonso Camargo Tolosa ildefonso.cama...@gmail.com writes:
 environments.  And no, it doesn't makes synchronous replication
 meaningless, because it will work synchronous if it have someone to
 sync to, and work async (or standalone) if it doesn't: that's perfect
 for HA environment.

 You seem to want Service Availibility when we are providing Data
 Availibility. I'm not saying you shouldn't ask what you're asking, just
 that it is a different need.

Yes, and no: I don't see why we can't have and option to choose which
one we want.  I can see the point of data availability: it is better
freeze the service, than risk losing transactions... however, try to
explain that to some managers: well, you know, the DB server froze
the whole bank system because, well, the standby server died, and we
didn't want to risk transaction loss, we just froze the master you
know, in case the master were to die too before the we had a reliable
standby.  I don't think a manager would really understand why you
would block the whole company's system, just because *the standby*
server died (and why you don't block it, when the master dies?!).
Now, maybe that's a bad example, I know a bank should have at least 3
or 4 servers, with some of them in different geographical areas, but
just think on the typical boss.

In Service Availability, you have data Availability most of the
time, until one of the servers fails (if you have just 2 nodes), what
if you have more than two: well, good for you!  But, you can keep
going with a single server, understanding that you are in a high risk,
that have to be fixed real soon (emergency).


 If you troll the archives, you will see that this debate has received
 much consideration already. The conclusion is that if you care about
 Service Availibility you should have 2 standby servers and set them both
 as candidates to being the synchronous one.

That's more cost, and for most applications: it doesn't worth the extra cost.

Really, I see the point you have, and I have *never* asked to remove
the data warranties, but to have an option to relax it, if the
particular situation requires it: enough safety for a given cost.


 That way, when you lose one standby the service is unaffected, the
 second standby is now the synchronous one, and it's possible to
 re-attach the failed standby live, with or without archiving (with is
 preferred so that the master isn't involved in the catch-up phase).

 As synchronous standby currently is, it just doesn't fit the HA usage,

 It does actually allow both data high availability and service high
 availability, provided that you feed at least two standbys.

Still, doesn't fit.  You need to spend more hardware, and more power
(and money there), and more carbon footprint, . you get the point,
also, having 3 servers for your DB can be necessary (and possible) for
some companies, but for others: no.


 What you seem to be asking is both data and service high availability
 with only two nodes. You're right that we can not provide that with
 current releases of PostgreSQL. I'm not sure anyone has a solid plan to
 make that happen.

 and if you really want to keep it that way, it doesn't belong to the
 HA chapter on the pgsql documentation, and should be moved.  And NO
 async replication will *not* work for HA, because the master can have
 more transactions than standby, and if the master crashes, the standby
 will have no way to recover these transactions, with synchronous
 replication we have *exactly* what we need: the data in the standby,
 after all, it will apply it once we promote it.

 Exactly. We want data availability first. Service availability is
 important too, and for that you need another standby.

Yeah, you need that with PostgreSQL, but no with DRBD, for example
(sorry, but DRBD is one of the flagships of HA things in the Linux
world).  Also, I'm not convinced about the 2nd standby thing... I
mean, just read this on the docs, which is a little alarming:

If primary restarts while commits are waiting for acknowledgement,
those waiting transactions will be marked fully committed once the
primary database recovers. There is no way to be certain that all
standbys have received all outstanding WAL data at time of the crash
of the primary. Some transactions may not show as committed on the
standby, even though they show as committed on the primary. The
guarantee we offer is that the application will not receive explicit
acknowledgement of the successful commit of a transaction until the
WAL data is known to be safely received by the standby.

So... there is no *real* warranty here either... I don't know how I
skipped that paragraph before today I mean, this implies that it
is possible that a transaction could be marked as commited on the
master, but the app was not informed on that (and thus, could try to
send it again), and the transaction was NOT applied on 

Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk ai...@highrise.ca wrote:
 On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas stho...@optionshouse.com 
 wrote:

 So far as transaction durability is concerned... we have a continuous
 background rsync over dark fiber for archived transaction logs, DRBD for
 block-level sync, filesystem snapshots for our backups, a redundant async DR
 cluster, an offsite backup location, and a tape archival service stretching
 back for seven years. And none of that will cause the master to stop
 processing transactions unless the master itself dies and triggers a
 failover.

 Right, so if the dark fiber between New Orleans and Seattle (pick two
 places for your datacenter) happens to be the first thing failing in
 your NO data center.  Disconenct the sync-ness, and continue.  Not a
 problem, unless it happens to be Aug 29, 2005.

 You have lost data.  Maybe only a bit.  Maybe it wasn't even
 important.  But that's not for PostgreSQL to decide.

I never asked for it... but, you (the one who is configuring the
system) can decide, and should be able to decide... right now: we
can't decide.


 But because your PG on DRDB continued when it couldn't replicate to
 Seattle, it told it's clients the data was durable, just minutes
 before the whole DC was under water.

Yeah, well, what is the probability of all of that?... really tiny.  I
bet it is more likely that you win the lottery, than all of these
events happening within that time frame.  But, risking monetary loses
because, for example, the online store stopped accepting orders while
the standby server went down, that's not acceptable for some companies
(and some companies just can't buy 3 x DB servers, or more!).


 OK, so a wise admin team would have removed the NO DC from it's
 primary role days before that hit.

 Change the NO to NYC and the date Sept 11, 2001.

 OK, so maybe we can concede that these types of major catasrophies are
 more devestating to us than loosing some data.

 Now your primary server was in AWS US East last week.  It's sync slave
 was in the affected AZ, but your PG primary continues on, until, since
 it was a EC2 instance, it disappears.  Now where is your data?

Who would *really* trust your PostgreSQL DB to EC2?... I mean, the I/O
is not very good, and the price is not exactly that low so that you
take that risk.

All in all: you are still getting together coincidences that have *so
low* probability


 Or the fire marshall orders the data center (or whole building) EPO,
 and the connection to your backup goes down minutes before your
 servers or other network peers.

 Using PG sync in its current incarnation would introduce an extra failure
 scenario that wasn't there before. I'm pretty sure we're not the only ones
 avoiding it for exactly that reason. Our queue discards messages it can't
 fulfil within ten seconds and then throws an error for each one. We need to
 decouple the secondary as quickly as possible if it becomes unresponsive,
 and there's really no way to do that without something in the database, one
 way or another.

 It introduces an extra failure, because it has introduce an extra
 data durability guarantee.

 Sure, many people don't *really* want that data durability guarantee,
 even though they would like the maybe guaranteed version of it.

 But that fine line is actually a difficult (impossible?) one to define
 if you don't know, at the moment of decision, what the next few
 moments will/could become.

You *never* know.  And the truth is that you have to make the decision
with what you have, if you can pay 10 servers nationwide: good for
you, not all of us can afford that (men, I could barely pay for two,
and that because I *know* I don't want to risk to lose the data or
service because the single server died).

As currently is, the point of: freezing the master because standby
dies is not good for all cases (and I dare say: for most cases), and
having to wait for pacemaker or other monitoring to note that, change
master config and reload... it will cause a service disruption! (for
several seconds, usually, ~30 seconds).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 12:17 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Jul 12, 2012 at 11:33:26AM +0530, Amit Kapila wrote:
  From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org]
  On Behalf Of Jose Ildefonso Camargo Tolosa

  Please, stop arguing on all of this: I don't think that adding an
  option will hurt anybody (specially because the work was already done
  by someone), we are not asking to change how the things work, we just
  want an option to decided whether we want it to freeze on standby
  disconnection, or if we want it to continue automatically... is that
  asking so much?

 I think this kind of decision should be done from outside utility or
 scripts.
 It would be better if from outside it can be detected that stand-by is down
 during sync replication, and send command to master to change its mode or
 change settings appropriately without stopping master.
 Putting this kind of more and more logic into replication code will make it
 more cumbersome.

 We certainly would need something external to inform administrators that
 the system is no longer synchronous.

That is *mandatory*, just as you monitor DRBD, or disk arrays: if a
disk fail, and alert have to be issued, to fix it as soon as possible.

But such alerts can wait 30 seconds to be sent out, so, any monitoring
system would be able to handle that, we just need to get current
system status from the monitoring system, and create corresponding
rules: a simple matter, actually.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Aidan Van Dyk
On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa

 Yeah, you need that with PostgreSQL, but no with DRBD, for example
 (sorry, but DRBD is one of the flagships of HA things in the Linux
 world).  Also, I'm not convinced about the 2nd standby thing... I
 mean, just read this on the docs, which is a little alarming:

 If primary restarts while commits are waiting for acknowledgement,
 those waiting transactions will be marked fully committed once the
 primary database recovers. There is no way to be certain that all
 standbys have received all outstanding WAL data at time of the crash
 of the primary. Some transactions may not show as committed on the
 standby, even though they show as committed on the primary. The
 guarantee we offer is that the application will not receive explicit
 acknowledgement of the successful commit of a transaction until the
 WAL data is known to be safely received by the standby.

 So... there is no *real* warranty here either... I don't know how I
 skipped that paragraph before today I mean, this implies that it
 is possible that a transaction could be marked as commited on the
 master, but the app was not informed on that (and thus, could try to
 send it again), and the transaction was NOT applied on the standby
 how can this happen? I mean, when the master comes back, shouldn't the
 standby get the missing WAL pieces from the master and then apply the
 transaction? The standby part is the one that I don't really get, on
 the application side... well, there are several ways in which you can
 miss the commit confirmation: connection issues in the worst moment,
 and the such, so, I guess it is not *so* serious, and the app should
 have a way of checking its last transaction if it lost connectivity to
 server before getting the transaction commited.

But you already have that in a single server situation as well.  There
is a window between when the commit is durable (and visible to
others, and will be committed after recovery of a crash), when the
client doesn't yet know it's committed (and might never get the commit
message due to server crash, network disconnect, client middle-tier
crash, etc).

So people are already susceptible to that, and defending against it, no? ;-)

And they are susceptible to that if they are on PostgreSQL, Oracle, MS
SQL, DB2, etc.

a.


-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Joe Conway
On 07/12/2012 02:53 PM, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 As long as we're spending time on this, I'd propose getting rid of
 lanplistrusted, at least for access checking.  Instead, just don't
 install USAGE privileges by default for those languages.
 
 There's definitely something to that idea --- certainly lanpltrusted
 dates from before we had a robust object-permissions system, and looks
 like a bit of a wart now that we do have one.
 
 I guess we could redefine the default privileges for languages as none
 and then have the TRUSTED keyword mean to install public usage
 privilege.  Or maybe it would be safer for upgrade purposes if we kept
 the default interpretation as-is and did an automatic REVOKE when
 TRUSTED wasn't specified.

+1

I'll take a look at the latter option sometime in the next few weeks and
submit for the next commitfest.

Is it still worth backpatching a change to has_language_privilege as a
bug fix?

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Shigeru HANADA
(2012/07/12 20:48), Kohei KaiGai wrote:
 It seems to me what postgresql_fdw_validator() is doing looks like
 a function to be named as libpq_fdw_validator().
 
 How about your opinion? It will help this namespace conflicts.

I'd prefer dblink_fdw_validator.

The name libpq_fdw_validator impresses me that a concrete FDW named
libpq_fdw is somewhere and it retrieves external data *from* libpq.
Indeed postgresql_fdw_validator allows only some of libpq options at the
moment, but we won't be able to rename it for backward compatibility
even if it wants to have non-libpq options in the future.

IMO basically each FDW validator should be owned by a particular FDW,
because in most cases validator should know FDW's internal deeply.  In
addition, it would want to have new options for new features.

Besides naming, as mentioned upthread, removing hard-coded libpq options
list from dblink and leaving it to libpq client library would make
dblink more robust about libpq option changes in future.

Regards,
-- 
Shigeru HANADA



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 8:29 PM, Aidan Van Dyk ai...@highrise.ca wrote:
 On Thu, Jul 12, 2012 at 8:27 PM, Jose Ildefonso Camargo Tolosa

 Yeah, you need that with PostgreSQL, but no with DRBD, for example
 (sorry, but DRBD is one of the flagships of HA things in the Linux
 world).  Also, I'm not convinced about the 2nd standby thing... I
 mean, just read this on the docs, which is a little alarming:

 If primary restarts while commits are waiting for acknowledgement,
 those waiting transactions will be marked fully committed once the
 primary database recovers. There is no way to be certain that all
 standbys have received all outstanding WAL data at time of the crash
 of the primary. Some transactions may not show as committed on the
 standby, even though they show as committed on the primary. The
 guarantee we offer is that the application will not receive explicit
 acknowledgement of the successful commit of a transaction until the
 WAL data is known to be safely received by the standby.

 So... there is no *real* warranty here either... I don't know how I
 skipped that paragraph before today I mean, this implies that it
 is possible that a transaction could be marked as commited on the
 master, but the app was not informed on that (and thus, could try to
 send it again), and the transaction was NOT applied on the standby
 how can this happen? I mean, when the master comes back, shouldn't the
 standby get the missing WAL pieces from the master and then apply the
 transaction? The standby part is the one that I don't really get, on
 the application side... well, there are several ways in which you can
 miss the commit confirmation: connection issues in the worst moment,
 and the such, so, I guess it is not *so* serious, and the app should
 have a way of checking its last transaction if it lost connectivity to
 server before getting the transaction commited.

 But you already have that in a single server situation as well.  There
 is a window between when the commit is durable (and visible to
 others, and will be committed after recovery of a crash), when the
 client doesn't yet know it's committed (and might never get the commit
 message due to server crash, network disconnect, client middle-tier
 crash, etc).

 So people are already susceptible to that, and defending against it, no? ;-)

Right.  What I'm saying is that particular part on the docs:

If primary restarts while commits are waiting for acknowledgement,
those waiting transactions will be marked fully committed once the
primary database recovers. ()Some transactions may not show as
committed on the standby, even though they show as committed on the
primary.(...)

See? it sounds like, after the primary database recovers, the standby
will still not have the transaction committed, and as far as I thought
I knew, the standby should get that over the WAL stream from master
once it reconnects to it.


 And they are susceptible to that if they are on PostgreSQL, Oracle, MS
 SQL, DB2, etc.

Certainly.  That's why I said:

(...)The standby part is the one that I don't really get, on
the application side... well, there are several ways in which you can
miss the commit confirmation: connection issues in the worst moment,
and the such, so, I guess it is not *so* serious, and the app should
have a way of checking its last transaction if it lost connectivity to
server before getting the transaction commited.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 4:10 PM, Shaun Thomas stho...@optionshouse.com wrote:
 On 07/12/2012 12:02 PM, Bruce Momjian wrote:

 Well, the problem also exists if add it as an internal database
 feature --- how long do we wait to consider the standby dead, how do
 we inform administrators, etc.


 True. Though if there is no secondary connected, either because it's not
 there yet, or because it disconnected, that's an easy check. It's the
 network lag/stall detection that's tricky.

Well, yes... but how does PostgreSQL currently note its main
synchronous standby went away and that it have to use another standby
and synchronous?  How long does it takes it to note that?



 I don't think anyone says the feature is useless, but is isn't going
 to be a simple boolean either.


 Oh $Deity no. I'd never suggest that. I just tend to be overly verbose, and
 sometimes my intent gets lost in the rambling as I try to explain my
 perspective. I apologize if it somehow came across that anyone could just
 flip a switch and have it work.

 My C is way too rusty, or I'd be writing an extension right now to do this,
 or be looking over that patch I linked to originally to make suitable
 adaptations. I know I talk about how relatively handy DRBD is, but it's also
 a gigantic PITA since it has to exist underneath the actual filesystem. :)


 --
 Shaun Thomas
 OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
 312-444-8534
 stho...@optionshouse.com



 __

 See http://www.peak6.com/email_disclaimer/ for terms and conditions related
 to this email

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file foreign tables

2012-07-12 Thread Etsuro Fujita
Thanks!

Best regards,
Etsuro Fujita

 -Original Message-
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]
 Sent: Friday, July 13, 2012 5:30 AM
 To: Etsuro Fujita
 Cc: 'Kohei KaiGai'; 'Robert Haas'; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file
 foreign tables
 
 Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
  Besides the revision, I modified check_selective_binary_conversion() to run
  heap_close() in the whole-row-reference case.  Attached is an updated
version
 of
  the patch.
 
 Applied with minor, mostly-cosmetic revisions.  I did fix
 check_selective_binary_conversion to not continue touching the
 relation's tupledesc after heap_close.  Also I thought
 convert_selectively was a better name for the hidden COPY option.
 
   regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] has_language_privilege returns incorrect answer for non-superuser

2012-07-12 Thread Tom Lane
Joe Conway m...@joeconway.com writes:
 Is it still worth backpatching a change to has_language_privilege as a
 bug fix?

I think back-patching a behavioral change in this area is probably a
bad idea.  We can fix it (in one way or another) going forward, but
changing this sort of thing in a minor release seems likely to have
more bad consequences than good ones.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Kohei KaiGai
2012/7/13 Shigeru HANADA shigeru.han...@gmail.com:
 (2012/07/12 20:48), Kohei KaiGai wrote:
 It seems to me what postgresql_fdw_validator() is doing looks like
 a function to be named as libpq_fdw_validator().

 How about your opinion? It will help this namespace conflicts.

 I'd prefer dblink_fdw_validator.

 The name libpq_fdw_validator impresses me that a concrete FDW named
 libpq_fdw is somewhere and it retrieves external data *from* libpq.
 Indeed postgresql_fdw_validator allows only some of libpq options at the
 moment, but we won't be able to rename it for backward compatibility
 even if it wants to have non-libpq options in the future.

 IMO basically each FDW validator should be owned by a particular FDW,
 because in most cases validator should know FDW's internal deeply.  In
 addition, it would want to have new options for new features.

 Besides naming, as mentioned upthread, removing hard-coded libpq options
 list from dblink and leaving it to libpq client library would make
 dblink more robust about libpq option changes in future.

OK, it seems to me fair enough.

Does someone have different opinions?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw in contrib

2012-07-12 Thread Etsuro Fujita
Hi Hanada-san,

 It would be possible to add some more features, such as ORDER BY
 push-down with index information support, without changing existing
 APIs, but at first add relatively simple pgsql_fdw and enhance it seems
 better.  In addition, once pgsql_fdw has been merged, it would help
 implementing proof-of-concept of SQL/MED-related features.

I agree with on this point.  However, I think it is preferable that pgsql_fdw
should support, from the start, the push down of PARAM_EXEC params, and thus the
parameter-change-driven remote-rescanning functionality for that.  I think that
such a functionality is necessary for pgsql_fdw to efficiently process SubLinks
on remote tables, and to realize parameterized scans in future, which I think
will be proof-of-concept code to demonstrate how to enhance FDWs to developers,
as discussed in the earlier thread of IMAP FDW...

Thanks,

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation

2012-07-12 Thread Jeff Janes
I've moved this thread from performance to hackers.

The topic was poor performance when truncating lots of small tables
repeatedly on test environments with fsync=off.

On Thu, Jul 12, 2012 at 6:00 PM, Jeff Janes jeff.ja...@gmail.com wrote:

 I think the problem is in the Fsync Absorption queue.  Every truncate
 adds a FORGET_RELATION_FSYNC to the queue, and processing each one of
 those leads to sequential scanning the checkpointer's pending ops hash
 table, which is quite large.  It is almost entirely full of other
 requests which have already been canceled, but it still has to dig
 through them all.   So this is essentially an N^2 operation.

My attached Proof of Concept patch reduces the run time of the
benchmark at the end of this message from 650sec to 84sec,
demonstrating that this is in fact the problem.  Which doesn't mean
that my patch is the right answer to it, of course.

(The delete option is still faster than truncate, coming in at around 55sec)


 I'm not sure why we don't just delete the entry instead of marking it
 as cancelled.  It looks like the only problem is that you can't delete
 an entry other than the one just returned by hash_seq_search.  Which
 would be fine, as that is the entry that we would want to delete;
 except that mdsync might have a different hash_seq_search open, and so
 it wouldn't be safe to delete.

 If the segno was taken out of the hash key and handled some other way,
 then the forgetting could be done with a simple hash look up rather
 than a full scan.

The above two ideas might be the better solution, as they would work
even when fsync=on.  Since BBU are becoming so popular I think the
fsync queue could be a problem even with fsync on if the fsync is fast
enough.  But I don't immediately know how to implement them.

 Maybe we could just turn off the pending ops table altogether when
 fsync=off, but since fsync is PGC_SIGHUP it is not clear how you could
 safely turn it back on.

Now that I think about it, I don't see how turning fsync from off to
on can ever be known to be safe, until a system wide sync has
intervened.  After all a segment that was dirtied and added to the
pending ops table while fsync=off might also be removed from the
pending ops table the microsecond before fsync is turned on, so how is
that different from never adding it in the first place?

The attached Proof Of Concept patch implements this in two ways, one
of which is commented out.  The commented out way omits the overhead
of sending the request to the checkpointer in the first place, but
breaks modularity a bit.

The benchmark used on 9.3devel head is:

fsync=off, all other defaults.

## one time initialization
perl -le 'print create schema foo$_; create table foo$_.foo$_ (k
integer, v integer); $ARGV[0]..$ARGV[0]+$ARGV[1]-1' 0 10 |psql

## actual benchmark.
perl -le 'print set client_min_messages=warning;;
foreach (1..1) {
print BEGIN;\n;
print insert into foo$_.foo$_ select * from
generate_series(1,10);  foreach $ARGV[0]..$ARGV[0]+$ARGV[1]-1;
print COMMIT;\nBEGIN;\n;
print truncate table foo$_.foo$_;  foreach
$ARGV[0]..$ARGV[0]+$ARGV[1]-1;
#print delete from foo$_.foo$_;  foreach
$ARGV[0]..$ARGV[0]+$ARGV[1]-1;
print COMMIT;\n
   }  ' 0 10 | time psql  /dev/null

Cheers,

Jeff


fsync_queue_POC.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Synchronous Standalone Master Redoux

2012-07-12 Thread Amit Kapila

 From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] 
 On Behalf Of Jose Ildefonso Camargo Tolosa
On Thu, Jul 12, 2012 at 9:28 AM, Aidan Van Dyk ai...@highrise.ca wrote:
 On Thu, Jul 12, 2012 at 9:21 AM, Shaun Thomas stho...@optionshouse.com
wrote:


 As currently is, the point of: freezing the master because standby
 dies is not good for all cases (and I dare say: for most cases), and
 having to wait for pacemaker or other monitoring to note that, change
 master config and reload... it will cause a service disruption! (for
 several seconds, usually, ~30 seconds).

Yes, this is true that it can cause service disruption, but the same will be
True even if master detects that internally by having timeout. 
By keeping this as external, the current behavior of PostgreSQL can be
maintained that
if there is no standy in sync mode, it will wait and still serve the purpose
as externally it can send message for master.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers