Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread David E. Wheeler
On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote:

 I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
 in COPY FROM command. BOM will be automatically detected and ignored
 if the file encoding is UTF-8. WIP patch attached.

By my reading of http://unicode.org/faq/utf_bom.html#bom5, I'd say +1

So I think what you propose makes sense.

 I'm thinking about only COPY FROM for reads, but if someone wants to add
 BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

I think it would have to be optional, since some recipients of UTF-8 encoded 
data do not expect a BOM.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remastering using streaming only replication?

2011-09-26 Thread Fujii Masao
On Thu, Sep 22, 2011 at 1:52 AM, Josh Berkus j...@agliodbs.com wrote:
 Fujii,

 I haven't really been following your latest patches about taking backups
 from the standby and cascading replication, but I wanted to see if it
 fulfills another TODO: the ability to remaster (that is, designate the
 lead standby as the new master) without needing to copy WAL files.

Sorry, I could not follow you. I believe that we can remaster even in 9.1.
When the master crashes, we can choose the lead standby by comparing
each standby replay location, and can promote it by pg_ctl promote.

What remaster feature are you expecting we should develop in 9.2?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE tab completion

2011-09-26 Thread Magnus Hagander
On Sun, Sep 25, 2011 at 15:06, Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 24 September 2011 11:59, Magnus Hagander mag...@hagander.net wrote:
 TABLE tab completion in psql only completes to tables, not views. but
 the TABLE command works fine for both tables and views (and also
 sequences).

 Seems we should just complete it to relations and not tables - or can
 anyone see a particular reason why we shouldn't?


 Doesn't that mean that DROP TABLE tab would offer up views as well
 as tables, which would be incorrect?

Meh - you are correct, of course. I guess that's why we have code review :-)

So - not a oneliner, but how about something like this?

(Happy to have someone point out a neater way of doing it, not
entirely fluent in how we do the tab completion..)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f7df36..0e2f524 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2648,6 +2648,12 @@ psql_completion(char *text, int start, int end)
 	else if (pg_strcasecmp(prev_wd, START) == 0)
 		COMPLETE_WITH_CONST(TRANSACTION);
 
+/* TABLE */
+	else if (pg_strcasecmp(prev_wd, TABLE) == 0 
+			 pg_strcasecmp(prev2_wd, TABLE) == 0)
+		/* must not match DROP TABLE etc */
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_relations, NULL);
+
 /* TRUNCATE */
 	else if (pg_strcasecmp(prev_wd, TRUNCATE) == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new createuser option for replication role

2011-09-26 Thread Fujii Masao
On Fri, Sep 23, 2011 at 10:47 PM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2011/9/23 Robert Haas robertmh...@gmail.com:
 On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Agreed. Attached is the updated version of the patch. It adds two options
 --replication and --no-replication. If neither specified, neither 
 REPLICATION
 nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case,
 replication privilege is granted to only superuser.

 Committed.  Do we need to make any changes in interactive mode, when
 we prompt?  In theory this could be either wanted or not wanted for
 either superusers or non-superusers, but I'm not really sure it's
 worth it, and I certainly don't want the command to go into
 interactive mode just because neither --replication nor
 --no-replication was specified.

 Thoughts?

 I believe the intereactive mode is useless.

Agreed. I think that a majority of createuser users are not interested
in REPLICATION privilege yet.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-09-26 Thread Jun Ishiduka
 Attached is the updated version of the patch. I refactored the code, fixed
 some bugs, added lots of source code comments, improved the document,
 but didn't change the basic design. Please check this patch, and let's use
 this patch as the base if you agree with that.

Thanks for update patch.
Yes. I agree.


 In the current patch, there is no safeguard for preventing users from
 taking backup during recovery when FPW is disabled. This is unsafe.
 Are you planning to implement such a safeguard?

Yes.
I want to reference the following Fujii's comments.

-
 Right. Let me explain again what I'm thinking.
 
 When FPW is changed, the master always writes the WAL record
 which contains the current value of FPW. This means that the standby
 can track all changes of FPW by reading WAL records.
 
 The standby has two flags: One indicates whether FPW has always
 been TRUE since last restartpoint. Another indicates whether FPW
 has always been TRUE since last pg_start_backup(). The standby
 can maintain those flags by reading WAL records streamed from
 the master.
 
 If the former flag indicates FALSE (i.e., the WAL records which
 the standby has replayed since last restartpoint might not contain
 required FPW), pg_start_backup() fails. If the latter flag indicates
 FALSE (i.e., the WAL records which the standby has replayed
 during the backup might not contain required FPW),
 pg_stop_backup() fails.
 
 If I'm not missing something, this approach can address the problem
 which you're concerned about.
-

Regards.



Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka@po.ntts.co.jp




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Kohei KaiGai
2011/9/26 Tom Lane t...@sss.pgh.pa.us:
 As a stopgap, what about removing sepgsql from the list of contrib
 modules tested by make -C contrib check?  (I haven't looked at
 exactly how ugly it might be to do that, nor whether we'd have to also
 disable installcheck from recursing to sepgsql.)

Is it unavailable to provide a hint to inform the main Makefile this contrib
module does not support either make 'check' or 'installcheck'?

For example, How about an idea to add REGRESS_AVAILABLE that
takes empty, 'only-check' or 'only-checkinstall'; and skip regression test
when 'only-checkinstall' is set on this?
If so, we shall handle similar cases in the future; without trade-off.

I also think it is not a reasonable option to remove --with-selinux to restain
regression test, instead of libselinux version checks.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bug of recovery?

2011-09-26 Thread Fujii Masao
Hi,

Currently, if a reference to an invalid page is found during recovery,
its information
is saved in hash table invalid_page_tab. Then, if such a reference
is resolved,
its information is removed from the hash table. If there is unresolved
reference to
an invalid page in the hash table at the end of recovery, PANIC error occurs.

What I'm worried about is that the hash table is volatile. If a user restarts
the server before reaching end of recovery, any information in the
hash table is lost,
and we wrongly miss the PANIC error case because we cannot find any unresolved
reference. That is, even if database is corrupted at the end of recovery,
a user might not be able to notice that. This looks like a serious problem. No?

To prevent the above problem, we should write the contents of the hash table to
the disk for every restartpoints, I think. Then, when the server
starts recovery,
it should reload the hash table from the disk. Thought? Am I missing something?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Robert Haas robertmh...@gmail.com:
 On Sun, Sep 25, 2011 at 3:25 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I'm a bit nervous about storing security_barrier in the RTE.  What
 happens to stored rules if the security_barrier option gets change
 later?

 The rte-security_barrier is evaluated when a query referencing security
 views get expanded. So, rte-security_barrier is not stored to catalog.

 I think it is.  If you create a view that involves an RTE, the node
 tree is going to get stored in pg_rewrite.ev_action.  And it's going
 to include the security_barrier attribute, because you added outfuncs
 support for it...

 No?

IIUC, nested views are also expanded when user's query gets rewritten.
Thus, rte-security_barrier shall be set based on the latest configuration
of the view.
I injected an elog(NOTICE, ...) to confirm the behavior, when security_barrier
flag was set on rte-security_barrier at ApplyRetrieveRule().

postgres=# CREATE VIEW v1 WITH (security_barrier=true) AS SELECT *
FROM t1 WHERE a % 2 = 0;
CREATE VIEW
postgres=# CREATE VIEW v2 WITH (security_barrier=true) AS SELECT a + a
AS aa, b FROM v1;
CREATE VIEW

postgres=# SELECT * FROM v2;
NOTICE:  security barrier set on v1
NOTICE:  security barrier set on v2
 aa |  b
+-
  4 | bbb
(1 row)

postgres=# ALTER TABLE v1 SET (security_barrier=false);
ALTER TABLE
postgres=# SELECT * FROM v2;
NOTICE:  security barrier set on v2
 aa |  b
+-
  4 | bbb
(1 row)

postgres=# ALTER TABLE v1 SET (security_barrier=true);
ALTER TABLE
postgres=# SELECT * FROM v2;
NOTICE:  security barrier set on v1
NOTICE:  security barrier set on v2
 aa |  b
+-
  4 | bbb
(1 row)

It seems to me the rte-security_barrier flag is correctly set, even
if underlying view's option was changed later.

Thanks,
--
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Peter Eisentraut
On sön, 2011-09-25 at 23:49 -0400, Robert Haas wrote:
 In fact, I've been wondering if we ought to go a step further and not
 recurse into the sepgsql directory for *any* of the targets.  Then we
 could get rid of the associated configure option, which no longer
 serves any other purpose, and just say that if you want to build (or
 regression-test) sepgsql, well, you gotta go to that directory and do
 it by hand.

I'm not in favor of that.  It's nice to have a uniform interface that
decides what to build.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Robert Haas robertmh...@gmail.com:
 On Sun, Sep 25, 2011 at 10:38 PM, Noah Misch n...@leadboat.com wrote:
 On Sun, Sep 25, 2011 at 11:22:03AM -0500, Kevin Grittner wrote:
 Robert Haas  09/25/11 10:58 AM 

  I'm not sure we've been 100% consistent about that, since we
  previously made CREATE OR REPLACE LANGUAGE not replace the owner
  with the current user.

 I think we've been consistent in *not* changing security on an
 object when it is replaced.

 [CREATE OR REPLACE FUNCTION does not change proowner or proacl]

 Good point.  C-O-R VIEW also preserves column default values.  I believe we 
 are
 consistent to the extent that everything possible to specify in each C-O-R
 statement gets replaced outright.  The preserved characteristics *require*
 commands like GRANT, COMMENT and ALTER VIEW to set in the first place.

 The analogue I had in mind is SECURITY DEFINER, which C-O-R FUNCTION reverts 
 to
 SECURITY INVOKER if it's not specified each time.  That default is safe, 
 though,
 while the proposed default of security_barrier=false is unsafe.

 Even though I normally take the opposite position, I still like the
 idea of dedicated syntax for this feature.  Not knowing what view
 options we might end up with in the future, I hate having to decide on
 what the general behavior ought to be.  But it would be easy to decide
 that CREATE SECURITY VIEW followed by CREATE OR REPLACE VIEW leaves
 the security flag set; it would be consistent with what we're doing
 with owner and acl information and wouldn't back us into any
 unpleasant decisions down the road.

Does the CREATE SECURITY VIEW statement mean a synonym of
CREATE VIEW ... WITH (security_barrier=true) ?

If so, it seems to me reasonable to keep the configuration when user
provides no explicit option.

1) an explicit WITH(security_barrier=true) / CREATE SECURITY VIEW
 - It always turns on a security_barrier option.

2) an explicit WITH(security_barrier=false)
 - It always turns off security_barrier option.

3) no explicit option / CREATE VIEW
 - Keep existing configuration, although inconsist with SECURITY DEFINER

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-26 Thread Peter Eisentraut
On lör, 2011-09-24 at 13:04 -0400, Tom Lane wrote:
  What if we modified pg_ctl to allow passing configuration parameters
  through to postmaster,
 
 You mean like pg_ctl -o?

Note that pg_ctl -o saves the parameters it uses and reapplies them
after a restart.  So this is not really the way to apply parameter
settings only once for recovery.  (Or at least it has the potential to
be a very confusing way.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-26 Thread Peter Eisentraut
On lör, 2011-09-24 at 14:02 +0100, Simon Riggs wrote:
 The semantics are clear: recovery.conf is read first, then
 postgresql.conf. It's easy to implement (1 line of code) and easy to
 understand.

What's clear about that?  My intuition would have been that
recovery.conf is read second.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-26 Thread Peter Eisentraut
On sön, 2011-09-25 at 12:58 -0400, Tom Lane wrote:
 And it's not like we don't break configuration file
 contents in most releases anyway, so I really fail to see why this one
 has suddenly become sacrosanct. 

Well, there is a slight difference.  Changes in postgresql.conf
parameter names and settings are adjusted automatically for me by my
package upgrade script.  If we, say, changed the names of recovery.conf
parameters, I'd have to get a new version of my $SuperReplicationTool.
That tool could presumably look at PG_VERSION and put a recovery.conf
with the right spellings in the right place.

But if we completely change the way the replication configuration
interacts, it's not clear that a smooth upgrade is possible without
significant effort.  That said, I don't see why it wouldn't be possible,
but let's design with upgradability in mind, instead of claiming that we
have never supported upgrades of this kind anyway.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is there any plan to add unsigned integer types?

2011-09-26 Thread crocket
MySQL already has unsigned INT type, and it has double the range of
signed INT type.
It's not just the bigger range that UINT type brings.
If unsigned INT type exists, I wouldn't have to execute create domain
UINT in every database.

If INT unsigned and SERIAL unsigned exist, PostgreSQL would bring
more convenience to users.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-26 Thread Peter Eisentraut
On fre, 2011-09-23 at 20:35 +0300, Marcin Mańk wrote:
 One idea:
 col like 'foo%' could be translated to col = 'foo' and col = foo || 'zzz' , 
 where 'z' is the largest possible character. This should be good enough  for 
 calculating stats.
 
  How to find such a character, i do not know.

That's what makes this so difficult.

If we knew the largest character, we could probably also find the
largest-1, largest-2, etc. characters and determine the total order of
everything.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there any plan to add unsigned integer types?

2011-09-26 Thread Peter Eisentraut
On mån, 2011-09-26 at 19:41 +0900, crocket wrote:
 MySQL already has unsigned INT type, and it has double the range of
 signed INT type.
 It's not just the bigger range that UINT type brings.
 If unsigned INT type exists, I wouldn't have to execute create domain
 UINT in every database.
 
 If INT unsigned and SERIAL unsigned exist, PostgreSQL would bring
 more convenience to users.
 

I believe there have been many discussions about this in the past,
outlining the various issues that would come with this project.  A first
step would be to start implementing this in user space and see how much
breaks.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Magnus Hagander
On Mon, Sep 26, 2011 at 06:58, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 Hi,

 I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
 in COPY FROM command. BOM will be automatically detected and ignored
 if the file encoding is UTF-8. WIP patch attached.

 I'm thinking about only COPY FROM for reads, but if someone wants to add
 BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

 Comments welcome.

I like it in general. But if we're looking at the BOM, shouldn't we
also look and *reject* the file if it's a BOM for a non-UTF8 file? Say
if the BOM claims it's UTF16?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Itagaki Takahiro
On Mon, Sep 26, 2011 at 20:12, Magnus Hagander mag...@hagander.net wrote:
 I like it in general. But if we're looking at the BOM, shouldn't we
 also look and *reject* the file if it's a BOM for a non-UTF8 file? Say
 if the BOM claims it's UTF16?

-1 because we're depending on manual configuration for now.
It would be reasonable if we had used automatic detection of
character encoding, but we don't. In addition, some crazy
encoding might use BOM codes as a valid character.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Magnus Hagander
On Mon, Sep 26, 2011 at 13:36, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Mon, Sep 26, 2011 at 20:12, Magnus Hagander mag...@hagander.net wrote:
 I like it in general. But if we're looking at the BOM, shouldn't we
 also look and *reject* the file if it's a BOM for a non-UTF8 file? Say
 if the BOM claims it's UTF16?

 -1 because we're depending on manual configuration for now.
 It would be reasonable if we had used automatic detection of
 character encoding, but we don't. In addition, some crazy
 encoding might use BOM codes as a valid character.

Does such an encoding really exist? And the code only executes when
the user thinks he's in UTF8, right? So it would still only happen if
the incorrect encoding was specified..


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Andrew Dunstan



On 09/26/2011 07:12 AM, Magnus Hagander wrote:

On Mon, Sep 26, 2011 at 06:58, Itagaki Takahiro
itagaki.takah...@gmail.com  wrote:

Hi,

I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
in COPY FROM command. BOM will be automatically detected and ignored
if the file encoding is UTF-8. WIP patch attached.

I'm thinking about only COPY FROM for reads, but if someone wants to add
BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

Comments welcome.

I like it in general. But if we're looking at the BOM, shouldn't we
also look and *reject* the file if it's a BOM for a non-UTF8 file? Say
if the BOM claims it's UTF16?




It should be rejected as invalidly encoded anyway, as a non-utf8 BOM is 
not valid utf-8. We shouldn't check in non-unicode cases where the 
sequence might be valid in those encodings (e.g. ISO-8859-1).


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-09-26 Thread Fujii Masao
On Fri, Sep 23, 2011 at 12:44 AM, Magnus Hagander mag...@hagander.net wrote:
 Would it make sense for pg_start_backup() to have the ability to wait
 for the next restartpoint in a case like this, if we know that FPW has
 been set? Instead of failing? Or maybe that's just overcomplicating
 things when trying to be user-friendly.

I don't think that it's worth adding code for such a feature. Because I believe
there are not many users who enable FPW on-the-fly for standby-only backup
and use such a feature.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLE tab completion

2011-09-26 Thread David Fetter
On Mon, Sep 26, 2011 at 10:37:50AM +0200, Magnus Hagander wrote:
 On Sun, Sep 25, 2011 at 15:06, Dean Rasheed dean.a.rash...@gmail.com wrote:
  On 24 September 2011 11:59, Magnus Hagander mag...@hagander.net wrote:
  TABLE tab completion in psql only completes to tables, not views. but
  the TABLE command works fine for both tables and views (and also
  sequences).
 
  Seems we should just complete it to relations and not tables - or can
  anyone see a particular reason why we shouldn't?
 
 
  Doesn't that mean that DROP TABLE tab would offer up views as well
  as tables, which would be incorrect?
 
 Meh - you are correct, of course. I guess that's why we have code review :-)
 
 So - not a oneliner, but how about something like this?
 
 (Happy to have someone point out a neater way of doing it, not
 entirely fluent in how we do the tab completion..)

That's pretty much it.  Should it also (eventually) do SRFs?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] DROP statement reworks

2011-09-26 Thread Dimitri Fontaine
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 The patches are not in git am format nor in patch format, so I could
 only read them, I didn't install them nor compiled the code, didn't run
 the regression tests.

I've marked the patch as Waiting on Author and referred to my previous
message.  Thanks,

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Displaying accumulated autovacuum cost

2011-09-26 Thread Shigeru Hanada
Hi Greg,

(2011/08/28 18:54), Greg Smith wrote:
 Updated patch cleans up two diff mistakes made when backing out the 
 progress report feature. The tip-off I screwed up should have been the 
 absurdly high write rate shown. The usleep was accidentally deleted, so 
 it was running without cost limits even applying. Here's a good one 
 instead:
 
 LOG: automatic vacuum of table pgbench.public.pgbench_accounts: index 
 scans: 1
 pages: 0 removed, 163935 remain
 tuples: 200 removed, 2928356 remain
 buffer usage: 117393 hits, 123351 misses, 102684 dirtied, 2.168 MiB/s 
 write rate
 system usage: CPU 2.54s/6.27u sec elapsed 369.99 sec

I took a look at your patch, and it seems fine about fundamental
functionality, though the patch needed to be rebased against current
HEAD.  Please see attached patch which I used for review.

IIUC, this patch provides:
* Three counters, which are used to keep # of buffers which were (1)
Hits: found in shared buffers, (2) Missed: not found in shared buffers,
and (3) Dirtied: marked as dirty, in an autovacuum of a relation.
These counters are used only when cost-based autovacuum is enabled by
setting vacuum_cost_delay to non-zero.
* Capability to report # of buffers above, and buffer write rate
(MiB/sec) in the existing autovacuum logging message, only when actual
duration  autovacuum_min_duration, and cost-based autovacuum is enabled.

I think one concern is the way showing statistics.  If showing summary
of statistics (at the end of an autovacuum) in server log is enough,
current implementation is fine.  Also showing progress report in server
log would be easy to achieve.  In contrast, reporting progress via
another backend would require shared memory or statistics collector,
rather than per-process global variables.  Anyway, this patch can be the
base of such enhancement.

There are some trivial comments:
* Local variables added by the patch (secs, usecs, write_rate and
endtime) can be moved into narrower scope.
* Initializing starttime to zero seems unnecessary.

In addition, documents about how to use the statistics would be
necessary, maybe in 23.1.5. The Autovacuum Daemon.

I'll set the status of this patch to waiting-on-author. Would you rebase
the patch and post it again?

Regards,
-- 
Shigeru Hanada
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7fe787e..768c658 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*** vacuum(VacuumStmt *vacstmt, Oid relid, b
*** 214,219 
--- 214,222 
  
VacuumCostActive = (VacuumCostDelay  0);
VacuumCostBalance = 0;
+   VacuumPageHit = 0;
+   VacuumPageMiss = 0;
+   VacuumPageDirty = 0;
  
/*
 * Loop to process each selected relation.
diff --git a/src/backend/commands/vacuumlazy.c 
b/src/backend/commands/vacuumlazy.c
index cf8337b..d9bb272 100644
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*** lazy_vacuum_rel(Relation onerel, VacuumS
*** 153,170 
int nindexes;
BlockNumber possibly_freeable;
PGRUsageru0;
!   TimestampTz starttime = 0;
boolscan_all;
TransactionId freezeTableLimit;
BlockNumber new_rel_pages;
double  new_rel_tuples;
TransactionId new_frozen_xid;
  
/* measure elapsed time iff autovacuum logging requires it */
if (IsAutoVacuumWorkerProcess()  Log_autovacuum_min_duration = 0)
{
pg_rusage_init(ru0);
!   if (Log_autovacuum_min_duration  0)
starttime = GetCurrentTimestamp();
}
  
--- 153,173 
int nindexes;
BlockNumber possibly_freeable;
PGRUsageru0;
!   TimestampTz starttime = 0, endtime;
boolscan_all;
TransactionId freezeTableLimit;
BlockNumber new_rel_pages;
double  new_rel_tuples;
TransactionId new_frozen_xid;
+   longsecs;
+   int usecs;
+   double  write_rate;
  
/* measure elapsed time iff autovacuum logging requires it */
if (IsAutoVacuumWorkerProcess()  Log_autovacuum_min_duration = 0)
{
pg_rusage_init(ru0);
!   if (Log_autovacuum_min_duration  0 || VacuumCostActive)
starttime = GetCurrentTimestamp();
}
  
*** lazy_vacuum_rel(Relation onerel, VacuumS
*** 250,272 
/* and log the action if appropriate */
if (IsAutoVacuumWorkerProcess()  Log_autovacuum_min_duration = 0)
{
if (Log_autovacuum_min_duration == 0 ||
!   TimestampDifferenceExceeds(starttime, 
GetCurrentTimestamp(),
   

Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 6:28 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2011/9/26 Robert Haas robertmh...@gmail.com:
 On Sun, Sep 25, 2011 at 10:38 PM, Noah Misch n...@leadboat.com wrote:
 On Sun, Sep 25, 2011 at 11:22:03AM -0500, Kevin Grittner wrote:
 Robert Haas  09/25/11 10:58 AM 

  I'm not sure we've been 100% consistent about that, since we
  previously made CREATE OR REPLACE LANGUAGE not replace the owner
  with the current user.

 I think we've been consistent in *not* changing security on an
 object when it is replaced.

 [CREATE OR REPLACE FUNCTION does not change proowner or proacl]

 Good point.  C-O-R VIEW also preserves column default values.  I believe we 
 are
 consistent to the extent that everything possible to specify in each C-O-R
 statement gets replaced outright.  The preserved characteristics *require*
 commands like GRANT, COMMENT and ALTER VIEW to set in the first place.

 The analogue I had in mind is SECURITY DEFINER, which C-O-R FUNCTION 
 reverts to
 SECURITY INVOKER if it's not specified each time.  That default is safe, 
 though,
 while the proposed default of security_barrier=false is unsafe.

 Even though I normally take the opposite position, I still like the
 idea of dedicated syntax for this feature.  Not knowing what view
 options we might end up with in the future, I hate having to decide on
 what the general behavior ought to be.  But it would be easy to decide
 that CREATE SECURITY VIEW followed by CREATE OR REPLACE VIEW leaves
 the security flag set; it would be consistent with what we're doing
 with owner and acl information and wouldn't back us into any
 unpleasant decisions down the road.

 Does the CREATE SECURITY VIEW statement mean a synonym of
 CREATE VIEW ... WITH (security_barrier=true) ?

 If so, it seems to me reasonable to keep the configuration when user
 provides no explicit option.

 1) an explicit WITH(security_barrier=true) / CREATE SECURITY VIEW
  - It always turns on a security_barrier option.

 2) an explicit WITH(security_barrier=false)
  - It always turns off security_barrier option.

 3) no explicit option / CREATE VIEW
  - Keep existing configuration, although inconsist with SECURITY DEFINER

No, you're missing my point completely.  If we use a flexible options
syntax here, then we have to decide on what behavior CREATE OR REPLACE
should have for all future options, without knowing what they are yet,
or what behavior will be appropriate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upgrading Extenions from 8.4

2011-09-26 Thread Dimitri Fontaine
Hi,

I'm in the middle of catching-up with pgsql-general and I see several
confused users about how to upgrade directly from 8.4.  As Tom said, we
could easily provide upgrade scripts to handle the move, we just
didn't, so there's some more manual work to do.

I'm wondering how much work that really is — I had a SQL query that was
good at preparing the extension--unpackaged--1.0.sql script, so I could
use that to produce a set of extension--unpackaged-in-8.4--1.0.sql ones.

Now, should I get to do that, would that be accepted into PostgreSQL
itself in, say, 9.1.2?  If not, I have to think about how to distribute
that, and depending on how to, if there's any value doing that for our
users.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Robert Haas robertmh...@gmail.com:
 On Mon, Sep 26, 2011 at 6:28 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2011/9/26 Robert Haas robertmh...@gmail.com:
 On Sun, Sep 25, 2011 at 10:38 PM, Noah Misch n...@leadboat.com wrote:
 On Sun, Sep 25, 2011 at 11:22:03AM -0500, Kevin Grittner wrote:
 Robert Haas  09/25/11 10:58 AM 

  I'm not sure we've been 100% consistent about that, since we
  previously made CREATE OR REPLACE LANGUAGE not replace the owner
  with the current user.

 I think we've been consistent in *not* changing security on an
 object when it is replaced.

 [CREATE OR REPLACE FUNCTION does not change proowner or proacl]

 Good point.  C-O-R VIEW also preserves column default values.  I believe 
 we are
 consistent to the extent that everything possible to specify in each C-O-R
 statement gets replaced outright.  The preserved characteristics *require*
 commands like GRANT, COMMENT and ALTER VIEW to set in the first place.

 The analogue I had in mind is SECURITY DEFINER, which C-O-R FUNCTION 
 reverts to
 SECURITY INVOKER if it's not specified each time.  That default is safe, 
 though,
 while the proposed default of security_barrier=false is unsafe.

 Even though I normally take the opposite position, I still like the
 idea of dedicated syntax for this feature.  Not knowing what view
 options we might end up with in the future, I hate having to decide on
 what the general behavior ought to be.  But it would be easy to decide
 that CREATE SECURITY VIEW followed by CREATE OR REPLACE VIEW leaves
 the security flag set; it would be consistent with what we're doing
 with owner and acl information and wouldn't back us into any
 unpleasant decisions down the road.

 Does the CREATE SECURITY VIEW statement mean a synonym of
 CREATE VIEW ... WITH (security_barrier=true) ?

 If so, it seems to me reasonable to keep the configuration when user
 provides no explicit option.

 1) an explicit WITH(security_barrier=true) / CREATE SECURITY VIEW
  - It always turns on a security_barrier option.

 2) an explicit WITH(security_barrier=false)
  - It always turns off security_barrier option.

 3) no explicit option / CREATE VIEW
  - Keep existing configuration, although inconsist with SECURITY DEFINER

 No, you're missing my point completely.  If we use a flexible options
 syntax here, then we have to decide on what behavior CREATE OR REPLACE
 should have for all future options, without knowing what they are yet,
 or what behavior will be appropriate.

Hmm. Indeed, it seems to me fair enough reason.

In this syntax case, the only way to clear the security_barrier flag
is to drop view
once, then create a view, isn't it?
And, is the security_barrier flag still stored within reloptions field?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-26 Thread Peter Geoghegan
On 26 September 2011 04:46, Robert Haas robertmh...@gmail.com wrote:
 I don't want to take time to review this in detail right now, because
 I don't think it would be fair to put this ahead of things that were
 submitted for the current CommitFest, but I'm impressed.

Thank you.

Now that I think about it, the if statement that determines if the
int4 specialisation will be used may be okay - it sort of documents
the conditions under which the int4 specialisation should be used with
reference to how the else generic case actually works, which is
perhaps no bad thing. I now realise that I should be using constants
from fmgroids.h though.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there any plan to add unsigned integer types?

2011-09-26 Thread Merlin Moncure
On Mon, Sep 26, 2011 at 5:41 AM, crocket crockabisc...@gmail.com wrote:
 MySQL already has unsigned INT type, and it has double the range of
 signed INT type.
 It's not just the bigger range that UINT type brings.
 If unsigned INT type exists, I wouldn't have to execute create domain
 UINT in every database.

 If INT unsigned and SERIAL unsigned exist, PostgreSQL would bring
 more convenience to users.

This comes up now and then.  The problem is the benefit gained is not
really worth the pain.  In today's 64 bit world, choosing a 64 bit int
nails the cases where you need the extra range and you have the
ability to use constraints (if necessary, through a domain) to enforce
correctness.

On the downside, you have to add a lot of casts, overloads, etc.
Figuring out the casting rules is non trivial and could lead to
surprising behaviors...inferring the type of 'unknown' strings is bad
enough as it is.

TBH, what I'd greatly prefer to see is to have domains be finished up
so that you don't have to carefully consider their use (for example,
you can't make arrays out of them).  Then an unsigned int could simply
be:

create domain uint as bigint check (value = 0);

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On sön, 2011-09-25 at 23:49 -0400, Robert Haas wrote:
 In fact, I've been wondering if we ought to go a step further and not
 recurse into the sepgsql directory for *any* of the targets.  Then we
 could get rid of the associated configure option, which no longer
 serves any other purpose, and just say that if you want to build (or
 regression-test) sepgsql, well, you gotta go to that directory and do
 it by hand.

 I'm not in favor of that.  It's nice to have a uniform interface that
 decides what to build.

Well, the right fix is to fix the sepgsql regression tests so that they
adhere to the uniform model of being something you can run without
destructive modifications to the host system.  What's being discussed at
the moment is the least messy stopgap we can have until the tests are
fixed.  I think that just taking sepgsql out of the subdirectory list
(except for make clean) might well be the most attractive workaround.

Another possibility is to remove the Makefile's knowledge of how to run
the tests, and change chkselinuxenv into something that both verifies
the environment and then launches the tests.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-26 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On fre, 2011-09-23 at 20:35 +0300, Marcin Mańk wrote:
 One idea:
 col like 'foo%' could be translated to col = 'foo' and col = foo || 'zzz' 
 , where 'z' is the largest possible character. This should be good enough  
 for calculating stats.
 How to find such a character, i do not know.

 That's what makes this so difficult.

 If we knew the largest character, we could probably also find the
 largest-1, largest-2, etc. characters and determine the total order of
 everything.

No, it's a hundred times worse than that, because in collations other
than C there typically *is* no total order.  The collation behavior of
many characters is context-sensitive, thanks to the multi-pass behavior
of typical dictionary algorithms.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On sön, 2011-09-25 at 23:49 -0400, Robert Haas wrote:
 In fact, I've been wondering if we ought to go a step further and not
 recurse into the sepgsql directory for *any* of the targets.  Then we
 could get rid of the associated configure option, which no longer
 serves any other purpose, and just say that if you want to build (or
 regression-test) sepgsql, well, you gotta go to that directory and do
 it by hand.

 I'm not in favor of that.  It's nice to have a uniform interface that
 decides what to build.

 Well, the right fix is to fix the sepgsql regression tests so that they
 adhere to the uniform model of being something you can run without
 destructive modifications to the host system.  What's being discussed at
 the moment is the least messy stopgap we can have until the tests are
 fixed.  I think that just taking sepgsql out of the subdirectory list
 (except for make clean) might well be the most attractive workaround.

 Another possibility is to remove the Makefile's knowledge of how to run
 the tests, and change chkselinuxenv into something that both verifies
 the environment and then launches the tests.

That's not a bad fix, either.

I have my doubts about the theory that we'll ever be able to make
these regression tests work without some kind of support within the
system security policy.  The whole point of MAC, for better or for
worse, is to make every decision to allow access made anywhere in the
system subject to veto by the system security policy.  I'd certainly
be happy to find out that there's a way to make it work the way you're
hoping, but I'm not expecting it.  Now maybe you'll say that we should
then remove the regression tests altogether, but I don't think that
having no regression tests is better than having regression tests that
are a pain-in-the-tail to run and most people won't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there any plan to add unsigned integer types?

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 10:02 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Sep 26, 2011 at 5:41 AM, crocket crockabisc...@gmail.com wrote:
 MySQL already has unsigned INT type, and it has double the range of
 signed INT type.
 It's not just the bigger range that UINT type brings.
 If unsigned INT type exists, I wouldn't have to execute create domain
 UINT in every database.

 If INT unsigned and SERIAL unsigned exist, PostgreSQL would bring
 more convenience to users.

 This comes up now and then.  The problem is the benefit gained is not
 really worth the pain.  In today's 64 bit world, choosing a 64 bit int
 nails the cases where you need the extra range and you have the
 ability to use constraints (if necessary, through a domain) to enforce
 correctness.

 On the downside, you have to add a lot of casts, overloads, etc.
 Figuring out the casting rules is non trivial and could lead to
 surprising behaviors...inferring the type of 'unknown' strings is bad
 enough as it is.

 TBH, what I'd greatly prefer to see is to have domains be finished up
 so that you don't have to carefully consider their use (for example,
 you can't make arrays out of them).  Then an unsigned int could simply
 be:

 create domain uint as bigint check (value = 0);

Even if we did that, there might still be cases where people would
want unsigned integers as a means of reducing storage.  4 extra bytes
may not seem like that much, but if you have billions of rows, it adds
up - not just in terms of actual storage space, but also in terms of
disk and memory bandwidth requirements when you want to do anything
with that data.  I have seen some recent data (which is not entirely
conclusive) that suggests that memory bandwidth can be a huge problem
for PostgreSQL performance on large boxes; and I think Greg Smith has
made similar comments in the past (correct me if I'm wrong, Greg).

I think, though, that if we choose to attack that problem in the first
instance by adding support for unsigned integers, we're probably going
to be only nibbling around the edges of the problem.  Reducing
alignment padding and adding block-level compression would benefit a
much larger number of workloads.  Those are not easy projects, but
unfortunately, due to the constraints of our type system, neither is
this.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unaccent contrib

2011-09-26 Thread Oleg Bartunov

On Wed, 21 Sep 2011, Tom Lane wrote:


Euler Taveira de Oliveira eu...@timbira.com writes:

On 21-09-2011 13:28, Daniel VАzquez wrote:

unaccent is compatible with postgresql 8.4 (but not is in their contrib
version distribution)



No, it is not. AFAICS it is necessary to add some backend code that is not in 
8.4.


[ pokes at it ]  Yeah, you are right.  The version of unaccent that is
in our source tree is a filtering dictionary, and therefore cannot
possibly work with backends older than 9.0 (when the filtering
dictionary feature was added).

So I'm wondering where the OP read that it was compatible with 8.4.
Our own documentation about it certainly does not say that.  It's
possible that Oleg and Teodor had some prototype version, different
from what got committed to our tree, that would work in 8.4.


AFAIR, the last version without filtering feature is 
http://www.sigaev.ru/misc/unaccent-0.2.tar.gz


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Upgrading Extenions from 8.4

2011-09-26 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 I'm in the middle of catching-up with pgsql-general and I see several
 confused users about how to upgrade directly from 8.4.  As Tom said, we
 could easily provide upgrade scripts to handle the move, we just
 didn't, so there's some more manual work to do.

 I'm wondering how much work that really is — I had a SQL query that was
 good at preparing the extension--unpackaged--1.0.sql script, so I could
 use that to produce a set of extension--unpackaged-in-8.4--1.0.sql ones.

 Now, should I get to do that, would that be accepted into PostgreSQL
 itself in, say, 9.1.2?  If not, I have to think about how to distribute
 that, and depending on how to, if there's any value doing that for our
 users.

I think it would be worth looking at just how messy they'd be.  We
failed to think about supporting the direct-from-8.4 case at all, but in
hindsight we really should have.

I guess one question that needs to be answered is why stop at 8.4?
I don't think we want to contract to support direct upgrades from every
previous contrib version ... but ...

An important point here is that a lot of existing installations may be
on $version and yet have a set of SQL objects for a particular contrib
module that represent $previous_version.  Given the lack of easy upgrade
methods in the pre-extensions world, it's not too unlikely that such
situations are the majority :-(.  So we need to factor that problem into
both our evaluations of how useful additional upgrade scripts are, and
how we're going to document which one to use.

We could avoid the documentation problem if the update-from-unpackaged
scripts could be designed so that they Just Work with multiple previous
versions of the contrib module.  In at least some cases, such as a new
function that wasn't there before, I think this would be quite easy
to do --- just use CREATE OR REPLACE FUNCTION.  Many of the contrib
modules haven't really changed much recently, so we shouldn't dismiss
this as impractical without taking a look.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-26 Thread Peter Eisentraut
On mån, 2011-09-26 at 10:08 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  On fre, 2011-09-23 at 20:35 +0300, Marcin Mańk wrote:
  One idea:
  col like 'foo%' could be translated to col = 'foo' and col = foo || 
  'zzz' , where 'z' is the largest possible character. This should be good 
  enough  for calculating stats.
  How to find such a character, i do not know.
 
  That's what makes this so difficult.
 
  If we knew the largest character, we could probably also find the
  largest-1, largest-2, etc. characters and determine the total order of
  everything.
 
 No, it's a hundred times worse than that, because in collations other
 than C there typically *is* no total order.  The collation behavior of
 many characters is context-sensitive, thanks to the multi-pass behavior
 of typical dictionary algorithms.

Well, there is a total order of all strings, but it's not consistent
under string concatenation.

But there is a largest character.  If the collation implementation
uses four weights (the typical case), the largest character is the one
that maps to    .  If you appended that
character to a string, you would get a larger string.  (Unless there are
French backwards levels or other funny things in place, perhaps.)  But
we don't know which character that is, and likely there isn't one, so
we'd need to largest character that maps to an actually assigned weight,
and that's not possible without exhaustive search of all collating
elements.

We could possibly try to make this whole thing work differently by
storing the strxfrm results in the histograms.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 9:51 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 No, you're missing my point completely.  If we use a flexible options
 syntax here, then we have to decide on what behavior CREATE OR REPLACE
 should have for all future options, without knowing what they are yet,
 or what behavior will be appropriate.

 Hmm. Indeed, it seems to me fair enough reason.

 In this syntax case, the only way to clear the security_barrier flag
 is to drop view
 once, then create a view, isn't it?

I was imagining we'd have ALTER VIEW .. [NO] SECURITY or something like that.

 And, is the security_barrier flag still stored within reloptions field?

No.  That would be missing the point.

But keep in mind no one else has endorsed my reasoning on this one as yet...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tom Lane
David E. Wheeler da...@kineticode.com 
cajw2+qdyg1+xlahdqnjs3ackmcsvcdkv_lcapwutwmxl9dz...@mail.gmail.com writes:
 On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote:
 I'm thinking about only COPY FROM for reads, but if someone wants to add
 BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

 I think it would have to be optional, since some recipients of UTF-8 encoded 
 data do not expect a BOM.

Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
that Microsloth does it does not make it standards-conformant.

I think that accepting it on input can be sensible, on the principle of
be liberal in what you accept, but the other side of that is be
conservative in what you send.  No BOMs in output, please.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tatsuo Ishii
 I'd like to support UTF-8 text or csv files that has BOM (byte order mark)
 in COPY FROM command. BOM will be automatically detected and ignored
 if the file encoding is UTF-8. WIP patch attached.

From RFC3629(http://tools.ietf.org/html/rfc3629#section-6):

 o A protocol SHOULD forbid use of U+FEFF as a signature for those
   textual protocol elements that the protocol mandates to be always
   UTF-8, the signature function being totally useless in those cases.

COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
I think we should not regard U+FEFF as BOM in COPY, rather we should
regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there any plan to add unsigned integer types?

2011-09-26 Thread Merlin Moncure
On Mon, Sep 26, 2011 at 9:21 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Sep 26, 2011 at 10:02 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Sep 26, 2011 at 5:41 AM, crocket crockabisc...@gmail.com wrote:
 MySQL already has unsigned INT type, and it has double the range of
 signed INT type.
 It's not just the bigger range that UINT type brings.
 If unsigned INT type exists, I wouldn't have to execute create domain
 UINT in every database.

 If INT unsigned and SERIAL unsigned exist, PostgreSQL would bring
 more convenience to users.

 This comes up now and then.  The problem is the benefit gained is not
 really worth the pain.  In today's 64 bit world, choosing a 64 bit int
 nails the cases where you need the extra range and you have the
 ability to use constraints (if necessary, through a domain) to enforce
 correctness.

 On the downside, you have to add a lot of casts, overloads, etc.
 Figuring out the casting rules is non trivial and could lead to
 surprising behaviors...inferring the type of 'unknown' strings is bad
 enough as it is.

 TBH, what I'd greatly prefer to see is to have domains be finished up
 so that you don't have to carefully consider their use (for example,
 you can't make arrays out of them).  Then an unsigned int could simply
 be:

 create domain uint as bigint check (value = 0);

 Even if we did that, there might still be cases where people would
 want unsigned integers as a means of reducing storage.  4 extra bytes
 may not seem like that much, but if you have billions of rows, it adds
 up - not just in terms of actual storage space, but also in terms of
 disk and memory bandwidth requirements when you want to do anything
 with that data.  I have seen some recent data (which is not entirely
 conclusive) that suggests that memory bandwidth can be a huge problem
 for PostgreSQL performance on large boxes; and I think Greg Smith has
 made similar comments in the past (correct me if I'm wrong, Greg).

 I think, though, that if we choose to attack that problem in the first
 instance by adding support for unsigned integers, we're probably going
 to be only nibbling around the edges of the problem.  Reducing
 alignment padding and adding block-level compression would benefit a
 much larger number of workloads.  Those are not easy projects, but
 unfortunately, due to the constraints of our type system, neither is
 this.  :-(

right -- exactly.  most 'savings' in this vein are nothing but due to
padding and other factors such that (at least today) there is no
disadvantage to going to 64 bit in range constrained cases.  also, I'd
submit history has been unkind to hardware dependent optimization
strategies in userland -- the engine should be dealing with this
problem.  better to define your data the proper way and let the
assembly instruction counting gurus in -hackers worry about it :-).

compression is an interesting topic: the guys over at tokudb are
making some wild claims...i'm curious if they are real, and what the
real tradeoffs are.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another possibility is to remove the Makefile's knowledge of how to run
 the tests, and change chkselinuxenv into something that both verifies
 the environment and then launches the tests.

 That's not a bad fix, either.

 I have my doubts about the theory that we'll ever be able to make
 these regression tests work without some kind of support within the
 system security policy.  The whole point of MAC, for better or for
 worse, is to make every decision to allow access made anywhere in the
 system subject to veto by the system security policy.  I'd certainly
 be happy to find out that there's a way to make it work the way you're
 hoping, but I'm not expecting it.  Now maybe you'll say that we should
 then remove the regression tests altogether, but I don't think that
 having no regression tests is better than having regression tests that
 are a pain-in-the-tail to run and most people won't.

The main point I'm on about here is that make check must not require
root privileges.  That is absolutely not negotiable (even if it were
sane from a security standpoint, which is ridiculous anyway).  I don't
think make installcheck should require root either, although there
might possibly be a little more wiggle room there.  If it's infeasible
to test sepgsql usefully without root involvement, then it can't be
tested within the existing regression test framework.  So maybe just
pushing the issue out to a separate shell script that you can choose
to invoke by hand is a reasonable compromise.

I think it should be possible to still use all the existing testing
infrastructure if the check/test script does something like
make REGRESS=label dml misc check

BTW, I think this line of argument also casts serious doubt on whether
REGRESS_PREP is a useful concept at all.  I'm more than half tempted to
revert the patches that added that to the regression test
infrastructure.  Do we still need the --launcher option, either?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tatsuo Ishii
 David E. Wheeler da...@kineticode.com 
 cajw2+qdyg1+xlahdqnjs3ackmcsvcdkv_lcapwutwmxl9dz...@mail.gmail.com writes:
 On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote:
 I'm thinking about only COPY FROM for reads, but if someone wants to add
 BOM in COPY TO, we might also support COPY TO WITH BOM for writes.
 
 I think it would have to be optional, since some recipients of UTF-8 
 encoded data do not expect a BOM.
 
 Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
 that Microsloth does it does not make it standards-conformant.
 
 I think that accepting it on input can be sensible, on the principle of
 be liberal in what you accept, but the other side of that is be
 conservative in what you send.  No BOMs in output, please.

Suppose a user uses brain-dead editor, which does not accept UTF-8
without BOM.  He decides to save his editor data into PostgreSQL using
COPY FROM. He extracts the data using COPY TO. Now he finds that his
stupid editor does not accept his data any more.

So I think if we decide to accept UTF-8 with BOM, we should keep BOM
when importing the data and output the data with BOM. If we don't want
to output UTF-8 with BOM, we should not accept UTF-8 with BOM. It
seems we don't have much choice...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Kohei KaiGai
2011/9/26 Tom Lane t...@sss.pgh.pa.us:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another possibility is to remove the Makefile's knowledge of how to run
 the tests, and change chkselinuxenv into something that both verifies
 the environment and then launches the tests.

 That's not a bad fix, either.

 I have my doubts about the theory that we'll ever be able to make
 these regression tests work without some kind of support within the
 system security policy.  The whole point of MAC, for better or for
 worse, is to make every decision to allow access made anywhere in the
 system subject to veto by the system security policy.  I'd certainly
 be happy to find out that there's a way to make it work the way you're
 hoping, but I'm not expecting it.  Now maybe you'll say that we should
 then remove the regression tests altogether, but I don't think that
 having no regression tests is better than having regression tests that
 are a pain-in-the-tail to run and most people won't.

 The main point I'm on about here is that make check must not require
 root privileges.  That is absolutely not negotiable (even if it were
 sane from a security standpoint, which is ridiculous anyway).  I don't
 think make installcheck should require root either, although there
 might possibly be a little more wiggle room there.  If it's infeasible
 to test sepgsql usefully without root involvement, then it can't be
 tested within the existing regression test framework.  So maybe just
 pushing the issue out to a separate shell script that you can choose
 to invoke by hand is a reasonable compromise.

If so, is it an option that contrib/sepgsql/Makefile implement its own
regression test scheme? Even if it requires root/unconfined privilege
to set up test server automatically, it is harmless as long as these
are not launched with regular make check/installchek.

 BTW, I think this line of argument also casts serious doubt on whether
 REGRESS_PREP is a useful concept at all.  I'm more than half tempted to
 revert the patches that added that to the regression test
 infrastructure.  Do we still need the --launcher option, either?

If contrib/sepgsql/Makefile implements its own tests including environment
checks, I think REGRESS_PREP is fungible. But --launcher option is necessary
to implement test schemes on the pg_regress.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there any plan to add unsigned integer types?

2011-09-26 Thread Leonardo Francalanci
 

 compression is an interesting topic: the guys over at tokudb are
 making some wild claims...i'm curious if they are real, and what the
 real tradeoffs are.


I don't know how much of the performance they claim comes from
compression and how much from the different indexing technique they
use (see the my post here, where nobody answered...

http://archives.postgresql.org/pgsql-general/2011-09/msg00615.php


)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding CORRESPONDING to Set Operations

2011-09-26 Thread Tom Lane
Kerem Kat kerem...@gmail.com writes:
 In the parser while analyzing SetOperationStmt, larg and rarg needs to be
 transformed as subqueries. SetOperationStmt can have two fields representing
 larg and rarg with projected columns according to corresponding:
 larg_corresponding,
 rarg_corresponding.

Why?  CORRESPONDING at a given set-operation level doesn't affect either
sub-query, so I don't see why you'd need a different representation for
the sub-queries.

 Obviously, that logic doesn't work at all for CORRESPONDING, so you'll
 need to have a separate code path to deduce the output column list in
 that case.

 If the output column list to be determined at that stage it needs to
 be filtered and ordered.
 In that case aren't we breaking the non-modification of user query argument?

No.  All that you're doing is correctly computing the lists of the
set-operation's output column types (and probably names too).  These are
internal details that needn't be examined when printing the query, so
they won't affect ruleutils.c.

 note: I am new to this list, am I asking too much detail?

Well, I am beginning to wonder if you should choose a smaller project
for your first venture into patching Postgres.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use new oom_score_adj without a new compile-time constant

2011-09-26 Thread Dan McGee
On Fri, Sep 23, 2011 at 2:49 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Sep 19, 2011 at 4:36 PM, Dan McGee d...@archlinux.org wrote:
 [ patch ]

 I suppose it's Tom who really needs to comment on this, but I'm not
 too enthusiastic about this approach.  Duplicating the Linux kernel
 calculation into our code means that we could drift if the formula
 changes again.
Why would the formula ever change? This seems like a different excuse
way of really saying you don't appreciate the hacky approach, which I
can understand completely. However, it simply doesn't make sense for
them to change this formula, as it scales the -17 to 16 old range to
the new -1000 to 1000 range. Those endpoints won't be changing unless
a third method is introduced, in which case this whole thing is mute
and we'd need to fix it yet again.

 I like Tom's previous suggestion (I think) of allowing both constants
 to be defined - if they are, then we try oom_score_adj first and fall
 back to oom_adj if that fails.  For bonus points, we could have
 postmaster stat() its own oom_score_adj file before forking and set a
 global variable to indicate the results.  That way we'd only ever need
 to test once per postmaster startup (at least until someone figures
 out a way to swap out the running kernel without stopping the
 server...!).
This would be fine, it just seems unreasonably complicated, not to
mention unnecessary. I might as well point this out [1]. I don't think
a soul out there has built without defining this to 0 (if they define
it at all), and not even that many people are using it. Is it all that
bad of an idea to just force it to 0 for both knobs and be done with
it?

-Dan McGee

[1] http://www.google.com/codesearch#search/q=LINUX_OOM_ADJ=type=cs
- Slackware and Fedora are the only hits not in the PG codebase, and
both define it to 0.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] random isolation test failures

2011-09-26 Thread Andrew Dunstan


We are seeing numerous occasional buildfarm failures of the fk-deadlock2 
isolation test, that look like this:


   ***
   *** 32,39 
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;waiting ...
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
   - step s1u2:... completed
  ERROR:  deadlock detected
  step s2c: COMMIT;
  step s1c: COMMIT;

   --- 32,39 
  step s2u1: UPDATE B SET Col2 = 1 WHERE BID = 2;
  step s1u2: UPDATE B SET Col2 = 1 WHERE BID = 2;waiting ...
  step s2u2: UPDATE B SET Col2 = 1 WHERE BID = 2;
  ERROR:  deadlock detected
   + step s1u2:... completed
  step s2c: COMMIT;
  step s1c: COMMIT;


If this is harmless, we could provide an alternative results file as a 
simple fix. If it's not harmless, it should be fixed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-26 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On mån, 2011-09-26 at 10:08 -0400, Tom Lane wrote:
 No, it's a hundred times worse than that, because in collations other
 than C there typically *is* no total order.  The collation behavior of
 many characters is context-sensitive, thanks to the multi-pass behavior
 of typical dictionary algorithms.

 Well, there is a total order of all strings, but it's not consistent
 under string concatenation.

 But there is a largest character.  If the collation implementation
 uses four weights (the typical case), the largest character is the one
 that maps to    .  If you appended that
 character to a string, you would get a larger string.  (Unless there are
 French backwards levels or other funny things in place, perhaps.)

But the problem is not make a string greater than this given one.
It is make a string greater than any string that begins with this
given one.  As an example, suppose we are given xyz where z is
the last letter in the collation.  We can probably find characters
such as ~ that are greater than z, but no string x-y-nonletter
is going to be considered greater than x-y-z-z by a dictionary
sorting method.  This is why make_greater_string has to be prepared
to give up and go increment some character before the last one:
the only way to succeed for this input is to construct xz.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] random isolation test failures

2011-09-26 Thread Kevin Grittner
Andrew Dunstan and...@dunslane.net wrote:
 
 We are seeing numerous occasional buildfarm failures of the
 fk-deadlock2 isolation test
 
 If this is harmless, we could provide an alternative results file
 as a simple fix. If it's not harmless, it should be fixed.
 
I agree, but don't look at me.  I'm not the one who added the tests,
nor are they related to serializable snapshot isolation.  Tom
recently raised the same issue on this thread:
 
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00991.php
 
Alvaro?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii is...@postgresql.org wrote:
 David E. Wheeler da...@kineticode.com 
 cajw2+qdyg1+xlahdqnjs3ackmcsvcdkv_lcapwutwmxl9dz...@mail.gmail.com writes:
 On Sep 25, 2011, at 9:58 PM, Itagaki Takahiro wrote:
 I'm thinking about only COPY FROM for reads, but if someone wants to add
 BOM in COPY TO, we might also support COPY TO WITH BOM for writes.

 I think it would have to be optional, since some recipients of UTF-8 
 encoded data do not expect a BOM.

 Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
 that Microsloth does it does not make it standards-conformant.

 I think that accepting it on input can be sensible, on the principle of
 be liberal in what you accept, but the other side of that is be
 conservative in what you send.  No BOMs in output, please.

 Suppose a user uses brain-dead editor, which does not accept UTF-8
 without BOM.  He decides to save his editor data into PostgreSQL using
 COPY FROM. He extracts the data using COPY TO. Now he finds that his
 stupid editor does not accept his data any more.

 So I think if we decide to accept UTF-8 with BOM, we should keep BOM
 when importing the data and output the data with BOM. If we don't want
 to output UTF-8 with BOM, we should not accept UTF-8 with BOM. It
 seems we don't have much choice...

Maybe this needs to be an optional behavior, controlled by some COPY option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 11:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another possibility is to remove the Makefile's knowledge of how to run
 the tests, and change chkselinuxenv into something that both verifies
 the environment and then launches the tests.

 That's not a bad fix, either.

 I have my doubts about the theory that we'll ever be able to make
 these regression tests work without some kind of support within the
 system security policy.  The whole point of MAC, for better or for
 worse, is to make every decision to allow access made anywhere in the
 system subject to veto by the system security policy.  I'd certainly
 be happy to find out that there's a way to make it work the way you're
 hoping, but I'm not expecting it.  Now maybe you'll say that we should
 then remove the regression tests altogether, but I don't think that
 having no regression tests is better than having regression tests that
 are a pain-in-the-tail to run and most people won't.

 The main point I'm on about here is that make check must not require
 root privileges.  That is absolutely not negotiable (even if it were
 sane from a security standpoint, which is ridiculous anyway).  I don't
 think make installcheck should require root either, although there
 might possibly be a little more wiggle room there.  If it's infeasible
 to test sepgsql usefully without root involvement, then it can't be
 tested within the existing regression test framework.  So maybe just
 pushing the issue out to a separate shell script that you can choose
 to invoke by hand is a reasonable compromise.

 I think it should be possible to still use all the existing testing
 infrastructure if the check/test script does something like
        make REGRESS=label dml misc check

 BTW, I think this line of argument also casts serious doubt on whether
 REGRESS_PREP is a useful concept at all.  I'm more than half tempted to
 revert the patches that added that to the regression test
 infrastructure.  Do we still need the --launcher option, either?

I want to be able to run these tests, but I'm not fussy about the
exact mechanism.  If you want to whack it around so that I type
./funky_sepgsql_regression_crap instead of make installcheck,
that's fine with me.  And if that means you can rip out some
supporting infrastructure, that's fine too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Multixact truncation for FK locks patch

2011-09-26 Thread Alvaro Herrera

I've been continuing work on modifying the system to let foreign keys
coexist concurrently with updates that do not touch the key columns.
I've made a lot of progress and things seem to be working rather well.
However, I just struck an obstacle that seemed problematic: handling the
truncation of the MultiXactId space when they are no longer needed.

I hadn't stopped to think much about this, regarding it as a trivial
problem to change multixact.c to be just like clog.c.  However, when it
came to actually doing it, I immediately realized that this cannot work,
because they don't share a common numeric space -- the mxact counter can
be anywhere, unrelated to the Xid counter.  There's no way to figure out
the mutixact truncation point from purely Xid data.

In search of solutions to this problem, two things came to mind:

1. Track MultiXact offset generation just as we track Xid generation.
This means that after vacuum we immediately know where to truncate.

2. Make them share a common numeric space.

Both those two solutions come at a very high cost.  #1 means that we
need some sort of frozenmxact column in pg_class and pg_database.  So
we'd know easily and precisely where to truncate mxact; but we would
bloat those catalogs for something that's not as important.  (We eat the
cost of maintaining relfrozenxid and datfrozenxid because it's necessary
for the system to work at all; but in the mxact case, we're talking
about something that's merely a concurrency optimization).

In the case of #2, we avoid having to add those columns, by having
multixact offsets be assigned by GetNewTransactionId; thus, we can
easily know where to truncate just by truncating at the same spot that
we truncate pg_clog.  The problem with this idea is that there would be
huge areas of pg_multixact/offset that are completely unused, because
they would correspond to the values assigned to Xids themselves.  This
would lead to bloat of multixact.  It would also lead to shortening the
useful Xid space, thus leading to shorter times to freeze vacuum.  This
is, of course, completely unacceptable.

So I had to look for something else -- and I think I have it: have
multixact itself track its truncation position relative to Xid.  Each
pg_multixact/offset segment would store ReadNewTransactionId at the time
it is created.  Whenever vacuum attempts to run pg_clog truncation, it
would also pass that Xid to multixact truncation; this would scan
existing segments and delete those that come before the one marked with
the maximum Xid previous to the pg_clog truncation point.

Doing this will require hacking the SLRU truncation logic a bit, so that
it's possible to have it scan segments with a callback in some fashion.

Thoughts?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] random isolation test failures

2011-09-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 We are seeing numerous occasional buildfarm failures of the fk-deadlock2 
 isolation test,

Yeah, I complained about this already, but Kevin disclaims all
responsibility for the fk isolation tests.  It looks like Alvaro
and Noah Misch are the people to be harassing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Suppose a user uses brain-dead editor, which does not accept UTF-8
 without BOM.

 Maybe this needs to be an optional behavior, controlled by some COPY option.

 I'm not excited about emitting non-standards-conformant output on the
 strength of a hypothetical argument about users and editors that may or
 may not exist.  I believe that there's a use-case for reading BOMs, but
 I have seen no field complaints demonstrating that we need to write
 them.  Even if we had a couple, use a less brain dead editor might be
 the best response.  We cannot promise to be compatible with arbitrarily
 broken software.

The thing that makes me doubt that is this comment from Tatsuo Ishii:

TI COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
TI I think we should not regard U+FEFF as BOM in COPY, rather we should
TI regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

If a BOM is confusable with valid data, then I think recognizing it
and discarding it unconditionally is no good - you could end up where
COPY OUT, TRUNCATE, COPY IN changes the table contents.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of recovery?

2011-09-26 Thread Florian Pflug
On Sep26, 2011, at 11:50 , Fujii Masao wrote:
 Currently, if a reference to an invalid page is found during recovery,
 its information
 is saved in hash table invalid_page_tab. Then, if such a reference
 is resolved,
 its information is removed from the hash table. If there is unresolved
 reference to
 an invalid page in the hash table at the end of recovery, PANIC error occurs.
 
 What I'm worried about is that the hash table is volatile. If a user restarts
 the server before reaching end of recovery, any information in the
 hash table is lost,
 and we wrongly miss the PANIC error case because we cannot find any unresolved
 reference. That is, even if database is corrupted at the end of recovery,
 a user might not be able to notice that. This looks like a serious problem. 
 No?
 
 To prevent the above problem, we should write the contents of the hash table 
 to
 the disk for every restartpoints, I think. Then, when the server
 starts recovery,
 it should reload the hash table from the disk. Thought? Am I missing 
 something?

Shouldn't references to invalid pages only occur before we reach a consistent
state? If so, the right fix would be to check whether all invalid page 
references
have been resolved after we've reached a consistent state, and to skip creating
restart points while there're unresolved page references.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 11:09 AM, Tatsuo Ishii is...@postgresql.org wrote:
 Suppose a user uses brain-dead editor, which does not accept UTF-8
 without BOM.

 Maybe this needs to be an optional behavior, controlled by some COPY option.

I'm not excited about emitting non-standards-conformant output on the
strength of a hypothetical argument about users and editors that may or
may not exist.  I believe that there's a use-case for reading BOMs, but
I have seen no field complaints demonstrating that we need to write
them.  Even if we had a couple, use a less brain dead editor might be
the best response.  We cannot promise to be compatible with arbitrarily
broken software.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 The thing that makes me doubt that is this comment from Tatsuo Ishii:
 TI COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
 TI I think we should not regard U+FEFF as BOM in COPY, rather we should
 TI regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

Yeah, that's a reasonable argument for rejecting the patch altogether.
I'm not qualified to decide whether it outweighs the we need to be able
to read Notepad output argument.  I do observe that
http://en.wikipedia.org/wiki/Byte_order_mark
says Unicode 3.2 has deprecated the no-break-space interpretation,
but on the other hand you're right that we can't really assume that
the character is not present in people's data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The thing that makes me doubt that is this comment from Tatsuo Ishii:
 TI COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
 TI I think we should not regard U+FEFF as BOM in COPY, rather we should
 TI regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

 Yeah, that's a reasonable argument for rejecting the patch altogether.

Yeah, or for making the behavior optional.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The thing that makes me doubt that is this comment from Tatsuo Ishii:
 TI COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
 TI I think we should not regard U+FEFF as BOM in COPY, rather we should
 TI regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

 Yeah, that's a reasonable argument for rejecting the patch altogether.

 Yeah, or for making the behavior optional.

Sorry, I should have been clearer: it's an argument for rejecting *this*
patch.  A patch that introduced a BOM option for COPY (which logically
could apply just as well to input or output) would be a different patch.

BTW, another issue with the patch-as-proposed is that it assumes,
without even checking, that fseek() will work (for that matter, it would
also fail pretty miserably on a file shorter than 3 bytes).  We could
dodge that problem with an option since it would be reasonable to define
the option as meaning that there MUST be a BOM there.  I would envision
it as acting much like the CSV HEADER option.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Peter Eisentraut
On tis, 2011-09-27 at 00:09 +0900, Tatsuo Ishii wrote:
 Suppose a user uses brain-dead editor, which does not accept UTF-8
 without BOM.

I would first like to see evidence that such an editor exists.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 5:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I think it is.  If you create a view that involves an RTE, the node
 tree is going to get stored in pg_rewrite.ev_action.  And it's going
 to include the security_barrier attribute, because you added outfuncs
 support for it...

 No?

 IIUC, nested views are also expanded when user's query gets rewritten.
 Thus, rte-security_barrier shall be set based on the latest configuration
 of the view.
 I injected an elog(NOTICE, ...) to confirm the behavior, when security_barrier
 flag was set on rte-security_barrier at ApplyRetrieveRule().

Hmm, OK.  I am still not convinced that this is the right approach.
Normally, we don't cache anything in the RangeTblEntry that might
change between plan time and execution time.  Those things are
normally stored in the RelOptInfo - why not do the same here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Peter Eisentraut
On mån, 2011-09-26 at 13:19 -0400, Robert Haas wrote:
 The thing that makes me doubt that is this comment from Tatsuo Ishii:
 
 TI COPY explicitly specifies the encoding (to be UTF-8 in this case).
 So
 TI I think we should not regard U+FEFF as BOM in COPY, rather we
 should
 TI regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.
 
 If a BOM is confusable with valid data, then I think recognizing it
 and discarding it unconditionally is no good - you could end up where
 COPY OUT, TRUNCATE, COPY IN changes the table contents.

We did recently accept a patch for psql -f to skip over a UTF-8
byte-order mark.  We had a lot of this same discussion there.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Robert Haas
On Mon, Sep 26, 2011 at 2:38 PM, Peter Eisentraut pete...@gmx.net wrote:
 On mån, 2011-09-26 at 13:19 -0400, Robert Haas wrote:
 The thing that makes me doubt that is this comment from Tatsuo Ishii:

 TI COPY explicitly specifies the encoding (to be UTF-8 in this case).
 So
 TI I think we should not regard U+FEFF as BOM in COPY, rather we
 should
 TI regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

 If a BOM is confusable with valid data, then I think recognizing it
 and discarding it unconditionally is no good - you could end up where
 COPY OUT, TRUNCATE, COPY IN changes the table contents.

 We did recently accept a patch for psql -f to skip over a UTF-8
 byte-order mark.  We had a lot of this same discussion there.

But that case is different, because zero-width, non-breaking space has
no particular meaning in an SQL script - it's either going to be
ignored as a BOM, ignored as whitespace, or an error.  But inside a
file being subjected to COPY it might be confusable with data that the
user wanted to end up in some table.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Andrew Dunstan



On 09/26/2011 02:38 PM, Peter Eisentraut wrote:

On mån, 2011-09-26 at 13:19 -0400, Robert Haas wrote:

The thing that makes me doubt that is this comment from Tatsuo Ishii:

TI  COPY explicitly specifies the encoding (to be UTF-8 in this case).
So
TI  I think we should not regard U+FEFF as BOM in COPY, rather we
should
TI  regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

If a BOM is confusable with valid data, then I think recognizing it
and discarding it unconditionally is no good - you could end up where
COPY OUT, TRUNCATE, COPY IN changes the table contents.

We did recently accept a patch for psql -f to skip over a UTF-8
byte-order mark.  We had a lot of this same discussion there.




Yes, but wasn't part of the rationale that this was safe because a 
leading BOM could not possibly be mistaken for anything else legitimate 
in an SQL source file? That's quite different from a data file. ISTM.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Peter Eisentraut
On mån, 2011-09-26 at 14:44 -0400, Robert Haas wrote:
  We did recently accept a patch for psql -f to skip over a UTF-8
  byte-order mark.  We had a lot of this same discussion there.
 
 But that case is different, because zero-width, non-breaking space has
 no particular meaning in an SQL script - it's either going to be
 ignored as a BOM, ignored as whitespace, or an error.  But inside a
 file being subjected to COPY it might be confusable with data that the
 user wanted to end up in some table.

Yes, my point was more directed toward the discussion about whether BOM
in UTF-8 are valid at all.  But your point pretty much kills this
altogether.  If I store a BOM in row 1, column 1 of my table, because,
well, maybe it's an XML document or something, then it needs to be able
to survive a copy out and in.  The only way we could proceed with this
would be if we prohibited BOMs in all user-data.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Brar Piening

Tom Lane wrote:

Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
that Microsloth does it does not make it standards-conformant.


Could you share a pointer to the spec?
All I've ever heard is that a BOM is optional for UTF-8 but not forbidden.

The Unicode FAQ (http://unicode.org/faq/utf_bom.html#BOM) states that 
some recipients of UTF-8 encoded data do not expect a BOM.

Postgres obviously belongs to those recipients.
That's why all my psql-scripts transferring data from MSSQL to Postgres 
need a '\! perl -CD -pi.orig -e tr/\x{feff}//d C:/datafile.txt' 
before feeding data into COPY TO.


Reading it tolerantly and writing it on user request is probably the way 
that would help most users.


Regards,

Brar


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Brar Piening

Tom Lane wrote:

Yeah, that's a reasonable argument for rejecting the patch altogether.
I'm not qualified to decide whether it outweighs the we need to be able
to read Notepad output argument.


Actually it's not only notepad.

I quite often find myself doing something like the following when moving 
data from MSSQL to PostgreSQL.


\echo Fetching data for table patient
\! sqlcmd -S DBSERVER -d DATABASE -E -f 65001 -o C:/datafile.txt -h -1 
-W -s | -Q SET NOCOUNT ON; SELECT * FROM my_table;

\! perl -CD -pi.orig -e tr/\x{feff}//d C:/datafile.txt

\echo Importing data into table patient
\copy my_table FROM 'C:/datafile.txt' WITH DELIMITER '|' NULL 'NULL'

Regards,

Brar

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql setenv command

2011-09-26 Thread Josh Kupershmidt
On Thu, Sep 15, 2011 at 7:02 PM, Andrew Dunstan and...@dunslane.net wrote:
 On Thu, September 15, 2011 6:10 pm, Josh Kupershmidt wrote:
 [need way to show current values]
 \! echo $foo

 (which is how I tested the patch, of course)

Ah, right. IMO it'd be helpful to mention that echo example in your
changes to psql-ref.sgml, either as part of your example inside the
programlisting, or as a suggestion with the rest of the text.

BTW, have you tested this on Windows? I don't have a Windows machine
handy to fool with, but I do see what might be a mess/confusion on
that platform. The MSDN docs claim[1] that putenv() is deprecated in
favor of _putenv(). The code in pg_upgrade uses
SetEnvironmentVariableA on WIN32, while win32env.c has a wrapper
function pgwin32_putenv() around _putenv().

On Sat, Sep 24, 2011 at 5:18 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 A description of the \setenv command should show up in the output of \?.

Yeah, Andrew agreed upthread that help.c should be amended as well,
which would fix \?.

 Should there be a regression test for this?  I'm not sure how it would
 work, as I don't see a cross-platform way to see what the variable is
 set to.

Similar recent psql changes haven't had regression tests included, and
I don't see much of a need here either.

--
[1] http://msdn.microsoft.com/en-US/library/ms235321%28v=VS.80%29.aspx

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Brar Piening

Robert Haas wrote:

The thing that makes me doubt that is this comment from Tatsuo Ishii:

TI  COPY explicitly specifies the encoding (to be UTF-8 in this case).  So
TI  I think we should not regard U+FEFF as BOM in COPY, rather we should
TI  regard U+FEFF as ZERO WIDTH NO-BREAK SPACE.

If a BOM is confusable with valid data, then I think recognizing it
and discarding it unconditionally is no good - you could end up where
COPY OUT, TRUNCATE, COPY IN changes the table contents.


Citing from the Unicode FAQ again:

Q: Where is a BOM useful?
A: A BOM is useful at the beginning of files that are typed as text, but 
for which it is not known whether they are in big or little endian 
format—it can also serve as a hint indicating that the file is in 
Unicode, as opposed to in a legacy encoding and furthermore, it act as a 
signature for the specific encoding form used.


I think that the major hint in the answer is beginning of files.

To correctly handle a BOM you need to be sure to be in the context of 
files that have defined bounds (especially a *beginning*) you can't 
properly handle a BOM in arbitrary streams.


Regards,

Brar

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of recovery?

2011-09-26 Thread Simon Riggs
On Mon, Sep 26, 2011 at 10:50 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Currently, if a reference to an invalid page is found during recovery,
 its information
 is saved in hash table invalid_page_tab. Then, if such a reference
 is resolved,
 its information is removed from the hash table. If there is unresolved
 reference to
 an invalid page in the hash table at the end of recovery, PANIC error occurs.

 What I'm worried about is that the hash table is volatile. If a user restarts
 the server before reaching end of recovery, any information in the
 hash table is lost,
 and we wrongly miss the PANIC error case because we cannot find any unresolved
 reference. That is, even if database is corrupted at the end of recovery,
 a user might not be able to notice that. This looks like a serious problem. 
 No?

 To prevent the above problem, we should write the contents of the hash table 
 to
 the disk for every restartpoints, I think. Then, when the server
 starts recovery,
 it should reload the hash table from the disk. Thought? Am I missing 
 something?

That doesn't happen because the when we stop the server it will
restart from a valid restartpoint - one where there is no in-progress
multi-phase operation.

So when it replays it will always replay both parts of the operation.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Robert Haas robertmh...@gmail.com:
 On Mon, Sep 26, 2011 at 5:58 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I think it is.  If you create a view that involves an RTE, the node
 tree is going to get stored in pg_rewrite.ev_action.  And it's going
 to include the security_barrier attribute, because you added outfuncs
 support for it...

 No?

 IIUC, nested views are also expanded when user's query gets rewritten.
 Thus, rte-security_barrier shall be set based on the latest configuration
 of the view.
 I injected an elog(NOTICE, ...) to confirm the behavior, when 
 security_barrier
 flag was set on rte-security_barrier at ApplyRetrieveRule().

 Hmm, OK.  I am still not convinced that this is the right approach.
 Normally, we don't cache anything in the RangeTblEntry that might
 change between plan time and execution time.  Those things are
 normally stored in the RelOptInfo - why not do the same here?

The point is that a sub-query come from a particular view does not
keep the information what view originally stored the sub-query when
it was passed to the executor stage.
PostgreSQL handles a view as just a sub-query after the rewriter stage.

One possible idea not to store the flag in RangeTblEntry is to utilize
rte-relid to show the relation-id of the source view, when rtekind is
RTE_SUBQUERY; that enables to pull the security_barrier flag in
executor stage.
However, the interface to reference reloptions are designed to pull
this information with Relation pointer, rather than lsyscache, so
I implemented this revision with a new rte-security_barrier member.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Kohei KaiGai
How about this fix on regression test of sepgsql?

It disables to launch regression test together with other modules,
and adds its own build target sepgsql-installcheck that launches
chkselinuxenv script then pg_regress command as currently we
are doing.

It allows users to launch regression test by hand, in same time,
it also enables to build all the contrib modules on the rpm build
environment without selinux ready.

2011/9/26 Robert Haas robertmh...@gmail.com:
 On Mon, Sep 26, 2011 at 11:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Sep 26, 2011 at 10:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another possibility is to remove the Makefile's knowledge of how to run
 the tests, and change chkselinuxenv into something that both verifies
 the environment and then launches the tests.

 That's not a bad fix, either.

 I have my doubts about the theory that we'll ever be able to make
 these regression tests work without some kind of support within the
 system security policy.  The whole point of MAC, for better or for
 worse, is to make every decision to allow access made anywhere in the
 system subject to veto by the system security policy.  I'd certainly
 be happy to find out that there's a way to make it work the way you're
 hoping, but I'm not expecting it.  Now maybe you'll say that we should
 then remove the regression tests altogether, but I don't think that
 having no regression tests is better than having regression tests that
 are a pain-in-the-tail to run and most people won't.

 The main point I'm on about here is that make check must not require
 root privileges.  That is absolutely not negotiable (even if it were
 sane from a security standpoint, which is ridiculous anyway).  I don't
 think make installcheck should require root either, although there
 might possibly be a little more wiggle room there.  If it's infeasible
 to test sepgsql usefully without root involvement, then it can't be
 tested within the existing regression test framework.  So maybe just
 pushing the issue out to a separate shell script that you can choose
 to invoke by hand is a reasonable compromise.

 I think it should be possible to still use all the existing testing
 infrastructure if the check/test script does something like
        make REGRESS=label dml misc check

 BTW, I think this line of argument also casts serious doubt on whether
 REGRESS_PREP is a useful concept at all.  I'm more than half tempted to
 revert the patches that added that to the regression test
 infrastructure.  Do we still need the --launcher option, either?

 I want to be able to run these tests, but I'm not fussy about the
 exact mechanism.  If you want to whack it around so that I type
 ./funky_sepgsql_regression_crap instead of make installcheck,
 that's fine with me.  And if that means you can rip out some
 supporting infrastructure, that's fine too.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
KaiGai Kohei kai...@kaigai.gr.jp


sepgsql-fix-regtest.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: plpgsql - remove unnecessary ccache search when a array variable is updated

2011-09-26 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/9/23 Robert Haas robertmh...@gmail.com:
 On Fri, Sep 23, 2011 at 12:26 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I fixed crash that described Tom. Do you know about other?

 No, I just don't see a new version of the patch.

 sorry - my mistake - I sent it only to Tom

Applied with corrections --- mostly, that you didn't think through the
domain-over-array case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM

2011-09-26 Thread Tom Lane
Brar Piening b...@gmx.de writes:
 Citing from the Unicode FAQ again:

 Q: Where is a BOM useful?
 A: A BOM is useful at the beginning of files that are typed as text, but 
 for which it is not known whether they are in big or little endian 
 format—it can also serve as a hint indicating that the file is in 
 Unicode, as opposed to in a legacy encoding and furthermore, it act as a 
 signature for the specific encoding form used.

Note that the reference to byte order betrays the implicit context
assumption: that we're talking about UTF16 or UTF32 representation.
A BOM in UTF8 data is useless for its intended purpose of disambiguating
byte order.  It could possibly be useful for telling UTF8 data apart
from non-UTF8 data, except for the inconvenient fact that that byte
sequence is not invalid data in non-UTF8 encodings.

BOM is useless in UTF8, no matter what Microsoft thinks.  Any tool that
relies on it to detect UTF8 data has to have a workaround for overriding
that detection, or it's broken to the point of uselessness.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 However, the interface to reference reloptions are designed to pull
 this information with Relation pointer, rather than lsyscache, so
 I implemented this revision with a new rte-security_barrier member.

This approach will guarantee that we can never implement an ALTER VIEW
(or CREATE OR REPLACE VIEW) option that changes the state of the flag.
I don't think that's a good idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/sepgsql regression tests are a no-go

2011-09-26 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 How about this fix on regression test of sepgsql?

IMO, the fundamental problem with the sepgsql regression tests is that
they think they don't need to play by the rules that apply to every
other PG regression test.  I don't think this patch is fixing that
problem; it's just coercing pgxs.mk to assist in not playing by the
rules, and adding still more undocumented complexity to the PGXS
mechanisms in order to do so.

If we have to have a nonstandard command for running the sepgsql
regression tests, as it seems that we do, we might as well just make
that an entirely local affair within contrib/sepgsql.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 Sorry, are you saying the current (in other words, rte-security_barrier
 stores the state of reloption) approach is not a good idea?

Yes.  I think the same as Robert: the way to handle this is to store it
in RelOptInfo for the duration of planning, and pull it from the
catalogs during planner startup (cf plancat.c).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 However, the interface to reference reloptions are designed to pull
 this information with Relation pointer, rather than lsyscache, so
 I implemented this revision with a new rte-security_barrier member.

 This approach will guarantee that we can never implement an ALTER VIEW
 (or CREATE OR REPLACE VIEW) option that changes the state of the flag.
 I don't think that's a good idea.

Sorry, are you saying the current (in other words, rte-security_barrier
stores the state of reloption) approach is not a good idea?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 Sorry, are you saying the current (in other words, rte-security_barrier
 stores the state of reloption) approach is not a good idea?

 Yes.  I think the same as Robert: the way to handle this is to store it
 in RelOptInfo for the duration of planning, and pull it from the
 catalogs during planner startup (cf plancat.c).

Hmm. If so, it seems to me worthwhile to investigate an alternative
approach that stores relation-id of the view on rte-relid if rtekind is
RTE_SUBQUERY and pull the security_barrier flag from the catalog
during planner stage.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Robert Haas
On Mon, Sep 12, 2011 at 3:31 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 The Part-2 tries to tackles a leaky-view scenarios by functions with
 very tiny cost
 estimation value. It was same one we had discussed in the commitfest-1st.
 It prevents to launch functions earlier than ones come from inside of views 
 with
 security_barrier option.

 The Part-3 tries to tackles a leaky-view scenarios by functions that 
 references
 one side of join loop. It prevents to distribute qualifiers including
 functions without
 leakproof attribute into relations across security-barrier.

I took a little more of a look at this today.  It has major problems.

First, I get compiler warnings (which you might want to trap in the
future by creating src/Makefile.custom with COPT=-Werror when
compiling).

Second, the regression tests fail on the select_views test.

Third, it appears that the part2 patch works by adding an additional
traversal of the entire query tree to standard_planner().  I don't
think we want to add overhead to the common case where no security
views are in use, or at least it had better be very small - so this
doesn't seem acceptable to me.

Here are some simple benchmarking with pgbench -S (scale factor 10,
shared_buffers=400MB, MacBook Pro laptop) with and without this stack
of patches.  These aren't clear-cut enough to make me absolutely sure
that this patch causes a noticeable performance regression, but I
think it does, and I'm not at all sure that this is the worst case:

results.kaigai.1:tps = 9359.908769 (including connections establishing)
results.kaigai.1:tps = 9366.317857 (including connections establishing)
results.kaigai.1:tps = 9413.593349 (including connections establishing)
results.master.1:tps = 9444.494510 (including connections establishing)
results.master.1:tps = 9400.486860 (including connections establishing)
results.master.1:tps = 9472.220529 (including connections establishing)

In the light of these problems, it doesn't seem worthwhile for me to
spend any more time on this right now: it looks to me like this needs
a lot more work before it can be considered for commit.  I will mark
it Waiting on Author for now, but I think Returned with Feedback might
be more appropriate.  This needs more than light cleanup; it needs
much more rigorous testing, both as to correctness and performance,
and at least a partial redesign.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of recovery?

2011-09-26 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Mon, Sep 26, 2011 at 10:50 AM, Fujii Masao masao.fu...@gmail.com wrote:
 To prevent the above problem, we should write the contents of the hash table 
 to
 the disk for every restartpoints, I think. Then, when the server
 starts recovery,
 it should reload the hash table from the disk. Thought? Am I missing 
 something?

 That doesn't happen because the when we stop the server it will
 restart from a valid restartpoint - one where there is no in-progress
 multi-phase operation.

Not clear that that's true.  The larger point though is that the
invalid-page table is only interesting during crash recovery --- once
you've reached a consistent state, it should be empty and remain so.
So I see no particular value in Fujii's proposal of logging the table to
disk during standby mode.

It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
we (think we) have reached consistency, rather than leaving it to be
done only when we exit recovery mode.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Tom Lane
Kohei KaiGai kai...@kaigai.gr.jp writes:
 One possible idea not to store the flag in RangeTblEntry is to utilize
 rte-relid to show the relation-id of the source view, when rtekind is
 RTE_SUBQUERY; that enables to pull the security_barrier flag in
 executor stage.

Maybe I'm confused here, but what does the executor need the information
for?  I thought this was a planner problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] Fix Leaky View Problem

2011-09-26 Thread Kohei KaiGai
2011/9/26 Tom Lane t...@sss.pgh.pa.us:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 One possible idea not to store the flag in RangeTblEntry is to utilize
 rte-relid to show the relation-id of the source view, when rtekind is
 RTE_SUBQUERY; that enables to pull the security_barrier flag in
 executor stage.

 Maybe I'm confused here, but what does the executor need the information
 for?  I thought this was a planner problem.

Sorry, planner was what I wanted to say.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql setenv command

2011-09-26 Thread Jeff Janes
On Mon, Sep 26, 2011 at 12:07 PM, Josh Kupershmidt schmi...@gmail.com wrote:
 On Thu, Sep 15, 2011 at 7:02 PM, Andrew Dunstan and...@dunslane.net wrote:
 On Thu, September 15, 2011 6:10 pm, Josh Kupershmidt wrote:
 [need way to show current values]
 \! echo $foo

 (which is how I tested the patch, of course)

 Ah, right. IMO it'd be helpful to mention that echo example in your
 changes to psql-ref.sgml, either as part of your example inside the
 programlisting, or as a suggestion with the rest of the text.

 BTW, have you tested this on Windows? I don't have a Windows machine
 handy to fool with, but I do see what might be a mess/confusion on
 that platform. The MSDN docs claim[1] that putenv() is deprecated in
 favor of _putenv(). The code in pg_upgrade uses
 SetEnvironmentVariableA on WIN32, while win32env.c has a wrapper
 function pgwin32_putenv() around _putenv().

I also wondered this and also don't have a Windows build system.

The win32.h uses a macro to turn putenv() into pgwin32_putenv() , so
as long as that macro is in effect I think it should be doing the
right thing.  In any event, there are several other places in the
existing code-base that call putenv() in a similar fashion to the new
code, so it if doesn't work I would expect things to already be
falling over.


 On Sat, Sep 24, 2011 at 5:18 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 A description of the \setenv command should show up in the output of \?.

 Yeah, Andrew agreed upthread that help.c should be amended as well,
 which would fix \?.

Yes, sorry for the accidental nag.  I didn't realize that help.c is
what implements \? until after I posted.


 Should there be a regression test for this?  I'm not sure how it would
 work, as I don't see a cross-platform way to see what the variable is
 set to.

 Similar recent psql changes haven't had regression tests included, and
 I don't see much of a need here either.

I wasn't sure of the convention on that.  I intentionally broke a few
\ commands (because that was easier than reading through all of
regress) and it did start failing, but that looks like that is because
those commands are used incidentally as part of other tests, rather
than having tests in their own right.

But in any case, considering that we are both wondering if it works on
Windows, I think that argues that an automatic regression test would
be very handy.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] EXECUTE tab completion

2011-09-26 Thread Andreas Karlsson

Hi,

Magnus's patch for adding tab completion of views to the TABLE statement 
reminded me of a minor annoyance of mine -- that EXECUTE always 
completes with PROCEDURE as if it would have been part of CREATE 
TRIGGER ... EXECUTE even when it is the first word of the line.


Attached is a simple patch which adds completion of prepared statement 
names to the EXECUTE statement.


What could perhaps be added is that if you press tab again after 
completing the prepared statement name you might want to see a single 
( appear. Did not add that though since EXECUTE foo(); is not valid 
syntax (while EXECUTE foo(1); is) and I did not feel the extra lines 
of code to add a query to check if the number of expected parameters is 
greater than 0 were worth it.


--
Andreas Karlsson
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
new file mode 100644
index 4f7df36..15bb8c1
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
*** static const SchemaQuery Query_for_list_
*** 588,593 
--- 588,598 
 FROM pg_catalog.pg_available_extensions \
WHERE substring(pg_catalog.quote_ident(name),1,%d)='%s' AND installed_version IS NULL
  
+ #define Query_for_list_of_prepared_statements \
+  SELECT pg_catalog.quote_ident(name) \
+FROM pg_catalog.pg_prepared_statements \
+   WHERE substring(pg_catalog.quote_ident(name),1,%d)='%s'
+ 
  /*
   * This is a list of all things in Pgsql, which can show up after CREATE or
   * DROP; and there is also a query to get a list of them.
*** psql_completion(char *text, int start, i
*** 1642,1647 
--- 1647,1658 
  		COMPLETE_WITH_LIST(list_CSV);
  	}
  
+ /* EXECUTE */
+ 	else if (pg_strcasecmp(prev_wd, EXECUTE) == 0 
+ 			 pg_strcasecmp(prev2_wd, EXECUTE) == 0)
+ 		/* must not match CREATE TRIGGER ... EXECUTE PROCEDURE */
+ 	COMPLETE_WITH_QUERY(Query_for_list_of_prepared_statements);
+ 
  	/* CREATE DATABASE */
  	else if (pg_strcasecmp(prev3_wd, CREATE) == 0 
  			 pg_strcasecmp(prev2_wd, DATABASE) == 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql setenv command

2011-09-26 Thread Andrew Dunstan



On 09/26/2011 05:07 PM, Jeff Janes wrote:


But in any case, considering that we are both wondering if it works on
Windows, I think that argues that an automatic regression test would
be very handy.




I think an automated test should be possible. Something like:

   \setenv PGFOO blurfl
   \! echo $PGFOO %PGFOO%


and then have a couple of alternative results. When I get time to get 
back to this I'll experiment.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of recovery?

2011-09-26 Thread Florian Pflug
On Sep26, 2011, at 22:39 , Tom Lane wrote:
 It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
 we (think we) have reached consistency, rather than leaving it to be
 done only when we exit recovery mode.

I believe we also need to prevent the creation of restart points before
we've reached consistency. If we're starting from an online backup,
and a checkpoint occurred between pg_start_backup() and pg_stop_backup(),
we currently create a restart point upon replaying that checkpoint's
xlog record. At that point, however, unresolved page references are
not an error, since a truncation that happened after the checkpoint
(but before pg_stop_backup()) might or might not be reflected in the
online backup.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for pg_upgrade

2011-09-26 Thread Bruce Momjian
panam wrote:
 Hi Bruce,
 
 on the old DB I've got 465783 as oid whereas on the new one it is 16505.
 
 is not in the dump file (old db), even 16385 (i guess this is a typo here)
 or 16505 are not.
 The only line in which 465783 could be found is

I need to see the lines after this.

 Is that enough information or should I send the whole dump? That's a bit of
 work as I have to expunge some sensitive schema data, or is there a
 meaningful way to just do the dump for a single db?

You can do:

pg_dump --binary-upgrade --schema-only dbname


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #6218: TRAP: FailedAssertion( !(owner-nsnapshots == 0), File: resowner.c, Line: 365)

2011-09-26 Thread Tom Lane
I wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 Shall I work on a fix?  I expect you are plenty busy with commitfest
 stuff, but please let me know otherwise.

 I have what-I-think-is-the-fix pretty clear in my own mind, so let me
 give it a try.  If it doesn't work I'll bounce it back to you.

Well, I soon ran into the issue that delaying the snapshot release makes
TopTransactionResourceOwner spit up.  After some reflection I decided
that the real problem is a circular dependency: snapshot management must
be considered lower-level than ResourceOwners because ResourceOwners
tell snapshot management what to do, but here we have
GetTransactionSnapshot trying to use TopTransactionResourceOwner to
manage its internal reference to the transaction snapshot.

Accordingly, the attached proposed patch gets rid of the circularity
by removing snapmgr.c's dependency on TopTransactionResourceOwner,
in favor of having it track the refcount manually.  This was messier
than I'd hoped because the bogus design had propagated into the SSI
manager meanwhile, but removing the TopTransactionResourceOwner
dependency from that too seems like a good idea.

This passes the regular and isolation regression tests, and it's also
okay with Yamamoto-san's prepared-ROLLBACK test case even without the
band-aid fix in plancache.c.  I can't immediately think of any other
test cases to throw at it.

Comments?

regards, tom lane

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index de3f965b37a4be93913907b1683c07ea24b4e633..3dab45c2da60a1010d566fccf658a4a55ee3ece3 100644
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
*** CommitTransaction(void)
*** 1906,1914 
  	/* Clean up the relation cache */
  	AtEOXact_RelationCache(true);
  
- 	/* Clean up the snapshot manager */
- 	AtEarlyCommit_Snapshot();
- 
  	/*
  	 * Make catalog changes visible to all backends.  This has to happen after
  	 * relcache references are dropped (see comments for
--- 1906,1911 
*** PrepareTransaction(void)
*** 2158,2166 
  	/* Clean up the relation cache */
  	AtEOXact_RelationCache(true);
  
- 	/* Clean up the snapshot manager */
- 	AtEarlyCommit_Snapshot();
- 
  	/* notify doesn't need a postprepare call */
  
  	PostPrepare_PgStat();
--- 2155,2160 
*** AbortTransaction(void)
*** 2339,2345 
  		AtEOXact_ComboCid();
  		AtEOXact_HashTables(false);
  		AtEOXact_PgStat(false);
- 		AtEOXact_Snapshot(false);
  		pgstat_report_xact_timestamp(0);
  	}
  
--- 2333,2338 
*** CleanupTransaction(void)
*** 2368,2373 
--- 2361,2367 
  	 * do abort cleanup processing
  	 */
  	AtCleanup_Portals();		/* now safe to release portal memory */
+ 	AtEOXact_Snapshot(false);	/* and release the transaction's snapshots */
  
  	CurrentResourceOwner = NULL;	/* and resource owner */
  	if (TopTransactionResourceOwner)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 1754180453daaa54e732bcd82cae5f3b70631524..d39f8975f8816d9bba02c0c398323ca470f1340b 100644
*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***
*** 146,152 
   *		PageIsPredicateLocked(Relation relation, BlockNumber blkno)
   *
   * predicate lock maintenance
!  *		RegisterSerializableTransaction(Snapshot snapshot)
   *		RegisterPredicateLockingXid(void)
   *		PredicateLockRelation(Relation relation, Snapshot snapshot)
   *		PredicateLockPage(Relation relation, BlockNumber blkno,
--- 146,152 
   *		PageIsPredicateLocked(Relation relation, BlockNumber blkno)
   *
   * predicate lock maintenance
!  *		GetSerializableTransactionSnapshot(Snapshot snapshot)
   *		RegisterPredicateLockingXid(void)
   *		PredicateLockRelation(Relation relation, Snapshot snapshot)
   *		PredicateLockPage(Relation relation, BlockNumber blkno,
*** static void OldSerXidSetActiveSerXmin(Tr
*** 417,423 
  static uint32 predicatelock_hash(const void *key, Size keysize);
  static void SummarizeOldestCommittedSxact(void);
  static Snapshot GetSafeSnapshot(Snapshot snapshot);
! static Snapshot RegisterSerializableTransactionInt(Snapshot snapshot);
  static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag);
  static bool GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag,
  		  PREDICATELOCKTARGETTAG *parent);
--- 417,423 
  static uint32 predicatelock_hash(const void *key, Size keysize);
  static void SummarizeOldestCommittedSxact(void);
  static Snapshot GetSafeSnapshot(Snapshot snapshot);
! static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot);
  static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag);
  static bool GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag,
  		  PREDICATELOCKTARGETTAG *parent);
*** SummarizeOldestCommittedSxact(void)
*** 1485,1490 
--- 

Re: [HACKERS] [BUGS] BUG #6218: TRAP: FailedAssertion( !(owner-nsnapshots == 0), File: resowner.c, Line: 365)

2011-09-26 Thread Alvaro Herrera

Excerpts from Tom Lane's message of lun sep 26 20:59:45 -0300 2011:

 Well, I soon ran into the issue that delaying the snapshot release makes
 TopTransactionResourceOwner spit up.  After some reflection I decided
 that the real problem is a circular dependency: snapshot management must
 be considered lower-level than ResourceOwners because ResourceOwners
 tell snapshot management what to do, but here we have
 GetTransactionSnapshot trying to use TopTransactionResourceOwner to
 manage its internal reference to the transaction snapshot.

Great.  I noticed the circularity but didn't reflect that it was bogus
in itself.

 Accordingly, the attached proposed patch gets rid of the circularity
 by removing snapmgr.c's dependency on TopTransactionResourceOwner,
 in favor of having it track the refcount manually.  This was messier
 than I'd hoped because the bogus design had propagated into the SSI
 manager meanwhile, but removing the TopTransactionResourceOwner
 dependency from that too seems like a good idea.

To be honest, I panicked for a second when I saw the new
SnapshotResetXmin call, before I realized that it wasn't necessary
before.  The serializable case makes more sense the patched way, I
think.

 This passes the regular and isolation regression tests, and it's also
 okay with Yamamoto-san's prepared-ROLLBACK test case even without the
 band-aid fix in plancache.c.  I can't immediately think of any other
 test cases to throw at it.

I added tests for all the problematic cases discovered after snapmgr was
introduced, so at least most known weird spots are covered.

Thanks

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #6218: TRAP: FailedAssertion( !(owner-nsnapshots == 0), File: resowner.c, Line: 365)

2011-09-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 To be honest, I panicked for a second when I saw the new
 SnapshotResetXmin call, before I realized that it wasn't necessary
 before.  The serializable case makes more sense the patched way, I
 think.

Yeah, in the old coding, SnapshotResetXmin would have happened
implicitly at the last UnregisterSnapshot call.  In this arrangement,
the last UnregisterSnapshot is going to be the manual one in
AtEOXact_Snapshot, so we need a manual SnapshotResetXmin there too.
I chose to put it at the bottom of the routine so that it's guaranteed
to fire even if the RegisteredSnapshots count was corrupted, but that's
just paranoia.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of recovery?

2011-09-26 Thread Fujii Masao
On Tue, Sep 27, 2011 at 7:28 AM, Florian Pflug f...@phlo.org wrote:
 On Sep26, 2011, at 22:39 , Tom Lane wrote:
 It might be worthwhile to invoke XLogCheckInvalidPages() as soon as
 we (think we) have reached consistency, rather than leaving it to be
 done only when we exit recovery mode.

 I believe we also need to prevent the creation of restart points before
 we've reached consistency. If we're starting from an online backup,
 and a checkpoint occurred between pg_start_backup() and pg_stop_backup(),
 we currently create a restart point upon replaying that checkpoint's
 xlog record. At that point, however, unresolved page references are
 not an error, since a truncation that happened after the checkpoint
 (but before pg_stop_backup()) might or might not be reflected in the
 online backup.

Preventing the creation of restartpoints before reaching consistent point
sounds fragile to the case where the backup takes very long time. It might
also take very long time to reach consistent point when replaying from that
backup. Which prevents also the removal of WAL files (e.g., streamed from
the master server) for a long time, and then might cause disk full failure.

ISTM that writing an invalid-page table to the disk for every restartpoints is
better approach. If an invalid-page table is never updated after we've
reached consistency point, we probably should make restartpoints write
that table only after that point. And, if a reference to an invalid
page is found
after the consistent point, we should emit error and cancel a recovery.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-26 Thread Andrew Dunstan



On 09/25/2011 02:39 PM, Joshua Berkus wrote:

There might be a use case for a separate directive include_if_exists,
or some such name.  But I think the user should have to tell us very
clearly that it's okay for the file to not be found.

Better to go back to include_directory, then.





I rather like Tom's suggestion of include_if_exists.

There might be a case for include_directory, but I think that needs to 
be made separately.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] random isolation test failures

2011-09-26 Thread Noah Misch
On Mon, Sep 26, 2011 at 01:10:27PM -0400, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  We are seeing numerous occasional buildfarm failures of the fk-deadlock2 
  isolation test,
 
 Yeah, I complained about this already, but Kevin disclaims all
 responsibility for the fk isolation tests.  It looks like Alvaro
 and Noah Misch are the people to be harassing.

Yep; I took advantage of Kevin's test harness for some unrelated tests.

These sporadic failures happen whenever the test case takes longer than
deadlock_timeout (currently 100ms for these tests) to setup the deadlock.  I
outlined some mitigating strategies here:
http://archives.postgresql.org/message-id/20110727171438.ge18...@tornado.leadboat.com

I'd vote for #1: let's double the deadlock_timeout until the failures stop.
Other opinions?

Thanks,
nm
*** a/src/test/isolation/specs/fk-deadlock.spec
--- b/src/test/isolation/specs/fk-deadlock.spec
***
*** 19,25  teardown
  }
  
  session s1
! setup { BEGIN; SET deadlock_timeout = '100ms'; }
  step s1i{ INSERT INTO child VALUES (1, 1); }
  step s1u{ UPDATE parent SET aux = 'bar'; }
  step s1c{ COMMIT; }
--- 19,25 
  }
  
  session s1
! setup { BEGIN; SET deadlock_timeout = '200ms'; }
  step s1i{ INSERT INTO child VALUES (1, 1); }
  step s1u{ UPDATE parent SET aux = 'bar'; }
  step s1c{ COMMIT; }
*** a/src/test/isolation/specs/fk-deadlock2.spec
--- b/src/test/isolation/specs/fk-deadlock2.spec
***
*** 24,30  teardown
  }
  
  session s1
! setup { BEGIN; SET deadlock_timeout = '100ms'; }
  step s1u1   { UPDATE A SET Col1 = 1 WHERE AID = 1; }
  step s1u2   { UPDATE B SET Col2 = 1 WHERE BID = 2; }
  step s1c{ COMMIT; }
--- 24,30 
  }
  
  session s1
! setup { BEGIN; SET deadlock_timeout = '200ms'; }
  step s1u1   { UPDATE A SET Col1 = 1 WHERE AID = 1; }
  step s1u2   { UPDATE B SET Col2 = 1 WHERE BID = 2; }
  step s1c{ COMMIT; }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix for pg_upgrade

2011-09-26 Thread panam
Hi Bruce,

on the old DB I've got 465783 as oid whereas on the new one it is 16505.

is not in the dump file (old db), even 16385 (i guess this is a typo here)
or 16505 are not.
The only line in which 465783 could be found is

Is that enough information or should I send the whole dump? That's a bit of
work as I have to expunge some sensitive schema data, or is there a
meaningful way to just do the dump for a single db?

Thanks  regards,
panam

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/fix-for-pg-upgrade-tp3411128p4843289.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-09-26 Thread Bruce Momjian
Peter Eisentraut wrote:
 8.4 - master upgrade fails like this:
 
 Restoring user relation files
 Mismatch of relation names in database regression: old name 
 pg_toast.pg_toast_27437, new name pg_toast.pg_toast_27309
 Failure, exiting
 
 This has been 100% reproducible for me.

I can now reproduce this failure and will research the cause, probably
not before next week though.  :-( What is interesting is that loading
the regression tests from an SQL dump does not show the failure, but
running the regression tests and then upgrading does.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-09-26 Thread Fujii Masao
On Mon, Sep 26, 2011 at 11:39 AM, Steve Singer ssinger...@sympatico.ca wrote:
 I have looked at both Jun's patch from Sept 13 and Fujii's updates to the
 patch.  I agree that Fujii's updated version should be used as the basis for
 changes going forward.   My comments below refer to that version (unless
 otherwise noted).

Thanks for the tests and comments!

 In backup.sgml  the new section titled Making a Base Backup during
 Recovery  I would prefer to see some mention in the title that this
 procedure is for standby servers ie Making a Base Backup from a Standby
 Database.  Users who have setup a hot-standby database should be familiar
 with the 'standby' terminology. I agree that the during recovery
 description is technically correct but I'm not sure someone who is looking
 through the manual for instructions on making a base backup from here
 standby will realize this is the section they should read.

I used the term recovery rather than standby because we can take
a backup even from the server in normal archive recovery mode but not
standby mode. But there is not many users who take a backup during
normal archive recovery, so I agree that the term standby is better to
be used in the document. Will change.

 Around line 969 where you give an example of copying the control file I
 would be a bit clearer that this is an example command.  Ie (Copy the
 pg_control file from the cluster directory to the global sub-directory of
 the backup.  For example cp $PGDATA/global/pg_control
 /mnt/server/backupdir/global)

Looks better. Will change.

 Testing Notes
 -

 I created a standby server from a base backup of another standby server. On
 this new standby server I then

 1. Ran pg_start_backup('3'); and left the psql connection open
 2. touch /tmp/3 -- my trigger_file

 ssinger@ssinger-laptop:/usr/local/pgsql92git/bin$ LOG:  trigger file found:
 /tmp/3
 FATAL:  terminating walreceiver process due to administrator command
 LOG:  restored log file 00010006 from archive
 LOG:  record with zero length at 0/60002F0
 LOG:  restored log file 00010006 from archive
 LOG:  redo done at 0/6000298
 LOG:  restored log file 00010006 from archive
 PANIC:  record with zero length at 0/6000298
 LOG:  startup process (PID 19011) was terminated by signal 6: Aborted
 LOG:  terminating any other active server processes
 WARNING:  terminating connection because of crash of another server process
 DETAIL:  The postmaster has commanded this server process to roll back the
 current transaction and exit, because another server process exited
 abnormally and possibly corrupted shared memory.
 HINT:  In a moment you should be able to reconnect to the database and
 repeat your command.

 The new postmaster (the one trying to be promoted) dies.  This is somewhat
 repeatable.

Looks weired. Though the WAL record starting from 0/6000298 was read
successfully, then re-fetch of the same record fails at the end of recovery.
One possible cause is the corruption of archived WAL file. What
restore_command on the standby and archive_command on the master
are you using? Could you confirm that there is no chance to overwrite
archive WAL files in your environment?

I tried to reproduce this problem several times, but I could not. Could
you provide the test case which reproduces the problem?

 If a base backup is in progress on a recovery database and that recovery
 database is promoted to master, following the promotion (if you don't
 restart the postmaster).  I see
 select pg_stop_backup();
 ERROR:  database system status mismatches between pg_start_backup() and
 pg_stop_backup()

 If you restart the postmaster this goes away.  When the postmaster leaves
 recovery mode I think it should abort an existing base backup so
 pg_stop_backup() will say no backup in progress,

I don't think that it's good idea to cancel the backup when promoting
the standby.
Because if we do so, we need to handle correctly the case where cancel of backup
and pg_start_backup/pg_stop_backup are performed at the same time. We can
simply do that by protecting those whole operations including pg_start_backup's
checkpoint by the lwlock. But I don't think that it's worth
introducing new lwlock
only for that. And it's not good to take a lwlock through
time-consuming checkpoint
operation. Of course we can avoid such a lwlock, but which would require more
complicated code.

 or give an error message on
 pg_stop_backup() saying that the base backup won't be usable.  The above
 error doesn't really tell the user why there is a mismatch.

What about the following error message?

ERROR:  pg_stop_backup() was executed during normal processing though
pg_start_backup() was executed during recovery
HINT:  The database backup will not be usable.

Or, you have better idea?

 In my testing a few times I got into a situation where a standby server
 coming from a recovery target took a while to finish recovery (this is on 

Re: [HACKERS] random isolation test failures

2011-09-26 Thread Alvaro Herrera
Excerpts from Noah Misch's message of lun sep 26 21:57:40 -0300 2011:

 These sporadic failures happen whenever the test case takes longer than
 deadlock_timeout (currently 100ms for these tests) to setup the deadlock.  I
 outlined some mitigating strategies here:
 http://archives.postgresql.org/message-id/20110727171438.ge18...@tornado.leadboat.com
 
 I'd vote for #1: let's double the deadlock_timeout until the failures stop.
 Other opinions?

I just tweaked isolationtester so that it collects the error messages
and displays them all together at the end of the test.  After seeing it
run, I didn't like it -- I think I prefer something more local, so that
in the only case where we call try_complete_step twice in the loop, we
report any errors in either.  AFAICS this would make both expected cases
behave identically in test output.  The only thing left to figure out is
where to store the error message between calls ... clearly Step is not
the right place for it.  I'm on it now, anyway.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


isolation-fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug of recovery?

2011-09-26 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 ISTM that writing an invalid-page table to the disk for every restartpoints is
 better approach.

I still say that's uncalled-for overkill.  The invalid-page table is not
necessary for recovery, it's only a debugging cross-check.  You're more
likely to introduce bugs than fix any by adding a mechanism like that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] random isolation test failures

2011-09-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 I just tweaked isolationtester so that it collects the error messages
 and displays them all together at the end of the test.  After seeing it
 run, I didn't like it -- I think I prefer something more local, so that
 in the only case where we call try_complete_step twice in the loop, we
 report any errors in either.  AFAICS this would make both expected cases
 behave identically in test output.

Hmm, is that really an appropriate fix?  I'm worried that it might mask
event-ordering differences that actually are significant.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >