Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Dennis Bjorklund
On Sat, 7 Aug 2004, Tom Lane wrote:

> question at hand is whether we can support 32-bit characters or not ---
> and if not, what's the next bug to fix?

True, and that's hard to just give an answer to. One could do some simple 
testing, make sure regexps work and then treat anything else that might 
not work, as bugs to be fixed later on when found.

The alternative is to inspect all code paths that involve strings, not fun 
at all :-)

My previous mail talked about utf-8 translation. Not all characters
possible to form using utf-8 are assigned by the unicode org. However,
the part that interprets the unicode strings are in the os so different
os'es can give different results. So I think pg should just accept even 6 
byte utf-8 sequences even if some characters are not currently assigned.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
This should do it.

Regards,

John Hansen 

-Original Message-
From: Dennis Bjorklund [mailto:[EMAIL PROTECTED] 
Sent: Saturday, August 07, 2004 5:02 PM
To: Tom Lane
Cc: John Hansen; Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x1 

On Sat, 7 Aug 2004, Tom Lane wrote:

> question at hand is whether we can support 32-bit characters or not 
> --- and if not, what's the next bug to fix?

True, and that's hard to just give an answer to. One could do some simple testing, 
make sure regexps work and then treat anything else that might not work, as bugs to be 
fixed later on when found.

The alternative is to inspect all code paths that involve strings, not fun at all :-)

My previous mail talked about utf-8 translation. Not all characters possible to form 
using utf-8 are assigned by the unicode org. However, the part that interprets the 
unicode strings are in the os so different os'es can give different results. So I 
think pg should just accept even 6 byte utf-8 sequences even if some characters are 
not currently assigned.

--
/Dennis Björklund





wchar.c.patch
Description: wchar.c.patch

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_dump: could not parse ACL list

2004-08-07 Thread Christopher Kings-Lynne
$ pg_dump -p 5433 test
pg_dump: could not parse ACL list ([0:1]={postgres=UC/postgres,=UC/postgres}) for object 
"public" (SCHEMA)
Ugh.  This is an unforeseen side effect of Joe's recent changes to make
array_out emit dimension info.
I think the most reasonable answer is to tweak the ACL code so that it
creates ACL arrays with lower bound 1 instead of lower bound 0.  The
only possible downside is that this would confuse any client code that
is manually manipulating ACL arrays and knows about the lower-bound-0
behavior ... but any such code is likely broken anyway by the other ACL
changes that have gone on lately ...
Yes, phpPgAdmin is currently broken either way methinks.
Chris
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Tatsuo Ishii
> Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> > ... This also means that the start byte can never start with 7 or 8
> > ones, that is illegal and should be tested for and rejected. So the
> > longest utf-8 sequence is 6 bytes (and the longest character needs 4
> > bytes (or 31 bits)).
> 
> Tatsuo would know more about this than me, but it looks from here like
> our coding was originally designed to support only 16-bit-wide internal
> characters (ie, 16-bit pg_wchar datatype width).  I believe that the
> regex library limitation here is gone, and that as far as that library
> is concerned we could assume a 32-bit internal character width.  The
> question at hand is whether we can support 32-bit characters or not ---
> and if not, what's the next bug to fix?

pg_wchar has been already 32-bit datatype.  However I doubt there's
actually a need for 32-but width character sets. Even Unicode only
uese up 0x0010, so 24-bit should be enough...
--
Tatsuo Ishii

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
Yes, but the specification allows for 6byte sequences, or 32bit
characters.
As dennis pointed out, just because they're not used, doesn't mean we
should not allow them to be stored, since there might me someone using
the high ranges for a private character set, which could very well be
included in the specification some day.

Regards,

John Hansen

-Original Message-
From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] 
Sent: Saturday, August 07, 2004 8:09 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; John Hansen; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PATCHES] [HACKERS] UNICODE characters above 0x1 

> Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> > ... This also means that the start byte can never start with 7 or 8 
> > ones, that is illegal and should be tested for and rejected. So the 
> > longest utf-8 sequence is 6 bytes (and the longest character needs 4

> > bytes (or 31 bits)).
> 
> Tatsuo would know more about this than me, but it looks from here like

> our coding was originally designed to support only 16-bit-wide 
> internal characters (ie, 16-bit pg_wchar datatype width).  I believe 
> that the regex library limitation here is gone, and that as far as 
> that library is concerned we could assume a 32-bit internal character 
> width.  The question at hand is whether we can support 32-bit 
> characters or not --- and if not, what's the next bug to fix?

pg_wchar has been already 32-bit datatype.  However I doubt there's
actually a need for 32-but width character sets. Even Unicode only uese
up 0x0010, so 24-bit should be enough...
--
Tatsuo Ishii



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] CVS comment

2004-08-07 Thread Gaetano Mendola
Alvaro Herrera wrote:
On Sat, Aug 07, 2004 at 01:34:20AM +0200, Gaetano Mendola wrote:
Alvaro Herrera wrote:

Yeah.  I included your tab-complete patch in the patch I sent to
pgsql-patches, which later Tom reworked and applied.  His CVS comment
didn't mention the tab completion change.  This isn't surprising at all,
as minor changes go uncommented sometimes when they are surrounded by
bigger changes (like the large object work).
Understood. Why not comment each file separately too much work with CVS?

People just doesn't feel it's important ... other projects have strict
guidelines regarding CVS commit message formatting, but what I have seen
is in most cases useless noise.  Anyone can see the real diffs when
there's need.

I do not have experience with CVS ( at work I user Clearcase ) and for my
personal purpose I use subversion ( any plans to migrate the CVS repository
to subversion or even bitkeeper ? ).

Subversion and arch have been mentioned, but so far there is no
compelling reason to change.  It'd take convincing at least a couple of
core hackers to get the ball rolling ...
Well, I think having seen what's happening at the 8.0 relase I think that
committers are too overloaded and someone else have to be "promoted" to be
a committers, and I believe that having betters tools can improve the process
too.

Regards
Gaetano Mendola


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Tatsuo Ishii
> Yes, but the specification allows for 6byte sequences, or 32bit
> characters.

UTF-8 is just an encoding specification, not character set
specification. Unicode only has 17 256x256 planes in its
specification.

> As dennis pointed out, just because they're not used, doesn't mean we
> should not allow them to be stored, since there might me someone using
> the high ranges for a private character set, which could very well be
> included in the specification some day.

We should expand it to 64-bit since some day the specification might
be changed then:-)

More seriously, Unicode is filled with tons of confusion and
inconsistency IMO. Remember that once Unicode adovocates said that the
merit of Unicode was it only requires 16-bit width. Now they say they
need surrogate pairs and 32-bit width chars...

Anyway my point is if current specification of Unicode only allows
24-bit range, why we need to allow usage against the specification?
--
Tatsuo Ishii

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Christopher Kings-Lynne
Now it's entirely possible that the underlying support is a few bricks
shy of a load --- for instance I see that pg_utf_mblen thinks there are
no UTF8 codes longer than 3 bytes whereas your code goes to 4.  I'm not
an expert on this stuff, so I don't know what the UTF8 spec actually
says.  But I do think you are fixing the code at the wrong level.
Surely there are UTF-8 codes that are at least 3 bytes.  I have a 
_vague_ recollection that you have to keep escaping and escaping to get 
up to like 4 bytes for some asian code points?

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
4 actually,
10 needs four bytes:

0xxx 10xx 10xx 10xx
10 = 1010  

Fill in the blanks, starting from the bottom, you get:
 1010 1011 1011

Regards,

John Hansen 

-Original Message-
From: Christopher Kings-Lynne [mailto:[EMAIL PROTECTED] 
Sent: Saturday, August 07, 2004 8:47 PM
To: Tom Lane
Cc: John Hansen; Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x1

> Now it's entirely possible that the underlying support is a few bricks

> shy of a load --- for instance I see that pg_utf_mblen thinks there 
> are no UTF8 codes longer than 3 bytes whereas your code goes to 4.  
> I'm not an expert on this stuff, so I don't know what the UTF8 spec 
> actually says.  But I do think you are fixing the code at the wrong
level.

Surely there are UTF-8 codes that are at least 3 bytes.  I have a
_vague_ recollection that you have to keep escaping and escaping to get
up to like 4 bytes for some asian code points?

Chris




---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Dennis Bjorklund
On Sat, 7 Aug 2004, John Hansen wrote:

> should not allow them to be stored, since there might me someone using
> the high ranges for a private character set, which could very well be
> included in the specification some day.

There are areas reserved for private character sets.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
Well, maybe we'd be better off, compiling a list of (in?)valid ranges
from the full unicode database 
(http://www.unicode.org/Public/UNIDATA/UnicodeData.txt and
http://www.unicode.org/Public/UNIDATA/Unihan.txt)
and with every release of pg, update the detection logic so only valid
characters are allowed?

Regards,

John Hansen

-Original Message-
From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] 
Sent: Saturday, August 07, 2004 8:46 PM
To: John Hansen
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PATCHES] [HACKERS] UNICODE characters above 0x1 

> Yes, but the specification allows for 6byte sequences, or 32bit 
> characters.

UTF-8 is just an encoding specification, not character set
specification. Unicode only has 17 256x256 planes in its specification.

> As dennis pointed out, just because they're not used, doesn't mean we 
> should not allow them to be stored, since there might me someone using

> the high ranges for a private character set, which could very well be 
> included in the specification some day.

We should expand it to 64-bit since some day the specification might be
changed then:-)

More seriously, Unicode is filled with tons of confusion and
inconsistency IMO. Remember that once Unicode adovocates said that the
merit of Unicode was it only requires 16-bit width. Now they say they
need surrogate pairs and 32-bit width chars...

Anyway my point is if current specification of Unicode only allows
24-bit range, why we need to allow usage against the specification?
--
Tatsuo Ishii



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Vacuum Cost Documentation?

2004-08-07 Thread Gaetano Mendola
Bruce Momjian wrote:
Jan Wieck wrote:
On 8/6/2004 9:04 PM, Bruce Momjian wrote:

Updated.  Thanks.
I thought we want to have the feature activated ... I reversed your 
change and brought guc.c in sync instead.

Uh, if the guy is doing a vacuum at night, does he want the delay? 
Seems someone should have to enable the delay by default, or does your
setup recoginize when it is being run on a lightly loaded system?

TODO:  make vacuum_cost_naptime aware of system load
:-)

Regards
Gaetano Mendola

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Dennis Bjorklund
On Sat, 7 Aug 2004, Tatsuo Ishii wrote:

> More seriously, Unicode is filled with tons of confusion and
> inconsistency IMO. Remember that once Unicode adovocates said that the
> merit of Unicode was it only requires 16-bit width. Now they say they
> need surrogate pairs and 32-bit width chars...
> 
> Anyway my point is if current specification of Unicode only allows
> 24-bit range, why we need to allow usage against the specification?

Whatever problems they have had in the past, the ISO 10646 defines
formally a 31-bit character set. Are you saying that applications should
reject strings that contain characters that it does not recognize?

Is there a specific reason you want to restrict it to 24 bits? In practice 
it does not matter much since it's not used today, I just don't know why 
you want it.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
Yea,. I know

10 - 10 : 2 separate planes iirc

... John 

-Original Message-
From: Dennis Bjorklund [mailto:[EMAIL PROTECTED] 
Sent: Saturday, August 07, 2004 9:06 PM
To: John Hansen
Cc: Tatsuo Ishii; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [PATCHES] [HACKERS] UNICODE characters above 0x1 

On Sat, 7 Aug 2004, John Hansen wrote:

> should not allow them to be stored, since there might me someone using 
> the high ranges for a private character set, which could very well be 
> included in the specification some day.

There are areas reserved for private character sets.

--
/Dennis Björklund




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] pg_dump and sequences (bug ?)

2004-08-07 Thread strk
Using pg_dump from postgresql 7.3.4 I've obtained
a dump file containing a SEQUENCE SET with no
corresponding SEQUENCE. I've seen that this is usually
due to the presence of a table with a 'serial' field,
but since in this case there is no such table I wonder
if this is a bug in pg_dump.

The only reason I can imagine for this is pg_dump taking
any sequence whose name  ends in _seq as being associated
to a table, no matter if that table exists and has a 'serial'
field. Is this possible ? Shouldn't this kind of dependency
be coded somehow ?

TIA

--strk;

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Dennis Bjorklund
On Sat, 7 Aug 2004, Takehiko Abe wrote:

It looked like you sent the last mail only to me and not the list. I 
assume it was a misstake and I send the reply to both.

> > Is there a specific reason you want to restrict it to 24 bits?
> 
> ISO 10646 is said to have removed its private use codepoints outside of
> the Unicode 0 - 10 range to ensure the compatibility with Unicode.
> 
> see Section C.2 and C.3 of Unicode 4.0 Appendix C "Relationship to ISO
> 10646": .

The one and only reason for allowing 31 bit is that it's defined by iso
10646. In practice there is probably no one that uses the upper part of
10646 so not supporting it will most likely not hurt anyone.

I'm happy either way so I will put my voice on letting PG use unicode (not
ISO 10646) and restrict it to 24 bits. By the time someone wants (if ever)
iso 10646 we probably have support for different charsets and can easily
handle both at the same time.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Dennis Bjorklund
> Sent: Saturday, August 07, 2004 10:48 PM
> To: Takehiko Abe
> Cc: [EMAIL PROTECTED]
> Subject: Re: [PATCHES] [HACKERS] UNICODE characters above 0x1
> 
> On Sat, 7 Aug 2004, Takehiko Abe wrote:
> 
> It looked like you sent the last mail only to me and not the 
> list. I assume it was a misstake and I send the reply to both.
> 
> > > Is there a specific reason you want to restrict it to 24 bits?
> > 
> > ISO 10646 is said to have removed its private use codepoints outside 
> > of the Unicode 0 - 10 range to ensure the compatibility with Unicode.
> > 
> > see Section C.2 and C.3 of Unicode 4.0 Appendix C 
> "Relationship to ISO
> > 10646": .
> 
> The one and only reason for allowing 31 bit is that it's 
> defined by iso 10646. In practice there is probably no one 
> that uses the upper part of
> 10646 so not supporting it will most likely not hurt anyone.
>   
>   
> I'm happy either way so I will put my voice on letting PG use 
> unicode (not ISO 10646) and restrict it to 24 bits. By the 
> time someone wants (if ever) iso 10646 we probably have 
> support for different charsets and can easily handle both at 
> the same time.
> 

Point taken. 
Since we're supporting UTF8, and not ISO 10646.

Now, is it really 24 bits tho? 
Afaict, it's really 21 (0 - 10 or 0 - xxx1  )

This would require that we suport 4 byte sequences
(0100 1000 1011 1011 = 10)

> --
> /Dennis Björklund
> 
> 
> ---(end of 
> broadcast)---
> TIP 7: don't forget to increase your free space map settings
> 
> 


Regards,

John Hansen

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Dennis Bjorklund
On Sat, 7 Aug 2004, John Hansen wrote:

> Now, is it really 24 bits tho? 
> Afaict, it's really 21 (0 - 10 or 0 - xxx1  )

Yes, up to 0x10 should be enough.

The 24 is not really important, this is all about what utf-8 strings to 
accept as input. The strings are stored as utf-8 strings and when 
processed inside pg it uses wchar_t that is 32 bit (on some systems at 
least). By restricting the utf-8 input to unicode we can in the future 
store each character as 3 bytes if we want.

--
/Dennis Björklund


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Vacuum Cost Documentation?

2004-08-07 Thread Jan Wieck
On 8/6/2004 11:34 PM, Bruce Momjian wrote:
Jan Wieck wrote:
On 8/6/2004 9:04 PM, Bruce Momjian wrote:
> Updated.  Thanks.
I thought we want to have the feature activated ... I reversed your 
change and brought guc.c in sync instead.
Uh, if the guy is doing a vacuum at night, does he want the delay? 
Seems someone should have to enable the delay by default, or does your
setup recoginize when it is being run on a lightly loaded system?


Good that autovacuum didn't make it then, those people would have had a 
big surprise :-)

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
> -Original Message-
> From: Dennis Bjorklund [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, August 07, 2004 11:23 PM
> To: John Hansen
> Cc: Takehiko Abe; [EMAIL PROTECTED]
> Subject: RE: [PATCHES] [HACKERS] UNICODE characters above 0x1
> 
> On Sat, 7 Aug 2004, John Hansen wrote:
> 
> > Now, is it really 24 bits tho? 
> > Afaict, it's really 21 (0 - 10 or 0 - xxx1  
> )
> 
> Yes, up to 0x10 should be enough.
> 
> The 24 is not really important, this is all about what utf-8 
> strings to accept as input. The strings are stored as utf-8 
> strings and when processed inside pg it uses wchar_t that is 
> 32 bit (on some systems at least). By restricting the utf-8 
> input to unicode we can in the future store each character as 
> 3 bytes if we want.

Which brings us back to something like the attached...

> 
> --
> /Dennis Björklund
> 
> 
> 

Regards,

John Hansen


wchar.c.patch
Description: wchar.c.patch

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Backend crashes with notification rule

2004-08-07 Thread Bernd Helmle
I have this on 8.0dev (checked out last friday):
yomama=# SELECT version();
 version
---
PostgreSQL 8.0devel on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.0
(1 Zeile)
yomama=# CREATE TABLE test (test text);
CREATE TABLE
yomama=# CREATE RULE rule_insert_test AS ON INSERT TO test DO NOTIFY 
test_notification;
CREATE RULE

yomama=# INSERT INTO test VALUES ('test');
Server beendete die Verbindung unerwartet
   Das heißt wahrscheinlich, daß der Server abnormal beendete
   bevor oder während die Anweisung bearbeitet wurde.
Die Verbindung zum Server wurde verloren.  Versuche Reset: Fehlgeschlagen.
!>
With assert check enabled, i got this in the logfile:
TRAP: FailedAssertion("!(n < list->length)", File: "list.c", Line: 392)
LOG:  server process (PID 18637) was terminated by signal 6
LOG:  terminating any other active server processes
Sorry for the german locale .
--
 Bernd
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Tom Lane
Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> On Sat, 7 Aug 2004, Tatsuo Ishii wrote:
>> Anyway my point is if current specification of Unicode only allows
>> 24-bit range, why we need to allow usage against the specification?

> Is there a specific reason you want to restrict it to 24 bits?

I see several places that have to allocate space on the basis of the
maximum encoded character length possible in the current encoding
(look for uses of pg_database_encoding_max_length).  Probably the only
one that's really significant for performance is text_substr(), but
that's enough to be an argument against setting maxmblen higher than
we have to.

It looks to me like supporting 4-byte UTF-8 characters would be enough
to handle the existing range of Unicode codepoints, and that is probably
as much as we want to do.

If I understood what I was reading, this would take several things:
* Remove the "special UTF-8 check" in pg_verifymbstr;
* Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
* Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.

Are there any other places that would have to change?  Would this break
anything?  The testing aspect is what's bothering me at the moment.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Vacuum Cost Documentation?

2004-08-07 Thread Tom Lane
Jan Wieck <[EMAIL PROTECTED]> writes:
> Good that autovacuum didn't make it then, those people would have had a 
> big surprise :-)

If autovacuum had made it, would you expect someone to have enabled it
by default?  Without advance discussion?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] PITR - recovery to a particular transaction

2004-08-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > When we do a PITR recovery based on xid, does it stop recovery based on
> > the start of the xid or the commit of the xid?
> 
> You can stop either "before" or "after" that commit.  See
> recovery.conf.sample (I don't think it's documented anywhere else
> yet :-(),

Yea, my question is if you choose "after", do you get everything that
happens until the "after" transaction commits, or just when it begins. 
If I stop after xid 125, and xid 126 starts and stops before 125
commits, does 126 get restored?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Vacuum Cost Documentation?

2004-08-07 Thread Bruce Momjian
Jan Wieck wrote:
> On 8/6/2004 11:34 PM, Bruce Momjian wrote:
> 
> > Jan Wieck wrote:
> >> On 8/6/2004 9:04 PM, Bruce Momjian wrote:
> >> 
> >> > Updated.  Thanks.
> >> 
> >> I thought we want to have the feature activated ... I reversed your 
> >> change and brought guc.c in sync instead.
> > 
> > Uh, if the guy is doing a vacuum at night, does he want the delay? 
> > Seems someone should have to enable the delay by default, or does your
> > setup recoginize when it is being run on a lightly loaded system?
> > 
> > 
> 
> Those people will instantly realize what is going on and either change 
> the delay setting or start running vacuum at daytime too.
> 
> What this buys us is that over time we will see less researches and 
> articles telling people that you have to bring down a PostgreSQL DB 
> frequently for vacuum maintenance because those "testers" run their fair 
> comparisions with out of the box configuration settings.

I am not in favor of adding a delay in VACUUM unless people ask for it. 
Imagine either night vacuum or a vacuum you run in your application
after you delete and just before you load a table.  Neither want a
vacuum.  I think people who want a delay will think to ask for it while
people who want a quick vacuum will just think that vacuum is slow.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] Backend crashes with notification rule

2004-08-07 Thread Tom Lane
Bernd Helmle <[EMAIL PROTECTED]> writes:
> I have this on 8.0dev (checked out last friday):
> yomama=# CREATE RULE rule_insert_test AS ON INSERT TO test DO NOTIFY 
> test_notification;
> [ causes crash ]

Fixed.  Thanks for the report!

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] PITR - recovery to a particular transaction

2004-08-07 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Yea, my question is if you choose "after", do you get everything that
> happens until the "after" transaction commits, or just when it begins. 
> If I stop after xid 125, and xid 126 starts and stops before 125
> commits, does 126 get restored?

Yes.  You don't get to be selective about what to keep: it's everything
up to a certain time instant, and nothing after that.  Stopping by XID
is just a different way of identifying what that time instant is.

BTW, stopping "before" an XID actually means stopping just before its
commit or abort record, so transactions that ended before it did will
be included in the recovery.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] PITR - recovery to a particular transaction

2004-08-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Yea, my question is if you choose "after", do you get everything that
> > happens until the "after" transaction commits, or just when it begins. 
> > If I stop after xid 125, and xid 126 starts and stops before 125
> > commits, does 126 get restored?
> 
> Yes.  You don't get to be selective about what to keep: it's everything
> up to a certain time instant, and nothing after that.  Stopping by XID
> is just a different way of identifying what that time instant is.
> 
> BTW, stopping "before" an XID actually means stopping just before its
> commit or abort record, so transactions that ended before it did will
> be included in the recovery.

OK, I added a mention of this in the docs.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Updateable Views?

2004-08-07 Thread Jan Wieck
On 8/3/2004 11:38 PM, Greg Stark wrote:
"Scott Marlowe" <[EMAIL PROTECTED]> writes:
On Tue, 2004-08-03 at 13:05, CSN wrote:
> Just wondering, is updateable views slated for a
> future version of Postgresql? In addition to using
> rules that is.
I would think that a basic fleshing out of the logic with some kind of
stored proc to make the views and triggers would likely get someone
started on the backend work.  You know, a proof of concept thingy.
I have some fears here. It seems everyone's first thought when they think
about updateable views is to think about constructing rules on the views.
How would that approach help with inline views? Things like:
 UPDATE (SELECT a+b AS x, c AS y FROM foo) SET c=1 WHERE x = 10
There is no such thing as an "inline view". What you show here is a 
subselect, and I have not heard of "updatable subselects" yet. Could you 
point me to the section in the ANSI SQL specifications that describes 
this feature please?

Jan
It seems like starting with these types of views in the backend would be more
productive than implementing something in rules. Once postgres can handle
inline views it should be trivial to handle persistent views just like they're
handled on selects.

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Oliver Elphick
On Sat, 2004-08-07 at 07:10, Tom Lane wrote:
> Oliver Elphick <[EMAIL PROTECTED]> writes:
> > glibc provides various routines (mb...) for handling Unicode.  How many
> > of our supported platforms don't have these?
> 
> Every one that doesn't use glibc.  Don't bother proposing a glibc-only
> solution (and that's from someone who works for a glibc-only company;
> you don't even want to think about the push-back you'll get from other
> quarters).

No. that's not what I was proposing.  My suggestion was to use these
routines if they are sufficiently widely implemented, and our own
routines where standard ones are not available.

The man page for mblen says
"CONFORMING TO
   ISO/ANSI C, UNIX98"

Is glibc really the only C library to conform?

If using the mb... routines isn't feasible, IBM's ICU library
(http://oss.software.ibm.com/icu/) is available under the X licence,
which is compatible with BSD as far as I can see.  Besides character
conversion, ICU can also do collation in various locales and encodings. 
My point is, we shouldn't be writing a new set of routines to do half a
job if there are already libraries available to do all of it.

-- 
Oliver Elphick  [EMAIL PROTECTED]
Isle of Wight  http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
 
 "Be still before the LORD and wait patiently for him;
  do not fret when men succeed in their ways, when they
  carry out their wicked schemes." 
Psalms 37:7 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Postgres development model (was Re: [HACKERS] CVS comment)

2004-08-07 Thread Alvaro Herrera
On Sat, Aug 07, 2004 at 12:38:10PM +0200, Gaetano Mendola wrote:
> Alvaro Herrera wrote:

> >Subversion and arch have been mentioned, but so far there is no
> >compelling reason to change.  It'd take convincing at least a couple
> >of core hackers to get the ball rolling ...
> 
> Well, I think having seen what's happening at the 8.0 relase I think
> that committers are too overloaded and someone else have to be
> "promoted" to be a committers, and I believe that having betters tools
> can improve the process too.

I don't think it was a problem of committers.  To me it was a problem of
reviewers.  Those are very scarce (for the bigger items it's mostly only
Tom).  Maybe a better SCM could help with this, but I doubt it.  As an
example, I did read the autovacuum patch, but I had no useful comment to
make on it.  Why didn't Jan or Bruce say something about it?  What about
Neil or Joe Conway?  If any of them could have had useful feedback, they
didn't have the time to do it.

The Linux model, heavily changed by the BitKeeper phenomenom (and I
think Andrew Morton plays a big role there too), does not really apply
here because the difference in manpower is huge.  They currently have a
very succesful model, very different from what was two years ago, but
they do have several very capable mantainers/reviewers.

Maybe our development process does need a revision.  Neil Conway seemed
to agree some time ago, but the rest of the people doesn't seem to have
an opinion on the subject that I remember.

But it's not a problem of committers.  Oleg Bartunov is a committer too,
as is Dennis Bjorklund, Peter Eisentraut, Marc Fournier and others.  But
apparently these are issues that they are not able to help with.

-- 
Alvaro Herrera ()
"Some men are heterosexual, and some are bisexual, and some
men don't think about sex at all... they become lawyers" (Woody Allen)


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Proposal for Disable Triggers

2004-08-07 Thread Alvaro Herrera
On Fri, Aug 06, 2004 at 03:14:13PM +1000, fastpgs wrote:

> And finally about the scope of the change of status of a trigger.
> Should this be local to the session or should be reflected globally?
> My humble opinion is it should be reflected globally(again, as in
> oracle ?)

If the change is global, what should happen on other sessions that have
a deferred event from that trigger concurrently with the one that
modifies it?  Should the answer be different depending on the isolation
mode of the transaction?

Also, should the change be permanent, or should it be undone when the
modifying backend exits (or the transaction ends)?

I don't think it makes a lot of sense to be changing triggers globally.
Usually you want to change it only to do a certain operation, without
worrying about concurrent transactions.  Following that rationale, the
command should not be ALTER, because that's used for permanent changes.
Also, make sure that when a backend crashes, the final state should be
the same as when the backend exits normally.

I'm not sure the Oracle behavior is the one we want to imitate here ...

-- 
Alvaro Herrera ()
Jude: I wish humans laid eggs
Ringlord: Why would you want humans to lay eggs?
Jude: So I can eat them


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Regarding redo/undo files.

2004-08-07 Thread Alvaro Herrera
On Fri, Jul 30, 2004 at 05:53:02AM +0100, Mamta Singh wrote:

> I went thru the site
> http://www.postgresql.org/docs/7.4/static/wal-benefits-later.html

I see that most of this document is obsolete.  It mentions:

1. with UNDO it will be possible to remove pg_clog.
   (already possible, another mechanism)
2. with UNDO it will be possible to implement savepoints.
   (already possible, another mechanism)
3. with WAL, PITR is possible.
   (already done)
4. we need a compressed WAL format.

Why not rip that section completely?

> Undo operation is not implemented. and information of
> the status regarding the transaction is stored in the
> permanent file pg_clog. But I have not been able to
> see the format of the file, as this file is in binary
> format. Same for the redo file.

pg_clog is documented in slru.c, clog.c and the README file in current
CVS tip src/backend/access/transam.  It's an array of "doublebits."

REDO files are the WAL archives.  The format is also binary.  See xlog.c
for a start.

> Could you please send me the format of the redo and
> pg_clog file. Also, give me some idea as to what do we
> need to change to implement undo-file. 

I'm pretty sure there's people that don't want UNDO to be implemented.
Don't waste your time ...

-- 
Alvaro Herrera ()
"A wizard is never late, Frodo Baggins, nor is he early.
He arrives precisely when he means to."  (Gandalf, en LoTR FoTR)


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] parameter hints to the optimizer

2004-08-07 Thread Oliver Jowett
Bruce Momjian wrote:
Oliver Jowett wrote:
Merlin Moncure wrote:

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
statement until parameters are received in the Bind message (which is 
essentially the same as what you describe). There was some discussion at 
the time about making it more flexible so you could apply it to arbitary 
statements, but that needed a protocol change so it didn't happen.

What do you mean about arbitrary statements?  Non-prepared ones, or
non-unnamed ones?
Non-unnamed ones. Adding flag on the Parse message that says when to 
plan the statement (or maybe on each Bind message even).

-O
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, August 08, 2004 2:43 AM
> To: Dennis Bjorklund
> Cc: Tatsuo Ishii; John Hansen; [EMAIL PROTECTED]; 
> [EMAIL PROTECTED]
> Subject: Re: [PATCHES] [HACKERS] UNICODE characters above 0x1 
> 
> Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> > On Sat, 7 Aug 2004, Tatsuo Ishii wrote:
> >> Anyway my point is if current specification of Unicode only allows 
> >> 24-bit range, why we need to allow usage against the specification?
> 
> > Is there a specific reason you want to restrict it to 24 bits?
> 
> I see several places that have to allocate space on the basis 
> of the maximum encoded character length possible in the 
> current encoding (look for uses of 
> pg_database_encoding_max_length).  Probably the only one 
> that's really significant for performance is text_substr(), 
> but that's enough to be an argument against setting maxmblen 
> higher than we have to.
> 
> It looks to me like supporting 4-byte UTF-8 characters would 
> be enough to handle the existing range of Unicode codepoints, 
> and that is probably as much as we want to do.
> 
> If I understood what I was reading, this would take several things:
> * Remove the "special UTF-8 check" in pg_verifymbstr;

I strongly disagree, this would mean one could store any sequence of
characters in the db, as long as the bytes are above 0x80. This would
not be valid utf8, and leave the data in an inconsistent state.
Setting the client encoding to unicode, implies that this is what we're
going to feed the database, and should guarantee, that what comes out of
a select is valid utf8. We can make sure of that, by doing the check
before it's inserted.

> * Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte
case;

pg_utf_mblen should handle any case according to the specification.
Currently, it will return 3, even for 4,5, and 6 byte sequences. Those
places where pg_utf_mblen is called, we should check to make sure, that
the length is between 1 and 4 inclusive, and that the sequence is valid.
This is what I made the patch for.

> * Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.

That I have no problem with.

> Are there any other places that would have to change?  Would 
> this break anything?  The testing aspect is what's bothering 
> me at the moment.
> 
>   regards, tom lane
> 
> 

Just my $0.02 worth,

Kind Regards,

John Hansen

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
> -Original Message-
> From: Oliver Elphick [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, August 08, 2004 7:43 AM
> To: Tom Lane
> Cc: John Hansen; Hackers; Patches
> Subject: Re: [HACKERS] UNICODE characters above 0x1
> 
> On Sat, 2004-08-07 at 07:10, Tom Lane wrote:
> > Oliver Elphick <[EMAIL PROTECTED]> writes:
> > > glibc provides various routines (mb...) for handling Unicode.  How

> > > many of our supported platforms don't have these?
> > 
> > Every one that doesn't use glibc.  Don't bother proposing a
glibc-only 
> > solution (and that's from someone who works for a glibc-only
company; 
> > you don't even want to think about the push-back you'll get from
other 
> > quarters).
> 
> No. that's not what I was proposing.  My suggestion was to 
> use these routines if they are sufficiently widely 
> implemented, and our own routines where standard ones are not 
> available.
> 
> The man page for mblen says
> "CONFORMING TO
>ISO/ANSI C, UNIX98"
> 
> Is glibc really the only C library to conform?
> 
> If using the mb... routines isn't feasible, IBM's ICU library
> (http://oss.software.ibm.com/icu/) is available under the X 
> licence, which is compatible with BSD as far as I can see.  
> Besides character conversion, ICU can also do collation in 
> various locales and encodings. 
> My point is, we shouldn't be writing a new set of routines to 
> do half a job if there are already libraries available to do 
> all of it.
> 

This sounds like a brilliant move, if anything.

> -- 
> Oliver Elphick  
> [EMAIL PROTECTED]
> Isle of Wight  
> http://www.lfix.co.uk/oliver
> GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F 
> A543 10EA
>  
>  "Be still before the LORD and wait patiently for him;
>   do not fret when men succeed in their ways, when they
>   carry out their wicked schemes." 
> Psalms 37:7 
> 
> 
> 

Kind Regards,

John Hansen


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] parameter hints to the optimizer

2004-08-07 Thread Bruce Momjian
Oliver Jowett wrote:
> Bruce Momjian wrote:
> > Oliver Jowett wrote:
> > 
> >>Merlin Moncure wrote:
> >>
> >>
> >>>Another way to deal with the problem is to defer plan generation until
> >>>the first plan execution and use the parameters from that execution.
> >>
> >>When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
> >>statement until parameters are received in the Bind message (which is 
> >>essentially the same as what you describe). There was some discussion at 
> >>the time about making it more flexible so you could apply it to arbitary 
> >>statements, but that needed a protocol change so it didn't happen.
> > 
> > 
> > What do you mean about arbitrary statements?  Non-prepared ones, or
> > non-unnamed ones?
> 
> Non-unnamed ones. Adding flag on the Parse message that says when to 
> plan the statement (or maybe on each Bind message even).

OK, what are unnamed prepared statements?  When are they used currently?
Only via the wire protocol?  Who uses them now?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Oliver Jowett
Tom Lane wrote:
If I understood what I was reading, this would take several things:
* Remove the "special UTF-8 check" in pg_verifymbstr;
* Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
* Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
Are there any other places that would have to change?  Would this break
anything?  The testing aspect is what's bothering me at the moment.
Does this change what client_encoding = UNICODE might produce? The JDBC 
driver will need some tweaking to handle this -- Java uses UTF-16 
internally and I think some supplementary character (?) scheme for 
values above 0x as of JDK 1.5.

-O
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Tom Lane
Oliver Jowett <[EMAIL PROTECTED]> writes:
> Does this change what client_encoding = UNICODE might produce? The JDBC 
> driver will need some tweaking to handle this -- Java uses UTF-16 
> internally and I think some supplementary character (?) scheme for 
> values above 0x as of JDK 1.5.

You're not likely to get out anything you didn't put in, so I'm not sure
it matters.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] parameter hints to the optimizer

2004-08-07 Thread Oliver Jowett
Bruce Momjian wrote:
Oliver Jowett wrote:
Bruce Momjian wrote:
Oliver Jowett wrote:

Merlin Moncure wrote:

Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed 
statement until parameters are received in the Bind message (which is 
essentially the same as what you describe). There was some discussion at 
the time about making it more flexible so you could apply it to arbitary 
statements, but that needed a protocol change so it didn't happen.

What do you mean about arbitrary statements?  Non-prepared ones, or
non-unnamed ones?
Non-unnamed ones. Adding flag on the Parse message that says when to 
plan the statement (or maybe on each Bind message even).

OK, what are unnamed prepared statements?  When are they used currently?
Only via the wire protocol?  Who uses them now?
The unnamed prepared statement is like any other prepared statement 
except it doesn't have a name :)  It can be accessed via:

1) V3 protocol Parse/Bind with an empty statement name uses the unnamed 
statement.
2) V2 or V3 "simple query" implicitly closes the unnamed statement.

CVS HEAD defers planning in case (1) until the Bind is received so it 
can do planning cost estimation using concrete parameter values and 
produce a better plan. It only does this for the unnamed statement, not 
for named statements. If you Parse into a named statement, planning 
happens immediately when the Parse is done.

This behaviour gives the client some flexibility without changing the 
protocol. It means that using Parse/Bind on the unnamed statement with 
parameters is essentially equivalent planning-wise to substituting the 
parameter values into the actual query and submitting that instead.

What we talked about briefly was providing some way to control when 
planning was done on a per-statement basis -- so you could say "don't 
defer planning for this unnamed query because I'm going to reuse the 
unnamed statement multiple times and the first set of parameters might 
not generate an efficient plan" or "do defer planning of this named 
query because I know I will be executing it with many similar parameter 
values and estimating using the first set of parameters gives a good plan".

Or an alternative is to have a way to control query replanning on each 
Bind individually -- so a client can get the benefit of skipping the 
parse step on subsequent executions and is able to pass parameters via 
Bind, but the query is replanned for the concrete parameter values on 
each execution. The JDBC driver wants this -- currently the use of named 
statements has to be explicitly turned on as with the current behaviour 
you may take a performance hit due to less-than-ideal plans as soon as 
you start using named statements.

So maybe the TODO should be something like "allow finer-grained client 
control of query estimation and (re-)planning when using Parse/Bind".

-O
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: Postgres development model (was Re: [HACKERS] CVS comment)

2004-08-07 Thread Joe Conway
Alvaro Herrera wrote:
I don't think it was a problem of committers.  To me it was a problem of
reviewers.  Those are very scarce (for the bigger items it's mostly only
Tom).  Maybe a better SCM could help with this, but I doubt it.  As an
example, I did read the autovacuum patch, but I had no useful comment to
make on it.  Why didn't Jan or Bruce say something about it?  What about
Neil or Joe Conway?  If any of them could have had useful feedback, they
didn't have the time to do it.
Unfortunately due to other commitments, personal and professional, I 
haven't had time to do much this development cycle :-(. I'm just now 
able to start getting a bit more active again.

But in any case I think your second sentence above gets to the heart of 
the issue. Postgres is a complex piece of code, and even though I have 
commit access, I don't understand many parts of it well enough to do a 
credible job reviewing others' code (at least not without days of effort 
just trying to understand what it is that I'm reviewing). Where I have 
both the time and some knowledge I do try to help, e.g. with plperl.

Joe
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Regarding redo/undo files.

2004-08-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> On Fri, Jul 30, 2004 at 05:53:02AM +0100, Mamta Singh wrote:
>> I went thru the site
>> http://www.postgresql.org/docs/7.4/static/wal-benefits-later.html

> I see that most of this document is obsolete.

Yeah, I was planning to remove that section or at least edit it
severely...

> I'm pretty sure there's people that don't want UNDO to be implemented.

I think the general consensus is that it doesn't offer us anything we
need, and the downsides are significant.  (Oracle DBAs are well aware
of the hazards of having to rely on the transaction log for UNDO ---
it does not play well at all with long-running transactions.)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] beta time

2004-08-07 Thread Bruce Momjian
I have two things left before beta. I want to make sure the release
notes are current against CVS and I want to make sure the win32
tablespace symlink changes I just made work.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Tatsuo Ishii
> Tom Lane wrote:
> 
> > If I understood what I was reading, this would take several things:
> > * Remove the "special UTF-8 check" in pg_verifymbstr;
> > * Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
> > * Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
> > 
> > Are there any other places that would have to change?  Would this break
> > anything?  The testing aspect is what's bothering me at the moment.
> 
> Does this change what client_encoding = UNICODE might produce? The JDBC 
> driver will need some tweaking to handle this -- Java uses UTF-16 
> internally and I think some supplementary character (?) scheme for 
> values above 0x as of JDK 1.5.

Java doesn't handle UCS above 0x? I didn't know that. As long as
you put in/out JDBC, it shouldn't be a problem. However if other APIs
put in such a data, you will get into trouble...
--
Tatsuo Ishii

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] beta time

2004-08-07 Thread Bruce Momjian
Bruce Momjian wrote:
> I have two things left before beta. I want to make sure the release
> notes are current against CVS and I want to make sure the win32
> tablespace symlink changes I just made work.
> 

Tom, when you updated the release notes, did you do a CVS log and
already get all the new stuff as of Aug 6?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] log file rotate

2004-08-07 Thread Bruce Momjian
Tom, you didn't like Andreas' idea of allowing the user to rotate the
log files on demand.  Isn't that standard functionality for any logging
program in case you want to manually start a new log file?  Is there no
way to do this simply?  Is this a TODO?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Tom Lane
"John Hansen" <[EMAIL PROTECTED]> writes:
> Ahh, but that's not the case. You cannot just delete the check, since
> not all combinations of bytes are valid UTF8. UTF bytes FE & FF never
> appear in a byte sequence for instance.

Well, this is still working at the wrong level.  The code that's in
pg_verifymbstr is mainly intended to enforce the *system wide*
assumption that multibyte characters must have the high bit set in
every byte.  (We do not support encodings without this property in
the backend, because it breaks code that looks for ASCII characters
... such as the main parser/lexer ...)  It's not really intended to
check that the multibyte character is actually legal in its encoding.

The "special UTF-8 check" was never more than a very quick-n-dirty hack
that was in the wrong place to start with.  We ought to be getting rid
of it not institutionalizing it.  If you want an exact encoding-specific
check on the legitimacy of a multibyte sequence, I think the right way
to do it is to add another function pointer to pg_wchar_table entries to
let each encoding have its own check routine.  Perhaps this could be
defined so as to avoid a separate call to pg_mblen inside the loop, and
thereby not add any new overhead.  I'm thinking about an API something
like

int validate_mbchar(const unsigned char *str, int len)

with result +N if a valid character N bytes long is present at
*str, and -N if an invalid character is present at *str and
it would be appropriate to display N bytes in the complaint.
(N must be <= len in either case.)  This would reduce the main
loop of pg_verifymbstr to a call of this function and an
error-case-handling block.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread Oliver Jowett
Tatsuo Ishii wrote:
Tom Lane wrote:

If I understood what I was reading, this would take several things:
* Remove the "special UTF-8 check" in pg_verifymbstr;
* Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
* Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
Are there any other places that would have to change?  Would this break
anything?  The testing aspect is what's bothering me at the moment.
Does this change what client_encoding = UNICODE might produce? The JDBC 
driver will need some tweaking to handle this -- Java uses UTF-16 
internally and I think some supplementary character (?) scheme for 
values above 0x as of JDK 1.5.

Java doesn't handle UCS above 0x? I didn't know that. As long as
you put in/out JDBC, it shouldn't be a problem. However if other APIs
put in such a data, you will get into trouble...
Internally, Java strings are arrays of UTF-16 values. Before JDK 1.5, 
all the string-manipulation library routines assumed that one code point 
== one UTF-16 value, so you can't represent values above 0x. The 1.5 
libraries understand using supplementary characters to use multiple 
UTF-16 values per code point. See 
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/

However, the JDBC driver needs to be taught about how to translate 
between UTF-8 representations of code points above 0x and pairs of 
UTF-16 values. Previously it didn't need to do anything since the server 
didn't use those high values. It's a minor thing..

-O
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] beta time

2004-08-07 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
>> I have two things left before beta. I want to make sure the release
>> notes are current against CVS and I want to make sure the win32
>> tablespace symlink changes I just made work.

> Tom, when you updated the release notes, did you do a CVS log and
> already get all the new stuff as of Aug 6?

Yes I did.  I think the release notes are good to go for beta,
with the possible exception of mentioning any array-input-parsing
hacking that Joe might be about to commit.

I think though that we might have some other must-fix Win32 issues :-(.
What are we going to do about this libpgport-depends-on-the-backend
business?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] beta time

2004-08-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> >> I have two things left before beta. I want to make sure the release
> >> notes are current against CVS and I want to make sure the win32
> >> tablespace symlink changes I just made work.
> 
> > Tom, when you updated the release notes, did you do a CVS log and
> > already get all the new stuff as of Aug 6?
> 
> Yes I did.  I think the release notes are good to go for beta,
> with the possible exception of mentioning any array-input-parsing
> hacking that Joe might be about to commit.

OK.  Thanks for doing that.

> I think though that we might have some other must-fix Win32 issues :-(.
> What are we going to do about this libpgport-depends-on-the-backend
> business?

I have Claudio on IM right now and am getting the details.  I wasn't
aware it was such a problem but I think it was introduced by rmtree()
and the malloc call.  

What I would like to do is to move elog/fprintf out of /port and make a
generic pglog call and have a backend function that calls elog and a
libpq version that calls fprintf.  This would remove a lot of FRONTEND
and Makefile compiles and will probably avoid some bugs.  The only
tricky part is passing a variable number of arguments. Same behavior for
malloc/palloc.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] beta time

2004-08-07 Thread Joe Conway
Tom Lane wrote:
Bruce Momjian <[EMAIL PROTECTED]> writes:
Tom, when you updated the release notes, did you do a CVS log and
already get all the new stuff as of Aug 6?
Yes I did.  I think the release notes are good to go for beta,
with the possible exception of mentioning any array-input-parsing
hacking that Joe might be about to commit.
I was waiting on feedback on two issues before committing:
1. '{{"1 2" x},{3}}'
2. '{{},{}}'
My patch would generate an ERROR for either. Tom, you questioned my 
disallowing of both of these, but didn't seem to have a very strong 
opinion. No one else has chimed in at all (including no replies to my 
post on GENERAL earlier). Can I go with what I have, at the risk of 
ripping it out if others complain?

If so, I'll commit what I posted last night, and amend the docs. I will 
also update the release notes.

BTW, when is the planned cutoff for commits to get into the beta?
Thanks,
Joe
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] beta time

2004-08-07 Thread Bruce Momjian
Joe Conway wrote:
> Tom Lane wrote:
> > Bruce Momjian <[EMAIL PROTECTED]> writes:
> >>Tom, when you updated the release notes, did you do a CVS log and
> >>already get all the new stuff as of Aug 6?
> > 
> > Yes I did.  I think the release notes are good to go for beta,
> > with the possible exception of mentioning any array-input-parsing
> > hacking that Joe might be about to commit.
> 
> I was waiting on feedback on two issues before committing:
> 
> 1. '{{"1 2" x},{3}}'
> 2. '{{},{}}'
> 
> My patch would generate an ERROR for either. Tom, you questioned my 
> disallowing of both of these, but didn't seem to have a very strong 
> opinion. No one else has chimed in at all (including no replies to my 
> post on GENERAL earlier). Can I go with what I have, at the risk of 
> ripping it out if others complain?
> 
> If so, I'll commit what I posted last night, and amend the docs. I will 
> also update the release notes.

OK, it has to go in the incompatibilities section, right?

> BTW, when is the planned cutoff for commits to get into the beta?

When everyone is done sometime tomorrow.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] UNICODE characters above 0x10000

2004-08-07 Thread John Hansen
> Well, this is still working at the wrong level.  The code 
> that's in pg_verifymbstr is mainly intended to enforce the 
> *system wide* assumption that multibyte characters must have 
> the high bit set in every byte.  (We do not support encodings 
> without this property in the backend, because it breaks code 
> that looks for ASCII characters ... such as the main 
> parser/lexer ...)  It's not really intended to check that the 
> multibyte character is actually legal in its encoding.
> 

Ok, point taken.

> The "special UTF-8 check" was never more than a very 
> quick-n-dirty hack that was in the wrong place to start with. 
>  We ought to be getting rid of it not institutionalizing it.  
> If you want an exact encoding-specific check on the 
> legitimacy of a multibyte sequence, I think the right way to 
> do it is to add another function pointer to pg_wchar_table 
> entries to let each encoding have its own check routine.  
> Perhaps this could be defined so as to avoid a separate call 
> to pg_mblen inside the loop, and thereby not add any new 
> overhead.  I'm thinking about an API something like
> 
>   int validate_mbchar(const unsigned char *str, int len)
> 
> with result +N if a valid character N bytes long is present 
> at *str, and -N if an invalid character is present at *str 
> and it would be appropriate to display N bytes in the complaint.
> (N must be <= len in either case.)  This would reduce the 
> main loop of pg_verifymbstr to a call of this function and an 
> error-case-handling block.
> 

Sounds like a plan...

>   regards, tom lane
> 
> 

Regards,

John Hansen

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] beta time

2004-08-07 Thread Tom Lane
Joe Conway <[EMAIL PROTECTED]> writes:
> I was waiting on feedback on two issues before committing:

> 1. '{{"1 2" x},{3}}'
> 2. '{{},{}}'

> My patch would generate an ERROR for either. Tom, you questioned my 
> disallowing of both of these, but didn't seem to have a very strong 
> opinion.

I don't have any great love for the first item --- I think it was an
unintended consequence of the way the code was first written, rather
than something the author meant to support.

I'm more concerned about what the second should mean, but I do have to
concede that we'd likely want to appropriate this syntax to mean NULL
array entries as soon as we support NULL array entries.  So rejecting it
at the moment may be a forward-looking thing to do.

> BTW, when is the planned cutoff for commits to get into the beta?

The plan was to wrap beta1 sometime tomorrow ... I'd guess that
"sometime" will end up being in the afternoon east coast time, but
this largely depends on the libpgport breakage ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] log file rotate

2004-08-07 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom, you didn't like Andreas' idea of allowing the user to rotate the
> log files on demand.

Give me a use case that requires that, and is sufficiently interesting
to justify even a marginal decrease in the reliability of the log
process.

Frankly, I do not believe that database users should have anything to do
with the log rotation process.  Do we have a TODO for allowing users to
force switching to a new WAL file segment?

> Is this a TODO?

IMHO, no.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] log file rotate

2004-08-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom, you didn't like Andreas' idea of allowing the user to rotate the
> > log files on demand.
> 
> Give me a use case that requires that, and is sufficiently interesting
> to justify even a marginal decrease in the reliability of the log
> process.
> 
> Frankly, I do not believe that database users should have anything to do
> with the log rotation process.  Do we have a TODO for allowing users to
> force switching to a new WAL file segment?

I thought rotatelogs supported it so we should in cases where someone
wanted to make a new log file to delete an unusually large one, like a 1
gig log file caused by some runaway process.  However, I see rotatelogs
doesn't have that capability so I guess we don't need it either.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] beta time

2004-08-07 Thread Marc G. Fournier
On Sat, 7 Aug 2004, Tom Lane wrote:
The plan was to wrap beta1 sometime tomorrow ... I'd guess that 
"sometime" will end up being in the afternoon east coast time, but this 
largely depends on the libpgport breakage ...
That's what I was figuring (re: libpgport) ... hopefully I'm following the 
right one, but am following the thread, and will hold off pending 
resolution ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] pg_dump and sequences (bug ?)

2004-08-07 Thread Christopher Kings-Lynne
Using pg_dump from postgresql 7.3.4 I've obtained
a dump file containing a SEQUENCE SET with no
corresponding SEQUENCE. I've seen that this is usually
due to the presence of a table with a 'serial' field,
but since in this case there is no such table I wonder
if this is a bug in pg_dump.
Perhaps.  Is there any way you can send me the compressed pg_dump -s 
output of your database?  Is it sensitive info?  How certain are you 
that there is no serial column in your database?

The only reason I can imagine for this is pg_dump taking
any sequence whose name  ends in _seq as being associated
to a table, no matter if that table exists and has a 'serial'
field. Is this possible ? Shouldn't this kind of dependency
be coded somehow ?
It is coded somehow and pg_dump in no way treats things that end in _seq 
as being on tables.

My first suspicion is that you must be mistaken, but i would really like 
to see the full pg_dump -s output of your database

Chris
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] log file rotate

2004-08-07 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> I thought rotatelogs supported it so we should in cases where someone
> wanted to make a new log file to delete an unusually large one, like a 1
> gig log file caused by some runaway process.

Hm?  We have a rotate-on-size parameter, so that's not going to happen.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] listing triggers

2004-08-07 Thread Christopher Kings-Lynne
futhermore:
\dt lists tables
\ds lists sequences
\d tablename lists that table.
etc. etc.
But how can I get a listing of all used triggers on a certain table?
\d 
Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_dump and sequences (bug ?)

2004-08-07 Thread Christopher Kings-Lynne
Also, given this and your previous operator commutator problem, I 
strongly suspect that someone has taken an axe to the system catalogs on 
your installation and they are very screwy.

Chris
strk wrote:
Using pg_dump from postgresql 7.3.4 I've obtained
a dump file containing a SEQUENCE SET with no
corresponding SEQUENCE. I've seen that this is usually
due to the presence of a table with a 'serial' field,
but since in this case there is no such table I wonder
if this is a bug in pg_dump.
The only reason I can imagine for this is pg_dump taking
any sequence whose name  ends in _seq as being associated
to a table, no matter if that table exists and has a 'serial'
field. Is this possible ? Shouldn't this kind of dependency
be coded somehow ?
TIA
--strk;
---(end of broadcast)---
TIP 8: explain analyze is your friend
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] log file rotate

2004-08-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > I thought rotatelogs supported it so we should in cases where someone
> > wanted to make a new log file to delete an unusually large one, like a 1
> > gig log file caused by some runaway process.
> 
> Hm?  We have a rotate-on-size parameter, so that's not going to happen.

Ah, OK.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] beta time

2004-08-07 Thread Joe Conway
Tom Lane wrote:
Joe Conway <[EMAIL PROTECTED]> writes:
1. '{{"1 2" x},{3}}'
2. '{{},{}}'

My patch would generate an ERROR for either. Tom, you questioned my 
disallowing of both of these, but didn't seem to have a very strong 
opinion.
I don't have any great love for the first item --- I think it was an
unintended consequence of the way the code was first written, rather
than something the author meant to support.
I'm more concerned about what the second should mean, but I do have to
concede that we'd likely want to appropriate this syntax to mean NULL
array entries as soon as we support NULL array entries.  So rejecting it
at the moment may be a forward-looking thing to do.
I committed the attached.
Regarding the release notes, I take it that we should modify 
doc/src/sgml/release.sgml? Does the following sound reasonable, or 
should I provide specific examples?

 
  
   Syntax checking of array input processing has been tighened up
   considerably. Junk that was previously allowed in odd places with
   odd results now causes an ERROR. Also changed behavior with respect
   to whitespace; trailing whitespace is now ignored as well as leading
   whitespace (which has always been ignored).
  
 
Joe
Index: doc/src/sgml/array.sgml
===
RCS file: /cvsroot/pgsql-server/doc/src/sgml/array.sgml,v
retrieving revision 1.36
diff -c -r1.36 array.sgml
*** doc/src/sgml/array.sgml	5 Aug 2004 03:29:11 -	1.36
--- doc/src/sgml/array.sgml	8 Aug 2004 04:49:48 -
***
*** 95,104 
  
 where delim is the delimiter character
 for the type, as recorded in its pg_type entry.
!(For all built-in types, this is the comma character
!,.)  Each
!val is either a constant of the array
!element type, or a subarray.  An example of an array constant is
  
  '{{1,2,3},{4,5,6},{7,8,9}}'
  
--- 95,106 
  
 where delim is the delimiter character
 for the type, as recorded in its pg_type entry.
!Among the standard data types provided in the
!PostgreSQL distribution, type
!box uses a semicolon (;) but all the others
!use comma (,). Each val is
!either a constant of the array element type, or a subarray. An example
!of an array constant is
  
  '{{1,2,3},{4,5,6},{7,8,9}}'
  
***
*** 161,167 
   
  
   
!   The ARRAY expression syntax may also be used:
  
  INSERT INTO sal_emp
  VALUES ('Bill',
--- 163,169 
   
  
   
!   The ARRAY constructor syntax may also be used:
  
  INSERT INTO sal_emp
  VALUES ('Bill',
***
*** 176,183 
Notice that the array elements are ordinary SQL constants or
expressions; for instance, string literals are single quoted, instead of
double quoted as they would be in an array literal.  The ARRAY
!   expression syntax is discussed in more detail in .
   
   
  
--- 178,185 
Notice that the array elements are ordinary SQL constants or
expressions; for instance, string literals are single quoted, instead of
double quoted as they would be in an array literal.  The ARRAY
!   constructor syntax is discussed in more detail in
!   .
   
   
  
***
*** 524,533 
 use comma.)  In a multidimensional array, each dimension (row, plane,
 cube, etc.) gets its own level of curly braces, and delimiters
 must be written between adjacent curly-braced entities of the same level.
!You may write whitespace before a left brace, after a right
!brace, or before any individual item string.  Whitespace after an item
!is not ignored, however: after skipping leading whitespace, everything
!up to the next right brace or delimiter is taken as the item value.

  

--- 526,542 
 use comma.)  In a multidimensional array, each dimension (row, plane,
 cube, etc.) gets its own level of curly braces, and delimiters
 must be written between adjacent curly-braced entities of the same level.
!   
! 
!   
!The array output routine will put double quotes around element values
!if they are empty strings or contain curly braces, delimiter characters,
!double quotes, backslashes, or white space.  Double quotes and backslashes
!embedded in element values will be backslash-escaped.  For numeric
!data types it is safe to assume that double quotes will never appear, but
!for textual data types one should be prepared to cope with either presence
!or absence of quotes.  (This is a change in behavior from pre-7.2
!PostgreSQL releases.)

  

***
*** 573,598 
  

 As shown previously, when writing an array value you may write double
!quotes around any individual array
!element.  You must do so if the element value would otherwise
!confuse the array-value parser.  For example, elements containing curly
!braces, commas (or whatever the delimiter character is), double quotes,
!backslashes, or leading white space must be 

Re: Postgres development model (was Re: [HACKERS] CVS comment)

2004-08-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I don't think it was a problem of committers.  To me it was a problem of
> reviewers.  Those are very scarce (for the bigger items it's mostly only
> Tom).

Yah.  We have plenty of people authorized to commit, and we add more
on a pretty regular basis.  (FWIW, Alvaro, you are high on the list
of people to appoint as new committers.)  The problem is finding
adequate review talent.  You don't have to be a committer to help review
patches --- feel free to look at anything that goes by, and if you see
a problem say so!  But the difficulty is that PG is a pretty large and
complex system, and it takes a good deal of familiarity with it to spot
some of the more esoteric problems.

Right at the moment we are a bit short of uber-hackers.  Vadim Mikheev
knew a lot about the code, but he's dropped out of sight and not been
replaced.  Tom Lockhart is sorely missed as well.  You, Manfred, and
Neil are up-and-coming but you each probably need another couple years
fooling with the code before you really have the full wizard's rating.

I don't have any magic solution to this.  I do say that people who know
the code get there by doing things with it --- at least that's how I got
there --- so I certainly encourage anyone with the time and interest
to pursue it.  When you see a bizarre bug report, find the cause and
fix it.  Or pick a project that you almost know how to do, but not
quite, and learn until you can do it.  Repeat as needed.

> Maybe a better SCM could help with this, but I doubt it.

I haven't seen any particular reason why we should adopt another SCM.
Perhaps BitKeeper or SubVersion would be better for our purposes than
CVS, but are they enough better to justify the switchover costs?
I doubt it.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] beta time

2004-08-07 Thread Bruce Momjian
Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <[EMAIL PROTECTED]> writes:
> >>1. '{{"1 2" x},{3}}'
> >>2. '{{},{}}'
> > 
> >>My patch would generate an ERROR for either. Tom, you questioned my 
> >>disallowing of both of these, but didn't seem to have a very strong 
> >>opinion.
> > 
> > I don't have any great love for the first item --- I think it was an
> > unintended consequence of the way the code was first written, rather
> > than something the author meant to support.
> > 
> > I'm more concerned about what the second should mean, but I do have to
> > concede that we'd likely want to appropriate this syntax to mean NULL
> > array entries as soon as we support NULL array entries.  So rejecting it
> > at the moment may be a forward-looking thing to do.
> 
> I committed the attached.
> 
> Regarding the release notes, I take it that we should modify 
> doc/src/sgml/release.sgml? Does the following sound reasonable, or 

Right.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] beta time

2004-08-07 Thread Tom Lane
Joe Conway <[EMAIL PROTECTED]> writes:
> I committed the attached.

Minor gripe: this bit of documentation seems out of date now.

!For example, elements containing curly braces, commas (or whatever the
!delimiter character is), double quotes, backslashes, or leading white
!space must be double-quoted.  To put a double quote or backslash in a

Should say "leading or trailing whitespace", no?

> Regarding the release notes, I take it that we should modify 
> doc/src/sgml/release.sgml?

Yup.

> Does the following sound reasonable, or 
> should I provide specific examples?

Seems roughly the right amount of detail to me.  Note you should make
entries under both "observe the following incompatibilities" and the
general list of datatype/function changes.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] beta time

2004-08-07 Thread Alvaro Herrera
On Sat, Aug 07, 2004 at 10:09:22PM -0700, Joe Conway wrote:

>  
>   
>Syntax checking of array input processing has been tighened up
>considerably. Junk that was previously allowed in odd places with
>odd results now causes an ERROR. Also changed behavior with respect
>to whitespace; trailing whitespace is now ignored as well as leading
>whitespace (which has always been ignored).
>   
>  

Whitespace where?

-- 
Alvaro Herrera ()
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] beta time

2004-08-07 Thread Joe Conway
Tom Lane wrote:
Minor gripe: this bit of documentation seems out of date now.
!For example, elements containing curly braces, commas (or whatever the
!delimiter character is), double quotes, backslashes, or leading white
!space must be double-quoted.  To put a double quote or backslash in a
Should say "leading or trailing whitespace", no?
Good catch. Fixed.
Seems roughly the right amount of detail to me.  Note you should make
entries under both "observe the following incompatibilities" and the
general list of datatype/function changes.
Yup, figured that out after I sent the last post. Done.
Thanks,
Joe
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] beta time

2004-08-07 Thread Joe Conway
Alvaro Herrera wrote:
Whitespace where?
OK, clarified:
 
  Syntax checking of array input processing has been tighened up
  considerably. Junk that was previously allowed in odd places with
  odd results now causes an ERROR. Also changed behavior with respect
  to whitespace surrounding array elements; trailing whitespace is now
  ignored as well as leading whitespace (which has always been ignored).
 
Thanks,
Joe
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[HACKERS] SRF/dropped column bug

2004-08-07 Thread Joe Conway
I see this behavior with CVS tip:
CREATE TABLE wibble (a integer, b integer);
INSERT INTO wibble VALUES (1,1);
ALTER TABLE wibble ADD COLUMN c BIGINT;
UPDATE wibble SET c = b;
ALTER TABLE wibble DROP COLUMN b;
ALTER TABLE wibble RENAME c TO b;
CREATE FUNCTION foobar() RETURNS SETOF wibble AS
'SELECT * FROM wibble' LANGUAGE SQL;
regression=# SELECT * FROM wibble;
 a | b
---+---
 1 | 1
(1 row)
regression=# select * from foobar();
 a | b
---+---
 1 |
(1 row)
The example comes from a complaint in January 2004, at which time it 
would instead throw an ERROR:

ERROR: query-specified return row and actual function return row do not
match
I'll start digging into this, but any hints on where to look would be 
greatly appreciated.

Thanks,
Joe
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] SRF/dropped column bug

2004-08-07 Thread Tom Lane
Joe Conway <[EMAIL PROTECTED]> writes:
> I'll start digging into this, but any hints on where to look would be 
> greatly appreciated.

I have a couple of similar issues on the radar.  The problem probably is
some bit of code that is not accounting for dropped columns in a row
datatype --- ie, the critical part of your example is the DROP COLUMN.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings