Possibly, since I got it wrong once more
About to give up, but attached, Updated patch.
Regards,
John Hansen
-Original Message-
From: Oliver Elphick [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 07, 2004 3:56 PM
To: Tom Lane
Cc: John Hansen; Hackers; Patches
Subject: Re:
On Sat, 7 Aug 2004, Tom Lane wrote:
shy of a load --- for instance I see that pg_utf_mblen thinks there are
no UTF8 codes longer than 3 bytes whereas your code goes to 4. I'm not
an expert on this stuff, so I don't know what the UTF8 spec actually
says. But I do think you are fixing the
Ahh, but that's not the case. You cannot just delete the check, since
not all combinations of bytes are valid UTF8. UTF bytes FE FF never
appear in a byte sequence for instance.
UTF8 is more that two bytes btw, up to 6 bytes are used to represent an
UTF8 character.
The 5 and 6 byte characters are
On Sat, 7 Aug 2004, Tom Lane wrote:
question at hand is whether we can support 32-bit characters or not ---
and if not, what's the next bug to fix?
True, and that's hard to just give an answer to. One could do some simple
testing, make sure regexps work and then treat anything else that might
This should do it.
Regards,
John Hansen
-Original Message-
From: Dennis Bjorklund [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 07, 2004 5:02 PM
To: Tom Lane
Cc: John Hansen; Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x1
On Sat, 7 Aug 2004, Tom Lane
$ pg_dump -p 5433 test
pg_dump: could not parse ACL list ([0:1]={postgres=UC/postgres,=UC/postgres}) for object
public (SCHEMA)
Ugh. This is an unforeseen side effect of Joe's recent changes to make
array_out emit dimension info.
I think the most reasonable answer is to tweak the ACL code so
Dennis Bjorklund [EMAIL PROTECTED] writes:
... This also means that the start byte can never start with 7 or 8
ones, that is illegal and should be tested for and rejected. So the
longest utf-8 sequence is 6 bytes (and the longest character needs 4
bytes (or 31 bits)).
Tatsuo would know
Yes, but the specification allows for 6byte sequences, or 32bit
characters.
As dennis pointed out, just because they're not used, doesn't mean we
should not allow them to be stored, since there might me someone using
the high ranges for a private character set, which could very well be
included in
Alvaro Herrera wrote:
On Sat, Aug 07, 2004 at 01:34:20AM +0200, Gaetano Mendola wrote:
Alvaro Herrera wrote:
Yeah. I included your tab-complete patch in the patch I sent to
pgsql-patches, which later Tom reworked and applied. His CVS comment
didn't mention the tab completion change. This isn't
Yes, but the specification allows for 6byte sequences, or 32bit
characters.
UTF-8 is just an encoding specification, not character set
specification. Unicode only has 17 256x256 planes in its
specification.
As dennis pointed out, just because they're not used, doesn't mean we
should not
Now it's entirely possible that the underlying support is a few bricks
shy of a load --- for instance I see that pg_utf_mblen thinks there are
no UTF8 codes longer than 3 bytes whereas your code goes to 4. I'm not
an expert on this stuff, so I don't know what the UTF8 spec actually
says. But I
4 actually,
10 needs four bytes:
0xxx 10xx 10xx 10xx
10 = 1010
Fill in the blanks, starting from the bottom, you get:
1010 1011 1011
Regards,
John Hansen
-Original Message-
From: Christopher Kings-Lynne [mailto:[EMAIL
On Sat, 7 Aug 2004, John Hansen wrote:
should not allow them to be stored, since there might me someone using
the high ranges for a private character set, which could very well be
included in the specification some day.
There are areas reserved for private character sets.
--
/Dennis
Well, maybe we'd be better off, compiling a list of (in?)valid ranges
from the full unicode database
(http://www.unicode.org/Public/UNIDATA/UnicodeData.txt and
http://www.unicode.org/Public/UNIDATA/Unihan.txt)
and with every release of pg, update the detection logic so only valid
characters are
Bruce Momjian wrote:
Jan Wieck wrote:
On 8/6/2004 9:04 PM, Bruce Momjian wrote:
Updated. Thanks.
I thought we want to have the feature activated ... I reversed your
change and brought guc.c in sync instead.
Uh, if the guy is doing a vacuum at night, does he want the delay?
Seems someone
On Sat, 7 Aug 2004, Tatsuo Ishii wrote:
More seriously, Unicode is filled with tons of confusion and
inconsistency IMO. Remember that once Unicode adovocates said that the
merit of Unicode was it only requires 16-bit width. Now they say they
need surrogate pairs and 32-bit width chars...
Yea,. I know
10 - 10 : 2 separate planes iirc
... John
-Original Message-
From: Dennis Bjorklund [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 07, 2004 9:06 PM
To: John Hansen
Cc: Tatsuo Ishii; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE:
Using pg_dump from postgresql 7.3.4 I've obtained
a dump file containing a SEQUENCE SET with no
corresponding SEQUENCE. I've seen that this is usually
due to the presence of a table with a 'serial' field,
but since in this case there is no such table I wonder
if this is a bug in pg_dump.
The only
On Sat, 7 Aug 2004, Takehiko Abe wrote:
It looked like you sent the last mail only to me and not the list. I
assume it was a misstake and I send the reply to both.
Is there a specific reason you want to restrict it to 24 bits?
ISO 10646 is said to have removed its private use codepoints
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Dennis Bjorklund
Sent: Saturday, August 07, 2004 10:48 PM
To: Takehiko Abe
Cc: [EMAIL PROTECTED]
Subject: Re: [PATCHES] [HACKERS] UNICODE characters above 0x1
On Sat, 7 Aug 2004, Takehiko Abe
On Sat, 7 Aug 2004, John Hansen wrote:
Now, is it really 24 bits tho?
Afaict, it's really 21 (0 - 10 or 0 - xxx1 )
Yes, up to 0x10 should be enough.
The 24 is not really important, this is all about what utf-8 strings to
accept as input. The strings are stored
On 8/6/2004 11:34 PM, Bruce Momjian wrote:
Jan Wieck wrote:
On 8/6/2004 9:04 PM, Bruce Momjian wrote:
Updated. Thanks.
I thought we want to have the feature activated ... I reversed your
change and brought guc.c in sync instead.
Uh, if the guy is doing a vacuum at night, does he want the delay?
-Original Message-
From: Dennis Bjorklund [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 07, 2004 11:23 PM
To: John Hansen
Cc: Takehiko Abe; [EMAIL PROTECTED]
Subject: RE: [PATCHES] [HACKERS] UNICODE characters above 0x1
On Sat, 7 Aug 2004, John Hansen wrote:
Now, is it
I have this on 8.0dev (checked out last friday):
yomama=# SELECT version();
version
---
PostgreSQL 8.0devel on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.0
(1 Zeile)
yomama=# CREATE TABLE
Dennis Bjorklund [EMAIL PROTECTED] writes:
On Sat, 7 Aug 2004, Tatsuo Ishii wrote:
Anyway my point is if current specification of Unicode only allows
24-bit range, why we need to allow usage against the specification?
Is there a specific reason you want to restrict it to 24 bits?
I see
Jan Wieck [EMAIL PROTECTED] writes:
Good that autovacuum didn't make it then, those people would have had a
big surprise :-)
If autovacuum had made it, would you expect someone to have enabled it
by default? Without advance discussion?
regards, tom lane
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
When we do a PITR recovery based on xid, does it stop recovery based on
the start of the xid or the commit of the xid?
You can stop either before or after that commit. See
recovery.conf.sample (I don't think it's documented anywhere
Jan Wieck wrote:
On 8/6/2004 11:34 PM, Bruce Momjian wrote:
Jan Wieck wrote:
On 8/6/2004 9:04 PM, Bruce Momjian wrote:
Updated. Thanks.
I thought we want to have the feature activated ... I reversed your
change and brought guc.c in sync instead.
Uh, if the guy is doing a
Bernd Helmle [EMAIL PROTECTED] writes:
I have this on 8.0dev (checked out last friday):
yomama=# CREATE RULE rule_insert_test AS ON INSERT TO test DO NOTIFY
test_notification;
[ causes crash ]
Fixed. Thanks for the report!
regards, tom lane
Bruce Momjian [EMAIL PROTECTED] writes:
Yea, my question is if you choose after, do you get everything that
happens until the after transaction commits, or just when it begins.
If I stop after xid 125, and xid 126 starts and stops before 125
commits, does 126 get restored?
Yes. You don't
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Yea, my question is if you choose after, do you get everything that
happens until the after transaction commits, or just when it begins.
If I stop after xid 125, and xid 126 starts and stops before 125
commits, does 126 get
On 8/3/2004 11:38 PM, Greg Stark wrote:
Scott Marlowe [EMAIL PROTECTED] writes:
On Tue, 2004-08-03 at 13:05, CSN wrote:
Just wondering, is updateable views slated for a
future version of Postgresql? In addition to using
rules that is.
I would think that a basic fleshing out of the logic with
On Sat, Aug 07, 2004 at 12:38:10PM +0200, Gaetano Mendola wrote:
Alvaro Herrera wrote:
Subversion and arch have been mentioned, but so far there is no
compelling reason to change. It'd take convincing at least a couple
of core hackers to get the ball rolling ...
Well, I think having seen
On Fri, Aug 06, 2004 at 03:14:13PM +1000, fastpgs wrote:
And finally about the scope of the change of status of a trigger.
Should this be local to the session or should be reflected globally?
My humble opinion is it should be reflected globally(again, as in
oracle ?)
If the change is
On Fri, Jul 30, 2004 at 05:53:02AM +0100, Mamta Singh wrote:
I went thru the site
http://www.postgresql.org/docs/7.4/static/wal-benefits-later.html
I see that most of this document is obsolete. It mentions:
1. with UNDO it will be possible to remove pg_clog.
(already possible, another
Bruce Momjian wrote:
Oliver Jowett wrote:
Merlin Moncure wrote:
Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation for the unnamed
statement until
-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED]
Sent: Sunday, August 08, 2004 2:43 AM
To: Dennis Bjorklund
Cc: Tatsuo Ishii; John Hansen; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PATCHES] [HACKERS] UNICODE characters above 0x1
Dennis Bjorklund [EMAIL
-Original Message-
From: Oliver Elphick [mailto:[EMAIL PROTECTED]
Sent: Sunday, August 08, 2004 7:43 AM
To: Tom Lane
Cc: John Hansen; Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x1
On Sat, 2004-08-07 at 07:10, Tom Lane wrote:
Oliver Elphick [EMAIL
Oliver Jowett wrote:
Bruce Momjian wrote:
Oliver Jowett wrote:
Merlin Moncure wrote:
Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan generation
Tom Lane wrote:
If I understood what I was reading, this would take several things:
* Remove the special UTF-8 check in pg_verifymbstr;
* Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
* Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
Are there any other places
Oliver Jowett [EMAIL PROTECTED] writes:
Does this change what client_encoding = UNICODE might produce? The JDBC
driver will need some tweaking to handle this -- Java uses UTF-16
internally and I think some supplementary character (?) scheme for
values above 0x as of JDK 1.5.
You're not
Bruce Momjian wrote:
Oliver Jowett wrote:
Bruce Momjian wrote:
Oliver Jowett wrote:
Merlin Moncure wrote:
Another way to deal with the problem is to defer plan generation until
the first plan execution and use the parameters from that execution.
When talking the V3 protocol, 7.5 defers plan
Alvaro Herrera wrote:
I don't think it was a problem of committers. To me it was a problem of
reviewers. Those are very scarce (for the bigger items it's mostly only
Tom). Maybe a better SCM could help with this, but I doubt it. As an
example, I did read the autovacuum patch, but I had no
Alvaro Herrera [EMAIL PROTECTED] writes:
On Fri, Jul 30, 2004 at 05:53:02AM +0100, Mamta Singh wrote:
I went thru the site
http://www.postgresql.org/docs/7.4/static/wal-benefits-later.html
I see that most of this document is obsolete.
Yeah, I was planning to remove that section or at least
I have two things left before beta. I want to make sure the release
notes are current against CVS and I want to make sure the win32
tablespace symlink changes I just made work.
--
Bruce Momjian| http://candle.pha.pa.us
[EMAIL PROTECTED] | (610)
Tom Lane wrote:
If I understood what I was reading, this would take several things:
* Remove the special UTF-8 check in pg_verifymbstr;
* Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
* Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
Are there
Bruce Momjian wrote:
I have two things left before beta. I want to make sure the release
notes are current against CVS and I want to make sure the win32
tablespace symlink changes I just made work.
Tom, when you updated the release notes, did you do a CVS log and
already get all the new
Tom, you didn't like Andreas' idea of allowing the user to rotate the
log files on demand. Isn't that standard functionality for any logging
program in case you want to manually start a new log file? Is there no
way to do this simply? Is this a TODO?
--
Bruce Momjian
John Hansen [EMAIL PROTECTED] writes:
Ahh, but that's not the case. You cannot just delete the check, since
not all combinations of bytes are valid UTF8. UTF bytes FE FF never
appear in a byte sequence for instance.
Well, this is still working at the wrong level. The code that's in
Tatsuo Ishii wrote:
Tom Lane wrote:
If I understood what I was reading, this would take several things:
* Remove the special UTF-8 check in pg_verifymbstr;
* Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
* Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
Are
Bruce Momjian [EMAIL PROTECTED] writes:
I have two things left before beta. I want to make sure the release
notes are current against CVS and I want to make sure the win32
tablespace symlink changes I just made work.
Tom, when you updated the release notes, did you do a CVS log and
already
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I have two things left before beta. I want to make sure the release
notes are current against CVS and I want to make sure the win32
tablespace symlink changes I just made work.
Tom, when you updated the release notes, did you do a
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Tom, when you updated the release notes, did you do a CVS log and
already get all the new stuff as of Aug 6?
Yes I did. I think the release notes are good to go for beta,
with the possible exception of mentioning any array-input-parsing
Joe Conway wrote:
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Tom, when you updated the release notes, did you do a CVS log and
already get all the new stuff as of Aug 6?
Yes I did. I think the release notes are good to go for beta,
with the possible exception of
Well, this is still working at the wrong level. The code
that's in pg_verifymbstr is mainly intended to enforce the
*system wide* assumption that multibyte characters must have
the high bit set in every byte. (We do not support encodings
without this property in the backend, because it
Joe Conway [EMAIL PROTECTED] writes:
I was waiting on feedback on two issues before committing:
1. '{{1 2 x},{3}}'
2. '{{},{}}'
My patch would generate an ERROR for either. Tom, you questioned my
disallowing of both of these, but didn't seem to have a very strong
opinion.
I don't have
Bruce Momjian [EMAIL PROTECTED] writes:
Tom, you didn't like Andreas' idea of allowing the user to rotate the
log files on demand.
Give me a use case that requires that, and is sufficiently interesting
to justify even a marginal decrease in the reliability of the log
process.
Frankly, I do not
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Tom, you didn't like Andreas' idea of allowing the user to rotate the
log files on demand.
Give me a use case that requires that, and is sufficiently interesting
to justify even a marginal decrease in the reliability of the log
On Sat, 7 Aug 2004, Tom Lane wrote:
The plan was to wrap beta1 sometime tomorrow ... I'd guess that
sometime will end up being in the afternoon east coast time, but this
largely depends on the libpgport breakage ...
That's what I was figuring (re: libpgport) ... hopefully I'm following the
Using pg_dump from postgresql 7.3.4 I've obtained
a dump file containing a SEQUENCE SET with no
corresponding SEQUENCE. I've seen that this is usually
due to the presence of a table with a 'serial' field,
but since in this case there is no such table I wonder
if this is a bug in pg_dump.
Perhaps.
Bruce Momjian [EMAIL PROTECTED] writes:
I thought rotatelogs supported it so we should in cases where someone
wanted to make a new log file to delete an unusually large one, like a 1
gig log file caused by some runaway process.
Hm? We have a rotate-on-size parameter, so that's not going to
Also, given this and your previous operator commutator problem, I
strongly suspect that someone has taken an axe to the system catalogs on
your installation and they are very screwy.
Chris
strk wrote:
Using pg_dump from postgresql 7.3.4 I've obtained
a dump file containing a SEQUENCE SET with
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I thought rotatelogs supported it so we should in cases where someone
wanted to make a new log file to delete an unusually large one, like a 1
gig log file caused by some runaway process.
Hm? We have a rotate-on-size parameter, so
Tom Lane wrote:
Joe Conway [EMAIL PROTECTED] writes:
1. '{{1 2 x},{3}}'
2. '{{},{}}'
My patch would generate an ERROR for either. Tom, you questioned my
disallowing of both of these, but didn't seem to have a very strong
opinion.
I don't have any great love for the first item --- I think it was
Joe Conway wrote:
Tom Lane wrote:
Joe Conway [EMAIL PROTECTED] writes:
1. '{{1 2 x},{3}}'
2. '{{},{}}'
My patch would generate an ERROR for either. Tom, you questioned my
disallowing of both of these, but didn't seem to have a very strong
opinion.
I don't have any great love for
Joe Conway [EMAIL PROTECTED] writes:
I committed the attached.
Minor gripe: this bit of documentation seems out of date now.
!For example, elements containing curly braces, commas (or whatever the
!delimiter character is), double quotes, backslashes, or leading white
!space must be
66 matches
Mail list logo