Re: [HACKERS] Re: charin(), text_char() should return something else for empty input
Peter Eisentraut <[EMAIL PROTECTED]> writes: > In general, I don't see a point in accepting a zero byte in character > strings. If you want to store binary data there are binary data types (or > effort could be invested in them). If we were starting in a green field then I'd think it worthwhile to maintain null-byte-cleanness for the textual datatypes. At this point, though, the amount of pain involved seems to vastly outweigh the value. The major problem is that I/O conventions not based on null-terminated strings would break all existing user-defined datatypes. (Changing our own code is one thing, breaking users' code is something else.) There are minor-by-comparison problems like not being able to use strcoll() for locale-sensitive comparisons anymore... I agree with Peter that spending some extra effort on bytea and/or similar types is probably a more profitable answer. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
RE: [HACKERS] Plans for solving the VACUUM problem
> > > OTOH it is possible to do without rolling back at all as > > > MySQL folks have shown us ;) > > > > Not with SDB tables which support transactions. > > My point was that MySQL was used quite a long time without it > and still quite many useful applications were produced. And my point was that needless to talk about rollbacks in non-transaction system and in transaction system one has to implement rollback somehow. > BTW, do you know what strategy is used by BSDDB/SDB for > rollback/undo ? AFAIR, they use O-smgr => UNDO is required. Vadim ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Re: charin(), text_char() should return something else for empty input
On Mon, May 28, 2001 at 02:37:32PM -0400, Tom Lane wrote: > I wrote: > > I propose that both of these operations should return a space character > > for an empty input string. This is by analogy to space-padding as you'd > > get with char(1). Any objections? > > An alternative approach is to make charin and text_char map empty > strings to the null character (\0), and conversely make charout and > char_text map the null character to empty strings. charout already > acts that way, in effect, since it has to produce a null-terminated > C string. This way would have the advantage that there would still > be a reversible dump and reload representation for a "char" field > containing '\0', whereas space-padding would cause such a field to > become ' ' after reload. But it's a little strange if you think that > "char" ought to behave the same as char(1). Does the standard require any particular behavior in with NUL characters? I'd like to see PG move toward treating them as ordinary control characters. I realize that at best it will take a long time to get there. C is irretrievably mired in the "NUL is a terminator" swamp, but SQL isn't C. Nathan Myers [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
RE: [HACKERS] Plans for solving the VACUUM problem
> > > Seems overwrite smgr has mainly advantages in terms of > > > speed for operations other than rollback. > > > > ... And rollback is required for < 5% transactions ... > > This obviously depends on application. Small number of aborted transactions was used to show useless of UNDO in terms of space cleanup - that's why I use same argument to show usefulness of O-smgr -:) > I know people who rollback most of their transactions > (actually they use it to emulate temp tables when reporting). Shouldn't they use TEMP tables? -:) > OTOH it is possible to do without rolling back at all as > MySQL folks have shown us ;) Not with SDB tables which support transactions. Vadim ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] pg_log ??
hello all I don't know what to do... the pg_log file is too big.. anyone can help me? thanks ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
RE: AW: [HACKERS] Plans for solving the VACUUM problem
> > > So are whole pages stored in rollback segments or just > > > the modified data? > > > > This is implementation dependent. Storing whole pages is > > much easy to do, but obviously it's better to store just > > modified data. > > I am not sure it is necessarily better. Seems to be a tradeoff here. > pros of whole pages: > a possible merge with physical log (for first > modification of a page after checkpoint > there would be no overhead compared to current > since it is already written now) Using WAL as RS data storage is questionable. > in a clever implementation a page already in the > "rollback segment" might satisfy the > modification of another row on that page, and > thus would not need any additional io. This would be possible only if there was no commit (same SCN) between two modifications. But, aren't we too deep on overwriting smgr (O-smgr) implementation? It's doable. It has advantages in terms of IO active transactions must do to follow MVCC. It has drawback in terms of required disk space (and, oh yeh, it's not easy to implement -:)). So, any other opinions about value of O-smgr? Vadim ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] /contrib/unixdate: Broke in cvs tip.
Larry Rosenman <[EMAIL PROTECTED]> writes: > I tried to use the unixdate contrib, and got the following: I think unixdate is suffering from bit-rot. Most or all of what it does is now part of the mainframe anyway. > Any ideas? (I need SOMETHING that takes a unix timestamp and turns it > to timestamp. ) There are a number of ways. I tend to rely on the binary equivalence between int4 and abstime: regression=# select now()::abstime::int4; ?column? --- 991145365 (1 row) regression=# select 991145365::int4::abstime::timestamp; ?column? 2001-05-29 10:09:25-04 (1 row) but the more officially supported way to do the former is date_part('epoch', timestamp), and I think there is also an Approved Way to do the latter. (Thomas?) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] appendum: Re: *really* simple select doesn't useindices ...
On Tue, 29 May 2001, Tom Lane wrote: > Gavin Sherry <[EMAIL PROTECTED]> writes: > > The column 'zip' is of type text. As such, indices will not be used except > > in the case when the where clause is WHERE zip ~ '^' for btree > > indices. > > Uh ... nonsense. Oh good, I was worried there for a sec ... :) > > On Tue, 29 May 2001, Marc G. Fournier wrote: > >> globalmatch=# vacuum verbose analyze locations; > >> NOTICE: --Relation locations-- > >> NOTICE: Pages 1395: Changed 0, reaped 0, Empty 0, New 0; Tup 123571: Vac 0, >Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 76, MaxLen 124; Re-using: Free/Avail. Space >0/0; EndEmpty/Avail. Pages 0/0. CPU 0.11s/0.00u sec. > > >> globalmatch=# explain SELECT location from locations WHERE zip = '80012'; > >> NOTICE: QUERY PLAN: > >> > >> Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) > > Okay, so it thinks that "zip = '80012'" will match 4217 out of 123571 > rows, which is more than enough to drive it to a sequential scan > (with an average of more than three matched rows on every page of the > relation, there'd be no I/O savings at all from consulting the index). > > Since the real number of matches is only 1, this estimate is obviously > way off. In 7.1 the estimate is being driven by the frequency of the > most common value in the column --- what is the most common value? If > you're lucky, the most common value is a dummy (empty string, maybe) > that you could replace by NULL with a few simple changes in > application logic. 7.1 is smart enough to distinguish NULL from real > data values in its estimates. If you're not lucky, there really are a > few values that are far more common than average, in which case you're > stuck unless you want to run development sources. Current sources > should do a lot better on that kind of data distribution. Hit it right on the mark: zip | cnt ---+--- | 81403 00210 | 1 00211 | 1 Will look at the code and see what I can do abuot that NULL issue ... thanks :) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] appendum: Re: *really* simple select doesn't use indices ...
Gavin Sherry <[EMAIL PROTECTED]> writes: > The column 'zip' is of type text. As such, indices will not be used except > in the case when the where clause is WHERE zip ~ '^' for btree > indices. Uh ... nonsense. > On Tue, 29 May 2001, Marc G. Fournier wrote: >> globalmatch=# vacuum verbose analyze locations; >> NOTICE: --Relation locations-- >> NOTICE: Pages 1395: Changed 0, reaped 0, Empty 0, New 0; Tup 123571: Vac 0, >Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 76, MaxLen 124; Re-using: Free/Avail. Space >0/0; EndEmpty/Avail. Pages 0/0. CPU 0.11s/0.00u sec. >> globalmatch=# explain SELECT location from locations WHERE zip = '80012'; >> NOTICE: QUERY PLAN: >> >> Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) Okay, so it thinks that "zip = '80012'" will match 4217 out of 123571 rows, which is more than enough to drive it to a sequential scan (with an average of more than three matched rows on every page of the relation, there'd be no I/O savings at all from consulting the index). Since the real number of matches is only 1, this estimate is obviously way off. In 7.1 the estimate is being driven by the frequency of the most common value in the column --- what is the most common value? If you're lucky, the most common value is a dummy (empty string, maybe) that you could replace by NULL with a few simple changes in application logic. 7.1 is smart enough to distinguish NULL from real data values in its estimates. If you're not lucky, there really are a few values that are far more common than average, in which case you're stuck unless you want to run development sources. Current sources should do a lot better on that kind of data distribution. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] appendum: Re: *really* simple select doesn't useindices ...
Marc, The column 'zip' is of type text. As such, indices will not be used except in the case when the where clause is WHERE zip ~ '^' for btree indices. Gavin On Tue, 29 May 2001, Marc G. Fournier wrote: > > Okay, just bit the bullet, upgraded to v7.1.2, and the problem still > persists: > > globalmatch=# vacuum verbose analyze locations; > NOTICE: --Relation locations-- > NOTICE: Pages 1395: Changed 0, reaped 0, Empty 0, New 0; Tup 123571: Vac 0, >Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 76, MaxLen 124; Re-using: Free/Avail. Space >0/0; EndEmpty/Avail. Pages 0/0. CPU 0.11s/0.00u sec. > NOTICE: Index locationstmp_gid_key: Pages 272; Tuples 123571. CPU 0.01s/0.15u sec. > NOTICE: Index locations_zip: Pages 320; Tuples 123571. CPU 0.02s/0.14u sec. > NOTICE: Index locations_country: Pages 342; Tuples 123571. CPU 0.03s/0.13u sec. > NOTICE: --Relation pg_toast_9373225-- > NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, >Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; >EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. > NOTICE: Index pg_toast_9373225_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec. > NOTICE: Analyzing... > VACUUM > globalmatch=# explain SELECT location from locations WHERE zip = '80012'; > NOTICE: QUERY PLAN: > > Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) > > EXPLAIN > globalmatch=# select version(); >version > - > PostgreSQL 7.1.2 on i386-unknown-freebsd4.3, compiled by GCC 2.95.3 > (1 row) > > > On Mon, 28 May 2001, Marc G. Fournier wrote: > > > > > First, this is still a v7.1 system ... its totally possible that this is > > long fixed, and I'm way overdue to get it to v7.1.2, which I'll gladly > > accept as a response ... > > > > That said ... seems like a very painful way to arrive at 1 row ... :) > > > > table structure: > > > > globalmatch=# \d locations > > Table "locations" > > Attribute | Type |Modifier > > ---+-+ > > gid | integer | not null default nextval('locationstmp_gid_seq'::text) > > city | text| > > state | text| > > country | text| > > zip | text| > > location | point | > > Indices: locations_zip, > > locationstmp_gid_key > > > > globalmatch=# \d locations_zip > > Index "locations_zip" > > Attribute | Type > > ---+-- > > zip | text > > btree > > > > globalmatch=# EXPLAIN SELECT count(location) from locations WHERE zip = '80012'; > > NOTICE: QUERY PLAN: > > > > Aggregate (cost=2950.18..2950.18 rows=1 width=16) > > -> Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) > > > > EXPLAIN > > > > globalmatch=# SELECT count(location) from locations WHERE zip = '80012'; > > count > > --- > > 1 > > (1 row) > > > > globalmatch=# SELECT count(location) from locations; > > count > > > > 123571 > > (1 row) > > > > > > > > > > Marc G. Fournier [EMAIL PROTECTED] > Systems Administrator @ hub.org > scrappy@{postgresql|isc}.org ICQ#7615664 > > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
AW: AW: [HACKERS] User functions and AIX
> > > IBM is trying to find the answer to this but I thought I would throw ... > > > Tell me your link line, OS and compiler version. > > And have you forgotten to include -bI:postgres.imp ? > > Bingo! I can't believe that IBM has been wrestling with this for a week. > Part of the reason we are thinking of going with IBM is for the support. Shared libs are obviously not their strong side :-) Basically we are very happy with their RS6000's and AIX though. > Here is my Makefile now. I'm not sure about that -lc there > as I get duplicate symbol warnings but it appears to work fine. they don't matter > CFLAGS =-g -O0 -pipe -ansi -Wall -Wshadow -Wpointer-arith gcc and not xlc :-) actually xlc produces faster code, but I don't think that makes a noticeable difference. > .o.so: > ld -G -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \ > -bexpall -bnoentry -lc Always use the compiler for linking instead of ld: gcc -Wl,-H512 -Wl,-bM:SRE -o $@ $< -L ${PGLIBDIR} -bI:/usr/local/pgsql/lib/postgres.imp \ -bexpall -bnoentry You are not allowed to leave anything unresolved, thus do not use -G, or you won't notice unresolved externals (-G includes -berok which you don't want at all). Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] appendum: Re: *really* simple select doesn't use indices ...
Okay, just bit the bullet, upgraded to v7.1.2, and the problem still persists: globalmatch=# vacuum verbose analyze locations; NOTICE: --Relation locations-- NOTICE: Pages 1395: Changed 0, reaped 0, Empty 0, New 0; Tup 123571: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 76, MaxLen 124; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.11s/0.00u sec. NOTICE: Index locationstmp_gid_key: Pages 272; Tuples 123571. CPU 0.01s/0.15u sec. NOTICE: Index locations_zip: Pages 320; Tuples 123571. CPU 0.02s/0.14u sec. NOTICE: Index locations_country: Pages 342; Tuples 123571. CPU 0.03s/0.13u sec. NOTICE: --Relation pg_toast_9373225-- NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. NOTICE: Index pg_toast_9373225_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec. NOTICE: Analyzing... VACUUM globalmatch=# explain SELECT location from locations WHERE zip = '80012'; NOTICE: QUERY PLAN: Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) EXPLAIN globalmatch=# select version(); version - PostgreSQL 7.1.2 on i386-unknown-freebsd4.3, compiled by GCC 2.95.3 (1 row) On Mon, 28 May 2001, Marc G. Fournier wrote: > > First, this is still a v7.1 system ... its totally possible that this is > long fixed, and I'm way overdue to get it to v7.1.2, which I'll gladly > accept as a response ... > > That said ... seems like a very painful way to arrive at 1 row ... :) > > table structure: > > globalmatch=# \d locations > Table "locations" > Attribute | Type |Modifier > ---+-+ > gid | integer | not null default nextval('locationstmp_gid_seq'::text) > city | text| > state | text| > country | text| > zip | text| > location | point | > Indices: locations_zip, > locationstmp_gid_key > > globalmatch=# \d locations_zip > Index "locations_zip" > Attribute | Type > ---+-- > zip | text > btree > > globalmatch=# EXPLAIN SELECT count(location) from locations WHERE zip = '80012'; > NOTICE: QUERY PLAN: > > Aggregate (cost=2950.18..2950.18 rows=1 width=16) > -> Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) > > EXPLAIN > > globalmatch=# SELECT count(location) from locations WHERE zip = '80012'; > count > --- > 1 > (1 row) > > globalmatch=# SELECT count(location) from locations; > count > > 123571 > (1 row) > > > > Marc G. Fournier [EMAIL PROTECTED] Systems Administrator @ hub.org scrappy@{postgresql|isc}.org ICQ#7615664 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
AW: [HACKERS] Re: charin(), text_char() should return something else for empty input
> An alternative approach is to make charin and text_char map empty > strings to the null character (\0), and conversely make charout and > char_text map the null character to empty strings. charout already > acts that way, in effect, since it has to produce a null-terminated > This way would have the advantage that there would still > be a reversible dump and reload representation for a "char" field > containing '\0' I more like this. IIRC some implementations allow storing a \0 in char(n) also. Then it is 8bit clean and can be used for a 1 byte number. Such values can usually only be inserted and selected with host variables. Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
AW: [HACKERS] User functions and AIX
> IBM is trying to find the answer to this but I thought I would throw > this out here to see if anyone can help me. I am compiling a user > defined type on AIX and it fails when I try to use it. The type is > chkpass and it is in the contrib directory. It fails with a core dump > at line 88 in chkpass.c. The line reads as follows. > > result = (chkpass *) palloc(sizeof(chkpass)); > > The top of the backtrace looks like this. > > #0 0x0 in ?? () from (unknown load module) > #1 0xd1087a60 in chkpass_in (fcinfo=0x0) at chkpass.c:88 > #2 0x10045cf4 in or_clause (clause=0x0) at clauses.c:211 > #3 0x10075d68 in int82ge (fcinfo=0x1015cfc8) at int8.c:343 > #4 0x1005909c in _readArrayRef () at readfuncs.c:924 > #5 0x10059b68 in _readSeqScan () at readfuncs.c:600 > > It looks like the dynamically loaded object (chkpass.so) > can't determine > the address of palloc() from the parent. I assume I need a > flag for the > compile either on the main build to export the addresses or > on the build > of chkpass to tell it where to look up the addresses. Anyone been > through this that might be able to shed some light? Tell me your link line, OS and compiler version. And have you forgotten to include -bI:postgres.imp ? In general it is imho a good idea to copy the appropriate compile and link flags from the regression test, that compiles shared libs in .../contrib. Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
AW: AW: [HACKERS] Plans for solving the VACUUM problem
> > > > > You mean it is restored in session that is running the transaction ? > > > > > > Depends on what you mean with restored. It first reads the heap page, > > > sees that it needs an older version and thus reads it from the "rollback >segment". > > > > So are whole pages stored in rollback segments or just the modified data? > > This is implementation dependent. Storing whole pages is much easy to do, > but obviously it's better to store just modified data. I am not sure it is necessarily better. Seems to be a tradeoff here. pros of whole pages: a possible merge with physical log (for first modification of a page after checkpoint there would be no overhead compared to current since it is already written now) in a clever implementation a page already in the "rollback segment" might satisfy the modification of another row on that page, and thus would not need any additional io. Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl