Re: [HACKERS] Unicode combining characters
Maybe something like this: declare a plpgsql function that takes two text parameters and has a body like for (i = 0 to a million) boolvar := $1 like $2; Then call it with strings of different lengths and see how the runtime varies. You need to apply the LIKE to function parameters, else the system will probably collapse the LIKE operation to a constant... Good idea. I did tests for both LIKE and REGEX using PL/pgsql functions(see source code below). Here are the result. What I did was calling the functions with changing taret strings from 32byte to 8192. Times are all in msec. (1) LIKE bytes Without MBWith MB 328121.94 8094.73 648167.98 8105.24 128 8151.30 8108.61 256 8090.12 8098.20 512 8111.05 8101.07 1024 8110.49 8099.61 2048 8095.32 8106.00 4096 8094.88 8091.19 8192 8123.02 8121.63 (2) REGEX bytes Without MB With MB 32 117.93 119.47 64 126.41 127.61 128 143.97 146.55 256 180.49 183.69 512 255.53 256.16 1024410.59 409.22 20485176.38 5181.99 40966000.82 5627.84 81926529.15 6547.10 - shell script --- for i in 32 64 128 256 512 1024 2048 4096 8192 do psql -c explain analyze select liketest(a,'aaa') from (select substring('very_long_text' from 0 for $i) as a) as a test done - shell script --- - functions - drop function liketest(text,text); create function liketest(text,text) returns bool as ' declare i int; rtn boolean; begin i := 100; while i 0 loop rtn := $1 like $2; i := i - 1; end loop; return rtn; end; ' language 'plpgsql'; drop function regextest(text,text); create function regextest(text,text) returns bool as ' declare i int; rtn boolean; begin i := 1; while i 0 loop rtn := $1 ~ $2; i := i - 1; end loop; return rtn; end; ' language 'plpgsql'; - functions - ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] RFD: access to remore databases: altername suggestion
You are attacking here two things: a) schemas, which should be done in 7.3, Is imho something different alltogether. (I know we have two opposed views here) thus multiple databases on same host would be unnecessary. I disagree :-) b) connections to remote host' databases, which is partially implemented already (in a ugly way, but...) see contrib/dblink What you described is a syntactic sugar to implement b) which isn't a bad idea, but just consider, it is already done. sorta. Not in the least. True remote access needs 2 phase commit, which is nowhere near the horizon. Remote read only access would be somewhat easier to implement, and would imho be a very useful first step. Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Unicode combining characters
- shell script --- for i in 32 64 128 256 512 1024 2048 4096 8192 do psql -c explain analyze select liketest(a,'aaa') from (select substring('very_long_text' from 0 for $i) as a) as a test done - shell script --- I don't think your search string is sufficient for a test. With 'aaa' it actually knows that it only needs to look at the first three characters of a. Imho you need to try something like liketest(a,'%aaa%'). Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] btree_gist regression test busted?
You are right. Please, apply attached patch or copy result/btree_gist.out expected/btree_gist.out Tom Lane wrote: In current CVS I see a failure in the btree_gist regression test. It kinda looks like the test data was changed without updating the expected results, but would you verify this? regards, tom lane *** ./expected/btree_gist.out Wed Aug 22 14:27:54 2001 --- ./results/btree_gist.out Tue Oct 2 18:48:34 2001 *** *** 17,23 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 7 (1 row) -- create idx --- 17,23 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 66 (1 row) -- create idx *** *** 34,39 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 7 (1 row) --- 34,39 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 66 (1 row) -- Teodor Sigaev [EMAIL PROTECTED] patch_btreesql.gz ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] My last ECPG commit
On Wed, Oct 03, 2001 at 02:27:20AM +, Thomas Lockhart wrote: It may be that the static tarballs for RedHat will work for you (they work for me on Mandrake). Maybe. But then I could compile the sources myself. I use CVSup to keep a local copy of the cvs repository on my laptop, so I have a *full* development environment when I'm traveling or otherwise off line. I'd have a very hard time working without it... Yes, that was my original thinking too. But CVS could do the same. Michael -- Michael Meskes [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Unicode combining characters
I don't think your search string is sufficient for a test. With 'aaa' it actually knows that it only needs to look at the first three characters of a. Imho you need to try something like liketest(a,'%aaa%'). Ok. I ran the modified test (now the iteration is reduced to 10 in liketest()). As you can see, there's huge difference. MB seems up to ~8 times slower:- There seems some problems existing in the implementation. Considering REGEX is not so slow, maybe we should employ the same design as REGEX. i.e. using wide charcters, not multibyte streams... MB+LIKE Total runtime: 1321.58 msec Total runtime: 1718.03 msec Total runtime: 2519.97 msec Total runtime: 4187.05 msec Total runtime: 7629.24 msec Total runtime: 14456.45 msec Total runtime: 17320.14 msec Total runtime: 17323.65 msec Total runtime: 17321.51 msec noMB+LIKE Total runtime: 964.90 msec Total runtime: 993.09 msec Total runtime: 1057.40 msec Total runtime: 1192.68 msec Total runtime: 1494.59 msec Total runtime: 2078.75 msec Total runtime: 2328.77 msec Total runtime: 2326.38 msec Total runtime: 2330.53 msec -- Tatsuo Ishii ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Bulkloading using COPY - ignore duplicates?
Peter Eisentraut writes: However, it seems to me that COPY ignoring duplicates can easily be done by preprocessing the input file. Or by post-processing, like (error checking cut): void import_shots(char *impfile, int lineshoot_id) { char tab_name[128]; char tab_temp[128]; frig_file(impfile); /* add the postgres header */ sprintf(tab_name, shot_%d, lineshoot_id); sprintf(tab_temp, shot_%d_tmp, lineshoot_id); sprintf(cmd, CREATE TEMPORARY TABLE %s AS SELECT * FROM shot, tab_temp); EXEC SQL EXECUTE IMMEDIATE :cmd; EXEC SQL COMMIT WORK; /* will not work without comit here! */ sprintf(cmd, COPY BINARY %s FROM '%s', tab_temp, impfile); append_page_alloc(cmd, tab_name, impfile, 1); EXEC SQL EXECUTE IMMEDIATE :cmd; sprintf(cmd, INSERT INTO %s SELECT DISTINCT ON(shot_time) * FROM %s, tab_name, tab_temp); EXEC SQL EXECUTE IMMEDIATE :cmd; sprintf(cmd, DROP TABLE %s, tab_temp); EXEC SQL EXECUTE IMMEDIATE :cmd; EXEC SQL COMMIT WORK ; remove(impfile); } However this is adding significant time to the import operation. Likewise I could loop round the input file first and hunt for duplicates, again with a performance hit. My main point is that Postgres can easily and quickly check for duplicates during the COPY (as it does currently) and it adds zero execution time to simply ignore these duplicate rows. Obviously this is a useful feature otherwise Oracle, Ingres and other commercial relational databases wouldn't feature similiar functionality. Yes, in an ideal world the input to COPY should be clean and consistent with defined indexes. However this is only really the case when COPY is used for database/table backup and restore. It misses the point that a major use of COPY is in speed optimisation on bulk inserts... Lee. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] CEST timezone
Is it a bug or CEST timezone is not supported anymore ? I can't import my 7.1.2 database to current development version of postgresql lis=# create table test (ts timestamp); CREATE lis=# insert into test values ('23.05.2000 09:06:59.00 CEST'); ERROR: Bad timestamp external representation '23.05.2000 09:06:59.00 CEST' lis=# insert into test values ('23.05.2000 09:06:59.00 CET'); INSERT 125614 1 lis=# select * from test; ts -- 23.05.2000 10:06:59 CEST (1 row) lis=# select version(); version PostgreSQL 7.2devel on i686-pc-linux-gnu, compiled by GCC 2.96 (1 row) lis=# Jan Varga ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Dumping variables..A sort of serialize
Hi, I am currently struggling to write a serialize-like function that would dump a row of a table into a string-like object in a way that would allow me to reconstruct the original row (or its individual elements) from this object. The tentative plan I have is something like this: 1) Write a C-function declared something like so: Datum serialise(PG_FUNCTION_ARGS) 2) Inside the function, get the pointer to the row using: TupleTableSlot *row = PG_GETARGPOINTER(0); 3) Use GetAttributeByName to get the Datum value corresponding to each of the attributes. This is where I am stuck. What I want to do now is to use this datum value returned by GetAttributeByName to get at the glob of memory occupied by the attribute and memmove it into an area declared as text. I could then store this text as a row in a table. Is this at all possible or am I talking through my hat ;) I am sorry I dont understand the backend variable storage and types too well and would be grateful for some help. Regards and Thanks, Gurunandan ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] BUG: text(varchar) truncates at 31 bytes
... Perhaps it'd be a better idea for the option of a freebie conversion to be checked earlier, say immediately after we discover there is no exact match for the function name and input type. Thomas, what do you think? We *really* need that catalog lookup first. Otherwise, we will never be able to override the hardcoded compatibility assumptions in that matching routine. Once we push that routine into a system catalog, we'll have more flexibility to tune things after the fact. Without the explicit function call, things would work just fine for the example at hand, right? I could put in a dummy passthrough routine. But that seems a bit ugly. - Thomas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Timestamp, fractional seconds problem
Problem: the external representation of time and timestamp are less precise than the internal representation. We are using postgresql 7.1.3 The timestamp and time types support resolving microseconds (6 places beyond the decimal), however the output routines round the value to only 2 decimal places. This causes data degradation, if a table with timestamps is copied out and then copied back in, as the timestamps lose precision. We feel this is a data integrity issue. Copy out (ascii) does not maintain the consistency of the data it copies. In our application, we depend on millisecond resolution timestamps and often need to copy out/copy back tables. The current timestamp formating in postgresql 7.1.x breaks this badly. A work around for display might be to use to_char(). But for copy the only workaround we have found is to use binary copy. Alas, binary copy does not work for server to client copies. Unfortunately, we need to copy to the client machine. The client copy does not support binary copies so we lose precision. Our suggested fix to this problem is to change the encoding of the fractional seconds part of the datetime and time types in datetime.c (called by timestamp_out, time_out) to represent least 6 digits beyond the decimal (ie %0.6f). A configurable format would also work. If there is another way to force the encoding to be precise we'd love to hear about it. Otherwise this appears to be a silent data integrity bug with unacceptable workarounds. Thanks! Laurette Cisneros ([EMAIL PROTECTED]) Elein Mustain NextBus Information Systems ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Unicode combining characters
Ok. I ran the modified test (now the iteration is reduced to 10 in liketest()). As you can see, there's huge difference. MB seems up to ~8 times slower:- There seems some problems existing in the implementation. Considering REGEX is not so slow, maybe we should employ the same design as REGEX. i.e. using wide charcters, not multibyte streams... MB+LIKE Total runtime: 1321.58 msec Total runtime: 1718.03 msec Total runtime: 2519.97 msec Total runtime: 4187.05 msec Total runtime: 7629.24 msec Total runtime: 14456.45 msec Total runtime: 17320.14 msec Total runtime: 17323.65 msec Total runtime: 17321.51 msec noMB+LIKE Total runtime: 964.90 msec Total runtime: 993.09 msec Total runtime: 1057.40 msec Total runtime: 1192.68 msec Total runtime: 1494.59 msec Total runtime: 2078.75 msec Total runtime: 2328.77 msec Total runtime: 2326.38 msec Total runtime: 2330.53 msec I did some trials with wide characters implementation and saw virtually no improvement. My guess is the logic employed in LIKE is too simple to hide the overhead of the multibyte and wide character conversion. The reason why REGEX with MB is not so slow would be the complexity of its logic, I think. As you can see in my previous postings, $1 ~ $2 operation (this is logically same as a LIKE '%a%') is, for example, almost 80 times slower than LIKE (remember that likest() loops over 10 times more than regextest()). So I decided to use a completely different approach. Now like has two matching engines, one for single byte encodings (MatchText()), the other is for multibyte ones (MBMatchText()). MatchText() is identical to the non MB version of it, and virtually no performance penalty for single byte encodings. MBMatchText() is for multibyte encodings and is identical the one used in 7.1. Here is the MB case result with SQL_ASCII encoding. Total runtime: 901.69 msec Total runtime: 939.08 msec Total runtime: 993.60 msec Total runtime: 1148.18 msec Total runtime: 1434.92 msec Total runtime: 2024.59 msec Total runtime: 2288.50 msec Total runtime: 2290.53 msec Total runtime: 2316.00 msec To accomplish this, I moved MatchText etc. to a separate file and now like.c includes it *twice* (similar technique used in regexec()). This makes like.o a little bit larger, but I believe this is worth for the optimization. -- Tatsuo Ishii ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] BUG: text(varchar) truncates at 31 bytes
Thomas Lockhart [EMAIL PROTECTED] writes: Perhaps it'd be a better idea for the option of a freebie conversion to be checked earlier, say immediately after we discover there is no exact match for the function name and input type. Thomas, what do you think? We *really* need that catalog lookup first. Otherwise, we will never be able to override the hardcoded compatibility assumptions in that matching routine. Sure, I said *after* we fail to find an exact match. But the freebie match is for a function name that matches a type name and is binary-compatible with the source type. That's not a weak constraint. ISTM that interpretation should take priority over interpretations that involve more than one level of transformation. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Unicode combining characters
Tatsuo Ishii [EMAIL PROTECTED] writes: To accomplish this, I moved MatchText etc. to a separate file and now like.c includes it *twice* (similar technique used in regexec()). This makes like.o a little bit larger, but I believe this is worth for the optimization. That sounds great. What's your feeling now about the original question: whether to enable multibyte by default now, or not? I'm still thinking that Peter's counsel is the wisest: plan to do it in 7.3, not today. But this fix seems to eliminate the only hard reason we have not to do it today ... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Feature suggestion: Postgresql binding to one IP?
Hi people, Is it possible for Postgresql to bind to one IP address? I'm trying to run multiple postgresql installations on one server. The unix socket could be named accordingly: Postgresql config bound to a particular port and all IPs. .s.PGSQL.portnumber Postgresql config bound to a particular port and IP. .s.PGSQL.portnumber.ipaddress Any other suggestions/comments on running multiple instances of postgresql are welcomed. An less desirable alternative is to keep binding to all IP, use different ports and name the ports, but specifying the port by name in -p doesn't work. Cheerio, Link. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Feature suggestion: Postgresql binding to one IP?
Lincoln Yeoh [EMAIL PROTECTED] writes: Is it possible for Postgresql to bind to one IP address? See 'virtual_host' GUC parameter. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Unicode combining characters
What's your feeling now about the original question: whether to enable multibyte by default now, or not? I'm still thinking that Peter's counsel is the wisest: plan to do it in 7.3, not today. But this fix seems to eliminate the only hard reason we have not to do it today ... If SQL99's I18N staffs would be added in 7.3, it means both the multibyte support and the locale support might disappear, then 2might be merged into it (not sure about NLS. is that compatible with, for example, per column charset?) So, for none multibyte users, the multibyte support would be one-release-only-functionality: suddenly appears in 7.2 and disappears in 7.3. I'm afraid this would cause more troubles rather than usefullness. What do you think? -- Tatsuo Ishii ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Unicode combining characters
Tatsuo Ishii [EMAIL PROTECTED] writes: What do you think? I think that we were supposed to go beta a month ago, and so this is no time to start adding new features to this release. Let's plan to make this happen (one way or the other) in 7.3, instead. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Problem on AIX with current
Tatsuo Ishii [EMAIL PROTECTED] writes: When I recompiled with LOCK_DEBUG and trace_lwlocks = true, it *works* (and saw lots of lock debugging messages, of course). However if I turn trace_lwlocks to off, the backend stucks again. Ugh ... ye classic Heisenbug ... Is there anything I can do? Apparently the problem is timing-sensitive, which is hardly surprising for a lock issue. You might find that it occurs some of the time if you repeat the test over and over. Note the machine has 4 processors. Is that related to? Hard to tell at this point, but considering that no one else has reported a problem so far, it does seem like multiple CPUs at least help to make the failure more probable. But it could just be a portability problem. Do you have another machine with identical OS and fewer processors to try for comparison? Andreas, have you tried CVS tip lately on AIX? What's your results? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] timestamp resolution?
Is this an expected behavior? I could not see why t1 and t2 are showing different time resolutions... test=# create table t3(t1 timestamp(2), t2 timestamp(2) default current_timestamp); CREATE test=# insert into t3 values(current_timestamp); INSERT 16566 1 test=# select * from t3; t1 |t2 +--- 2001-10-04 13:48:34+09 | 2001-10-04 13:48:34.34+09 (1 row) -- Tatsuo Ishii ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] PROBLEM SOLVED: LOCK TABLE oddness in PLpgSQL function called via JDBC
At 06:36 PM 10/2/2001, Hiroshi Inoue wrote: The cause is that the stored function uses a common snapshot throughout the function execution. As I've complained many times, the current implementaion is far from intuition and this case seems to show that it isn't proper at all either. Bravo! That indeed seems to have been the problem. To solve it, I simply moved the LOCK TABLE out of the PLpgSQL function and into the JDBC code. While this isn't *ideal* as it leaves the table locked across two JDBC calls (the function and the following commit), it achieves the desired result (synchronous access to the idfactory table across all clients), and as I said, the function won't be called very often. It's far more important that it work as expected rather than it work in sub-millisecond time. To illustrate then what seems to have been occurring: TimeThread A Thread B 1 snapshot 2 lock 3 read 1, 1 4 write 11, 2 5snapshot 6 return 1 7 commit 8lock 9read 1, 1 10write 11, 2 11FAIL As long as thread B takes its snapshot any time before the commit at (7), its write at (10) will not affect any rows because ... The *update* statement find the row matching the where clause using the common snapshot but will find the row was already updated and the updated row doesn't satisfy the condition any longer. Ouch. So querying for select, update, delete, whatever goes against the snapshot to *locate* rows, but then applies the where clause to the *new values* not seen in the snapshot? If that's the case, that's extremely confusing. Anyway, many thanks to everyone for keeping me from going totally insane. Luckily the other stored procedures we need to write won't require such strict access to table data. :) Peace, Dave ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] btree_gist regression test busted?
Your patch has been added to the PostgreSQL unapplied patches list at: http://candle.pha.pa.us/cgi-bin/pgpatches I will try to apply it within the next 48 hours. You are right. Please, apply attached patch or copy result/btree_gist.out expected/btree_gist.out Tom Lane wrote: In current CVS I see a failure in the btree_gist regression test. It kinda looks like the test data was changed without updating the expected results, but would you verify this? regards, tom lane *** ./expected/btree_gist.out Wed Aug 22 14:27:54 2001 --- ./results/btree_gist.outTue Oct 2 18:48:34 2001 *** *** 17,23 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 7 (1 row) -- create idx --- 17,23 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 66 (1 row) -- create idx *** *** 34,39 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 7 (1 row) --- 34,39 select count(*) from tstmp where t '2001-05-29 08:33:09+04'; count --- ! 66 (1 row) -- Teodor Sigaev [EMAIL PROTECTED] [ Attachment, skipping... ] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Unicode combining characters
Tatsuo Ishii [EMAIL PROTECTED] writes: ... There seems some problems existing in the implementation. Considering REGEX is not so slow, maybe we should employ the same design as REGEX. i.e. using wide charcters, not multibyte streams... Seems like a good thing to put on the to-do list. In the meantime, we still have the question of whether to enable multibyte in the default configuration. I'd still vote YES, as these results seem to me to demonstrate that there is no wide-ranging performance penalty. A problem confined to LIKE on long strings isn't a showstopper IMHO. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Unicode combining characters
Tatsuo Ishii [EMAIL PROTECTED] writes: ... There seems some problems existing in the implementation. Considering REGEX is not so slow, maybe we should employ the same design as REGEX. i.e. using wide charcters, not multibyte streams... Seems like a good thing to put on the to-do list. In the meantime, we still have the question of whether to enable multibyte in the default configuration. I'd still vote YES, as these results seem to me to demonstrate that there is no wide-ranging performance penalty. A problem confined to LIKE on long strings isn't a showstopper IMHO. As I said, with a valid not anchored like expression the performance difference was substantial, even for shorter strings it was 37%. The test with like 'aaa' was not a good test case, and we should not deduce anything from that. Andreas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Unicode combining characters
Tatsuo Ishii [EMAIL PROTECTED] writes: ... There seems some problems existing in the implementation. Considering REGEX is not so slow, maybe we should employ the same design as REGEX. i.e. using wide charcters, not multibyte streams... Seems like a good thing to put on the to-do list. In the meantime, we still have the question of whether to enable multibyte in the default configuration. I'd still vote YES, as these results seem to me to demonstrate that there is no wide-ranging performance penalty. A problem confined to LIKE on long strings isn't a showstopper IMHO. Added to TODO: * Use wide characters to evaluate regular expressions, for performance (Tatsuo) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Rounding issue with current_time
... So the real issue appears to be that subsecond resolution isn't propagating into time and timetz at all. Ah. Of course it isn't, because I (probably) didn't change DecodeTimeOnly() to use the microsecond resolution version of the transaction time. Will look at it. - Thomas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Rounding issue with current_time
BTW, would you object to my removing the macros IS_BUILTIN_TYPE(), IS_HIGHER_TYPE(), IS_HIGHEST_TYPE() from parse_coerce.h? They are used nowhere and are not being maintained --- eg, they don't seem to know about TIMESTAMPTZ. OK. I had already stripped out some #if NOT_USED code but must have missed those. - Thomas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] HISTORY for 7.2
Bruce, I notice HISTORY in CVS doesn't mentioned any development we did with GiST. Should we write some info ? Major things we did: 1. Null-safe interface to GiST 2. Support of multi-key GiST indices TODO Adding concurrency for GiST Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] HISTORY for 7.2
Bruce, I notice HISTORY in CVS doesn't mentioned any development we did with GiST. Should we write some info ? Major things we did: 1. Null-safe interface to GiST 2. Support of multi-key GiST indices I had generic GIST improvements. Updated to: Allow GIST to handle NULLs and multi-key indexes (Oleg Bartunov, Teodor Sigaev, Tom) TODO Adding concurrency for GiST Added. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] BUG: text(varchar) truncates at 31 bytes
#create table t (v varchar); #insert into t values ('0123456789a0123456789b0123456789c0123456789d'); #select v from t; v -- 0123456789a0123456789b0123456789c0123456789d (1 row) So far, so good. #select text(v) from t; text - 0123456789a0123456789b012345678 (1 row) Truncation occurs. Work around: # select v::text from t; ?column? -- 0123456789a0123456789b0123456789c0123456789d (1 row) I couldnt figure out what happens during a text(varchar) call. I looked around in pg_proc, but couldnt find the function. There's probably an automagic type conversion going on or something. Could someone explain what all the internal varchar-like types are (ie. varchar,varchar(n),text,char,_char,bpchar) and when they're used? I find it all really confusing - I'm sure others do too. Is there anyway to determine what postgresql is doing in its automagic function calls? I guess I'm asking for an EXPLAIN that describes function calls. For example, EXPLAIN select text(v) from t; -- {Description of conversion from varchar to whatever the text() function actually works on} Thanks, dave ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] BUG: text(varchar) truncates at 31 bytes
I can confirm this problem exists in current sources. Quite strange. #create table t (v varchar); #insert into t values ('0123456789a0123456789b0123456789c0123456789d'); #select v from t; v -- 0123456789a0123456789b0123456789c0123456789d (1 row) So far, so good. #select text(v) from t; text - 0123456789a0123456789b012345678 (1 row) Truncation occurs. Work around: # select v::text from t; ?column? -- 0123456789a0123456789b0123456789c0123456789d (1 row) I couldnt figure out what happens during a text(varchar) call. I looked around in pg_proc, but couldnt find the function. There's probably an automagic type conversion going on or something. Could someone explain what all the internal varchar-like types are (ie. varchar,varchar(n),text,char,_char,bpchar) and when they're used? I find it all really confusing - I'm sure others do too. Is there anyway to determine what postgresql is doing in its automagic function calls? I guess I'm asking for an EXPLAIN that describes function calls. For example, EXPLAIN select text(v) from t; -- {Description of conversion from varchar to whatever the text() function actually works on} Thanks, dave ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] BUG: text(varchar) truncates at 31 bytes
#select text(v) from t; text - 0123456789a0123456789b012345678 (1 row) Truncation occurs. Looking at the explain verbose output, it looks like it may be doing a conversion to name because it looks like there isn't a text(varchar), but there's a text(name) and a name(varchar). My guess is there's no text(varchar) because they're considered binary compatible. Work around: # select v::text from t; ?column? -- 0123456789a0123456789b0123456789c0123456789d (1 row) These types are probably marked as binary compatible, so nothing major has to happen in the type conversion. Same thing happens in CAST(v AS text). Is there anyway to determine what postgresql is doing in its automagic function calls? I guess I'm asking for an EXPLAIN that describes function calls. For example, EXPLAIN select text(v) from t; You can use EXPLAIN VERBOSE if you're willing to wade through the output. :) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Missing inserts
On Mar 02 Oct 2001 21:59, you wrote: In 7.1.X and earlier the INSERT rules are executed _before_ the INSERT. This is changed to _after_ in 7.2. This would mean...??? I haven´t had much trouble until now, so I can´t understand why one of the 4 inserts of the rule didn´t get through. Is there some logic? TIA! -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? - Martín Marqués |[EMAIL PROTECTED] Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral - ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Unicode combining characters
Tom Lane writes: In the meantime, we still have the question of whether to enable multibyte in the default configuration. This would make more sense if all of multibyte, locale, and NLS became defaults in one release. I haven't quite sold people in the second item yet, although I have a design how to do that (see below). And the third, well who knows... Perhaps we could make it a release goal for 7.3 to * Optimize i18n stuff to have a minimal performance penalty when it's not used. (locale=C etc.) * Make i18n stuff sufficiently well-behaved to make it the default. (Especially, add initdb options and GUC parameters to set the locale. Don't rely on environment variables -- too complicated.) Meanwhile, quadratic performance penalties (or so it seems) for LIKE expressions aren't exactly a minor problem. -- Peter Eisentraut [EMAIL PROTECTED] http://funkturm.homeip.net/~peter ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Unicode combining characters
Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane writes: In the meantime, we still have the question of whether to enable multibyte in the default configuration. Perhaps we could make it a release goal for 7.3 Yeah, that's probably the best way to proceed... it's awfully late in the 7.2 cycle to be deciding to do this now... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Rounding issue with current_time
With current CVS, I did regression=# create table foo (f1 date default current_date, regression(# f2 time default current_time, regression(# f3 timestamp default current_timestamp); CREATE regression=# \d foo Table foo Column | Type |Modifiers +--+-- f1 | date | default date('now'::text) f2 | time | default time('now'::text) f3 | timestamp with time zone | default timestamp('now'::text) regression=# insert into foo default values; INSERT 139633 1 regression=# insert into foo default values; INSERT 139634 1 regression=# select * from foo; f1 |f2| f3 +--+ 2001-10-03 | 13:15:37 | 2001-10-03 13:15:37-04 2001-10-03 | 13:15:49 | 2001-10-03 13:15:50-04 (2 rows) It's fairly disconcerting that f2 and f3 don't agree, wouldn't you say? Further experimentation shows that it happens about half the time, with the timestamp always one second ahead of the time when they differ. I infer that the new sub-second-resolution transaction timestamp is being correctly rounded when stored as a timestamp, but is truncated not rounded when stored as a time. Type timetz shows the same misbehavior. Not sure where to look for this ... regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] COMMENT ON
Peter Eisentraut [EMAIL PROTECTED] writes: COMMENT ON AGGREGATE my_aggregate double precision IS 'Computes sample variance'; works but looks strange syntax. Should we fix the program or docs? I vote for fixing the program. If we fix this, we should also change DROP AGGREGATE, which also uses the paren-less syntax. (I think the COMMENT ON syntax was modeled on DROP.) I'd be in favor of changing, but we do need to maintain consistency. Another issue is that pg_dump knows about using both of these commands... we'll have a compatibility problem if we don't continue to accept the old syntax for awhile. All fixed ... regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] CEST timezone
Thanks for your reply. Yes, CEST equals to CETDST please add CEST as a synonym to existing timezone code (if it is possible) Done in my sources; will be committed soon. - Thomas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] BUG: text(varchar) truncates at 31 bytes
Stephan Szabo [EMAIL PROTECTED] writes: Looking at the explain verbose output, it looks like it may be doing a conversion to name because it looks like there isn't a text(varchar), but there's a text(name) and a name(varchar). My guess is there's no text(varchar) because they're considered binary compatible. Since the truncation is to 31 characters, it seems clear that a conversion to name happened. I think the reason for this behavior is that the possibility of a freebie binary-compatible conversion is not considered until all else fails (see parse_func.c: it's only considered after func_get_detail fails). Unfortunately func_get_detail is willing to consider all sorts of implicit conversions, so these secondary possibilities end up being the chosen alternative. Perhaps it'd be a better idea for the option of a freebie conversion to be checked earlier, say immediately after we discover there is no exact match for the function name and input type. Thomas, what do you think? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Rounding issue with current_time
Further experimentation: regression=# create table foo3 (f1 date default current_date, regression(# f2 time(3) default current_time, regression(# f3 timestamp(3) default current_timestamp); CREATE regression=# insert into foo3 default values; (multiple times) regression=# select * from foo3; f1 |f2| f3 +--+- 2001-10-03 | 13:32:07 | 2001-10-03 13:32:07-04 2001-10-03 | 13:32:08 | 2001-10-03 13:32:08.3020-04 2001-10-03 | 13:32:09 | 2001-10-03 13:32:09.4280-04 2001-10-03 | 13:32:10 | 2001-10-03 13:32:10.2530-04 2001-10-03 | 13:32:10 | 2001-10-03 13:32:10.8850-04 2001-10-03 | 13:32:11 | 2001-10-03 13:32:11.2930-04 2001-10-03 | 13:32:11 | 2001-10-03 13:32:11.6650-04 2001-10-03 | 13:32:12 | 2001-10-03 13:32:12.04-04 2001-10-03 | 13:32:13 | 2001-10-03 13:32:13.3730-04 (9 rows) So the real issue appears to be that subsecond resolution isn't propagating into time and timetz at all. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Unicode combining characters
Bruce Momjian [EMAIL PROTECTED] writes: Added to TODO: * Use wide characters to evaluate regular expressions, for performance (Tatsuo) Regexes are fine; it's LIKE that's slow. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] My last ECPG commit
... I use CVSup to keep a local copy of the cvs repository on my laptop, so I have a *full* development environment when I'm traveling or otherwise off line. I'd have a very hard time working without it... Yes, that was my original thinking too. But CVS could do the same. ? A local copy of the *repository*, not a checked out version of the tree. CVSup is too cool for words ;) Building CVSup from scratch is not trivial, since it requires the installation of Modula3. There are packages available for RH/Mandrake Linux. - Thomas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] CEST timezone
Is it a bug or CEST timezone is not supported anymore ? I can't import my 7.1.2 database to current development version of postgresql afaik CEST was never supported by PostgreSQL. Can you please confirm that this is the same as CETDST (Central European Time, Daylight Savings Time) or, perhaps, CET (Central European Standard Time)? We can add CEST as a synonym once I understand what it is supposed to be ;) My guess is that your timezone database on your OS has changed; I've looked back to versions from the beginning of 2000 and didn't see any mention of CEST in the PostgreSQL code. - Thomas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Rounding issue with current_time
... It's fairly disconcerting that f2 and f3 don't agree, wouldn't you say? :) I'll look at it. - Thomas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Rounding issue with current_time
BTW, would you object to my removing the macros IS_BUILTIN_TYPE(), IS_HIGHER_TYPE(), IS_HIGHEST_TYPE() from parse_coerce.h? They are used nowhere and are not being maintained --- eg, they don't seem to know about TIMESTAMPTZ. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Unicode combining characters
I think that we were supposed to go beta a month ago, and so this is no time to start adding new features to this release. Let's plan to make this happen (one way or the other) in 7.3, instead. Agreed. -- Tatsuo Ishii ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] timestamp resolution?
Is this an expected behavior? I could not see why t1 and t2 are showing different time resolutions... Even stranger, this only happens on the first call to CURRENT_TIMESTAMP after starting a backend (example below), and stays that way if I just do select current_timestamp. Something must not be initialized quite right, but I don't know what. Any guesses? - Thomas (backend already connected and have just dropped t1) thomas=# create table t1 (d1 timestamp(2), d2 timestamp(2) default current_timestamp); CREATE thomas=# insert into t1 values (current_timestamp); INSERT 16572 1 thomas=# select * from t1; d1 |d2 ---+--- 2001-10-04 05:37:12.09+00 | 2001-10-04 05:37:12.09+00 (1 row) thomas=# \q myst$ psql ... thomas=# insert into t1 values (current_timestamp); INSERT 16573 1 thomas=# select * from t1; d1 |d2 ---+--- 2001-10-04 05:37:12.09+00 | 2001-10-04 05:37:12.09+00 2001-10-04 05:37:40+00| 2001-10-04 05:37:39.72+00 (2 rows) thomas=# insert into t1 values (current_timestamp); INSERT 16574 1 thomas=# select * from t1; d1 |d2 ---+--- 2001-10-04 05:37:12.09+00 | 2001-10-04 05:37:12.09+00 2001-10-04 05:37:40+00| 2001-10-04 05:37:39.72+00 2001-10-04 05:38:08.33+00 | 2001-10-04 05:38:08.33+00 (3 rows) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Unicode combining characters
Ok. I ran the modified test (now the iteration is reduced to 10 in liketest()). As you can see, there's huge difference. MB seems up to ~8 times slower:- There seems some problems existing in the implementation. Considering REGEX is not so slow, maybe we should employ the same design as REGEX. i.e. using wide charcters, not multibyte streams... Let me add I think our regex code is very slow. It is the standard BSD regex library by Henry Spencer. He rewrote it a few years ago for TCL 8.X and said he was working on a standalone library version. I have asked him several times via email over the years but he still has not released a standalone version of the new optimized regex code. It is on our TODO list. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Timestamp, fractional seconds problem
Problem: the external representation of time and timestamp are less precise than the internal representation. Fixed (as of yesterday) in the upcoming release. - Thomas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] BUG: text(varchar) truncates at 31 bytes
... Sure, I said *after* we fail to find an exact match. But the freebie match is for a function name that matches a type name and is binary-compatible with the source type. That's not a weak constraint. ISTM that interpretation should take priority over interpretations that involve more than one level of transformation. Ah, OK I think. If there is a counterexample, it is probably no less obscure than this one. - Thomas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]