Re: [HACKERS] Status report: regex replacement
I have just committed the latest version of Henry Spencer's regex package (lifted from Tcl 8.4.1) into CVS HEAD. This code is natively able to handle wide characters efficiently, and so it avoids the multibyte performance problems recently exhibited by Wade Klaver. I have not done extensive performance testing, but the new code seems at least as fast as the old, and much faster in some cases. I have tested the new regex with src/test/mb and it all passed. So the new code looks safe at least for EUC_CN, EUC_JP, EUC_KR, EUC_TW, MULE_INTERNAL, UNICODE, though the test does not include all possible regex patterns. -- Tatsuo Ishii ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] lock.h and proc.h
Sumaira Ali [EMAIL PROTECTED] writes: hi..i have questions about struct pgproc (in file proc.h) and proclock ( in file lock.h) of the postgresql source code, does anyone know the exact difference between pgproc and proclock structs?? There's one PGPROC per process. There's one PROCLOCK for each process and each lock that that process has any interest in (ie, either currently holds or is waiting for). The comments for these structs seem to be a bit of a mess at the moment :-( Bruce renamed the struct types recently, but appears not to have done a good job of adjusting the comments to match. It may help to know that a proclock object was formerly called a holder. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] disk pages, buffers and blocks
Hi, we'd like to know how disk pages map to disk blocks. In particular, looking at the code it seems that one page can be built on several disk blocks while in the first lines of bufpage.h it is said that a postgres disk page is an abstraction layered on top of *a* postgres disk block. As a matter of fact, it looks quite reasonable to have more than a block per page. We've also found out that a postgres buffer contains exactly one disk block, but we'd like to understand how pages, blocks and buffers relate to each other. Thank you very much for your help! regards, alice and lorena __ Yahoo! Cellulari: loghi, suonerie, picture message per il tuo telefonino http://it.yahoo.com/mail_it/foot/?http://it.mobile.yahoo.com/index2002.html ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] disk pages, buffers and blocks
=?iso-8859-1?q?Alice=20Lottini?= [EMAIL PROTECTED] writes: we'd like to know how disk pages map to disk blocks. There is no real distinction between the concepts in Postgres --- page and block are interchangeable terms, and a buffer always holds exactly one of either. The number of filesystem blocks or physical disk sectors needed to hold a Postgres page is a different question, of course. Postgres does not actually care about that, at least not directly. (But for efficiency reasons you want a Postgres page to be a multiple of the disk sector size and filesystem block size, and probably not a very large multiple.) Not sure if that's relevant to your confusion or not. first lines of bufpage.h it is said that a postgres disk page is an abstraction layered on top of *a* postgres disk block. I think that was written about fifteen years back by a Comp Sci grad student overinfatuated with the notion of abstraction ;-). It is true that the storage manager pushes blocks around without caring much what is in them, but I see no real value in drawing a distinction between a block and a page. If you were to make such a distinction you might define block = unit of I/O (between Postgres and the kernel, that is) page = unit within which space allocation is done for tuples But it doesn't make any sense to use a page size that is different from the unit of I/O. Certainly there's no point in making it smaller (that would just restrict the size of tuples, to no purpose) and if you make it bigger then you have to worry about tuples that have only partially been written out. Also, the present design for WAL *requires* block == page in this sense, because the LSN timestamp in each page header is meant to indicate whether the page is up-to-date on disk, and so the unit of I/O has to be a page. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] SMP + PostgreSQL in FreeBSD
Hi all, The FreeBSD 5.0 released recently. Some phrases from release notes: . . . SMP support has been largely reworked, incorporating code from BSD/OS 5.0. One of the main features of SMPng (``SMP Next Generation'') is to allow more processes to run in kernel, without the need for spin locks that can dramatically reduce the efficiency of multiple processors . . .. Reading this release notes I see only this great improvement vs 4.7 version, which can help for SQL server. On other hand FreeBSD 5.0 have totally redesigned kernel, I m afraid to setup productional DB to it. We had bought PC with 2xPIII, which will be dedicated PostgreSQL server. Old releases (4.7, for example) also supports SMP, but worse comparative with version 5 as described in above mentioned release notes. Please say, if anybody test SMP in FreeBSD for PostgreSQL - really Postgres with v5.0 will dramatically increase SQL server performance? Thanks a lot for any advance for this question. -- best regards, Ruslan A Dautkhanov ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Wed, 5 Feb 2003, Tom Lane wrote: [TL: Could be. By heritage I meant BSD-without-any-adjective. It is [TL: perfectly clear from Leffler, McKusick et al. (_The Design and [TL: Implementation of the 4.3BSD UNIX Operating System_) that back then, [TL: 8K was the standard filesystem block size. FS block size != Disk Buffer Size. Though 8k might have been the standard FS block size, it was possible -- and occasionally practiced -- to do 4k/512 filesystems, or 16k/2k filesystems, or M/N filesystems where { 4k M 16k (maybe 32k), log2(M) == int(log2(M)), log2(N) == int(log2(N)) and M/N = 8 }. --*greywolf; -- NetBSD: making all computer hardware a commodity. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] POSIX regex performance bug in 7.3 Vs. 7.2
Tom Lane kirjutas K, 05.02.2003 kell 08:12: Hannu Krosing [EMAIL PROTECTED] writes: Another idea is to make special regex type and store the regexes pre-parsed (i.e. in some fast-load form) ? Seems unlikely that going out to disk could beat just recompiling the regexp. We have to get _something_ from disk anyway. Currently we fetch regex source code, but if there were some format that is faster to load then that could be an option. -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [OpenFTS-general] relor and relkov
Hi! Me to. bye Uros On 31.01.2003 at 10:48:44, Caffeinate The World [EMAIL PROTECTED] wrote: But, we need help to create good documentation for tsearch ! This is main stopper for releasing of tsearch. I am currently using tsearch. I\'d be happy to help with documentation. __ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com --- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com ___ OpenFTS-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/openfts-general -- Binary, adj.: Possessing the ability to have friends of both sexes. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] COUNT and Performance ...
But pgstattuple does do a sequential scan of the table. You avoid a lot of the executor's tuple-pushing and plan-node-traversing machinery that way, but the I/O requirement is going to be exactly the same. I have tried it more often so that I can be sure that everything is in the cache. I thought it did some sort of stat on tables. Too bad :(. If people want to count ALL rows of a table. The contrib stuff is pretty useful. It seems to be transaction safe. Not entirely. pgstattuple uses HeapTupleSatisfiesNow(), which means you get a count of tuples that are committed good in terms of the effects of transactions committed up to the instant each tuple is examined. This is in general different from what count(*) would tell you, because it ignores snapshotting. It'd be quite unrepeatable too, in the face of active concurrent changes --- it's very possible for pgstattuple to count a single row twice or not at all, if it's being concurrently updated and the other transaction commits between the times pgstattuple sees the old and new versions of the row. Interesting. I have tried it with concurrent sessions and transactions - the results seemed to be right (I could not see the records inserted by open transactions). Too bad :(. It would have been a nice work around. The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz I think your test case is small enough that the whole table is resident in memory, so this measurement only accounts for CPU time per tuple and not any I/O. Given the small size of pgstattuple's per-tuple loop, the speed differential is not too surprising --- but it won't scale up to larger tables. Sometime it would be interesting to profile count(*) on large tables and see exactly where the CPU time goes. It might be possible to shave off some of the executor overhead ... regards, tom lane I have tried it with the largest table on my testing system. Reducing the overhead is great :). Thanks a lot, Hans -- *Cybertec Geschwinde u Schoenig* Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/1/913 68 09; +43/664/233 90 75 www.postgresql.at http://www.postgresql.at, cluster.postgresql.at http://cluster.postgresql.at, www.cybertec.at http://www.cybertec.at, kernel.cybertec.at http://kernel.cybertec.at ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] plpython fails its regression test
I hate following up my on my own email, especially to say I was wrong. In a previous message I said plpython passed the regression test here. It failed, I'll check it out over the weekend. However, python version 2.2 and later will fail further tests because of the deprecation of rexec. Andrew ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] COUNT and Performance ...
For a more accurate view of the time used, use the \timing switch in psql. That leaves out the overhead for forking and loading psql, connecting to the database and such things. I think, that it would be even nicer if postgresql automatically choose to replace the count(*)-with-no-where with something similar. Regards, Arjen Hans-Jürgen Schönig wrote: This patch adds a note to the documentation describing why the performance of min() and max() is slow when applied to the entire table, and suggesting the simple workaround most experienced Pg users eventually learn about (SELECT xyz ... ORDER BY xyz LIMIT 1). Any suggestions on improving the wording of this section would be welcome. Cheers, -- ORDER and LIMIT work pretty fast (no seq scan). In special cases there can be another way to avoid seq scans: action=# select tuple_count from pgstattuple('t_text'); tuple_count - 14203 (1 row) action=# BEGIN; BEGIN action=# insert into t_text (suchid) VALUES ('10'); INSERT 578606 1 action=# select tuple_count from pgstattuple('t_text'); tuple_count - 14204 (1 row) action=# ROLLBACK; ROLLBACK action=# select tuple_count from pgstattuple('t_text'); tuple_count - 14203 (1 row) If people want to count ALL rows of a table. The contrib stuff is pretty useful. It seems to be transaction safe. The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz): root@actionscouts:~# time psql action -c select tuple_count from pgstattuple('t_text'); tuple_count - 14203 (1 row) real0m0.266s user0m0.030s sys 0m0.020s root@actionscouts:~# time psql action -c select count(*) from t_text count --- 14203 (1 row) real0m0.701s user0m0.040s sys 0m0.010s I think that this could be a good workaround for huge counts (maybe millions of records) with no where clause and no joins. Hans http://kernel.cybertec.at ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] 7.2 result sets and plpgsql
I've had a good look and to no avail. Can someone please answer me this: -Can plpgsqlfunctions be used to return multiple result sets in ver 7.2 at all? or is this only a feature enabled in 7.3? If it is possible in 7.2 can you please give me an examplethat would return multiple rows.
[HACKERS] lo_in: error in parsing
I am using PostgreSQL 7.1, with the jdbc7.1-1.3.jar file. I am trying to send a Large Object to the database but get an error saying 'lo_in: error in parsing dump of binary data is following'. The offending statement is 'p.setBinaryStream(1, bis, size);' where bis is an instanceof DataInputStream and p is a PreparedStatement. The exact same code runs beautifully under Oracle, but throws this exception under PostgreSQL. I have followed the documentation to the letter so I don't see why it throws the exception. The field in the table is of type 'lo', which the documentation uses. Any tips ? Regards. -- Luca Saccarola Key ID: 0x4A7A51F7 (c/o keyservers) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PGP signing releases
On Tue, 2003-02-04 at 18:27, Curt Sampson wrote: On Tue, 2003-02-04 at 16:13, Kurt Roeckx wrote: On Tue, Feb 04, 2003 at 02:04:01PM -0600, Greg Copeland wrote: Even improperly used, digital signatures should never be worse than simple checksums. Having said that, anyone that is trusting checksums as a form of authenticity validation is begging for trouble. Should I point out that a fingerprint is nothing more than a hash? Since someone already mentioned MD5 checksums of tar files versus PGP key fingerprints, perhaps things will become a bit clearer here if I point out that the important point is not that these are both hashes of some data, but that the time and means of acquisition of that hash are entirely different between the two. And that it creates a verifiable chain of entities with direct associations to people and hopefully, email addresses. Meaning, it opens the door for rapid authentication and validation of each entity and associated person involved. Again, something a simple MD5 hash does not do or even allow for. Perhaps even more importantly, it opens the door for rapid detection of corruption in the system thanks to revocation certificates/keys. In turn, allows for rapid repair in the event that the worst is realized. Again, something a simple MD5 does not assist with in the least. Thanks Curt. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Irix 6.2, Postgres 7.3.1, some brokenness
Disregard previous. Using /bin/ld (with LDREL = -r) works fine as a linker. Call it force of habit. Is it worth warning the user that you cannot use gcc as ld on Irix? I used it because I figured I would need gnu ld (which I of course didn't have). Anyhow, 7.3.1 is successfully built. Alex -- alex avriette $^X is my programming language of choice. [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Status report: regex replacement
On Fri, 7 Feb 2003 00:49, Hannu Krosing wrote: Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05: Perhaps we should not call the encoding UNICODE but UTF8 (which it really is). UNICODE is a character set which has half a dozen official encodings and calling one of them UNICODE does not make things very clear. Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use this). I don't know what it is called though. I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used. UTF-8 seems to be the most popular, but even XML standard requires all compliant implementations to deal with at least both UTF-8 and UTF-16. Strong agreement from me, for whatever value you wish to place on my opinion. UTF-8 is a preferable name to UNICODE. The case for distinguishing 7-bit from 8-bit latin1 seems much weaker. Tim -- --- Tim Allen [EMAIL PROTECTED] Proximity Pty Ltd http://www.proximity.com.au/ http://www4.tpg.com.au/users/rita_tim/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.2 result sets and plpgsql
It's a 7.3 feature only. Chris -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of mail.luckydigital.comSent: Sunday, 2 February 2003 2:19 PMTo: [EMAIL PROTECTED]Subject: [HACKERS] 7.2 result sets and plpgsql I've had a good look and to no avail. Can someone please answer me this: -Can plpgsqlfunctions be used to return multiple result sets in ver 7.2 at all? or is this only a feature enabled in 7.3? If it is possible in 7.2 can you please give me an examplethat would return multiple rows.
[HACKERS] Wrong charset mappings
Hi all, One Japanese character has been causing my head to swim lately. I've finally tracked down the problem to both Java 1.3 and Postgresql. The problem character is namely: utf-16: 0x301C utf-8: 0xE3809C SJIS: 0x8160 EUC_JP: 0xA1C1 Otherwise known as the WAVE DASH character. The confusion stems from a very similar character 0xFF5E (utf-16) or 0xEFBD9E (utf-8) the FULLWIDTH TILDE. Java has just lately (1.4.1) finally fixed their mappings so that 0x301C maps correctly to both the correct SJIS and EUC-JP character. Previously (at least in 1.3.1) they mapped SJIS to 0xFF5E and EUC to 0x301C, causing all sorts of trouble. Postgresql at least picked one of the two characters namely 0xFF5E, so conversions in and out of the database to/from sjis/euc seemed to be working. Problem is when you try to view utf-8 from the database or if you read the data into java (utf-16) and try converting to euc or sjis from there. Anyway, I think postgresql needs to be fixed for this character. In my opinion what needs to be done is to change the mappings... euc-jp - utf-8- euc-jp ==== 0xA1C1 - 0xE3809C0xA1C1 sjis - utf-8- sjis ==== 0x8160 - 0xE3809C0x8160 As to what to do with the current mapping of 0xEFBD9E (utf-8)? It probably should be removed. Maybe you could keep the mapping back to the sjis/euc characters to help backward compatibility though. I'm not sure what is the correct approach there. If anyone can tell me how to edit the mappings under: src/backend/utils/mb/Unicode/ and rebuild postgres to use them, then I can test this out locally. Looking forward to your replies. Tom. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Planning a change of representation in the planner
Currently, the planner spends a good deal of time pushing around lists of small integers, because it uses such lists to identify join relations. For example, given SELECT ... FROM a, b, c WHERE ... the list (1,2) (or equivalently (2,1)) would represent the join of a and b. This representation is pretty clearly a hangover from the days when the Postgres planner was written in Lisp :-(. It's inefficient --- common operations like union, intersection, and is-subset tests require O(N^2) steps. And it's error-prone: I just had my nose rubbed once again in the nasty things that happen if you accidentally get some duplicate entries in a relation ID list. (It's nasty because some but not all of the low-level list-as-set operations depend on the assumption that the elements of a given list are distinct.) I'm thinking of replacing this representation by a variable-length-bitmap representation. Basically it would be like struct bitmapset { int nwords;/* number of words in array */ int array[1]; /* really [nwords] */ }; Each array element would hold 32 bits; the integer i is a member of the set iff (array[i/32] (i%32)) 1 == 1. For sets containing no elements over 31 (which would account for the vast majority of queries) only a single word would be needed in the array. Operations like set union, intersection, and subset test could process 32 bits at a time --- they'd reduce to trivial C operations like |=, =, ~, applied once per word. There would be a few things that would be slower (like iterating through the actual integer elements of a set) but AFAICT this representation is much better suited to the planner's needs than the list method. I've been thinking of doing this for a while just on efficiency grounds, but kept putting it off because I don't expect much of any performance gain on simple queries. (You need a dozen or so tables in a query before the inefficiencies of the list representation really start to hurt.) But tonight I'm thinking I'll do it anyway, because it'll also be impervious to duplicate-element bugs. Comments? regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [OpenFTS-general] relor and relkov
Nice ! We'll send you archive with new tsearch and short info, so you could test it and write documentation. I have a live DB, is it possible to install the new alpha tsearch module w/o conflicting with the existing production one? Can you install it to a different schema? Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Wed, Feb 05, 2003 at 12:18:29PM -0500, Tom Lane wrote: D'Arcy J.M. Cain [EMAIL PROTECTED] writes: On Wednesday 05 February 2003 11:49, Tom Lane wrote: I wonder if it is possible that, every so often, you are losing just the last few bytes of an NFS transfer? Yah, that's kind of what it looked like when I tried this before Christmas too although the actual errors differd. Wild thought here: can you reduce the MTU on the LAN linking the NFS server to the NetBSD box? If so, does it help? How about adjusting the read and write-size used by the NetBSD machine? I think the default is 32k for both read and write on i386 machines now. Perhaps try setting them back to 8k (it's the -r and -w flags to mount_nfs, IIRC) Ian. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Wed, 5 Feb 2003, D'Arcy J.M. Cain wrote: [DJC: This feels rather fragile. I doubt that it is hardware related because I dad [DJC: tried it on the other ethernet interface in the machine which was on a [DJC: completely different network than the one I am on now. All I can offer up is that at one point I had to reduce to 16k NFSIO when I replaced a switch (you didn't replace a switch, did you?) between my i386 and my sparc (my le0 and the switch didn't play nicely together; once I got the hme0 in, everything was happy as a clam). [DJC: What is the implication of smaller read and write size? Will I [DJC: necessarily take a performance hit? I didn't start noticing observable degradation across 100TX until I dropped NFSIO to 4k (which I did purely for benchmarking statistics). The differences between 8k, 16k and 32k have not been noticeable to me. 32k IO would hang my system at one point; since that time, something appears to have been fixed. [DJC: -- [DJC: D'Arcy J.M. Cain darcy@{druid|vex}.net | Democracy is three wolves [DJC: http://www.druid.net/darcy/| and a sheep voting on [DJC: +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. [DJC: --*greywolf; -- NetBSD: Servers' choice! ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Wed, Feb 05, 2003 at 03:09:09PM -0500, Tom Lane wrote: D'Arcy J.M. Cain [EMAIL PROTECTED] writes: On Wednesday 05 February 2003 13:04, Ian Fry wrote: How about adjusting the read and write-size used by the NetBSD machine? I think the default is 32k for both read and write on i386 machines now. Perhaps try setting them back to 8k (it's the -r and -w flags to mount_nfs, IIRC) Hey! That did it. Hot diggety! So, why does this fix it? Who knows. One thing that I'd be interested to know is whether Darcy is using NFSv2 or NFSv3 -- 32k requests are not, strictly speaking, within the bounds of the v2 specification. If he is using UDP rather than TCP as the transport layer, another potential issue is that 32K requests will end up as IP packets with a very large number of fragments, potentially exposing some kind of network stack bug in which the last fragment is dropped or corrupted (I would suspect that the likelihood of such a bug in the NetApp stack is quite low, however). If feasible, it is probably better to use TCP as the transport and let it handle segmentation whether the request size is 8K or 32K. I think now you file a bug report with the NetBSD kernel folk. My thoughts are running in the direction of a bug having to do with scattering a 32K read into multiple kernel disk-cache buffers or gathering together multiple cache buffer contents to form a 32K write. That doesn't make much sense to me. Pages on i386 are 4K, so whether he does 8K writes or 32K writes, it will always come from multiple pages in the pagecache. Unless NetBSD has changed from its heritage, the kernel disk cache buffers are 8K, and so an 8K NFS read or write would never cross a cache buffer boundary. But 32K would. I don't know what heritage you're referring to, but it has never been the case that NetBSD's buffer cache has used fixed-size 8K disk buffers, and I don't believe that it was ever the case for any Net2 or 4.4-derived system. Or it could be a similar bug on the NFS server's side? That's concievable. Of course, a client bug is quite possible, as well, but I don't think the mechanism you suggest is likely. -- Thor Lancelot Simon [EMAIL PROTECTED] But as he knew no bad language, he had called him all the names of common objects that he could think of, and had screamed: You lamp! You towel! You plate! and so on. --Sigmund Freud ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Status report: regex replacement
On Thu, 2003-02-06 at 13:25, Tatsuo Ishii wrote: I have just committed the latest version of Henry Spencer's regex package (lifted from Tcl 8.4.1) into CVS HEAD. This code is natively able to handle wide characters efficiently, and so it avoids the multibyte performance problems recently exhibited by Wade Klaver. I have not done extensive performance testing, but the new code seems at least as fast as the old, and much faster in some cases. I have tested the new regex with src/test/mb and it all passed. So the new code looks safe at least for EUC_CN, EUC_JP, EUC_KR, EUC_TW, MULE_INTERNAL, UNICODE, though the test does not include all possible regex patterns. Perhaps we should not call the encoding UNICODE but UTF8 (which it really is). UNICODE is a character set which has half a dozen official encodings and calling one of them UNICODE does not make things very clear. -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Status report: regex replacement
Perhaps we should not call the encoding UNICODE but UTF8 (which it really is). UNICODE is a character set which has half a dozen official encodings and calling one of them UNICODE does not make things very clear. Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use this). I don't know what it is called though. -- Tatsuo Ishii ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Status report: regex replacement
Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05: Perhaps we should not call the encoding UNICODE but UTF8 (which it really is). UNICODE is a character set which has half a dozen official encodings and calling one of them UNICODE does not make things very clear. Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use this). I don't know what it is called though. I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used. UTF-8 seems to be the most popular, but even XML standard requires all compliant implementations to deal with at least both UTF-8 and UTF-16. -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] databases limit
On Thu, Feb 06, 2003 at 12:30:03AM -0500, Tom Lane wrote: I have a feeling that what the questioner really means is how can I limit the resources consumed by any one database user? In which case (I'm moving this to -hackers 'cause I think it likely belongs there.) I note that this question has come up before, and several people have been sceptical of its utility. In particular, in this thread http://groups.google.ca/groups?hl=enlr=ie=UTF-8threadm=Pine.LNX.4.21.0212221510560.15719-10%40linuxworld.com.aurnum=1prev=/groups%3Fq%3Dlimit%2Bresources%2B%2Bgroup:comp.databases.postgresql.*%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3DPine.LNX.4.21.0212221510560.15719-10%2540linuxworld.com.au%26rnum%3D1 (sorry about the long line: I just get errors searching at the official archives) Tom Lane notes that you could just run another back end to make things more secure. That much is true; but I'm wondering whether it might be worth it to limit how much a _database_ can use. For instance, suppose I have a number of databases which are likely to see sporadic heavy loads. There are limitations on how slow the response can be. So I have to do some work to guarantee that, for instance, certain tables from each database don't get flushed from the buffers. I can do this now by setting up separate postmasters. That way, each gets its own shared memory segment. Those certain tables will be ones that are frequently accessed, and so they'll always remain in the buffer, even if the other database is busy (because the two databases don't share a buffer). (I'm imagining the case -- not totally imaginary -- where one of the databases tends to be accessed heavily during one part of a 24 hour day, and another database gets hit more on another part of the same day.) The problem with this scenario is that it makes administration somewhat awkward as soon as you have to do this 5 or 6 times. I was thinking that it might be nice to be able to limit how much of the total resources a given database can consume. If one database were really busy, that would not mean that other databases would automatically be more sluggish, because they would still have some guaranteed minimum percentage of the total resources. So, anyone care to speculate? -- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada [EMAIL PROTECTED] M2P 2A8 +1 416 646 3304 x110 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PostgreSQL v7.3.2 Released -- Permission denied from pretty much everywhere
Joshua D. Drake [EMAIL PROTECTED] writes: Been trying to test the latest source but the following places give permission dened when trying to download: ftp.postgresql.org ftp.us.postgresql.org ftp2.us.postgresql.org mirror.ac.uk I just started a download from ftp.us.postgresql.org, and it seems to be working fine. We've not heard other complaints, either. Sure the problem's not on your end? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Why is lc_messages restricted?
Is there a reason why lc_messages is PGC_SUSET, and not PGC_USERSET? I can't see any security rationale for restricting it. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] PostgreSQL v7.3.2 Released -- Permission denied from pretty mucheverywhere
Hello folks, Been trying to test the latest source but the following places give permission dened when trying to download: ftp.postgresql.org ftp.us.postgresql.org ftp2.us.postgresql.org mirror.ac.uk Anybody got one that works? J Oliver Elphick wrote: On Wed, 2003-02-05 at 20:41, Laurette Cisneros wrote: I was trying from the postgresql.org download web page and following the mirror links there...and none of them that I was able to get to (some of them didn't work) showed 7.3.2. I got it from mirror.ac.uk yesterday ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL v7.3.2 Released -- Permission denied from pretty much
Hello, Pardon me while I pull my book out of various dark places. It has been a very long week. I got it. Thanks. Sincerely, Joshua Drake Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Been trying to test the latest source but the following places give permission dened when trying to download: ftp.postgresql.org ftp.us.postgresql.org ftp2.us.postgresql.org mirror.ac.uk I just started a download from ftp.us.postgresql.org, and it seems to be working fine. We've not heard other complaints, either. Sure the problem's not on your end? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Wed, Feb 05, 2003 at 09:24:48PM +, David Laight wrote: If he is using UDP rather than TCP as the transport layer, another potential issue is that 32K requests will end up as IP packets with a very large number of fragments, potentially exposing some kind of network stack bug in which the last fragment is dropped or corrupted. Actually it is worse that that, and IMHO 32k UDP requests are asking for trouble. A 32k UDP datagram is about 22 ethernet packets. If ANY of them is lost on the network, then the entire datagram is lost. NFS must regenerate the request on a timeout. The receiving system won't report that it is missing a fragment. As he stated several times, he has tested with TCP mounts and observed the same issue. So the above issue shouldn't be related. There are also an lot of ethernet cards out there which don't have enough buffer space for 32k of receive data. Not to mention the fact that NFS can easily (at least on some systems) generate concurrent requests for different parts of the same file. I would suggest reducing the size back to 8k, even that causes trouble with some cards. If NetBSD as an NFS client is this fragile we have problems. The default read/write size shouldn't be 32kB if that is not going to work reliably. It should also be realised that transmitting 22 full sized, back to back frames on the ethernet doesn't do anything for sharing the bandwidth betweenn different users. The MAC layer has to very aggressive in order to get a packet in edgeways (so to speak). So what? If it is a switched network, which I assume it is since he was talking to the NetApp gigabit port earlier, then this is irrelevant. Even the $40 Fry's switches are more or less non-blocking. Even if he is saturating the local *hub*, it shouldn't cause NetBSD to fail, it would just be rude. :-) There could be some packet mangling on the network, checking the amount of retransmissions on either end of the TCP connection should give you an idea about that. -Andrew ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On February 06, 2003 at 03:50, Justin Clift wrote: Tom Lane wrote: snip Hoo boy. I was already suspecting data corruption in the index, and this looks like more of the same. My thoughts are definitely straying in the direction of the NFS server is dropping bits, somehow. Both this and the (admittedly unproven) bt_moveright loop suggest corrupted values in the cross-page links that exist at the very end of each btree index page. I wonder if it is possible that, every so often, you are losing just the last few bytes of an NFS transfer? Hmmm... does anyone remember the name of that NFS testing tool the FreeBSD guys were using? Think it came from Apple. They used it to find and isolate bugs in the FreeBSD code a while ago. Sounds like it might be useful here. :-) fsx. See also http://www.connectathon.org hth, Byron ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL, NetBSD and NFS
[ On Friday, January 31, 2003 at 11:54:27 (-0500), D'Arcy J.M. Cain wrote: ] Subject: Re: PostgreSQL, NetBSD and NFS On Thursday 30 January 2003 18:32, Simon J. Gerraty wrote: Is postgreSQL trying to lock a file perhaps? Would seem a sensible thing for it to be doing... Is that a problem? FWIW I am running statd and lockd on the NetBSD box. NetBSD's NFS implementation only supports locking as a _server_, not a client. http://www.unixcircle.com/features/nfs.php Optional for file locking (lockd+statd): lockd: Rpc.lockd is a daemon which provides file and record-locking services in an NFS environment. FreeBSD, NetBSD and OpenBSD file locking is only supported on server side. NFS server support for locking was introduced in NetBSD-1.5: http://www.netbsd.org/Releases/formal-1.5/NetBSD-1.5.html * Server part of NFS locking (implemented by rpc.lockd(8)) now works. and as you can also see from rcp.lockd/lockd.c: revision 1.5 date: 2000/06/07 14:34:40; author: bouyer; state: Exp; lines: +67 -25 Implement file locking in lockd. All the stuff is done in userland, using fhopen() and flock(). This means that if you kill lockd, all locks will be relased (but you're supposed to kill statd at the same time, so remote hosts will know it and re-establish the lock). Tested against solaris 2.7 and linux 2.2.14 clients. Shared lock are not handled efficiently, they're serialised in lockd when they could be granted. Terry Lambert has some proposed fixes to add NFS client level locking to the FreeBSD kernel: http://www.freebsd.org/~terry/DIFF.LOCKS.txt http://www.freebsd.org/~terry/DIFF.LOCKS.MAN http://www.freebsd.org/~terry/DIFF.LOCKS -- Greg A. Woods +1 416 218-0098;[EMAIL PROTECTED]; [EMAIL PROTECTED] Planix, Inc. [EMAIL PROTECTED]; VE3TCP; Secrets of the Weird [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Thu, Jan 30, 2003 at 01:27:59PM -0600, Greg Copeland wrote: That was going to be my question too. I thought NFS didn't have some of the requisite file system behaviors (locking, flushing, etc. IIRC) for PostgreSQL to function correctly or reliably. I don't know what locking sheme PostgreSQL use, but in theory it should be possible to use it over NFS: - a fflush()/msync() should work the same way on a NFS filesystem as on a local filesystem, provided the client and server implements the NFS protocol properly - locking via temp files works over NFS, again provided the client and server implements the NFS protocol properly (this is why you can safely read your mailbox over NFS, for example). If PostgreSQL uses flock or fcntl, it's a problem. -- Manuel Bouyer [EMAIL PROTECTED] NetBSD: 24 ans d'experience feront toujours la difference -- ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [OpenFTS-general] relor and relkov
--- Oleg Bartunov [EMAIL PROTECTED] wrote: On Fri, 31 Jan 2003, Caffeinate The World wrote: But, we need help to create good documentation for tsearch ! This is main stopper for releasing of tsearch. I am currently using tsearch. I'd be happy to help with documentation. Nice ! We'll send you archive with new tsearch and short info, so you could test it and write documentation. I have a live DB, is it possible to install the new alpha tsearch module w/o conflicting with the existing production one? __ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Status report: regex replacement
Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use this). I don't know what it is called though. I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used. I just pointed out that ISO-8859-1 is *not* an encoding, but a character set. UTF-8 seems to be the most popular, but even XML standard requires all compliant implementations to deal with at least both UTF-8 and UTF-16. I don't think PostgreSQL is going to natively support UTF-16. -- Tatsuo Ishii ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster