Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tatsuo Ishii
 I have just committed the latest version of Henry Spencer's regex
 package (lifted from Tcl 8.4.1) into CVS HEAD.  This code is natively
 able to handle wide characters efficiently, and so it avoids the
 multibyte performance problems recently exhibited by Wade Klaver.
 I have not done extensive performance testing, but the new code seems
 at least as fast as the old, and much faster in some cases.

I have tested the new regex with src/test/mb and it all passed. So the
new code looks safe at least for EUC_CN, EUC_JP, EUC_KR, EUC_TW,
MULE_INTERNAL, UNICODE, though the test does not include all possible
regex patterns.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] lock.h and proc.h

2003-02-06 Thread Tom Lane
Sumaira Ali [EMAIL PROTECTED] writes:
 hi..i have questions about struct pgproc (in file proc.h) and proclock ( in 
 file lock.h) of the postgresql source code, does anyone know the exact 
 difference between pgproc and proclock structs??

There's one PGPROC per process.  There's one PROCLOCK for each process
and each lock that that process has any interest in (ie, either
currently holds or is waiting for).

The comments for these structs seem to be a bit of a mess at the moment :-(
Bruce renamed the struct types recently, but appears not to have done a
good job of adjusting the comments to match.  It may help to know that
a proclock object was formerly called a holder.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[HACKERS] disk pages, buffers and blocks

2003-02-06 Thread Alice Lottini
Hi,
we'd like to know how disk pages map to disk blocks.
In particular, looking at the code it seems that one
page can be built on several disk blocks while in the
first lines of bufpage.h it is said that a postgres
disk page is an abstraction layered on top of *a*
postgres disk block.
As a matter of fact, it looks quite reasonable to have
more than a block per page.
We've also found out that a postgres buffer contains
exactly one disk block, but we'd like to understand
how pages, blocks and buffers relate to each other.

Thank you very much for your help!
regards, alice and lorena 


__
Yahoo! Cellulari: loghi, suonerie, picture message per il tuo telefonino
http://it.yahoo.com/mail_it/foot/?http://it.mobile.yahoo.com/index2002.html

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] disk pages, buffers and blocks

2003-02-06 Thread Tom Lane
=?iso-8859-1?q?Alice=20Lottini?= [EMAIL PROTECTED] writes:
 we'd like to know how disk pages map to disk blocks.

There is no real distinction between the concepts in Postgres ---
page and block are interchangeable terms, and a buffer always
holds exactly one of either.

The number of filesystem blocks or physical disk sectors needed to hold
a Postgres page is a different question, of course.  Postgres does not
actually care about that, at least not directly.  (But for efficiency
reasons you want a Postgres page to be a multiple of the disk sector
size and filesystem block size, and probably not a very large multiple.)
Not sure if that's relevant to your confusion or not.

 first lines of bufpage.h it is said that a postgres
 disk page is an abstraction layered on top of *a*
 postgres disk block.

I think that was written about fifteen years back by a Comp Sci grad
student overinfatuated with the notion of abstraction ;-).  It is true
that the storage manager pushes blocks around without caring much what
is in them, but I see no real value in drawing a distinction between
a block and a page.  If you were to make such a distinction you might
define
block = unit of I/O (between Postgres and the kernel, that is)
page = unit within which space allocation is done for tuples

But it doesn't make any sense to use a page size that is different from
the unit of I/O.  Certainly there's no point in making it smaller
(that would just restrict the size of tuples, to no purpose) and if
you make it bigger then you have to worry about tuples that have only
partially been written out.

Also, the present design for WAL *requires* block == page in this sense,
because the LSN timestamp in each page header is meant to indicate
whether the page is up-to-date on disk, and so the unit of I/O has to be
a page.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[HACKERS] SMP + PostgreSQL in FreeBSD

2003-02-06 Thread Ruslan A Dautkhanov
Hi all,

   The FreeBSD 5.0 released recently. Some phrases from release notes:
 . . .   SMP support has been largely reworked, incorporating code from 
   BSD/OS 5.0. One of the main features of SMPng (``SMP Next Generation'')
   is to allow more processes to run in kernel, without the need for 
   spin locks that can dramatically reduce the efficiency of multiple
   processors   . . ..

   Reading this release notes I see only this great improvement vs 4.7 version,
 which can help for SQL server. On other hand FreeBSD 5.0 have totally
 redesigned kernel, I m afraid to setup productional DB to it.
   We had bought PC with 2xPIII, which will be dedicated PostgreSQL server.
 Old releases (4.7, for example) also supports SMP, but worse comparative with
 version 5 as described in above mentioned release notes. Please say,
 if anybody test SMP in FreeBSD for PostgreSQL - really Postgres with v5.0 
 will dramatically increase SQL server performance? 


   Thanks a lot for any advance for this question.



 --
 best regards,
 Ruslan A Dautkhanov

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Greywolf
On Wed, 5 Feb 2003, Tom Lane wrote:

[TL: Could be.  By heritage I meant BSD-without-any-adjective.  It is
[TL: perfectly clear from Leffler, McKusick et al. (_The Design and
[TL: Implementation of the 4.3BSD UNIX Operating System_) that back then,
[TL: 8K was the standard filesystem block size.

FS block size !=  Disk Buffer Size.  Though 8k might have been the
standard FS block size, it was possible -- and occasionally practiced
-- to do 4k/512 filesystems, or 16k/2k filesystems, or M/N filesystems
where { 4k  M  16k (maybe 32k), log2(M) == int(log2(M)),
log2(N) == int(log2(N)) and M/N = 8 }.


--*greywolf;
--
NetBSD: making all computer hardware a commodity.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] POSIX regex performance bug in 7.3 Vs. 7.2

2003-02-06 Thread Hannu Krosing
Tom Lane kirjutas K, 05.02.2003 kell 08:12:
 Hannu Krosing [EMAIL PROTECTED] writes:
  Another idea is to make special regex type and store the regexes
  pre-parsed (i.e. in some fast-load form) ?
 
 Seems unlikely that going out to disk could beat just recompiling the
 regexp. 

We have to get _something_ from disk anyway. Currently we fetch regex
source code, but if there were some format that is faster to load then
that could be an option.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] [OpenFTS-general] relor and relkov

2003-02-06 Thread Uro Gruber
Hi!

Me to.

bye Uros

On 31.01.2003 at 10:48:44, Caffeinate The World
[EMAIL PROTECTED] wrote:

 
  But, we need help to create good documentation for tsearch
 !
  This is main stopper for releasing of tsearch.
 
 I am currently using tsearch. I\'d be happy to help with
 documentation.
 
 __
 Do you Yahoo!?
 Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
 http://mailplus.yahoo.com
 
 
 ---
 This SF.NET email is sponsored by:
 SourceForge Enterprise Edition + IBM + LinuxWorld = Something
 2 See!
 http://www.vasoftware.com
 ___
 OpenFTS-general mailing list
 [EMAIL PROTECTED]
 https://lists.sourceforge.net/lists/listinfo/openfts-general
 


--
Binary, adj.:
Possessing the ability to have friends of both sexes.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] COUNT and Performance ...

2003-02-06 Thread Hans-Jürgen Schönig


But pgstattuple does do a sequential scan of the table.  You avoid a lot
of the executor's tuple-pushing and plan-node-traversing machinery that
way, but the I/O requirement is going to be exactly the same.
 


I have tried it more often so that I can be sure that everything is in 
the cache.
I thought it did some sort of stat on tables. Too bad :(.


If people want to count ALL rows of a table. The contrib stuff is pretty 
useful. It seems to be transaction safe.
   


Not entirely.  pgstattuple uses HeapTupleSatisfiesNow(), which means you
get a count of tuples that are committed good in terms of the effects of
transactions committed up to the instant each tuple is examined.  This
is in general different from what count(*) would tell you, because it
ignores snapshotting.  It'd be quite unrepeatable too, in the face of
active concurrent changes --- it's very possible for pgstattuple to
count a single row twice or not at all, if it's being concurrently
updated and the other transaction commits between the times pgstattuple
sees the old and new versions of the row.
 

Interesting. I have tried it with concurrent sessions and transactions - 
the results seemed to be right (I could not see the records inserted by 
open transactions). Too bad :(. It would have been a nice work around.


The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz



I think your test case is small enough that the whole table is resident
in memory, so this measurement only accounts for CPU time per tuple and
not any I/O.  Given the small size of pgstattuple's per-tuple loop, the
speed differential is not too surprising --- but it won't scale up to
larger tables.

Sometime it would be interesting to profile count(*) on large tables
and see exactly where the CPU time goes.  It might be possible to shave
off some of the executor overhead ...

			regards, tom lane
 


I have tried it with the largest table on my testing system.
Reducing the overhead is great :).

   Thanks a lot,

   Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at http://www.postgresql.at, cluster.postgresql.at 
http://cluster.postgresql.at, www.cybertec.at 
http://www.cybertec.at, kernel.cybertec.at http://kernel.cybertec.at



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] plpython fails its regression test

2003-02-06 Thread Andrew Bosma

I hate following up my on my own email, especially to say I was wrong.
In a previous message I said plpython passed the regression test here.
It failed, I'll check it out over the weekend.

However, python version 2.2 and later will fail further tests because
of the deprecation of rexec.  

Andrew

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] COUNT and Performance ...

2003-02-06 Thread Arjen van der Meijden
For a more accurate view of the time used, use the \timing switch in psql.
That leaves out the overhead for forking and loading psql, connecting to 
the database and such things.

I think, that it would be even nicer if postgresql automatically choose 
to replace the count(*)-with-no-where with something similar.

Regards,

Arjen

Hans-Jürgen Schönig wrote:
This patch adds a note to the documentation describing why the
performance of min() and max() is slow when applied to the entire table,
and suggesting the simple workaround most experienced Pg users
eventually learn about (SELECT xyz ... ORDER BY xyz LIMIT 1).

Any suggestions on improving the wording of this section would be
welcome.

Cheers,


--

ORDER and LIMIT work pretty fast (no seq scan).
In special cases there can be another way to avoid seq scans:


action=# select tuple_count from pgstattuple('t_text');
 tuple_count
-
   14203
(1 row)

action=# BEGIN;
BEGIN
action=# insert into t_text (suchid) VALUES ('10');
INSERT 578606 1
action=# select tuple_count from pgstattuple('t_text');
 tuple_count
-
   14204
(1 row)

action=# ROLLBACK;
ROLLBACK
action=# select tuple_count from pgstattuple('t_text');
 tuple_count
-
   14203
(1 row)


If people want to count ALL rows of a table. The contrib stuff is pretty 
useful. It seems to be transaction safe.

The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz):


root@actionscouts:~# time psql action -c select tuple_count from 
pgstattuple('t_text');
 tuple_count
-
   14203
(1 row)


real0m0.266s
user0m0.030s
sys 0m0.020s
root@actionscouts:~# time psql action -c select count(*) from t_text
 count
---
 14203
(1 row)


real0m0.701s
user0m0.040s
sys 0m0.010s


I think that this could be a good workaround for huge counts (maybe 
millions of records) with no where clause and no joins.

Hans

http://kernel.cybertec.at


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[HACKERS] 7.2 result sets and plpgsql

2003-02-06 Thread mail.luckydigital.com



I've had a good look and to no avail. Can someone 
please answer me this:

-Can plpgsqlfunctions be used to return 
multiple result sets in ver 7.2 at all?
or is this only a feature enabled in 
7.3?

If it is possible in 7.2 can you please give me an 
examplethat would return multiple 
rows.




[HACKERS] lo_in: error in parsing

2003-02-06 Thread Luca Saccarola
I am using PostgreSQL 7.1, with the jdbc7.1-1.3.jar file. I am trying to
send a Large Object to the database but get an error saying 'lo_in: error
in parsing dump of binary data is following'. The offending statement is
'p.setBinaryStream(1, bis, size);' where bis is an instanceof
DataInputStream and p is a PreparedStatement. The exact same code runs
beautifully under Oracle, but throws this exception under PostgreSQL. I
have followed the documentation to the letter so I don't see why it throws
the exception. The field in the table is of type 'lo', which the
documentation uses.

Any tips ?

Regards.
-- 
Luca Saccarola
Key ID: 0x4A7A51F7 (c/o keyservers)

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] PGP signing releases

2003-02-06 Thread Greg Copeland
On Tue, 2003-02-04 at 18:27, Curt Sampson wrote:
 On Tue, 2003-02-04 at 16:13, Kurt Roeckx wrote:
  On Tue, Feb 04, 2003 at 02:04:01PM -0600, Greg Copeland wrote:
  
   Even improperly used, digital signatures should never be worse than
   simple checksums.  Having said that, anyone that is trusting checksums
   as a form of authenticity validation is begging for trouble.
 
  Should I point out that a fingerprint is nothing more than a
  hash?
 
 Since someone already mentioned MD5 checksums of tar files versus PGP
 key fingerprints, perhaps things will become a bit clearer here if I
 point out that the important point is not that these are both hashes of
 some data, but that the time and means of acquisition of that hash are
 entirely different between the two.


And that it creates a verifiable chain of entities with direct
associations to people and hopefully, email addresses.  Meaning, it
opens the door for rapid authentication and validation of each entity
and associated person involved.  Again, something a simple MD5 hash does
not do or even allow for.  Perhaps even more importantly, it opens the
door for rapid detection of corruption in the system thanks to
revocation certificates/keys.  In turn, allows for rapid repair in the
event that the worst is realized.  Again, something a simple MD5 does
not assist with in the least.


Thanks Curt.


-- 
Greg Copeland [EMAIL PROTECTED]
Copeland Computer Consulting


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Irix 6.2, Postgres 7.3.1, some brokenness

2003-02-06 Thread alex avriette
Disregard previous. Using /bin/ld (with LDREL = -r) works fine as a 
linker. Call it force of habit.

Is it worth warning the user that you cannot use gcc as ld on Irix? I 
used it because I figured I would need gnu ld (which I of course didn't 
have).

Anyhow, 7.3.1 is successfully built.


Alex


--
alex avriette
$^X is my programming language of choice.
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tim Allen
On Fri, 7 Feb 2003 00:49, Hannu Krosing wrote:
 Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05:
   Perhaps we should not call the encoding UNICODE but UTF8 (which it
   really is). UNICODE is a character set which has half a dozen official
   encodings and calling one of them UNICODE does not make things very
   clear.
 
  Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely
  way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use
  this). I don't know what it is called though.

 I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse
 anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used.

 UTF-8 seems to be the most popular, but even XML standard requires all
 compliant implementations to deal with at least both UTF-8 and UTF-16.

Strong agreement from me, for whatever value you wish to place on my opinion. 
UTF-8 is a preferable name to UNICODE. The case for distinguishing 7-bit from 
8-bit latin1 seems much weaker.

Tim

-- 
---
Tim Allen  [EMAIL PROTECTED]
Proximity Pty Ltd  http://www.proximity.com.au/
  http://www4.tpg.com.au/users/rita_tim/


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] 7.2 result sets and plpgsql

2003-02-06 Thread Christopher Kings-Lynne



It's a 
7.3 feature only.

Chris

  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED]]On Behalf Of 
  mail.luckydigital.comSent: Sunday, 2 February 2003 2:19 
  PMTo: [EMAIL PROTECTED]Subject: [HACKERS] 7.2 
  result sets and plpgsql
  I've had a good look and to no avail. Can someone 
  please answer me this:
  
  -Can plpgsqlfunctions be used to return 
  multiple result sets in ver 7.2 at all?
  or is this only a feature enabled in 
  7.3?
  
  If it is possible in 7.2 can you please give me 
  an examplethat would return multiple 
  rows.
  
  


[HACKERS] Wrong charset mappings

2003-02-06 Thread Thomas O'Dowd
Hi all,

One Japanese character has been causing my head to swim lately. I've
finally tracked down the problem to both Java 1.3 and Postgresql.

The problem character is namely:
utf-16: 0x301C
utf-8: 0xE3809C
SJIS: 0x8160
EUC_JP: 0xA1C1
Otherwise known as the WAVE DASH character.

The confusion stems from a very similar character 0xFF5E (utf-16) or
0xEFBD9E (utf-8) the FULLWIDTH TILDE.

Java has just lately (1.4.1) finally fixed their mappings so that 0x301C
maps correctly to both the correct SJIS and EUC-JP character. Previously
(at least in 1.3.1) they mapped SJIS to 0xFF5E and EUC to 0x301C,
causing all sorts of trouble.

Postgresql at least picked one of the two characters namely 0xFF5E, so
conversions in and out of the database to/from sjis/euc seemed to be
working. Problem is when you try to view utf-8 from the database or if
you read the data into java (utf-16) and try converting to euc or sjis
from there.

Anyway, I think postgresql needs to be fixed for this character. In my
opinion what needs to be done is to change the mappings...

euc-jp - utf-8- euc-jp
====
0xA1C1 - 0xE3809C0xA1C1

sjis   - utf-8- sjis
====
0x8160 - 0xE3809C0x8160

As to what to do with the current mapping of 0xEFBD9E (utf-8)? It
probably should be removed. Maybe you could keep the mapping back to the
sjis/euc characters to help backward compatibility though. I'm not sure
what is the correct approach there.

If anyone can tell me how to edit the mappings under:
src/backend/utils/mb/Unicode/

and rebuild postgres to use them, then I can test this out locally.

Looking forward to your replies.

Tom.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[HACKERS] Planning a change of representation in the planner

2003-02-06 Thread Tom Lane
Currently, the planner spends a good deal of time pushing around lists
of small integers, because it uses such lists to identify join
relations.  For example, given SELECT ... FROM a, b, c WHERE ...
the list (1,2) (or equivalently (2,1)) would represent the join of
a and b.

This representation is pretty clearly a hangover from the days when the
Postgres planner was written in Lisp :-(.  It's inefficient --- common
operations like union, intersection, and is-subset tests require O(N^2)
steps.  And it's error-prone: I just had my nose rubbed once again in
the nasty things that happen if you accidentally get some duplicate
entries in a relation ID list.  (It's nasty because some but not all of
the low-level list-as-set operations depend on the assumption that the
elements of a given list are distinct.)

I'm thinking of replacing this representation by a
variable-length-bitmap representation.  Basically it would be like

struct bitmapset {
int nwords;/* number of words in array */
int array[1];  /* really [nwords] */
};

Each array element would hold 32 bits; the integer i is a member of
the set iff (array[i/32]  (i%32))  1 == 1.  For sets containing no
elements over 31 (which would account for the vast majority of queries)
only a single word would be needed in the array.  Operations like set
union, intersection, and subset test could process 32 bits at a time ---
they'd reduce to trivial C operations like |=, =,  ~, applied once per
word.  There would be a few things that would be slower (like iterating
through the actual integer elements of a set) but AFAICT this
representation is much better suited to the planner's needs than the
list method.

I've been thinking of doing this for a while just on efficiency grounds,
but kept putting it off because I don't expect much of any performance
gain on simple queries.  (You need a dozen or so tables in a query
before the inefficiencies of the list representation really start to
hurt.)  But tonight I'm thinking I'll do it anyway, because it'll also
be impervious to duplicate-element bugs.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] [OpenFTS-general] relor and relkov

2003-02-06 Thread Christopher Kings-Lynne
  Nice ! We'll send you archive with new tsearch and short
  info, so you could test it and write documentation.
 
 I have a live DB, is it possible to install the new alpha tsearch
 module w/o conflicting with the existing production one?

Can you install it to a different schema?

Chris


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Ian Fry
On Wed, Feb 05, 2003 at 12:18:29PM -0500, Tom Lane wrote:
 D'Arcy J.M. Cain [EMAIL PROTECTED] writes:
  On Wednesday 05 February 2003 11:49, Tom Lane wrote:
  I wonder if it is possible that, every so often,
  you are losing just the last few bytes of an NFS transfer?
  Yah, that's kind of what it looked like when I tried this before
  Christmas too although the actual errors differd.
 Wild thought here: can you reduce the MTU on the LAN linking the NFS
 server to the NetBSD box?  If so, does it help?

How about adjusting the read and write-size used by the NetBSD machine? I think
the default is 32k for both read and write on i386 machines now. Perhaps try
setting them back to 8k (it's the -r and -w flags to mount_nfs, IIRC)

Ian.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Greywolf
On Wed, 5 Feb 2003, D'Arcy J.M. Cain wrote:

[DJC: This feels rather fragile.  I doubt that it is hardware related because I dad
[DJC: tried it on the other ethernet interface in the machine which was on a
[DJC: completely different network than the one I am on now.

All I can offer up is that at one point I had to reduce to 16k NFSIO
when I replaced a switch (you didn't replace a switch, did you?) between
my i386 and my sparc (my le0 and the switch didn't play nicely together;
once I got the hme0 in, everything was happy as a clam).

[DJC: What is the implication of smaller read and write size?  Will I
[DJC: necessarily take a performance hit?

I didn't start noticing observable degradation across 100TX until I
dropped NFSIO to 4k (which I did purely for benchmarking statistics).

The differences between 8k, 16k and 32k have not been noticeable
to me.  32k IO would hang my system at one point; since that time,
something appears to have been fixed.

[DJC: --
[DJC: D'Arcy J.M. Cain darcy@{druid|vex}.net   |  Democracy is three wolves
[DJC: http://www.druid.net/darcy/|  and a sheep voting on
[DJC: +1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.
[DJC:


--*greywolf;
--
NetBSD: Servers' choice!


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Thor Lancelot Simon
On Wed, Feb 05, 2003 at 03:09:09PM -0500, Tom Lane wrote:
 D'Arcy J.M. Cain [EMAIL PROTECTED] writes:
  On Wednesday 05 February 2003 13:04, Ian Fry wrote:
  How about adjusting the read and write-size used by the NetBSD machine? I
  think the default is 32k for both read and write on i386 machines now.
  Perhaps try setting them back to 8k (it's the -r and -w flags to mount_nfs,
  IIRC)
 
  Hey!  That did it.
 
 Hot diggety!
 
  So, why does this fix it?

Who knows.  One thing that I'd be interested to know is whether Darcy is
using NFSv2 or NFSv3 -- 32k requests are not, strictly speaking, within
the bounds of the v2 specification.  If he is using UDP rather than TCP
as the transport layer, another potential issue is that 32K requests will
end up as IP packets with a very large number of fragments, potentially
exposing some kind of network stack bug in which the last fragment is
dropped or corrupted (I would suspect that the likelihood of such a bug
in the NetApp stack is quite low, however).  If feasible, it is probably
better to use TCP as the transport and let it handle segmentation whether
the request size is 8K or 32K.

 I think now you file a bug report with the NetBSD kernel folk.  My
 thoughts are running in the direction of a bug having to do with
 scattering a 32K read into multiple kernel disk-cache buffers or
 gathering together multiple cache buffer contents to form a 32K write.

That doesn't make much sense to me.  Pages on i386 are 4K, so whether he
does 8K writes or 32K writes, it will always come from multiple pages in
the pagecache.

 Unless NetBSD has changed from its heritage, the kernel disk cache
 buffers are 8K, and so an 8K NFS read or write would never cross a
 cache buffer boundary.  But 32K would.

I don't know what heritage you're referring to, but it has never been
the case that NetBSD's buffer cache has used fixed-size 8K disk buffers,
and I don't believe that it was ever the case for any Net2 or 4.4-derived
system.

 Or it could be a similar bug on the NFS server's side?

That's concievable.  Of course, a client bug is quite possible, as well,
but I don't think the mechanism you suggest is likely.

-- 
 Thor Lancelot Simon  [EMAIL PROTECTED]
   But as he knew no bad language, he had called him all the names of common
 objects that he could think of, and had screamed: You lamp!  You towel!  You
 plate! and so on.  --Sigmund Freud

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Hannu Krosing
On Thu, 2003-02-06 at 13:25, Tatsuo Ishii wrote:
  I have just committed the latest version of Henry Spencer's regex
  package (lifted from Tcl 8.4.1) into CVS HEAD.  This code is natively
  able to handle wide characters efficiently, and so it avoids the
  multibyte performance problems recently exhibited by Wade Klaver.
  I have not done extensive performance testing, but the new code seems
  at least as fast as the old, and much faster in some cases.
 
 I have tested the new regex with src/test/mb and it all passed. So the
 new code looks safe at least for EUC_CN, EUC_JP, EUC_KR, EUC_TW,
 MULE_INTERNAL, UNICODE, though the test does not include all possible
 regex patterns.

Perhaps we should not call the encoding UNICODE but UTF8 (which it
really is). UNICODE is a character set which has half a dozen official
encodings and calling one of them UNICODE does not make things very
clear.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tatsuo Ishii
 Perhaps we should not call the encoding UNICODE but UTF8 (which it
 really is). UNICODE is a character set which has half a dozen official
 encodings and calling one of them UNICODE does not make things very
 clear.

Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely
way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use
this). I don't know what it is called though.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Hannu Krosing
Tatsuo Ishii kirjutas N, 06.02.2003 kell 17:05:
  Perhaps we should not call the encoding UNICODE but UTF8 (which it
  really is). UNICODE is a character set which has half a dozen official
  encodings and calling one of them UNICODE does not make things very
  clear.
 
 Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely
 way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use
 this). I don't know what it is called though.

I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse
anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used. 

UTF-8 seems to be the most popular, but even XML standard requires all
compliant implementations to deal with at least both UTF-8 and UTF-16.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] databases limit

2003-02-06 Thread Andrew Sullivan
On Thu, Feb 06, 2003 at 12:30:03AM -0500, Tom Lane wrote:

 I have a feeling that what the questioner really means is how can I
 limit the resources consumed by any one database user?  In which case

(I'm moving this to -hackers 'cause I think it likely belongs there.)

I note that this question has come up before, and several people have
been sceptical of its utility.  In particular, in this thread

http://groups.google.ca/groups?hl=enlr=ie=UTF-8threadm=Pine.LNX.4.21.0212221510560.15719-10%40linuxworld.com.aurnum=1prev=/groups%3Fq%3Dlimit%2Bresources%2B%2Bgroup:comp.databases.postgresql.*%26hl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3DPine.LNX.4.21.0212221510560.15719-10%2540linuxworld.com.au%26rnum%3D1

(sorry about the long line: I just get errors searching at the official
archives) Tom Lane notes that you could just run another back end to
make things more secure.

That much is true; but I'm wondering whether it might be worth it to
limit how much a _database_ can use.  For instance, suppose I have a
number of databases which are likely to see sporadic heavy loads. 
There are limitations on how slow the response can be.  So I have to
do some work to guarantee that, for instance, certain tables from
each database don't get flushed from the buffers.

I can do this now by setting up separate postmasters.  That way, each
gets its own shared memory segment.  Those certain tables will be
ones that are frequently accessed, and so they'll always remain in
the buffer, even if the other database is busy (because the two
databases don't share a buffer).  (I'm imagining the case -- not
totally imaginary -- where one of the databases tends to be accessed
heavily during one part of a 24 hour day, and another database gets
hit more on another part of the same day.)

The problem with this scenario is that it makes administration
somewhat awkward as soon as you have to do this 5 or 6 times.  I was
thinking that it might be nice to be able to limit how much of the
total resources a given database can consume.  If one database were
really busy, that would not mean that other databases would
automatically be more sluggish, because they would still have some
guaranteed minimum percentage of the total resources.

So, anyone care to speculate?

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] PostgreSQL v7.3.2 Released -- Permission denied from pretty much everywhere

2003-02-06 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
Been trying to test the latest source but the following places give 
 permission dened when trying to download:

ftp.postgresql.org
ftp.us.postgresql.org
ftp2.us.postgresql.org
mirror.ac.uk

I just started a download from ftp.us.postgresql.org, and it seems to be
working fine.  We've not heard other complaints, either.  Sure the
problem's not on your end?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[HACKERS] Why is lc_messages restricted?

2003-02-06 Thread Tom Lane
Is there a reason why lc_messages is PGC_SUSET, and not PGC_USERSET?
I can't see any security rationale for restricting it.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



[HACKERS] PostgreSQL v7.3.2 Released -- Permission denied from pretty mucheverywhere

2003-02-06 Thread Joshua D. Drake
Hello folks,

  Been trying to test the latest source but the following places give 
permission dened when trying to download:

  ftp.postgresql.org
  ftp.us.postgresql.org
  ftp2.us.postgresql.org
  mirror.ac.uk

  Anybody got one that works?

J


Oliver Elphick wrote:
On Wed, 2003-02-05 at 20:41, Laurette Cisneros wrote:


I was trying from the postgresql.org download web page and following the
mirror links there...and none of them that I was able to get to (some of
them didn't work) showed 7.3.2.



I got it from mirror.ac.uk yesterday




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL v7.3.2 Released -- Permission denied from pretty much

2003-02-06 Thread Joshua D. Drake
Hello,


  Pardon me while I pull my book out of various dark places. It has 
been a very long week. I got it. Thanks.

Sincerely,

Joshua Drake


Tom Lane wrote:
Joshua D. Drake [EMAIL PROTECTED] writes:


  Been trying to test the latest source but the following places give 
permission dened when trying to download:



  ftp.postgresql.org
  ftp.us.postgresql.org
  ftp2.us.postgresql.org
  mirror.ac.uk



I just started a download from ftp.us.postgresql.org, and it seems to be
working fine.  We've not heard other complaints, either.  Sure the
problem's not on your end?

			regards, tom lane



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Andrew Gillham
On Wed, Feb 05, 2003 at 09:24:48PM +, David Laight wrote:
  If he is using UDP rather than TCP
  as the transport layer, another potential issue is that 32K requests will
  end up as IP packets with a very large number of fragments, potentially
  exposing some kind of network stack bug in which the last fragment is
  dropped or corrupted.
 
 Actually it is worse that that, and IMHO 32k UDP requests are asking for
 trouble.
 
 A 32k UDP datagram is about 22 ethernet packets.  If ANY of them is
 lost on the network, then the entire datagram is lost.  NFS must
 regenerate the request on a timeout.  The receiving system won't
 report that it is missing a fragment.

As he stated several times, he has tested with TCP mounts and observed
the same issue.  So the above issue shouldn't be related.

 There are also an lot of ethernet cards out there which don't have
 enough buffer space for 32k of receive data.   Not to mention the
 fact that NFS can easily (at least on some systems) generate
 concurrent requests for different parts of the same file.
 
 I would suggest reducing the size back to 8k, even that causes
 trouble with some cards.

If NetBSD as an NFS client is this fragile we have problems.  The default
read/write size shouldn't be 32kB if that is not going to work reliably.

 It should also be realised that transmitting 22 full sized, back
 to back frames on the ethernet doesn't do anything for sharing
 the bandwidth betweenn different users.  The MAC layer has to very
 aggressive in order to get a packet in edgeways (so to speak).

So what?  If it is a switched network, which I assume it is since he was
talking to the NetApp gigabit port earlier, then this is irrelevant.  Even
the $40 Fry's switches are more or less non-blocking. 

Even if he is saturating the local *hub*, it shouldn't cause NetBSD to fail,
it would just be rude. :-)

There could be some packet mangling on the network, checking the amount
of retransmissions on either end of the TCP connection should give you an
idea about that.

-Andrew

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Byron Servies
On February 06, 2003 at 03:50, Justin Clift wrote:
 Tom Lane wrote:
 snip
 Hoo boy.  I was already suspecting data corruption in the index, and
 this looks like more of the same.  My thoughts are definitely straying
 in the direction of the NFS server is dropping bits, somehow.
 
 Both this and the (admittedly unproven) bt_moveright loop suggest
 corrupted values in the cross-page links that exist at the very end of
 each btree index page.  I wonder if it is possible that, every so often,
 you are losing just the last few bytes of an NFS transfer?
 
 Hmmm... does anyone remember the name of that NFS testing tool the 
 FreeBSD guys were using?  Think it came from Apple.  They used it to 
 find and isolate bugs in the FreeBSD code a while ago.
 
 Sounds like it might be useful here.
 
 :-)
 

fsx.  See also http://www.connectathon.org

hth,

Byron

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Greg A. Woods
[ On Friday, January 31, 2003 at 11:54:27 (-0500), D'Arcy J.M. Cain wrote: ]
 Subject: Re: PostgreSQL, NetBSD and NFS

 On Thursday 30 January 2003 18:32, Simon J. Gerraty wrote:
  Is postgreSQL trying to lock a file perhaps?  Would seem a sensible thing
  for it to be doing...
 
 Is that a problem?  FWIW I am running statd and lockd on the NetBSD box.

NetBSD's NFS implementation only supports locking as a _server_, not a
client.

http://www.unixcircle.com/features/nfs.php

   Optional for file locking (lockd+statd):

   lockd:

   Rpc.lockd is a daemon which provides file and record-locking services
   in an NFS environment.

   FreeBSD, NetBSD and OpenBSD file locking is only supported on server
   side.

NFS server support for locking was introduced in NetBSD-1.5:

http://www.netbsd.org/Releases/formal-1.5/NetBSD-1.5.html

 * Server part of NFS locking (implemented by rpc.lockd(8)) now works.  

and as you can also see from rcp.lockd/lockd.c:


revision 1.5
date: 2000/06/07 14:34:40;  author: bouyer;  state: Exp;  lines: +67 -25
Implement file locking in lockd. All the stuff is done in userland, using
fhopen() and flock(). This means that if you kill lockd, all locks will
be relased (but you're supposed to kill statd at the same time, so
remote hosts will know it and re-establish the lock).
Tested against solaris 2.7 and linux 2.2.14 clients.
Shared lock are not handled efficiently, they're serialised in lockd when they
could be granted.



Terry Lambert has some proposed fixes to add NFS client level locking to
the FreeBSD kernel:

http://www.freebsd.org/~terry/DIFF.LOCKS.txt
http://www.freebsd.org/~terry/DIFF.LOCKS.MAN
http://www.freebsd.org/~terry/DIFF.LOCKS

-- 
Greg A. Woods

+1 416 218-0098;[EMAIL PROTECTED];   [EMAIL PROTECTED]
Planix, Inc. [EMAIL PROTECTED]; VE3TCP; Secrets of the Weird [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] PostgreSQL, NetBSD and NFS

2003-02-06 Thread Manuel Bouyer
On Thu, Jan 30, 2003 at 01:27:59PM -0600, Greg Copeland wrote:
 That was going to be my question too.
 
 I thought NFS didn't have some of the requisite file system behaviors
 (locking, flushing, etc. IIRC) for PostgreSQL to function correctly or
 reliably.

I don't know what locking sheme PostgreSQL use, but in theory it should
be possible to use it over NFS:
- a fflush()/msync() should work the same way on a NFS filesystem as on a
  local filesystem, provided the client and server implements the NFS
  protocol properly
- locking via temp files works over NFS, again provided the client and server
  implements the NFS protocol properly (this is why you can safely read your
  mailbox over NFS, for example). If PostgreSQL uses flock or fcntl, it's
  a problem.

-- 
Manuel Bouyer [EMAIL PROTECTED]
 NetBSD: 24 ans d'experience feront toujours la difference
--

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [OpenFTS-general] relor and relkov

2003-02-06 Thread Caffeinate The World

--- Oleg Bartunov [EMAIL PROTECTED] wrote:
 On Fri, 31 Jan 2003, Caffeinate The World wrote:
 
 
   But, we need help to create good documentation for tsearch !
   This is main stopper for releasing of tsearch.
 
  I am currently using tsearch. I'd be happy to help with
 documentation.
 
 Nice ! We'll send you archive with new tsearch and short
 info, so you could test it and write documentation.

I have a live DB, is it possible to install the new alpha tsearch
module w/o conflicting with the existing production one?

__
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Status report: regex replacement

2003-02-06 Thread Tatsuo Ishii
  Right. Also we perhaps should call LATIN1 or ISO-8859-1 more precisely
  way since ISO-8859-1 can be encoded in either 7 bit or 8 bit(we use
  this). I don't know what it is called though.
 
 I don't think that calling 8-bit ISO-8859-1 ISO-8859-1 can confuse
 anybody, but UCS-2 (ISO-10646-1), UTF-8 and UTF-16 are all widely used. 

I just pointed out that ISO-8859-1 is *not* an encoding, but a
character set.

 UTF-8 seems to be the most popular, but even XML standard requires all
 compliant implementations to deal with at least both UTF-8 and UTF-16.

I don't think PostgreSQL is going to natively support UTF-16.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster