Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Albe Laurenz
Marlon Petry wrote:
 I am trying to develop, a API to carry through backup and 
 restore through JDBC.
 I think that the best form is to use JNI. 
 Some Suggestion? 

Do you mean 'backup' or 'export/dump'?
If you mean 'export', do you need anything besides SQL?
If you mean 'backup', how do you want to restore from
a client machine when there is no server running?
And if you're on the server, why would you want to use Java?

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] JAVA Support

2006-09-29 Thread Magnus Hagander
  I would if we could get some -hackers buy in on the idea.  Adding
 more
  and more auth methods is something they're not excited about
 unless
  there's a good reason (which I think this is).
 
 Actually, I've been trying to get some of the Sun engineers to
 contribute patches for Solaris authentication methods, of which
 GSSAPI is one.  So in theory someone from Sun should be looking at
 coding this.

Well, if they are, I hope they would be speaking up now, so work isn't
duplicated... So if you're out there, please speak up! ;-)

//Magnus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] JAVA Support

2006-09-29 Thread Magnus Hagander

This being SASL:

  I know I tried to make
  it work on win32 once and failed miserably. (Then again, I've
 failed
  on Linux as well, but not quite as bad. And it's not included in
 all
  Linux distributions, at least it wasn't when I checked a while
 back)
 
 Well, I know Redhat has RPM's that look reasonable.  I'm not a big
 Linux user myself.  (More a BSD bigot, to be honest.)

Well, Redhat != Linux, really :P

Over to GSSAPI:

 In theory, you get to plug in other mechanisms than Kerberos.  In
 practice I think this only worked on Solaris, until very recently.

FWIW, Microsoft have supported NTLM over GSSAPI since.. eh. Back in
1999, I guess, with the first pre-releases of Windows 2000.

 Wire compatibility with a native Windows API (the SSPI), if it's
 used correctly.  (Google for posts by Jeffrey Altman for references
 to example code.)

This, IMHO, is a big win if we can pull it off. It would significantly
lower the barrier for getting Kerberos working properly in pg on Win32.

//Magnus


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] JAVA Support

2006-09-29 Thread Magnus Hagander
  However, that doesn't change that some people would like us to
 support
  GSSAPI, and there may be some benefit (additional applications,
 better
  network authentication, etc.) for doing so.  If we can get
 additional
  programmers to code the support (i.e. Sun, JPL) I don't see any
 reason
  not to support the *additional* authentication methods.
 
 Well, as I said already, a lot depends on the size of the patch.
 As a reductio ad absurdum, if they drop 100K lines of code on us,
 it *will* get rejected, no matter how cool it is.

Oh, absolutely.


 The current Kerberos support seems to require about 50 lines in
 configure.in and circa 200 lines of C code in each of the backend
 and libpq.  Plus a dependency on an outside library that happens to
 be readily available and compatibly licensed.

I would expect, without looking at the details of the API, GSSAPI to be
about the same amount of code if not less.


 What amount of code are we talking about adding here, and what
 dependencies exactly?  What portability and license hazards will be
 added?

The Kerberos5 libraries that we rely on today provide GSSAPI. So it
would work with the same external library. Now, it could *also* work
with other libraries in some cases (for example, the Win32 SSPI
libraries), but with the same libraries it should work fine.

//Magnus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Heikki Linnakangas

ITAGAKI Takahiro wrote:
However, I think our next goal to shrink the headers is 16 bytes. The 
headers
become 23 bytes using phantom cids and we are limited by alignments, 
so we will

have no more advantages unless we delete extra 7 bytes in the headers.
...and it seems to be very difficult.


Yeah, I thought about that too earlier.

If we get rid of VACUUM FULL, or replace it with something that doesn't 
need xvac, and keep cmin and cmax in backend-private storage, we could 
get rid of the overlayed t_field4, which is 4 bytes. Then we're down to 
19 bytes.


We could get rid of t_hoff, because we should always be able to 
calculate the header size. Then we're down to 18 bytes.


There's currently  15 bits in use in the infomask. After we remove the 
HEAP_MOVED_* fields that we don't need without VACUUM FULL, that's down 
to 13 bits. t_natts only needs 11 bits, because MaxHeapAttributeNumber 
is 1600. We could move 5 of the bits in infomask to the high 5 bits of 
t_natts, and save one byte.


We're now down to 17 bytes. That's as far as I got.

So it seems we could shave off some bytes, but we still can't get down 
to 16. And the changes needed in total would be quite massive.


--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Faster StrNCpy

2006-09-29 Thread Markus Schaber
Hi, Tom,

Tom Lane wrote:
 Strong, David [EMAIL PROTECTED] writes:
 Just wondering - are any of these cases where a memcpy() would work
 just as well? Or are you not sure that the source string is at least
 64 bytes in length?
 
 In most cases, we're pretty sure that it's *not* --- it'll just be a
 palloc'd C string.
 
 I'm disinclined to fool with the restriction that namestrcpy zero-pad
 Name values, because they might end up on disk, and allowing random
 memory contents to get written out is ungood from a security point of
 view.

There's another disadvantage of always copying 64 bytes:

It may be that the 64-byte range crosses a page boundary. Now guess what
happens when this next page is not mapped - segfault.

Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
If we want to keep the property that VACUUM doesn't re-evaluate index 
entries, any proposal that doesn't keep track of every heap tuple 
isn't going to work. I'll try to modify the design I had in mind so 
that it does keep track of every heap tuple in some form.

After some thought:

Imagine a normal B-tree just like what we have now. But when there is 
more than one tuple on the same heap page with consecutive index keys, 
we represent all of them in a single index tuple that contains the key 
of the first one of them, and a (run-length encoded) bitmap of the 
OffsetNumbers. We should get most of the space and I/O savings as with 
the original proposal, but we can vacuum it without re-evaluating index 
expressions.


It does change the format of an index tuple, unlike the original 
proposal. That adds some complexity. but it's doable.


--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Martijn van Oosterhout
On Fri, Sep 29, 2006 at 09:35:31AM +0100, Heikki Linnakangas wrote:
 We could get rid of t_hoff, because we should always be able to 
 calculate the header size. Then we're down to 18 bytes.

Without t_hoff, how do you know the size of the null bitmap? You could
probably do it only if you assume the null bitmap, if present, always
covers all the columns...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Heikki Linnakangas

Martijn van Oosterhout wrote:

On Fri, Sep 29, 2006 at 09:35:31AM +0100, Heikki Linnakangas wrote:

We could get rid of t_hoff, because we should always be able to
calculate the header size. Then we're down to 18 bytes.


Without t_hoff, how do you know the size of the null bitmap? You could
probably do it only if you assume the null bitmap, if present, always
covers all the columns...


I think we assume that already. heap_form_tuple reserves space for the 
bitmap like this:


   if (hasnull)
   len += BITMAPLEN(numberOfAttributes);

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Martijn van Oosterhout
On Fri, Sep 29, 2006 at 10:59:13AM +0100, Heikki Linnakangas wrote:
 Martijn van Oosterhout wrote:
 On Fri, Sep 29, 2006 at 09:35:31AM +0100, Heikki Linnakangas wrote:
 We could get rid of t_hoff, because we should always be able to
 calculate the header size. Then we're down to 18 bytes.
 
 Without t_hoff, how do you know the size of the null bitmap? You could
 probably do it only if you assume the null bitmap, if present, always
 covers all the columns...
 
 I think we assume that already. heap_form_tuple reserves space for the 
 bitmap like this:
 
if (hasnull)
len += BITMAPLEN(numberOfAttributes);

Ok, now we do an ALTER TABLE blah ADD COLUMN ..., and we have to expand
the bitmaps for the entire table?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Heikki Linnakangas

Martijn van Oosterhout wrote:

On Fri, Sep 29, 2006 at 10:59:13AM +0100, Heikki Linnakangas wrote:

Martijn van Oosterhout wrote:

On Fri, Sep 29, 2006 at 09:35:31AM +0100, Heikki Linnakangas wrote:

We could get rid of t_hoff, because we should always be able to
calculate the header size. Then we're down to 18 bytes.

Without t_hoff, how do you know the size of the null bitmap? You could
probably do it only if you assume the null bitmap, if present, always
covers all the columns...

I think we assume that already. heap_form_tuple reserves space for the
bitmap like this:

if (hasnull)
len += BITMAPLEN(numberOfAttributes);


Ok, now we do an ALTER TABLE blah ADD COLUMN ..., and we have to expand
the bitmaps for the entire table?

No, you'd still have the the number of attributes (t_natts) in the header.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Martijn van Oosterhout
On Fri, Sep 29, 2006 at 10:51:32AM +0100, Heikki Linnakangas wrote:
 After some thought:
 
 Imagine a normal B-tree just like what we have now. But when there is 
 more than one tuple on the same heap page with consecutive index keys, 
 we represent all of them in a single index tuple that contains the key 
 of the first one of them, and a (run-length encoded) bitmap of the 
 OffsetNumbers. We should get most of the space and I/O savings as with 
 the original proposal, but we can vacuum it without re-evaluating index 
 expressions.

I think something like this has been discussed before. Where an index
tuple currently has the key values followed by a ctid, simply change
that so that it can be a list of ctid's, in order.

This works on having the same key, but doesn't care if the tuples are
on the same page. It used to be not possible because of how to handle
forward and backward scanning within an index entry during updates. I
think with the new page at a time index scanning, this got a lot
easier.

One issue is that a single index page could hold more than 1000 index
entries, which might cause problems for callers.

 It does change the format of an index tuple, unlike the original 
 proposal. That adds some complexity. but it's doable.

This way doesn't change the current index format much.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] New version of money type

2006-09-29 Thread Martijn van Oosterhout
On Thu, Sep 28, 2006 at 06:32:11PM -0500, Jim C. Nasby wrote:
 What would be ideal is a money type that stored what currency was used
 and let you change precision (within reason).

The taggedtypes version of currency does half of that, by storing the
currency and allowing the output format to depend on that. It doesn't
allow you to easily change the precision though, that would require
user-defined typmod which is still under discussion.

It would be possible to create a taggedtypes version of currency based
on int64. With the currency tag it would be 12 bytes total. And the
number of decimal places could be defined per currency...

Interesting thought, probably wouldn't take more than an hour to whip
up.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas

Martijn van Oosterhout wrote:

On Fri, Sep 29, 2006 at 10:51:32AM +0100, Heikki Linnakangas wrote:

After some thought:

Imagine a normal B-tree just like what we have now. But when there is
more than one tuple on the same heap page with consecutive index keys,
we represent all of them in a single index tuple that contains the key
of the first one of them, and a (run-length encoded) bitmap of the
OffsetNumbers. We should get most of the space and I/O savings as with
the original proposal, but we can vacuum it without re-evaluating index
expressions.


I think something like this has been discussed before. Where an index
tuple currently has the key values followed by a ctid, simply change
that so that it can be a list of ctid's, in order.


Actually it's t_tid followed by t_info (which is size + flags) followed 
by key values.



This works on having the same key, but doesn't care if the tuples are
on the same page. It used to be not possible because of how to handle
forward and backward scanning within an index entry during updates. I
think with the new page at a time index scanning, this got a lot
easier.


I'm not very interested in the case where you have a lot of equal keys, 
I think the bitmap index am is more suitable for that. The Block B-tree 
could be used whenever you have a clustered table (including unique 
indexes). Some DBMSs have index-organized-tables for the same use case.


When I tested a prototype of the original idea with TPC-C (DBT-2) data, 
a block index on the order_line table was under 2% of the size of a 
normal B-tree. That's very close to a best-case scenario; narrow rows 
and a completely clustered table. I'm aiming at that order of magnitude 
reduction in storage size.



One issue is that a single index page could hold more than 1000 index
entries, which might cause problems for callers.


I don't think that's a problem.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Marlon Petry
The idea is to make one pg_dump of the server and to keep in the
machine of client.And to restore this pg_dump when it will be necessary
through the machine of the client.Perhaps I will have that to use some store procedure in the server, I do not know
 regardsMarlon2006/9/29, Albe Laurenz 

[EMAIL PROTECTED]:
Marlon Petry wrote: I am trying to develop, a API to carry through backup and restore through JDBC. I think that the best form is to use JNI. Some Suggestion?Do you mean 'backup' or 'export/dump'?
If you mean 'export', do you need anything besides SQL?If you mean 'backup', how do you want to restore froma client machine when there is no server running?And if you're on the server, why would you want to use Java?
Yours,Laurenz Albe-- MarlonA busca infinita à felicidade nos priva de vivê-la a cada dia... (Elba Lucas)




Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Andrew Dunstan

Marlon Petry wrote:


The idea is to make one pg_dump of the server and to keep in the 
machine of client.
And to restore this pg_dump when it will be necessary through the 
machine of the client.
Perhaps I will have that to use some store procedure in the server, I 
do not know





pg_dump and pg_restore do not need to run on the server machine. Why not 
just run them where you want the dump stored?


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Marlon Petry
 The idea is to make one pg_dump of the server and to keep in the
 machine of client. And to restore this pg_dump when it will be necessary through the machine of the client. Perhaps I will have that to use some store procedure in the server, I do not know
pg_dump and pg_restore do not need to run on the server machine. Why notjust run them where you want the dump stored?cheersandrewBut I would need to have installed pg_dump and pg_restore in machine client?
Without having installed pg_dump and pg_restore,how I could makeregards,Marlon


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Andrew Dunstan

Marlon Petry wrote:



pg_dump and pg_restore do not need to run on the server machine.
Why not
just run them where you want the dump stored?




But I would need to have installed pg_dump and pg_restore in machine 
client?

Without having installed pg_dump and pg_restore,how I could make





You can't. pg_dump in particular embodies an enormous amount of 
knowledge that simply does not exist elsewhere. There is no dump/restore 
API, and there is nothing you can hook up to using JNI, AFAIK.


cheers

andrew


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Simon Riggs
On Fri, 2006-09-29 at 10:51 +0100, Heikki Linnakangas wrote:
 Heikki Linnakangas wrote:
  If we want to keep the property that VACUUM doesn't re-evaluate index 
  entries, any proposal that doesn't keep track of every heap tuple 
  isn't going to work. I'll try to modify the design I had in mind so 
  that it does keep track of every heap tuple in some form.

The ideal situation is that we have one index pointer per block, so we
should look for that and optimize accordingly. We mark the heap block to
indicate how many block index pointers there are to it. If we have only
a single pointer, then VACUUM will only have to touch the index pointer
when the whole heap block is removed. In that case we have no
re-evaluation of the index AFAICS.

 After some thought:
 
 Imagine a normal B-tree just like what we have now. But when there is 
 more than one tuple on the same heap page with consecutive index keys, 
 we represent all of them in a single index tuple that contains the key 
 of the first one of them, and a (run-length encoded) bitmap of the 
 OffsetNumbers. We should get most of the space and I/O savings as with 
 the original proposal, but we can vacuum it without re-evaluating index 
 expressions.

The benefit we're seeking with a block index is that most INSERTs don't
write to the index. With that scheme we'd need to continually update the
index tuple so that it exactly represented the heap after each inserted
tuple, which is going to cause a hot block problem.

Much of that can go away if we have a bulk_heap_insert() which puts the
index entries in at the end of the block, though that needs some heavy
thought also.

Can we have this scheme enabled *only* for functional block indexes?
Normal case is likely to be monotonically ascending keys for real world
objects like Orders, Calls, Transactions etc.. It sounds like the
original proposal is still valid for those cases.

The bitmap would allow us to access heap rows faster in some
circumstances, I suppose.

Multi-block bitmaps do make this too much like bitmap indexes and the
use-case is very different. [Is there some kind of hybrid solution of
block  bitmap indexes?]

 It does change the format of an index tuple, unlike the original 
 proposal. That adds some complexity. but it's doable.

Can we use an info bit to have two index tuple formats?
- single tuple (as now)
- multiple tuple block bitmap (as you suggest)

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Andreas Pflug
Andrew Dunstan wrote:
 Marlon Petry wrote:


 pg_dump and pg_restore do not need to run on the server machine.
 Why not
 just run them where you want the dump stored?




 But I would need to have installed pg_dump and pg_restore in machine
 client?
 Without having installed pg_dump and pg_restore,how I could make




 You can't. pg_dump in particular embodies an enormous amount of
 knowledge that simply does not exist elsewhere. There is no
 dump/restore API, and there is nothing you can hook up to using JNI,
 AFAIK.
Recently, there was the proposal to extract that knowledge to a library
(making pg_dump itself just a wrapper). This sounds valuable more and
more, is anybody working on this for 8.3?

Regards,
Andreas


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Markus Schaber
Hi, Marlon,

Marlon Petry wrote:

 But I would need to have installed pg_dump and pg_restore in machine
 client?
 Without having installed pg_dump and pg_restore,how I could make

pg_dump and pg_restore should be runnable (possible with a small shell /
bash wrapper script) without any installation, simply having them and
all neded libs lying in the current directory. They don't need any
registry keys, weird environment settings or such, just some libs which
should be present on most platforms, except libpq.

Using a java application for dump/restore will burden you with
installing a JVM, the PostgreSQL JDBC drivers, and your application,
which seems at least equal effort and more ressources.


Btw, another idea is to run pg_dump on the server, but pipe its output
to the client, e. G. via running it through telnet or SSH (yes there are
SSH servers for windows), and then piping it to a file on the client (e.
G. using plink.exe from the putty package). Or use netcat or so.

On a unix box, when you're really crazy, and want to ignore all security
restrictions, you could even install pg_dump via inetd, and then
everyone connecting via TCP on the appropriate port gets a dump of the
database. :-)

HTH,
Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Marlon Petry
 Marlon Petry wrote:
 pg_dump and pg_restore do not need to run on the server machine. Why not just run them where you want the dump stored? But I would need to have installed pg_dump and pg_restore in machine
 client? Without having installed pg_dump and pg_restore,how I could make You can't. pg_dump in particular embodies an enormous amount of knowledge that simply does not exist elsewhere. There is no
 dump/restore API, and there is nothing you can hook up to using JNI, AFAIK.Recently, there was the proposal to extract that knowledge to a library(making pg_dump itself just a wrapper). This sounds valuable more and
more, is anybody working on this for 8.3?Regards,AndreasI have interest in working,how i could start ?regards,marlon


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Gregory Stark
Csaba Nagy [EMAIL PROTECTED] writes:

   I think you build a whole new index named something like .temp-reindex 
   and
   then as the last step of the second transaction delete the old idnex and
   rename the new index.
  
  That would require getting exclusive lock on the table.
 
 Just out of curiosity, creating a new index concurrently (or online,
 whatever you call it) doesn't require to set an exclusive lock on the
 table ? I thought it would, at least swiftly at the end of the
 operation, after all it's modifying the table...

Nope.

As I understand it the step that fundamentally requires a table lock is
actually dropping the old index. You have to be sure nobody is actually using
it before you do anything that causes people to stop maintaining it.

We could do something like how the online index build creates the index but in
reverse. We mark the index invalid and then wait out any transactions that
could be using it. Then we can drop it safely.

But I think even that has some practical problems. Transactions that have that
index in their relcache structure for the table will try to maintain it and
get confused if it's gone.

It seems to me that taking a brief lock on the table at the end of the reindex
isn't actually much of a problem. It only needs to be held briefly and it can
be done in a separate transaction so there isn't a deadlock risk.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Andrew Dunstan

Marlon Petry wrote:


 You can't. pg_dump in particular embodies an enormous amount of
 knowledge that simply does not exist elsewhere. There is no
 dump/restore API, and there is nothing you can hook up to using JNI,
 AFAIK.
Recently, there was the proposal to extract that knowledge to a
library
(making pg_dump itself just a wrapper). This sounds valuable more and
more, is anybody working on this for 8.3?


I have interest in working,how i could start ?




Start by reading the code in src/bin/pg_dump

Then after you recover from your head exploding you start devising some 
sort of sane API ...


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Marlon Petry
Marlon Petry wrote:  You can't. pg_dump in particular embodies an enormous amount of
  knowledge that simply does not exist elsewhere. There is no  dump/restore API, and there is nothing you can hook up to using JNI,  AFAIK. Recently, there was the proposal to extract that knowledge to a
 library (making pg_dump itself just a wrapper). This sounds valuable more and more, is anybody working on this for 8.3? I have interest in working,how i could start ?
Start by reading the code in src/bin/pg_dumpThen after you recover from your head exploding you start devising somesort of sane API ...cheersandrew
ok. i'm trying start.


[HACKERS] send()/receive() and on-disk storage

2006-09-29 Thread Markus Schaber
Hi,

Currently, there's a discussion on the pljava list, and we're confused
about a small detail:

Does PostgreSQL call the datatypes' defined send() function before
storing the tuple data in the table, on disk, and call receive() when
reading it again?

My position was that send()/receive() are only used for client
communication in the V3 protocol binary mode, and COPY in binary mode,
but the backend stores them in the tables as_is (modulo compression/
TOASTing).

Is my assumption correct?

Btw, I'm going to triple-check this in the source and via tests, too.

Thanks,
Markus
-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] send()/receive() and on-disk storage

2006-09-29 Thread Martijn van Oosterhout
On Fri, Sep 29, 2006 at 03:44:07PM +0200, Markus Schaber wrote:
 Does PostgreSQL call the datatypes' defined send() function before
 storing the tuple data in the table, on disk, and call receive() when
 reading it again?

No.

 My position was that send()/receive() are only used for client
 communication in the V3 protocol binary mode, and COPY in binary mode,
 but the backend stores them in the tables as_is (modulo compression/
 TOASTing).

Yes.

It would be terribly inefficient to call those functions for each
read/write. The disk has the internal format, send/receive deal with a
portable not-host-dependant representation of the data.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas

Simon Riggs wrote:

On Fri, 2006-09-29 at 10:51 +0100, Heikki Linnakangas wrote:

Heikki Linnakangas wrote:

If we want to keep the property that VACUUM doesn't re-evaluate index
entries, any proposal that doesn't keep track of every heap tuple
isn't going to work. I'll try to modify the design I had in mind so
that it does keep track of every heap tuple in some form.


The ideal situation is that we have one index pointer per block, so we
should look for that and optimize accordingly. We mark the heap block to
indicate how many block index pointers there are to it. If we have only
a single pointer, then VACUUM will only have to touch the index pointer
when the whole heap block is removed. In that case we have no
re-evaluation of the index AFAICS.


I don't see how that would work. It sounds similar to the reference 
counting option I proposed, which had the same re-evaluation problem.


And in addition, it requires adding index-specific information to the 
heap page format, which troubles me from a modularization viewpoint.



After some thought:

Imagine a normal B-tree just like what we have now. But when there is
more than one tuple on the same heap page with consecutive index keys,
we represent all of them in a single index tuple that contains the key
of the first one of them, and a (run-length encoded) bitmap of the
OffsetNumbers. We should get most of the space and I/O savings as with
the original proposal, but we can vacuum it without re-evaluating index
expressions.


The benefit we're seeking with a block index is that most INSERTs don't
write to the index. With that scheme we'd need to continually update the
index tuple so that it exactly represented the heap after each inserted
tuple, which is going to cause a hot block problem.


That's just one of the benefits. I think the main benefit is dramatic 
reduction in index size which means that more of the index is cached.


An INSERT will have to find the corresponding leaf page anyway. Having 
to dirty it isn't a big deal assuming that the hot blocks stay in cache.


The hot block problem worries me a bit too. Any indexing scheme that 
packs more items on a block is going to suffer from that. Testing will 
show if that becomes a problem.



Can we have this scheme enabled *only* for functional block indexes?


No. As Tom pointed out, data type specific functions have potentially 
the same problem.


And having both versions seems like a lot of code and complexity.


The bitmap would allow us to access heap rows faster in some
circumstances, I suppose.


Yes, you wouldn't have to re-evaluate index quals on every tuple, when 
the whole range represented by the index tuple falls within the range 
you're searching for. And when there's only few tuples with consecutive 
keys on a heap page (which is not a good use case for block B-trees), 
you don't need to scan the whole page to find those matches.



Multi-block bitmaps do make this too much like bitmap indexes and the
use-case is very different. [Is there some kind of hybrid solution of
block  bitmap indexes?]


Not that I know of, though there is different kind of bitmap indexes. 
The one that didn't make it to 8.2 uses equality encoding, where you 
have a bitmap for every distinct value. You can also have 
range-encoding, where you have a bitmap for ranges of values, for 
example one bitmap for 1-10, another for 10-15 etc. If you choose the 
ranges dynamically so that you have one range for each heap page (when 
it's clustered), you get something similar to the proposed Block B-tree.


The current bitmap encoding scheme is optimized for large bitmaps, 
though, so the performance wouldn't be as good.


It does change the format of an index tuple, unlike the original 
proposal. That adds some complexity. but it's doable.


Can we use an info bit to have two index tuple formats?
- single tuple (as now)
- multiple tuple block bitmap (as you suggest)


Yes. There's one bit free in the index tuple header.

--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Tom Dunstan

Markus Schaber wrote:

Marlon Petry wrote:

But I would need to have installed pg_dump and pg_restore in machine
client?
Without having installed pg_dump and pg_restore,how I could make


pg_dump and pg_restore should be runnable (possible with a small shell /
bash wrapper script) without any installation, simply having them and
all neded libs lying in the current directory.


There's probably a case for having static builds of pg_dump and 
pg_restore around for various architectures, if only to help people out 
when they don't have access to a build environment etc. Either a set of 
static binaries on the website, or an easy way to build them from the 
source tree (they could then be copied to the target system).


It strikes me that Marlon hasn't really explained why he wants to use 
JDBC. I assume that your application is Java based, but trust me, 
invoking pg_dump through Runtime.exec() or whatever is going to be much, 
much easier than any of the other things you've suggested, such as 
making a pg_dump API and using JNI to call it. That's just pain city, in 
a bunch of ways.


Do you need to process the dump inside your program in some way? Or do 
you just need to store a dump and restore it later? Why the fascination 
with using an API?



On a unix box, when you're really crazy, and want to ignore all security
restrictions, you could even install pg_dump via inetd, and then
everyone connecting via TCP on the appropriate port gets a dump of the
database. :-)


Oh, man, my head just exploded reading that. That's taking evil and 
being *creative* with it. :)


Cheers

Tom


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [COMMITTERS] pgsql: Fix IS NULL and IS NOT NULL tests on row-valued

2006-09-29 Thread Teodor Sigaev

9.17.5. Row-wise Comparison
These constructs test a row value for null or not null. A row value is 
considered not null if it has at least one field that is not null.


I suppose, it should be changed too.

Tom Lane wrote:

the SQL spec, viz IS NULL is true if all the row's fields are null, IS NOT
NULL is true if all the row's fields are not null.  The former coding got

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] send()/receive() and on-disk storage

2006-09-29 Thread Markus Schaber
Hi, Martijn,

Martijn van Oosterhout wrote:

 It would be terribly inefficient to call those functions for each
 read/write. The disk has the internal format, send/receive deal with a
 portable not-host-dependant representation of the data.

Thanks.

Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Jim C. Nasby
On Fri, Sep 29, 2006 at 01:15:06PM +0900, ITAGAKI Takahiro wrote:
 
 Jim C. Nasby [EMAIL PROTECTED] wrote:
 
  The reason I thought of this is because once the transaction commits, we
  have no use for the cid info. So we could do something like have
  bgwriter look for tuples that belong to committed transactions before it
  writes a page, and strip the cid out of them.
 
 Your concept is just like as the experimental method that I suggested before
 in http://archives.postgresql.org/pgsql-hackers/2005-08/msg01193.php
 We can remove cmin and cmax from commited tuples and xmin from frozen tuples
 and we might save some bytes in tuple headers.
 
 However, I think our next goal to shrink the headers is 16 bytes. The headers
 become 23 bytes using phantom cids and we are limited by alignments, so we 
 will
 have no more advantages unless we delete extra 7 bytes in the headers.
 ...and it seems to be very difficult.

Dumb question... wouldn't getting down to 20 bytes buy us something?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Imagine a normal B-tree just like what we have now. But when there is 
 more than one tuple on the same heap page with consecutive index keys, 
 we represent all of them in a single index tuple that contains the key 
 of the first one of them, and a (run-length encoded) bitmap of the 
 OffsetNumbers.

At first I thought that was a typo, and instead of consecutive you
meant to write equal.  I gather from the later statement

 I'm not very interested in the case where you have a lot of equal keys, 
 I think the bitmap index am is more suitable for that.

that indeed you meant to write consecutive, and I've got a problem
with that: define consecutive.  In a datatype independent fashion,
please.  I also wonder how you are going to implement splitting and
merging of runs, which will certainly be necessary if this isn't to be
a constantly-requires-REINDEX thing.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
  
I'm not very interested in the case where you have a lot of equal keys, 
I think the bitmap index am is more suitable for that.



that indeed you meant to write consecutive, and I've got a problem
with that: define consecutive.  In a datatype independent fashion,
please.  I also wonder how you are going to implement splitting and
merging of runs, which will certainly be necessary if this isn't to be
a constantly-requires-REINDEX thing.
  


I don't mean consecutive as in 1, 2, 3, 4, ... without gaps but as in 
A and B are consecutive in the index, if there's no value X in the 
index so that A  X  B. Maybe there's a better word for that.


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Faster StrNCpy

2006-09-29 Thread Tom Lane
Markus Schaber [EMAIL PROTECTED] writes:
 There's another disadvantage of always copying 64 bytes:
 It may be that the 64-byte range crosses a page boundary. Now guess what
 happens when this next page is not mapped - segfault.

Irrelevant, because in all interesting cases the Name field is part of a
larger record that would stretch into that other page anyway.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Jan de Visser
On Friday 29 September 2006 10:55, Heikki Linnakangas wrote:
 Tom Lane wrote:
  Heikki Linnakangas [EMAIL PROTECTED] writes:
  I'm not very interested in the case where you have a lot of equal keys,
  I think the bitmap index am is more suitable for that.
 
  that indeed you meant to write consecutive, and I've got a problem
  with that: define consecutive.  In a datatype independent fashion,
  please.  I also wonder how you are going to implement splitting and
  merging of runs, which will certainly be necessary if this isn't to be
  a constantly-requires-REINDEX thing.

 I don't mean consecutive as in 1, 2, 3, 4, ... without gaps but as in
 A and B are consecutive in the index, if there's no value X in the
 index so that A  X  B. Maybe there's a better word for that.

http://en.wikipedia.org/wiki/Monotonic

jan

-- 
--
Jan de Visser                     [EMAIL PROTECTED]

                Baruk Khazad! Khazad ai-menu!
--

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Then after you recover from your head exploding you start devising some 
 sort of sane API ...

That's the hard part.  There is no percentage in having a library if
it doesn't do anything significantly different from what you could
accomplish via
system(pg_dump ...switches);

What is it you hope to accomplish by having a library, exactly?
(And don't say more control over the dump process.  pg_dump is already
on the hairy edge of maintainability; we do *not* need to try to deal
with making it still function correctly after an application programmer
makes some random intervention in the process.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Another idea for dealing with cmin/cmax

2006-09-29 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 Dumb question... wouldn't getting down to 20 bytes buy us something?

Only on 32-bit machines, which are getting less interesting as database
servers every day.  (Just last night I was reading somebody opining that
the transition to 64-bit hardware would be effectively complete by 2008
... and he was talking about desktop PCs, not serious iron.)

BTW, the apparently useless byte after the 27- or 23-byte header
actually has some good use: in a table of up to 8 columns, you can
fit a null bitmap there for free.  In a scheme that took us down
to 20 rather than 19 bytes, even a narrow table would pay the full
maxalign price for having a null.

I'm in favor of combining cmin/cmax/xvac to get us down to 23 bytes,
but I think anything beyond that is going to face a serious problem
of greatly increased cost for diminishing returns.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 define consecutive.

 I don't mean consecutive as in 1, 2, 3, 4, ... without gaps but as in 
 A and B are consecutive in the index, if there's no value X in the 
 index so that A  X  B. Maybe there's a better word for that.

Um, but how are you going to make that work without storing the keys for
both A and B?  You won't be able to tell whether an incoming key C
that's just a bit bigger than A should go before or after B.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Array assignment behavior (was Re: [ADMIN] Stored procedure array limits)

2006-09-29 Thread Tom Lane
[ expanding this thread, as it now needs wider discussion ]

Paul B. Anderson [EMAIL PROTECTED] writes:
 Actually, I was not filling all of the arrays in sequential order.  I 
 added code to initialize them in order and the function seems to be 
 working now.  Is that a known problem? 

Well, it's a documented behavior: section 8.10.4 saith

A stored array value can be enlarged by assigning to an element
adjacent to those already present, or by assigning to a slice
that is adjacent to or overlaps the data already present.

Up to 8.2 we didn't have a lot of choice about this, because without any
ability to have nulls embedded in arrays, there wasn't any sane thing to
do with the intermediate positions if you assigned to an element not
adjacent to the existing range.  As of 8.2 we could allow assignment to
arbitrary positions by filling the intermediate positions with nulls.
The code hasn't actually been changed to allow that, but it's something
we could consider doing now.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Block B-Tree concept

2006-09-29 Thread Heikki Linnakangas

Tom Lane wrote:

Heikki Linnakangas [EMAIL PROTECTED] writes:
  
I don't mean consecutive as in 1, 2, 3, 4, ... without gaps but as in 
A and B are consecutive in the index, if there's no value X in the 
index so that A  X  B. Maybe there's a better word for that.



Um, but how are you going to make that work without storing the keys for
both A and B?  You won't be able to tell whether an incoming key C
that's just a bit bigger than A should go before or after B.
  


Let me describe the insertion algorithm:

1. To insert a tuple with key X, we find the position in the index where 
the new tuple would go, just like with a normal B-tree. Let's call the 
index tuple right before the position A and the next tuple B. So 
according to normal B-tree rules, X should go between A and B.


2. If A points to the same heap page as X, we set the bit representing 
the offset of the new tuple in the index tuple A (this might require 
enlarging the index tuple and event splitting the page), and we're done. 
If it points to a different page, we need split the range A-B to A-X-B, 
proceed to step 3.


3. To split the range A-B, scan the heap page to see which of the tuples 
pointed to by A are = X and which are  X . If there's no tuples = X, 
insert a new index tuple for X, and we're done. Otherwise, let Y be the 
smallest tuple = X. Insert a new index tuple for Y, containing all the 
offsets with keys = X, and remove those offsets from A. We have now 
split A-B to A-Y-B so that A  X  Y  B.


4. Insert the new index tuple for X.

(I'm not sure I got the above description correct for cases with equal 
keys.)


Note that we don't keep track of the ordering of tuples that are clumped 
into a single index tuple. That's not important, I/O wise, because if 
you're going to fetch a heap page into memory, you might as well scan 
all the tuples on it and sort them if necessary. That's where the space 
and I/O savings comes from.


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] a little doubr about domains and pl/python

2006-09-29 Thread Walter Cruz

Hi all. I'm playing with pl/python AND i'm with a doubt.

I wanna create a domain to roman numerals.

I have made:

CREATE OR REPLACE FUNCTION valid_roman(text)
RETURNS BOOLEAN
IMMUTABLE
LANGUAGE plpythonu AS
$$
from roman9 import fromRoman

roman = args[0]
try:
  r = fromRoman(roman)
  return 'true'
except:
   return 'false'
$$;

CREATE DOMAIN romano AS text
 CHECK (valid_roman(VALUE))
 ;


CREATE OR REPLACE FUNCTION to_roman(integer)
RETURNS romano
IMMUTABLE
LANGUAGE plpythonu AS
$$
from roman9 import toRoman

number = args[0]
try:
  r = toRoman(number)
  return r
except:
 return 'false'
$$;

CREATE CAST (integer AS romano) WITH FUNCTION to_roman(integer);

And that works:

SELECT to_roman(50)
SELECT to_roman(100)
SELECT valid_roman('XX')
SELECT valid_roman('XXX')
SELECT valid_roman('')

But my cast doen't works:

SELECT 5::romano

ERROR:  value for domain romano violates check constraint romano_check

What's wrong?

[]'s
- Walter

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [COMMITTERS] pgsql: Fix IS NULL and IS NOT NULL tests on row-valued expressions to

2006-09-29 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 9.17.5. Row-wise Comparison
 These constructs test a row value for null or not null. A row value is 
 considered not null if it has at least one field that is not null.

Wups, I missed that part of the docs, will fix.  Thanks.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Array assignment behavior (was Re: [ADMIN] Stored procedure array limits)

2006-09-29 Thread Casey Duncan

On Sep 29, 2006, at 9:14 AM, Tom Lane wrote:


[ expanding this thread, as it now needs wider discussion ]

Paul B. Anderson [EMAIL PROTECTED] writes:

Actually, I was not filling all of the arrays in sequential order.  I
added code to initialize them in order and the function seems to be
working now.  Is that a known problem?


Well, it's a documented behavior: section 8.10.4 saith

A stored array value can be enlarged by assigning to an element
adjacent to those already present, or by assigning to a slice
that is adjacent to or overlaps the data already present.

Up to 8.2 we didn't have a lot of choice about this, because  
without any
ability to have nulls embedded in arrays, there wasn't any sane  
thing to

do with the intermediate positions if you assigned to an element not
adjacent to the existing range.  As of 8.2 we could allow  
assignment to

arbitrary positions by filling the intermediate positions with nulls.
The code hasn't actually been changed to allow that, but it's  
something

we could consider doing now.

Comments?


At first blush, this strikes me as a bit too magical/implicit. Are  
there other languages where sequences behave similarly? The best  
analogy that comes to mind is sparse files, but in that case there is  
an implicit contract that the intervening empty regions do not  
actually occupy physical space, doesn't sound like that's true here.


I think the result of this change would be more difficult debugging  
of off-by-one errors and their ilk, rather than actually being a real  
benefit.


OTOH, perhaps there is a real use-case I am missing here. I don't see  
the rest of this thread on GENERAL and I couldn't find it searching  
the archives, where did it come from?


-Casey


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [GENERAL] Array assignment behavior (was Re: [ADMIN] Stored procedure array limits)

2006-09-29 Thread John D. Burger

As of 8.2 we could allow assignment to
arbitrary positions by filling the intermediate positions with nulls.
The code hasn't actually been changed to allow that, but it's  
something

we could consider doing now.


At first blush, this strikes me as a bit too magical/implicit. Are  
there other languages where sequences behave similarly?


 perl -e '@A = (1, 2, 3); print @A\n; $A[10] = 10; print @A\n;'
1 2 3
1 2 310

- John D. Burger
  MITRE


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Backup and restore through JDBC

2006-09-29 Thread Andreas Pflug
Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
   
 Then after you recover from your head exploding you start devising some 
 sort of sane API ...
 

 That's the hard part.  There is no percentage in having a library if
 it doesn't do anything significantly different from what you could
 accomplish via
   system(pg_dump ...switches);

 What is it you hope to accomplish by having a library, exactly?
 (And don't say more control over the dump process. 
Some more progress feedback would be really nice.
  pg_dump is already
 on the hairy edge of maintainability; we do *not* need to try to deal
 with making it still function correctly after an application programmer
 makes some random intervention in the process.)
   
Agreed. The only sane approach seems to have a single dump function call
(that takes a set of parameters as prepared by command line scanning)
and a set of callbacks that enable api users to do sensible stuff at
different stages of the backup process.

Regards,
Andreas





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Nulls, arrays, records, IS NULL, IS DISTINCT FROM

2006-09-29 Thread Tom Lane
Following up yesterday's discussion, I've been studying the SQL spec for
null predicate and distinct predicate, and it seems a bit
inconsistent.

The rules for distinct predicate make it clear that you are supposed
to drill down into row and array values to determine distinctness.
SQL99 has

a) If the declared type of X or Y is an array type, then X IS
  DISTINCT FROM Y is effectively computed as follows:

  i) Let NX be the number of elements in X; let NY be the number
 of elements in Y.

 ii) Let EX(i) be the i-th element of X; let EY(i) be the i-th
 element of Y.

iii) Case:

 1) If NX is not equal to NY, then X IS DISTINCT FROM Y is
   true.

 2) If NX equals zero and NY equals zero, then X IS
   DISTINCT FROM Y is false.

 3) If EX(i) IS DISTINCT FROM EY(i) is false for all i
   between 1 (one) and NX, then X IS DISTINCT FROM Y is
   false.

 4) Otherwise, X IS DISTINCT FROM Y is true.

SQL2003 has completely rewritten the text but the meaning seems the
same.  I suppose we want to generalize the NX/NY business to say
if the array bounds are not identical then the arrays are distinct.
We are clearly getting this wrong since the introduction of nulls in
arrays, but I'll go fix that.

Similarly, given two row expressions, distinctness is determined
field-wise: X and Y are distinct if any two corresponding fields
are distinct.  We are currently getting this correct only for
the case of parse-time ROW expressions, ie
ROW(x,y,z) IS [NOT] DISTINCT FROM ROW(xx,yy,zz)
This is pretty much analogous to the case Teodor noted yesterday
for IS NULL: it's not being done in gram.y but it's still being
done much too early.  We need to be able to do it in the executor
to handle situations where a row value is coming from a function
or some other source that's not disassemblable at parse time.

What's bothering me is that for foo IS [NOT] NULL, the spec clearly
prescribes drilling down into a rowtype value to examine the individual
fields, but I can't find any language that prescribes the same for
arrays.  Is this intentional, or an oversight?  In particular, the
spec says
ROW(1,2,NULL) IS NOT NULL
is false, because the row fields must be *all* not null to make it true.
But it's very unclear whether
ARRAY[1,2,NULL] IS NOT NULL
should be false on the same reasoning.  Right now, we respond true on
the grounds that the array object as-a-whole isn't null, without
examining its contents.

Comments?  Does anyone see any guidance in the spec?  If there is none,
which behavior do we think is most useful/consistent?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Win32 hard crash problem

2006-09-29 Thread Joshua D. Drake
Joshua D. Drake wrote:
 Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 Yes, unfortunately there isn't much more to be had for another 2
 weeks ;)

 I trust they've got the reboot time and they will know exactly how long
 from reboot to problem?  I'm not all that sold on the GetTickCount
 overflow theory, but certainly we ought not be missing a chance to test
 or disprove it.
 
 Yes I documented all conversations and disclaimers :)

O.k. further on this.. the crashing is happening quickly now but not
predictably. (as in sometimes a week sometimes 2 days). I just now got
them to send some further logs... Interestingly:


2006-09-28 16:38:37.406  LOG:  could not send data to client: An
operation on a socket could not be performed because the system lacked
sufficient buffer space or because a queue was full.

That log entry is the last (of consequence) entry before the machine says:

2006-09-28 16:40:36.921  LOG:  received fast shutdown request
2006-09-28 16:40:36.921  LOG:  aborting any active transactions
2006-09-28 16:40:36.921  FATAL:  terminating connection due to
administrator command

On the ERROR side of things I have a bunch of standard, unique key
violations etc... AND:

postgresql-2006-09-27_00.log:2006-09-27 23:49:57.671  FATAL:  could
not read from statistics collector pipe: No error

I have requested a clean run with entire log at DEBUG2. Hopefully that
will give us more info.


Sincerely,

Joshua D. Drake








 
 Joshua D. Drake
 

 regards, tom lane

 
 


-- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] JAVA Support

2006-09-29 Thread Henry B. Hotz


On Sep 28, 2006, at 9:35 PM, Tom Lane wrote:


Joshua D. Drake [EMAIL PROTECTED] writes:
Is there any reason why we haven't built a generic authentication  
API?

Something like PAM, except cross platform?


We're database geeks, not security/crypto/authentication geeks.  What
makes you think we have any particular competence to do the above?

Actually, the part of this proposal that raised my hackles the most  
was

the claim that GSSAPI provides a generic auth API, because that was
exactly the bill of goods we were sold in connection with PAM.  (So  
why

is this our problem at all --- can't you make a PAM plugin for it??)
It didn't help any that that was shortly followed by the lame  
admission
that no one has ever implemented anything except Kerberos  
underneath it.

Word to the wise, guys: go *real* soft on vaporware claims for auth
stuff, because we've seen enough of those before.


Well, that's why I was pushing SASL instead of GSSAPI.  There are  
multiple mechanisms that are actually in use.


PAM turned out not to be sufficiently specified for cross-platform  
behavioral compatibility, and it only does password checking anyway.   
Calling it a security solution is a big overstatement IMO.  I guess a  
lot of people use PAM with SSL and don't worry about the gap between  
the two (which SASL or GSSAPI close).


In defense of GSSAPI non-Kerberos mechanisms do exist.  They just  
cost money and they aren't very cross-platform.  AFAIK GSSAPI has no  
simple password mechanisms.


There's a Microsoft-compatible SPNEGO mechanism for GSSAPI that's  
being implemented fairly widely now, but it's just a sub-negotiation  
mech that lets you choose between a Kerberos 5 (that's practically  
identical to the direct one), and NTLM.  If you allow NTLM you'd  
better limit it to NTLMv2!


 


The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
[EMAIL PROTECTED], or [EMAIL PROTECTED]



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] JAVA Support

2006-09-29 Thread Henry B. Hotz


On Sep 29, 2006, at 12:31 AM, Magnus Hagander wrote:


However, that doesn't change that some people would like us to

support

GSSAPI, and there may be some benefit (additional applications,

better

network authentication, etc.) for doing so.  If we can get

additional

programmers to code the support (i.e. Sun, JPL) I don't see any

reason

not to support the *additional* authentication methods.


Well, as I said already, a lot depends on the size of the patch.
As a reductio ad absurdum, if they drop 100K lines of code on us,
it *will* get rejected, no matter how cool it is.


Oh, absolutely.



The current Kerberos support seems to require about 50 lines in
configure.in and circa 200 lines of C code in each of the backend
and libpq.  Plus a dependency on an outside library that happens to
be readily available and compatibly licensed.


I would expect, without looking at the details of the API, GSSAPI  
to be

about the same amount of code if not less.


Probably save some Kerberos bookkeeping.  Probably loose it with  
GSSAPI bookkeeping, including name translation (which is far less  
obvious).  Net, I would expect to lose, but not by very much.



What amount of code are we talking about adding here, and what
dependencies exactly?  What portability and license hazards will be
added?


The Kerberos5 libraries that we rely on today provide GSSAPI. So it
would work with the same external library. Now, it could *also* work
with other libraries in some cases (for example, the Win32 SSPI
libraries), but with the same libraries it should work fine.

//Magnus


If I had a lot of time to spend on this I would write a SASL-like  
wrapper so it could be used on platforms with GSSAPI, but not SASL  
support in the OS.  As you may have noticed, I believe SASL is the  
way to go.  I'm not up for it though.


There's probably room in the world for a SASL-lite library though.   
Cyrus is great, but if your OS doesn't supply it for you, it's  
supposed to be really hard to build.



 


The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
[EMAIL PROTECTED], or [EMAIL PROTECTED]



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Testing strlcpy ()

2006-09-29 Thread Strong, David
We've completed some tests comparing Postgres 8.2beta1 (beta1) and Postgres 
8.2beta1 with strlcpy () (strlcpy).
 
First and foremost, the patch seems to be stable - we have not run into any 
issues with it.
 
After a database reload, there is an 11% difference between strlcpy and beta1 - 
strlcpy (107 TPS) performing ~11% better than beta1 (96 TPS). This seems to be 
consistent over a number of runs. Originally, we had only seen 3% difference 
here.
 
After 8 (or so) runs, the difference between the two versions disappears and 
both versions produce ~250 TPS. Both versions, over time, spend more and more 
time in LWLockAcquire. 
 
David

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Nulls, arrays, records, IS NULL, IS DISTINCT FROM

2006-09-29 Thread David Fetter
On Fri, Sep 29, 2006 at 12:53:19PM -0400, Tom Lane wrote:
 Following up yesterday's discussion, I've been studying the SQL spec for
 null predicate and distinct predicate, and it seems a bit
 inconsistent.

Do you have the official spec in hand, or just the draft from wiscorp?

 SQL2003 has completely rewritten the text but the meaning seems the
 same.  I suppose we want to generalize the NX/NY business to say if
 the array bounds are not identical then the arrays are distinct.

That sounds like a reasonable generalization.

 We are clearly getting this wrong since the introduction of nulls in
 arrays, but I'll go fix that.

 Similarly, given two row expressions, distinctness is determined
 field-wise: X and Y are distinct if any two corresponding fields
 are distinct.  We are currently getting this correct only for
 the case of parse-time ROW expressions, ie
   ROW(x,y,z) IS [NOT] DISTINCT FROM ROW(xx,yy,zz)
 This is pretty much analogous to the case Teodor noted yesterday
 for IS NULL: it's not being done in gram.y but it's still being
 done much too early.  We need to be able to do it in the executor
 to handle situations where a row value is coming from a function
 or some other source that's not disassemblable at parse time.
 
 What's bothering me is that for foo IS [NOT] NULL, the spec clearly
 prescribes drilling down into a rowtype value to examine the individual
 fields, but I can't find any language that prescribes the same for
 arrays.  Is this intentional, or an oversight?  In particular, the
 spec says
   ROW(1,2,NULL) IS NOT NULL
 is false, because the row fields must be *all* not null to make it true.

That's odd because as I understand the above,

ROW(1,2,NULL) IS NULL

is also false.  Have I misunderstood?

 But it's very unclear whether
   ARRAY[1,2,NULL] IS NOT NULL
 should be false on the same reasoning.  Right now, we respond true on
 the grounds that the array object as-a-whole isn't null, without
 examining its contents.
 
 Comments?  Does anyone see any guidance in the spec?

Sadly, I don't have the official one in hand.  Is there one available?

 If there is none, which behavior do we think is most
 useful/consistent?

My reading of the (provisional) spec is that a complex type (ROW,
ARRAY, MULTISET) is NULL iff all of its elements are NULL or the whole
of it is NULL.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Nulls, arrays, records, IS NULL, IS DISTINCT FROM

2006-09-29 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 On Fri, Sep 29, 2006 at 12:53:19PM -0400, Tom Lane wrote:
 In particular, the spec says
  ROW(1,2,NULL) IS NOT NULL
 is false, because the row fields must be *all* not null to make it true.

 That's odd because as I understand the above,
 ROW(1,2,NULL) IS NULL
 is also false.  Have I misunderstood?

Yup, they are both false.  The spec goes to some trouble to make it
clear that IS NULL/IS NOT NULL are not inverses for row values:

NOTE 135 - For all R, R IS NOT NULL has the same result as
NOT R IS NULL if and only if R is of degree 1. Table 16,
null predicate semantics, specifies this behavior.

What I find odd is the lack of comparable language about arrays.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] a little doubr about domains and pl/python

2006-09-29 Thread Tom Lane
Walter Cruz [EMAIL PROTECTED] writes:
 CREATE CAST (integer AS romano) WITH FUNCTION to_roman(integer);

 What's wrong?

Domains aren't supported as cast source/targets at the moment; I don't
think the system is finding your cast specification at all, but is
instead doing this as 5::text::romano which of course fails.  See past
discussions about the subtleties of trying to fix this.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Array assignment behavior (was Re: [ADMIN] Stored procedure array

2006-09-29 Thread Paul B. Anderson




It seems that the suggestion to fill intermediate positions with
NULLs would be preferable to the current behavior. 

I know of no requirement to populate arrays in sequence in any other
language so I think other programmers would be surprised too by the
current behavior.

Paul


Tom Lane wrote:

  [ expanding this thread, as it now needs wider discussion ]

"Paul B. Anderson" [EMAIL PROTECTED] writes:
  
  
Actually, I was not filling all of the arrays in sequential order.  I 
added code to initialize them in order and the function seems to be 
working now.  Is that a known problem? 

  
  
Well, it's a documented behavior: section 8.10.4 saith

	A stored array value can be enlarged by assigning to an element
	adjacent to those already present, or by assigning to a slice
	that is adjacent to or overlaps the data already present.

Up to 8.2 we didn't have a lot of choice about this, because without any
ability to have nulls embedded in arrays, there wasn't any sane thing to
do with the intermediate positions if you assigned to an element not
adjacent to the existing range.  As of 8.2 we could allow assignment to
arbitrary positions by filling the intermediate positions with nulls.
The code hasn't actually been changed to allow that, but it's something
we could consider doing now.

Comments?

			regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

.

  





Re: [HACKERS] [GENERAL] Array assignment behavior (was Re: [ADMIN] Stored procedure

2006-09-29 Thread Erik Jones

Yep, that definitely threw me the first time I encountered it.

Paul B. Anderson wrote:
It seems that the suggestion to fill intermediate positions with NULLs 
would be preferable to the current behavior. 

I know of no requirement to populate arrays in sequence in any other 
language so I think other programmers would be surprised too by the 
current behavior.


Paul


Tom Lane wrote:

[ expanding this thread, as it now needs wider discussion ]

Paul B. Anderson [EMAIL PROTECTED] writes:
  
Actually, I was not filling all of the arrays in sequential order.  I 
added code to initialize them in order and the function seems to be 
working now.  Is that a known problem? 



Well, it's a documented behavior: section 8.10.4 saith

A stored array value can be enlarged by assigning to an element
adjacent to those already present, or by assigning to a slice
that is adjacent to or overlaps the data already present.

Up to 8.2 we didn't have a lot of choice about this, because without any
ability to have nulls embedded in arrays, there wasn't any sane thing to
do with the intermediate positions if you assigned to an element not
adjacent to the existing range.  As of 8.2 we could allow assignment to
arbitrary positions by filling the intermediate positions with nulls.
The code hasn't actually been changed to allow that, but it's something
we could consider doing now.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

.

  



--
erik jones [EMAIL PROTECTED]
software development
emma(r)


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Array assignment behavior (was Re: [ADMIN] Stored procedure array limits)

2006-09-29 Thread Tom Lane
John D. Burger [EMAIL PROTECTED] writes:
 As of 8.2 we could allow assignment to
 arbitrary positions by filling the intermediate positions with nulls.
 The code hasn't actually been changed to allow that, but it's  
 something we could consider doing now.
 
 At first blush, this strikes me as a bit too magical/implicit. Are  
 there other languages where sequences behave similarly?

 perl -e '@A = (1, 2, 3); print @A\n; $A[10] = 10; print @A\n;'
 1 2 3
 1 2 310

Actually, now that I look closely, I think the SQL spec demands exactly
this.  Recall that SQL99 only allows one-dimensional, lower-bound-one
arrays.  The specification for UPDATE ... SET C[I] = SV ... reads

  Case:

  i) If the value of C is null, then an exception condition is
 raised: data exception - null value in array target.

 ii) Otherwise:

 1) Let N be the maximum cardinality of C.

 2) Let M be the cardinality of the value of C.

 3) Let I be the value of the simple value specification
   immediately contained in update target.

 4) Let EDT be the element type of C.

 5) Case:

   A) If I is greater than zero and less than or equal to
  M, then the value of C is replaced by an array A
  with element type EDT and cardinality M derived as
  follows:

  I) For j varying from 1 (one) to I-1 and from I+1 to
M, the j-th element in A is the value of the j-th
element in C.

 II) The I-th element of A is set to the specified
update value, denoted by SV, by applying the
General Rules of Subclause 9.2, Store assignment,
to the I-th element of A and SV as TARGET and
VALUE, respectively.

   B) If I is greater than M and less than or equal to
  N, then the value of C is replaced by an array A
  with element type EDT and cardinality I derived as
  follows:

  I) For j varying from 1 (one) to M, the j-th element
in A is the value of the j-th element in C.

 II) For j varying from M+1 to I-1, the j-th element in
A is the null value.

III) The I-th element of A is set to the specified
update value, denoted by SV, by applying the
General Rules of Subclause 9.2, Store assignment,
to the I-th element of A and SV as TARGET and
VALUE, respectively.

   C) Otherwise, an exception condition is raised: data
  exception - array element error.

We currently violate case i by allowing the null array value to be
replaced by a single-element array.  I'm disinclined to change that,
as I think our behavior is more useful than the spec's.  But case ii.5.B
pretty clearly describes null-fill, so I think we'd better do that, now
that we can.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Array assignment behavior (was Re: [ADMIN] Stored procedure

2006-09-29 Thread Erik Jones
Ok, just so I can be sure I understand what I just read:  i. says that 
you can assign to an array that has not been initialized.  ii.  states 
that the index of an insertion into an array should  not be limited by 
the current range of index values of the array and requires any gaps in 
the index range to be set with values of null.  I really don't have 
anything to add to discussion other than that I agree with Tom's 
assessment, but rather want clarify what might be a slightly hazy 
interpretation of the specs listed below.


Tom Lane wrote:

Actually, now that I look closely, I think the SQL spec demands exactly
this.  Recall that SQL99 only allows one-dimensional, lower-bound-one
arrays.  The specification for UPDATE ... SET C[I] = SV ... reads

  Case:

  i) If the value of C is null, then an exception condition is
 raised: data exception - null value in array target.

 ii) Otherwise:

 1) Let N be the maximum cardinality of C.

 2) Let M be the cardinality of the value of C.

 3) Let I be the value of the simple value specification
   immediately contained in update target.

 4) Let EDT be the element type of C.

 5) Case:

   A) If I is greater than zero and less than or equal to
  M, then the value of C is replaced by an array A
  with element type EDT and cardinality M derived as
  follows:

  I) For j varying from 1 (one) to I-1 and from I+1 to
M, the j-th element in A is the value of the j-th
element in C.

 II) The I-th element of A is set to the specified
update value, denoted by SV, by applying the
General Rules of Subclause 9.2, Store assignment,
to the I-th element of A and SV as TARGET and
VALUE, respectively.

   B) If I is greater than M and less than or equal to
  N, then the value of C is replaced by an array A
  with element type EDT and cardinality I derived as
  follows:

  I) For j varying from 1 (one) to M, the j-th element
in A is the value of the j-th element in C.

 II) For j varying from M+1 to I-1, the j-th element in
A is the null value.

III) The I-th element of A is set to the specified
update value, denoted by SV, by applying the
General Rules of Subclause 9.2, Store assignment,
to the I-th element of A and SV as TARGET and
VALUE, respectively.

   C) Otherwise, an exception condition is raised: data
  exception - array element error.

We currently violate case i by allowing the null array value to be
replaced by a single-element array.  I'm disinclined to change that,
as I think our behavior is more useful than the spec's.  But case ii.5.B
pretty clearly describes null-fill, so I think we'd better do that, now
that we can.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
  



--
erik jones [EMAIL PROTECTED]
software development
emma(r)


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Per-database search_path

2006-09-29 Thread David Fetter
Folks,

I'd like to propose a feature for 8.3, which would let login roles
have a default search_path both globally, as it is now, and
per-database.  This is because in general no two databases have
any schemas in common, and a login role should be able to do something
reasonable just by connecting.

What would be involved with making this so?  So far, I can see:

* Storing this information in pg_catalog somehow
* Setting the search_path when a login role connects

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Per-database search_path

2006-09-29 Thread Josh Berkus
DF,

 I'd like to propose a feature for 8.3, which would let login roles
 have a default search_path both globally, as it is now, and
 per-database.  This is because in general no two databases have
 any schemas in common, and a login role should be able to do something
 reasonable just by connecting.

Yes.  I've been bitten by this more than once ...

However, it almost seems like this would become a piece of the other 
per-database-user stuff we'd like to do, like local superuser.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Per-database search_path

2006-09-29 Thread David Fetter
On Fri, Sep 29, 2006 at 01:06:09PM -0700, Josh Berkus wrote:
 DF,
 
  I'd like to propose a feature for 8.3, which would let login roles
  have a default search_path both globally, as it is now, and
  per-database.  This is because in general no two databases have
  any schemas in common, and a login role should be able to do
  something reasonable just by connecting.
 
 Yes.  I've been bitten by this more than once ...

Come to think of it, this ties in with the ON (DIS)CONNECT TRIGGER
idea that others have brought up in the past.  The trigger idea may be
too big a hammer, and might even be ill-advised from a design point of
view, but I thought I'd bring it up anyhow.

 However, it almost seems like this would become a piece of the other
 per-database-user stuff we'd like to do, like local superuser.

I'm not sure that's the same.  The thing about superuser as it exists
now is the ability to write to the filesystem, which means that
there's no boundary really possible.  Maybe some kind of tiered system
of users with more than 2 tiers...but that sounds pretty byzantine to
me.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Faster StrNCpy

2006-09-29 Thread mark
On Fri, Sep 29, 2006 at 11:21:21AM +0200, Markus Schaber wrote:
 Tom Lane wrote:
  Just wondering - are any of these cases where a memcpy() would work
  just as well? Or are you not sure that the source string is at least
  64 bytes in length?
  
  In most cases, we're pretty sure that it's *not* --- it'll just be a
  palloc'd C string.
  
  I'm disinclined to fool with the restriction that namestrcpy zero-pad
  Name values, because they might end up on disk, and allowing random
  memory contents to get written out is ungood from a security point of
  view.
 
 There's another disadvantage of always copying 64 bytes:
 
 It may be that the 64-byte range crosses a page boundary. Now guess what
 happens when this next page is not mapped - segfault.

With strncpy(), this possibility already exists. If it is a real problem,
that stand-alone 64-byte allocations are crossing page boundaries, the
fault is with the memory allocator, not with the user of the memory.

For strlcpy(), my suggestion that Tom quotes was that modern processes
do best when instructions can be fully parallelized. It is a lot
easier to parallelize a 64-byte copy, than a tight loop looking for
'\0' or n = 64. 64 bytes easily fits into cache memory, and modern
processors write cache memory in blocks of 16, 32, or 64 bytes anyways,
meaning that any savings in terms of not writing are minimal.

But it's only safe if you know that the source string allocation is
= 64 bytes. Often you don't, therefore it isn't safe, and the suggestion
is unworkable.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Faster StrNCpy

2006-09-29 Thread mark
If anybody is curious, here are my numbers for an AMD X2 3800+:

$ gcc -O3 -std=c99 -DSTRING='This is a very long sentence that is expected to 
be slow.' -o x x.c y.c strlcpy.c ; ./x
NONE:620268 us
MEMCPY:  683135 us
STRNCPY:7952930 us
STRLCPY:   10042364 us

$ gcc -O3 -std=c99 -DSTRING='Short sentence.' -o x x.c y.c strlcpy.c ; ./x
NONE:554694 us
MEMCPY:  691390 us
STRNCPY:7759933 us
STRLCPY:3710627 us

$ gcc -O3 -std=c99 -DSTRING='' -o x x.c y.c strlcpy.c ; ./x
NONE:631266 us
MEMCPY:  775340 us
STRNCPY:7789267 us
STRLCPY: 550430 us

Each invocation represents 100 million calls to each of the functions.
Each function accepts a 'dst' and 'src' argument, and assumes that it
is copying 64 bytes from 'src' to 'dst'. The none function does
nothing. The memcpy calls memcpy(), the strncpy calls strncpy(), and
the strlcpy calls the strlcpy() that was posted from the BSD sources.
(GLIBC doesn't have strlcpy() on my machine).

This makes it clear what the overhead of the additional logic involves.
memcpy() is approximately equal to nothing at all. strncpy() is always
expensive. strlcpy() is often more expensive than memcpy(), except in
the empty string case.

These tests do not properly model the effects of real memory, however,
they do model the effects of cache memory. I would suggest that the
results are exaggerated, but not invalid.

For anybody doubting the none vs memcpy, I've included the generated
assembly code. I chalk it entirely up to fully utilizing the
parallelization capability of the CPU. Although 16 movq instructions
are executed, they can be executed fully in parallel.

It almost makes it clear to me that all of these instructions are
pretty fast. Are we sure this is a real bottleneck? Even the slowest
operation above, strlcpy() on a very long string, appears to execute
10 per microsecond? Perhaps my tests are too easy for my CPU and I
need to make it access many different 64-byte blocks? :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/

.file   x.c
.text
.p2align 4,,15
.globl x_none
.type   x_none, @function
x_none:
.LFB14:
rep ; ret
.LFE14:
.size   x_none, .-x_none
.p2align 4,,15
.globl x_strlcpy
.type   x_strlcpy, @function
x_strlcpy:
.LFB17:
movl$64, %edx
jmp strlcpy
.LFE17:
.size   x_strlcpy, .-x_strlcpy
.p2align 4,,15
.globl x_strncpy
.type   x_strncpy, @function
x_strncpy:
.LFB16:
movl$64, %edx
jmp strncpy
.LFE16:
.size   x_strncpy, .-x_strncpy
.p2align 4,,15
.globl x_memcpy
.type   x_memcpy, @function
x_memcpy:
.LFB15:
movq(%rsi), %rax
movq%rax, (%rdi)
movq8(%rsi), %rax
movq%rax, 8(%rdi)
movq16(%rsi), %rax
movq%rax, 16(%rdi)
movq24(%rsi), %rax
movq%rax, 24(%rdi)
movq32(%rsi), %rax
movq%rax, 32(%rdi)
movq40(%rsi), %rax
movq%rax, 40(%rdi)
movq48(%rsi), %rax
movq%rax, 48(%rdi)
movq56(%rsi), %rax
movq%rax, 56(%rdi)
ret
.LFE15:
.size   x_memcpy, .-x_memcpy
.section.eh_frame,a,@progbits
.Lframe1:
.long   .LECIE1-.LSCIE1
.LSCIE1:
.long   0x0
.byte   0x1
.string zR
.uleb128 0x1
.sleb128 -8
.byte   0x10
.uleb128 0x1
.byte   0x3
.byte   0xc
.uleb128 0x7
.uleb128 0x8
.byte   0x90
.uleb128 0x1
.align 8
.LECIE1:
.LSFDE1:
.long   .LEFDE1-.LASFDE1
.LASFDE1:
.long   .LASFDE1-.Lframe1
.long   .LFB14
.long   .LFE14-.LFB14
.uleb128 0x0
.align 8
.LEFDE1:
.LSFDE3:
.long   .LEFDE3-.LASFDE3
.LASFDE3:
.long   .LASFDE3-.Lframe1
.long   .LFB17
.long   .LFE17-.LFB17
.uleb128 0x0
.align 8
.LEFDE3:
.LSFDE5:
.long   .LEFDE5-.LASFDE5
.LASFDE5:
.long   .LASFDE5-.Lframe1
.long   .LFB16
.long   .LFE16-.LFB16
.uleb128 0x0
.align 8
.LEFDE5:
.LSFDE7:
.long   .LEFDE7-.LASFDE7
.LASFDE7:
.long   .LASFDE7-.Lframe1
.long   .LFB15
.long   .LFE15-.LFB15
.uleb128 0x0
.align 8
.LEFDE7:
.ident  GCC: (GNU) 4.1.1 20060525 (Red Hat 4.1.1-1)
.section.note.GNU-stack,,@progbits

---(end of 

Re: [HACKERS] Faster StrNCpy

2006-09-29 Thread Tom Lane
[EMAIL PROTECTED] writes:
 If anybody is curious, here are my numbers for an AMD X2 3800+:

You did not show your C code, so no one else can reproduce the test on
other hardware.  However, it looks like your compiler has unrolled the
memcpy into straight-line 8-byte moves, which makes it pretty hard for
anything operating byte-wise to compete, and is a bit dubious for the
general case anyway (since it requires assuming that the size and
alignment are known at compile time).

This does make me wonder about whether we shouldn't try the
strlen+memcpy implementation I proposed earlier ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Win32 hard crash problem

2006-09-29 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 O.k. further on this.. the crashing is happening quickly now but not
 predictably. (as in sometimes a week sometimes 2 days).

OK, that seems to eliminate the GetTickCount-overflow theory anyway.

 That log entry is the last (of consequence) entry before the machine says:
 2006-09-28 16:40:36.921  LOG:  received fast shutdown request

Oh?  That's pretty interesting on a Windows machine, because AFAIK there
wouldn't be any standard mechanism that might tie into our homegrown
signal facility.  Anyone have a theory on what might trigger a SIGINT
to the postmaster, other than intentional pg_ctl invocation?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Faster StrNCpy

2006-09-29 Thread mark
On Fri, Sep 29, 2006 at 05:34:30PM -0400, Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  If anybody is curious, here are my numbers for an AMD X2 3800+:
 You did not show your C code, so no one else can reproduce the test on
 other hardware.  However, it looks like your compiler has unrolled the
 memcpy into straight-line 8-byte moves, which makes it pretty hard for
 anything operating byte-wise to compete, and is a bit dubious for the
 general case anyway (since it requires assuming that the size and
 alignment are known at compile time).

I did show the .s code. I call into x_memcpy(a, b), meaning that the
compiler can't assume anything. It may happen to be aligned.

Here are results over 64 Mbytes of memory, to ensure that every call is
a cache miss:

$ gcc -O3 -std=c99 -DSTRING='This is a very long sentence that is expected to 
be very slow.' -DN=(1024*1024) -o x x.c y.c strlcpy.c ; ./x
NONE:767243 us
MEMCPY: 6044137 us
STRNCPY:   10741759 us
STRLCPY:   12061630 us
LENCPY: 9459099 us

$ gcc -O3 -std=c99 -DSTRING='Short sentence.' -DN=(1024*1024) -o x x.c y.c 
strlcpy.c ; ./x
NONE:712193 us
MEMCPY: 6072312 us
STRNCPY:9982983 us
STRLCPY:6605052 us
LENCPY: 7128258 us

$ gcc -O3 -std=c99 -DSTRING='' -DN=(1024*1024) -o x x.c y.c strlcpy.c ; ./x 
NONE:708164 us
MEMCPY: 6042817 us
STRNCPY:8885791 us
STRLCPY:5592477 us
LENCPY: 6135550 us

At least on my machine, memcpy() still comes out on top. Yes, assuming that
it is aligned correctly for the machine. Here is unaliagned (all arrays are
stored +1 offset in memory):

$ gcc -O3 -std=c99 -DSTRING='This is a very long sentence that is expected to 
be very slow.' -DN=(1024*1024) -DALIGN=1 -o x x.c y.c strlcpy.c ; ./x
NONE:790932 us
MEMCPY: 6591559 us
STRNCPY:   10622291 us
STRLCPY:   12070007 us
LENCPY:10322541 us

$ gcc -O3 -std=c99 -DSTRING='Short sentence.' -DN=(1024*1024) -DALIGN=1 -o 
x x.c y.c strlcpy.c ; ./x
NONE:764577 us
MEMCPY: 6631731 us
STRNCPY:9513540 us
STRLCPY:6615345 us
LENCPY: 7263392 us

$ gcc -O3 -std=c99 -DSTRING='' -DN=(1024*1024) -DALIGN=1 -o x x.c y.c 
strlcpy.c ; ./x
NONE:825689 us
MEMCPY: 660 us
STRNCPY:8976487 us
STRLCPY:5878088 us
LENCPY: 6180358 us

Alignment looks like it does impact the results for memcpy(). memcpy()
changes from around 6.0 seconds to 6.6 seconds. Overall, though, it is
still the winner in all cases accept for strlcpy(), which beats it on
very short strings ().

Here is the cache hit case including your strlen+memcpy as 'LENCPY':

$ gcc -O3 -std=c99 -DSTRING='This is a very long sentence that is expected to 
be very slow.' -DN=1 -o x x.c y.c strlcpy.c ; ./x
NONE:696157 us
MEMCPY:  825118 us
STRNCPY:7983159 us
STRLCPY:   10787462 us
LENCPY: 6048339 us

$ gcc -O3 -std=c99 -DSTRING='Short sentence.' -DN=1 -o x x.c y.c strlcpy.c ; 
./x
NONE:700201 us
MEMCPY:  593701 us
STRNCPY:7577380 us
STRLCPY:3727801 us
LENCPY: 3169783 us

$ gcc -O3 -std=c99 -DSTRING='' -DN=1 -o x x.c y.c strlcpy.c ; ./x
NONE:706283 us
MEMCPY:  792719 us
STRNCPY:7870425 us
STRLCPY: 681334 us
LENCPY: 2062983 us


First call was every call being a cache hit. With this one, every one is
a cache miss, and the 64-byte blocks are spread equally over 64 Mbytes of
memory. I've attached the code for your consideration. x.c is the routines
I used to perform the tests. y.c is the main program. strlcpy.c is copied
from the online reference as is without change. The compilation steps
are described above. STRING is the string to try out. N is the number
of 64-byte blocks to allocate. ALIGN is the number of bytes to offset
the array by when storing / reading / writing. ALIGN should be = 0.

At N=1, it's all in cache. At N=1024*1024 it is taking up 64 Mbytes of
RAM.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/

#include string.h
#include sys/types.h

size_t
strlcpy(char *dst, const char *src, size_t siz);

void x_none(char * restrict a, const char * restrict b) {
// Do nothing.
}

void x_memcpy(char * restrict a, const char * restrict b) {
memcpy(a, b, 64);
}

void x_strncpy(char * restrict a, const char * restrict b) {
strncpy(a, b, 64);
}

void x_strlcpy(char * restrict a, const char * restrict b) {
strlcpy(a, b, 64);
}

void x_strlenmemcpy(char * restrict a, const char * restrict b) {
size_t len = strlen(b) + 1;
memcpy(a, b, len  64 ? len : 64);
}

#include stdio.h
#include string.h
#include 

Re: [HACKERS] Per-database search_path

2006-09-29 Thread David Fetter
On Fri, Sep 29, 2006 at 05:41:35PM -0400, Tom Lane wrote:
 David Fetter [EMAIL PROTECTED] writes:
  On Fri, Sep 29, 2006 at 01:06:09PM -0700, Josh Berkus wrote:
  However, it almost seems like this would become a piece of the
  other per-database-user stuff we'd like to do, like local
  superuser.
 
  I'm not sure that's the same.  The thing about superuser as it
  exists now is the ability to write to the filesystem, which means
  that there's no boundary really possible.
 
 Yeah.  ISTM the correct generalization is per-user per-database
 default GUC settings, which has nothing to do with superuserness.

This sounds like a TODO for 8.3.  What wrinkles might this involve?
Offhand, I'm thinking that it would touch the inheritance stuff that
roles have.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Per-database search_path

2006-09-29 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 On Fri, Sep 29, 2006 at 05:41:35PM -0400, Tom Lane wrote:
 Yeah.  ISTM the correct generalization is per-user per-database
 default GUC settings, which has nothing to do with superuserness.

 This sounds like a TODO for 8.3.  What wrinkles might this involve?

Probably rethink the rolconfig/datconfig representation.  Maybe it's
time for a separate catalog for the settings.

 Offhand, I'm thinking that it would touch the inheritance stuff that
 roles have.

No, it wouldn't, because defaults only apply at the instant of
connection, so there's no inheritance or SET ROLE to worry about.
Whatever role you log in as is what you get.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] The enormous s-childXids problem

2006-09-29 Thread Robert Treat
On Saturday 16 September 2006 20:34, Tom Lane wrote:
 Gregory Stark [EMAIL PROTECTED] writes:
  Tom Lane [EMAIL PROTECTED] writes:
  The real question is why does the subtransaction actually assign itself
  an XID --- a simple RETURN NEXT operation ought not do that, AFAICS.
 
  I suspect the answer to that is the same as the answer to what's actually
  creating the subtransaction. plperl_return_next doesn't. I think
  something must be doing an actual SPI query, not just a return next.

 The other question on the table is why it didn't respond to QueryCancel
 in a reasonable amount of time.  I'd really like to see a complete test
 case for this problem ...


I think the plperl was a red herring.  Once dbi-link grabs a recordset, the 
rows are looped over, processed, and then inserted (based on some 
conditionals) into another table. Those inserts are wrapped in a 
beginexception block, which, since it is in a loop, I suspect is creating 
the large number of childXids in cases where there are a large number of 
inserts.   I haven't tested that theory, but it seems logical, and should be 
easy enough to reproduce with a simple LOOP ... END LOOP in plpgsql.  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Win32 hard crash problem

2006-09-29 Thread Joshua D. Drake
Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
 O.k. further on this.. the crashing is happening quickly now but not
 predictably. (as in sometimes a week sometimes 2 days).
 
 OK, that seems to eliminate the GetTickCount-overflow theory anyway.
 
 That log entry is the last (of consequence) entry before the machine says:
 2006-09-28 16:40:36.921  LOG:  received fast shutdown request
 
 Oh?  That's pretty interesting on a Windows machine, because AFAIK there
 wouldn't be any standard mechanism that might tie into our homegrown
 signal facility.  Anyone have a theory on what might trigger a SIGINT
 to the postmaster, other than intentional pg_ctl invocation?

Well the other option would be a windows restart. On windows would that
send a SIGINT to the backend?

Joshua D. Drake


 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 


-- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] JAVA Support

2006-09-29 Thread Bruce Momjian
Henry B. Hotz wrote:
 Well, that's why I was pushing SASL instead of GSSAPI.  There are  
 multiple mechanisms that are actually in use.
 
 PAM turned out not to be sufficiently specified for cross-platform  
 behavioral compatibility, and it only does password checking anyway.   
 Calling it a security solution is a big overstatement IMO.  I guess a  
 lot of people use PAM with SSL and don't worry about the gap between  
 the two (which SASL or GSSAPI close).
 
 In defense of GSSAPI non-Kerberos mechanisms do exist.  They just  
 cost money and they aren't very cross-platform.  AFAIK GSSAPI has no  
 simple password mechanisms.
 
 There's a Microsoft-compatible SPNEGO mechanism for GSSAPI that's  
 being implemented fairly widely now, but it's just a sub-negotiation  
 mech that lets you choose between a Kerberos 5 (that's practically  
 identical to the direct one), and NTLM.  If you allow NTLM you'd  
 better limit it to NTLMv2!

As already mentioned, the limitations of PAM weren't clear until after
we implemented it, so I expect the same to happen here, and the number
of acronyms flying around in this discussion is a bad sign too.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Per-database search_path

2006-09-29 Thread Bruce Momjian

Added to TODO:

* Allow more complex user/database default GUC settings

  Currently, ALTER USER and ALTER DATABASE support per-user and
  per-database defaults.  Consider adding per-user-and-database
  defaults so things like search_path can be defaulted for a
  specific user connecting to a specific database.


---

Tom Lane wrote:
 David Fetter [EMAIL PROTECTED] writes:
  On Fri, Sep 29, 2006 at 05:41:35PM -0400, Tom Lane wrote:
  Yeah.  ISTM the correct generalization is per-user per-database
  default GUC settings, which has nothing to do with superuserness.
 
  This sounds like a TODO for 8.3.  What wrinkles might this involve?
 
 Probably rethink the rolconfig/datconfig representation.  Maybe it's
 time for a separate catalog for the settings.
 
  Offhand, I'm thinking that it would touch the inheritance stuff that
  roles have.
 
 No, it wouldn't, because defaults only apply at the instant of
 connection, so there's no inheritance or SET ROLE to worry about.
 Whatever role you log in as is what you get.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New version of money type

2006-09-29 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, Sep 29, 2006 at 12:19:07PM +0200, Martijn van Oosterhout wrote:
 On Thu, Sep 28, 2006 at 06:32:11PM -0500, Jim C. Nasby wrote:
  What would be ideal is a money type that stored what currency was used
  and let you change precision (within reason).
 
 The taggedtypes version of currency does half of that, by storing the
 currency and allowing the output format to depend on that. It doesn't
 allow you to easily change the precision though, that would require
 user-defined typmod which is still under discussion.
 
 It would be possible to create a taggedtypes version of currency based
 on int64. With the currency tag it would be 12 bytes total. And the
 number of decimal places could be defined per currency...
 
 Interesting thought, probably wouldn't take more than an hour to whip
 up.

If you are at that, it's worth noting that the currency tag consists of
three capital ASCII letters. That would be fifteen bits, take or give.
That leaves fourty-eightish bits for the number or about 10^14.

THis is only half-serious, since there are other problems with
currencies (their value is dependent on time, all that Pandora's box).

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFHfWlBcgs9XrR2kYRAhieAJ9GYKruXeW3nqGyg0TO8Mo5bFBNMQCfbfzK
lbARH+l5PxIexOElcxTg3WE=
=//LX
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New version of money type

2006-09-29 Thread David Fetter
On Sat, Sep 30, 2006 at 04:42:13AM +, [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On Fri, Sep 29, 2006 at 12:19:07PM +0200, Martijn van Oosterhout wrote:
  On Thu, Sep 28, 2006 at 06:32:11PM -0500, Jim C. Nasby wrote:
 If you are at that, it's worth noting that the currency tag consists
 of three capital ASCII letters. That would be fifteen bits, take or
 give.

I feel silly for even mentioning this, but there are less than 256
countries in the UN, and as far as I know, each has at most one
currency, so you could use 8 bits instead of 15.

 That leaves forty-eightish bits for the number or about 10^14.

By the above calculation, 56 bits or about 7.2 * 10^16.

 THis is only half-serious, since there are other problems with
 currencies (their value is dependent on time, all that Pandora's
 box).

It's not just dependent on time.  Exchange rates vary in such a way
that the relationships are not always transitive :P

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match