date:20050508

Re: [HACKERS] Can we get patents?

2005-05-08 Thread Marc G. Fournier

On Sun, 8 May 2005, Alvaro Herrera wrote:
Hackers,
I was reading LWN.net and noticed an article about Eben Moglen's keynote
at linux.conf.au.  Apparently he advises free software projects to get
patents on their best ideas.
Eben encouraged free software developers to record their novel
inventions and to obtain patents on the best of them. Free legal
help can be made available to obtain patents on the best ideas.
Until the rules of the game can be changed, we must play the
game, and having the right patents available may make all the
difference in defending against an attack.
http://lwn.net/Articles/133421/
Eben Moglen is the FSF's attorney.
I'm wondering, could the PostgreSQL Foundation (or some other entity)
get patents on some parts of Postgres?  Maybe ResourceOwners for
example; or the newer parts of the optimizer.
The patents would be freely licensed to everyone (including commercial
redistributors and developers/users of competing products), except to
patent litigators, or something like that.
Individual developers could get their work patent'd, I would imagine ...

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen

Tatsuo Ishii
 Sent: Sunday, May 08, 2005 3:41 PM
 To: John Hansen
 Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; 
 [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Patch for collation using ICU

  Alvaro Herrera wrote:
   Sent: Sunday, May 08, 2005 2:49 PM
   To: John Hansen
   Cc: Tatsuo Ishii; pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
   pgsql-hackers@postgresql.org
   Subject: Re: [HACKERS] Patch for collation using ICU

   On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
Tatsuo Ishii wrote:

 So Japanese(including ASCII)/UNICODE behavior is
   perfectly correct
 at this moment.

Right, so you _never_ use accented ascii characters in 
 Japanese? 
(like è for example, whose uppercase is È)

   That isn't ASCII.  It's latin1 or some other ASCII extension.

  Point taken...
  But...

  If you want EUC_JP (Japanese + ASCII) then use that as your 
 backend encoding, not UTF-8 (unicode).
  UTF-8 encoded databases are very useful for representing multiple 
  languages in the same database, but this usefulness 
 vanishes if functions like upper/lower doesn't work correctly.

 I'm just curious if Germany/French/Spanish mixed text can be 
 sorted correctly. I think these languages need their own 
 locales even with UNICODE/ICU.

No, they will not sort correctly, for that you still need the locale.

  So optimizing for 3 languages breaks more than a hundred, 
 that's doesn't seem fair!

That is a compromise I'd be willing to agree on. :)

 Why don't you add a GUC variable or some such to control the 
 upper/lower behavior?
 --
 Tatsuo Ishii

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

2005-05-08 Thread John Hansen

Tatsuo Ishii wrote:
 Sent: Sunday, May 08, 2005 3:31 PM
 To: John Hansen
 Cc: [EMAIL PROTECTED]; pgsql-general@postgresql.org; 
 pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

  Tatsuo Ishii wrote:
   Sent: Sunday, May 08, 2005 12:01 PM
   To: [EMAIL PROTECTED]
   Cc: pgsql-general@postgresql.org; pgsql-hackers@postgresql.org
   Subject: Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

   We have developed patches which relaxes the character 
 validation so 
   that PostgreSQL accepts invalid characters. It works like this:

  That is just plain 100% wrong!!

  Under no circumstances should there be invalid data in a database.
  And if you're trying to make a database of invalid data, 
 then at least 
  encode it using a valid encoding.

  In fact, I've proposed strengthening the validation 
 routines for UTF-8.

 Actually I myself thought as you are before. Later I found 
 that it was not so good idea. People already have invalid 
 encoded data in their precious database and have very hard 
 time to migrate to newer version of PostgreSQL because of 
 encoding validation.

 Think about this kind of situation:

 There is a table t1(member_id integer primary key, 
 member_name text, address text, phone text, email text). I 
 have to reach each member by either adress, phone or email. 
 Unfortunately some of address field have wrong encoded data. 
 In this case I will use phone or email to reach them. 

 Now I need to upgrade to newer PostgreSQL within 1 day. I 
 know I have to fix wrong encoded field but it will take more 
 than 1 day. So I would like to import the data first then fix 
 wrong encoded field on running database since I can reach 
 members by phone or email even with wrong encoded address field...

Actually would be very simple, create function isvalidutf8(text) in your
preferred language.
C source is available from unicode.org.
Create function converttoutf8(text) using whatever code is required to
transform the _wrong_ encoding (SQL_ASCII - UTF8 for instance) to
utf-8.
Update table set field=converttoutf8(field) where !isvalidutf8(field);

Now sit back and relax while your invalid data is converted to utf-8.

When done, pg_dump the database, upgrade, and reload.

This should take less than a day.

 I saw this kind of situation in the real world and that's why 
 we developed the patches.
 --
 Tatsuo Ishii

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Will new release require an initdb?

2005-05-08 Thread Devrim GUNDUZ

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
The question is in the subject line: Will the new dot releases require an 
initdb, since they will play with system catalogs?

Regards,
- --
Devrim GUNDUZ 
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.tdmsoft.com.tr  http://www.gunduz.org
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFCfeHutl86P3SPfQ4RAr+2AKDuu/WYJOvtDOVUKnOBjCOGhA4PlQCfd3k3
AE8E7mckCUDT0Wb/qPDFmSk=
=vpx2
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] pl/pgsql enabled by default

2005-05-08 Thread Rod Taylor

 2. Issue a query like:
 
 SELECT *
 FROM view_of_salaries_based_on_current_user
 WHERE my_side_effect_function_that_inserts_into_a_temp_table(salary, 
 employee);

An SRF will guarantee an execution order and work for security purposes,
but getting your function with side effects to run early would be rather
challenging. Somehow you would need to make the planner think the
function is fairly selective without it actually being that way and the
only way that I'm aware of to get the planner to consider a functions
selectivity is to index it. But how do you index the function without
knowing all of the values?

-- 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Magnus Hagander

 Is this patch ready for application?

  
http://people.freebsd.org/~girgen/postgresql-icu/pg-802-icu-200
5-05-06.d
 iff.gz

 The web site is:

  http://people.freebsd.org/~girgen/postgresql-icu/readme.html

I don't think so, not quite. I have not had any positive 
reports from linux 
users, this is only tested in a FreeBSD environment. I'd say 
it needs some more testing.

I've just finished some simple testing on win32, and it does seem to work fine 
there as well, with a few modifications to the build step. As I don't have a  
working autoconf, I applied the stuff that would come from your configure.in 
changes directly to the files. Meaning putting USE_ICU in pg_config.h, and the 
following changes to Makefile.global:
1) Add the directory for the ICU include files to CPPFLAGS
2) Add -licuuc -licuin to LIBS. I notice these are different names from those 
used on Unix, so a different configure test will be needed there.
3) Add the icu lib directory to LDFLAGS
4) Remove encodings PG_WIN_1258/PG_TCVN and IBM866/PG_ALT. Didn't 
investigate further why this was needed, but this probably has something to do 
with my tests being off CVS tip vs the patch being for 8.0-stable.


After doing this, I can properly get upper and lowercase for the swedish 
characters åäö/ÅÄÖ, as well as the ß-SS conversion. That's all I really tested 
at this point, but it did *not* work before in a unicode/UTF8 database.

I've been working off the ICU 3.2 binaries available from the ICU page, the 
version compiled with MSVC 6 (because it uses the same runtime DLLs as stuff 
compiled with mingw. It should work with the MSVC7 version as well, but that 
would introduce additional DLL dependencies).

//Magnus

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] why two WRITE_NODE_FIELD(whereCluase)?

2005-05-08 Thread Tatsuo Ishii

I see WRITE_NODE_FIELD(whereClause) twice in _outSelectStmt(). Maybe a
bug?
--
Tatsuo Ishii

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Magnus Hagander

 The 3.2 vs 2.8 business is disturbing also; specifically, I 
 don't think we get to require 3.2 on a platform where 2.8 is 
 installed.

There seems to be nothing in the ICU licence that would prevent us from
bundling it.
This would solve both the 3.2 vs 2.8 problems, and would remove the
'dependency'.

 People just aren't going to hold still for that, even 
 assuming that ICU supports installing both versions at once, 
 which isn't clear to me at the moment ...

There's no problems with having both installed.

... unless you're on win32, it seems.  For some reason, they name their
libs with the version on unix (libicu18n.so.32), but not on win32 where
they all have the same name. And they don't stuff versioning information
in the DLL files.
That can be lived with as long as libpq doesn't depend on it, though -
you can just stick the DLL in the same directory as the EXE, which is
also what the ICU people recommend in their docs. Unnecessarily ugly,
but it works.


I did that on debian to get the patch going.
Tho, bundling it seems cleaner to me.

The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2.
That means the size of the distribution would almost *double* if we
bundled ICU.

It's probably fine bundling it in the binary distributions (at least
we'd probably do it on win32, since not many ppl will have it already
there), but bundling the source seems a bit excessive to me.

//Magnus

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen

 The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2.
 That means the size of the distribution would almost *double* 
 if we bundled ICU.

Ermm,. Don't forget to remove the current charset conversions and locale
support before making your size estimation.

 
 It's probably fine bundling it in the binary distributions 
 (at least we'd probably do it on win32, since not many ppl 
 will have it already there), but bundling the source seems a 
 bit excessive to me.
 
 //Magnus
 
 

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-08 Thread Heikki Linnakangas

On Sat, 7 May 2005, Tom Lane wrote:
Heikki Linnakangas [EMAIL PROTECTED] writes:
Maybe we should take a different approach to the problem:
1. Create new file with an extension to mark that it's not
yet committed (eg. 1234.notcommitted)
This is pushing the problem into the wrong place, viz the lowest-level
file access routines, which will now all have to know about
.notcommitted status.  It also creates race conditions --- think about
backend A trying to commit file 1234 at about the same time that
backend B is trying to flush some dirty buffers belonging to that file.
True. With the rename variant, it might indeed get messy.
Consider the variant with extra marker files. In that case, backend B 
doesn't have to know about the .notcommitted status to flush the buffers.

But most importantly, it doesn't handle the file-deletion case.
File-deletions are easy to handle. Just write the list of pending 
deletions to WAL on commit.

To recap, we have 2 slightly different scenarios:
a) Delete a file, write commit record, crash
b) Create a file, crash
Just WAL logging the deletions on commit would take care of A. The 
.notcommitted mechanism would take care of B.

- Heikki
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Tatsuo Ishii

  I don't buy it. If current conversion tables does the right 
  thing, why we need to replace. Or if conversion tables are 
  not correct, why don't you fix it? I think the rule of 
  character conversion will not change frequently, especially 
  for LATIN languages. Thus maintaining cost is not too high.
 
 I never said we need to, but if we're going to implement ICU,
 then we might as well go all the way.

So you admit there's no benefit using ICU for replacing existing
conversions?

Besides ICU does not support all existing conversions, I think ICU has
serious flaw for using conversion. If I understand correctly, ICU uses
UNICODE internally to do the conversion. For example, to implement
SJIS-EUC_JP conversion, ICU first converts SJIS to UNICODE then
converts UNICODE to EUC_JP. Problem is these conversion is not roud
trip(conversion between SJIS/EUC_JP and UNICODE will lose some
information). Thus SJIS-EUC_JP-SJIS conversion using ICU does not
preserve original text.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Palle Girgensohn


--On söndag, maj 08, 2005 22.19.25 +0900 Tatsuo Ishii [EMAIL PROTECTED] 
wrote:

   On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
Tatsuo Ishii wrote:
  
 So Japanese(including ASCII)/UNICODE behavior is
   perfectly correct
 at this moment.
   
Right, so you _never_ use accented ascii characters in
 Japanese?
(like è for example, whose uppercase is È)
  
   That isn't ASCII.  It's latin1 or some other ASCII extension.
 
  Point taken...
  But...
 
  If you want EUC_JP (Japanese + ASCII) then use that as your
 backend encoding, not UTF-8 (unicode).
  UTF-8 encoded databases are very useful for representing multiple
  languages in the same database, but this usefulness
 vanishes if functions like upper/lower doesn't work correctly.

 I'm just curious if Germany/French/Spanish mixed text can be
 sorted correctly. I think these languages need their own
 locales even with UNICODE/ICU.
No, they will not sort correctly, for that you still need the locale.
I'm confused. I thought the ICU patches is intended for using on
broken locale platforms?
It will sort correctly in *one* locale, using ICU. You still cannot mix 
different locales in the same database cluster, the collation locale is 
still fixed at initdb time, unfortunately.

/Palle

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Tatsuo Ishii

On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
 Tatsuo Ishii wrote:

  So Japanese(including ASCII)/UNICODE behavior is
perfectly correct
  at this moment.
 
 Right, so you _never_ use accented ascii characters in 
  Japanese? 
 (like è for example, whose uppercase is È)

That isn't ASCII.  It's latin1 or some other ASCII extension.
   
   Point taken...
   But...
   
   If you want EUC_JP (Japanese + ASCII) then use that as your 
  backend encoding, not UTF-8 (unicode).
   UTF-8 encoded databases are very useful for representing multiple 
   languages in the same database, but this usefulness 
  vanishes if functions like upper/lower doesn't work correctly.
  
  I'm just curious if Germany/French/Spanish mixed text can be 
  sorted correctly. I think these languages need their own 
  locales even with UNICODE/ICU.
 
 No, they will not sort correctly, for that you still need the locale.

I'm confused. I thought the ICU patches is intended for using on
broken locale platforms?
--
Tatsuo Ishii

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] why two WRITE_NODE_FIELD(whereCluase)?

2005-05-08 Thread Tom Lane

Tatsuo Ishii [EMAIL PROTECTED] writes:
 I see WRITE_NODE_FIELD(whereClause) twice in _outSelectStmt(). Maybe a
 bug?

Yeah, it is, though a pretty minor one since we never read SelectStmt
nodes back in again (they are not used in stored rules, which are
already Querys).  Feel free to fix it in HEAD.  Might want to check
the other fields while at it ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Views, views, views! (long)

2005-05-08 Thread Jim C. Nasby

On Fri, May 06, 2005 at 05:44:43PM -0400, Robert Treat wrote:
 Sorry, but I'm still in the admin tools wont use these camp since I don't 
 believe these views can solve an admin tools need to support multiple 
 versioning within its code.   I also don't think it is any harder to learn to 
 query the system tables  than it would be to learn to query these new views 
 (with a few caevets that I will come back to) and it might actually be 
 better.  If I'm building an admin tool, I have to know that tablespaces 

I find it hard to believe that it's easier to write a 30 line query
instead of just selecting out of a single view. But, even if an admin
tool does want to 'go direct to the source' and query the system tables,
ISTM that having a reference implementation (the system views) would be
very valuable.

 aren't supported on some older versions, and I think it is easier to figure 
 this out if my query breaks on tablespace information rather than if my query 
 just silently sends me some special data (NULL?) that I have to interpret to 
 mean not supported.

Well, these views don't prevent you from using version() to know what is
and isn't supported, but if you can think of other means to indicate
what features are and aren't available I'm all ears.

 That said, some admin tools already have a requirment that you install some 
 little piece of schema into your database to support them, they could  
 include this package along with thier software if they felt strongly about 
 it.
 
 The cavet I am thinking about from above is things like the relacl bits of 
 pg_class, which are a total poop to work with.  Adding a couple of new system 
 views to help make that information more transparent would be a good thing. 
 Actually I am thinkinga couple of parts of this stuff could be used as an 
 enhancement to the current system views if people weren't interested in a 
 wholesale replacement. 

It's certainly not decided that these views would replace anything. I'm
in favor of always keeping these views in their own schema so that it's
up to the user to decide what exactly they want to query. If they want
stuff out of the current catalog, then use pg_catalog. If they want
these new views, then use pg_sysviews.

There is the possibility of eventually replacing some of the old system
views, but that would be several versions away, if it were to ever
happen. And of course these would not replace the system tables.

  But yes, the intention is to continue to support backwards compatability
  as much as possible. Currently I believe that compatability stops at
  versions that don't support schemas, though that could change.
 
 I'm curious, are the queries between various versions actually all that 
 different? I can't imagine that you can present a stable interface going back 
 3 versions that is relevant to all three versions that also requires serious 
 query changes between each version.   

I suggest taking a gander at the '_compat' files at
http://lnk.nu/cvs.pgfoundry.org/251/. Basically, features that are new
in 8.0 (ie: tablespaces) have an abstraction layer. The code under that
layer is version specific, but the code above it is generic. So
_compat74.sql creates a bogus pg_tablespace (though now that I'm
thinking about it, we should probably use a different name for that,
such as _pg_tablespace).

Of course, we could certainly go the route of having completely
different view definitions for different versions, but I'm not sure
that's an improvement.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-08 Thread Tom Lane

Heikki Linnakangas [EMAIL PROTECTED] writes:
 Consider the variant with extra marker files. In that case, backend B 
 doesn't have to know about the .notcommitted status to flush the buffers.

[ shrug ]  It's still broken, and the reason is that there's no
equivalent of fsync for directory operations.  Consider

A creates 1234 and 1234.notcommitted

A commits

B performs a checkpoint

crash

all before A manages to delete 1234.notcommitted, or at least before
that deletion has made its way to disk.  Upon restart, only WAL
events after the checkpoint will be replayed, so 1234.notcommitted
doesn't go away, and then you've got a problem.

To fix this there would need to be a way (1) for B to be aware of the
pending file deletion and (2) for B to delay committing the checkpoint
until the directory update is surely down on disk.  Your proposal
doesn't provide for (1), and even if we fixed that, I know of no
portable kernel API for (2).  fsync isn't applicable.

While your original patch is buggy, it's at least fixable and has
localized, limited impact.  I don't think these schemes are safe
at all --- they put a great deal more weight on the semantics of
the filesystem than I care to do.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Will new release require an initdb?

2005-05-08 Thread Tom Lane

Devrim GUNDUZ [EMAIL PROTECTED] writes:
 The question is in the subject line: Will the new dot releases require an 
 initdb, since they will play with system catalogs?

No.  See the release notes --- it's the DBA who has to do the playing.
http://developer.postgresql.org/docs/postgres/release.html

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Tom Lane

Palle Girgensohn [EMAIL PROTECTED] writes:
 I'm confused. I thought the ICU patches is intended for using on
 broken locale platforms?

 It will sort correctly in *one* locale, using ICU. You still cannot mix 
 different locales in the same database cluster, the collation locale is 
 still fixed at initdb time, unfortunately.

I thought the point of using ICU was to be able to dig out from under
that restriction?  It's a bit of a large pill to swallow if we will
still have to throw it away someday to become SQL spec compliant.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Race conditions, race conditions!

2005-05-08 Thread Jim C. Nasby

On Sat, May 07, 2005 at 07:20:48PM -0400, Tom Lane wrote:
 Greg Stark [EMAIL PROTECTED] writes:
  I wonder if there's an argument for building assertion-enabled builds with
  code that randomly yields the processor some percentage of time before and
  after taking a lock. It wouldn't catch every case but it might help.
 
 Seems like that would mainly help you find cases where you'd put a lock
 acquire or release a bit too late or too soon in a sequence of events;
 not cases where you'd failed to acquire a needed lock at all.  It'd be
 more useful I think to have a facility that randomly stops backends for
 awhile regardless of exactly where they are in the code.
 
 A high-load test case actually does this to some extent, but the problem
 is you have little reproducibility and no assurance that execution
 stopped for long enough to let critical events happen elsewhere.  The
 ideal facility I think would slow one backend much more than others,
 whereas high load still leaves them all making progress at about the
 same rate ...

Would setting different priorities/niceness on different backends during
the stress test help? It might not be perfect but it should be trivial
to accomplish...
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Views, views, views! (long)

2005-05-08 Thread Andrew - Supernews

On 2005-05-08, Jim C. Nasby [EMAIL PROTECTED] wrote:
 I suggest taking a gander at the '_compat' files at
 http://lnk.nu/cvs.pgfoundry.org/251/. Basically, features that are new
 in 8.0 (ie: tablespaces) have an abstraction layer. The code under that
 layer is version specific, but the code above it is generic. So
 _compat74.sql creates a bogus pg_tablespace (though now that I'm
 thinking about it, we should probably use a different name for that,
 such as _pg_tablespace).

Remember that this is still an alpha version. In the longer term I think
we should look at splitting it into two schemas, one with the views
themselves and another with the support functions and other implementation
details. (There are other ways to handle pg_tablespace too, that just
happened to be the convenient one for proof-of-concept testing.)

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Can we get patents?

2005-05-08 Thread Josh Berkus

Alvaro,

 I'm wondering, could the PostgreSQL Foundation (or some other entity)
 get patents on some parts of Postgres?  Maybe ResourceOwners for
 example; or the newer parts of the optimizer.

That depends; is the SFLC offering to pay for the patent applications?  Last I 
checked, it was somewhere around $6000 per patent.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] pl/pgsql enabled by default

2005-05-08 Thread Josh Berkus

Mike,

 I think most people coming from any other enterprise-class RDBMS
 environment will be surprised that they cannot use VIEWs to provide
 user-specific views on data. I could be wrong, but I'd put money on it...

Well, I'd say that giving regular users the create permission on your 
database/schema is unwise, period.   I don't, even when the only user is 
phpuser.  SQL injections attacks are no fun.

Also, as Andrew points out, this can't be used to circumvent view-based 
security if you've set it up correctly; if the user can't select * from 
table, then he can't write a function to select * from table.  

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Andrew Dunstan


Magnus Hagander wrote:
The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2.
That means the size of the distribution would almost *double* if we
bundled ICU.
It's probably fine bundling it in the binary distributions (at least
we'd probably do it on win32, since not many ppl will have it already
there), but bundling the source seems a bit excessive to me.
 

I'm also mildly curious to know what effect using ICU will have on 
memory consumption. Has anyone looked? My suspicion was aroused by this 
library that it installed on my FC3 box:

-rwxr-xr-x  1 root root 9777876 Jan  1 20:18 /usr/lib/libicudata.so.32.0

cheers
andrew
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] pl/pgsql enabled by default

2005-05-08 Thread Andrew Dunstan


Josh Berkus wrote:
Mike,
 

I think most people coming from any other enterprise-class RDBMS
environment will be surprised that they cannot use VIEWs to provide
user-specific views on data. I could be wrong, but I'd put money on it...
   

Well, I'd say that giving regular users the create permission on your 
database/schema is unwise, period.   I don't, even when the only user is 
phpuser.  SQL injections attacks are no fun.

Also, as Andrew points out, this can't be used to circumvent view-based 
security if you've set it up correctly; if the user can't select * from 
table, then he can't write a function to select * from table.  

 

Seems it's a bit more complicated. Kris Jurka has explained to me how 
one might be able to, at least theoretically. Perhaps Mike needs to do 
something like:

 revoke usage on language sql from public;
on his db. Then users could continue to use functions he has defined, 
but not subvert things via their own functions. That will certainly go 
into my list of db hardening tips and tricks.

cheers
andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread Palle Girgensohn

 Palle Girgensohn [EMAIL PROTECTED] writes:
 I'm confused. I thought the ICU patches is intended for using on
 broken locale platforms?

 It will sort correctly in *one* locale, using ICU. You still cannot mix
 different locales in the same database cluster, the collation locale is
 still fixed at initdb time, unfortunately.

 I thought the point of using ICU was to be able to dig out from under
 that restriction?

I think it might be quite possible to mix several locales, using ICU. It's
just that this is not what the patch does at moment. It just finds out the
locale set at initdb and uses it for collation with ICU.

Handling mixed locales for collation has a few hard problems, AFAIK.
First, isn't the main obstacle for mixing collations that indices require
a single well defined locale? I assume that locale dependant comparison
(collation) is used when indexing tuples, right? As long as a specific
locales collation is used for indexing text fields, I believe we cannot
easily mix different locales, right? Second, how do we tell the backend
which locale to use? Is there some SQL spec for this?

 It's a bit of a large pill to swallow if we will still
 have to throw it away someday to become SQL spec compliant.

What do we need to be SQL spec compliant in this respect?

/Palle



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Views, views, views! (long)

2005-05-08 Thread Andrew - Supernews

On 2005-05-05, Peter Eisentraut [EMAIL PROTECTED] wrote:
 I would suggest that you align your terminology with the information 
 schema as much as possible, so it would be type_schema and not 
 type_schema_name, and ordinal_position instead of 
 column_position.  Otherwise we'll have a lot of confusion ahead if we 
 instroduced a third parallel set of terminology.

Personally I'm open to suggestions on this; we didn't entirely agree on
the naming conventions when writing the stuff so far.

 c) In most places, system objects are segregated from
 user objects,  e.g. pg_user_indexes

 I think that is a bad idea as it goes against the fundamental design of 
 PostgreSQL.

In what way? Please elaborate.

 g) All views are as normalized as possible, using child views
 rather than arrays, and providing keys and consistent join columns.

 You still seem to have a bunch of arrays in there.  Anything with an 
 array is never normalized.

There are 6 array columns in there at the moment. One looks pointless and
might get removed (database_config). The others are all intentional and
cover cases where the denormalized view is (a) already easily available
within the query and (b) substantially useful. The normalized versions
are available too in all cases.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen

Tatsuo Ishii wrote:
 Sent: Sunday, May 08, 2005 11:19 PM
 To: John Hansen
 Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; 
 [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Patch for collation using ICU

 On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote:
  Tatsuo Ishii wrote:

   So Japanese(including ASCII)/UNICODE behavior is
 perfectly correct
   at this moment.

  Right, so you _never_ use accented ascii characters in
   Japanese? 
  (like è for example, whose uppercase is È)

 That isn't ASCII.  It's latin1 or some other ASCII extension.

Point taken...
But...

If you want EUC_JP (Japanese + ASCII) then use that as your
   backend encoding, not UTF-8 (unicode).
UTF-8 encoded databases are very useful for 
 representing multiple 
languages in the same database, but this usefulness
   vanishes if functions like upper/lower doesn't work correctly.

   I'm just curious if Germany/French/Spanish mixed text can 
 be sorted 
   correctly. I think these languages need their own locales 
 even with 
   UNICODE/ICU.

  No, they will not sort correctly, for that you still need 
 the locale.

 I'm confused. I thought the ICU patches is intended for using 
 on broken locale platforms?

Initially yes, but why duplicate code?
What I meant was, that they will not sort correctly using the C locale.
Locale _name_ needs to be known to ICU for it to sort correctly.

 --
 Tatsuo Ishii

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen

Tom Lane wrote:
 Sent: Monday, May 09, 2005 2:47 AM
 To: Palle Girgensohn
 Cc: Tatsuo Ishii; John Hansen; [EMAIL PROTECTED]; 
 pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Patch for collation using ICU 

 Palle Girgensohn [EMAIL PROTECTED] writes:
  I'm confused. I thought the ICU patches is intended for using on 
  broken locale platforms?

  It will sort correctly in *one* locale, using ICU. You still cannot 
  mix different locales in the same database cluster, the collation 
  locale is still fixed at initdb time, unfortunately.

 I thought the point of using ICU was to be able to dig out 
 from under that restriction?  It's a bit of a large pill to 
 swallow if we will still have to throw it away someday to 
 become SQL spec compliant.

That is not a limitation of ICU but of postgresql.
I don't know what the specs say, but imagine something like:
SELECT foo FROM bar ORDER BY foo WITH LOCALE 'en_US', foobar WITH LOCALE
'jp_JP';

Which would be less difficult to implement using ICU.

   regards, tom lane

... John

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen

Tatsuo Ishii wrote:
 Sent: Sunday, May 08, 2005 11:08 PM
 To: John Hansen
 Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
 pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Patch for collation using ICU

   I don't buy it. If current conversion tables does the 
 right thing, 
   why we need to replace. Or if conversion tables are not 
 correct, why 
   don't you fix it? I think the rule of character 
 conversion will not 
   change frequently, especially for LATIN languages. Thus 
 maintaining 
   cost is not too high.

  I never said we need to, but if we're going to implement 
 ICU, then we 
  might as well go all the way.

 So you admit there's no benefit using ICU for replacing 
 existing conversions?

 Besides ICU does not support all existing conversions, I 
 think ICU has serious flaw for using conversion. If I 
 understand correctly, ICU uses UNICODE internally to do the 
 conversion. For example, to implement
 SJIS-EUC_JP conversion, ICU first converts SJIS to UNICODE then
 converts UNICODE to EUC_JP. Problem is these conversion is 
 not roud trip(conversion between SJIS/EUC_JP and UNICODE will 
 lose some information). Thus SJIS-EUC_JP-SJIS conversion 
 using ICU does not preserve original text.

Could you please send me a sample text as an attachment encoded in SJIS
where this would happen?

 --
 Tatsuo Ishii

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Patch for collation using ICU

2005-05-08 Thread John Hansen

 -Original Message-
 From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, May 08, 2005 11:08 PM
 To: John Hansen
 Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; 
 pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Patch for collation using ICU

   I don't buy it. If current conversion tables does the 
 right thing, 
   why we need to replace. Or if conversion tables are not 
 correct, why 
   don't you fix it? I think the rule of character 
 conversion will not 
   change frequently, especially for LATIN languages. Thus 
 maintaining 
   cost is not too high.

  I never said we need to, but if we're going to implement 
 ICU, then we 
  might as well go all the way.

 So you admit there's no benefit using ICU for replacing 
 existing conversions?

 Besides ICU does not support all existing conversions, I 
 think ICU has serious flaw for using conversion. If I 
 understand correctly, ICU uses UNICODE internally to do the 
 conversion. For example, to implement
 SJIS-EUC_JP conversion, ICU first converts SJIS to UNICODE then
 converts UNICODE to EUC_JP. Problem is these conversion is 
 not roud trip(conversion between SJIS/EUC_JP and UNICODE will 
 lose some information). Thus SJIS-EUC_JP-SJIS conversion 
 using ICU does not preserve original text.

Just for the record, I fetched a web page encoded in sjis, and converted
it to euc-jp and back using uconv from ICU 3.2, and the result is the
original is identical to the transformed file.

 uconv -f Shift_JIS -t EUC-JP -o index.html.euc index.html
 uconv -f EUC-JP -t Shift_JIS -o index.html.sjis index.html.euc
 diff index.html index.html.sjis

... John

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[HACKERS] Oracle Style packages on postgres

2005-05-08 Thread rmm

Oracle Style packages on postgres

OVERVIEW:

To emulate oracle server side development in postgres I required server
side packages.  The following text demonstrates how to do this using
plpython on postgres 8 and suggests a language extension.


WHAT ARE ORACLE PACKAGES?

Looking back over the postgres discussion forums (particulary a discussion
in 2001 following a proposal by Bill Studenmund) there appears to be some
confusion over what oracle packages are.  Here's a concise definition :
   A black box processing engine with one or more public access functions
that retains state across calls
An oracle package is created when first referenced. Its initialization
code is run once (ie costly queries to populate session wide package
params) and the package dies at the end of the session
An analogy with OOP is that it's like having a single class instance
available for the duration of a session.


SOME POWERFUL USES OF PACKAGES:

1. Pipes - oracle dbms_pipe built-in allows asynchronous communication
between any number of producer/consumer database sessions on any number of
pipes

2. Logging - leave all logging/debug statements in code, decision on
logging output can be made when the logging package is initialised (eg by
querying lookup tables for user, on/off, level, and destination).  Combine
logging with pipes and the output can be stored in tables seperate from
the current transaction.  Include timing info down to milliseconds and
live problems/bottlenecks can more easily be identified.

3. Batch reporting - more suited to autonomous transactions than logging
but useful to have the report package store start time, duration,
error/warning count running totals etc. and summarize automatically at
report end.

See the example below on how to implement a version of the oracle
dbms_output package in plpython


EXTENSIONS TO POSTGRES:

Oracle style package creation syntax is split into header and body so that
the body(code) can be re-compiled without invalidating dependent objects.
Postgres syntax for the dbms_output example (in any postgres server side
language) would be along the lines of:
CREATE OR REPLACE PACKAGE HEADER dbms_output AS
   FUNCTION dbms_output_put_line(text) RETURNS text,
   FUNCTION dbms_output_get_lines() RETURNS text;
CREATE OR REPLACE PACKAGE BODY dbms_output AS $$ 'package code' $$
language;

Adding pg_package with a link from pg_proc are the only changes required
to the data dictionary.
It would be nice to have similar dotted syntax as oracle
(user.package.function) but would this mess up postgres namespaces?

The language in which the package was created would process the 'package
code', for example in python:
o  create public functions linking header declaration to package body code
(see dbms_output example)
o  process embedded sql, eg l_curs=select * from dual -
l_curs=self.execute('select * from dual')
o  the extracted sql can be 'prepared' by postgres and syntax exceptions
reported as compilation errors


SUMMARY:
Packages are an important addition to postgres.  Some of the server side
languages have the potential to create them now.  It would be useful to
add a common high level syntax before the various language implementations
start developing their own solutions.


I'm currently testing dbms_pipe on postgres, let me know if anyone is
interested.  I replaced xml-rpc (5 messages/second) by sockets (600x
faster!), and may test corba


Ronnie Mackay


-
-


EXAMPLE :CONVERT ORACLE BUILT-IN PACKAGE DBMS_OUTPUT:

[Oracle syntax is :exec dbms_output.put_line('line1');]

Postgresselect dbms_output_put_line('line 1');
Postgresselect test_call_dbms_output_from_within_plpgsql('line 2
(plpgsql)');
Postgresselect test_call_dbms_output_from_within_plpython('line 3
(plpython)');
Postgresselect dbms_output_put_line('line 4');

Postgresselect dbms_output_get_lines();
--- DBMS_OUTPUT DEMO ---
line 1
line 2 (plpgsql)
line 3 (plpython)
line 4
--- DBMS_OUTPUT DEMO ---

So using current postgres syntax the only difference with oracle is that
dbms_output.put_line('line 1'); becomes
dbms_output_put_line('line 1');
The source code to implement the package body is returned by postgres
function dbms_output()


POSTGRES CREATE STATEMENTS FOR EXAMPLE:
-

CREATE or replace FUNCTION dbms_output_put_line(text) RETURNS text AS $$
from plpython import getPackage
return getPackage(GD, plpy, 'dbms_output').putLine(args[0])
$$ LANGUAGE plpythonu;

CREATE or replace FUNCTION dbms_output_get_lines() RETURNS text AS $$
from plpython import getPackage
return getPackage(GD, plpy, 'dbms_output').getLines()
$$ LANGUAGE plpythonu;

-- package body
CREATE OR REPLACE FUNCTION dbms_output() RETURNS text AS $$
return 
from plpython

Re: [HACKERS] test bed

2005-05-08 Thread Neil Conway

Oleg Bartunov wrote:
I just talked with AMD Russia and they could provide almost
any hw for testing, so I'm looking for test-suite for PostgreSQL.
Do we have sort of official tests ?
I guess the regression tests would be the closest to an official set of 
tests, but they obviously aren't perfect. For performance testing, there 
are various tools -- pgbench, OSDL's DBT tests, OSDB, etc.

-Neil
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

2005-05-08 Thread Tino Wildenhain

Am Sonntag, den 08.05.2005, 14:30 +0900 schrieb Tatsuo Ishii:
...
 Actually I myself thought as you are before. Later I found that it was
 not so good idea. People already have invalid encoded data in their
 precious database and have very hard time to migrate to newer version
 of PostgreSQL because of encoding validation.
...
 Now I need to upgrade to newer PostgreSQL within 1 day. I know I have
 to fix wrong encoded field but it will take more than 1 day. So I
 would like to import the data first then fix wrong encoded field on
 running database since I can reach members by phone or email even with
 wrong encoded address field...

Well, if you are so in a hurry you better not migrate. Postgres is
proud of validating the input and to have no invalid data. So if
you have invalid data, better fix it.

 I saw this kind of situation in the real world and that's why we
 developed the patches.

Why not developing a helper for contrib to help reencoding the
database instead?

Regards
Tino


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

2005-05-08 Thread Madison Kelly

John Hansen wrote:
Tatsuo Ishii wrote:
We have developed patches which relaxes the character 
validation so that PostgreSQL accepts invalid characters. It 
works like this:

That is just plain 100% wrong!!
Under no circumstances should there be invalid data in a database.
And if you're trying to make a database of invalid data, then at 
least encode it using a valid encoding.

In fact, I've proposed strengthening the validation routines for UTF-8.
... John
  Under most circumstances I would agree with you completely. In my 
case though I have to decide between risking a loss of a user's data or 
attempt to store the file name in some manner that would return the same 
name used by the file system.

  The user (or one of his/her users in the case of an admin) may be 
completely unaware of the file name being an invalid unicode name. The 
file itself though may still be quite valid and contain information 
worthy of backing up. I could notify the user/admin that the name is not 
valid but there is no way I could rely on the name being changed. Given 
the choices, I would prefer to attempt to store/use the file name with 
the invalid unicode character than simply ignore the file.

  Is there a way to store the name in raw binary? If so, would this not 
be safe because to postgresql it should no longer matter what data is or 
represents, right? Maybe there is a third option I am not yet concidering?

Madison
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Madison Kelly (Digimer)
TLE-BU, The Linux Experience; Back Up
http://tle-bu.thelinuxexperience.com
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

2005-05-08 Thread Madison Kelly

Thank you, I would!
(B
(B  What versions have you tested the patch against? I am sorry but I am
(Bnot too familiar with applying patches against the main program, is
(Bthere documentation on how to apply the patch? Is there a way to roll
(Bthe patch back/remove it? Would I be able to script the installation of
(Bthe patch (I would expect so).
(B
(B  The reason for the last question is that I expect (hope) many people
(Bwill use it and I want to make it as easy as possible for a user to
(Bsimply select or unselect the patch if it works well. If I can script
(Bthe install and removal of this patch then I can do just this and that
(Bwould be wonderful.
(B
(B  Thank you again!
(B
(B  $B$I$&$b(B $B$"$j$,$H$&(B $B$4$6$$$^$9(B! (I hope that is right, my 
(BJapanese is
(Bstill elementary. :) )
(B
(B  Madison
(B
(B
(BTatsuo Ishii wrote:
(B We have developed patches which relaxes the character validation so
(B that PostgreSQL accepts invalid characters. It works like this:
(B 
(B 1) new postgresql.conf item "mbstr_check" added.
(B 2) if mbstr_check = 0 then invalid characters are not accepted
(B(same as current PostgreSQL behavior). This is the default.
(B 3) if mbstr_check = 1 then invalid characters are accepted with
(BWARNING
(B 4) if mbstr_check = 2 then invalid characters are accepted without any
(Bwarnings
(B 5) We have checked PostgreSQL source code if accepting invalid
(Bcharacters makes some troubles. We have found that we need to fix a
(Bplace and the fix is included in the patches.
(B 
(B Madison,
(B If you are interested in the patches, I could send it to you.
(B 
(B Hackers,
(B Do you think the functionality something like above is worth to add to
(B PostgreSQL?
(B --
(B Tatsuo Ishii
(B 
(B 
(BHi all,
(B
(B   I've been chasing down a bug and from what I have learned it may be 
(Bbecause of how postgreSQL (8.0.2 on Fedora Core 4 test 2) handles 
(Binvalid unicode. I've been given some ideas on how to try to catch 
(Binvalid unicode but it seems expensive so I am hoping there is a 
(Bpostgresql way to deal with this problem.
(B
(B   I've run into a problem where a bulk postgres "COPY..." statement is 
(Bdieing because one of the lines contains a file name with an invalid 
(Bunicode character. In nautilus this file has '(invalid encoding)' and 
(Bthe postgres error is 'CONTEXT:  COPY file_info_3, line 228287, column 
(Bfile_name: "Femme Fatal\u.url"'.
(B
(B   To actually look at the file from the shell (bash) shows what appears 
(Bto be a whitespace but when I copy/paste the file name I get the 
(B'\u' you see above.
(B
(B   I could, with the help of the TLUG people, use regex to match for an 
(Binvalid character and skip the file but that is not ideal. The reason is 
(Bthat this is for my backup program and invalid unicode or not, the 
(Bcontents of the file may still be important and I would prefer to have 
(Bit in the database so that it is later copied. I can copy and move the 
(Bfile in the shell so the file isn't apparently in an of itself corrupt.
(B
(B   So then, is there a way I can tell postresql to accept the invalid 
(Bunicode name? Here is a copy of my schema:
(B
(Btle-bu= \d file_info_2
(B   Table "public.file_info_2"
(B Column| Type |Modifiers
(B--+--+-
(B  file_group_name  | text |
(B  file_group_uid   | bigint   | not null
(B  file_mod_time| bigint   | not null
(B  file_name| text | not null
(B  file_parent_dir  | text | not null
(B  file_perm| text | not null
(B  file_size| bigint   | not null
(B  file_type| character varying(2) | not null default 
(B'f'::character varying
(B  file_user_name   | text |
(B  file_user_uid| bigint   | not null
(B  file_backup  | boolean  | not null default true
(B  file_display | boolean  | not null default false
(B  file_restore_display | boolean  | not null default false
(B  file_restore | boolean  | not null default false
(BIndexes:
(B "file_info_2_display_idx" btree (file_type, file_parent_dir, file_name)
(B
(B   'file_name' and 'file_parent_dir' are the columns that could have 
(Bentries with the invalid unicode characters. Maybe I could/should use 
(Bsomething other than 'text'? These columns could contain anything that a 
(Bfile or directory name could be.
(B
(B   Thanks!
(B
(BMadison
(B
(B-- 
(B-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
(BMadison Kelly (Digimer)
(BTLE-BU, The Linux Experience; Back Up
(Bhttp://tle-bu.thelinuxexperience.com

[HACKERS]

2005-05-08 Thread sac

unsubscribe

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem

2005-05-08 Thread Oliver Jowett

Madison Kelly wrote:
  Is there a way to store the name in raw binary? 
Yes: bytea.
-O
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-08 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Heikki Linnakangas [EMAIL PROTECTED] writes:
  Consider the variant with extra marker files. In that case, backend B 
  doesn't have to know about the .notcommitted status to flush the buffers.
 
 [ shrug ]  It's still broken, and the reason is that there's no
 equivalent of fsync for directory operations.  Consider

Traditionally that's because directory operations were always synchronous, and
hence didn't need to be fsynced. I think this is still true, other systems
like qmail's maildir still depend on this, for example.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Oracle Style packages on postgres

2005-05-08 Thread Bob

One simple benefit to packages is just organization of related code.On 5/7/05, [EMAIL PROTECTED] 
[EMAIL PROTECTED] wrote:Oracle Style packages on postgresOVERVIEW:
To emulate oracle server side development in postgres I required serverside packages.The following text demonstrates how to do this usingplpython on postgres 8 and suggests a language extension.WHAT ARE ORACLE PACKAGES?
Looking back over the postgres discussion forums (particulary a discussionin 2001 following a proposal by Bill Studenmund) there appears to be someconfusion over what oracle packages are.Here's a concise definition :
 A black box processing engine with one or more public access functionsthat retains state across callsAn oracle package is created when first referenced. Its initializationcode is run once (ie costly queries to populate session wide package
params) and the package dies at the end of the sessionAn analogy with OOP is that it's like having a single class instanceavailable for the duration of a session.SOME POWERFUL USES OF PACKAGES:
1. Pipes - oracle dbms_pipe built-in allows asynchronous communicationbetween any number of producer/consumer database sessions on any number ofpipes2. Logging - leave all logging/debug statements in code, decision on
logging output can be made when the logging package is initialised (eg byquerying lookup tables for user, on/off, level, and destination).Combinelogging with pipes and the output can be stored in tables seperate from
the current transaction.Include timing info down to milliseconds andlive problems/bottlenecks can more easily be identified.3. Batch reporting - more suited to autonomous transactions than loggingbut useful to have the report package store start time, duration,
error/warning count running totals etc. and summarize automatically atreport end.See the example below on how to implement a version of the oracledbms_output package in plpythonEXTENSIONS TO POSTGRES:
Oracle style package creation syntax is split into header and body so thatthe body(code) can be re-compiled without invalidating dependent objects.Postgres syntax for the dbms_output example (in any postgres server side
language) would be along the lines of:CREATE OR REPLACE PACKAGE HEADER dbms_output AS FUNCTION dbms_output_put_line(text) RETURNS text, FUNCTION dbms_output_get_lines() RETURNS text;CREATE OR REPLACE PACKAGE BODY dbms_output AS $$ 'package code' $$
language;Adding pg_package with a link from pg_proc are the only changes requiredto the data dictionary.It would be nice to have similar dotted syntax as oracle(user.package.function) but would this mess up postgres namespaces?
The language in which the package was created would process the 'packagecode', for example in python:ocreate public functions linking header declaration to package body code(see dbms_output example)
oprocess embedded sql, eg l_curs=select * from dual -l_curs=self.execute('select * from dual')othe extracted sql can be 'prepared' by postgres and syntax exceptionsreported as compilation errors
SUMMARY:Packages are an important addition to postgres.Some of the server sidelanguages have the potential to create them now.It would be useful toadd a common high level syntax before the various language implementations
start developing their own solutions.I'm currently testing dbms_pipe on postgres, let me know if anyone isinterested.I replaced xml-rpc (5 messages/second) by sockets (600xfaster!), and may test corba
Ronnie Mackay--EXAMPLE :CONVERT ORACLE BUILT-IN PACKAGE DBMS_OUTPUT:
[Oracle syntax is :exec dbms_output.put_line('line1');]Postgresselect dbms_output_put_line('line 1');Postgresselect test_call_dbms_output_from_within_plpgsql('line 2(plpgsql)');Postgresselect test_call_dbms_output_from_within_plpython('line 3
(plpython)');Postgresselect dbms_output_put_line('line 4');Postgresselect dbms_output_get_lines();--- DBMS_OUTPUT DEMO ---line 1line 2 (plpgsql)line 3 (plpython)
line 4--- DBMS_OUTPUT DEMO ---So using current postgres syntax the only difference with oracle is thatdbms_output.put_line('line 1'); becomesdbms_output_put_line('line 1');
The source code to implement the package body is returned by postgresfunction dbms_output()POSTGRES CREATE STATEMENTS FOR EXAMPLE:-
CREATE or replace FUNCTION dbms_output_put_line(text) RETURNS text AS $$from plpython import getPackagereturn getPackage(GD, plpy, 'dbms_output').putLine(args[0])$$ LANGUAGE plpythonu;CREATE or replace FUNCTION dbms_output_get_lines() RETURNS text AS $$
from plpython import getPackagereturn getPackage(GD, plpy, 'dbms_output').getLines()$$ LANGUAGE plpythonu;-- package bodyCREATE OR REPLACE FUNCTION dbms_output() RETURNS text AS $$return 
from plpython import PlPythonPackageclass Package(PlPythonPackage): def __init__(self,

Re: [HACKERS] pl/pgsql enabled by default

2005-05-08 Thread Greg Stark

Mike Mascari mascarm@mascari.com writes:

 2. Issue a query like:
 
 SELECT *
 FROM view_of_salaries_based_on_current_user
 WHERE my_side_effect_function_that_inserts_into_a_temp_table(salary, 
 employee);

That's just exactly equivalent to


SELECT * 
  FROM (select * 
  from all_salaries 
 where user = CURRENT_USER
   ) 
 WHERE malicious_function(salary,employee)


Hm. If you incorrectly mark your function as IMMUTABLE even though it has side
effects then the planner may indeed collapse this. Does the planner know it
can't collapse views if the underlying tables aren't accessible to the user?

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files

2005-05-08 Thread Tom Lane

Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 [ shrug ]  It's still broken, and the reason is that there's no
 equivalent of fsync for directory operations.  Consider

 Traditionally that's because directory operations were always
 synchronous, and hence didn't need to be fsynced.

That might be true with respect to the process requesting the directory
operation ... but I think you missed the point entirely.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] pl/pgsql enabled by default

2005-05-08 Thread Tom Lane

Greg Stark [EMAIL PROTECTED] writes:
 Hm. If you incorrectly mark your function as IMMUTABLE even though it
 has side effects then the planner may indeed collapse this. Does the
 planner know it can't collapse views if the underlying tables aren't
 accessible to the user?

There are no cases where function or view collapsing elides permissions
checks (if you have a counterexample please provide it!!).  They could
change the time at which permissions checks are applied, though; which
has the potential for a REVOKE to not disallow execution of already-
planned queries that ideally it should prevent.  I believe that this
risk will be fixed by the planned forcing of replanning after schema
changes.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Oracle Style packages on postgres

2005-05-08 Thread Satoshi Nagayasu

Bob wrote:
(B One simple benefit to packages is just organization of related code.
(B
(BAnd the package-scoped variables or constant values, similar to
(Bthe global variables.
(B
(BIt will be very useful for application programmers
(Bif one variable can be shared from several functions.
(B
(BI needed some tricks when I tried to port such PL/SQL to PL/pgSQL.
(B
(BBob wrote:
(B One simple benefit to packages is just organization of related code.
(B 
(B On 5/7/05, [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]*  [EMAIL PROTECTED] 
(B mailto:[EMAIL PROTECTED] wrote:
(B 
(B Oracle Style packages on postgres
(B 
(B OVERVIEW:
(B 
(B To emulate oracle server side development in postgres I required server
(B side packages.  The following text demonstrates how to do this using
(B plpython on postgres 8 and suggests a language extension.
(B 
(B WHAT ARE ORACLE PACKAGES?
(B 
(B Looking back over the postgres discussion forums (particulary a
(B discussion
(B in 2001 following a proposal by Bill Studenmund) there appears to be
(B some
(B confusion over what oracle packages are.  Here's a concise definition :
(B"A black box processing engine with one or more public access
(B functions
(B that retains state across calls"
(B An oracle package is created when first referenced. Its initialization
(B code is run once (ie costly queries to populate session wide package
(B params) and the package dies at the end of the session
(B An analogy with OOP is that it's like having a single class instance
(B available for the duration of a session.
(B 
(B SOME POWERFUL USES OF PACKAGES:
(B 
(B 1. Pipes - oracle dbms_pipe built-in allows asynchronous communication
(B between any number of producer/consumer database sessions on any
(B number of
(B pipes
(B 
(B 2. Logging - leave all logging/debug statements in code, decision on
(B logging output can be made when the logging package is initialised
(B (eg by
(B querying lookup tables for user, on/off, level, and
(B destination).  Combine
(B logging with pipes and the output can be stored in tables seperate from
(B the current transaction.  Include timing info down to milliseconds and
(B live problems/bottlenecks can more easily be identified.
(B 
(B 3. Batch reporting - more suited to autonomous transactions than logging
(B but useful to have the report package store start time, duration,
(B error/warning count running totals etc. and summarize automatically at
(B report end.
(B 
(B See the example below on how to implement a version of the oracle
(B dbms_output package in plpython
(B 
(B EXTENSIONS TO POSTGRES:
(B 
(B Oracle style package creation syntax is split into header and body
(B so that
(B the body(code) can be re-compiled without invalidating dependent
(B objects.
(B Postgres syntax for the dbms_output example (in any postgres server
(B side
(B language) would be along the lines of:
(B CREATE OR REPLACE PACKAGE HEADER dbms_output AS
(BFUNCTION dbms_output_put_line(text) RETURNS text,
(BFUNCTION dbms_output_get_lines() RETURNS text;
(B CREATE OR REPLACE PACKAGE BODY dbms_output AS $$ 'package code' $$
(B language;
(B 
(B Adding pg_package with a link from pg_proc are the only changes required
(B to the data dictionary.
(B It would be nice to have similar dotted syntax as oracle
(B (user.package.function) but would this mess up postgres namespaces?
(B 
(B The language in which the package was created would process the 'package
(B code', for example in python:
(B o  create public functions linking header declaration to package
(B body code
(B (see dbms_output example)
(B o  process embedded sql, eg l_curs=select * from dual -
(B l_curs=self.execute('select * from dual')
(B o  the extracted sql can be 'prepared' by postgres and syntax exceptions
(B reported as compilation errors
(B 
(B SUMMARY:
(B Packages are an important addition to postgres.  Some of the server side
(B languages have the potential to create them now.  It would be useful to
(B add a common high level syntax before the various language
(B implementations
(B start developing their own solutions.
(B 
(B I'm currently testing dbms_pipe on postgres, let me know if anyone is
(B interested.  I replaced xml-rpc (5 messages/second) by sockets (600x
(B faster!), and may test corba
(B 
(B Ronnie Mackay
(B 
(B 
(B -
(B 
(B -
(B 
(B EXAMPLE :CONVERT ORACLE BUILT-IN PACKAGE DBMS_OUTPUT:
(B 
(B [Oracle syntax is :exec dbms_output.put_line('line1');]
(B

Re: [HACKERS] pl/pgsql enabled by default

2005-05-08 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  Hm. If you incorrectly mark your function as IMMUTABLE even though it
  has side effects then the planner may indeed collapse this. Does the
  planner know it can't collapse views if the underlying tables aren't
  accessible to the user?
 
 There are no cases where function or view collapsing elides permissions
 checks (if you have a counterexample please provide it!!).  

I'm talking about something like this. In guess there isn't a problem after
all but it depends critically on the AND short-circuiting (and the order of
evaluation of the expression not being changed).

db= create view vtest as select * from test where a  1 ;
db= create or replace function f(integer) returns integer as 'begin raise 
notice ''foo %'', $1; return $1; end' language plpgsql;

db= explain select * from vtest where f(a)0;
  QUERY PLAN   
---
 Seq Scan on test  (cost=0.00..27.50 rows=112 width=4)
   Filter: ((a  1) AND (f(a)  0))
(2 rows)


I can't come up with any circumstances where the function will get called
before the a1 clause. If it were indexed then it would be evaluated first
but it would no longer be relevant since the function wouldn't be getting
called.

But it's something to watch out for. If ever it seems like a wise idea to have
the optimizer fiddle with the order of evaluation, say based on the
selectivity or computational expense of the conditions then it could create a
problem.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Oracle Style packages on postgres

2005-05-08 Thread Thomas Hallgren

Satoshi Nagayasu wrote:
(B
(BAn oracle package is created when first referenced. Its initialization
(Bcode is run once (ie costly queries to populate session wide package
(Bparams) and the package dies at the end of the session
(BAn analogy with OOP is that it's like having a single class instance
(Bavailable for the duration of a session.
(B
(BPL/Java has an object called "Session" that does exactly this. It is not
(Bavailable from other languages at present. Are Packages supposed to be
(Bcross-language?
(B
(BRegards,
(BThomas Hallgren
(B
(B
(B---(end of broadcast)---
(BTIP 9: the planner will ignore your desire to choose an index scan if your
(B  joining column's datatypes do not match

45 matches

Mail list logo