Re: [HACKERS] Can we get patents?
On Sun, 8 May 2005, Alvaro Herrera wrote: Hackers, I was reading LWN.net and noticed an article about Eben Moglen's keynote at linux.conf.au. Apparently he advises free software projects to get patents on their best ideas. Eben encouraged free software developers to record their novel inventions and to obtain patents on the best of them. Free legal help can be made available to obtain patents on the best ideas. Until the rules of the game can be changed, we must play the game, and having the right patents available may make all the difference in defending against an attack. http://lwn.net/Articles/133421/ Eben Moglen is the FSF's attorney. I'm wondering, could the PostgreSQL Foundation (or some other entity) get patents on some parts of Postgres? Maybe ResourceOwners for example; or the newer parts of the optimizer. The patents would be freely licensed to everyone (including commercial redistributors and developers/users of competing products), except to patent litigators, or something like that. Individual developers could get their work patent'd, I would imagine ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Patch for collation using ICU
Tatsuo Ishii Sent: Sunday, May 08, 2005 3:41 PM To: John Hansen Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU Alvaro Herrera wrote: Sent: Sunday, May 08, 2005 2:49 PM To: John Hansen Cc: Tatsuo Ishii; pgman@candle.pha.pa.us; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: Tatsuo Ishii wrote: So Japanese(including ASCII)/UNICODE behavior is perfectly correct at this moment. Right, so you _never_ use accented ascii characters in Japanese? (like è for example, whose uppercase is È) That isn't ASCII. It's latin1 or some other ASCII extension. Point taken... But... If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, not UTF-8 (unicode). UTF-8 encoded databases are very useful for representing multiple languages in the same database, but this usefulness vanishes if functions like upper/lower doesn't work correctly. I'm just curious if Germany/French/Spanish mixed text can be sorted correctly. I think these languages need their own locales even with UNICODE/ICU. No, they will not sort correctly, for that you still need the locale. So optimizing for 3 languages breaks more than a hundred, that's doesn't seem fair! That is a compromise I'd be willing to agree on. :) Why don't you add a GUC variable or some such to control the upper/lower behavior? -- Tatsuo Ishii ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Tatsuo Ishii wrote: Sent: Sunday, May 08, 2005 3:31 PM To: John Hansen Cc: [EMAIL PROTECTED]; pgsql-general@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem Tatsuo Ishii wrote: Sent: Sunday, May 08, 2005 12:01 PM To: [EMAIL PROTECTED] Cc: pgsql-general@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem We have developed patches which relaxes the character validation so that PostgreSQL accepts invalid characters. It works like this: That is just plain 100% wrong!! Under no circumstances should there be invalid data in a database. And if you're trying to make a database of invalid data, then at least encode it using a valid encoding. In fact, I've proposed strengthening the validation routines for UTF-8. Actually I myself thought as you are before. Later I found that it was not so good idea. People already have invalid encoded data in their precious database and have very hard time to migrate to newer version of PostgreSQL because of encoding validation. Think about this kind of situation: There is a table t1(member_id integer primary key, member_name text, address text, phone text, email text). I have to reach each member by either adress, phone or email. Unfortunately some of address field have wrong encoded data. In this case I will use phone or email to reach them. Now I need to upgrade to newer PostgreSQL within 1 day. I know I have to fix wrong encoded field but it will take more than 1 day. So I would like to import the data first then fix wrong encoded field on running database since I can reach members by phone or email even with wrong encoded address field... Actually would be very simple, create function isvalidutf8(text) in your preferred language. C source is available from unicode.org. Create function converttoutf8(text) using whatever code is required to transform the _wrong_ encoding (SQL_ASCII - UTF8 for instance) to utf-8. Update table set field=converttoutf8(field) where !isvalidutf8(field); Now sit back and relax while your invalid data is converted to utf-8. When done, pg_dump the database, upgrade, and reload. This should take less than a day. I saw this kind of situation in the real world and that's why we developed the patches. -- Tatsuo Ishii ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Will new release require an initdb?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, The question is in the subject line: Will the new dot releases require an initdb, since they will play with system catalogs? Regards, - -- Devrim GUNDUZ devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.tdmsoft.com.tr http://www.gunduz.org -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFCfeHutl86P3SPfQ4RAr+2AKDuu/WYJOvtDOVUKnOBjCOGhA4PlQCfd3k3 AE8E7mckCUDT0Wb/qPDFmSk= =vpx2 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] pl/pgsql enabled by default
2. Issue a query like: SELECT * FROM view_of_salaries_based_on_current_user WHERE my_side_effect_function_that_inserts_into_a_temp_table(salary, employee); An SRF will guarantee an execution order and work for security purposes, but getting your function with side effects to run early would be rather challenging. Somehow you would need to make the planner think the function is fairly selective without it actually being that way and the only way that I'm aware of to get the planner to consider a functions selectivity is to index it. But how do you index the function without knowing all of the values? -- ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Patch for collation using ICU
Is this patch ready for application? http://people.freebsd.org/~girgen/postgresql-icu/pg-802-icu-200 5-05-06.d iff.gz The web site is: http://people.freebsd.org/~girgen/postgresql-icu/readme.html I don't think so, not quite. I have not had any positive reports from linux users, this is only tested in a FreeBSD environment. I'd say it needs some more testing. I've just finished some simple testing on win32, and it does seem to work fine there as well, with a few modifications to the build step. As I don't have a working autoconf, I applied the stuff that would come from your configure.in changes directly to the files. Meaning putting USE_ICU in pg_config.h, and the following changes to Makefile.global: 1) Add the directory for the ICU include files to CPPFLAGS 2) Add -licuuc -licuin to LIBS. I notice these are different names from those used on Unix, so a different configure test will be needed there. 3) Add the icu lib directory to LDFLAGS 4) Remove encodings PG_WIN_1258/PG_TCVN and IBM866/PG_ALT. Didn't investigate further why this was needed, but this probably has something to do with my tests being off CVS tip vs the patch being for 8.0-stable. After doing this, I can properly get upper and lowercase for the swedish characters åäö/ÅÄÖ, as well as the ß-SS conversion. That's all I really tested at this point, but it did *not* work before in a unicode/UTF8 database. I've been working off the ICU 3.2 binaries available from the ICU page, the version compiled with MSVC 6 (because it uses the same runtime DLLs as stuff compiled with mingw. It should work with the MSVC7 version as well, but that would introduce additional DLL dependencies). //Magnus ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] why two WRITE_NODE_FIELD(whereCluase)?
I see WRITE_NODE_FIELD(whereClause) twice in _outSelectStmt(). Maybe a bug? -- Tatsuo Ishii ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Patch for collation using ICU
The 3.2 vs 2.8 business is disturbing also; specifically, I don't think we get to require 3.2 on a platform where 2.8 is installed. There seems to be nothing in the ICU licence that would prevent us from bundling it. This would solve both the 3.2 vs 2.8 problems, and would remove the 'dependency'. People just aren't going to hold still for that, even assuming that ICU supports installing both versions at once, which isn't clear to me at the moment ... There's no problems with having both installed. ... unless you're on win32, it seems. For some reason, they name their libs with the version on unix (libicu18n.so.32), but not on win32 where they all have the same name. And they don't stuff versioning information in the DLL files. That can be lived with as long as libpq doesn't depend on it, though - you can just stick the DLL in the same directory as the EXE, which is also what the ICU people recommend in their docs. Unnecessarily ugly, but it works. I did that on debian to get the patch going. Tho, bundling it seems cleaner to me. The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2. That means the size of the distribution would almost *double* if we bundled ICU. It's probably fine bundling it in the binary distributions (at least we'd probably do it on win32, since not many ppl will have it already there), but bundling the source seems a bit excessive to me. //Magnus ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Patch for collation using ICU
The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2. That means the size of the distribution would almost *double* if we bundled ICU. Ermm,. Don't forget to remove the current charset conversions and locale support before making your size estimation. It's probably fine bundling it in the binary distributions (at least we'd probably do it on win32, since not many ppl will have it already there), but bundling the source seems a bit excessive to me. //Magnus ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files
On Sat, 7 May 2005, Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Maybe we should take a different approach to the problem: 1. Create new file with an extension to mark that it's not yet committed (eg. 1234.notcommitted) This is pushing the problem into the wrong place, viz the lowest-level file access routines, which will now all have to know about .notcommitted status. It also creates race conditions --- think about backend A trying to commit file 1234 at about the same time that backend B is trying to flush some dirty buffers belonging to that file. True. With the rename variant, it might indeed get messy. Consider the variant with extra marker files. In that case, backend B doesn't have to know about the .notcommitted status to flush the buffers. But most importantly, it doesn't handle the file-deletion case. File-deletions are easy to handle. Just write the list of pending deletions to WAL on commit. To recap, we have 2 slightly different scenarios: a) Delete a file, write commit record, crash b) Create a file, crash Just WAL logging the deletions on commit would take care of A. The .notcommitted mechanism would take care of B. - Heikki ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Patch for collation using ICU
I don't buy it. If current conversion tables does the right thing, why we need to replace. Or if conversion tables are not correct, why don't you fix it? I think the rule of character conversion will not change frequently, especially for LATIN languages. Thus maintaining cost is not too high. I never said we need to, but if we're going to implement ICU, then we might as well go all the way. So you admit there's no benefit using ICU for replacing existing conversions? Besides ICU does not support all existing conversions, I think ICU has serious flaw for using conversion. If I understand correctly, ICU uses UNICODE internally to do the conversion. For example, to implement SJIS-EUC_JP conversion, ICU first converts SJIS to UNICODE then converts UNICODE to EUC_JP. Problem is these conversion is not roud trip(conversion between SJIS/EUC_JP and UNICODE will lose some information). Thus SJIS-EUC_JP-SJIS conversion using ICU does not preserve original text. -- Tatsuo Ishii ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Patch for collation using ICU
--On söndag, maj 08, 2005 22.19.25 +0900 Tatsuo Ishii [EMAIL PROTECTED] wrote: On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: Tatsuo Ishii wrote: So Japanese(including ASCII)/UNICODE behavior is perfectly correct at this moment. Right, so you _never_ use accented ascii characters in Japanese? (like è for example, whose uppercase is È) That isn't ASCII. It's latin1 or some other ASCII extension. Point taken... But... If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, not UTF-8 (unicode). UTF-8 encoded databases are very useful for representing multiple languages in the same database, but this usefulness vanishes if functions like upper/lower doesn't work correctly. I'm just curious if Germany/French/Spanish mixed text can be sorted correctly. I think these languages need their own locales even with UNICODE/ICU. No, they will not sort correctly, for that you still need the locale. I'm confused. I thought the ICU patches is intended for using on broken locale platforms? It will sort correctly in *one* locale, using ICU. You still cannot mix different locales in the same database cluster, the collation locale is still fixed at initdb time, unfortunately. /Palle ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Patch for collation using ICU
On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: Tatsuo Ishii wrote: So Japanese(including ASCII)/UNICODE behavior is perfectly correct at this moment. Right, so you _never_ use accented ascii characters in Japanese? (like è for example, whose uppercase is È) That isn't ASCII. It's latin1 or some other ASCII extension. Point taken... But... If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, not UTF-8 (unicode). UTF-8 encoded databases are very useful for representing multiple languages in the same database, but this usefulness vanishes if functions like upper/lower doesn't work correctly. I'm just curious if Germany/French/Spanish mixed text can be sorted correctly. I think these languages need their own locales even with UNICODE/ICU. No, they will not sort correctly, for that you still need the locale. I'm confused. I thought the ICU patches is intended for using on broken locale platforms? -- Tatsuo Ishii ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] why two WRITE_NODE_FIELD(whereCluase)?
Tatsuo Ishii [EMAIL PROTECTED] writes: I see WRITE_NODE_FIELD(whereClause) twice in _outSelectStmt(). Maybe a bug? Yeah, it is, though a pretty minor one since we never read SelectStmt nodes back in again (they are not used in stored rules, which are already Querys). Feel free to fix it in HEAD. Might want to check the other fields while at it ... regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Views, views, views! (long)
On Fri, May 06, 2005 at 05:44:43PM -0400, Robert Treat wrote: Sorry, but I'm still in the admin tools wont use these camp since I don't believe these views can solve an admin tools need to support multiple versioning within its code. I also don't think it is any harder to learn to query the system tables than it would be to learn to query these new views (with a few caevets that I will come back to) and it might actually be better. If I'm building an admin tool, I have to know that tablespaces I find it hard to believe that it's easier to write a 30 line query instead of just selecting out of a single view. But, even if an admin tool does want to 'go direct to the source' and query the system tables, ISTM that having a reference implementation (the system views) would be very valuable. aren't supported on some older versions, and I think it is easier to figure this out if my query breaks on tablespace information rather than if my query just silently sends me some special data (NULL?) that I have to interpret to mean not supported. Well, these views don't prevent you from using version() to know what is and isn't supported, but if you can think of other means to indicate what features are and aren't available I'm all ears. That said, some admin tools already have a requirment that you install some little piece of schema into your database to support them, they could include this package along with thier software if they felt strongly about it. The cavet I am thinking about from above is things like the relacl bits of pg_class, which are a total poop to work with. Adding a couple of new system views to help make that information more transparent would be a good thing. Actually I am thinkinga couple of parts of this stuff could be used as an enhancement to the current system views if people weren't interested in a wholesale replacement. It's certainly not decided that these views would replace anything. I'm in favor of always keeping these views in their own schema so that it's up to the user to decide what exactly they want to query. If they want stuff out of the current catalog, then use pg_catalog. If they want these new views, then use pg_sysviews. There is the possibility of eventually replacing some of the old system views, but that would be several versions away, if it were to ever happen. And of course these would not replace the system tables. But yes, the intention is to continue to support backwards compatability as much as possible. Currently I believe that compatability stops at versions that don't support schemas, though that could change. I'm curious, are the queries between various versions actually all that different? I can't imagine that you can present a stable interface going back 3 versions that is relevant to all three versions that also requires serious query changes between each version. I suggest taking a gander at the '_compat' files at http://lnk.nu/cvs.pgfoundry.org/251/. Basically, features that are new in 8.0 (ie: tablespaces) have an abstraction layer. The code under that layer is version specific, but the code above it is generic. So _compat74.sql creates a bogus pg_tablespace (though now that I'm thinking about it, we should probably use a different name for that, such as _pg_tablespace). Of course, we could certainly go the route of having completely different view definitions for different versions, but I'm not sure that's an improvement. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files
Heikki Linnakangas [EMAIL PROTECTED] writes: Consider the variant with extra marker files. In that case, backend B doesn't have to know about the .notcommitted status to flush the buffers. [ shrug ] It's still broken, and the reason is that there's no equivalent of fsync for directory operations. Consider A creates 1234 and 1234.notcommitted A commits B performs a checkpoint crash all before A manages to delete 1234.notcommitted, or at least before that deletion has made its way to disk. Upon restart, only WAL events after the checkpoint will be replayed, so 1234.notcommitted doesn't go away, and then you've got a problem. To fix this there would need to be a way (1) for B to be aware of the pending file deletion and (2) for B to delay committing the checkpoint until the directory update is surely down on disk. Your proposal doesn't provide for (1), and even if we fixed that, I know of no portable kernel API for (2). fsync isn't applicable. While your original patch is buggy, it's at least fixable and has localized, limited impact. I don't think these schemes are safe at all --- they put a great deal more weight on the semantics of the filesystem than I care to do. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Will new release require an initdb?
Devrim GUNDUZ [EMAIL PROTECTED] writes: The question is in the subject line: Will the new dot releases require an initdb, since they will play with system catalogs? No. See the release notes --- it's the DBA who has to do the playing. http://developer.postgresql.org/docs/postgres/release.html regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Patch for collation using ICU
Palle Girgensohn [EMAIL PROTECTED] writes: I'm confused. I thought the ICU patches is intended for using on broken locale platforms? It will sort correctly in *one* locale, using ICU. You still cannot mix different locales in the same database cluster, the collation locale is still fixed at initdb time, unfortunately. I thought the point of using ICU was to be able to dig out from under that restriction? It's a bit of a large pill to swallow if we will still have to throw it away someday to become SQL spec compliant. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Race conditions, race conditions!
On Sat, May 07, 2005 at 07:20:48PM -0400, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: I wonder if there's an argument for building assertion-enabled builds with code that randomly yields the processor some percentage of time before and after taking a lock. It wouldn't catch every case but it might help. Seems like that would mainly help you find cases where you'd put a lock acquire or release a bit too late or too soon in a sequence of events; not cases where you'd failed to acquire a needed lock at all. It'd be more useful I think to have a facility that randomly stops backends for awhile regardless of exactly where they are in the code. A high-load test case actually does this to some extent, but the problem is you have little reproducibility and no assurance that execution stopped for long enough to let critical events happen elsewhere. The ideal facility I think would slow one backend much more than others, whereas high load still leaves them all making progress at about the same rate ... Would setting different priorities/niceness on different backends during the stress test help? It might not be perfect but it should be trivial to accomplish... -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Views, views, views! (long)
On 2005-05-08, Jim C. Nasby [EMAIL PROTECTED] wrote: I suggest taking a gander at the '_compat' files at http://lnk.nu/cvs.pgfoundry.org/251/. Basically, features that are new in 8.0 (ie: tablespaces) have an abstraction layer. The code under that layer is version specific, but the code above it is generic. So _compat74.sql creates a bogus pg_tablespace (though now that I'm thinking about it, we should probably use a different name for that, such as _pg_tablespace). Remember that this is still an alpha version. In the longer term I think we should look at splitting it into two schemas, one with the views themselves and another with the support functions and other implementation details. (There are other ways to handle pg_tablespace too, that just happened to be the convenient one for proof-of-concept testing.) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Can we get patents?
Alvaro, I'm wondering, could the PostgreSQL Foundation (or some other entity) get patents on some parts of Postgres? Maybe ResourceOwners for example; or the newer parts of the optimizer. That depends; is the SFLC offering to pay for the patent applications? Last I checked, it was somewhere around $6000 per patent. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] pl/pgsql enabled by default
Mike, I think most people coming from any other enterprise-class RDBMS environment will be surprised that they cannot use VIEWs to provide user-specific views on data. I could be wrong, but I'd put money on it... Well, I'd say that giving regular users the create permission on your database/schema is unwise, period. I don't, even when the only user is phpuser. SQL injections attacks are no fun. Also, as Andrew points out, this can't be used to circumvent view-based security if you've set it up correctly; if the user can't select * from table, then he can't write a function to select * from table. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch for collation using ICU
Magnus Hagander wrote: The source for ICU 3.2 is 9.8Mb in .tar.gz. PostgreSQL 8.0.2 is 13.2. That means the size of the distribution would almost *double* if we bundled ICU. It's probably fine bundling it in the binary distributions (at least we'd probably do it on win32, since not many ppl will have it already there), but bundling the source seems a bit excessive to me. I'm also mildly curious to know what effect using ICU will have on memory consumption. Has anyone looked? My suspicion was aroused by this library that it installed on my FC3 box: -rwxr-xr-x 1 root root 9777876 Jan 1 20:18 /usr/lib/libicudata.so.32.0 cheers andrew ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pl/pgsql enabled by default
Josh Berkus wrote: Mike, I think most people coming from any other enterprise-class RDBMS environment will be surprised that they cannot use VIEWs to provide user-specific views on data. I could be wrong, but I'd put money on it... Well, I'd say that giving regular users the create permission on your database/schema is unwise, period. I don't, even when the only user is phpuser. SQL injections attacks are no fun. Also, as Andrew points out, this can't be used to circumvent view-based security if you've set it up correctly; if the user can't select * from table, then he can't write a function to select * from table. Seems it's a bit more complicated. Kris Jurka has explained to me how one might be able to, at least theoretically. Perhaps Mike needs to do something like: revoke usage on language sql from public; on his db. Then users could continue to use functions he has defined, but not subvert things via their own functions. That will certainly go into my list of db hardening tips and tricks. cheers andrew ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Patch for collation using ICU
Palle Girgensohn [EMAIL PROTECTED] writes: I'm confused. I thought the ICU patches is intended for using on broken locale platforms? It will sort correctly in *one* locale, using ICU. You still cannot mix different locales in the same database cluster, the collation locale is still fixed at initdb time, unfortunately. I thought the point of using ICU was to be able to dig out from under that restriction? I think it might be quite possible to mix several locales, using ICU. It's just that this is not what the patch does at moment. It just finds out the locale set at initdb and uses it for collation with ICU. Handling mixed locales for collation has a few hard problems, AFAIK. First, isn't the main obstacle for mixing collations that indices require a single well defined locale? I assume that locale dependant comparison (collation) is used when indexing tuples, right? As long as a specific locales collation is used for indexing text fields, I believe we cannot easily mix different locales, right? Second, how do we tell the backend which locale to use? Is there some SQL spec for this? It's a bit of a large pill to swallow if we will still have to throw it away someday to become SQL spec compliant. What do we need to be SQL spec compliant in this respect? /Palle ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Views, views, views! (long)
On 2005-05-05, Peter Eisentraut [EMAIL PROTECTED] wrote: I would suggest that you align your terminology with the information schema as much as possible, so it would be type_schema and not type_schema_name, and ordinal_position instead of column_position. Otherwise we'll have a lot of confusion ahead if we instroduced a third parallel set of terminology. Personally I'm open to suggestions on this; we didn't entirely agree on the naming conventions when writing the stuff so far. c) In most places, system objects are segregated from user objects, e.g. pg_user_indexes I think that is a bad idea as it goes against the fundamental design of PostgreSQL. In what way? Please elaborate. g) All views are as normalized as possible, using child views rather than arrays, and providing keys and consistent join columns. You still seem to have a bunch of arrays in there. Anything with an array is never normalized. There are 6 array columns in there at the moment. One looks pointless and might get removed (database_config). The others are all intentional and cover cases where the denormalized view is (a) already easily available within the query and (b) substantially useful. The normalized versions are available too in all cases. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Patch for collation using ICU
Tatsuo Ishii wrote: Sent: Sunday, May 08, 2005 11:19 PM To: John Hansen Cc: [EMAIL PROTECTED]; pgman@candle.pha.pa.us; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU On Sun, May 08, 2005 at 02:07:29PM +1000, John Hansen wrote: Tatsuo Ishii wrote: So Japanese(including ASCII)/UNICODE behavior is perfectly correct at this moment. Right, so you _never_ use accented ascii characters in Japanese? (like è for example, whose uppercase is È) That isn't ASCII. It's latin1 or some other ASCII extension. Point taken... But... If you want EUC_JP (Japanese + ASCII) then use that as your backend encoding, not UTF-8 (unicode). UTF-8 encoded databases are very useful for representing multiple languages in the same database, but this usefulness vanishes if functions like upper/lower doesn't work correctly. I'm just curious if Germany/French/Spanish mixed text can be sorted correctly. I think these languages need their own locales even with UNICODE/ICU. No, they will not sort correctly, for that you still need the locale. I'm confused. I thought the ICU patches is intended for using on broken locale platforms? Initially yes, but why duplicate code? What I meant was, that they will not sort correctly using the C locale. Locale _name_ needs to be known to ICU for it to sort correctly. -- Tatsuo Ishii ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Patch for collation using ICU
Tom Lane wrote: Sent: Monday, May 09, 2005 2:47 AM To: Palle Girgensohn Cc: Tatsuo Ishii; John Hansen; [EMAIL PROTECTED]; pgman@candle.pha.pa.us; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU Palle Girgensohn [EMAIL PROTECTED] writes: I'm confused. I thought the ICU patches is intended for using on broken locale platforms? It will sort correctly in *one* locale, using ICU. You still cannot mix different locales in the same database cluster, the collation locale is still fixed at initdb time, unfortunately. I thought the point of using ICU was to be able to dig out from under that restriction? It's a bit of a large pill to swallow if we will still have to throw it away someday to become SQL spec compliant. That is not a limitation of ICU but of postgresql. I don't know what the specs say, but imagine something like: SELECT foo FROM bar ORDER BY foo WITH LOCALE 'en_US', foobar WITH LOCALE 'jp_JP'; Which would be less difficult to implement using ICU. regards, tom lane ... John ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Patch for collation using ICU
Tatsuo Ishii wrote: Sent: Sunday, May 08, 2005 11:08 PM To: John Hansen Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU I don't buy it. If current conversion tables does the right thing, why we need to replace. Or if conversion tables are not correct, why don't you fix it? I think the rule of character conversion will not change frequently, especially for LATIN languages. Thus maintaining cost is not too high. I never said we need to, but if we're going to implement ICU, then we might as well go all the way. So you admit there's no benefit using ICU for replacing existing conversions? Besides ICU does not support all existing conversions, I think ICU has serious flaw for using conversion. If I understand correctly, ICU uses UNICODE internally to do the conversion. For example, to implement SJIS-EUC_JP conversion, ICU first converts SJIS to UNICODE then converts UNICODE to EUC_JP. Problem is these conversion is not roud trip(conversion between SJIS/EUC_JP and UNICODE will lose some information). Thus SJIS-EUC_JP-SJIS conversion using ICU does not preserve original text. Could you please send me a sample text as an attachment encoded in SJIS where this would happen? -- Tatsuo Ishii ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch for collation using ICU
-Original Message- From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] Sent: Sunday, May 08, 2005 11:08 PM To: John Hansen Cc: pgman@candle.pha.pa.us; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Patch for collation using ICU I don't buy it. If current conversion tables does the right thing, why we need to replace. Or if conversion tables are not correct, why don't you fix it? I think the rule of character conversion will not change frequently, especially for LATIN languages. Thus maintaining cost is not too high. I never said we need to, but if we're going to implement ICU, then we might as well go all the way. So you admit there's no benefit using ICU for replacing existing conversions? Besides ICU does not support all existing conversions, I think ICU has serious flaw for using conversion. If I understand correctly, ICU uses UNICODE internally to do the conversion. For example, to implement SJIS-EUC_JP conversion, ICU first converts SJIS to UNICODE then converts UNICODE to EUC_JP. Problem is these conversion is not roud trip(conversion between SJIS/EUC_JP and UNICODE will lose some information). Thus SJIS-EUC_JP-SJIS conversion using ICU does not preserve original text. Just for the record, I fetched a web page encoded in sjis, and converted it to euc-jp and back using uconv from ICU 3.2, and the result is the original is identical to the transformed file. uconv -f Shift_JIS -t EUC-JP -o index.html.euc index.html uconv -f EUC-JP -t Shift_JIS -o index.html.sjis index.html.euc diff index.html index.html.sjis ... John ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Oracle Style packages on postgres
Oracle Style packages on postgres OVERVIEW: To emulate oracle server side development in postgres I required server side packages. The following text demonstrates how to do this using plpython on postgres 8 and suggests a language extension. WHAT ARE ORACLE PACKAGES? Looking back over the postgres discussion forums (particulary a discussion in 2001 following a proposal by Bill Studenmund) there appears to be some confusion over what oracle packages are. Here's a concise definition : A black box processing engine with one or more public access functions that retains state across calls An oracle package is created when first referenced. Its initialization code is run once (ie costly queries to populate session wide package params) and the package dies at the end of the session An analogy with OOP is that it's like having a single class instance available for the duration of a session. SOME POWERFUL USES OF PACKAGES: 1. Pipes - oracle dbms_pipe built-in allows asynchronous communication between any number of producer/consumer database sessions on any number of pipes 2. Logging - leave all logging/debug statements in code, decision on logging output can be made when the logging package is initialised (eg by querying lookup tables for user, on/off, level, and destination). Combine logging with pipes and the output can be stored in tables seperate from the current transaction. Include timing info down to milliseconds and live problems/bottlenecks can more easily be identified. 3. Batch reporting - more suited to autonomous transactions than logging but useful to have the report package store start time, duration, error/warning count running totals etc. and summarize automatically at report end. See the example below on how to implement a version of the oracle dbms_output package in plpython EXTENSIONS TO POSTGRES: Oracle style package creation syntax is split into header and body so that the body(code) can be re-compiled without invalidating dependent objects. Postgres syntax for the dbms_output example (in any postgres server side language) would be along the lines of: CREATE OR REPLACE PACKAGE HEADER dbms_output AS FUNCTION dbms_output_put_line(text) RETURNS text, FUNCTION dbms_output_get_lines() RETURNS text; CREATE OR REPLACE PACKAGE BODY dbms_output AS $$ 'package code' $$ language; Adding pg_package with a link from pg_proc are the only changes required to the data dictionary. It would be nice to have similar dotted syntax as oracle (user.package.function) but would this mess up postgres namespaces? The language in which the package was created would process the 'package code', for example in python: o create public functions linking header declaration to package body code (see dbms_output example) o process embedded sql, eg l_curs=select * from dual - l_curs=self.execute('select * from dual') o the extracted sql can be 'prepared' by postgres and syntax exceptions reported as compilation errors SUMMARY: Packages are an important addition to postgres. Some of the server side languages have the potential to create them now. It would be useful to add a common high level syntax before the various language implementations start developing their own solutions. I'm currently testing dbms_pipe on postgres, let me know if anyone is interested. I replaced xml-rpc (5 messages/second) by sockets (600x faster!), and may test corba Ronnie Mackay - - EXAMPLE :CONVERT ORACLE BUILT-IN PACKAGE DBMS_OUTPUT: [Oracle syntax is :exec dbms_output.put_line('line1');] Postgresselect dbms_output_put_line('line 1'); Postgresselect test_call_dbms_output_from_within_plpgsql('line 2 (plpgsql)'); Postgresselect test_call_dbms_output_from_within_plpython('line 3 (plpython)'); Postgresselect dbms_output_put_line('line 4'); Postgresselect dbms_output_get_lines(); --- DBMS_OUTPUT DEMO --- line 1 line 2 (plpgsql) line 3 (plpython) line 4 --- DBMS_OUTPUT DEMO --- So using current postgres syntax the only difference with oracle is that dbms_output.put_line('line 1'); becomes dbms_output_put_line('line 1'); The source code to implement the package body is returned by postgres function dbms_output() POSTGRES CREATE STATEMENTS FOR EXAMPLE: - CREATE or replace FUNCTION dbms_output_put_line(text) RETURNS text AS $$ from plpython import getPackage return getPackage(GD, plpy, 'dbms_output').putLine(args[0]) $$ LANGUAGE plpythonu; CREATE or replace FUNCTION dbms_output_get_lines() RETURNS text AS $$ from plpython import getPackage return getPackage(GD, plpy, 'dbms_output').getLines() $$ LANGUAGE plpythonu; -- package body CREATE OR REPLACE FUNCTION dbms_output() RETURNS text AS $$ return from plpython
Re: [HACKERS] test bed
Oleg Bartunov wrote: I just talked with AMD Russia and they could provide almost any hw for testing, so I'm looking for test-suite for PostgreSQL. Do we have sort of official tests ? I guess the regression tests would be the closest to an official set of tests, but they obviously aren't perfect. For performance testing, there are various tools -- pgbench, OSDL's DBT tests, OSDB, etc. -Neil ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Am Sonntag, den 08.05.2005, 14:30 +0900 schrieb Tatsuo Ishii: ... Actually I myself thought as you are before. Later I found that it was not so good idea. People already have invalid encoded data in their precious database and have very hard time to migrate to newer version of PostgreSQL because of encoding validation. ... Now I need to upgrade to newer PostgreSQL within 1 day. I know I have to fix wrong encoded field but it will take more than 1 day. So I would like to import the data first then fix wrong encoded field on running database since I can reach members by phone or email even with wrong encoded address field... Well, if you are so in a hurry you better not migrate. Postgres is proud of validating the input and to have no invalid data. So if you have invalid data, better fix it. I saw this kind of situation in the real world and that's why we developed the patches. Why not developing a helper for contrib to help reencoding the database instead? Regards Tino ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
John Hansen wrote: Tatsuo Ishii wrote: We have developed patches which relaxes the character validation so that PostgreSQL accepts invalid characters. It works like this: That is just plain 100% wrong!! Under no circumstances should there be invalid data in a database. And if you're trying to make a database of invalid data, then at least encode it using a valid encoding. In fact, I've proposed strengthening the validation routines for UTF-8. ... John Under most circumstances I would agree with you completely. In my case though I have to decide between risking a loss of a user's data or attempt to store the file name in some manner that would return the same name used by the file system. The user (or one of his/her users in the case of an admin) may be completely unaware of the file name being an invalid unicode name. The file itself though may still be quite valid and contain information worthy of backing up. I could notify the user/admin that the name is not valid but there is no way I could rely on the name being changed. Given the choices, I would prefer to attempt to store/use the file name with the invalid unicode character than simply ignore the file. Is there a way to store the name in raw binary? If so, would this not be safe because to postgresql it should no longer matter what data is or represents, right? Maybe there is a third option I am not yet concidering? Madison -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Madison Kelly (Digimer) TLE-BU, The Linux Experience; Back Up http://tle-bu.thelinuxexperience.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Thank you, I would! (B (B What versions have you tested the patch against? I am sorry but I am (Bnot too familiar with applying patches against the main program, is (Bthere documentation on how to apply the patch? Is there a way to roll (Bthe patch back/remove it? Would I be able to script the installation of (Bthe patch (I would expect so). (B (B The reason for the last question is that I expect (hope) many people (Bwill use it and I want to make it as easy as possible for a user to (Bsimply select or unselect the patch if it works well. If I can script (Bthe install and removal of this patch then I can do just this and that (Bwould be wonderful. (B (B Thank you again! (B (B $B$I$&$b(B $B$"$j$,$H$&(B $B$4$6$$$^$9(B! (I hope that is right, my (BJapanese is (Bstill elementary. :) ) (B (B Madison (B (B (BTatsuo Ishii wrote: (B We have developed patches which relaxes the character validation so (B that PostgreSQL accepts invalid characters. It works like this: (B (B 1) new postgresql.conf item "mbstr_check" added. (B 2) if mbstr_check = 0 then invalid characters are not accepted (B(same as current PostgreSQL behavior). This is the default. (B 3) if mbstr_check = 1 then invalid characters are accepted with (BWARNING (B 4) if mbstr_check = 2 then invalid characters are accepted without any (Bwarnings (B 5) We have checked PostgreSQL source code if accepting invalid (Bcharacters makes some troubles. We have found that we need to fix a (Bplace and the fix is included in the patches. (B (B Madison, (B If you are interested in the patches, I could send it to you. (B (B Hackers, (B Do you think the functionality something like above is worth to add to (B PostgreSQL? (B -- (B Tatsuo Ishii (B (B (BHi all, (B (B I've been chasing down a bug and from what I have learned it may be (Bbecause of how postgreSQL (8.0.2 on Fedora Core 4 test 2) handles (Binvalid unicode. I've been given some ideas on how to try to catch (Binvalid unicode but it seems expensive so I am hoping there is a (Bpostgresql way to deal with this problem. (B (B I've run into a problem where a bulk postgres "COPY..." statement is (Bdieing because one of the lines contains a file name with an invalid (Bunicode character. In nautilus this file has '(invalid encoding)' and (Bthe postgres error is 'CONTEXT: COPY file_info_3, line 228287, column (Bfile_name: "Femme Fatal\u.url"'. (B (B To actually look at the file from the shell (bash) shows what appears (Bto be a whitespace but when I copy/paste the file name I get the (B'\u' you see above. (B (B I could, with the help of the TLUG people, use regex to match for an (Binvalid character and skip the file but that is not ideal. The reason is (Bthat this is for my backup program and invalid unicode or not, the (Bcontents of the file may still be important and I would prefer to have (Bit in the database so that it is later copied. I can copy and move the (Bfile in the shell so the file isn't apparently in an of itself corrupt. (B (B So then, is there a way I can tell postresql to accept the invalid (Bunicode name? Here is a copy of my schema: (B (Btle-bu= \d file_info_2 (B Table "public.file_info_2" (B Column| Type |Modifiers (B--+--+- (B file_group_name | text | (B file_group_uid | bigint | not null (B file_mod_time| bigint | not null (B file_name| text | not null (B file_parent_dir | text | not null (B file_perm| text | not null (B file_size| bigint | not null (B file_type| character varying(2) | not null default (B'f'::character varying (B file_user_name | text | (B file_user_uid| bigint | not null (B file_backup | boolean | not null default true (B file_display | boolean | not null default false (B file_restore_display | boolean | not null default false (B file_restore | boolean | not null default false (BIndexes: (B "file_info_2_display_idx" btree (file_type, file_parent_dir, file_name) (B (B 'file_name' and 'file_parent_dir' are the columns that could have (Bentries with the invalid unicode characters. Maybe I could/should use (Bsomething other than 'text'? These columns could contain anything that a (Bfile or directory name could be. (B (B Thanks! (B (BMadison (B (B-- (B-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- (BMadison Kelly (Digimer) (BTLE-BU, The Linux Experience; Back Up (Bhttp://tle-bu.thelinuxexperience.com
[HACKERS]
unsubscribe ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Madison Kelly wrote: Is there a way to store the name in raw binary? Yes: bytea. -O ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files
Tom Lane [EMAIL PROTECTED] writes: Heikki Linnakangas [EMAIL PROTECTED] writes: Consider the variant with extra marker files. In that case, backend B doesn't have to know about the .notcommitted status to flush the buffers. [ shrug ] It's still broken, and the reason is that there's no equivalent of fsync for directory operations. Consider Traditionally that's because directory operations were always synchronous, and hence didn't need to be fsynced. I think this is still true, other systems like qmail's maildir still depend on this, for example. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Oracle Style packages on postgres
One simple benefit to packages is just organization of related code.On 5/7/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:Oracle Style packages on postgresOVERVIEW: To emulate oracle server side development in postgres I required serverside packages.The following text demonstrates how to do this usingplpython on postgres 8 and suggests a language extension.WHAT ARE ORACLE PACKAGES? Looking back over the postgres discussion forums (particulary a discussionin 2001 following a proposal by Bill Studenmund) there appears to be someconfusion over what oracle packages are.Here's a concise definition : A black box processing engine with one or more public access functionsthat retains state across callsAn oracle package is created when first referenced. Its initializationcode is run once (ie costly queries to populate session wide package params) and the package dies at the end of the sessionAn analogy with OOP is that it's like having a single class instanceavailable for the duration of a session.SOME POWERFUL USES OF PACKAGES: 1. Pipes - oracle dbms_pipe built-in allows asynchronous communicationbetween any number of producer/consumer database sessions on any number ofpipes2. Logging - leave all logging/debug statements in code, decision on logging output can be made when the logging package is initialised (eg byquerying lookup tables for user, on/off, level, and destination).Combinelogging with pipes and the output can be stored in tables seperate from the current transaction.Include timing info down to milliseconds andlive problems/bottlenecks can more easily be identified.3. Batch reporting - more suited to autonomous transactions than loggingbut useful to have the report package store start time, duration, error/warning count running totals etc. and summarize automatically atreport end.See the example below on how to implement a version of the oracledbms_output package in plpythonEXTENSIONS TO POSTGRES: Oracle style package creation syntax is split into header and body so thatthe body(code) can be re-compiled without invalidating dependent objects.Postgres syntax for the dbms_output example (in any postgres server side language) would be along the lines of:CREATE OR REPLACE PACKAGE HEADER dbms_output AS FUNCTION dbms_output_put_line(text) RETURNS text, FUNCTION dbms_output_get_lines() RETURNS text;CREATE OR REPLACE PACKAGE BODY dbms_output AS $$ 'package code' $$ language;Adding pg_package with a link from pg_proc are the only changes requiredto the data dictionary.It would be nice to have similar dotted syntax as oracle(user.package.function) but would this mess up postgres namespaces? The language in which the package was created would process the 'packagecode', for example in python:ocreate public functions linking header declaration to package body code(see dbms_output example) oprocess embedded sql, eg l_curs=select * from dual -l_curs=self.execute('select * from dual')othe extracted sql can be 'prepared' by postgres and syntax exceptionsreported as compilation errors SUMMARY:Packages are an important addition to postgres.Some of the server sidelanguages have the potential to create them now.It would be useful toadd a common high level syntax before the various language implementations start developing their own solutions.I'm currently testing dbms_pipe on postgres, let me know if anyone isinterested.I replaced xml-rpc (5 messages/second) by sockets (600xfaster!), and may test corba Ronnie Mackay--EXAMPLE :CONVERT ORACLE BUILT-IN PACKAGE DBMS_OUTPUT: [Oracle syntax is :exec dbms_output.put_line('line1');]Postgresselect dbms_output_put_line('line 1');Postgresselect test_call_dbms_output_from_within_plpgsql('line 2(plpgsql)');Postgresselect test_call_dbms_output_from_within_plpython('line 3 (plpython)');Postgresselect dbms_output_put_line('line 4');Postgresselect dbms_output_get_lines();--- DBMS_OUTPUT DEMO ---line 1line 2 (plpgsql)line 3 (plpython) line 4--- DBMS_OUTPUT DEMO ---So using current postgres syntax the only difference with oracle is thatdbms_output.put_line('line 1'); becomesdbms_output_put_line('line 1'); The source code to implement the package body is returned by postgresfunction dbms_output()POSTGRES CREATE STATEMENTS FOR EXAMPLE:- CREATE or replace FUNCTION dbms_output_put_line(text) RETURNS text AS $$from plpython import getPackagereturn getPackage(GD, plpy, 'dbms_output').putLine(args[0])$$ LANGUAGE plpythonu;CREATE or replace FUNCTION dbms_output_get_lines() RETURNS text AS $$ from plpython import getPackagereturn getPackage(GD, plpy, 'dbms_output').getLines()$$ LANGUAGE plpythonu;-- package bodyCREATE OR REPLACE FUNCTION dbms_output() RETURNS text AS $$return from plpython import PlPythonPackageclass Package(PlPythonPackage): def __init__(self,
Re: [HACKERS] pl/pgsql enabled by default
Mike Mascari mascarm@mascari.com writes: 2. Issue a query like: SELECT * FROM view_of_salaries_based_on_current_user WHERE my_side_effect_function_that_inserts_into_a_temp_table(salary, employee); That's just exactly equivalent to SELECT * FROM (select * from all_salaries where user = CURRENT_USER ) WHERE malicious_function(salary,employee) Hm. If you incorrectly mark your function as IMMUTABLE even though it has side effects then the planner may indeed collapse this. Does the planner know it can't collapse views if the underlying tables aren't accessible to the user? -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Cleaning up unreferenced table files
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: [ shrug ] It's still broken, and the reason is that there's no equivalent of fsync for directory operations. Consider Traditionally that's because directory operations were always synchronous, and hence didn't need to be fsynced. That might be true with respect to the process requesting the directory operation ... but I think you missed the point entirely. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] pl/pgsql enabled by default
Greg Stark [EMAIL PROTECTED] writes: Hm. If you incorrectly mark your function as IMMUTABLE even though it has side effects then the planner may indeed collapse this. Does the planner know it can't collapse views if the underlying tables aren't accessible to the user? There are no cases where function or view collapsing elides permissions checks (if you have a counterexample please provide it!!). They could change the time at which permissions checks are applied, though; which has the potential for a REVOKE to not disallow execution of already- planned queries that ideally it should prevent. I believe that this risk will be fixed by the planned forcing of replanning after schema changes. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Oracle Style packages on postgres
Bob wrote: (B One simple benefit to packages is just organization of related code. (B (BAnd the package-scoped variables or constant values, similar to (Bthe global variables. (B (BIt will be very useful for application programmers (Bif one variable can be shared from several functions. (B (BI needed some tricks when I tried to port such PL/SQL to PL/pgSQL. (B (BBob wrote: (B One simple benefit to packages is just organization of related code. (B (B On 5/7/05, [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]* [EMAIL PROTECTED] (B mailto:[EMAIL PROTECTED] wrote: (B (B Oracle Style packages on postgres (B (B OVERVIEW: (B (B To emulate oracle server side development in postgres I required server (B side packages. The following text demonstrates how to do this using (B plpython on postgres 8 and suggests a language extension. (B (B WHAT ARE ORACLE PACKAGES? (B (B Looking back over the postgres discussion forums (particulary a (B discussion (B in 2001 following a proposal by Bill Studenmund) there appears to be (B some (B confusion over what oracle packages are. Here's a concise definition : (B"A black box processing engine with one or more public access (B functions (B that retains state across calls" (B An oracle package is created when first referenced. Its initialization (B code is run once (ie costly queries to populate session wide package (B params) and the package dies at the end of the session (B An analogy with OOP is that it's like having a single class instance (B available for the duration of a session. (B (B SOME POWERFUL USES OF PACKAGES: (B (B 1. Pipes - oracle dbms_pipe built-in allows asynchronous communication (B between any number of producer/consumer database sessions on any (B number of (B pipes (B (B 2. Logging - leave all logging/debug statements in code, decision on (B logging output can be made when the logging package is initialised (B (eg by (B querying lookup tables for user, on/off, level, and (B destination). Combine (B logging with pipes and the output can be stored in tables seperate from (B the current transaction. Include timing info down to milliseconds and (B live problems/bottlenecks can more easily be identified. (B (B 3. Batch reporting - more suited to autonomous transactions than logging (B but useful to have the report package store start time, duration, (B error/warning count running totals etc. and summarize automatically at (B report end. (B (B See the example below on how to implement a version of the oracle (B dbms_output package in plpython (B (B EXTENSIONS TO POSTGRES: (B (B Oracle style package creation syntax is split into header and body (B so that (B the body(code) can be re-compiled without invalidating dependent (B objects. (B Postgres syntax for the dbms_output example (in any postgres server (B side (B language) would be along the lines of: (B CREATE OR REPLACE PACKAGE HEADER dbms_output AS (BFUNCTION dbms_output_put_line(text) RETURNS text, (BFUNCTION dbms_output_get_lines() RETURNS text; (B CREATE OR REPLACE PACKAGE BODY dbms_output AS $$ 'package code' $$ (B language; (B (B Adding pg_package with a link from pg_proc are the only changes required (B to the data dictionary. (B It would be nice to have similar dotted syntax as oracle (B (user.package.function) but would this mess up postgres namespaces? (B (B The language in which the package was created would process the 'package (B code', for example in python: (B o create public functions linking header declaration to package (B body code (B (see dbms_output example) (B o process embedded sql, eg l_curs=select * from dual - (B l_curs=self.execute('select * from dual') (B o the extracted sql can be 'prepared' by postgres and syntax exceptions (B reported as compilation errors (B (B SUMMARY: (B Packages are an important addition to postgres. Some of the server side (B languages have the potential to create them now. It would be useful to (B add a common high level syntax before the various language (B implementations (B start developing their own solutions. (B (B I'm currently testing dbms_pipe on postgres, let me know if anyone is (B interested. I replaced xml-rpc (5 messages/second) by sockets (600x (B faster!), and may test corba (B (B Ronnie Mackay (B (B (B - (B (B - (B (B EXAMPLE :CONVERT ORACLE BUILT-IN PACKAGE DBMS_OUTPUT: (B (B [Oracle syntax is :exec dbms_output.put_line('line1');] (B
Re: [HACKERS] pl/pgsql enabled by default
Tom Lane [EMAIL PROTECTED] writes: Greg Stark [EMAIL PROTECTED] writes: Hm. If you incorrectly mark your function as IMMUTABLE even though it has side effects then the planner may indeed collapse this. Does the planner know it can't collapse views if the underlying tables aren't accessible to the user? There are no cases where function or view collapsing elides permissions checks (if you have a counterexample please provide it!!). I'm talking about something like this. In guess there isn't a problem after all but it depends critically on the AND short-circuiting (and the order of evaluation of the expression not being changed). db= create view vtest as select * from test where a 1 ; db= create or replace function f(integer) returns integer as 'begin raise notice ''foo %'', $1; return $1; end' language plpgsql; db= explain select * from vtest where f(a)0; QUERY PLAN --- Seq Scan on test (cost=0.00..27.50 rows=112 width=4) Filter: ((a 1) AND (f(a) 0)) (2 rows) I can't come up with any circumstances where the function will get called before the a1 clause. If it were indexed then it would be evaluated first but it would no longer be relevant since the function wouldn't be getting called. But it's something to watch out for. If ever it seems like a wise idea to have the optimizer fiddle with the order of evaluation, say based on the selectivity or computational expense of the conditions then it could create a problem. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Oracle Style packages on postgres
Satoshi Nagayasu wrote: (B (BAn oracle package is created when first referenced. Its initialization (Bcode is run once (ie costly queries to populate session wide package (Bparams) and the package dies at the end of the session (BAn analogy with OOP is that it's like having a single class instance (Bavailable for the duration of a session. (B (BPL/Java has an object called "Session" that does exactly this. It is not (Bavailable from other languages at present. Are Packages supposed to be (Bcross-language? (B (BRegards, (BThomas Hallgren (B (B (B---(end of broadcast)--- (BTIP 9: the planner will ignore your desire to choose an index scan if your (B joining column's datatypes do not match