Re: [HACKERS] COPY (query) TO file
In article [EMAIL PROTECTED], Andrew Dunstan [EMAIL PROTECTED] writes: Mark Woodward wrote: Tom had posted a question about file compression with copy. I thought about it, and I want to through this out and see if anyone things it is a good idea. Currently, the COPY command only copies a table, what if it could operate with a query, as: COPY (select * from mytable where foo='bar') as BAR TO stdout Isn't this already being worked on? The TODO list says: Allow COPY to output from views IIRC Karel Zak posted a patch for that. Another idea would be to allow actual SELECT statements in a COPY. Personally I strongly favor the second option as being more flexible than the first. How so? I see that psql -h somehost somedb -c copy 'somequery' to stdout localfile would be more terse than psql -h somehost somedb -c create temp view tmp as somequery; copy tmp to stdout localfile but what's more flexible there? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Porting MSSQL to PGSQL (Was: [OT] MySQL is bad, but THIS bad?)
In article [EMAIL PROTECTED], Martijn van Oosterhout kleptog@svana.org writes: On Mon, May 22, 2006 at 10:00:22AM -0500, Jim C. Nasby wrote: T-SQL has statement-level triggers, and they get used a lot (some big apps ONLY put code in triggers). Statement-level triggers are very efficient for maintaining aggregates; the closest PG has are rewrite rules. Yeah, I wish PostgreSQL had them. I've got clients that could certainly make use of them. What are you referring to that is not supported currently? Probably he means that a complete implementation of statement-level triggers should include a way to access the rows affected by the trigger call. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Question about Postgresql time fields(possible bug)
In article [EMAIL PROTECTED], Bruce Momjian pgman@candle.pha.pa.us writes: Tony Caduto wrote: Hi, I just noticed today that Postgresql accepts a value of 24:00:00, this is for sure not correct as there is no such thing as 24:00:00 PG Admin III will display this value just fine which is also incorrect, PG Lightning Admin catches it as a invalid time, but shouldn't there be some validation of times and dates at the server level? There are people who are using PG Admin III and they don't even know they have bogus dates and times in their databases. A leap second will show as 24:00:00. It is a valid time. Shouldn't such a leap second be represented as '... 23:59:60'? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Question about Postgresql time fields(possible bug)
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: Harald Fuchs [EMAIL PROTECTED] writes: Bruce Momjian pgman@candle.pha.pa.us writes: A leap second will show as 24:00:00. It is a valid time. Shouldn't such a leap second be represented as '... 23:59:60'? People who didn't like 24:00:00 would complain about that, too ;-) Well, Richard T. Snodgrass says in Developing Time-Oriented Database Applications in SQL (pg. 81) the following: Most days have 24 hours. The day in April that daylight saving time kicks in has only 23 hours; the day in October that daylight saving time ends contains 25 hours. Similarly, minutes can have 62 seconds (though up to 1999 only one leap second has ever been added to any particular minute), as mentioned in this standard [44, p. 25]. where ref [44] is ISO, Database Language SQL. ISO/IEC 9075: 1992. ANSI X3.135-1992 To me this sounds like 23:59:60, doesn't it? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Oddity with extract microseconds?
In article [EMAIL PROTECTED], Christopher Kings-Lynne [EMAIL PROTECTED] writes: mysql SELECT EXTRACT(MICROSECOND FROM '2003-01-02 10:30:01.00123'); +---+ | EXTRACT(MICROSECOND FROM '2003-01-02 10:30:01.00123') | +---+ | 1230 | +---+ 1 row in set (0.00 sec) Does contrary behavior from MySQL count as evidence that PostgreSQL's behavior is correct? :-) No...I happen to think that their way is more consistent though. Pity it's not in the spec. I'd say the comparison with MySQL is useless because MySQL is unable to store microseconds in a DATETIME or TIMESTAMP column, although you can extract microseconds from a date/time literal. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Using multi-row technique with COPY
In article [EMAIL PROTECTED], Jim C. Nasby [EMAIL PROTECTED] writes: On Sun, Nov 27, 2005 at 07:44:55PM +, Simon Riggs wrote: not have any unique indexes or row triggers. It should be possible to take advantage of this automatically when those requirements are met, without any new options. Just as it was with Seq Scans, this is worth about 10% reduction in CPU for a COPY FROM. snip FSM access would need to change slightly to allow for whole-block-only requests to be made for heaps, without damaging the average row length calculation. It might be simpler to ignore FSM entirely? Does that mean that this fast copy would end up not re-using space on pages that have space available? ISTM that's something users would want to be able to over-ride. In fact, it seems like it shouldn't be a default behavior... Why not? If you later do INSERT/UPDATE/DELETE, space will get reused anyway, and if you don't (i.e. one-time bulk load on a constant table), you should afterwards do a VACUUM FULL ANALYZE anyway. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] 8.1 substring bug?
Consider the following: CREATE TEMP TABLE tbl ( id SERIAL NOT NULL, PRIMARY KEY (id) ); COPY tbl (id) FROM stdin; 1 2 3 4 \. SELECT substring ('1234567890' FOR (SELECT count (*) FROM tbl)::int); This returns '1234', as expected. But SELECT substring ('1234567890' FOR (SELECT count (*) FROM tbl)); returns NULL. I think the problem is that SELECT count(*) returns a BIGINT whereas substring expects an INT. Shouldn't there be a warning? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.1 substring bug?
In article [EMAIL PROTECTED], Martijn van Oosterhout kleptog@svana.org writes: It's even sillier than that: test=# SELECT substring ('1234567890' FOR 4::bigint); substring --- (1 row) test=# SELECT substring ('1234567890' FOR 4::int); substring --- 1234 (1 row) Looking at the explain verbose make it look like it's using the wrong version of substring. It's using the oid 2074 one: test=# select oid, oid::regprocedure from pg_proc where proname = 'substring'; oid | oid ---+- 936 | substring(text,integer,integer) 937 | substring(text,integer) 1680 | substring(bit,integer,integer) 1699 | substring(bit,integer) 2012 | substring(bytea,integer,integer) 2013 | substring(bytea,integer) 2073 | substring(text,text) 2074 | substring(text,text,text) 16579 | substring(citext,integer,integer) 16580 | substring(citext,integer) (10 rows) That substring is for regular expressions. Nasty, not sure how to deal with that one... Ah, so it's using substring (STRING from PATTERN for ESCAPE)? Yes, that explains the NULL. Looks like we're in the INT/BIGINT confusion again... Have a nice day, It's a nice day since I have a nice workaround for this misfeature :-) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Supporting NULL elements in arrays
In article [EMAIL PROTECTED], Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: Pavel Stehule [EMAIL PROTECTED] writes: it's great news. My personal opinion about formating NULL values '{a, NULL, b}' -- longer, clean NULL is NULL Unfortunately, that already has a meaning, and it's not that. What a pity. I don't see any alternative to the empty element proposal - it's worth the possible breakage. How about '{a, \N, b}'? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing the overhead of NUMERIC data
In article [EMAIL PROTECTED], Gregory Maxwell [EMAIL PROTECTED] writes: On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote: Yeah, and while one way of removing that dependance is to use ICU, that library wants everything in UTF-16. So we replace copying to add NULL to string with converting UTF-8 to UTF-16 on each call. Ugh! The argument for UTF-16 is that if you're using a language that doesn't use ASCII at all, UTF-8 gets inefficient pretty quickly. Is this really the case? Only unicode values 000800 - 00 are smaller in UTF-16 than in UTF-8, and in their case it's three bytes vs two. Cyrilic, Arabic, Greek, Latin, etc are all two bytes in both. IMHO the best encoding for Cyrilic, Arabic, Greek, Latin, etc is ISO-8859-* - just one byte. You need UTF* only when you want to have more than one of of them in the same column. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Call for 7.5 feature completion
In article [EMAIL PROTECTED], Christopher Kings-Lynne [EMAIL PROTECTED] writes: * optional interface which sends a row typeoid along with each row in a result set Oh, and 'select rowid, * from table' which returns special rowid column that just incrementally numbers each row. Why? It's a thing best handled at the client side, and we already have a way to do it server-side (temporary sequences). ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] invalidating cached plans
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: One possible approach is to do the invalidation on a sufficiently coarse grain that we don't care. For example, I would be inclined to make any change in a table's schema invalidate all plans that use that table at all; that would then subsume the constraint problem for instance. This doesn't solve the inlined function problem however. How about making this even more coarse-grained? Blindly throw all cached plans away when something in the database DDL changes. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] invalidating cached plans
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: One possible approach is to do the invalidation on a sufficiently coarse grain that we don't care. For example, I would be inclined to make any change in a table's schema invalidate all plans that use that table at all; that would then subsume the constraint problem for instance. This doesn't solve the inlined function problem however. How about using an even coarser grain? Whenever something in the database in question changes, blindly throw away all cached plans for this DB. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] libpq and prepared statements progress for 8.0
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: David Wheeler [EMAIL PROTECTED] writes: On Sep 20, 2004, at 12:34 AM, Bruce Momjian wrote: I think we should favor libpq usage wherever possible and only re-implement it in the native language when required, like for jdbc/java. I don't normally post me too posts, but I think that what Bruce says here is extremely important. Allow me to state a contrary position ;-) The first problem with this approach is that it requires libpq to be all things to all people. We've already had some discussion in this thread about the tension between supporting application programs written in C, which want one set of features, and drivers, which need some other ones. After awhile you end up with a bloated, probably buggy library. We're already some way down that path, and I don't care to go much further. I don't think that's what David meant, although he said so :-) What we should have is a C API especially for use by driver authors; probably this API is so far away from the rest of libpq that it should not be part of it. This API could make life easier for driver authours, resulting in more and better drivers for more languages. The second problem is the one someone already pointed out, that you *need* multiple implementations in order to keep the protocol definition honest. Nobody forces a driver author to use that API, and there are driver authors who *cannot* use it, e.g. Java. This means there will be more than one implementation anyways. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Calling PL functions with named parameters
In article [EMAIL PROTECTED], Oliver Jowett [EMAIL PROTECTED] writes: I think you just made my point for me. C++ allows default parameters and resolves the ambiguity by disallowing ambiguous calls when they happen. I'm not sure why C++ doesn't disallow it at declaration time off the top of my head -- perhaps because you'd get inconsistent behaviour if the candidates were split across compilation units. IIRC this was due to multiple unheritance. You could inherit methods with the same name and parameter list from two different base classes. Disallowing that at declaration time would mean disallowing inheritance (even indirectly) from these two base classes, even though the derived class didn't use the ambiguous methods. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Add Missing From?
In article [EMAIL PROTECTED], Stephan Szabo [EMAIL PROTECTED] writes: Right, the reason it's important is that there are some things now that are potentially tied together. If you have table A with rows A1,...,An and table B with rows B1,...,Bm and the delete join condition gives the two outputs (A1,B1) and (A2,B1) does a before trigger returning NULL for B1 prevent A1 and A2 from being deleted? Do row triggers for B1 get run twice? Me thinks that at the time of deletion the join between A and B should not matter any more. The DELETE code would see A1 and A2, and it would see B1. Thus a B1 trigger would get called only once. By the way, this would be an issue also for singe-table DELETEs. Also, if we do allow multiple table at a time deletes, should we be considering multiple table at a time updates, and if so, what works for those. IMHO multi-table UPDATEs would be much harder because in this case the join conditions would matter at update time: which row in A would get updated with values from which row in B? I'm also not sure how rules apply to these multi-table delete statements. See above. If we break the join relationships before deletion we can issue DELETE requests logically sequentially, and these requests would do the right thing: either fire a trigger or get changed by a rule. Well, my hypothetical multi-table-DELETE would then call the DELETE rule. Where do you see a problem? It's not a problem, but I'd think that: delete from foo, bar where foo.val=3; should probably act similarly to: create view v as select * from foo,bar where foo.val=3; delete from v; or probably even: create view v as select * from foo,bar; delete from v where val=3; If that's true, we're not merely defining behavior for the first, but defining what we expect to make the behavior for the latter two as well so we should think about that as well. That's right, but I think it would be well-defined if we say build the result set and then for each table seperately delete all distinct rows covered by the result set. In either case we would end up with deleteing some rows in foo and all rows in bar. The general problem seems to be that a table can occur at many places within one query, and at each place different rows are matched. On the top level this should not be a problem: just specify the correct table alias between DELETE and FROM. That's not quite the issue I was thinking of. I was thinking of cases where the output rows are not directly/easily connected to base table rows, like delete foo from (select sum(a) from tab group by b having sum(a)10) foo; Perhaps we could disallow deleting from tables/aliases in deeper subselect levels? That's probably a good thing for any first implementation. It'd probably still be good to think about those cases to not box out possible future enhancements. As you correctly pointed out above, this is a similar problem to updatable views. Does anyone know what the SQL standard says about those beasts? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Add Missing From?
In article [EMAIL PROTECTED], Josh Berkus [EMAIL PROTECTED] writes: Harald, You're talking about the deletion target table. Sorry to mention the M word again, but MySQL allows deleting from more than one table at the same time. Should we support that? Nope. In fact, I'd argue pretty strongly against any move to do so. MySQL supports multi-table delete for 2 reasons: 1) MySQL does not support CASCADE Foriegn Keys, and as a result this is the only way they can clean out all related records. Not quite - MySQL implemented ON DELETE CASCADE for foreign keys _before_ multi-table DELETEs. 2) To be blunt: MySQL doesn't care about your data. I know - what do you think why I'm lurking here? ;-) Since we do support CASCADE FKs, there is no reason for us to support this syntax; just set your FKs up correctly and it's taken care of. Were we to consider implementing this, the implementors would have to answer the following questions: A) In what order are the rows deleted, from which table first? In exactly the same order as for single-table DELETEs - implementation-defined. B) If no join criteria are supplied, is it OK to delete all rows from one of the tables? Yes - people creating Cartesian products deserve punishment :-) C) If one of the tables had FKs or triggers that affect one of the other tables, when do these get evaluated/fired? Implementation-defined. Overall, I consider it a very, very bad idea. My main concern was not multi-table DELETEs per se, but a way to do deletions based on results of arbitrary queries. Multi-table DELETEs would just be a logical extension to that. I thought about something like that: DELETE [tbl [,tbl]...] FROM fromexp fromexp could be anything which is legal after a SELECT ... FROM, including outer joins and LIMIT clauses. tbl could be names or aliases of tables used in fromexp. If none supplied, this would default to all tables used there, thus making DELETE FROM t1 WHERE whatever just a degenerate case. The semantics of that would be: 1. Do a SELECT * FROM fromexp 2. For every tbl, delete everything covered by the result set of the SELECT, in some arbitrary order (unless restricted by an ORDER BY) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Add Missing From?
In article [EMAIL PROTECTED], Stephan Szabo [EMAIL PROTECTED] writes: In exactly the same order as for single-table DELETEs - implementation-defined. I think you probably meant in an unspecified order. Implementation-defined really doesn't mean anything when you're trying to define what it means for a particular implementation. ;) You're right - what I meant was something like the same order in which SELECT without an ORDER BY returns its rows - something you can't rely upon. B) If no join criteria are supplied, is it OK to delete all rows from one of the tables? Yes - people creating Cartesian products deserve punishment :-) What we do here should be related to what we would want to happen on a view with a join if we were to make automatically updatable views. I think automatically updatable views are something like views with automatically generated INSERT/UPDATE/DELETE rules, aren't they? Well, my hypothetical multi-table-DELETE would then call the DELETE rule. Where do you see a problem? I thought about something like that: DELETE [tbl [,tbl]...] FROM fromexp fromexp could be anything which is legal after a SELECT ... FROM, including outer joins and LIMIT clauses. Including subselects in FROM? That's a big barrel of fish. How does this interact with subselects with union or distinct or group by. Ouch, didn't think about that. The general problem seems to be that a table can occur at many places within one query, and at each place different rows are matched. On the top level this should not be a problem: just specify the correct table alias between DELETE and FROM. Perhaps we could disallow deleting from tables/aliases in deeper subselect levels? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Add Missing From?
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: Jan Wieck [EMAIL PROTECTED] writes: What about DELETE FROM staff JOIN users ... then? I don't much care for that, mainly because in my mind x JOIN y should always be semantically equivalent to y JOIN x. I think we want a real clear syntactical separation between the deletion target table and the other tables. Just one minor nit: You're talking about the deletion target table. Sorry to mention the M word again, but MySQL allows deleting from more than one table at the same time. Should we support that? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Add Missing From?
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: Robert Treat [EMAIL PROTECTED] writes: Well, as yall have pointed out, the feature is not sql spec (for some reason I thought it had been put in) so since the update syntax seems quite similar to oracles, perhaps they can provide a pointer on delete syntax as well? I can't seem to find my oracle syntax book, anyone have one handy ? Didn't get any Oracle hits in a quick google, but I did find out that MySQL spells it USING: DELETE FROM target_tbl USING other_tbls WHERE ... This isn't a particularly compelling precedent seeing that (a) MySQL doesn't use our flavor of UPDATE syntax and (b) they only adopted the above in 4.0.2. Actually, MySQL supports two different syntaxes for multi-table DELETEs: 1. DELETE t1,t2 FROM t1,t2,t3 WHERE t1.id=t2.id AND t2.id=t3.id; (introduced in MySQL 4.0.0) 2. DELETE FROM t1,t2 USING t1,t2,t3 WHERE t1.id=t2.id AND t2.id=t3.id; (introduced in MySQL 4.0.2) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_dump bug fixing
In article [EMAIL PROTECTED], Rod Taylor [EMAIL PROTECTED] writes: On Mon, 2004-07-19 at 12:36, Josh Berkus wrote: Rod, I think what we want is a clean template without all of the extras that template1 has. We have this, it's called Template0. Doesn't work for me. I remove a number of things that are included by default in template0, but yes, it's close. I think pg_dump should do a kind of diff between template1 and the database in question and include the necessary statements in the dump to allow pg_restore to replay the diff after it has created the database from template1. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: Shridhar Daithankar [EMAIL PROTECTED] writes: I was toying around with idea of converting all the memory related parameters in postgresql.conf to kilobytes for simplicity and uniformity. Why is that a good idea? Two reasons: 1. Some values are in KB, some in 8 KB 2. I find it easier to calculate in KB I'd like to see the following: * If the value is purely numeric, treat it as before (to ensure compatibility with older versions) * If the value is numeric with a prefix of [KMG], interpret it as KB, MB, or GB, respectively ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_dump --comment?
In article [EMAIL PROTECTED], Chris Campbell [EMAIL PROTECTED] writes: Harald Fuchs wrote: Why don't you just do ( echo -- This is my comment pg_dump whatever ) dumpfile ? How could I dump using the custom format, and then use dumpfile with pg_restore to restore the dump? If I just prepend the comment to the file, then pg_restore will choke, since the file won't be in the proper custom format. I would have to remove the comment before sending the file to pg_restore. Is there an easy way to do that? That can be easily automated, and not take a huge amount of time given a 4 gig dump file that must be modified? Since pg_restore is able to read from standard input, that should not be a problem: instead of pg_restore options dumpfile just do sed 1d dumpfile | pg_restore options ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings