Re: [GENERAL] Solved? Re: 8.2.4 signal 11 with large transaction
On 2007-07-20, Bill Moran <[EMAIL PROTECTED]> wrote: > It just occurred to me that there's another wildcard in this one. > The 8.1 system I tested was on FreeBSD 5.5, while both 8.2 systems > were running on FreeBSD 6.2. I wonder if FreeBSD has changed > which signal gets sent on memory exhaustion? Neither 5.x nor 6.x send any sort of signal simply because maxdsize was reached; the only effect of reaching maxdsize is that further memory allocations fail. I'd guess that the most likely cause for a sig11 under those circumstances is a stack overflow resulting from recursive errors... -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Processing a work queue
On 2007-04-30, "John D. Burger" <[EMAIL PROTECTED]> wrote: > Andrew - Supernews wrote: > >>> Anyone have any ideas on how to handle a work queue? >> >> Advisory locks (userlocks in pre-8.2). > > Can someone explain why these are a better fit than whatever locks > SELECT FOR UPDATE acquires? They can be acquired without blocking, and they are non-transactional (and can therefore be held for long periods of time, while you work on the item - this allows you to either detect cases where a queue runner died before completing an item, or (where appropriate) automatically release such items back into the queue). The nonblocking bit avoids the need for a backoff-and-retry in the case when two queue runners both try and fetch from the queue at the same time - using advisory locks they both get a (different) item, rather than one getting a serialization failure. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Processing a work queue
On 2007-04-26, Steve Crawford <[EMAIL PROTECTED]> wrote: > Anyone have any ideas on how to handle a work queue? Advisory locks (userlocks in pre-8.2). -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] hashtext () and collisions
On 2007-04-11, "Leon Mergen" <[EMAIL PROTECTED]> wrote: > Now, my question is: how big is the chance that a collision happens > between hashes ? I noticed that the function only returns a 32 bit > number, so I figure it must be at least once in the 4 billion values. Assuming it's a uniform random hash, 32 bits long, then if you have 65536 values, you have a ~40% chance of at least one collision. Any defects in the hash function only increase that probability. This is a result of what's known as the "birthday paradox" (so-called because in a group of 23 people, there is a better than even chance that two of them share a birthday). The number of rows needed to have an approximately even chance of at least one collision grows as the _square root_ of the number of hash buckets; or to put it another way, you always need _more than twice as many bits_ in your hash value than you think you do. (e.g. using md5(), which is a 128-bit hash) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] More grist for the PostgreSQL vs MySQL mill
On 2007-01-21, Ron Johnson <[EMAIL PROTECTED]> wrote: >> And no BDB (at least last I checked is not GPL) > > It's BSD (for obvious reasons), no? No, Sleepycat's licence is _NOT_ BSD. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Aggregate in Correlated SubQuery
On 2006-10-15, Niederland <[EMAIL PROTECTED]> wrote: > Before postgresql 8.1.5, I could do the following to find the first > lead that created a prospect in my application. > > SELECT > Lead.LeadID, > Prospect.ProspectID > FROM > Prospect INNER JOIN Lead USING (ProspectID) > WHERE > Lead.CreationDate = (SELECT MIN(Lead.CreationDate) FROM Lead AS LL > WHERE LL.ProspectID = Lead.ProspectID) I think that was always wrong, and that what you wanted was MIN(LL.CreationDate). The aggregate in the subquery must be an aggregate over the subquery's rows, not over the outer query. (Though there are ways to do this query without the subquery at all) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] more anti-postgresql FUD
On 2006-10-13, Alexander Staubo <[EMAIL PROTECTED]> wrote: > Makes sense. However, in this case I was batching updates in > transactions and committing each txn at 1 second intervals, all on a > single connection. In other words, the bottleneck illustrated by this > test should not be related to fsyncs, and this does not seem to > explain the huge discrepancy between update (1,000/sec) and insert > (9,000 inserts/sec, also in 1-sec txns) performance. Update has to locate the one live row version amongst all the dead ones; insert doesn't need to bother. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] more anti-postgresql FUD
On 2006-10-13, Alexander Staubo <[EMAIL PROTECTED]> wrote: > On Oct 13, 2006, at 17:13 , Andrew - Supernews wrote: >> Your disk probably has write caching enabled. A 10krpm disk should be >> limiting you to under 170 transactions/sec with a single connection >> and fsync enabled. > > What formula did you use to get to that number? It's just the number of disk revolutions per second. Without caching, each WAL flush tends to require a whole revolution unless the on-disk layout of the filesystem is _very_ strange. You can get multiple commits per WAL flush if you have many concurrent connections, but with a single connection that doesn't apply. > Is there a generic > way on Linux to turn off (controller-based?) write caching? I don't use Linux, sorry. Modern SCSI disks seem to ship with WCE=1 on mode page 8 on the disk, thus enabling evil write caching by default. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] more anti-postgresql FUD
On 2006-10-13, Alexander Staubo <[EMAIL PROTECTED]> wrote: > On my box (Dell PowerEdge 1850, dual Xeon 2.8GHz, 4GB RAM, 10kRPM > SCSI, Linux 2.6.15, Ubuntu) I get 1,100 updates/sec, compared to > 10,000 updates/sec with MySQL/InnoDB, using a stock installation of > both. Insert performance is only around 10% worse than MySQL at > around 9,000 rows/sec. Curiously enough, changing shared_buffers, > wal_buffers, effective_cache_size and even fsync seems to have no > effect on update performance, while fsync has a decent effect on > insert performance. Your disk probably has write caching enabled. A 10krpm disk should be limiting you to under 170 transactions/sec with a single connection and fsync enabled. I also did some tests on this, and even though the machine I was testing on had some competing database activity, autovacuum was effective at keeping the table size stable (at 70-odd pages) when running several hundred thousand updates on a 1-row table. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] looping through query to update column
On 2006-10-13, "Albe Laurenz" <[EMAIL PROTECTED]> wrote: > You lock the table (with LOCK) or the row you're working on > (with SELECT FOR UPDATE) so that nobody else can change it while > you are working on it. > > You need something like ctid if your table has the fundamental flaw > of lacking a primary key. Looping over rows unnecessarily is a mistake. You can add a SERIAL column to a table using ALTER TABLE, which will automatically number the existing rows; this is a better way to fix a lack of a primary key than messing around with ctids. For a one-off update, use a temporary sequence: create temporary sequence foo; update table set recordid = nextval('foo'); -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Need help with a function from hell..
On 2006-10-03, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hi all, > > I have a small coding problem where my function is becoming, well, too > ugly for comfort. I haven't finished it but you will get picture below. > > First a small description of the purpose. I have an aggregate function > that takes a string and simply concatenates that string to the previous > (internal state) value of the aggregate, example: > > "Hello:World" || ", " || "World:Hello" --> "Hello:World, World:Hello" > > My problem is that I sometimes get the same value before the colon > sign and in those cases I should not add the whole string to the previous > value of the aggregate but extract the value that is behind the colon and > add it to already existing part which matched the value before the colon > but with a slash as a delimiter, example: > > Internal state: "Hello:World, World:Hello" > New value: "Hello:Dolly" > After function is run: "Hello:World/Dolly, World:Hello" > > So what I am doing is a lot of strpos() and substr() functions (I have > previously asked for the speed of the substr() function) but it is > beginning to look really alwful. You might have better luck with a different approach. For example, start by accumulating the values into an array, rather than a string, and rather than try and do the magic bits in the transition function, do them in one pass at the end, making use of the full power of SQL queries rather than trying to do your own procedural logic. If you think about your problem in SQL terms, what you're really trying to do is essentially a "group by" on your first field. If you can avoid the need to pass that in as a colon-delimited value, then your life will be much simpler; but even if you can't avoid that, the SQLish solution will be easier. As a sample of what you can do, here is a function that does part of the job (requires the array_accum aggregate as given as an example in the manual): create function foo(text[]) returns text[] language sql immutable as $f$ select ARRAY(select k || ':' || array_to_string(v,'/') from (select split_part($1[i],':',1) as k, array_accum(substring($1[i] from ':(.*)')) as v from generate_series(array_lower($1,1), array_upper($1,1)) s(i) group by k) s1) $f$; => select array_to_string(foo(ARRAY['foo:bar', 'baz:quux', 'foo:baz']),','); array_to_string -- baz:quux,foo:bar/baz (1 row) To understand the function, look at the subqueries from the inside out; the inner one splits the foo:bar elements into two columns, groups by the first and collects the corresponding values into an array; the outer one converts the format back to the one you require. As a bonus, if you want to eliminate duplicate values, you can just add the "distinct" keyword inside the array_accum aggregate. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Initializing Datums for use with SPI_execute_plan
On 2006-09-18, "Jack Orenstein" <[EMAIL PROTECTED]> wrote: > On 9/14/06, Andrew - Supernews <[EMAIL PROTECTED]> wrote: >> One way: > > Thank you, I hope you can help with some clarifications. > >> >> bytea *my_bytea = (bytea *) palloc(byte_array_len + VARHDRSZ); >> memcpy(VARDATA(my_bytea), byte_array, byte_array_len); >> VARATT_SIZEP(my_bytea) = byte_array_len + VARHDRSZ; >> >> values[0] = DirectFunctionCall1(textin,CStringGetDatum(yourstring)); > > What is textin -- the name of a function? textin is the name of the input function for the "text" type; it takes a C string as input and returns a Datum. Most (but not all) type input functions can be called this way. (The format of the C string that you pass to a type input function is exactly the same as you'd send from a client, without quoting/escaping.) (The ones that can't be called by DirectFunctionCall1 are those that need access to type info, such as array or composite value input, or user-defined types. Those are callable using other fmgr interfaces.) >> values[1] = PointerGetDatum(my_bytea); > > One other thing is still unclear: What about signed/unsigned long? If > I have a C long, then how do I get a Datum that I can assign to > values[i]? What matters isn't so much the C type that you have, as the SQL type that you are supposedly providing. Pg doesn't have an "unsigned" type unless you created your own, so the question is, are you passing an int4 or an int8 to SQL? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] ECPG: non-integer constant in group by
On 2006-09-15, Poul Jensen <[EMAIL PROTECTED]> wrote: > 2) The code below was meant to find the # of distinct elements for many > columns, but fails with the message > sql error 'non-integer constant in GROUP BY' > thrown at the EXEC SQL EXECUTE statement. What is the problem? > char dbnm[50], *stmt = "SELECT ? FROM beamdata GROUP BY ?;"; That statement is not legal - parameters can only be used to substitute _data_, not identifiers or syntax. You would have to interpolate the column name directly into the query (remembering to quote it with "" if needed) rather than use a parameter for it. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Initializing Datums for use with SPI_execute_plan
On 2006-09-14, "Jack Orenstein" <[EMAIL PROTECTED]> wrote: > I don't think I explained myself clearly. I have a C string (char*, > terminating zero) and a byte array (char*, possibly containing zeros, > and I know the length). I want to obtain Datums wrapping these values > that will be used to bind varchar and bytea columns, and I'm trying to > find out how to generate these Datums. One way: bytea *my_bytea = (bytea *) palloc(byte_array_len + VARHDRSZ); memcpy(VARDATA(my_bytea), byte_array, byte_array_len); VARATT_SIZEP(my_bytea) = byte_array_len + VARHDRSZ; values[0] = DirectFunctionCall1(textin,CStringGetDatum(yourstring)); values[1] = PointerGetDatum(my_bytea); SPI_execute_plan... -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Template1 oops
On 2006-09-13, Alban Hertroys <[EMAIL PROTECTED]> wrote: > I'm quite certain that approach can be made more bullet-proof by > wrapping it inside a transaction. I saw no mention of that on his page. > > It's quite amazing what PostgreSQL can handle inside transactions :) Some of the few things that pg can _not_ do inside a transaction include: CREATE DATABASE DROP DATABASE -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Having problems with a 25 million row table on 8.1.3
On 2006-04-25, Tony Caduto <[EMAIL PROTECTED]> wrote: > Hi, > I have a client who has a 25 million row table that is used to keep > track of financial security info. > So far it has worked great, but today someone wanted to get all the tax > codes(there are lot's of dupes) from the table. > So we tried this: > > select DISTINCT tax_code from warehouse.sec_trans > We let this run for 1/2 hour or so and canceled it. > > Then I tried select DISTINCT ON (tax_code) tax_code from warehouse.sec_trans > > same deal, had to cancel it. select tax_code from warehouse.sec_trans group by tax_code; Both of the DISTINCT variants rely on sorting. GROUP BY, on the other hand, can use a hash aggregate, which will be much more efficient where the number of distinct values is low. Of course it will still need to scan the whole table... -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] How to avoid empty temp schemas
On 2006-04-24, Andrus <[EMAIL PROTECTED]> wrote: > After some time of working, Postgres 8.1.3 on W2K creates large a number of > empty schemas in my database. > > Those schemas are named pg_temp_1 .. pg_temp_13 (some names are missing in > sequence). This is normal. The number of temp schemas created will never exceed max_connections. They are recycled, not removed, as needed. > Some of them do not contain any objects. Some contain temp tables created by > my application. The ones that are in use may contain objects. Those objects are cleared out when the connection ends. In the event of a crash, they are cleared out when the temp schema is next recycled. > Is it safe to remove those schemas ? It is not necessary. I do not know if it is safe. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] IDT timezone
On 2006-04-21, "Brandon Metcalf" <[EMAIL PROTECTED]> wrote: > What is the best way to handle timestamps with a timezone of IDT? I > see that I could modify src/backend/utils/adt/datetime.c to support > IDT, but what is the best solution? > > Basically, I have an application where I'm grabbing the timezone from > the output of date(1) and appending that to a timestamp before I do an > INSERT. In the situations where the timezone is IDT, the INSERT > fails. On reasonably up-to-date systems, why not use the %z format specifier for date(1) to get a numeric zone offset? Better yet, omit the offset entirely and make sure that the session timezone is correctly set (to, presumably, 'Asia/Jerusalem') and let postgres figure out whether DST is in effect (which it can do just as well as date(1) can, provided you're keeping reasonably up to date - pg 8.0 onwards carry their own copy of the standard zoneinfo database with them). Zone names like 'IST' are in any event entirely ambiguous and should never be used - you could regard it as a pure fluke that pg happens to resolve 'IST' as +0200 rather than +0530... -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Unexplained lock creating table
On 2006-04-22, Wes <[EMAIL PROTECTED]> wrote: > I have a C application (libpq) that uses partitioning. I create parent > tables 'header' and 'detail'. The application reads opens multiple > connections, creates the child tables, and uses COPY to import the data: > > open connection 1 > begin > create table header_1 If this is the first child table of "header", which it presumably is, then the pg_class row for "header" has to be updated to set relhassubclass. Since you do not commit the create before continuing, this backend continues to hold a row lock on the updated, uncommitted pg_class row while you do... > COPY into header_1 > > open connection 2 > begin > create table header_2 At this point connection 2, which does not yet see the existence of header_1 and the updated pg-class row for "header" (since they are not yet committed and are thus invisible even in SnapshotNow), _also_ believes it needs to update the pg_class row for "header" for the same reason. However, the update attempt immediately runs into the locked/uncommitted row belonging to connection 1, and must therefore wait on the lock before proceeding... (If in fact you allow connection 1 to complete and commit, the create table in connections 2+ may then bail out with a "tuple concurrently updated" error, since catalog updates don't generally have the recheck-after-lock logic used for user queries in read-committed mode, so can't cope with the fact that another connection updated the tuple. If connection 1 aborts the transaction instead, then connection 2 can proceed.) [...] > However, if I force table header_1 to be created outside the COPY > transaction (using psql, manually committing the transaction from within > gdb, etc.), then run the application, it works regardless of the number of > open connections/transactions. > > I then drop all the child tables, leaving the parent table, and rerun the > application. It again works for all connections. relhassubclass isn't reset to false when all child tables of a parent table are removed. So next time through there is no need to update the pg_class row for the parent table. So the simplest workaround is probably to ensure that you create at least one partition in each table at the outset, before trying to actually load any data. You've already discovered that this works, but at least you now know why :-) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Replacing MD5 hash in pg_auth...
On 2006-04-15, "Peter van der Maas" <[EMAIL PROTECTED]> wrote: > Hello, > > Is it correct to assume that if a user has write permission to > \data\global\pg_auth on a Win32 machine, the superuser's MD5 hash can be > replaced with one of a known origin in order to own the DB? It's worse than that. If you can _read_ pg_auth, then you can log in as any user who has an MD5 password provided that pg_hba.conf allows md5 auth - the values stored in pg_auth (and pg_shadow) are password equivalents for the purposes of md5 auth. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] pg_restore 7.4.7 locks itself out
On 2006-04-10, Alban Hertroys <[EMAIL PROTECTED]> wrote: > Tom Lane wrote: >> Alban Hertroys <[EMAIL PROTECTED]> writes: >> >>>postgres 15092 0.0 0.3 43692 12924 ? D14:11 0:00 postgres: >>>postgres vh3_live [local] INSERT >> >> This process is not blocked on a lock: it's waiting for disk I/O. >> >> Thoughts that come to mind include (1) it's going fine and you're not >> patient enough; (2) something wrong with your disk drive; (3) DB is >> mounted across NFS and you're having network problems. > > Really? I've been waiting for it to finish ever since, amounting to > almost 4 hours now. It doesn't seem to have progressed one bit since it > started. Well, I'll let it run overnight and see what has happened by > tomorrow morning. Notice the "INSERT" there. For a restore, you'd expect it to be "COPY", _unless_ you used the -d option to pg_dump (this is a common mistake to make, given that all the other utilities use -d to specify the database name). Restoring an inserts dump is _SLOW_ to the point of being glacial, because (unless it's a text dump and you edited in BEGIN/COMMIT statements yourself) it'll be committing each row individually, which limits the restore speed to a couple of hundred rows/sec unless you have evil (or battery-backed) write caching or you've turned fsync off. COPY (which pg_dump will generate if you didn't use -d) runs two or three orders of magnitude faster. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Non-Overlaping date interval index
On 2006-02-18, "A. Kretschmer" <[EMAIL PROTECTED]> wrote: > am 18.02.2006, um 10:03:11 +0100 mailte Pailloncy Jean-Gerard folgendes: >> I wonder how to have a sort of "uniq" index on date interval, such that >> there is no date interval overlaping in the table. > > You can write a trigger to check the count of records which overlaps > (start,end): Doesn't work reliably due to the race condition, unless you lock the table against conflicting inserts. The problem of course is that the trigger doesn't see uncommitted rows from concurrent transactions. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Multiple-index optimization not working for = ANY operator
On 2006-02-15, "Jimmy Choi" <[EMAIL PROTECTED]> wrote: > I find that it doesn't work (i.e. index is not used and a sequential > scan is needed), if I have the following instead: > > SELECT * FROM foo WHERE id = ANY (ARRAY[1,2]); > > Is this expected? The reason I would like the last case to work is that > my plpgsql function takes as input an array of IDs, and so I cannot > write my query using the first two forms above. > > Any idea on how I can get around this is greatly appreciated. The workaround (for versions 7.4 - 8.1) is: select * from foo where id in (select myarray[i] from generate_series(array_lower(myarray,1), array_upper(myarray,1)) as s(i)); This typically plans out as a nestloop join with an aggregate over the function scan as the outer path, and an index lookup on foo.id as the inner path. Execution times for 8.1 are usually very slightly worse than the equivalent IN (a,b,c,...) with literal values, but the plan time is shorter. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] update with subquery - unexpected results
On 2006-02-05, Niederland <[EMAIL PROTECTED]> wrote: > I am using postgres 8.1.2 on a windows computer. > My postgres.config has: > add_missing_from = off > > In the following example, the first update sets value for v1 in table > t1 to 4 for all rows. > Just for my sanity, I checked the null condition in the second update, > and this > does not update any of the rows, am I missing something? > > create table t1 (v1 integer); > create table t2 (v2 integer); > > insert into t1 (v1) values (1); > insert into t1 (v1) values (2); > > insert into t2 (v2) values (3); > > update t1 set v1=4 where v1 IN (select v1 from t2); What's happening here is that since t2 has no column "v1", the reference to v1 in the subquery is bound to v1 in the nearest outer query level, i.e. the update itself. So the subquery becomes correlated, and returns true for all non-null values of v1 as long as t2 has at least one row. > update t1 set v1=4 where v1 IN (select NULLIF(1,1) from t2); v1 IN (NULL) is null for all v1, and WHERE treats null conditions as false. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] 8.0.3 regexp_replace()...
On 2006-02-01, rlee0001 <[EMAIL PROTECTED]> wrote: > Stephan, > > How do IN and NOT IN treat NULLs? Don't these functions search an array > for a specified value returning true or false? I guess the intuitive > thing for IN and NOT IN to do would be to return NULL if NULL appears > anywhere in the array since those elements values are "unknown". foo IN (x1,x2,x3) is exactly equivalent to (foo = x1) OR (foo = x2) OR (foo = x3) foo NOT IN (x1,x2,x3) is likewise equivalent to (foo <> x1) AND (foo <> x2) AND (foo <> x3) In the first case, if one of the x? is null, then the result of the expression is true if any of the clauses is true, or null otherwise; TRUE OR NULL is true, while FALSE OR NULL is null. In the second case, the result is likewise determined by the logic of three-valued AND. Since TRUE AND NULL is null, and FALSE AND NULL is false, that means that the expression can never return true if any of the x? is null. > Oh yeah, my favorite is this: NULL::VARCHAR? Nope, you can't do it. => select null::varchar; varchar - (1 row) works for me. (Note: that's not an empty string; use \pset null in psql to see the difference.) > Not without creating your own CAST. Casting from what? NULL isn't a type... > Seems to me that an obvious value would be 'NULL'. Or maybe '' > (empty string). If NULL ever got converted to 'NULL' or '', how would you distinguish it from the literal 'NULL' or ''? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] NOT HAVING clause?
On 2006-01-24, Will Glynn <[EMAIL PROTECTED]> wrote: > You might try: > > SELECT some_column > FROM some_table > GROUP BY some_column > HAVING SUM(CASE WHEN sort_order=1 THEN 1 ELSE 0 END) = 0; SELECT some_column FROM some_table GROUP BY some_column HAVING every(sort_order <> 1); every() is in 8.1 at least (can't recall when it was introduced); it's the same as bool_and(), i.e. an aggregate that returns true only if all inputs are true. Why isn't there a corresponding any(), I wonder? (bool_or does exist) It should be possible to define bool_and() or every() in the usual way as a custom aggregate even on versions that don't have them builtin. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] SELECT Rules or stored procedure
On 2006-01-19, "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > On Tue, Jan 17, 2006 at 09:55:42PM -0700, Assad Jarrahian wrote: >> q2) How does one get access to the rows just selected in the CREATE >> RULE computation? > > Via NEW and OLD. Read Chapter 34 of the documentation. He's talking about a SELECT rule. A SELECT rule is no more and no less than a view - you can do nothing with select rules that you can't do with a normal view. (You can only have one SELECT rule, it must be named _RETURN, and it must be a DO INSTEAD SELECT rule. If you put such a rule on a plain table, you will find that the table changes into a view.) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] sort character data in arbitrary order?
On 2006-01-14, Robert Paulsen <[EMAIL PROTECTED]> wrote: > Here is my query so far: > > SELECT foo, bar, baz, FROM my_table WHERE state ~ '[abc]' > ORDER BY state ASC LIMIT 1. > > This works as expected. My problem is that I am relying on the collating > sequence of the letters a-z and the desirability of states may not always be > in this order. > > Is there a better way to do the "ORDER BY" or some other way to accomplish > this? I know I could do three queries and then compare the results but I was > hoping to do this all within the single query. If there's only a small number of possible "state" values then: ORDER BY state = 'a' DESC, state = 'b' DESC, state = 'c' DESC If there's more than a small number, then have a separate state_priority table mapping states to integer values, and join against that and sort by the priority value. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Error when inserting millions of records
On 2006-01-13, Rudolph <[EMAIL PROTECTED]> wrote: > Me and a friend are testing some operations for a search application in > PostgreSQL. In version 8.1 my friend tried to copy the result of a join > into another table, about 45.000.000 (small) records. This is the > result: Do you have any constraints, especially deferred constraints, or triggers on the table? Are there any functional indexes? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Adding another primary key to a populated table
On 2006-01-06, Daniel Kunkel <[EMAIL PROTECTED]> wrote: > Hi > > I'm trying to add another primary key to a table populated with data and > a number of foreign key constraints. You can only have one primary key on a table. You can add additional unique constraints to get the same effect. (A primary key constraint is just a unique constraint that is also not null, and is the default target for REFERENCES constraints referring to the table - this last factor is why there can be only one...) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] When it is better to use "timestamp without time zone"?
On 2006-01-04, Emi Lu <[EMAIL PROTECTED]> wrote: > OK. When the column is setup as "timestamp with time zone default > now()", the default values will be set based on the Operating System, > right? You have to understand that in the current implementation, pg does not actually store the time zone. > An example case: > PostgreSQL server is on machine1, with timezone setup as "-5". A table > named test1(col timestamp with time zone default now() ); > > . insert into test1 from client machine2 with timezone "+2"; the value > inserted into machine1 should be "2006-01-04 10:01:01-05" but not > "2006-01-04 10:01:01+02" ? If the client gave the value as '2006-01-04 10:01:01', then the value is taken to be in whatever the session's timezone setting is. If the client didn't set that (either on connect, or via a SET command, or as a per-user or per-database default) then the server's timezone is the default. It's important in this context to note that "-5" or "+2" don't sufficiently specify time _zones_ as opposed to _timezone offsets_. When you're talking about a specific time, you can say '2006-01-04 10:01:01-0500', but to say that "a machine is in timezone -5" is generally nonsense. In the real world, you have to take into account DST rules both current and historical, which the timezone libraries know about. > . select * from test1 from client machine2, we will get "2006-01-04 > 10:01:01-05" since the absolute value is saved, which is never caculated > again? The result will be whatever the stored time is _in the session's timezone_. > . What is the problem here when the column type is setup as "timestamp > without time zone"? > The value "2006-01-04 10:01:01" is saved and read from both machine1 > and machine2. But '2006-01-04 10:01:01' doesn't mean the same thing in two different timezones. If what matters is that the result say "10:01:01" regardless of what timezone the client is in, then you want timestamp without time zone. If what matters is that the result be the _same time_ regardless of what timezone, then you want timestamp _with_ time zone. The second case is vastly more common. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] When it is better to use "timestamp without time zone"?
On 2006-01-04, Emi Lu <[EMAIL PROTECTED]> wrote: > In postgreSQL, the default value for timestamp is "without time zone". That's true only because the SQL standard says so, not because it's actually a good idea to use timestamps without time zone. > Can I say when data is used among diff time zones, timestamp with time > zone is a MUST; otherwise, timestamp without time zone is used? No. You should, IMO, use timestamp with time zone in essentially all cases. (In particular, you should _always_ use it for recording the time at which an event happened, which covers most uses of timestamps.) Only use timestamp without time zone for data storage if you have a specific reason to do so. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] POSTGRES DB 3 800 000 rows table, speed up?
On 2005-12-27, Eugene <[EMAIL PROTECTED]> wrote: > I've indexed first two columns they are IPfrom, IPto also table is btree > version of postgre is 7.4.8, on hosting > I ask db like this SELECT * FROM ipdb2 WHERE '3229285376' BETWEEN ipfrom > AND ipto; > > and get answer after 3-10 seconds, is there a way to speed it up somehow? > any tweaks and tuneups possible with it? http://pgfoundry.org/projects/ip4r Designed specifically for storing IP ranges as in your example, it will (if correctly used) reduce that query time down to a few milliseconds. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] INSERT OR UPDATE
On 2006-01-03, Tony Wasson <[EMAIL PROTECTED]> wrote: > On 1/2/06, Julian Scarfe <[EMAIL PROTECTED]> wrote: >> I'm keeping config information for an application in a series of related >> tables. I'd like a command that INSERTs data if it's new, or UPDATEs it if >> the key is duplicated. > > A MERGE trigger will do exactly what you are asking for. > > http://archives.postgresql.org/pgsql-sql/2004-05/msg00135.php > > Send all your data as INSERTS. > The trigger will run a function to see if the row exists. > If the row exists -> rewrite it as it as an UPDATE > If the row does not exist -> leave the INSERT alone There are some rather serious race conditions in that (it's not taking anything like a strong enough lock to avoid that - it would need to use SHARE ROW EXCLUSIVE, not ROW EXCLUSIVE). In 8.0 on, use the method described here: http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Best practices: Handling Daylight-saving time
On 2005-03-11, Mike Benoit <[EMAIL PROTECTED]> wrote: > Here is a scenario I ran in to with collecting bandwidth usage and > displaying it back in graph form to customers. > > You can store the timestamps in GMT, but the customer wants to see when > spikes happen in his localtime, which most likely has DST. So twice a > year, you are either compressing two hours of bandwidth usage into one, > or the opposite, stretching one hour in to two, which of course produces > somewhat odd looking graphs during that time. That seems an odd way to handle it. If you graph the data by days according to the customer's time, then on one day in the year your graph is one hour smaller, and on another day it is one hour larger. The point to notice is that the customer's local time should affect only the _labels_ on the graph, and possibly your choice of start and end times, and not the _data_ being plotted. For example, suppose I have a table: create table tztst (ts timestamptz primary key, value float8 not null); and I want to plot individual days from it in the customer's timezone: test=> set timezone to 'America/Denver'; -- or wherever he is SET test=> select ts::time,value from tztst where ts between '2005-04-02 00:00:00' and '2005-04-02 23:59:59' order by ts; ts| value --+-- 00:00:00 | 286.764410064167 01:00:00 | 291.294525072763 02:00:00 | 294.912455364789 03:00:00 | 297.582051776698 04:00:00 | 299.276640583591 05:00:00 | 299.979290014267 06:00:00 | 299.68297942788 07:00:00 | 298.390669461862 08:00:00 | 296.115272450212 09:00:00 | 292.879523407724 10:00:00 | 288.715752869235 11:00:00 | 283.665563853606 12:00:00 | 277.779416180109 13:00:00 | 271.116122290598 14:00:00 | 263.742259615024 15:00:00 | 255.731505351766 16:00:00 | 247.16390030942 17:00:00 | 238.125049165494 18:00:00 | 228.705265132773 19:00:00 | 218.998667579544 20:00:00 | 209.102241619985 21:00:00 | 199.11486907096 22:00:00 | 189.136340457592 23:00:00 | 179.266357939324 (24 rows) test=> select ts::time,value from tztst where ts between '2005-04-03 00:00:00' and '2005-04-03 23:59:59' order by ts; ts| value --+-- 00:00:00 | 169.603539118895 01:00:00 | 160.244431687857 03:00:00 | 151.282548753949 04:00:00 | 142.807434489044 05:00:00 | 134.903769433375 06:00:00 | 127.650524395576 07:00:00 | 121.120171402458 08:00:00 | 115.377959582483 09:00:00 | 110.481263218032 10:00:00 | 106.479008480546 11:00:00 | 103.411184576393 12:00:00 | 101.308444187935 13:00:00 | 100.19179720206 14:00:00 | 100.072400786337 15:00:00 | 100.951447910284 16:00:00 | 102.820155425614 17:00:00 | 105.659851824544 18:00:00 | 109.442163799338 19:00:00 | 114.129299739007 20:00:00 | 119.674427330605 21:00:00 | 126.022141492211 22:00:00 | 133.109017962198 23:00:00 | 140.864247013488 (23 rows) test=> select ts::time,value from tztst where ts between '2005-10-30 00:00:00' and '2005-10-30 23:59:59' order by ts; ts| value --+-- 00:00:00 | 110.349122831853 01:00:00 | 114.741289638094 01:00:00 | 119.837588745288 02:00:00 | 125.595930978012 03:00:00 | 131.968759497219 04:00:00 | 138.903442561358 05:00:00 | 146.342708199957 06:00:00 | 154.225117209803 07:00:00 | 162.485570567354 08:00:00 | 171.055847066766 09:00:00 | 179.865166743321 10:00:00 | 188.840775429059 11:00:00 | 197.908545612907 12:00:00 | 206.99358864294 13:00:00 | 216.020873214721 14:00:00 | 224.915845037786 15:00:00 | 233.605042562575 16:00:00 | 242.016703682664 17:00:00 | 250.081358401684 18:00:00 | 257.732402570221 19:00:00 | 264.906647954345 20:00:00 | 271.544844092858 21:00:00 | 277.592167633387 22:00:00 | 282.998675105977 23:00:00 | 287.71971539486 (25 rows) All of these can be converted to meaningful (and un-distorted) graphs. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Index scan vs. Seq scan on timestamps
On 2004-12-07, Stephen Frost <[EMAIL PROTECTED]> wrote: > Is there a way to say "just take the value of this function at the start > of the transaction and then have it be constant" in a query? Why not use CURRENT_TIMESTAMP, etc., which do exactly that? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Index scan vs. Seq scan on timestamps
On 2004-12-06, Per Jensen <[EMAIL PROTECTED]> wrote: > Why does PG not use the index on the time column in the second select, > timeofday() has been cast to a timestamp after all. "timestamp" is "timestamp without time zone" (not the most useful type in the world). Your column is of type "timestamp with time zone" (correct). The relationship between the two is not trivial and the lack of an index scan therefore expected. Try casting to "timestamp with time zone" instead. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] select single entry and its neighbours using direct-acess to index?
On 2004-12-06, Pierre-Frédéric Caillaud <[EMAIL PROTECTED]> wrote: > SELECT * FROM test WHERE (name='b' and name2>'a') OR (name>'b') ORDER BY > name,name2 ASC LIMIT 1; Write that WHERE clause instead as: WHERE name>='b' AND (name>'b' OR (name='b' AND name2>'a')) This is logically equivalent, but it gives the planner a better handle on how to use an index scan to satisfy the query. > SELECT * FROM test WHERE (name='b' and name2<'a') OR (name<'b') ORDER BY > name,name2 DESC LIMIT 1; That needs to be ORDER BY name DESC, name2 DESC (the direction indicator applies per-column and not to the output ordering). Same goes for the WHERE clause in this query as the previous one, too. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Constaints
On 2004-12-05, Jan Wieck <[EMAIL PROTECTED]> wrote: > PostgreSQL does not have userland session variables. You would have to > write some custom set/get functions in a procedural language that is > capable of holding global data across function calls (like PL/Tcl). You can fake session variables by using constant-returning functions defined in the per-session temporary namespace. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] Upcoming Changes to News Server ...
On 2004-11-24, Marc G Fournier From : <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] (Andrew - Supernews) writes: >>Any chance of there being regular (or even only occasional) signed >>checkgroups messages for the new hierarchy? > > Sure, but I've never done it before, so if you can help ... ? Sure. You can reach me by email or on the irc chan. You should already have a copy of the signcontrol script itself (it's in the INN distribution). The format of a checkgroups message is simple: one line in the message body per newsgroup in this format: group.name(tabs)Description of group The description must end with " (Moderated)" (without the quotes) if it is a moderated group in the Usenet sense, and not otherwise. Conventionally the separator is enough tabs so that the description starts in column 24, but the only real requirement is that there be one or more tabs (and not any other sort of whitespace, and no tabs in the description). This is the same format as the newsgroups file in INN. For the headers, you want "Control: checkgroups", an Approved header, and a Newsgroups: header with an appropriate announcement group in (pgsql.announce should do; the message won't show up to normal readers). Checkgroups should be posted preferably after any change to the group list, and once per month or two even if there are no changes. Obviously you need an appropriate PGP or GPG key (RSA seems to be best as far as compatibility goes), which has to be published somewhere (but doesn't need to be on keyservers). Let me know if you have any questions or if you want me to verify any messages. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Upcoming Changes to News Server ...
On 2004-11-23, "Marc G. Fournier" <[EMAIL PROTECTED]> wrote: > On Wed, 24 Nov 2004, Peter Eisentraut wrote: >> Marc G. Fournier wrote: >>> Due to recent action by Google concerning the >>> comp.databases.postgresql.* hierarchy, we are going to make some >>> changes that should satisfy just about everyone ... over the next >>> 24hrs or so, traffic *to* >>> comp.databases.postgresql.* from the mailing lists will cease and be >>> re-routed to pgsql.* instead ... on our server (and we encourage >>> others to do the same), the comp.* groups will be aliased to the new >>> pgsql.* hierarchy, so that posts to the old groups will still get >>> through ... >> >> What exactly is this meant to achieve? > > To clean up the comp.* hierarchy ... evcen if the 4/5 that are being RFDd > right now pass, ppl are going to continue screaming that the other 15-16 > should be removed as well ... this way, thos using news.postgresql.org can > still get access to the whole hierarchy, while the comp.* would only carry > those that are deemed "official" Any chance of there being regular (or even only occasional) signed checkgroups messages for the new hierarchy? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] RFD: comp.databases.postgresql.general
On 2004-11-08, Woodchuck Bill <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] ("Marc G. Fournier") wrote in > news:[EMAIL PROTECTED]: > >> Unless its spam, it goes through ... I don't (nor have I ever) refused a >> post based on content other then spam ... even if its anti-PostgreSQL >> *shrug* > > The problem with the system is that the spam *all* gets posted to Usenet, > but not the mailing lists. The mailing lists may be moderated, but the > newsgroups are not. That needs to be changed. > > Marc, please stop removing news.groups from your replies. He's posting to the mailing list; he probably can't avoid dropping the crosspost. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]