[HACKERS] A big thanks to SuSE
Yesterday I was a bit worried... I switched to SuSE just 2 weeks ago... my newly installed databse server was waitinI thought that I would have to wait so much to have RPMs for SuSE and today I see v7.4 compiled for many flavors of SuSE, even for X86-64. Wow :) Thanks :) -- Daniele Orlandi ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] 7.4 Wishlist
Christopher Kings-Lynne wrote: Hi guys, Just out of interest, if someone was going to pay you to hack on Postgres for 6 months, what would you like to code for 7.4? Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Replication. Well, jokes apart, I think this is one of the most needed features to me. Currently I'm using strange voodoo to replicate some tables on other machines in order to spread load and resilency. Compared to what I am doing now a good master to slave replication would be heaven. I understand that a good replication is painful but in my experience, if you start by integrating some rude, experimental implementation in the mainstream PostgreSQL the rest will come by itself. For example, RI was something I wouldn't consider production level in 7.2, but was a start, now in 7.3 is much much better, probably complete in the most important parts. Other wishes (not as important as the replication issue) are: - Better granularity of security and access control, like in mysql. - Ability to reset the state of an open backend, including aborting open transaction to allow for better connection pooling and reusing, maybe giving the client the ability to switch between users... Bye! -- Daniele Orlandi Planet Srl ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Optimizer boolean syntax
Tom Lane wrote: Only two of them are logically equivalent. Consider NULL. Ohhh IS NOT TRUE or IS NOT FALSE also match NULL, I never knew this :) Even for the first two, assuming equivalence requires hard-wiring an assumption about the behavior of the bool = bool operator; which is a user-redefinable operator. I'm not totally comfortable with the idea. Ok, I see your point and the problems that may arise, but I hope wou will agree with me that from the point of view of the user, both clauses have the same meaning and the index usage should be consistant with it. Unfortunatelly I don't know very well PostgreSQL internals, so I may be saying a load of bullshits, but wouldn't be possible to consider any evaluation of a bool expression in the form of bool=bool with true as the second 'bool'[1] ? At least as a TODO item ? Thanks! Bye! [1] Eventually including the var IS TRUE and var IS FALSE (not var IS NOT ...) which already are special syntax cases if I am not wrong. -- Daniele Orlandi Planet Srl ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Optimizer boolean syntax
Are those two syntaxes eqivalent ? select * from users where monitored; select * from users where monitored=true; If the answer is yes, the optimimer probably doesn't agree with you :) Tested on RC1: template1=# create table a (a boolean, b text); CREATE TABLE inserted ~18000 rows with just one true (just to make an index scan meaningful) template1=# vacuum analyze a; VACUUM template1=# explain select * from a where a; QUERY PLAN Seq Scan on a (cost=0.00..802.64 rows=1 width=11) Filter: a (2 rows) template1=# explain select * from a where a=true; QUERY PLAN -- Index Scan using a_a on a (cost=0.00..2.01 rows=1 width=11) Index Cond: (a = true) (2 rows) Bye! -- Daniele Orlandi Planet Srl ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Optimizer boolean syntax
Stephan Szabo wrote: On Thu, 21 Nov 2002, Daniele Orlandi wrote: Are those two syntaxes eqivalent ? select * from users where monitored; select * from users where monitored=true; If the answer is yes, the optimimer probably doesn't agree with you :) That depends on the definition of equivalent. By equivalent I mean means the same thing so, behaves in the same way. I consider the former syntax to be cleaner and I would tend to use it most of times. For what concerns partial indexes, I agree, it's a better approach for this kind of indexing and I did some test: - ctonet=# create index users_monitored on users (monitored) where monitored; CREATE ctonet=# explain select * from users where monitored; NOTICE: QUERY PLAN: Index Scan using users_monitored on users (cost=0.00..9.44 rows=6 width=186) EXPLAIN Nice, it appears to use the index, but: ctonet=# explain select * from users where monitored=true; NOTICE: QUERY PLAN: Seq Scan on users (cost=0.00..8298.84 rows=59 width=186) EXPLAIN - The problem is the opposite... so, effectively, seems that the optimizer considers monitored and monitored=true as two different expressions... The viceversa is analog and we also can see that the syntax monitored is true is considered different from the other two syntaxes: --- ctonet=# drop index users_monitored; DROP ctonet=# create index users_monitored on users (monitored) where monitored=true; CREATE ctonet=# explain select * from users where monitored=true; NOTICE: QUERY PLAN: Index Scan using users_monitored on users (cost=0.00..9.45 rows=6 width=186) EXPLAIN ctonet=# explain select * from users where monitored; NOTICE: QUERY PLAN: Seq Scan on users (cost=0.00..8077.07 rows=59 width=186) EXPLAIN ctonet=# create index users_monitored on users (monitored) where monitored=true; CREATE ctonet=# explain select * from users where monitored is true; NOTICE: QUERY PLAN: Seq Scan on users (cost=0.00..8077.07 rows=59 width=186) EXPLAIN - What I propose is that all those syntaxes are made equivalent (by, for example, rewriting boolean comparisons to a common form) in order to have a more consistent index usage. Bye! -- Daniele Orlandi Planet Srl ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Client-side merge string sorting
Hello, I sometimes need to perform client-side merges, sometimes between two tables on the same database, sometimes between two different databases. When the merge key is numeric all goes well but, when the merge key is a string a problem arises: string comparison operators often behave differently between the database(s) and the client's language. Sometimes it is due to the locale settings, sometimes is the particular implementation of the operator, as a matter of facts, I cannot trust the strings comparison operators. Si, the question is how client-side merge should be done... - Perform the sorting locally... only one operator... maybe suboptimal sorting... etc - Compare the strings hex-encoded: overhead apart, I found myself unable to use encode(..) function on PostgreSQL since it accepts only BYTEA data and text isn't castable to bytea. - Invent a new operator whose behaviour would be always consistent, locale-indepentent... (like the very-first C's strcmp). Which do you think should be the correct approach ? Thanks in advance! Best regards! -- Daniele Orlandi Planet Srl ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] CRCs
Nathan Myers wrote: It wouldn't help you recover, but you would be able to report that you cannot recover. While this could help decting hardware problems, you still won't be able to detect some (many) memory errors because the CRC will be calculated on the already corrupted data. Of course there are other situations where CRC will not match and appropriately logged is a reliable heads-up warning. Bye! -- Daniele
Re: [HACKERS] PHP and PostgreSQL
Bruce Momjian wrote: I have been asked by the major PHP developer Rasmus Lerdorf to see if the PostgreSQL/PHP interface needs any improvements. Is the current PostgreSQL interface module in PHP adequate? Does it support all the current libpq features? If not, would someone submit some patches to the PHP folks. They want us to work well with PHP. They are basically encouraging us to improve it in any way we can. If I can put my 0.02 Euros, I think there are two issues that deserve to be fixed, both are relative to persistent connections. The first is that the transaction state is (obviously) kept between connections. This is not a problem in itself, but there are some situations where the transaction block is not closed with COMMIT or ABORT with the obvious consequencies when the connection is reused (and with the locks not released). As a workaround I registered a shutdown funcion that sends an ABORT command every time a script finishes his processing but this results in a "Not in trasaction block" error most of the time (harmless but pretty annoying). IMHO there should be some sort of lightweight reset that puts the connection in its pristine state without the need to fork a backend againg and reopen the TCP connection. The second is that if PostgreSQL is restarted, PHP doesn't detect that and an error message is seen for every persistent connection. IMHO PHP should detect a closed connection and clear the connections pool. Bye! -- Daniele ------- Daniele Orlandi - Utility Line Italia - http://www.orlandi.com Via Mezzera 29/A - 20030 - Seveso (MI) - Italy ---
Re: [HACKERS] libpq enhancement for multi-process application
Sébastien Bonnet wrote: Hi all, and mainly postresql developpers, I've been reading old posts about the libpq interface related to multi-process application. The main problem being that after a fork, each process has a DB connexion, actually the same. If one closes it, the other one remains in a unknown or not stable state. Uhm... I always thought that sharing the same socket between processes is wrong. My multi-process daemon works like apache with a pool of processes everyone with its own connection to the DB. The connection is only opened AFTER the fork and remains open as long as the process lives just to avoid a new connection for each accept. Bye!
Re: [HACKERS] Why vacuum?
"Ross J. Reedstrom" wrote: Not to mention the recent thread here about people recovering data that was accidently deleted, or from damaged db files: the old tuples serve as redundant backup, in a way. Not a real compelling reason to keep a non-overwriting smgr, but still a surprise bonus for those who need it. The optimal would be a configurable behaviour. I wouldn't enable it on a users table, neither on a log-type table (the former is a slowly changing table, the second is a table with few updates/deletes), but a fast-changing table like an active sessions table would benefit a lot. Currently, my active sessions table grows by 100K every 20 seconds, I have to constantly vacuum it to keep the things reasonable. Other tables would benefit a lot, pg_listener for example. Bye!
Re: [HACKERS] Why vacuum?
Alfred Perlstein wrote: If you're talking about vacuum, you really don't want to do this, No, I'm not talking about vacuum as it is intended now, it's only a process that scans tables to find available blocks/tuples. It is virtually optional, if it doesn't run, the database will behave just like now. what's going to happen is that since you have an exclusive lock on the file during your vacuum and no way to do priority lending you can deadlock. No exclusive lock, it's just a reader. When your table grows to be very large you'll see what we're talking about. I see this as an optimization issue. If the scanner isn't smart and loses time scanning areas of the table that have not been emptied, you go back to the current behaviour. Bye!
Re: [HACKERS] Why vacuum?
bpalmer wrote: I noticed the other day that one of my pg databases was slow, so I ran vacuum on it, which brought a question to mind: why the need? I looked at my oracle server and we aren't doing anything of the sort (that I can find), so why does pg need it? Any info? Hi, I'm one of the people beeing slightly bitten by the current vacuum behaviour :), so i take the chance to add my suggestions to this question. FWIW, my thought is about a vacuumer process that, in background, scans each table for available blocks (for available I mean a block full of deleted rows whose tid is commited) and fills a cache of those blocks available to the backends. Whenever a backend needs to allocate a new block it looks for a free block in the cache, if it finds any, it can use it, else it proceeds as usual appending the block at the tail. The vacuumer would run with a very low priority, so that it doesn't suck precious CPU and I/O when the load on the machine is high. A small flag on each table would avoid the vacuumer to scan the table if no empty block is found and no tuple has been deleted. Ok, now tell me where this is badly broken :)) Just my .02 euro :) Bye! -- Daniele Orlandi
[HACKERS] European Datestyle
Hello, Why is this happening ? ctonet=# show datestyle; NOTICE: DateStyle is ISO with European conventions SHOW VARIABLE ctonet=# select creation_date from users limit 1; creation_date 2000-12-07 04:40:23+01 ^^ Datestyle has been set either with -e and with "set datestyle" with no change. Context: Postgresql 7.0.3 on RedHat Linux 7.0 - Kernel 2.4.0-test10 - Glibc 2.1.94 and 2.2 Thanks! Bye! -- Daniele Orlandi
Re: [HACKERS] European Datestyle
Trond Eivind GlomsrØd wrote: 2000-12-07 04:40:23+01 ^^ That is the ISO-style, isn't it? Yes, it is; but according to the documentation (and how it used to be on other machines running PG 6.x) it should be ordered in european format, I don't know if I'm missing something obviuous or what... There are two ways of making dates make sense, none of them American (but hey, they're still using Fahrenheit, feet, lb, fl.oz. acres and other nonsensical units... ) I do not mean to cricticize british units, after all, I would have preferred base16 units instead of base10 :) Bye! -- Daniele --- Daniele Orlandi - Utility Line Italia - http://www.orlandi.com Via Mezzera 29/A - 20030 - Seveso (MI) - Italy ---
Re: AW: [HACKERS] beta testing version
Bruce Guenter wrote: CRCs are designed to catch N-bit errors (ie N bits in a row with their values flipped). N is (IIRC) the number of bits in the CRC minus one. So, a 32-bit CRC can catch all 31-bit errors. That's the only guarantee a CRC gives. Everything else has a 1 in 2^32-1 chance of producing the same CRC as the original data. That's pretty good odds, but not a guarantee. Nothing is a guarante. Everywhere you have a non-null probability of failure. Memories of any kind doesn't give you a *guarantee* that the data you read is exactly the one you wrote. CPUs and transmsision lines are subject to errors too. You only may be guaranteed that the overall proabability of your system is under a specified level. When the level is low enought you usually suppose the absence of errors guaranteed. With CRC32 you considerably reduce p, and given the frequency when CRC would need to reveal an error, I would consider it enought. Bye! -- Daniele --- Daniele Orlandi - Utility Line Italia - http://www.orlandi.com Via Mezzera 29/A - 20030 - Seveso (MI) - Italy ---
Re: AW: [HACKERS] beta testing version
Tom Lane wrote: Zeugswetter Andreas SB [EMAIL PROTECTED] writes: Yes, but there would need to be a way to verify the last page or record from txlog when running on crap hardware. How exactly *do* we determine where the end of the valid log data is, anyway? Couldn't you use a CRC ? Anyway... may I suggest adding CRCs to the data ? I just discovered that I had a faulty HD controller and I fear that something could have been written erroneously (this could also help to detect faulty memory, though only in certain cases). Bye! -- Daniele Orlandi Planet Srl
Re: AW: [HACKERS] beta testing version
Bruce Guenter wrote: - Assume that a CRC is a guarantee. A CRC would be a good addition to help ensure the data wasn't broken by flakey drive firmware, but doesn't guarantee consistency. Even a CRC per transaction (it could be a nice END record) ? Bye! -- Daniele --- Daniele Orlandi - Utility Line Italia - http://www.orlandi.com Via Mezzera 29/A - 20030 - Seveso (MI) - Italy ---