Re: [HACKERS] patch (for 9.1) string functions
2010/7/21 Itagaki Takahiro itagaki.takah...@gmail.com: I reviewed the core changes of the patch. I don't think we need mb_string_info() at all. Instead, we can just call pg_mbxxx() functions. I rewrote the patch to use pg_mbstrlen_with_len() and pg_mbcharcliplen(). What do you think the changes? It requires re-counting lengths of multi-byte strings in some cases, but the code will be much simpler and can avoid allocating length buffers. It is a good idea. I see a problem only for right function, where for most common use case a mblen will be called two times. I am not able to say now, if this can be a performance issue or not. Highly probably not - only for very large strings. postgres=# create or replace function randomstr(int) returns text as $$select string_agg(substring('abcdefghijklmnop' from trunc(random()*13)::int+1 for 1),'') from generate_series(1,$1) $$ language sql; CREATE FUNCTION Time: 27,452 ms postgres=# select count(*) from(select right(randomstr(1000),3) from generate_series(1,1))x; count --- 1 (1 row) Time: 5615,061 ms postgres=# select count(*) from(select right(randomstr(1000),3) from generate_series(1,1))x; count --- 1 (1 row) Time: 5606,937 ms postgres=# select count(*) from(select right(randomstr(1000),3) from generate_series(1,1))x; count --- 1 (1 row) Time: 5630,771 ms postgres=# select count(*) from(select right(randomstr(1000),3) from generate_series(1,1))x; count --- 1 (1 row) Time: 5753,063 ms postgres=# select count(*) from(select right(randomstr(1000),3) from generate_series(1,1))x; count --- 1 (1 row) Time: 5755,776 ms It is about 2% slower for UTF8 encoding. So it isn't significant for me. I agree with your changes. Thank You very much Regards Pavel Stehule I'd like to apply contrib/stringinfo apart from the core changes, because there seems to be still some idea to improve sprintf(). -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Fri, Jul 16, 2010 at 7:43 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 16/07/10 10:40, Fujii Masao wrote: So we should always prevent the standby from applying any WAL in pg_xlog unless walreceiver is in progress. That is, if there is no WAL available in the archive, the standby ignores pg_xlog and starts walreceiver process to request for WAL streaming. That completely defeats the purpose of storing streamed WAL in pg_xlog in the first place. The reason it's written and fsync'd to pg_xlog is that if the standby subsequently crashes, you can use the WAL from pg_xlog to reapply the WAL up to minRecoveryPoint. Otherwise you can't start up the standby anymore. But, the standby can start up by reading the missing WAL files from the master. No? On the second thought, minRecoveryPoint can be guaranteed to be older than the fsync location on the master if we'll prevent the standby from applying the WAL files more than the fsync location. So we can safely apply the WAL files in pg_xlog up to minRecoveryPoint. Consequently, we should always prevent the standby from applying any newer WAL in pg_xlog than minRecoveryPoint unless walreceiver is in progress. Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar
Hi Peter, Thanks for your feedback. On 20/07/10 19:54, Peter Eisentraut wrote: Attached is a patch with the revised XMLEXISTS function, complete with grammar support and regression tests. The implemented grammar is: XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] ) Though the full grammar makes everything after the xpath_expression optional, I've left it has mandatory simply to avoid lots of rework of the function (would need new null checks, memory handling would need reworking). Some thoughts, mostly nitpicks: The snippet of documentation could be clearer. It says if the xml satisifies the xpath. Not sure what that means exactly. An XPath expression, by definition, returns a value. How is that value used to determine the result? I'll rephrase it: The function xmlexists returns true if the xpath returns any nodes and false otherwise. Naming of parser symbols: xmlexists_list isn't actually a list of xmlexists's. That particular rule can probably be done away with anyway and the code be put directly into the XMLEXISTS rule. Why is the first argument AexprConst instead of a_expr? The SQL standard says it's a character string literal, but I think we can very well allow arbitrary expressions. Yes, it was AexprConst because of the specification. I also found that using it solved my shift/reduce problems, but I can change it a_expr as see if I can work them out in a different way. xmlexists_query_argument_list should be optional. OK, I'll change it. The rules xml_default_passing_mechanism and xml_passing_mechanism are pretty useless to have a separate rules. Just mention the tokens where they are used. Again, I'll change that too. Why c_expr? As with the AexprConst, it's choice was partially influenced by the fact it solved the shift/reduce errors I was getting. I'm guessing than that I should really use a_expr and resolve the shift/reduce problem differently? Call the C-level function xmlexists for consistency. Sure. I'll look to get a patch addressing these concerns out in the next day or two, work/family/sleep permitting! :) Regards, -- Mike Fowler Registered Linux user: 379787 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Sat, Jul 17, 2010 at 3:25 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 14/07/10 09:50, Fujii Masao wrote: TODO The patch have no features for performance improvement of synchronous replication. I admit that currently the performance overhead in the master is terrible. We need to address the following TODO items in the subsequent CF. * Change the poll loop in the walsender * Change the poll loop in the backend * Change the poll loop in the startup process * Change the poll loop in the walreceiver I was actually hoping to see a patch for these things first, before any of the synchronous replication stuff. Eliminating the polling loops is important, latency will be laughable otherwise, and it will help the synchronous case too. At first, note that the poll loop in the backend and walreceiver doesn't exist without synchronous replication stuff. Yeah, I'll start with the change of the poll loop in the walsender. I'm thinking that we should make the backend signal the walsender to send the outstanding WAL immediately as the previous synchronous replication patch I submitted in the past year did. I use the signal here because walsender needs to wait for the request from the backend and the ack message from the standby *concurrently* in synchronous replication. If we use the semaphore instead of the signal, the walsender would not be able to respond the ack immediately, which also degrades the performance. The problem of this idea is that signal can be sent per transaction commit. I'm not sure if this frequent signaling really harms the performance of replication. BTW, when I benchmarked the previous synchronous replication patch based on the idea, AFAIR the result showed no impact of the signaling. But... Thought? Do you have another better idea? * Perform the WAL write and replication concurrently * Send WAL from not only disk but also WAL buffers IMHO these are premature optimizations that we should not spend any effort on now. Maybe later, if ever. Yep! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Sun, Jul 18, 2010 at 3:14 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 14/07/10 09:50, Fujii Masao wrote: Quorum commit - In previous discussion about synchronous replication, some people wanted the quorum commit feature. This feature is included in also Zontan's synchronous replication patch, so I decided to create it. The patch provides quorum parameter in postgresql.conf, which specifies how many standby servers transaction commit will wait for WAL records to be replicated to, before the command returns a success indication to the client. The default value is zero, which always doesn't make transaction commit wait for replication without regard to replication_mode. Also transaction commit always doesn't wait for replication to asynchronous standby (i.e., replication_mode is set to async) without regard to this parameter. If quorum is more than the number of synchronous standbys, transaction commit returns a success when the ACK has arrived from all of synchronous standbys. There should be a way to specify wait for *all* connected standby servers to acknowledge Agreed. I'll allow -1 as the valid value of the quorum parameter, which means that transaction commit waits for all connected standbys. Protocol I extended the handshake message START_REPLICATION so that it includes replication_mode read from recovery.conf. If 'async' is passed, the master thinks that it doesn't need to wait for the ACK from the standby. Please use self-explanatory names for the modes in START_REPLICATION command, instead of just an integer. Agreed. What about changing the START_REPLICATION message to?: START_REPLICATION XXX/XXX SYNC_LEVEL { async | recv | fsync | replay } Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Tue, Jul 20, 2010 at 8:12 PM, Peter Eisentraut pete...@gmx.net wrote: My preference would be to stick to a style where we identify the committer using the author tag and note the patch author, reviewers, whether the committer made changes, etc. in the commit message. A single author field doesn't feel like enough for our workflow, and having a mix of authors and committers in the author field seems like a mess. Well, I had looked forward to actually putting the real author into the author field. I hadn't realised that was possible until Guillaume did so on his first commit to the new pgAdmin GIT repo. It seems to work nicely: http://git.postgresql.org/gitweb?p=pgadmin3.git;a=commit;h=08e2826d90129bd4e4b3b7462bab682dd6a703e4 -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On 07/21/2010 01:52 AM, Robert Haas wrote: On Tue, Jul 20, 2010 at 5:46 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: I guess what Robert is saying is that you don't need shmem to pass messages around. The new LISTEN implementation was just an example. imessages aren't supposed to use it directly. Rather, the idea is to store the messages in a new SLRU area. Thus you don't need to mess with dynamically allocating shmem at all. Okay, so I just need to grok the SLRU stuff. Thanks for clarifying. Note that I sort of /want/ to mess with shared memory. It's what I know how to deal with. It's how threaded programs work as well. Ya know, locks, conditional variables, mutexes, all those nice thing that allow you to shoot your foot so terribly nicely... Oh, well... I think it should be rather straightforward. There would be a unique append-point; Unique append-point? Sounds like what I had before. That'd be a step backwards, compared to the per-backend queue and an allocator that hopefully scales well with the amount of CPU cores. each process desiring to send a new message to another backend would add a new message at that point. There would be one read pointer per backend, and it would be advanced as messages are consumed. Old segments could be trimmed as backends advance their read pointer, similar to how sinval queue is handled. That leads to pretty nasty fragmentation. A dynamic allocator should do much better in that regard. (Wamalloc certainly does). If the messages are mostly unicast, it might be nice if to contrive a method whereby backends didn't need to explicitly advance over messages destined only for other backends. Like maybe allocate a small, fixed amount of shared memory sufficient for two pointers into the SLRU area per backend, and then use the SLRU to store each message with a header indicating where the next message is to be found. That's pretty much how imessages currently work. A single list of messages queued per backend. For each backend, you store one pointer to the first queued message and one pointer to the last queued message. New messages can be added by making the current last message point to a newly added message and updating the last message pointer for that backend. You'd need to think about the locking and reference counting carefully to make sure you eventually freed up unused pages, but it seems like it might be doable. I've just read through slru.c, but still don't have a clue how it could replace a dynamic allocator. At the moment, the creator of an imessage allocs memory, copies the payload there and then activates the message by appending it to the recipient's queue. Upon getting signaled, the recipient consumes the message by removing it from the queue and is obliged to release the memory the messages occupies after having processed it. Simple and straight forward, IMO. The queue addition and removal is clear. But how would I do the alloc/free part with SLRU? Its blocks are fixed size (BLCKSZ) and the API with ReadPage and WritePage is rather unlike a pair of alloc() and free(). One big advantage of attacking the problem with an SLRU is that there's no fixed upper limit on the amount of data that can be enqueued at any given time. You can spill to disk or whatever as needed (although hopefully you won't normally do so, for performance reasons). Yes, imessages shouldn't ever be spilled to disk. There naturally must be an upper limit for them. (Be it total available memory, as for threaded things or a given and size-constrained pool, as is the case for dynshmem). To me it rather sounds like SLRU is a candidate for using dynamically allocated shared memory underneath, instead of allocating a fixed amount of slots in advance. That would allow more efficient use of shared memory. (Given SLRU's ability to spill to disk, it could even be used to 'balance' out anomalies to some extent). Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Wed, Jul 21, 2010 at 02:28, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut pete...@gmx.net wrote: Well, I had looked forward to actually putting the real author into the author field. What if there's more than one? What if you make changes yourself? How will you credit the reviewer? I think our current practice is fine. Put it in the commit log. If nothing else, I think this definitely falls under the minimum changes first policy. Let's start by doing things exactly as we're doing now. We can then consider changing this in the future, but let's not change everything at once. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query results differ depending on operating system (using GIN)
On Tue, 20 Jul 2010, Robert Haas wrote: On Tue, Jul 20, 2010 at 5:41 AM, Artur Dabrowski a...@astec.com.pl wrote: I have been redirected here from pg-general. I tested full text search using GIN index and it turned out that the results depend on operating system. Not all the rows are found when executing some of queries on pg server installed on Win XP SP3 and CentOS 5.4, while everything seems to be fine on Ubuntu 4.4.1. More details and tested queries are described here: http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-ts29172750.html I hope you can help with this weird problem. This seems like it's definitely a bug, but I don't know much about the GIN code. Copying Oleg and Teodor... On my machine I didn't reproduce the problem with Artur's dump. I think the problem could be with package, since I use only compiled version. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leaky views, yet again
2010/7/21 KaiGai Kohei kai...@ak.jp.nec.com: (2010/07/20 2:13), Heikki Linnakangas wrote: On 09/07/10 06:47, KaiGai Kohei wrote: When leaky and non-leaky functions are chained within a WHERE clause, it will be ordered by the cost of functions. So, we have possibility that leaky functions are executed earlier than non-leaky functions. No, that needs to be forbidden as part of the fix. Leaky functions must not be executed before all the quals from the view are evaluated. IIUC, a view is extracted to a subquery in the rewriter phase, then it can be pulled up to join clause at pull_up_subqueries(). In this case, WHERE clause may have the quals come from different origins, isn't it? E.g) SELECT * FROM v1 WHERE f_malicious(v1.a); At the rewriter: - SELECT v1.* FROM (SELECT * FROM t1 WHERE f_policy(t1.b)) v1 WHERE f_malicious(v1.a); At the pull_up_subqueries() - SELECT * FROM t1 WHERE f_policy(t1.b) AND f_malicious(t1.a); ^^ ^ cost = 100 cost = 0.0001 Apart from an idea of secure/leaky function mark, isn't it necessary any mechanism to enforce f_policy() shall be executed earlier than f_malicious()? I think you guys are in fact agreeing with each other. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leaky views, yet again
2010/7/21 KaiGai Kohei kai...@ak.jp.nec.com: On the other hand, if it's enough from a performance point of view to review and mark only a few built-in functions like index operators, maybe it's ok. I also think it is a worthful idea to try as a proof-of-concept. Yeah. So, should we mark this patch as Returned with Feedback, and you can submit a proof-of-concept patch for the next CF? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql \conninfo command (was: Patch: psql \whoami option)
On Wed, Jul 21, 2010 at 1:07 AM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jul 20, 2010 at 11:14 PM, Robert Haas robertmh...@gmail.com wrote: OK, committed. When I specify the path of the directory for the Unix-domain socket as the host, \conninfo doesn't mention that this connection is based on the Unix-domain socket. Is this intentional? $ psql -h/tmp -c\conninfo You are connected to database postgres on host /tmp at port 5432 as user postgres. I expected that something like You are connected to database postgres via local socket on /tmp at port 5432 as user postgres. :-( No, I didn't realize the host field could be used that way. It's true that you get a fairly similar message from \c, but that's not exactly intuitive either. rhaas=# \c - - /tmp - You are now connected to database rhaas on host /tmp. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
On Wed, Jul 21, 2010 at 6:17 AM, Abhijit Menon-Sen a...@toroid.org wrote: At 2010-07-20 13:04:12 -0400, robertmh...@gmail.com wrote: 1. Clone the origin. Then, clone the clone n times locally. This uses hard links, so it saves disk space. But, every time you want to pull, you first have to pull to the main clone, and then to each of the slave clones. And same thing when you want to push. If your extra clones are for occasionally-touched back branches, then: (a) In my experience, it is almost always much easier to work with many branches and move patches between them rather than use multiple clones; but (b) You don't need to do the double-pull and push. Clone your local repository as many times as needed, but create new git-remote(1)s in each extra clone and pull/push only the branch you care about directly from or to the remote. That way, you'll start off with the bulk of the storage shared with your main local repository, and waste a few KB when you make (presumably infrequent) new changes. Ah, that is clever. Perhaps we need to write up directions on how to do that. But that brings me to another point: In my experience (doing exactly this kind of old-branch-maintenance with Archiveopteryx), git doesn't help you much if you want to backport (i.e. cherry-pick) changes from a development branch to old release branches. It is much more helpful when you make changes to the *oldest* applicable branch and bring it *forward* to your development branch (by merging the old branch into your master). Cherry-picking can be done, but it becomes painful after a while. Well, per previous discussion, we're not going to change that at this point, or maybe ever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
On Wed, Jul 21, 2010 at 12:39, Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 21, 2010 at 6:17 AM, Abhijit Menon-Sen a...@toroid.org wrote: At 2010-07-20 13:04:12 -0400, robertmh...@gmail.com wrote: 1. Clone the origin. Then, clone the clone n times locally. This uses hard links, so it saves disk space. But, every time you want to pull, you first have to pull to the main clone, and then to each of the slave clones. And same thing when you want to push. If your extra clones are for occasionally-touched back branches, then: (a) In my experience, it is almost always much easier to work with many branches and move patches between them rather than use multiple clones; but (b) You don't need to do the double-pull and push. Clone your local repository as many times as needed, but create new git-remote(1)s in each extra clone and pull/push only the branch you care about directly from or to the remote. That way, you'll start off with the bulk of the storage shared with your main local repository, and waste a few KB when you make (presumably infrequent) new changes. Ah, that is clever. Perhaps we need to write up directions on how to do that. Yeah, that's the way I work with some projects at least. But that brings me to another point: In my experience (doing exactly this kind of old-branch-maintenance with Archiveopteryx), git doesn't help you much if you want to backport (i.e. cherry-pick) changes from a development branch to old release branches. It is much more helpful when you make changes to the *oldest* applicable branch and bring it *forward* to your development branch (by merging the old branch into your master). Cherry-picking can be done, but it becomes painful after a while. Well, per previous discussion, we're not going to change that at this point, or maybe ever. Nope, the deal was definitely that we stick to the current workflow. Yes, this means we can't use git cherry-pick or similar git-specific tools to make life easier. But it shouldn't make life harder than it is *now*, with cvs. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
At 2010-07-20 13:04:12 -0400, robertmh...@gmail.com wrote: 1. Clone the origin. Then, clone the clone n times locally. This uses hard links, so it saves disk space. But, every time you want to pull, you first have to pull to the main clone, and then to each of the slave clones. And same thing when you want to push. If your extra clones are for occasionally-touched back branches, then: (a) In my experience, it is almost always much easier to work with many branches and move patches between them rather than use multiple clones; but (b) You don't need to do the double-pull and push. Clone your local repository as many times as needed, but create new git-remote(1)s in each extra clone and pull/push only the branch you care about directly from or to the remote. That way, you'll start off with the bulk of the storage shared with your main local repository, and waste a few KB when you make (presumably infrequent) new changes. But that brings me to another point: In my experience (doing exactly this kind of old-branch-maintenance with Archiveopteryx), git doesn't help you much if you want to backport (i.e. cherry-pick) changes from a development branch to old release branches. It is much more helpful when you make changes to the *oldest* applicable branch and bring it *forward* to your development branch (by merging the old branch into your master). Cherry-picking can be done, but it becomes painful after a while. See http://toroid.org/ams/etc/git-merge-vs-p4-integrate for more. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
At 2010-07-20 14:34:20 -0400, robertmh...@gmail.com wrote: I think there is also a committer field, but that doesn't always appear and I'm not clear on how it works. There is always a committer field, and it is set sensibly as long as the committer has user.name and user.email set correctly with git-config. It is not displayed by git-log by default, unless it is different from the author. (As PeterE showed, it's easy to get the list of committers.) My preference would be to stick to a style where we identify the committer using the author tag and note the patch author, reviewers, whether the committer made changes, etc. in the commit message. An aside: as a patch author (and elsewhere, as a committer), it's nice when the log shows the author rather than the committer. Will we really have so many patches with multiple authors or other complications that we can't set the author by default and fall back to explanations in the commit message (e.g. applied with changes) for more complicated cases? I want to make sure that I don't accidentally push the last three of those to the authoritative server... By default (at least with a recent git), git push will push branches that are tracking remote branches, but new local branches have to be pushed explicitly to create them on the remote. So don't worry about that. 3. Merge commits. I believe that we have consensus that commits should always be done as a squash, so that the history of all of our branches is linear. I admit I haven't been paying as much attention as I should, but I did not know there was such a consensus. If anyone could explain the rationale, I would be grateful. But it seems to me that someone could accidentally push a merge commit […] Can we forbid this? Yes, I suppose it's possible, but personally I think it would be a waste of time to try to ban merge commits. 4. History rewriting. Under what circumstances, if any, are we OK with rebasing the master? Please, let's never do that. The cure for pulling a rebased branch into an existing clone may seem simple, but it's a huge pain in practice, and it's never really worth it. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Wed, Jul 21, 2010 at 6:46 AM, Abhijit Menon-Sen a...@toroid.org wrote: My preference would be to stick to a style where we identify the committer using the author tag and note the patch author, reviewers, whether the committer made changes, etc. in the commit message. An aside: as a patch author (and elsewhere, as a committer), it's nice when the log shows the author rather than the committer. Will we really have so many patches with multiple authors or other complications that we can't set the author by default and fall back to explanations in the commit message (e.g. applied with changes) for more complicated cases? Tom Lane rewrites part of nearly every commit, and even I change maybe 30% of them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Wed, Jul 21, 2010 at 12:46, Abhijit Menon-Sen a...@toroid.org wrote: At 2010-07-20 14:34:20 -0400, robertmh...@gmail.com wrote: I want to make sure that I don't accidentally push the last three of those to the authoritative server... By default (at least with a recent git), git push will push branches that are tracking remote branches, but new local branches have to be pushed explicitly to create them on the remote. Yeha, i agree this is probably not a big problem. Plus, if we accidentally push a branch that shouldn't have been pushed, it can easily be removed (as long as it's noticed before anybody relies on it). To the suitable embarrassment of the committer who made the incorrect push, which has a tendency to teach them not to do it next time :-) 3. Merge commits. I believe that we have consensus that commits should always be done as a squash, so that the history of all of our branches is linear. I admit I haven't been paying as much attention as I should, but I did not know there was such a consensus. If anyone could explain the rationale, I would be grateful. We are not changing the workflow, just the tool. We may consider changing the workflow sometime in the future (but don't bet on it), but we're definitely not changing both at the same time. This has been discussed many times before, both here on list and in person on at least two instances of the pgcon developer meeting. This is not the time to re-open that discussion. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
At 2010-07-21 06:39:28 -0400, robertmh...@gmail.com wrote: Perhaps we need to write up directions on how to do that. I'll write them if you tell me where to put them. It's trivial. Well, per previous discussion, we're not going to change that at this point, or maybe ever. Sure. I just wanted to mention it, because it's something I learned the hard way. It's also true that back-porting changes is a bigger deal for Postgres than it was for me (in the sense that it's an exception rather than a routine activity), and individual changes are usually backported as soon as, or very soon after, they are committed; so it should be less painful on the whole. Another point, in response to Magnus's followup: At 2010-07-21 12:42:03 +0200, mag...@hagander.net wrote: Yes, this means we can't use git cherry-pick or similar git-specific tools to make life easier. No, that's not right. You *can* use cherry-pick; in fact, it's the sane way to backport the occasional change. What you can't do is efficiently manage a queue of changes to be backported to multiple branches. But as I said above, that's not exactly what we want to do for Postgres, so it should not matter too much. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_config problem on Solaris 10u7 X64
On Tue, Jul 20, 2010 at 10:52 PM, Amber guxiaobo1...@gmail.com wrote: I am trying to build RPostgreSQL on Solaris 10u7 X64, but have problems with pg_config, the configure script of RPostgreSQL checks for pg_config and got “checking for pg_config... /usr/bin/pg_config”. In Solaris 10u7 X64, three versions of PostgreSQL are installed, there are in /usr/postgres/8.2(8.2.9) and /usr/postgres/8.3(8.3.3), the corresponding bin files are in /usr/postgres/version/bin and /usr/postgres/version/bin/amd64, and the libraries in /usr/bin is 8.1.11 and it seems a 32bit, and I can’t find the 64bit version bins for 8.1.11. My question is how to let RPostgreSQL configure script find the 64bit pg_config. My first guess would be to try changing your PATH before running configure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
On Wed, Jul 21, 2010 at 6:56 AM, Abhijit Menon-Sen a...@toroid.org wrote: At 2010-07-21 06:39:28 -0400, robertmh...@gmail.com wrote: Perhaps we need to write up directions on how to do that. I'll write them if you tell me where to put them. It's trivial. Post 'em here or drop them on the wiki and post a link. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
At 2010-07-21 12:55:55 +0200, mag...@hagander.net wrote: We are not changing the workflow, just the tool. OK, but I don't see why accidental merge commits need to be considered antisocial, and banned or rebased away. Who cares if they exist? They don't change anything you need to do to pull, create, view, or push changes. This is not the time to re-open that discussion. Sure. I apologise for bringing it up. -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leaky views, yet again
(2010/07/21 19:26), Robert Haas wrote: 2010/7/21 KaiGai Koheikai...@ak.jp.nec.com: On the other hand, if it's enough from a performance point of view to review and mark only a few built-in functions like index operators, maybe it's ok. I also think it is a worthful idea to try as a proof-of-concept. Yeah. So, should we mark this patch as Returned with Feedback, and you can submit a proof-of-concept patch for the next CF? Yes, it's fair enough. -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Wed, Jul 21, 2010 at 13:05, Abhijit Menon-Sen a...@toroid.org wrote: At 2010-07-21 12:55:55 +0200, mag...@hagander.net wrote: We are not changing the workflow, just the tool. OK, but I don't see why accidental merge commits need to be considered antisocial, and banned or rebased away. Who cares if they exist? They don't change anything you need to do to pull, create, view, or push changes. They makes it harder to track how the project has moved along for people who don't really know about the concept. I'm not sure, but I bet they may cause issues for those tracking the project through git-cvs, or any other tool that doesn't deal with nonlinear history. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: 2010/7/20 Pavel Stehule pavel.steh...@gmail.com: here is a new version - new these functions are not a strict and function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: 2010/7/20 Pavel Stehule pavel.steh...@gmail.com: here is a new version - new these functions are not a strict and function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. I don't agree. Building or updating any array is little bit expensive. There can be same performance issue like combination array_agg and array_to_string versus string_agg. I am not against to possible name changes. But I am strong in opinion so current string_to_array and array_to_string are buggy and have to be deprecated. Regards Pavel p.s. can we use a names - text_to_array, array_to_text ? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
At 2010-07-21 06:57:53 -0400, robertmh...@gmail.com wrote: Post 'em here or drop them on the wiki and post a link. 1. Clone the remote repository as usual: git clone git://git.postgresql.org/git/postgresql.git 2. Create as many local clones as you want: git clone postgresql foobar 3. In each clone (supposing you care about branch xyzzy): 3.1. git remote origin set-url ssh://whatever/postgresql.git 3.2. git remote update git remote prune 3.2. git checkout -t origin/xyzzy 3.4. git branch -d master 3.5. Edit .git/config and set origin.fetch thus: [remote origin] fetch = +refs/heads/xyzzy:refs/remotes/origin/xyzzy (You can git config remote.origin.fetch '+refs/...' if you're squeamish about editing the config file.) 3.6. That's it. git pull and git push will work correctly. (This will replace the origin remote that pointed at your local postgresql.git clone with one that points to the real remote; but you could also add a remote definition named something other than origin, in which case you'd need to git push thatname etc.) -- ams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: 2010/7/20 Pavel Stehule pavel.steh...@gmail.com: here is a new version - new these functions are not a strict and function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. I don't agree. Building or updating any array is little bit expensive. There can be same performance issue like combination array_agg and array_to_string versus string_agg. But is it really bad enough to introduce custom versions of every function that might want to do this sort of thing? I am not against to possible name changes. But I am strong in opinion so current string_to_array and array_to_string are buggy and have to be deprecated. But I don't think anyone else agrees with you. The current behavior isn't the only one anyone might want, but it's one reasonable behavior. p.s. can we use a names - text_to_array, array_to_text ? That's not going to reduce confusion one bit... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: 2010/7/20 Pavel Stehule pavel.steh...@gmail.com: here is a new version - new these functions are not a strict and function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. I don't agree. Building or updating any array is little bit expensive. There can be same performance issue like combination array_agg and array_to_string versus string_agg. But is it really bad enough to introduce custom versions of every function that might want to do this sort of thing? I am not against to possible name changes. But I am strong in opinion so current string_to_array and array_to_string are buggy and have to be deprecated. But I don't think anyone else agrees with you. The current behavior isn't the only one anyone might want, but it's one reasonable behavior. see on discus to these function - this is Marlin Moncure proposal http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151503.html these functions was designed in reaction to reporting bugs and problems with serialisation and deserialisation of arrays with null fields. you can't to parse string to array with null values now postgres=# select string_to_array('1,2,3,null,5',',')::int[]; ERROR: invalid input syntax for integer: null postgres=# Regards Pavel Stehule p.s. can we use a names - text_to_array, array_to_text ? That's not going to reduce confusion one bit... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Pavel Stehule pavel.steh...@gmail.com: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: 2010/7/20 Pavel Stehule pavel.steh...@gmail.com: here is a new version - new these functions are not a strict and function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. I don't agree. Building or updating any array is little bit expensive. There can be same performance issue like combination array_agg and array_to_string versus string_agg. But is it really bad enough to introduce custom versions of every function that might want to do this sort of thing? please look on http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151475.html I am not alone in opinion so current string to array functions has not good design Regards Pavel I am not against to possible name changes. But I am strong in opinion so current string_to_array and array_to_string are buggy and have to be deprecated. But I don't think anyone else agrees with you. The current behavior isn't the only one anyone might want, but it's one reasonable behavior. see on discus to these function - this is Marlin Moncure proposal http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151503.html these functions was designed in reaction to reporting bugs and problems with serialisation and deserialisation of arrays with null fields. you can't to parse string to array with null values now postgres=# select string_to_array('1,2,3,null,5',',')::int[]; ERROR: invalid input syntax for integer: null postgres=# Regards Pavel Stehule p.s. can we use a names - text_to_array, array_to_text ? That's not going to reduce confusion one bit... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: psql: edit function, show function commands patch
Hello I am sending a actualised patch. I understand to your criticism about line numbering. I have to agree. With line numbering the patch is longer. I have a one significant reason for it. There are not conformance between line numbers of CREATE FUNCTION statement and line numbers of function's body. Raise exception, syntactic errors use a function body line numbers. But users doesn't see alone function's body. He see a CREATE FUNCTION statement. What more - and this depend on programmer style sometimes is necessary to correct line number with -1. Now I have enough knowledges of plpgsql, and I am possible to see a problematic row, but it little bit hard task for beginners. You can see. CREATE OR REPLACE FUNCTION public.foo() RETURNS integer LANGUAGE plpgsql AS $function$ 1 begin 2return 10/0; 3 end; $function$ postgres=# select foo(); ERROR: division by zero CONTEXT: SQL statement SELECT 10/0 PL/pgSQL function foo line 2 at RETURN postgres=# CREATE OR REPLACE FUNCTION public.foo() RETURNS integer LANGUAGE plpgsql 1 AS $function$ begin 2 return 10/0; 3 end; $function$ postgres=# select foo(); ERROR: division by zero CONTEXT: SQL statement SELECT 10/0 PL/pgSQL function foo line 2 at RETURN This is very trivial example - for more complex functions, the correct line numbering is more useful. 2010/7/16 Jan Urbański wulc...@wulczer.org: Hi, here's a review of the \sf and \ef [num] patch from http://archives.postgresql.org/message-id/162867791003290927y3ca44051p80e697bc6b19d...@mail.gmail.com == Formatting == The patch has some small tabs/spaces and whitespace issues and it applies with some offsets, I ran pgindent and rebased against HEAD, attaching the resulting patch for your convenience. == Functionality == The patch adds the following features: * \e file.txt num - starts a editor for the current query buffer and puts the cursor on the [num] line * \ef func num - starts a editor for a function and puts the cursor on the [num] line * \sf func - shows a full CREATE FUNCTION statement for the function * \sf+ func - the same, but with line numbers * \sf[+] func num - the same, but only from line num onward It only touches psql, so no performance or backend stability worries. In my humble opinion, only the \sf[+] is interesting, because it gives you a copy/pasteable version of the function definition without opening up an editor, and I can find that useful (OTOH: you can set PSQL_EDITOR to cat and get the same effect with \ef... ok, just joking). Line numbers are an extra touch, personally it does not thrill me too much, but I've nothing against it. The number variants of \e and \ef work by simply executing $EDITOR +num file. I tried with some editors that came to my mind, and not all of them support it (most do, though): * emacs and emacsclient work * vi works * nano works * pico works * mcedit works * kwrite does not work * kedit does not work not sure what other people (or for instance Windows people) use. Apart from no universal support from editors, it does not save that many keystrokes - at most a couple. In the end you can usually easily jump to the line you want once you are inside your dream editor. I found, so there are a few editor for ms win with support for direct line navigation. There isn't any standart. Next I tested kwrite and KDE. There is usual a parameter --line. So you can you use a system variable PSQL_NAVIGATION_COMMAND - for example - for KDE PSQL_NAVIGATION_COMMAND=--line default is +n My recommendation would be to only integrate the \sf[+] part of the patch, which will have the additional benefit of making it much smaller and cleaner (will avoid the grotty splitting of the number from the function name, for instance). But I'm just another user out there, maybe others will find uses for the other cases. I disagree. You cannot use a text editor command, because SQL linenumbers are not equal to body line numbers. I would personally not add the leading and trailing newlines to \sf output, but that's a question of taste. Docs could use some small grammar fixes, but other than that they're fine. == Code == In \sf code there just a strncmp, so this works: \sfblablabla funcname fixed The error for an empty \sf is not great, it should probably look more like \sf: missing required argument following the examples of \pset, \copy or \prompt. Why is lnptr always being passed as a pointer? Looks like a unnecessary complication and one more variable to care about. Can't we just pass lineno? fixed I removed redundant code and appended a more comments/ Regards Pavel Stehule == End == Cheers, Jan editfce.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
* Fujii Masao masao.fu...@gmail.com [100721 03:49]: The patch provides quorum parameter in postgresql.conf, which specifies how many standby servers transaction commit will wait for WAL records to be replicated to, before the command returns a success indication to the client. The default value is zero, which always doesn't make transaction commit wait for replication without regard to replication_mode. Also transaction commit always doesn't wait for replication to asynchronous standby (i.e., replication_mode is set to async) without regard to this parameter. If quorum is more than the number of synchronous standbys, transaction commit returns a success when the ACK has arrived from all of synchronous standbys. There should be a way to specify wait for *all* connected standby servers to acknowledge Agreed. I'll allow -1 as the valid value of the quorum parameter, which means that transaction commit waits for all connected standbys. Hm... so if my 1 synchronouse standby is operatign normally, and quarum is set to 1, I'll get what I want (commit waits until it's safely on both servers). But what happens if my standby goes bad. Suddenly the quarum setting is ignored (because it's number of connected standby servers?) Is there a way for me to not allow any commits if the quarum setting number of standbies is *not* availble? Yes, I want my db to halt in that situation, and yes, alarmbells will be ringing... In reality, I'm likely to run 2 synchronous slaves, with quarum of 1. So 1 slave can fail an dI can still have 2 going. But if that 2nd slave ever failed while the other was down, I definately don't want the master to forge on ahead! Of course, this won't be for everyone, just as the current just connected standbys isn't for everything either... a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 8:14 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Pavel Stehule pavel.steh...@gmail.com: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 7:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:39 AM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: 2010/7/20 Pavel Stehule pavel.steh...@gmail.com: here is a new version - new these functions are not a strict and function to_string is marked as stable. We have array_to_string(anyarray, text) and string_to_array(text, text), and you'll introduce to_string(anyarray, text, text) and to_array(text, text, text). Do we think it is good idea to have different names for them? IMHO, we'd better use 3 arguments version of array_to_string() instead of the new to_string() ? The worst part is that the new names are not very mnemonic. I think maybe what we really need here is array equivalents of COALESCE() and NULLIF(). It looks like the proposed to_string() function is basically equivalent to replacing each NULL entry with the array with a given value, and then doing array_to_string() as usual. And it looks like the proposed to_array function basically does the same thing as to_array(), and then replaces empty strings with NULL or some other value. Maybe we just need a function array_replace(anyarray, anyelement, anyelement) that replaces any element in the array that IS NOT DISTINCT FROM $2 with $3 and returns the new array. That could be useful for other things besides this particular case, too. I don't agree. Building or updating any array is little bit expensive. There can be same performance issue like combination array_agg and array_to_string versus string_agg. But is it really bad enough to introduce custom versions of every function that might want to do this sort of thing? please look on http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg151475.html I am not alone in opinion so current string to array functions has not good design OK, I stand corrected, although I'm not totally convinced. I still think to_array() and to_string() are not a good choice of names. I am not sure if we should reuse the existing names (adding a third parameter) or pick something else, like array_concat() and split_to_array(). Also, should we consider putting these in contrib/stringfunc rather than core? Or is there enough support for core that we should stick with doing it that way? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
OK, I stand corrected, although I'm not totally convinced. I still think to_array() and to_string() are not a good choice of names. I am not sure if we should reuse the existing names (adding a third parameter) or pick something else, like array_concat() and split_to_array(). It was discussed before. I would to see some symmetry in names. The bad thing is so great names like string_to_array and array_to_string is used, and second bad thing was done three years ago when nobody thinking about NULL values. I don't think, so we are able to repair older functions - simply the default behave isn't optimal. I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Also, should we consider putting these in contrib/stringfunc rather than core? Or is there enough support for core that we should stick with doing it that way? so it is one variant. I am not against to moving these function to contrib/stringfunc. I am thinking, so we have to solve question about marking string_to_array and array_to_string functions as deprecated first. Then we can move forward?? My opinion is known - I am for removing of these function in future and replacing by modernized functions. Others opinions??? Can we move forward? Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
On Tue, Jul 20, 2010 at 09:57:06AM +0400, Zotov wrote: SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763) You could try rewriting it to: SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE d1.ID=234409763 UNION SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE d2.ID=234409763 This should have the same semantics as the original query. I don't believe PG knows how to do a rewrite like this at the moment. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Preliminary review of Synchronous Replication patches
Hello Zoltán, Fujii and list, Kevin asked me to do a preliminary review on both synchronous replication patches. Relevant posts on -hackers are: (A) http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php (B) http://archives.postgresql.org/message-id/aanlktilgyl3y1jkdvhx02433coq7jlmqicsqmosbu...@mail.gmail.com (1) http://archives.postgresql.org/pgsql-hackers/2010-05/msg00746.php (2) http://archives.postgresql.org/pgsql-hackers/2010-05/msg01047.php (3) http://wiki.postgresql.org/wiki/Streaming_Replication#Synchronization_capability The first patch (A) was posted by Zoltán Böszörményi three months ago, with comments on -hackers in thread (1). The second patch by Fujii Masao a few days ago (B). Since both patches overlap in functionality, applying one in core means not applying the other. Initially I set out to do a complete review of both patches and let the difficult choice of preferring one over the other to fellow reviewers. However, for the following reasons I believe that patch (A) should probably be withdrawn and the review effort continued on (B). * patch (A) was designed and programmed without prior community involvement. This in itself doesn't make it a bad patch nor a bad way of contributing source code, however thread (1) shows that some issues were raised and more ideas existed. * one of the leafs of thread (A) was (4) where Zoltán Böszörményi hints there might be a new version of the patch (replacing XIDs with LSNs). However to date no new version was posted. Also this in itself is not ground for rejection, but together with the existence of patch (B) gives rise to the idea that work on (A) might have halted. * the work on patch (B) started actually with the post (1) where Fujii Masao indicates he is going to write a patch too, and proposes to work together with Zoltán Böszörményi on the design. * patch (B) encompasses functionality of (A) and more, it also addresses some, if not all ideas on the design that were raised in the comments on patch (A) Adding this up I have the impression that patch (A) will not get a newer version, based on the fact that a newer patch (B) exists which has more functionality and is partly based on community feedback on patch (A), where patch (A) itself is not. Therefore I think that the focus and review time during this commitfest should be on patch (B), unless Zoltán Böszörményi disagrees and supplies a new version of this patch. Depending on a reaction of Zoltán Böszörményi I think patch (A) should be set to either Returned With Feedback, if a new version is in the making, or Rejected if not. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: I: [HACKERS] About Our CLUSTER implementation is pessimal patch
I think writetup_rawheap() and readtup_rawheap() are a little complex, but should work as long as there are no padding between t_len and t_self in HeapTupleData struct. - It might be cleaner if you write the total item length and tuple data separately. - (char *) tuple + sizeof(tuplen) might be more robust than tuple-t_self. - I used your functions - changed the docs for CLUSTER (I don't know if they make sense/are enough) - added a minor comment 2 questions: 1) about the copypaste from FormIndexDatum comment: how can I improve it? The idea is that we could have a faster call, but it would mean copying and pasting a lot of code from FormIndexDatum. 2) what other areas can I comment more? sorted_cluster-20100721.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit psqlrc
On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote: It's tempting to propose making .psqlrc apply only in interactive mode, period. But that would be an incompatibility with previous releases, and I'm not sure it's the behavior we want, either. What is a use case for having .psqlrc be read in noninteractive use? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit psqlrc
On Wed, Jul 21, 2010 at 10:24 AM, Peter Eisentraut pete...@gmx.net wrote: On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote: It's tempting to propose making .psqlrc apply only in interactive mode, period. But that would be an incompatibility with previous releases, and I'm not sure it's the behavior we want, either. What is a use case for having .psqlrc be read in noninteractive use? Well, for example, if I hate the new ASCII format with a fiery passion that can never be quenched (and, by the way, I do), then I'd like this to apply: \pset linestyle old-ascii Even when I do this: psql -c '...whatever...' -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
On Wed, Jul 21, 2010 at 5:54 AM, Robert Haas robertmh...@gmail.com wrote: This patch still needs some work. It includes a bunch of stylistic changes that aren't relevant to the purpose of the patch. There's no reason that I can see to change the existing levenshtein_internal function to take text arguments instead of char *, or to change !m to m == 0 in existing code, or to change the whitespace in the comments of that function. All of those need to be reverted before we can consider committing this. I changed arguments of function from char * to text * in order to avoid text_to_cstring call. Same benefit can be achived by replacing char * with char * and length. I changed !m to m == 0 because Itagaki asked me to make it conforming coding style. Do you think there is no reason to fix coding style in existing code? There is a huge amount of duplicated code here. I think it makes sense to have the multibyte version of the function be separate, but perhaps we can merge the less-than-or-equal to bits into the main code, so that we only have two copies instead of four. Perhaps we can't just add a max_d argument max_distance to levenshtein_internal; and if this value is =0 then it represents the max allowable distance, but if it is 0 then there is no limit. Sure, that might slow down the existing code a bit, but it might not be significant. I'd at least like to see some numbers showing that it is significant before we go to this much trouble. In these case we should add many checks of max_d in levenshtein_internal function which make code more complex. Actually, we can merge all four functions into one function. But such function will have many checks about multibyte encoding and max_d. So, I see four cases here: 1) one function with checks for multibyte encoding and max_d 2) two functions with checks for multibyte encoding 3) two functions with checks for max_d 4) four separate functions If you prefer case number 3 you should argue your position little more. The code doesn't follow the project coding style. Braces must be uncuddled. Comment paragraphs will be reflowed unless they begin and end with --. Function definitions should have the type declaration on one line and the function name at the start of the next. Freeing memory with pfree is likely a waste of time; is there any reason not to just rely on the memory context reset, as the original coding did? Ok, I'll fix this things. I think we might need to remove the acknowledgments section from this code. If everyone who touches this code adds their name, we're quickly going to have a mess. If we're not going to remove the acknowledgments section, then please add my name, too, because I've already patched this code once... In that case I think we can leave original acknowledgments section. With best regards, Alexander Korotkov.
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Tue, 20 Jul 2010 14:34:20 -0400 Robert Haas robertmh...@gmail.com wrote: I have some concerns related to the upcoming conversion to git and how we're going to avoid having things get messy as people start using the new repository. Here's a few responses from the point of view of somebody who has been working with git in the kernel community for some years now. Hopefully it's helpful... 1. Inability to cleanly and easily (and programatically) identify who committed what. No, git tracks committer information separately, and it's easily accessible. Dig into the grungy details of git-log and you'll see that you can get out just about anything you need, in any format. IMHO, vandalizing the author field would be a mistake; it's your best way of tracking where the patch came from and for ensuring credit in your changelogs. Why throw away information? 2. Branch and tag management. In CVS, there are branches and tags in only one place: on the server. In git, you can have local branches and tags and remote branches and tags, and you can pull and push tags between servers. If I'm working on a git repository that has branches master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, numeric_2b, and temprelnames, I want to make sure that I don't accidentally push the last three of those to the authoritative server... but I do want to push all the others. Similarly I want to push only the corrects subset of tags (though that should be less of an issue, at least for me, as I don't usually create local tags). I'm not sure how to set this up, though. Branch push policy can be tweaked in your local config. I'm less sure about tags. It's worth noting that the kernel community does very little with push in general - things are much more often pulled. That may not be a workflow that's suitable for postgresql, though. 3. Merge commits. I believe that we have consensus that commits should always be done as a squash, so that the history of all of our branches is linear. But it seems to me that someone could accidentally push a merge commit, either because they forgot to squash locally, or because of a conflict between their local git repo's master branch and origin/master. Can we forbid this? That seems like a terrible idea to me - why would you destroy history? Obviously I've missed a discussion here. But, the first time somebody wants to use bisect to pinpoint a regression-causing patch, you'll wish you had that information there. 4. History rewriting. Under what circumstances, if any, are we OK with rebasing the master? For example, if we decide not to have merge commits, and somebody does a merge commit anyway, are we going to rebase to get rid of it? A good general rule of thumb is to treat publicly-exposed history as immutable. As soon as you start rebasing trees you create misery for anybody working with those trees. If you're really set on avoiding things like merges, there are ways to set up scripts on the server to enforce policies. jon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit psqlrc
On Jul 21, 2010, at 9:42 AM, Robert Haas wrote: On Wed, Jul 21, 2010 at 10:24 AM, Peter Eisentraut pete...@gmx.net wrote: On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote: It's tempting to propose making .psqlrc apply only in interactive mode, period. But that would be an incompatibility with previous releases, and I'm not sure it's the behavior we want, either. What is a use case for having .psqlrc be read in noninteractive use? Well, for example, if I hate the new ASCII format with a fiery passion that can never be quenched (and, by the way, I do), then I'd like this to apply: \pset linestyle old-ascii Even when I do this: psql -c '...whatever...' Well, tossing out two possible solutions: 1) .psqlrc + .psql_profile (kinda like how bash separates out the interactive/non-interactive parts). Kinda yucky, but it's a working solution. 2) have a flag which explicitly includes the psqlrc file in non-interactive use (perhaps if -x is available, use it for the analogue to -X). Regards, David -- David Christensen End Point Corporation da...@endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Preliminary review of Synchronous Replication patches
Yeb Havinga yebhavi...@gmail.com wrote: Kevin asked me to do a preliminary review on both synchronous replication patches. Thanks for doing so. BTW, Yeb has emailed me off-list that he has more specific notes on both patches, but has run into high priority items on his day job which will prevent him from posting those for another day or two. Since both patches overlap in functionality, applying one in core means not applying the other. Initially I set out to do a complete review of both patches and let the difficult choice of preferring one over the other to fellow reviewers. However, for the following reasons I believe that patch (A) should probably be withdrawn and the review effort continued on (B). Unless there are objections, I will mark the patch by Zoltán Böszörményi as Returned with Feedback in a couple days, and ask that everyone interested in this feature focus on advancing the patch by Fujii Masao. Given the scope and importance of this area, I think we could benefit from another person or two signing on officially as Reviewers. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sql/med review - problems with patching
Hello I am playing with foreign tables now. I found a few small issues now: * fg tables are not dumped via pg_dump * autocomplete for CREATE FOREIGN DATA WRAPPER doesn't offer HANDLER keyword (probably it isn't your problem) * ERROR: unrecognized objkind: 18 issue create table omega(a int, b int, c int); insert into omega select i, i+1, i+2 from generate_series(1,1000,3) g(i); postgres=# SELECT * from pg_foreign_server ; srvname | srvowner | srvfdw | srvtype | srvversion | srvacl | srvoptions -+--++-+++ fake|16384 | 16385 | ||| (1 row) postgres=# SELECT * from pg_foreign_data_wrapper ; fdwname | fdwowner | fdwvalidator | fdwhandler | fdwacl | fdwoptions -+--+--+++ xx |16384 | 3120 | 3121 || (1 row) COPY omega to '/tmp/omega'; CREATE FOREIGN TABLE omega3(a int, b int, c int) SERVER fake OPTIONS (filename '/tmp/omega'); create role tom; grant select on omega2 to tom; there was unstable behave - first call of select * from omega was finished by * ERROR: unrecognized objkind: 18 (I can't to simulate later :( ) second was finished with correct exception ERROR: must be superuser to COPY to or from a file HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone. Have to be this security limits still ? I understand to this limit for COPY statement, but I don't see a sense for foreign table. I agree - only superuser can CREATE FOREIGN TABLE based on file fdw handler. But why access via MED have to be limited? I am very happy from implementation of file_fdw_handler. It is proof so LIMIT isn't a problem, and I don't understand why it have to be a problem for dblink handler. postgres=# select count(*) from omega2; count - 3335004 (1 row) Time: 1915,281 ms postgres=# select count(*) from omega2; count - 3335004 (1 row) Time: 1921,744 ms postgres=# select count(*) from (select * from omega2 limit 1000) x; count --- 1000 (1 row) Time: 1,597 ms From practical view I like to see a used option for any tables. I am missing a more described info in \d command Regards Pavel 2010/7/20 Itagaki Takahiro itagaki.takah...@gmail.com: 2010/7/14 Pavel Stehule pavel.steh...@gmail.com: please, can you refresh patch, please? Updated patch attached. The latest version is always in the git repo. http://repo.or.cz/w/pgsql-fdw.git (branch: fdw) I'm developing the patch on postgres' git repo. So, regression test for dblink might fail because of out-of-sync issue between cvs and git. When I looked to documentation I miss a some tutorial for foreign tables. There are only reference. I miss some paragraph where is cleanly and simple specified what is possible now and whot isn't possible. Enhancing of dblink isn't documented Sure. I'll start to write documentation when we agree the design of FDW. In function pgIterate(ForeignScanState *scanstate) you are iterare via pg result. I am thinking so using a cursor and fetching multiple rows should be preferable. Sure, but I'm thinking that it will be improved after libpq supports protocol-level cursor. The libpq improvement will be applied much more applications including postgresql_fdw. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit psqlrc
On Wed, Jul 21, 2010 at 11:31 AM, David Christensen da...@endpoint.com wrote: On Jul 21, 2010, at 9:42 AM, Robert Haas wrote: On Wed, Jul 21, 2010 at 10:24 AM, Peter Eisentraut pete...@gmx.net wrote: On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote: It's tempting to propose making .psqlrc apply only in interactive mode, period. But that would be an incompatibility with previous releases, and I'm not sure it's the behavior we want, either. What is a use case for having .psqlrc be read in noninteractive use? Well, for example, if I hate the new ASCII format with a fiery passion that can never be quenched (and, by the way, I do), then I'd like this to apply: \pset linestyle old-ascii Even when I do this: psql -c '...whatever...' Well, tossing out two possible solutions: 1) .psqlrc + .psql_profile (kinda like how bash separates out the interactive/non-interactive parts). Kinda yucky, but it's a working solution. 2) have a flag which explicitly includes the psqlrc file in non-interactive use (perhaps if -x is available, use it for the analogue to -X). Hmm. Well, that still doesn't solve the problem that -c and -f do different things with respect to psqlrc, does it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: It was discussed before. I would to see some symmetry in names. That's reasonable. The bad thing is so great names like string_to_array and array_to_string is used, Yeah, those names are not too good. and second bad thing was done three years ago when nobody thinking about NULL values. I don't think, so we are able to repair older functions - simply the default behave isn't optimal. This is a matter of opinion, but certainly it's not right for everyone. I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On 22 July 2010 01:55, Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. Well, -1 from me for deprecating string_to_array and array_to_string. For what it's worth, I agree with Pavel about the current behaviour in core. It's broken whenever NULLs come into play. We need to improve on this one way or another, and I think it would be a shame to deal with a problem in core by adding something to contrib. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. What about implode() and explode()? It's got symmetry and it's possibly more descriptive. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: It was discussed before. I would to see some symmetry in names. That's reasonable. The bad thing is so great names like string_to_array and array_to_string is used, Yeah, those names are not too good. and second bad thing was done three years ago when nobody thinking about NULL values. I don't think, so we are able to repair older functions - simply the default behave isn't optimal. This is a matter of opinion, but certainly it's not right for everyone. I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Wed, Jul 21, 2010 at 10:49 AM, Jonathan Corbet cor...@lwn.net wrote: 1. Inability to cleanly and easily (and programatically) identify who committed what. No, git tracks committer information separately, and it's easily accessible. Dig into the grungy details of git-log and you'll see that you can get out just about anything you need, in any format. IMHO, vandalizing the author field would be a mistake; it's your best way of tracking where the patch came from and for ensuring credit in your changelogs. Why throw away information? If git had a place to store all the information we care about, that would be fine, but it doesn't. Here's a recent attribution line I used: David Christensen. Reviewed by Steve Singer. Some further changes by me. There's no reviewer header, and there's no concept that a patch might have come from the author (or perhaps multiple authors), but then have been adjusted by one or more reviewers and then frobnicated some more by the committer. I'm not sure it's possible to create a system that can effectively store all the ways we give credit and attribution, but a single author line is definitely not it. How much do I have to change a patch before I would use my own name on the author line rather than the patch author's? A single byte? A non-whitespace change? More than 15% of the patch? And, oh by the way, it's the committer who writes the commit message, not the patch author, so at an *absolute minimum* that part of the commit object isn't coming from the original author. 2. Branch and tag management. In CVS, there are branches and tags in only one place: on the server. In git, you can have local branches and tags and remote branches and tags, and you can pull and push tags between servers. If I'm working on a git repository that has branches master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal, numeric_2b, and temprelnames, I want to make sure that I don't accidentally push the last three of those to the authoritative server... but I do want to push all the others. Similarly I want to push only the corrects subset of tags (though that should be less of an issue, at least for me, as I don't usually create local tags). I'm not sure how to set this up, though. Branch push policy can be tweaked in your local config. I'm less sure about tags. It's worth noting that the kernel community does very little with push in general - things are much more often pulled. That may not be a workflow that's suitable for postgresql, though. Seems like we've got this one worked out, per discussion upthread. 3. Merge commits. I believe that we have consensus that commits should always be done as a squash, so that the history of all of our branches is linear. But it seems to me that someone could accidentally push a merge commit, either because they forgot to squash locally, or because of a conflict between their local git repo's master branch and origin/master. Can we forbid this? That seems like a terrible idea to me - why would you destroy history? Obviously I've missed a discussion here. But, the first time somebody wants to use bisect to pinpoint a regression-causing patch, you'll wish you had that information there. In any commit pattern, if I use bisect to pinpoint a regression causing patch, I will find the commit that broke it. Whoever made that particular commit is to blame. Full stop. I don't really care where in the development of the patch that was eventually committed the breakage happened, and I do not want to wade through 50 revs of somebody's private development to find the particular place where they made a thinko. I only care that their patch *as committed* is broken. I don't think that non-linear history is an advantage in any situation. It may be an unavoidable necessity if you have lots of cross-merging between different repositories, but we don't, so for us it's just clutter. 4. History rewriting. Under what circumstances, if any, are we OK with rebasing the master? For example, if we decide not to have merge commits, and somebody does a merge commit anyway, are we going to rebase to get rid of it? A good general rule of thumb is to treat publicly-exposed history as immutable. As soon as you start rebasing trees you create misery for anybody working with those trees. If you're really set on avoiding things like merges, there are ways to set up scripts on the server to enforce policies. Yep, Magnus coded it up today. It works great. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] documentation for committing with git
At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Note that while anyone is welcome to comment, I mostly care about whether the document is adequate for our existing committers, rather than whether someone who is not a committer thinks we should manage the project differently... that might be an interesting discussion, but we're theoretically making this switch in about a month, and getting agreement on changing our current workflow will take about a decade, so there is not time now to do the latter before we do the former. So I would ask everyone to consider postponing those discussions until after we've made the switch and ironed out the kinks. On the other hand, if you have technical corrections, or if you have suggestions on how to do the same things better (rather than suggestions on what to do differently), that would be greatly appreciated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 12:08 PM, Brendan Jurd dire...@gmail.com wrote: On 22 July 2010 01:55, Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 21, 2010 at 9:39 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. Well, -1 from me for deprecating string_to_array and array_to_string. For what it's worth, I agree with Pavel about the current behaviour in core. It's broken whenever NULLs come into play. We need to improve on this one way or another, and I think it would be a shame to deal with a problem in core by adding something to contrib. Fair enough. I'm OK with putting it in core if we can come up with suitable names. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. What about implode() and explode()? It's got symmetry and it's possibly more descriptive. Hmm, it's a thought. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Jul 21, 2010, at 12:30 , Robert Haas wrote: array_split() and array_join(), following Perl? +1. Seems common in other languages such as Ruby, Python, and Java as well. Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit psqlrc
On Wed, 2010-07-21 at 17:24 +0300, Peter Eisentraut wrote: On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote: It's tempting to propose making .psqlrc apply only in interactive mode, period. But that would be an incompatibility with previous releases, and I'm not sure it's the behavior we want, either. What is a use case for having .psqlrc be read in noninteractive use? Changing the historical defaults, such as error/exit behaviour, ensuring timing is on etc.. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On Wed, Jul 21, 2010 at 4:33 AM, Markus Wanner mar...@bluegap.ch wrote: Okay, so I just need to grok the SLRU stuff. Thanks for clarifying. Note that I sort of /want/ to mess with shared memory. It's what I know how to deal with. It's how threaded programs work as well. Ya know, locks, conditional variables, mutexes, all those nice thing that allow you to shoot your foot so terribly nicely... Oh, well... For what it's worth, I feel your pain. I think the SLRU method is *probably* better, but I feel your pain anyway. For each backend, you store one pointer to the first queued message and one pointer to the last queued message. New messages can be added by making the current last message point to a newly added message and updating the last message pointer for that backend. You'd need to think about the locking and reference counting carefully to make sure you eventually freed up unused pages, but it seems like it might be doable. I've just read through slru.c, but still don't have a clue how it could replace a dynamic allocator. At the moment, the creator of an imessage allocs memory, copies the payload there and then activates the message by appending it to the recipient's queue. Upon getting signaled, the recipient consumes the message by removing it from the queue and is obliged to release the memory the messages occupies after having processed it. Simple and straight forward, IMO. The queue addition and removal is clear. But how would I do the alloc/free part with SLRU? Its blocks are fixed size (BLCKSZ) and the API with ReadPage and WritePage is rather unlike a pair of alloc() and free(). Given what you're trying to do, it does sound like you're going to need some kind of an algorithm for space management; but you'll be managing space within the SLRU rather than within shared_buffers. For example, you might end up putting a header on each SLRU page or segment and using that to track the available freespace within that segment for messages to be read and written. It'll probably be a bit more complex than the one for listen (see asyncQueueAddEntries). One big advantage of attacking the problem with an SLRU is that there's no fixed upper limit on the amount of data that can be enqueued at any given time. You can spill to disk or whatever as needed (although hopefully you won't normally do so, for performance reasons). Yes, imessages shouldn't ever be spilled to disk. There naturally must be an upper limit for them. (Be it total available memory, as for threaded things or a given and size-constrained pool, as is the case for dynshmem). I guess experience has taught me to be wary of things that are wired in memory. Under extreme memory pressure, something's got to give, or the whole system will croak. Consider also the contrary situation, where the imessages stuff is not in use (even for a short period of time, like a few minutes). Then we'd really rather not still have memory carved out for it. To me it rather sounds like SLRU is a candidate for using dynamically allocated shared memory underneath, instead of allocating a fixed amount of slots in advance. That would allow more efficient use of shared memory. (Given SLRU's ability to spill to disk, it could even be used to 'balance' out anomalies to some extent). I think what would be even better is to merge the SLRU pools with the shared_buffer pool, so that the two can duke it out for who is in most need of the limited amount of memory available. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Explicit psqlrc
Excerpts from Peter Eisentraut's message of mié jul 21 10:24:26 -0400 2010: On tis, 2010-07-20 at 11:48 -0400, Robert Haas wrote: It's tempting to propose making .psqlrc apply only in interactive mode, period. But that would be an incompatibility with previous releases, and I'm not sure it's the behavior we want, either. What is a use case for having .psqlrc be read in noninteractive use? Even if there weren't one, why does it get applied to -f but not -c? They're both noninteractive. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? I have a problem with array_split - because there string is split. I looked on net - and languages usually uses a split or join. split is method of str class in Java. So when I am following Perl, I feel better with just only split and join, but join is keyword :( - step back, maybe string_split X array_join ? select string_split('1,2,3,4',','); select array_join(array[1,2,3,4],','); so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Pavel p.s. It is typical use case for packages - with it, we can have the functions string.split() and array.join() -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? I have a problem with array_split - because there string is split. I looked on net - and languages usually uses a split or join. split is method of str class in Java. So when I am following Perl, I feel better with just only split and join, but join is keyword :( - step back, maybe string_split X array_join ? select string_split('1,2,3,4',','); select array_join(array[1,2,3,4],','); so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Well, I guess I prefer my suggestion to any of those (I know... what a surprise), but I think I could live with #3, #4, or #5. It's hard for me to imagine that we really want to create a function called just join(), given the other meanings that JOIN already has in SQL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? I have a problem with array_split - because there string is split. I looked on net - and languages usually uses a split or join. split is method of str class in Java. So when I am following Perl, I feel better with just only split and join, but join is keyword :( - step back, maybe string_split X array_join ? select string_split('1,2,3,4',','); select array_join(array[1,2,3,4],','); so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Well, I guess I prefer my suggestion to any of those (I know... what a surprise), but I think I could live with #3, #4, or #5. It's hard for me to imagine that we really want to create a function called just join(), given the other meanings that JOIN already has in SQL. it hasn't any relation to SQL language - but I don't expect so some like this can be accepted by Tom :). So for this moment we are in agreement on #3, #4, #5. I think, we can wait one or two days for opinions of others - and than I'll fix patch. ok? Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
On Wed, Jul 21, 2010 at 7:40 AM, Alexander Korotkov aekorot...@gmail.com wrote: On Wed, Jul 21, 2010 at 5:54 AM, Robert Haas robertmh...@gmail.com wrote: This patch still needs some work. It includes a bunch of stylistic changes that aren't relevant to the purpose of the patch. There's no reason that I can see to change the existing levenshtein_internal function to take text arguments instead of char *, or to change !m to m == 0 in existing code, or to change the whitespace in the comments of that function. All of those need to be reverted before we can consider committing this. I changed arguments of function from char * to text * in order to avoid text_to_cstring call. *scratches head* Aren't you just moving the same call to a different place? Same benefit can be achived by replacing char * with char * and length. I changed !m to m == 0 because Itagaki asked me to make it conforming coding style. Do you think there is no reason to fix coding style in existing code? Yeah, we usually try to avoid changing that sort of thing in existing code, unless there's a very good reason. There is a huge amount of duplicated code here. I think it makes sense to have the multibyte version of the function be separate, but perhaps we can merge the less-than-or-equal to bits into the main code, so that we only have two copies instead of four. Perhaps we can't just add a max_d argument max_distance to levenshtein_internal; and if this value is =0 then it represents the max allowable distance, but if it is 0 then there is no limit. Sure, that might slow down the existing code a bit, but it might not be significant. I'd at least like to see some numbers showing that it is significant before we go to this much trouble. In these case we should add many checks of max_d in levenshtein_internal function which make code more complex. When you say many checks, how many? Actually, we can merge all four functions into one function. But such function will have many checks about multibyte encoding and max_d. So, I see four cases here: 1) one function with checks for multibyte encoding and max_d 2) two functions with checks for multibyte encoding 3) two functions with checks for max_d 4) four separate functions If you prefer case number 3 you should argue your position little more. I'm somewhat convinced that separating the multibyte case out has a performance benefit both by intuition and because you posted some numbers, but I haven't seen any argument for separating out the other case, so I'm asking if you've checked and whether there is an effect and whether it's significant. The default is always to try to avoid maintaining multiple copies of substantially identical code, due to the danger that a future patch might fail to update all of them and thus introduce a bug. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 2:25 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? I have a problem with array_split - because there string is split. I looked on net - and languages usually uses a split or join. split is method of str class in Java. So when I am following Perl, I feel better with just only split and join, but join is keyword :( - step back, maybe string_split X array_join ? select string_split('1,2,3,4',','); select array_join(array[1,2,3,4],','); so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Well, I guess I prefer my suggestion to any of those (I know... what a surprise), but I think I could live with #3, #4, or #5. It's hard for me to imagine that we really want to create a function called just join(), given the other meanings that JOIN already has in SQL. it hasn't any relation to SQL language - but I don't expect so some like this can be accepted by Tom :). So for this moment we are in agreement on #3, #4, #5. I think, we can wait one or two days for opinions of others - and than I'll fix patch. ok? Yeah, I'd like some more votes, too. Aside from what I suggested (array_join/array_split), I think my favorite is your #5. We might also want to put some work into documentating the differences between the old and new functions clearly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add column if not exists (CINE)
--On 1. Mai 2010 23:09:23 -0400 Robert Haas robertmh...@gmail.com wrote: On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: CREATE OR REPLACE is indeed much more complicated. In fact, for tables, I maintain that you'll need to link with -ldwim to make it work properly. This may in fact be an appropriate way to handle the case for tables, given the complexity of their definitions. Patch attached. I had an initial look at Robert's patch. Patch applies cleanly, documentation and regression tests included, everything works as expected. When looking at the functionality there's one thing that strikes me a little: be...@localhost:bernd #*= CREATE TABLE IF NOT EXISTS foo(id int); ERROR: duplicate key value violates unique constraint pg_type_typname_nsp_index DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. This is what you get from concurrent CINE commands. The typname thingie might be confusing by unexperienced users, but i think its hard to do anything about it ? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
Hi, first of all, thanks for your feedback, I enjoy the discussion. On 07/21/2010 07:25 PM, Robert Haas wrote: Given what you're trying to do, it does sound like you're going to need some kind of an algorithm for space management; but you'll be managing space within the SLRU rather than within shared_buffers. For example, you might end up putting a header on each SLRU page or segment and using that to track the available freespace within that segment for messages to be read and written. It'll probably be a bit more complex than the one for listen (see asyncQueueAddEntries). But what would that buy us? Also consider that pretty much all available dynamic allocators use shared memory (either from the OS directly, or via mmap()'d area). Yes, imessages shouldn't ever be spilled to disk. There naturally must be an upper limit for them. (Be it total available memory, as for threaded things or a given and size-constrained pool, as is the case for dynshmem). I guess experience has taught me to be wary of things that are wired in memory. Under extreme memory pressure, something's got to give, or the whole system will croak. I absolutely agree to that last sentence. However, experience has taught /me/ to be wary of things that needlessly swap to disk for hours before reporting any kind of error (AKA swap hell). I prefer systems that adjust to the OOM condition, instead of just ignoring it and falling back to disk (which isn't doesn't provide infinite space, so that's just pushing the limits). The solution for imessages certainly isn't spilling to disk, which would consume even more resources. Instead the process(es) for which there are pending imessages should be allowed to consume them. That's why upon OOM, IMessageCreate currently simply blocks the process that wants to create an imessages. And yes, that's not quite perfect (that process should still consume messages for itself), and it might not play well with other potential users of dynamically allocated memory. But it certainly works better than spilling to disk (and yes, I tested that behavior within Postgres-R). Consider also the contrary situation, where the imessages stuff is not in use (even for a short period of time, like a few minutes). Then we'd really rather not still have memory carved out for it. Huh? That's exactly what dynamic allocation could give you: not having memory carved out for stuff you currently don't need, but instead being able to dynamically use memory where most needed. SLRU has memory (not disk space) carved out for pretty much every sub-system separately, if I'm reading that code correctly. I think what would be even better is to merge the SLRU pools with the shared_buffer pool, so that the two can duke it out for who is in most need of the limited amount of memory available. ..well, just add the shared_buffer pool to the list of candidates that could use dynamically allocated shared memory. It would need some thinking about boundaries (i.e. when to spill to disk, for those modules that /want/ to spill to disk) and dealing with OOM situations, but that's about it. Regards Markus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
Aidan Van Dyk ai...@highrise.ca writes: * Robert Haas robertmh...@gmail.com [100720 13:04]: 3. Clone the origin once. Apply patches to multiple branches by switching branches. Playing around with it, this is probably a tolerable way to work when you're only going back one or two branches but it's certainly a big nuisance when you're going back 5-7 branches. This is what I do when I'm working on a project that has completely proper dependancies, and you don't need to always re-run configure between different branches. I use ccache heavily, so configure takes longer than a complete build with a couple-dozen actually-not-previously-seen changes... But *all* dependancies need to be proper in the build system, or you end up needing a git-clean-type-cleanup between branch switches, forcing a new configure run too, which takes too much time... Maybe this will cause make dependancies to be refined in PG ;-) Well, there's also the VPATH possibility, where all your build objects are stored out of the way of the repo. So you could checkout the branch you're interrested in, change to the associated build directory and build there. And automate that of course. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On ons, 2010-07-21 at 12:22 -0400, Robert Haas wrote: At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Looks good. Please consolidate this with the Committers page when the day comes. Comments: 3. ... your name and email address must match those configured on the server == How do we know what those are? Who controls that? 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. == This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 21:07, Peter Eisentraut pete...@gmx.net wrote: On ons, 2010-07-21 at 12:22 -0400, Robert Haas wrote: At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Looks good. Please consolidate this with the Committers page when the day comes. Comments: 3. ... your name and email address must match those configured on the server == How do we know what those are? Who controls that? sysadmins team. It's set up when committers are added, just like today's authormap on the git mirror. Before we set up the system, we'll double check all of them with each committer, of course. 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. == This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful. Indeed. Why don't I do that more often... +1 on making that a general recommendation, and have people only not do that if they really know what they're doing :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
Jonathan Corbet wrote: 3. Merge commits. I believe that we have consensus that commits should always be done as a squash, so that the history of all of our branches is linear. But it seems to me that someone could accidentally push a merge commit, either because they forgot to squash locally, or because of a conflict between their local git repo's master branch and origin/master. Can we forbid this? That seems like a terrible idea to me - why would you destroy history? Obviously I've missed a discussion here. But, the first time somebody wants to use bisect to pinpoint a regression-causing patch, you'll wish you had that information there. We have a clear idea of what should be part of the public history contained in the authoritative repo and what should be history that is private to the developer/tester/committer. We don't want to pollute the former with the latter. The level of granularity of our current CVS commits seems to us to be about right. So when a committer pushes a patch it should add one fast-forward commit to the tree. We want to be able to bisect between these commit objects, but not between all the work product commits that led up to them. Of course, developers, committers and testers can keep what they like privately - we're only talking about what should go in the authoritative repo. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] need more ALTER TABLE guards for typed tables
After some investigation I figured that I need to add two more checks into the ALTER TABLE code to prevent certain types of direct changes to typed tables (see attached patch). But it's not clear to me whether such checks should go into the Prep or the Exec phases. Prep seems more plausible to me, but some commands such as DropColumn don't have a Prep handler. A clarification would be helpful. Index: src/backend/commands/tablecmds.c === RCS file: /cvsroot/pgsql/src/backend/commands/tablecmds.c,v retrieving revision 1.332 diff -u -3 -p -r1.332 tablecmds.c --- src/backend/commands/tablecmds.c 6 Jul 2010 19:18:56 - 1.332 +++ src/backend/commands/tablecmds.c 21 Jul 2010 14:34:41 - @@ -5788,6 +5788,11 @@ ATPrepAlterColumnType(List **wqueue, NewColumnValue *newval; ParseState *pstate = make_parsestate(NULL); + if (rel-rd_rel-reloftype) + ereport(ERROR, +(errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg(cannot alter column type of typed table))); + /* lookup the attribute so we can check inheritance status */ tuple = SearchSysCacheAttName(RelationGetRelid(rel), colName); if (!HeapTupleIsValid(tuple)) @@ -7126,6 +7131,11 @@ ATExecAddInherit(Relation child_rel, Ran int32 inhseqno; List *children; + if (child_rel-rd_rel-reloftype) + ereport(ERROR, +(errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg(cannot change inheritance of typed table))); + /* * AccessShareLock on the parent is what's obtained during normal CREATE * TABLE ... INHERITS ..., so should be enough here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander mag...@hagander.net wrote: 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. == This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful. Indeed. Why don't I do that more often... +1 on making that a general recommendation, and have people only not do that if they really know what they're doing :-) Hmm, I didn't know about that option. What makes us think that's the behavior people will most often want? Because it doesn't seem like what I want, just for one example... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 21:20, Robert Haas robertmh...@gmail.com wrote: On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander mag...@hagander.net wrote: 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. == This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful. Indeed. Why don't I do that more often... +1 on making that a general recommendation, and have people only not do that if they really know what they're doing :-) Hmm, I didn't know about that option. What makes us think that's the behavior people will most often want? Because it doesn't seem like what I want, just for one example... It'd be what I want for everything *except* when doing backpatching. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Jul 21, 2010, at 2:20 PM, Robert Haas wrote: On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander mag...@hagander.net wrote: 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. == This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful. Indeed. Why don't I do that more often... +1 on making that a general recommendation, and have people only not do that if they really know what they're doing :-) Hmm, I didn't know about that option. What makes us think that's the behavior people will most often want? Because it doesn't seem like what I want, just for one example... So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Create a push on master. Bare git push. WIP commit gets pushed upstream. Oops. Regards, David -- David Christensen End Point Corporation da...@endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 3:23 PM, David Christensen da...@endpoint.com wrote: On Jul 21, 2010, at 2:20 PM, Robert Haas wrote: On Wed, Jul 21, 2010 at 3:11 PM, Magnus Hagander mag...@hagander.net wrote: 6. Finally, you must push your changes back to the server. git push This will push changes in all branches you've updated, but only branches that also exist on the remote side will be pushed; thus, you can have local working branches that won't be pushed. == This is true, but I have found it saner to configure push.default = tracking, so that only the current branch is pushes. Some people might find that useful. Indeed. Why don't I do that more often... +1 on making that a general recommendation, and have people only not do that if they really know what they're doing :-) Hmm, I didn't know about that option. What makes us think that's the behavior people will most often want? Because it doesn't seem like what I want, just for one example... So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Create a push on master. Bare git push. WIP commit gets pushed upstream. Oops. Sure, oops, but I would never do that. I'd stash it or put it on a topic branch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
Excerpts from Dimitri Fontaine's message of mié jul 21 15:00:48 -0400 2010: Well, there's also the VPATH possibility, where all your build objects are stored out of the way of the repo. So you could checkout the branch you're interrested in, change to the associated build directory and build there. And automate that of course. This does not work as cleanly as you suppose, because some build objects are stored in the source tree. configure being one of them. So if you switch branches, configure is rerun even in a VPATH build, which is undesirable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
Excerpts from Robert Haas's message of mié jul 21 15:26:47 -0400 2010: So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Create a push on master. Bare git push. WIP commit gets pushed upstream. Oops. Sure, oops, but I would never do that. I'd stash it or put it on a topic branch. Somebody else will. Please remember you're writing docs that are not for yourself. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
Excerpts from Andrew Dunstan's message of mié jul 21 15:11:41 -0400 2010: Jonathan Corbet wrote: That seems like a terrible idea to me - why would you destroy history? Obviously I've missed a discussion here. But, the first time somebody wants to use bisect to pinpoint a regression-causing patch, you'll wish you had that information there. So when a committer pushes a patch it should add one fast-forward commit to the tree. We want to be able to bisect between these commit objects, but not between all the work product commits that led up to them. Of course, developers, committers and testers can keep what they like privately - we're only talking about what should go in the authoritative repo. I don't disagree that we're going to squash commits, but I don't believe that developers will be able to keep what they like privately. The commit objects for the final patch are going to differ, if only because they have different parents than the ones on the main branch. Of course, they will be able to have a local branch with their local patch, but to Git there will be no relationship between this branch and the final, squashed patch in the authoritative repo. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
Robert Haas wrote: At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Note that while anyone is welcome to comment, I mostly care about whether the document is adequate for our existing committers, rather than whether someone who is not a committer thinks we should manage the project differently... that might be an interesting discussion, but we're theoretically making this switch in about a month, and getting agreement on changing our current workflow will take about a decade, so there is not time now to do the latter before we do the former. So I would ask everyone to consider postponing those discussions until after we've made the switch and ironed out the kinks. On the other hand, if you have technical corrections, or if you have suggestions on how to do the same things better (rather than suggestions on what to do differently), that would be greatly appreciated. Well, either we have a terminology problem or a statement of policy that I'm not sure I agree with, in point 2. IMNSHO, what we need to forbid is commits that are not fast-forward commits, i.e. that do not have the current branch head as an ancestor, ideally as the immediate ancestor. Personally, I have a strong opinion that for everything but totally trivial patches, the committer should create a short-lived work branch where all the work is done, and then do a squash merge back to the main branch, which is then pushed. This pattern is not mentioned at all. In my experience, it is essential, especially if you're working on more than one thing at a time, as many people often are. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 21:37, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Note that while anyone is welcome to comment, I mostly care about whether the document is adequate for our existing committers, rather than whether someone who is not a committer thinks we should manage the project differently... that might be an interesting discussion, but we're theoretically making this switch in about a month, and getting agreement on changing our current workflow will take about a decade, so there is not time now to do the latter before we do the former. So I would ask everyone to consider postponing those discussions until after we've made the switch and ironed out the kinks. On the other hand, if you have technical corrections, or if you have suggestions on how to do the same things better (rather than suggestions on what to do differently), that would be greatly appreciated. Well, either we have a terminology problem or a statement of policy that I'm not sure I agree with, in point 2. IMNSHO, what we need to forbid is commits that are not fast-forward commits, i.e. that do not have the current branch head as an ancestor, ideally as the immediate ancestor. Personally, I have a strong opinion that for everything but totally trivial patches, the committer should create a short-lived work branch where all the work is done, and then do a squash merge back to the main branch, which is then pushed. This pattern is not mentioned at all. In my experience, it is essential, especially if you're working on more than one thing at a time, as many people often are. Uh, that's going to create an actual merge commit, no? Or you mean squash-merge-but-only-fast-forward? I *think* the docs is based off the pattern of the committer having two repositories - one for his own work, one for comitting, much like I assume all of us have today in cvs. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Jul 21, 2010, at 2:39 PM, Magnus Hagander wrote: On Wed, Jul 21, 2010 at 21:37, Andrew Dunstan and...@dunslane.net wrote: Robert Haas wrote: At the developer meeting, I promised to do the work of documenting how committers should use git. So here's a first version. http://wiki.postgresql.org/wiki/Committing_with_Git Note that while anyone is welcome to comment, I mostly care about whether the document is adequate for our existing committers, rather than whether someone who is not a committer thinks we should manage the project differently... that might be an interesting discussion, but we're theoretically making this switch in about a month, and getting agreement on changing our current workflow will take about a decade, so there is not time now to do the latter before we do the former. So I would ask everyone to consider postponing those discussions until after we've made the switch and ironed out the kinks. On the other hand, if you have technical corrections, or if you have suggestions on how to do the same things better (rather than suggestions on what to do differently), that would be greatly appreciated. Well, either we have a terminology problem or a statement of policy that I'm not sure I agree with, in point 2. IMNSHO, what we need to forbid is commits that are not fast-forward commits, i.e. that do not have the current branch head as an ancestor, ideally as the immediate ancestor. Personally, I have a strong opinion that for everything but totally trivial patches, the committer should create a short-lived work branch where all the work is done, and then do a squash merge back to the main branch, which is then pushed. This pattern is not mentioned at all. In my experience, it is essential, especially if you're working on more than one thing at a time, as many people often are. Uh, that's going to create an actual merge commit, no? Or you mean squash-merge-but-only-fast-forward? I *think* the docs is based off the pattern of the committer having two repositories - one for his own work, one for comitting, much like I assume all of us have today in cvs. You can also do a rebase after the merge to remove the local merge commit before pushing. I tend to do this anytime I merge a local branch, just to rebase on top of the most recent origin/master. Regards, David -- David Christensen End Point Corporation da...@endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] need more ALTER TABLE guards for typed tables
Excerpts from Peter Eisentraut's message of mié jul 21 15:18:58 -0400 2010: After some investigation I figured that I need to add two more checks into the ALTER TABLE code to prevent certain types of direct changes to typed tables (see attached patch). But it's not clear to me whether such checks should go into the Prep or the Exec phases. Prep seems more plausible to me, but some commands such as DropColumn don't have a Prep handler. A clarification would be helpful. I think if there's no Prep phase, you should add it. I don't think it makes sense to have this kind of check in Exec. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
Magnus Hagander wrote: Personally, I have a strong opinion that for everything but totally trivial patches, the committer should create a short-lived work branch where all the work is done, and then do a squash merge back to the main branch, which is then pushed. This pattern is not mentioned at all. In my experience, it is essential, especially if you're working on more than one thing at a time, as many people often are. Uh, that's going to create an actual merge commit, no? Or you mean squash-merge-but-only-fast-forward? Yes, exactly that. Something like: git checkout -b myworkbranch ... work, test, commit, rinse, lather repeat ... git checkout RELn_m_STABLE git pull git merge --squash myworkbranch git push I *think* the docs is based off the pattern of the committer having two repositories - one for his own work, one for comitting, much like I assume all of us have today in cvs. So then what? After you've done your work you'll still need to pull the stuff somehow into your commit tree. I don't think this will buy you a lot. I usually clone the whole CVS tree for non-trivial work, but I'm not sure that's an ideal work pattern. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
On Wed, 21 Jul 2010 15:11:41 -0400 Andrew Dunstan and...@dunslane.net wrote: We have a clear idea of what should be part of the public history contained in the authoritative repo and what should be history that is private to the developer/tester/committer. We don't want to pollute the former with the latter. The thought makes me shudder...you lose the history, the reasons for specific changes, the authorship of changes, and the ability of your testers to pinpoint problematic changes. But...your project, your decision...we'll keep using PostgreSQL regardless...:) Thanks, jon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Git conversion progress report and call for testing assistance
Here's a status update on the git conversion, as well as a call for some help mainly in testing. After testing a bunch of tools, I've found that using cvs2git is by far the best option when keeping keywords. It's the one that gives only the issues that I posted about a couple of days ago. So I've proceeded based off this one to something that would be how we create the real repository once we go. This means I've scripted the removal of the $PostgreSQL$ tags from the tip of the active branches as one big commit after the migration. I've also set up the git server and the scripts around it, that we can eventually use. This includes commit email sending, commit policy enforcement (no merge commits, correct author/committer tag etc) and proper access control (a modified version of the one on git.postgresql.org - since we definitely don't want any external dependencies for the main repository). This is all available for testing now. Marc has set up a mailinglist at pgsql-committers-t...@postgresql.org where commit messages from the new system is sent. If you care about what they look like, subscribe there and wait for one to show up :-) Subscription is done the usual way. Anonymous users can view the repository at git.postgresql.org using gitweb or the git:// protocol, under the name postgresql-migration. DISCLAIMER: DO NOT BASE ANY WORK OFF THIS REPOSITORY. IT *WILL* BE RECREATED SEVERAL TIMES AND MAY CHANGE COMPLETELY! Existing committers have been set up to access the new repository at ssh://g...@gitmaster.postgresql.org/postgresql.git. Robert Haas has written some instructoins for how to use this - please read and review those. And in general, a call to committers: please test this! Now is the time, not after we've migrated ;) Just throw in some random commits, and some non-random ones, both to get yourself familiar with the workflow and to iron out the bugs in the scripts (I'm sure they're there). For those interested in what's done, the scripts running this are all up on github at http://github.com/mhagander/pg_githooks. The root contains the scripts for commit messages, policy enforcement and access control. There's also a temporary directory called migration that contains the scripts and configuration files that are in use for the version of the repository that is up there now. There's some minor plumbing around these that isn't up there yet, but in general it is all that's used. And note that if you want to play with it, the script uses around 8Gb of temp disk space when running, so make sure you have enough space if you do it in a VM... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 3:31 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Robert Haas's message of mié jul 21 15:26:47 -0400 2010: So you're working on some back branch, and make a WIP commit so you can switch to master to make a quick commit. Create a push on master. Bare git push. WIP commit gets pushed upstream. Oops. Sure, oops, but I would never do that. I'd stash it or put it on a topic branch. Somebody else will. Please remember you're writing docs that are not for yourself. I don't have any problem suggesting it for those who may want it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] git config user.email
We need to decide what email addresses committers will use on the new git repository when they commit. Although I think we have more votes (at least from committers) for always having author == committer, rather than possibly setting the author tag to some other value, the issue exists independently of that. I believe we want to try to set things up so that committers will not need to change the email address they are using to commit even if their employment situation changes. Because if that happens, then it becomes more difficult to keep track of who is who. My initial suggestion was to say that everyone should just be usern...@postgresql.org; but I think that met with some resistance. Magnus, for example, tells me that he is a committer for multiple projects, and is mag...@hagander.net at all of them. Since that's a domain name he owns personally, it seems safe enough. But I'm inclined to think we should avoid things like rh...@commandprompt.com, just on the off chance JD decides to fire me. Of course, I expect there might be some dissenting voices on this point, so... thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Disk caching
Hi to all. I am trying to see how PostgreSQL performance changes on the basis of work_mem. So, I am going to execute the 22 queries of TPCH (http://www.tpc.org/tpch/) again and again, each time for a different value of work_mem. Since I am interested just in work_mem variations, I should prevent each query to take advantages from revious executions of the 22 queries them selves. For example, taking cache advantages. So, taking into account that the 22 queries are those http://pastebin.com/7Dg50YRZ and are executed on tables of hundreds of MB and 1) Is it sufficient to run change the values of work_mem through psql and running the queries again without restarting postgres? 2) Or, should I restart postgres? 3) Or, shoud I restart the machine each time I execute the 22 queries? Thanks for your time. Regards. Manolo. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 2:25 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 1:48 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2010/7/21 Robert Haas robertmh...@gmail.com: On Wed, Jul 21, 2010 at 12:08 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am thinking so we have to do decision about string_to_array and array_to_string deprecation first. If these function will be deprecated, then we can use a similar names (and probably we should to use a similar names) - so text_to_array or array_to_string can be acceptable. If not, then this discus is needless - then to_string and to_array have to be maximally in contrib - stringfunc is good idea - and maybe we don't need thinking about new names. Well, -1 from me for deprecating string_to_array and array_to_string. I am not in favor of the names to_string and to_array even if we put them in contrib, though. The problem with string_to_array and array_to_string is that they aren't descriptive enough, and to_string/to_array is even less so. I am not a English native speaker, so I have a different feeling. These functions do array_serialisation and array_deseralisation, but this names are too long. I have not idea about better names - it is descriptive well (for me) text-array, array-text - and these names shows very cleanly symmetry between functions. I have to repeat - it is very clean for not native speaker. Well, the problem is that array_to_string(), for example, tells you that an array is being converted to a string, but not how. And to_string() tells you that you're getting a string, but it doesn't tell you either what you're getting it from or how you're getting it. We already have a function to_char() which can be used to format a whole bunch of different types as strings; I can't see adding a new function with almost the same name that does something completely different. array_split() and array_join(), following Perl? array_implode() and array_explode(), along the lines suggested by Brendan? I have a problem with array_split - because there string is split. I looked on net - and languages usually uses a split or join. split is method of str class in Java. So when I am following Perl, I feel better with just only split and join, but join is keyword :( - step back, maybe string_split X array_join ? select string_split('1,2,3,4',','); select array_join(array[1,2,3,4],','); so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Well, I guess I prefer my suggestion to any of those (I know... what a surprise), but I think I could live with #3, #4, or #5. It's hard for me to imagine that we really want to create a function called just join(), given the other meanings that JOIN already has in SQL. it hasn't any relation to SQL language - but I don't expect so some like this can be accepted by Tom :). So for this moment we are in agreement on #3, #4, #5. I think, we can wait one or two days for opinions of others - and than I'll fix patch. ok? Yeah, I'd like some more votes, too. Aside from what I suggested (array_join/array_split), I think my favorite is your #5. ok #5 - it is absolutely out of me - explode, implode are used in Czech only with relation to bombs. In this moment I have a problem to decide what is related to string_to_array and array_to_string - it is nothing against to your opinion, just it means, so it hasn't any meaning for me - and probably for lot of foreign developers. But I found on net, that people use this names. We might also want to put some work into documentating the differences between the old and new functions clearly. sure Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
On Wed, Jul 21, 2010 at 2:28 PM, Robert Haas robertmh...@gmail.com wrote: Yeah, I'd like some more votes, too. Aside from what I suggested (array_join/array_split), I think my favorite is your #5. -1 for me for any name that is of the form of: type_operation(); we don't have bytea_encode, array_unnest(), date_to_char(), etc. the non-internal ones that we do have (mostly array funcs), are improperly named imo. this is sql, not c. suppose we want to extend string serialization to row types? why not serialize/unserialize? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
On Wed, Jul 21, 2010 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote: *scratches head* Aren't you just moving the same call to a different place? So, where you can find this different place? :) In this patch null-terminated strings are not used at all. Yeah, we usually try to avoid changing that sort of thing in existing code, unless there's a very good reason. Ok. In these case we should add many checks of max_d in levenshtein_internal function which make code more complex. When you say many checks, how many? Actually, we can merge all four functions into one function. But such function will have many checks about multibyte encoding and max_d. So, I see four cases here: 1) one function with checks for multibyte encoding and max_d 2) two functions with checks for multibyte encoding 3) two functions with checks for max_d 4) four separate functions If you prefer case number 3 you should argue your position little more. I'm somewhat convinced that separating the multibyte case out has a performance benefit both by intuition and because you posted some numbers, but I haven't seen any argument for separating out the other case, so I'm asking if you've checked and whether there is an effect and whether it's significant. The default is always to try to avoid maintaining multiple copies of substantially identical code, due to the danger that a future patch might fail to update all of them and thus introduce a bug. I've tested it with big value of max_d and I thought that it's evident that checking for negative value of max_d will not produce significant benefit. Anyway, I tried to add checking for negative max_d into levenshtein_less_equal_mb function. static int levenshtein_less_equal_internal_mb(char *s, char *t, int s_len, int t_len, int ins_c, int del_c, int sub_c, int max_d) { intm, n; int *prev; int *curr; inti, j; const char *x; const char *y; CharLengthAndOffset *lengths_and_offsets; inty_char_len; intcurr_left, curr_right, prev_left, prev_right, d; intdelta, min_d; /* * We should calculate number of characters for multibyte encodings */ m = pg_mbstrlen_with_len(s, s_len); n = pg_mbstrlen_with_len(t, t_len); /* * We can transform an empty s into t with n insertions, or a non-empty t * into an empty s with m deletions. */ if (m == 0) return n * ins_c; if (n == 0) return m * del_c; /* * We can find the minimal distance by the difference of lengths */ delta = m - n; if (delta 0) min_d = delta * del_c; else if (delta 0) min_d = - delta * ins_c; else min_d = 0; if (max_d = 0 min_d max_d) return max_d + 1; /* * For security concerns, restrict excessive CPU+RAM usage. (This * implementation uses O(m) memory and has O(mn) complexity.) */ if (m MAX_LEVENSHTEIN_STRLEN || n MAX_LEVENSHTEIN_STRLEN) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(argument exceeds the maximum length of %d bytes, MAX_LEVENSHTEIN_STRLEN))); /* One more cell for initialization column and row. */ ++m; ++n; /* * Instead of building an (m+1)x(n+1) array, we'll use two different * arrays of size m+1 for storing accumulated values. At each step one * represents the previous row and one is the current row of the * notional large array. * For multibyte encoding we'll also store array of lengths of * characters and array with character offsets in first string * in order to avoid great number of * pg_mblen calls. */ prev = (int *) palloc((2 * sizeof(int) + sizeof(CharLengthAndOffset)) * m ); curr = prev + m; lengths_and_offsets = (CharLengthAndOffset *)(prev + 2 * m); lengths_and_offsets[0].offset = 0; for (i = 0, x = s; i m - 1; i++) { lengths_and_offsets[i].length = pg_mblen(x); lengths_and_offsets[i + 1].offset = lengths_and_offsets[i].offset + lengths_and_offsets[i].length; x += lengths_and_offsets[i].length; } lengths_and_offsets[i].length = 0; /* Initialize the previous row to 0..cols */ curr_left = 1; d = min_d; for (i = 0; i delta; i++) { prev[i] = d; } curr_right = m; for (; i m; i++) { prev[i] = d; d += (ins_c + del_c); if (max_d = 0 d max_d) { curr_right = i; break; } } /* * There are following optimizations: * 1) Actually the minimal possible value of final distance (in the case of * all possible matches) is stored is the cells of the matrix. In the case * of movement towards diagonal, which contain last cell, value
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 3:37 PM, Andrew Dunstan and...@dunslane.net wrote: Well, either we have a terminology problem or a statement of policy that I'm not sure I agree with, in point 2. IMNSHO, what we need to forbid is commits that are not fast-forward commits, i.e. that do not have the current branch head as an ancestor, ideally as the immediate ancestor. There are two separate questions here. One is whether an update to a ref is fast-forward or history rewriting, and the other is whether it is a merge commit or not. I don't believe that we want either history-rewriting commits or merge commits to get pushed, but this paragraph is about merge commits. Personally, I have a strong opinion that for everything but totally trivial patches, the committer should create a short-lived work branch where all the work is done, and then do a squash merge back to the main branch, which is then pushed. This pattern is not mentioned at all. In my experience, it is essential, especially if you're working on more than one thing at a time, as many people often are. git merge --squash doesn't create a merge commit. Indeed, the whole point is to create a commit which essentially encapsulates the same diff as a merge commit but actually isn't one. From the man page: Produce the working tree and index state as if a real merge happened (except for the merge information), but do not actually make a commit or move the HEAD, nor record $GIT_DIR/MERGE_HEAD to cause the next git commit command to create a merge commit. As for whether to discuss the use of git merge --squash, I could go either way on that. Personally, my preferred workflow is to do 'git rebase -i master' on a topic branch, squash all the commits, and then switch to the master branch and do 'git merge otherbranch', resulting in a fast-forward merge with no merge commit. But there are many other ways to do it, including 'git merge --squash' and the already-mentioned 'git commit -a'. I think there's a risk of this turning into a complete tutorial on git, which might detract from its primary purpose of explaining to committers how to get a basic, working setup in place. But we can certainly add whatever you think is important, or maybe some language indicating that 'git commit -a' is just an EXAMPLE of how to create a commit... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: psql: edit function, show function commands patch
Pavel Stehule pavel.steh...@gmail.com writes: CREATE OR REPLACE FUNCTION public.foo() RETURNS integer LANGUAGE plpgsql 1 AS $function$ begin 2 return 10/0; 3 end; $function$ This is very trivial example - for more complex functions, the correct line numbering is more useful. I completely agree with this, in-functions line numbering is a must-have. I'd like psql to handle that better. That said, I usually edit functions in Emacs on my workstation. I did implement a linum-mode extension to show PL/pgSQL line numbers in addition to the buffer line numbers in emacs, but it failed to work with this AS $function$ begin on the same line example. It's fixed in the attached, should there be any users of it. Regards, -- dim dim-pgsql.el Description: pgsql setup for emacs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] managing git disk space usage
Alvaro Herrera alvhe...@commandprompt.com writes: This does not work as cleanly as you suppose, because some build objects are stored in the source tree. configure being one of them. So if you switch branches, configure is rerun even in a VPATH build, which is undesirable. Ouch. Reading -hackers led me to thinking this had received a cleaning effort in the Makefiles, so that any generated file appears in the build directory. Sorry to learn that's not (yet?) the case. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add column if not exists (CINE)
On Wed, Jul 21, 2010 at 2:53 PM, Bernd Helmle maili...@oopsware.de wrote: --On 1. Mai 2010 23:09:23 -0400 Robert Haas robertmh...@gmail.com wrote: On Wed, Apr 28, 2010 at 9:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: CREATE OR REPLACE is indeed much more complicated. In fact, for tables, I maintain that you'll need to link with -ldwim to make it work properly. This may in fact be an appropriate way to handle the case for tables, given the complexity of their definitions. Patch attached. I had an initial look at Robert's patch. Patch applies cleanly, documentation and regression tests included, everything works as expected. When looking at the functionality there's one thing that strikes me a little: be...@localhost:bernd #*= CREATE TABLE IF NOT EXISTS foo(id int); ERROR: duplicate key value violates unique constraint pg_type_typname_nsp_index DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. This is what you get from concurrent CINE commands. The typname thingie might be confusing by unexperienced users, but i think its hard to do anything about it ? I get the same error message from concurrent CREATE TABLE commands even without CINE... S1: rhaas=# begin; BEGIN rhaas=# create table foo (id int); CREATE TABLE S2: rhaas=# begin; BEGIN rhaas=# create table foo (id int); blocks S1: rhaas=# commit; COMMIT S2: ERROR: duplicate key value violates unique constraint pg_type_typname_nsp_index DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. I agree it would be nice to fix this. I'm not sure how hard it is. I don't think it's the job of this patch. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git config user.email
Robert Haas wrote: We need to decide what email addresses committers will use on the new git repository when they commit. Although I think we have more votes (at least from committers) for always having author == committer, rather than possibly setting the author tag to some other value, the issue exists independently of that. I believe we want to try to set things up so that committers will not need to change the email address they are using to commit even if their employment situation changes. Because if that happens, then it becomes more difficult to keep track of who is who. My initial suggestion was to say that everyone should just be usern...@postgresql.org; but I think that met with some resistance. Magnus, for example, tells me that he is a committer for multiple projects, and is mag...@hagander.net at all of them. Since that's a domain name he owns personally, it seems safe enough. But I'm inclined to think we should avoid things like rh...@commandprompt.com, just on the off chance JD decides to fire me. Of course, I expect there might be some dissenting voices on this point, so... thoughts? Do we care that much? I agree it should probably be something permanent, and so that could rule out employment based addresses, but it doesn't strike me as a big deal. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 5:03 PM, Robert Haas robertmh...@gmail.com wrote: working setup in place. But we can certainly add whatever you think is important, or maybe some language indicating that 'git commit -a' is just an EXAMPLE of how to create a commit... I took a crack at this, as well as incorporating some of the other suggestions that have been made. I'm sure it's not perfect, but maybe it's an improvement... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
Excerpts from Robert Haas's message of mié jul 21 14:25:47 -0400 2010: On Wed, Jul 21, 2010 at 7:40 AM, Alexander Korotkov aekorot...@gmail.com wrote: On Wed, Jul 21, 2010 at 5:54 AM, Robert Haas robertmh...@gmail.com wrote: Same benefit can be achived by replacing char * with char * and length. I changed !m to m == 0 because Itagaki asked me to make it conforming coding style. Do you think there is no reason to fix coding style in existing code? Yeah, we usually try to avoid changing that sort of thing in existing code, unless there's a very good reason. I think fixing a stylistic issue in code that's being edited for other purposes is fine, and a good idea going forward. We wouldn't commit a patch that would *only* fix those, because that would cause a problem for backpatches for no benefit, but if the patch touches something else, then a backpatch of another patch is going to need manual intervention anyway. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git config user.email
Excerpts from Robert Haas's message of mié jul 21 12:54:36 -0400 2010: My initial suggestion was to say that everyone should just be usern...@postgresql.org; but I think that met with some resistance. Magnus, for example, tells me that he is a committer for multiple projects, and is mag...@hagander.net at all of them. Since that's a domain name he owns personally, it seems safe enough. But I'm inclined to think we should avoid things like rh...@commandprompt.com, just on the off chance JD decides to fire me. I have a mild preference of alvhe...@alvh.no-ip.org over @postgresql.org. If other committers are going to use personal addresses, I'll use mine as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] accentuated letters in text-search
Hi. I was googling for how to create a text-seach-config with the following properties: - Map unicode accentuated letters to an un-accentuated equivalent - No stop-words - Lowercase all words And came over this from -general: http://www.techienuggets.com/Comments?tx=106813 Then after some more googling I found this: http://www.sai.msu.su/~megera/wiki/unaccent Any reason the unaccent dict. and function did not make it in 9.0? -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
On Wed, Jul 21, 2010 at 2:47 PM, Alexander Korotkov aekorot...@gmail.com wrote: On Wed, Jul 21, 2010 at 10:25 PM, Robert Haas robertmh...@gmail.com wrote: *scratches head* Aren't you just moving the same call to a different place? So, where you can find this different place? :) In this patch null-terminated strings are not used at all. I can't. You win. :-) Actually, I wonder if there's enough performance improvement there that we might think about extracting that part of the patch and apply it separately. Then we could continue trying to figure out what to do with the rest. Sometimes it's simpler to deal with one change at a time. I tested it with american-english dictionary with 98569 words. test=# select sum(levenshtein(word, 'qwerqwerqwer')) from words; sum - 1074376 (1 row) Time: 131,435 ms test=# select sum(levenshtein_less_equal(word, 'qwerqwerqwer',100)) from words; sum - 1074376 (1 row) Time: 221,078 ms test=# select sum(levenshtein_less_equal(word, 'qwerqwerqwer',-1)) from words; sum - 1074376 (1 row) Time: 254,819 ms The function with negative value of max_d didn't become faster than with just big value of max_d. Ah, I see. That's pretty compelling, I guess. Although it still seems like a lot of code... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On Wed, Jul 21, 2010 at 2:53 PM, Markus Wanner mar...@bluegap.ch wrote: Consider also the contrary situation, where the imessages stuff is not in use (even for a short period of time, like a few minutes). Then we'd really rather not still have memory carved out for it. Huh? That's exactly what dynamic allocation could give you: not having memory carved out for stuff you currently don't need, but instead being able to dynamically use memory where most needed. SLRU has memory (not disk space) carved out for pretty much every sub-system separately, if I'm reading that code correctly. Yeah, I think you are right. :-( I think what would be even better is to merge the SLRU pools with the shared_buffer pool, so that the two can duke it out for who is in most need of the limited amount of memory available. ..well, just add the shared_buffer pool to the list of candidates that could use dynamically allocated shared memory. It would need some thinking about boundaries (i.e. when to spill to disk, for those modules that /want/ to spill to disk) and dealing with OOM situations, but that's about it. I'm not sure why merging the SLRU pools with shared_buffers would benefit from dynamically allocated shared memory. I might be at (or possibly beyond) the limit of my ability to comment intelligently on this without looking more at what you want to use these imessages for, but I'm still pretty skeptical about the idea of storing them directly in shared memory. It's possible, though, that I am all wet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git config user.email
My initial suggestion was to say that everyone should just be usern...@postgresql.org; but I think that met with some resistance. Magnus, for example, tells me that he is a committer for multiple projects, and is mag...@hagander.net at all of them. Since that's a domain name he owns personally, it seems safe enough. But I'm inclined to think we should avoid things like rh...@commandprompt.com, just on the off chance JD decides to fire me. Of course, I expect there might be some dissenting voices on this point, so... thoughts? I'd prefer usern...@postgresql.org since: - It's permanent as already pointed out - It'd make clear that username is working as one of PostgreSQL project members Personal email addesses such as mag...@hagander.net would be ok as long as he is sure that he will continue to pay charges for his domain:-) -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers