Re: [HACKERS] patch (for 9.1) string functions
I'm reviewing contrib part of the string functions patch. I found an issue in sprintf() to print integer values. In this case, 'l' (for long type) is used on *all* platforms. For example, SELECT sprintf('%d', 10); internally uses appendStringInfo('%ld', (int64) 10) But there are some platform that requires to use %lld for int64 format, probably on Windows. That's why we have INT64_FORMAT macro. sprintf() needs to be adjusted to use INT64_FORMAT or similar portable codes. Other portion of the patch seems to be OK for me, unless you have still some idea to extend the feature. 2010/7/17 Pavel Stehule pavel.steh...@gmail.com: I have a one idea nonstandard enhancing of sprintf - relatie often job is a quoting in PostgreSQL. So sprintf should have a special formats for quoted values. What do you think about %lq ... literal quoted %iq ... ident quoted They save some keyboard types to write quote_literal() and quote_ident(), right? They seem to be useful and reasonable for me. One comment is that you might want to print NULL values as NULL instead of NULL in such cases. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED security
2010/7/23 Itagaki Takahiro itagaki.takah...@gmail.com: 2010/7/22 Pavel Stehule pavel.steh...@gmail.com: I see a SQL/MED security very unclean - it have to be very vell documented :( ERROR: password is required DETAIL: Non-superuser cannot connect if the server does not request a password. The security model of current FDW heavily depends on implementation of existing COPY FROM and dblink. It should be improved in the future. Postgresql_fdw is based on dblink, that always requires password for non-superuser connections. File_fdw is based on COPY FROM, that only allows superusers to read local files. So, present FDWs also require password or superuser privileges to read foreign tables. I think the ideal design is creating foreign tables are restricted only for superusers, or requiring password on the creation time, but don't require such privileges on SELECT time. But I didn't develop such feature at the moment. I agree on 50%. Only superusers can create these tables, but he can grant SELECT to some non superusers. And this behave have to be done from start not later. Ok, It is nice, so supersuser can have a foreign tables, but is very unpractical so only this user can use these tables. I understand where is core of this limit - but it have to be solved before commiting. Usually people don't like to use a superuser account. There are some security issues here - ALTERing generic options. Superusers can define file_fdw on a local file, and can grant the ownership to non-superusers. But the non-superusers should not modify 'filename' option with ALTER OPTION. If allowed, they can read another local file on the server. I am thinking so for fdw_file nonsuper user can't to change any options. I don't see any use case. There is another security issue: password can be retrieved by all users to query system catalogs. The password is stored in generic options, that are visible for all users and not encrypted. We can allow users to read other normal options, but should not password and local filesystem path. However, we don't have such equipments for now. sure, it must by solved first. Probably we have to have to store password to separate file inside a data directory :( - I am not a security engineer - probably have to use a same mechanism, that use a postgres for storing a passwords. Regards Pavel Stehule Such security issues are ones of the most complex problems in FDW. Comments and ideas welcome. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
Hello 2010/7/23 Itagaki Takahiro itagaki.takah...@gmail.com: I'm reviewing contrib part of the string functions patch. I found an issue in sprintf() to print integer values. In this case, 'l' (for long type) is used on *all* platforms. For example, SELECT sprintf('%d', 10); internally uses appendStringInfo('%ld', (int64) 10) But there are some platform that requires to use %lld for int64 format, probably on Windows. That's why we have INT64_FORMAT macro. sprintf() needs to be adjusted to use INT64_FORMAT or similar portable codes. ok, I'll look on it Other portion of the patch seems to be OK for me, unless you have still some idea to extend the feature. 2010/7/17 Pavel Stehule pavel.steh...@gmail.com: I have a one idea nonstandard enhancing of sprintf - relatie often job is a quoting in PostgreSQL. So sprintf should have a special formats for quoted values. What do you think about %lq ... literal quoted %iq ... ident quoted They save some keyboard types to write quote_literal() and quote_ident(), right? They seem to be useful and reasonable for me. One comment is that you might want to print NULL values as NULL instead of NULL in such cases. yes, it is good note Thank You very much Regards Pavel Stehule -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lock_timeout GUC patch - Review
Hi, first, thanks for the review. Hi, I've been reviewing this patch for the last few days. Here it is : * Submission review * Is the patch in context diff format? Yes * Does it apply cleanly to the current CVS HEAD? Yes * Does it include reasonable tests, necessary doc patches, etc? Doc patches are there. There are no regression tests. Should there be ? IIRC, there was a discussion/patch about parallel psql that can hold more than one connections open. With that feature, a regression test can be added. Reading the 9.0beta3 docs, it's not there and this patch is not on the current commitfest either. Is there anyone who knows the status of this feature? * Usability review * Does the patch actually implement that? Yes * Do we want that? I think so. At least I'd like to have this feature :) :-) * Do we already have it? No * Does it follow SQL spec, or the community-agreed behavior? I didn't see a clear conclusion from the -hackers thread on this (GUC vs SQL statement extension) * Does it include pg_dump support (if applicable)? Not applicable. Or should pg_dump and/or pg_restore put lock_timeout to 0 ? * Are there dangers? As it is a guc, it could be set globally, is that a danger ? It could be set globally, but it will exhibit a new global behaviour. Which is not unexpected. The problem may come from [auto]vacuum processes can get stopped. Maybe others, too. A previous version contained a checking function that refused to set it from postgresql.conf for this reason, but it was frowned upon by Tom. :-) The proper fix would be that every such processes should set this GUC to zero for them locally. * Have all the bases been covered? I honestly don't know. It touches alarm signal handling. * Apply the patch, compile it and test: * Does the feature work as advertised? I only tested it with Linux. The code is very OS-dependent. It works as advertised with Linux. I found only one corner case (see below) The setitimer() function is implemented in backend/port/win32/timer.c, so it's abstracted away. With that in mind, I think there's no OS-dependent in this patch. * Are there corner cases the author has failed to consider? The feature almost works as advertised : it fails when lock_timeout = deadlock_timeout. Then the lock_timeout isn't detected. I think this case isn't considered in handle_sig_alarm(). I will look into this, thanks for spotting it. * Are there any assertion failures or crashes? No * Performance review * Does the patch slow down simple tests? No * If it claims to improve performance, does it? Not applicable * Does it slow down other things? No. Maybe alarm signal handling and enabling will be slower, as there is more work done there (for instance, a GetCurrentTimestamp, that was only done when log_lock_waits was activated until now. But I couldn't measure a slowdown. * Read the changes to the code in detail and consider: * Does it follow the project coding guidelines? I think so * Are there portability issues? It seems to have already been adressed, from the previous discussion in the thread. * Will it work on Windows/BSD etc? It should. I couldn't test it though. Infrastructure is there. * Are the comments sufficient and accurate? Yes * Does it do what it says, correctly? Yes * Does it produce compiler warnings? No * Can you make it crash? No * Consider the changes to the code in the context of the project as a whole: * Is everything done in a way that fits together coherently with other features/modules? I have a feeling that enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a very specific problem, and it gets complicated because there is no infrastructure in the code to handle several timeouts at the same time with sigalarm, so each timeout has its own dedicated and intertwined code. But I'm still discovering this part of the code. There is a problem with setitimer(): only one timer can be alive at one time with the same timer id (ITIMER_REAL). * Are there interdependencies that can cause problems? I don't think so. Thanks for the review, I will post a new patch sometime next week. Best regards, Zoltán Böszörményi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add column if not exists (CINE)
--On 21. Juli 2010 17:16:13 -0400 Robert Haas robertmh...@gmail.com wrote: I get the same error message from concurrent CREATE TABLE commands even without CINE... S1: rhaas=# begin; BEGIN rhaas=# create table foo (id int); CREATE TABLE S2: rhaas=# begin; BEGIN rhaas=# create table foo (id int); blocks S1: rhaas=# commit; COMMIT S2: ERROR: duplicate key value violates unique constraint pg_type_typname_nsp_index DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. Funny, never realized that before, but you're right. I agree it would be nice to fix this. I'm not sure how hard it is. I don't think it's the job of this patch. :-) Yes, i agree. I would like to mark this patch Ready for Committer, if that's okay for you (since you are a committer you might want to commit it yourself). Given that there's still some discussion in progress, i'm not sure about it, however. The patch itself looks fine to me and I'm traveling this weekend, so i don't want to hold it off as long as necessary. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE...ALTER COLUMN vs inheritance
--On 16. Juni 2010 14:31:00 +0200 Bernd Helmle maili...@oopsware.de wrote: There are some spelling and grammar errors in comments that I can help fix if you want the help. You're welcome. I have pushed my branch to my git repos as well, if you like to start from there: http://git.postgresql.org/gitweb?p=bernd_pg.git;a=shortlog;h=refs/heads/ notnull_constraint I'm going to delay this patch until the next commitfest. I'm able to work on it only sporadically during the next two weeks, and some remaining issues need my attention on this patch: - ALTER TABLE SET NOT NULL works not properly - ALTER TABLE child NO INHERIT parent leaves inconistent state in pg_constraint - Special case in pg_get_constraintdef_worker() needs more thinking -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] antisocial things you can do in git (but not CVS)
* Robert Haas: 1. Inability to cleanly and easily (and programatically) identify who committed what. With CVS, the author of a revision is the person who committed it, period. With git, the author string can be set to anything the person typing 'git commit' feels like. It's even more difficult than that. Git does not record at all who updated a particular branch to include a specific commit. This operation does not leave any metadata behind. It is possible to write a log file for audit purposes, but it's an out-of-band solution. My preference would be to stick to a style where we identify the committer using the author tag and note the patch author, reviewers, whether the committer made changes, etc. in the commit message. A single author field doesn't feel like enough for our workflow, and having a mix of authors and committers in the author field seems like a mess. It would be possible to enforce that on the server side, but it would interfere with merges. 3. Merge commits. I believe that we have consensus that commits should always be done as a squash, so that the history of all of our branches is linear. But it seems to me that someone could accidentally push a merge commit, either because they forgot to squash locally, or because of a conflict between their local git repo's master branch and origin/master. Can we forbid this? It's possible to do this with some scripting on the server side. -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add column if not exists (CINE)
On Fri, Jul 23, 2010 at 2:46 AM, Bernd Helmle maili...@oopsware.de wrote: --On 21. Juli 2010 17:16:13 -0400 Robert Haas robertmh...@gmail.com wrote: I get the same error message from concurrent CREATE TABLE commands even without CINE... S1: rhaas=# begin; BEGIN rhaas=# create table foo (id int); CREATE TABLE S2: rhaas=# begin; BEGIN rhaas=# create table foo (id int); blocks S1: rhaas=# commit; COMMIT S2: ERROR: duplicate key value violates unique constraint pg_type_typname_nsp_index DETAIL: Key (typname, typnamespace)=(foo, 2200) already exists. Funny, never realized that before, but you're right. I agree it would be nice to fix this. I'm not sure how hard it is. I don't think it's the job of this patch. :-) Yes, i agree. I would like to mark this patch Ready for Committer, if that's okay for you (since you are a committer you might want to commit it yourself). Given that there's still some discussion in progress, i'm not sure about it, however. The patch itself looks fine to me and I'm traveling this weekend, so i don't want to hold it off as long as necessary. As far as I can see, the other emails were regarding adding columns, whereas this patch is about creating tables. So I think it's OK... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
2010/7/23 KaiGai Kohei kai...@ak.jp.nec.com: Sorry for the confusion. What I wanted to say is the patch itself is fine but we need to make consensus before the detailed code reviewing. I guess we probably need some more people to express an opinion, then. Do you have one? I'm not sure I do, yet. I'd like to hear the patch author's response to Itagaki Takahiro's question upthread: Why don't you use just echo 'options' $PGDATA/postgresql.conf ? Could you explain where the -C options is better than initdb + echo? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security label support, part.2
2010/7/23 KaiGai Kohei kai...@ak.jp.nec.com: Hmm. How about if there's just one provider loaded, you can omit it, but if you fail to specify it and there's1 loaded, we just throw an error saying you didn't specify whose label it is. Perhaps, we need to return the caller a state whether one provider checked the given label at least, or not. Return to the caller? This is an SQL command. You either get an error, or you don't. If it was omitted, all the providers try to check the given label, but it has mutually different format, so one of providers will raise an error at least. Yeah, but it won't be a very clear error, and what if you have, say, a provider that allows arbitrary strings as labels? Since this is a security feature, I think it's a pretty bad idea to allow the user to do anything that might be ambiguous. It means we have to specify the provider when two or more providers are loaded, but not necessary when just one provider. But that should be fine. Loading multiple providers should, as you say, be fairly rare. I think it is 100% clear that row-level security will require completely separate infrastructure, and therefore I'm not even a tiny bit worried about this. :-) Hmm. Are you saying we may degrade the feature when we switch to the completely separate infrastructure? Is it preferable?? Uh... no, not really. I'm saying that I don't think we're backing ourselves into a corner. What makes you think we are? Sorry, meaning of the last question was unclear for me Is it a idiom? I don't understand why we wouldn't be able to support multiple providers for row-level security. Why do you think that's a problem? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security label support, part.2
* Robert Haas (robertmh...@gmail.com) wrote: I don't understand why we wouldn't be able to support multiple providers for row-level security. Why do you think that's a problem? My guess would be that he's concerned about only having space in the tuple header for 1 label. I see two answers- only allow 1 provider for a given relation (doesn't strike me as a horrible limitation), or handle labels as extra columns where you could have more than one. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] security label support, part.2
On Fri, Jul 23, 2010 at 8:32 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: I don't understand why we wouldn't be able to support multiple providers for row-level security. Why do you think that's a problem? My guess would be that he's concerned about only having space in the tuple header for 1 label. I see two answers- only allow 1 provider for a given relation (doesn't strike me as a horrible limitation), or handle labels as extra columns where you could have more than one. I think we've been pretty clear in previous discussions that any row-level security implementation should be a general one, and SE-Linux or whatever can integrate with that to do what it needs to do. So I'm pretty sure we'll be using regular columns rather than cramming anything into the tuple header. There are pretty substantial performance benefits to such an implementation, as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security label support, part.2
(2010/07/23 20:44), Robert Haas wrote: 2010/7/23 KaiGai Koheikai...@ak.jp.nec.com: Hmm. How about if there's just one provider loaded, you can omit it, but if you fail to specify it and there's1 loaded, we just throw an error saying you didn't specify whose label it is. Perhaps, we need to return the caller a state whether one provider checked the given label at least, or not. Return to the caller? This is an SQL command. You either get an error, or you don't. Ahh, I was talked about relationship between the core PG code and ESP module. It means the security hook returns a state which informs the core PG code whether one provider checked the given label, then the core PG code can decide whether it raise an actual error to users, or not. In other words, I'd like to suggest the security hook which returns a tag of ESP module, as follows: const char * check_object_relabel_hook(const ObjectAddress *object, const char *provider, const char *seclabel); The second argument reflects FOR provider clause. It informs ESP modules what provider is specified. If omitted, it will be NULL. Then, ESP module which checked the given security label must return its tag. Maybe, selinux, if SE-PostgreSQL. Or, NULL will be returned if nobody checked it. If NULL or incorrect tag is returned, the core PG code can know the given seclabel is not checked/validated, then it will raise an error to users. Elsewhere, the validated label will be stored with the returned tag. It enables to recognize what label is validated by SELinux, and what label is not. If it was omitted, all the providers try to check the given label, but it has mutually different format, so one of providers will raise an error at least. Yeah, but it won't be a very clear error, and what if you have, say, a provider that allows arbitrary strings as labels? Since this is a security feature, I think it's a pretty bad idea to allow the user to do anything that might be ambiguous. It is provider's job to validate the given security label. So, if we install such a security module which accept arbitrary strings as label, the core PG code also need to believe the ESP module. But the arbitrary label will be tagged with something other than selinux, so it does not confuse other module, according to the above idea. It means we have to specify the provider when two or more providers are loaded, but not necessary when just one provider. But that should be fine. Loading multiple providers should, as you say, be fairly rare. I think it is 100% clear that row-level security will require completely separate infrastructure, and therefore I'm not even a tiny bit worried about this. :-) Hmm. Are you saying we may degrade the feature when we switch to the completely separate infrastructure? Is it preferable?? Uh... no, not really. I'm saying that I don't think we're backing ourselves into a corner. What makes you think we are? Sorry, meaning of the last question was unclear for me Is it a idiom? I don't understand why we wouldn't be able to support multiple providers for row-level security. Why do you think that's a problem? I don't have any clear reason why we wouldn't be able to support multiple labels on user tuples, but it is intangible anxiety, because I have not implemented it as a working example yet. (So, I never think it is impossible.) Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
On Jul 23, 2010, at 6:36 AM, Robert Haas wrote: 2010/7/23 KaiGai Kohei kai...@ak.jp.nec.com: Sorry for the confusion. What I wanted to say is the patch itself is fine but we need to make consensus before the detailed code reviewing. I guess we probably need some more people to express an opinion, then. Do you have one? I'm not sure I do, yet. I'd like to hear the patch author's response to Itagaki Takahiro's question upthread: Why don't you use just echo 'options' $PGDATA/postgresql.conf ? Could you explain where the -C options is better than initdb + echo? At this point, I have no real preference for this patch; it is just as easy to echo line datadir/postgresql.conf, so perhaps that makes this patch somewhat pointless. I suppose there's a shaky argument to be made for Windows compatibility, but I'm sure there's also an equivalent functionality to be found in the windows shell. Reception to this idea has seemed pretty lukewarm, although I think Peter expressed some interest. Some of the previous linked correspondence in the review referred to some of the proposed split configuration file mechanisms. My particular implementation is fairly limited to the idea of a single configuration file, so compared to some of the other proposed approaches including split .conf files, it may not cover the same ground. Like I said in the original submission, I found it helpful for the programmatic configuration of a number of simultaneous node, but if it's not generally useful to the community at large, I'll understand if it's punted. Regards, David -- David Christensen End Point Corporation da...@endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Thu, 22 Jul 2010, Robert Haas wrote: On Thu, Jul 22, 2010 at 5:34 PM, Kris Jurka bo...@ejurka.com wrote: Attached is a patch to make the server continue to consume protocol data until instructed to stop by the client in the same way as copying text data to the server currently works. I guess the obvious question is whether we shouldn't instead change the docs to match the behavior. I suspect there's almost no chance we'd consider back-patching a change of this type, since it is a clear behavior change. And even if we did, there would still be people running servers with the old behavior with which JDBC and other drivers would have to cope. Having two different behaviors might be worse than the status quo. It is a clear behavior change, but that's what bug fixes are. I would advocate back-patching this because I doubt many people would be affected by this change and I think it would be awkward trying to document how things work differently in binary mode when sending a file end marker than in text mode or without a file end marker. If this was fixed server side and backpatched, I would not modify the JDBC driver to work with older server versions. The copy documentation is clear that you must call PQputCopyEnd or equivalent to end the copy sequence, so this would only affect people who are not doing that and using binary copy mode. I doubt many people are using binary copy at all because of the additional difficulty in generating binary format data and the potential for portability problems. Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
Kris Jurka wrote: I guess the obvious question is whether we shouldn't instead change the docs to match the behavior. I suspect there's almost no chance we'd consider back-patching a change of this type, since it is a clear behavior change. And even if we did, there would still be people running servers with the old behavior with which JDBC and other drivers would have to cope. Having two different behaviors might be worse than the status quo. It is a clear behavior change, but that's what bug fixes are. That was my first reaction. I don't think we're in the business if redefining bugs out of existence. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security label support, part.2
On Fri, Jul 23, 2010 at 8:59 AM, KaiGai Kohei kai...@kaigai.gr.jp wrote: (2010/07/23 20:44), Robert Haas wrote: 2010/7/23 KaiGai Koheikai...@ak.jp.nec.com: Hmm. How about if there's just one provider loaded, you can omit it, but if you fail to specify it and there's1 loaded, we just throw an error saying you didn't specify whose label it is. Perhaps, we need to return the caller a state whether one provider checked the given label at least, or not. Return to the caller? This is an SQL command. You either get an error, or you don't. Ahh, I was talked about relationship between the core PG code and ESP module. It means the security hook returns a state which informs the core PG code whether one provider checked the given label, then the core PG code can decide whether it raise an actual error to users, or not. In other words, I'd like to suggest the security hook which returns a tag of ESP module, as follows: const char * check_object_relabel_hook(const ObjectAddress *object, const char *provider, const char *seclabel); I don't think that's a very good design. What I had in mind was a simple API for security providers to register themselves (including their names), and then the core code will only call the relevant security provider. I did try to explain this in point #3 of my original review. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
Kris Jurka bo...@ejurka.com writes: Attached is a patch to make the server continue to consume protocol data until instructed to stop by the client in the same way as copying text data to the server currently works. I believe this is a misunderstanding of the protocol spec. The spec is (intended to say that) we'll continue to accept data after reporting an error, not that we will silently swallow an incorrect data stream and not complain about it. Which is what this patch will do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
David Christensen escreveu: Like I said in the original submission, I found it helpful for the programmatic configuration of a number of simultaneous node, but if it's not generally useful to the community at large, I'll understand if it's punted. I'm afraid it is the only use case for this new option. If it is, it doesn't deserve a new option. We can live with echo + initdb for those cases. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Fri, Jul 23, 2010 at 9:32 AM, Andrew Dunstan and...@dunslane.net wrote: Kris Jurka wrote: I guess the obvious question is whether we shouldn't instead change the docs to match the behavior. I suspect there's almost no chance we'd consider back-patching a change of this type, since it is a clear behavior change. And even if we did, there would still be people running servers with the old behavior with which JDBC and other drivers would have to cope. Having two different behaviors might be worse than the status quo. It is a clear behavior change, but that's what bug fixes are. That was my first reaction. I don't think we're in the business if redefining bugs out of existence. I certainly understand that reaction - I just worry that there might be people depending on the current behavior. We really don't want to get a reputation for breaking things in minor releases. But this is not an area of the code I'm very familiar with, and I'm not in a good position to judge the likelihood of breakage, so I'll defer to those who are... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On 7/23/2010 6:40 AM, Tom Lane wrote: Kris Jurkabo...@ejurka.com writes: Attached is a patch to make the server continue to consume protocol data until instructed to stop by the client in the same way as copying text data to the server currently works. I believe this is a misunderstanding of the protocol spec. The spec is (intended to say that) we'll continue to accept data after reporting an error, not that we will silently swallow an incorrect data stream and not complain about it. Which is what this patch will do. All this does is make binary mode match text mode. Are you planning on changing text mode to match binary mode instead? Currently text mode parsing ends at the data end marker (\.) and throws away any further data which may or may not be ill formatted. For example there's no complaint about copying the following data file into a single column integer table even though there is bogus data after the file end marker 3 4 \. asdf aff 5 qq Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch for 9.1: initdb -C option
David Christensen da...@endpoint.com wrote: At this point, I have no real preference for this patch; it is just as easy to echo line datadir/postgresql.conf, so perhaps that makes this patch somewhat pointless. On reflection, I'm inclined to agree. I suppose there's a shaky argument to be made for Windows compatibility, but I'm sure there's also an equivalent functionality to be found in the windows shell. The Windows command shell supports the echo command and to append. Like I said in the original submission, I found it helpful for the programmatic configuration of a number of simultaneous node, but if it's not generally useful to the community at large, I'll understand if it's punted. Unless someone can show a significant benefit beyond appends, I'll do that in a couple days. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] could not reattach to shared memory on Windows
On Tue, Jul 20, 2010 at 17:31, Etienne Dube etd...@gmail.com wrote: On 09/02/2010 4:09 PM, Etienne Dube wrote: Magnus Hagander wrote: IIRC, we've had zero reports on whether the patch worked at all on 8.2 in an environment where the problem actually existed. So yes, some testing and feedback would be much apprecaited. //Magnus Thanks for your quick reply. We upgraded to Service Pack 2 and it solved the problem. Nevertheless, I'll try to reproduce the issue under a Win2008 SP1 VM to see whether the patch makes a difference. 8.2.x win32 binaries are built using MinGW right? Etienne The could not reattach to shared memory bug came back to bite us, this time on a production machine running Windows Server 2008 R2 x64. I manually applied the patch against the 8.2.17 sources and installed the build on a test server. It has been running for two days without any issue. We'll see after a while if the patch actually fixes the problem (it seems to happen only after the postgres service has been up and running for some time) but in case you want to include this fix in a future 8.2.18 release, I've attached the new patch to apply against the REL8_2_STABLE branch. Yes, I think it's time to backpatch this to 8.2 - it has worked very well on 8.3 and 8.4, and we've had a couple of good reports on 8.2 by now. So I've done that, so it should be in the next 8.2 version. In fact, there was a small bug in the patch that broke all non-win32 platforms, so I fixed that while at it :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
Kris Jurka bo...@ejurka.com writes: On 7/23/2010 6:40 AM, Tom Lane wrote: I believe this is a misunderstanding of the protocol spec. The spec is (intended to say that) we'll continue to accept data after reporting an error, not that we will silently swallow an incorrect data stream and not complain about it. Which is what this patch will do. All this does is make binary mode match text mode. The fact that text mode eats data after \. is a backwards-compatibility kluge to match the behavior of pre-7.4 COPY. It could very legitimately be argued to be a bug in itself. I don't think that we should make binary mode match it. The main concrete reason why not is that binary mode has almost no redundancy. It would be really easy for the code change you suggest to result in data being silently discarded with no hint of what went wrong. After some reflection, I think the real issue here is that the JDBC driver is depending on a behavior not stated in the protocol, which is the relative timing of FE-to-BE and BE-to-FE messages. Once you've sent the EOF marker, the only correct follow-on for a spec-compliant frontend is a CopyEnd message. So the backend is just sending its response a bit sooner. There's nothing in the protocol spec forbidding that. I would be willing to accept a patch that avoided sending CopyEnd immediately after receiving EOF, so long as it still threw an error for extra data; but this is not that patch. The larger issue though is whether it wouldn't be better to change the driver behavior instead. I can't help thinking that the JDBC driver must be being overly cute if this breaks it ... and you're never going to get everybody to upgrade their server version, even if we were willing to back-patch the change. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql \conninfo command (was: Patch: psql \whoami option)
On Wed, Jul 21, 2010 at 11:13 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Jul 22, 2010 at 11:57 AM, Robert Haas robertmh...@gmail.com wrote: Should we be using is_absolute_path() here instead, as libpq does? Yes. The attached patch does that. On Wed, Jul 21, 2010 at 10:09 PM, David Christensen da...@endpoint.com wrote: If we print the local socket when it's been explicitly set via the host= param, why not display the actual socket path in the general local socket case? Patch? Done. DEFAULT_PGSOCKET_DIR (i.e., /tmp) should be displayed in that case, I think. $ psql -c\conninfo You are connected to database postgres via local socket on /tmp at port 5432 as user postgres. Sorry for the slow response, I've been slightly buried. I've committed this with some kibitzing, the most relevant bit of which is that I changed via local socket on to via local socket in, which I think reads better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] including backend ID in relpath of temp rels - updated patch
On Thu, Jun 10, 2010 at 3:39 PM, Robert Haas robertmh...@gmail.com wrote: I believe that this patch will clear away one major obstacle to implementing global temporary and unlogged tables: it enables us to be sure of cleaning up properly after a crash without relying on catalog entries or XLOG. Based on previous discussions, however, I believe there is support for making this change independently of what becomes of that project, just for the benefit of having a more robust cleanup mechanism. Hi, i was looking at v3 of this patch... it looks good, works as advertised, pass regression tests, and all tests i could think of... haven't looked the code at much detail but seems ok, to me at least... -- Jaime Casanova www.2ndQuadrant.com Soporte y capacitación de PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RRR] CommitFest 2010-07 week one progress report
Hi, On 07/22/2010 08:51 PM, Robert Haas wrote: It seems to me that the discussion is Alvaro and I are having with Markus is tilted toward having Markus rewrite the imessages interface to use an SLRU, in which case neither of them will go in this CF. I'm hopeful that Heikki or Tom will comment on this also when they get back from their vacations. Just for the record: I don't currently think a rewrite to use SLRU makes any sense for imessages. But to answer Kevin's question: I don't expect to have the prerequisite patches committed this CF, as I don't think it currently makes any sense for Postgres to apply them. Nor did I feel there's general consensus that having we want to have a dynamic memory allocator (for shared memory). It would be nice to be able to keep track of these kind of patches, which are available to Postgres and get maintained, but aren't currently needed or wanted. But do we want to use the CF application for that? How do you prefer to proceed with these patches? It's also worth noting that Simon requested more and better documentation. But I simply can't promise to write anything soon. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Fri, 23 Jul 2010, Tom Lane wrote: Kris Jurka bo...@ejurka.com writes: On 7/23/2010 6:40 AM, Tom Lane wrote: I believe this is a misunderstanding of the protocol spec. The spec is (intended to say that) we'll continue to accept data after reporting an error, not that we will silently swallow an incorrect data stream and not complain about it. Which is what this patch will do. All this does is make binary mode match text mode. The fact that text mode eats data after \. is a backwards-compatibility kluge to match the behavior of pre-7.4 COPY. It could very legitimately be argued to be a bug in itself. I don't think that we should make binary mode match it. The main concrete reason why not is that binary mode has almost no redundancy. It would be really easy for the code change you suggest to result in data being silently discarded with no hint of what went wrong. Binary copy mode already does this (eat data silently after -1 field count). The patch I sent just made it follow the fe/be protocol while it does so. jurka=# create table copytest (a int); CREATE TABLE jurka=# insert into copytest values (1); INSERT 0 1 jurka=# \copy copytest to copydata with binary jurka=# \! echo garbage copydata jurka=# \copy copytest from copydata with binary jurka=# select * from copytest; a --- 1 1 (2 rows) After some reflection, I think the real issue here is that the JDBC driver is depending on a behavior not stated in the protocol, which is the relative timing of FE-to-BE and BE-to-FE messages. Once you've sent the EOF marker, the only correct follow-on for a spec-compliant frontend is a CopyEnd message. So the backend is just sending its response a bit sooner. There's nothing in the protocol spec forbidding that. What about CopyFail? The protocol docs say nothing about the message contents only about the messages themselves. Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2010-07 week one progress report
Markus Wanner mar...@bluegap.ch wrote: On 07/22/2010 08:51 PM, Robert Haas wrote: It seems to me that the discussion is Alvaro and I are having with Markus is tilted toward having Markus rewrite the imessages interface to use an SLRU, in which case neither of them will go in this CF. I'm hopeful that Heikki or Tom will comment on this also when they get back from their vacations. Just for the record: I don't currently think a rewrite to use SLRU makes any sense for imessages. From the sidelines -- I don't know much about our SLRU implementation, but on from what I have heard, it's hard to see that as a good fit for what you're trying to do. At a minimum, those suggesting it should probably sketch out how that would work just a little bit farther. But to answer Kevin's question: I don't expect to have the prerequisite patches committed this CF, as I don't think it currently makes any sense for Postgres to apply them. OK, so all eight of these patches should be considered WIP submissions. That's good to know from a process management PoV. Nor did I feel there's general consensus that having we want to have a dynamic memory allocator (for shared memory). On the other hand, I seem to remember hearing mentions of a desire for that in the past, particularly regarding the ability to have pluggable modules. I suspect that any such feature would need to allocate blocks from which space can be allocated and released in a more lightweight manner than dealing with the OS -- probably with an interface similar to current memory contexts. How does the current patch deal with that? It would be nice to be able to keep track of these kind of patches, which are available to Postgres and get maintained, but aren't currently needed or wanted. pgfoundry? But do we want to use the CF application for that? How do you prefer to proceed with these patches? For WIP patches, I'm inclined to leave them open until the end of the CF or until discussion seems to have hit an end. I wonder if we should have a flag in the application for these, so they show up in the counts in a different way. It's also worth noting that Simon requested more and better documentation. But I simply can't promise to write anything soon. For a WIP patch, I suspect that putting on your personal TODO list, to cover before any submission for acceptance, is the main thing. Of course, lack of comments or documentation may limit the feedback you get for your WIP submission, as reverse-engineering intent from code can be time-consuming and tedious. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
On Sun, Jul 18, 2010 at 02:40, Peter Eisentraut pete...@gmx.net wrote: On lör, 2010-07-17 at 11:13 -0600, Alex Hunsaker wrote: So I would expect the more indexes you had or group by items to slow it down. Not so much the number of columns. Right? At the outer level (which is not visible in this patch) it loops over all columns in the select list, and then it looks up the indexes each time. So the concern was that if the select list was * with a wide table, looking up the indexes each time would be slow. Thanks for the explanation. Anyhow it sounds like I should try it on top of the other patch and see if it works. I assume it might still need some fixups to work with that other patch? Or do you expect it to just work? There is some work necessary to integrate the two. I just read that patch is getting pushed till at least the next commit fest: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01219.php Should we push this patch back to? Alternatively we could make it work with just primary keys until the other patch gets in. I think that makes sense, find that attached. Thoughts? Note I axed the index not null attribute checking, I have not thought to deeply about it other than if its a primary key it cant have non null attributes Right? I may have missed something subtle hence the heads up. *** a/doc/src/sgml/queries.sgml --- b/doc/src/sgml/queries.sgml *** *** 886,895 SELECT product_id, p.name, (sum(s.units) * p.price) AS sales In this example, the columns literalproduct_id/literal, literalp.name/literal, and literalp.price/literal must be in the literalGROUP BY/ clause since they are referenced in ! the query select list. (Depending on how the products ! table is set up, name and price might be fully dependent on the ! product ID, so the additional groupings could theoretically be ! unnecessary, though this is not implemented.) The column literals.units/ does not have to be in the literalGROUP BY/ list since it is only used in an aggregate expression (literalsum(...)/literal), which represents the sales --- 886,892 In this example, the columns literalproduct_id/literal, literalp.name/literal, and literalp.price/literal must be in the literalGROUP BY/ clause since they are referenced in ! the query select list (but see below). The column literals.units/ does not have to be in the literalGROUP BY/ list since it is only used in an aggregate expression (literalsum(...)/literal), which represents the sales *** *** 898,903 SELECT product_id, p.name, (sum(s.units) * p.price) AS sales --- 895,912 /para para + If the products table is set up so that, + say, literalproduct_id/literal is the primary key or a + not-null unique constraint, then it would be enough to group + by literalproduct_id/literal in the above example, since name + and price would be firsttermfunctionally + dependent/firsttermindextermprimaryfunctional + dependency/primary/indexterm on the product ID, and so there + would be no ambiguity about which name and price value to return + for each product ID group. +/para + +para In strict SQL, literalGROUP BY/ can only group by columns of the source table but productnamePostgreSQL/productname extends this to also allow literalGROUP BY/ to group by columns in the *** a/doc/src/sgml/ref/select.sgml --- b/doc/src/sgml/ref/select.sgml *** *** 520,527 GROUP BY replaceable class=parameterexpression/replaceable [, ...] produces a single value computed across all the selected rows). When literalGROUP BY/literal is present, it is not valid for the commandSELECT/command list expressions to refer to ! ungrouped columns except within aggregate functions, since there ! would be more than one possible value to return for an ungrouped column. /para /refsect2 --- 520,531 produces a single value computed across all the selected rows). When literalGROUP BY/literal is present, it is not valid for the commandSELECT/command list expressions to refer to ! ungrouped columns except within aggregate functions or if the ! ungrouped column is functionally dependent on the grouped columns, ! since there would otherwise be more than one possible value to ! return for an ungrouped column. A functional dependency exists if ! the grouped columns (or a subset thereof) are the primary key or a ! not-null unique constraint of the table containing the ungrouped column. /para /refsect2 *** a/src/backend/commands/view.c --- b/src/backend/commands/view.c *** *** 16,21 --- 16,22 #include access/heapam.h #include access/xact.h + #include catalog/dependency.h #include catalog/namespace.h #include commands/defrem.h #include
Re: [HACKERS] Functional dependencies and GROUP BY
On Fri, Jul 23, 2010 at 11:00, Alex Hunsaker bada...@gmail.com wrote: Alternatively we could make it work with just primary keys until the other patch gets in. I think that makes sense, find that attached. Thoughts? *sigh*, find attached a version that removes talk of unique not null constraints from the docs func_deps_v4.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
Excerpts from Alexander Korotkov's message of jue jul 22 03:21:57 -0400 2010: On Thu, Jul 22, 2010 at 1:59 AM, Robert Haas robertmh...@gmail.com wrote: Ah, I see. That's pretty compelling, I guess. Although it still seems like a lot of code... I think there is a way to merge single-byte and multi-byte versions of functions without loss in performance using macros and includes (like in 'backend/utils/adt/like.c'). Do you think it is acceptable in this case? Using the trick of a single source file similar to like_match.c that implements all the different behaviors, and include that file more than once with different macro definitions, is probably a good idea. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Thu, Jul 22, 2010 at 08:43:35PM +0300, Marko Tiikkaja wrote: Hi, From the code I understood that when executing a query normally, in READ COMMITTED mode, we take a new snapshot for every query that comes out of rewrite. But in an EXPLAIN ANALYZE, we only update the CID of the snapshot taken when the EXPLAIN started. Did I misunderstand the code? And if I didn't, why do we do this differently? You mentioned in IRC that this was in aid of getting wCTEs going. How are these things connected? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/23/2010 8:52 PM, David Fetter wrote: On Thu, Jul 22, 2010 at 08:43:35PM +0300, Marko Tiikkaja wrote: Did I misunderstand the code? And if I didn't, why do we do this differently? You mentioned in IRC that this was in aid of getting wCTEs going. How are these things connected? Currently, I'm trying to make wCTEs behave a bit like RULEs do. But if every rewrite product takes a new snapshot, wCTEs will behave very unpredictably. But because EXPLAIN ANALYZE does *not* take a new snapshot for every rewrite product, I'm starting to think that maybe this isn't the behaviour we wanted to begin with? Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
Robert Haas robertmh...@gmail.com writes: so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Yeah, I'd like some more votes, too. I still don't see a compelling reason not to extend existing functions with a third argument. Or are we talking about deprecating them in the future (like remove their mention in the docs) and have the new names to replace them, with the new behavior as the default and the extended call convention as the old behavior? I'm not sure about that, so I think extending existing function is ok. Or we would have to have the new functions work well with other types too, so that it's compelling to move from the old ones. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
On Fri, Jul 23, 2010 at 2:18 AM, Joseph Adams joeyadams3.14...@gmail.com wrote: This is a work-in-progress patch of my GSoC project: Add JSON datatype to PostgreSQL. It provides the following: * JSON datatype: A TEXT-like datatype for holding JSON-formatted text. Although the JSON RFC decrees that a JSON text be an object or array (meaning 'hello' is considered invalid JSON text), this datatype lets you store any JSON value (meaning 'hello'::JSON is allowed). * Validation: Content is validated when a JSON datum is constructed, but JSON validation can also be done programmatically with the json_validate() function. * Conversion to/from JSON for basic types. Conversion functions are needed because casting will not unwrap JSON-encoded values. For instance, json('string')::text is 'string', while from_json('string') is 'string'. Also, to_json can convert PostgreSQL arrays to JSON arrays, providing a nice option for dealing with arrays client-side. from_json currently can't handle JSON arrays/objects yet (how they should act is rather unclear to me, except when array dimensions and element type are consistent). * Retrieving/setting values in a JSON node (via selectors very similar to, but not 100% like, JSONPath as described at http://goessner.net/articles/JsonPath/ ). * Miscellaneous functions json_condense and json_type. This is a patch against CVS HEAD. This module compiles, installs, and passes all 8 tests successfully on my Ubuntu 9.10 system. It is covered pretty decently with regression tests. It also has SGML documentation (the generated HTML is attached for convenience). Although I am aware of many problems in this patch, I'd like to put it out sooner rather than later so it can get plenty of peer review. Problems I'm aware of include: * Probably won't work properly when the encoding (client or server?) is not UTF-8. When encoding (e.g. with json_condense), it should (but doesn't) use \u escapes for characters the target encoding doesn't support. * json.c is rather autarkic. It has its own string buffer system (rather than using StringInfo) and UTF-8 validator (rather than using pg_verify_mbstr_len(?) ). * Some functions/structures are named suggestively, as if they belong to (and would be nice to have in) PostgreSQL's utility libraries. They are: - TypeInfo, initTypeInfo, and getTypeInfo: A less cumbersome wrapper around get_type_io_data. - FN_EXTRA and FN_EXTRA_SZ: Macros to make working with fcinfo-flinfo-fn_extra easier. - enumLabelToOid: Look up the Oid of an enum label; needed to return an enum that isn't built-in. - utf8_substring: Extract a range of UTF-8 characters out of a UTF-8 string. * Capitalization and function arrangement are rather inconsistent. Braces are KR-style. * json_cleanup and company aren't even used. * The sql/json.sql test case should be broken into more files. P.S. The patch is gzipped because it expands to 2.6 megabytes. Some technical points about the submission: - If you want your coded to be included in PostgreSQL, you need to put the same license and attribution on it that we use elsewhere. Generally that looks sorta like this: /*- * * tablecmds.c *Commands for creating and altering table structures and settings * * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * * * IDENTIFICATION *$PostgreSQL$ * *- */ You should have this header on each file, both .c and .h. - The reason why this patch is 2.6MB is because it has 2.4MB of tests. I think you should probably pick out the most useful 10% or so, and drop the rest. - I was under the impression that we wanted EXPLAIN (FORMAT JSON) to return type json, but that's obviously not going to be possible if all of this is contrib. We could (a) move it all into core, (b) move the type itself and its input and output functions into core and leave the rest in contrib [or something along those lines], or (c) give up using it as the return type for EXPLAIN (FORMAT JSON). - You need to comply with the project coding standards. Thus: static void foo() { exit(1); } Not: static void foo() { exit(1); } You should have at least one blank line between every pair of functions. You should use uncuddled curly braces. Your commenting style doesn't match the project standard. We prefer explicit NULL tests over if (!foo). Basically, you should run pgindent on your code, and also read this: http://www.postgresql.org/docs/8.4/static/source.html I don't think this is going to fly either: /* Declare and initialize a String with the given name. */ #define String(name) String name = NewString() #define NewString() {{NULL, 0, 0}} That's just too much magic. We want people to
Re: [HACKERS] patch: Add JSON datatype to PostgreSQL (GSoC, WIP)
On Fri, Jul 23, 2010 at 2:34 PM, Robert Haas robertmh...@gmail.com wrote: - elog() must be used except for can't happen situations. Compare the way numeric_in() throws an error message versus json_in(). Er... that should have said elog() must NOT be used except for can't happen situations. Also, I was just looking at json_delete(). While the existing coding there is kind of fun, I think this can be written more straightforwardly by doing something like this (not tested): while (1) { while (is_internal(node) node-v.children.head) node = node-v.children.head; if (node-next) next = node-next; else if (node-parent) next = node-parent; else break; free_node(node); node = next; } That gets rid of all of the gotos and one of the local variables and, at least IMO, is easier to understand... though it would be even better still if you also added a comment saying something like We do a depth-first, left-to-right traversal of the tree, freeing nodes as we go. We need not bother clearing any of the pointers, because the traversal order is such that we're never in danger of revisiting a node we've already freed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] permission inconsistency with functions
Hello, I am writing a blog on backups with postgresql, which I plan at some point (if someone doesn't beat me to it) on turning into a patch for the docs but I found this inconsistency: The docs state that: In particular, it must have read access to all tables that you want to back up, so in practice you almost always have to run it as a database superuser. Ignoring the fact that databases have a lot more objects than tables, there is no READ/SELECT permission for functions. Thus in order to backup a function, I must have EXECUTE permissions on the function. Further if I don't have EXECUTE permissions I can still see the function in pg_proc. This seems like an inconsistency worth looking into, especially now that we have per column perms. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] permission inconsistency with functions
On fre, 2010-07-23 at 11:48 -0700, Joshua D. Drake wrote: In particular, it must have read access to all tables that you want to back up, so in practice you almost always have to run it as a database superuser. Ignoring the fact that databases have a lot more objects than tables, there is no READ/SELECT permission for functions. Thus in order to backup a function, I must have EXECUTE permissions on the function. Further if I don't have EXECUTE permissions I can still see the function in pg_proc. In order to back up a table's contents you must read it, but you don't need to execute a function in order to back it up. It's not inconsistent, it's just different. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: psql: edit function, show function commands patch
Hello 2010/7/23 Jan Urbański wulc...@wulczer.org: On 21/07/10 14:43, Pavel Stehule wrote: Hello I am sending a actualised patch. Hi, thanks! I understand to your criticism about line numbering. I have to agree. With line numbering the patch is longer. I have a one significant reason for it. CREATE OR REPLACE FUNCTION public.foo() RETURNS integer LANGUAGE plpgsql AS $function$ 1 begin 2 return 10/0; 3 end; $function$ postgres=# select foo(); ERROR: division by zero CONTEXT: SQL statement SELECT 10/0 PL/pgSQL function foo line 2 at RETURN postgres=# OK, that convinced me, and also others seem to like the feature. So I'll just make code remarks this time. In the \e handling code, if the file name was given and there is no line number, an uninitialised variable will be passed to do_edit. I see that you want to avoid passing a magic number to do_edit, but I think you should just treat anything 0 as that magic value, initialise lineno with -1 and simplify all these nested ifs. fixed uninitialised variable. Second is little bit different - there is three states, not only two - lineno is undefined, lineno is wrong and lineno is correct. I would not to ignore a wrong lineno. Typo in a comment: when wirst row of function - when first row of function fixed I think the #define for DEFAULT_BODY_SEPARATOR should either be at the beginning of the file (and then also used in \ef handling) or just hardcoded in both places. this means some like only local constant - see PARAMS_ARRAY_SIZE in same file. It is used only inside a first_row_is_empty function. It's not used outside this function. Any reason why DEFAULT_NAVIGATION_COMMAND has a space in front of it? no it is useless - fixed Some lines have 80 characters. there are lot of longer lines - but I can't to modify other lines only for formating. My code has max line about 90 characters (when it doesn't respect more common form for some parts). If you agree that a -1 parameter do do_edit will mean no lineno, then I think you can change get_lineno_for_navigation to not take a backslashResult argument and signal errors by returning -1. there are a too much magic constants, so I prefer a cleaner form with backslashResult. The code isn't longer and it reacts better on wrong entered value - negative value is nonsense for this purpose. In get_lineno_for_navigation you will have to protect the large comment block with /*-- otherwise pgindent will reflow it. done PSQL_NAVIGATION_COMMAND needs to be documented in the SGML docs. I changed PSQL_NAVIGATION_COMMAND to PSQL_EDITOR_NAVIGATION_OPTION and append a few lines - as I can - some one who can speak English has to correct it. Uff, that's all from me, sorry for bringing up all these small issues, I hope this will go in soon! It is your job :) I'm going to change it to waiting on author again, mainly because of the uninitialised variable in \d handling, the rest are just stylistic nitpicks. Cheers, Jan attached updated patch Thank you very much Pavel Stehule *** ./doc/src/sgml/ref/psql-ref.sgml.orig 2010-07-20 05:54:19.0 +0200 --- ./doc/src/sgml/ref/psql-ref.sgml 2010-07-23 20:46:49.746690828 +0200 *** *** 1339,1345 varlistentry ! termliteral\edit/literal (or literal\e/literal) literaloptional replaceable class=parameterfilename/replaceable /optional/literal/term listitem para --- 1339,1345 varlistentry ! termliteral\edit/literal (or literal\e/literal) literaloptional replaceable class=parameterfilename/replaceable /optional optional linenumber /optional/literal/term listitem para *** *** 1369,1380 systems, filenamenotepad.exe/filename on Windows systems. /para /tip /listitem /varlistentry varlistentry ! termliteral\ef optional replaceable class=parameterfunction_description/replaceable /optional/literal/term listitem para --- 1369,1386 systems, filenamenotepad.exe/filename on Windows systems. /para /tip + + para + If replaceable class=parameterlinenumber/replaceable is + specified, then cursor is moved on this line after start of + editor. + /para /listitem /varlistentry varlistentry ! termliteral\ef optional replaceable class=parameterfunction_description/replaceable /optional optional linenumber /optional /literal/term listitem para *** *** 1397,1402 --- 1403,1415 If no function is specified, a blank commandCREATE FUNCTION/ template is presented for editing. /para + + para + If replaceable class=parameterlinenumber/replaceable is + specified, then cursor is moved
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Fri, Jul 23, 2010 at 2:13 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: On 7/23/2010 8:52 PM, David Fetter wrote: On Thu, Jul 22, 2010 at 08:43:35PM +0300, Marko Tiikkaja wrote: Did I misunderstand the code? And if I didn't, why do we do this differently? You mentioned in IRC that this was in aid of getting wCTEs going. How are these things connected? Currently, I'm trying to make wCTEs behave a bit like RULEs do. But if every rewrite product takes a new snapshot, wCTEs will behave very unpredictably. But because EXPLAIN ANALYZE does *not* take a new snapshot for every rewrite product, I'm starting to think that maybe this isn't the behaviour we wanted to begin with? Where should I be looking in the code for this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] permission inconsistency with functions
On Fri, 2010-07-23 at 21:55 +0300, Peter Eisentraut wrote: On fre, 2010-07-23 at 11:48 -0700, Joshua D. Drake wrote: In particular, it must have read access to all tables that you want to back up, so in practice you almost always have to run it as a database superuser. Ignoring the fact that databases have a lot more objects than tables, there is no READ/SELECT permission for functions. Thus in order to backup a function, I must have EXECUTE permissions on the function. Further if I don't have EXECUTE permissions I can still see the function in pg_proc. In order to back up a table's contents you must read it, but you don't need to execute a function in order to back it up. It's not inconsistent, it's just different. Sorry you are correct, I made a mistake in my ERROR message reading. JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/23/2010 10:00 PM, Robert Haas wrote: On Fri, Jul 23, 2010 at 2:13 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: Currently, I'm trying to make wCTEs behave a bit like RULEs do. But if every rewrite product takes a new snapshot, wCTEs will behave very unpredictably. But because EXPLAIN ANALYZE does *not* take a new snapshot for every rewrite product, I'm starting to think that maybe this isn't the behaviour we wanted to begin with? Where should I be looking in the code for this? ProcessQuery() and ExplainOnePlan(). ProcessQuery calls PushActiveSnapshot(GetTransactionSnapshot()) for every statement while ExplainOnePlan calls PushUpdatedSnapshot(GetActiveSnapshot()). Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
2010/7/23 Dimitri Fontaine dfonta...@hi-media.com: Robert Haas robertmh...@gmail.com writes: so my preferences: 1. split, join - I checked - we are able to create join function 2. split, array_join - when only join can be a problem 3. string_split, array_join - there are not clean symmetry, but it respect wide used a semantics - string.split, array.join 4. explode, implode 5. array_explode, array_implode -- I cannot to like array_split - it is contradiction for me. Yeah, I'd like some more votes, too. I still don't see a compelling reason not to extend existing functions with a third argument. Or are we talking about deprecating them in the future (like remove their mention in the docs) and have the new names to replace them, with the new behavior as the default and the extended call convention as the old behavior? just incomplete default behave :(. We can enhance old functions, but we cannot to change default behave - it is mean, so we will to ignore a NULLs in arrays forever - but it isn't true a three years. It is a feature now. Please look to archive. There was a discus about it. I'm not sure about that, so I think extending existing function is ok. Or we would have to have the new functions work well with other types too, so that it's compelling to move from the old ones. I would not to replace or enhance a to_char function. I plan to use a implode, explode names Regards Pavel Stehule Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] reminder... beta4 is coming
I've just added a couple of recently-reported bugs here: http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items These aren't all 9.0-specific, but we still need to fix them. Also, the steady trickle of 9.0 bug reports that we were getting before seems to have dried up. Does that mean there are no more bugs, or that there's no more testing? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/23/2010 10:06 PM, I wrote: ProcessQuery() and ExplainOnePlan(). ProcessQuery calls PushActiveSnapshot(GetTransactionSnapshot()) for every statement while ExplainOnePlan calls PushUpdatedSnapshot(GetActiveSnapshot()). Here's a test case demonstrating the difference: =# create table foo(a int); CREATE TABLE =# create table bar(a int); CREATE TABLE =# create table baz(a int); CREATE TABLE =# create rule foorule as on update to foo do instead (select pg_sleep(5); insert into bar select * from baz); CREATE RULE =# insert into foo values(0); no EXPLAIN: =# truncate bar, baz; TRUNCATE TABLE T1=# begin; update foo set a=a; BEGIN -- sleeps.. T2=# insert into baz values(0); INSERT 0 1 -- T1 returns pg_sleep -- (1 row) UPDATE 0 T1=# select * from bar; a --- 0 (1 row) EXPLAIN: =# truncate bar, baz; TRUNCATE TABLE T1=# begin; explain analyze update foo set a=a; BEGIN -- sleeps.. T2=# insert into baz values(0); INSERT 0 1 -- T1 returns (QUERY PLAN) T1=# select * from bar; a --- (0 rows) This may be a bit hard to follow, but essentially what happens is that in EXPLAIN ANALYZE, the INSERT in the rule does not see the changes made by T2 to baz while in the regular execution scenario it does. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Fri, Jul 23, 2010 at 3:19 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: This may be a bit hard to follow, but essentially what happens is that in EXPLAIN ANALYZE, the INSERT in the rule does not see the changes made by T2 to baz while in the regular execution scenario it does. Well that's gotta be a bug, but in what I'm not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Fri, Jul 23, 2010 at 03:30:03PM -0400, Robert Haas wrote: On Fri, Jul 23, 2010 at 3:19 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: This may be a bit hard to follow, but essentially what happens is that in EXPLAIN ANALYZE, the INSERT in the rule does not see the changes made by T2 to baz while in the regular execution scenario it does. Well that's gotta be a bug, but in what I'm not sure. I've said it before, and I'll say it again. User-accessible RULEs are themselves a bug :P Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Fri, Jul 23, 2010 at 3:31 PM, David Fetter da...@fetter.org wrote: On Fri, Jul 23, 2010 at 03:30:03PM -0400, Robert Haas wrote: On Fri, Jul 23, 2010 at 3:19 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: This may be a bit hard to follow, but essentially what happens is that in EXPLAIN ANALYZE, the INSERT in the rule does not see the changes made by T2 to baz while in the regular execution scenario it does. Well that's gotta be a bug, but in what I'm not sure. I've said it before, and I'll say it again. User-accessible RULEs are themselves a bug :P Maybe so, but perhaps it would be more productive to concentrate on solving the immediate problem first. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: overview
Markus Wanner mar...@bluegap.ch writes: Daemon code? That sounds like it could be an addition to the coordinator, which I'm somewhat hesitant to extend, as it's a pretty critical process (especially for Postgres-R). [...] However, note that the coordinator is designed to be just a message passing or routing process, which should not do any kind of time consuming processing. It must *coordinate* things (well, jobs) and react promptly. Nothing else. Yeah, I guess user daemons would have to be workers, not plugins you want to load into the coordinator. On the other side, the background workers have a connection to exactly one database. They are supposed to do work on that database. Is that because of the way backends are started, and to avoid having to fork new ones too often? The background workers can easily load external libraries - just as a normal backend can with LOAD. That would also provide better encapsulation (i.e. an error would only tear down that backend, not the coordinator). You'd certainly have to communicate between the coordinator and the background worker. I'm not sure how match that fits your use case. Pretty well I think. The thread on -performance is talking quite a bit about connection pooling. The only way I can imagine some sort of connection pooling to be implemented on top of bgworkers would be to let the coordinator listen on an additional port and pass on all requests to the bgworkers as jobs (using imessages). And of course send back the responses to the client. I'm not sure how that overhead compares to using pgpool or pgbouncer. Those are also separate processes through which all of your data must flow. They use plain system sockets, imessages use signals and shared memory. Yeah. The connection pool is better outside of code. Let's think PGQ and a internal task scheduler first, if we think at any generalisation. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Fri, Jul 23, 2010 at 03:30:03PM -0400, Robert Haas wrote: On Fri, Jul 23, 2010 at 3:19 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: This may be a bit hard to follow, but essentially what happens is that in EXPLAIN ANALYZE, the INSERT in the rule does not see the changes made by T2 to baz while in the regular execution scenario it does. Well that's gotta be a bug, but in what I'm not sure. One could argue that its less of a semantic change changing explain's behaviour than the normal executors way of working... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Patch for phypot - Pygmy Hippotause
Tom Lane t...@sss.pgh.pa.us wrote: I think the patch is good in principle Since everyone seems to agree this is a good patch which needed minor tweaks, and we haven't heard from the author, I just went ahead and made the changes. Andrew, could you take another look and see if you agree I've covered it all before it gets marked ready for a committer? -Kevin *** a/src/backend/utils/adt/geo_ops.c --- b/src/backend/utils/adt/geo_ops.c *** *** 5410,5412 plist_same(int npts, Point *p1, Point *p2) --- 5410,5470 return FALSE; } + + + /*- + * Determine the hypotenuse. + * + * If required, x and y are swapped to make x the larger number. The + * traditional formulae of x^2+y^2 is rearranged to factor x outside the + * sqrt. This allows computation of the hypotenuse for significantly + * larger values, and with a higher precision than otherwise normally + * possible. + * + * Only arguments of 1.27e308 are at risk of causing overflow. Whereas + * the naive approach limits arguments to 9.5e153. + * + * sqrt( x^2 + y^2 ) = sqrt( x^2( 1 + y^2/x^2) ) + * = x * sqrt( 1 + y^2/x^2 ) + * = x * sqrt( 1 + y/x * y/x ) + * + * It is expected that this routine will eventually be replaced with the + * C99 hypot() function. + * + * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the + * case of hypot(inf,nan) results in INF, and not NAN. + *---*/ + double + phypot(double x, double y) + { + double yx; + + if (isinf(x) || isinf(y)) + return get_float8_infinity(); + + if (isnan(x) || isnan(y)) + return get_float8_nan(); + + x = fabs(x); + y = fabs(y); + + /* If required, swap x and y */ + if (x y) + { + double temp = x; + + x = y; + y = temp; + } + + /* +* If y is zero, the hypotenuse is x, whether or not x is also zero. This +* test protects against divide-by-zero errors. +*/ + if (y == 0.0) + return x; + + /* Determine the hypotenuse */ + yx = y / x; + return x * sqrt(1.0 + (yx * yx)); + } *** a/src/include/utils/geo_decls.h --- b/src/include/utils/geo_decls.h *** *** 50,56 #define FPge(A,B) ((A) = (B)) #endif ! #define HYPOT(A, B) sqrt((A) * (A) + (B) * (B)) /*- * Point - (x,y) --- 50,56 #define FPge(A,B) ((A) = (B)) #endif ! #define HYPOT(A, B) phypot((A), (B)) /*- * Point - (x,y) *** *** 211,216 extern Datum point_div(PG_FUNCTION_ARGS); --- 211,217 /* private routines */ extern double point_dt(Point *pt1, Point *pt2); extern double point_sl(Point *pt1, Point *pt2); + extern double phypot(double x, double y); /* public lseg routines */ extern Datum lseg_in(PG_FUNCTION_ARGS); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] non-overlapping, consecutive partitions
hello everybody, i have just come across some issue which has been bugging me for a while. consider: SELECT * FROM foo ORDER BY bar; if we have an index on bar, we can nicely optimize away the sort step by consulting the index - a btree will return sorted output. under normal circumstances it will be seq-sort but doing some config settings we can turn this into an index scan nicely to avoid to the sort (disk space is my issue here). this is not so easy anymore: create table foo ( x date ); create table foo_2010 () INHERITS (foo) create table foo_2009 () INHERITS (foo) create table foo_2008 () INHERITS (foo) now we add constraints to make sure that data is only in 2008, 2009 and 2010. we assume that everything is indexed: SELECT * FROM foo ORDER BY bar will now demand an ugly sort for this data. this is not an option if you need more than a handful of rows ... if constraints are non overlapping and if they are based on a sortable data type, we might be able to scan one index after the other and get a sorted list. why is this an issue? imagine a case where you want to do billing, eg. some phone calls. the job now is: the last 10 calls of a customer are free and you want to sum up those which are not free. to do that you basically need a sorted list per customer. if you have data here which is partitioned over time you are screwed up because you want to return a sorted list taken from X partitions to some higher level operation (windowing or whatever). resorting vast amounts of data is a killer here. in the particular case i am talking about my problem is roughly 2 TB scaled out to some PL/proxy farm. does anybody see a solution to this problem? what are the main showstoppers to make something like this work? many thanks, hans -- Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/23/2010 10:46 PM, Andres Freund wrote: On Fri, Jul 23, 2010 at 03:30:03PM -0400, Robert Haas wrote: On Fri, Jul 23, 2010 at 3:19 PM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: This may be a bit hard to follow, but essentially what happens is that in EXPLAIN ANALYZE, the INSERT in the rule does not see the changes made by T2 to baz while in the regular execution scenario it does. Well that's gotta be a bug, but in what I'm not sure. One could argue that its less of a semantic change changing explain's behaviour than the normal executors way of working... One could also argue that people usually test their code with EXPLAIN ANALYZE and could've made the opposite conclusion based on its output. ;-) But I really have no idea what we should do about this. It seems to me that EXPLAIN ANALYZE's behaviour is less surprising (that's actually what I, before yesterday, always thought happens), but I'm not too sure about that either. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] non-overlapping, consecutive partitions
On 7/23/2010 11:04 PM, Hans-Jürgen Schönig wrote: does anybody see a solution to this problem? what are the main showstoppers to make something like this work? I think we should absolutely make this work when we have a good partitioning implementation. That said, I don't think it's wise to put a lot of effort into making this work with our current partitioning method when the partitioning patches are just around the corner. The developer time should be directed at those patches instead. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: to_string, to_array functions
Hello I am sending a actualised patch. There is only one significant change to last patch. Function to_string was renamed to implode and to_array was renamed explode. Regards Pavel Stehule *** ./doc/src/sgml/func.sgml.orig 2010-07-23 21:18:04.698690857 +0200 --- ./doc/src/sgml/func.sgml 2010-07-23 21:51:08.860689007 +0200 *** *** 4652,4658 /para para ! If case-independent matching is specified, the effect is much as if all case distinctions had vanished from the alphabet. When an alphabetic that exists in multiple cases appears as an --- 4652,4658 /para para ! If case-independent matching is specified, the effect is much as if all case distinctions had vanished from the alphabet. When an alphabetic that exists in multiple cases appears as an *** *** 9541,9546 --- 9541,9552 primaryarray_upper/primary /indexterm indexterm + primaryexplode/primary + /indexterm + indexterm + primaryimplode/primary + /indexterm + indexterm primarystring_to_array/primary /indexterm indexterm *** *** 9675,9680 --- 9681,9708 row entry literal + functionexplode/function(typetext/type, typetext/type optional, typetext/type/optional) + /literal + /entry + entrytypetext[]/type/entry + entrysplits string into array elements using supplied delimiter and null string/entry + entryliteralexolode('1,2,3,,5', ',')/literal/entry + entryliteral{1,2,3,4,NULL,5}/literal/entry +/row +row + entry + literal + functionimplode/function(typeanyarray/type, typetext/type optional, typetext/type/optional) + /literal + /entry + entrytypetext/type/entry + entryconcatenates array elements using supplied delimiter and null string/entry + entryliteralimplode(ARRAY[1, 2, 3, NULL, 5], ',', '*')/literal/entry + entryliteral1,2,3,*,5/literal/entry +/row +row + entry + literal functionstring_to_array/function(typetext/type, typetext/type) /literal /entry *** ./src/backend/catalog/system_views.sql.orig 2010-07-23 21:18:04.806852641 +0200 --- ./src/backend/catalog/system_views.sql 2010-07-23 22:03:56.329687977 +0200 *** *** 487,489 --- 487,497 CREATE OR REPLACE FUNCTION pg_start_backup(label text, fast boolean DEFAULT false) RETURNS text STRICT VOLATILE LANGUAGE internal AS 'pg_start_backup'; + + CREATE OR REPLACE FUNCTION + implode(v anyarray, fldsep text, null_string text DEFAULT '') + RETURNS text STABLE LANGUAGE internal AS 'implode'; + + CREATE OR REPLACE FUNCTION + explode(inputstr text, fldsep text, null_string text DEFAULT '') + RETURNS text[] IMMUTABLE LANGUAGE internal AS 'explode'; *** ./src/backend/utils/adt/array_userfuncs.c.orig 2010-07-23 21:18:04.880689496 +0200 --- ./src/backend/utils/adt/array_userfuncs.c 2010-07-23 21:18:36.467693435 +0200 *** *** 407,415 --- 407,417 create_singleton_array(FunctionCallInfo fcinfo, Oid element_type, Datum element, + bool isNull, int ndims) { Datum dvalues[1]; + bool nulls[1]; int16 typlen; bool typbyval; char typalign; *** *** 429,434 --- 431,437 ndims, MAXDIM))); dvalues[0] = element; + nulls[0] = isNull; for (i = 0; i ndims; i++) { *** *** 462,468 typbyval = my_extra-typbyval; typalign = my_extra-typalign; ! return construct_md_array(dvalues, NULL, ndims, dims, lbs, element_type, typlen, typbyval, typalign); } --- 465,471 typbyval = my_extra-typbyval; typalign = my_extra-typalign; ! return construct_md_array(dvalues, nulls, ndims, dims, lbs, element_type, typlen, typbyval, typalign); } *** ./src/backend/utils/adt/varlena.c.orig 2010-07-23 21:18:04.911693316 +0200 --- ./src/backend/utils/adt/varlena.c 2010-07-23 22:00:47.83269 +0200 *** *** 75,80 --- 75,83 static bytea *bytea_overlay(bytea *t1, bytea *t2, int sp, int sl); static StringInfo makeStringAggState(FunctionCallInfo fcinfo); + static Datum _explode(PG_FUNCTION_ARGS); + static Datum _implode(PG_FUNCTION_ARGS); + /* * CONVERSION ROUTINES EXPORTED FOR USE BY C CODE * *** *** 2965,2970 --- 2968,2984 } /* + * Returns true when two text params are same. + */ + static + bool text_isequal(text *txt1, text *txt2) + { + return DatumGetBool(DirectFunctionCall2(texteq, + PointerGetDatum(txt1), + PointerGetDatum(txt2))); + } + + /* * text_to_array * parse input string * return text array of elements *** ***
Re: [HACKERS] patch: to_string, to_array functions
Hello Dimitry I still don't see a compelling reason not to extend existing functions with a third argument. Or are we talking about deprecating them in the future (like remove their mention in the docs) and have the new names to replace them, with the new behavior as the default and the extended call convention as the old behavior? just incomplete default behave :(. We can enhance old functions, but we cannot to change default behave - it is mean, so we will to ignore a NULLs in arrays forever - but it isn't true a three years. It is a feature now. Please look to archive. There was a discus about it. The reason, why I am strong in change of default behave is only one - I know only one use-case for curent behave - when array_to_string ignore a NULL, - when you would to remove NULLs from array, you can do string_to_array(array_to_string(x,'###'), '###') - I don't know other use-case. When I have a NULL in array, then it have a some sense there. And I can remove NULLs from array via more secure and faster way SELECT array(SELECT v FROM unnest(x) g(x) WHERE v IS NOT NULL) using string_to_array and array_to_string is slower and for some domains can be wrong (for text). Regards Pavel p.s. I expect so anybody who has NULLs in an array not only for a joy. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
Excerpts from Marko Tiikkaja's message of vie jul 23 14:13:18 -0400 2010: On 7/23/2010 8:52 PM, David Fetter wrote: On Thu, Jul 22, 2010 at 08:43:35PM +0300, Marko Tiikkaja wrote: Did I misunderstand the code? And if I didn't, why do we do this differently? You mentioned in IRC that this was in aid of getting wCTEs going. How are these things connected? Currently, I'm trying to make wCTEs behave a bit like RULEs do. But if every rewrite product takes a new snapshot, wCTEs will behave very unpredictably. I don't think it's fair game to change the behavior of multiple-output rules at this point. However, I also think that it's unwise to base wCTEs on the behavior of rules -- rules are widely considered broken and unusable for nontrivial cases. Also, I think that having a moving snapshot for the different parts of a wCTE is going to mean they're unpredictable. For predictable usage you'll be forcing the user to always wrap them in SERIALIZABLE transactions. In short I think a wCTE should only advance the CID, not get a whole new snapshot. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
Alvaro Herrera alvhe...@commandprompt.com wrote: In short I think a wCTE should only advance the CID, not get a whole new snapshot. Even after unblocking from a write conflict at the READ COMMITTED isolation level? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/24/10 12:37 AM +0300, Alvaro Herrera wrote: Excerpts from Marko Tiikkaja's message of vie jul 23 14:13:18 -0400 2010: On 7/23/2010 8:52 PM, David Fetter wrote: On Thu, Jul 22, 2010 at 08:43:35PM +0300, Marko Tiikkaja wrote: Did I misunderstand the code? And if I didn't, why do we do this differently? You mentioned in IRC that this was in aid of getting wCTEs going. How are these things connected? Currently, I'm trying to make wCTEs behave a bit like RULEs do. But if every rewrite product takes a new snapshot, wCTEs will behave very unpredictably. I don't think it's fair game to change the behavior of multiple-output rules at this point. However, I also think that it's unwise to base wCTEs on the behavior of rules -- rules are widely considered broken and unusable for nontrivial cases. I don't want to change the behaviour either, but we have two different behaviours right now. We need to change at least the other. wCTEs are not going to be based on any of the broken behaviour of rules, that's for sure. What I meant is expanding a single query into multiple queries and running the executor separately for all of them. Also, I think that having a moving snapshot for the different parts of a wCTE is going to mean they're unpredictable. For predictable usage you'll be forcing the user to always wrap them in SERIALIZABLE transactions. That is *not* what has been discussed for wCTEs. A wCTE will only modify the CID of the snapshot in any isolation mode. In short I think a wCTE should only advance the CID, not get a whole new snapshot. Agreed. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/24/10 12:42 AM +0300, Kevin Grittner wrote: Alvaro Herreraalvhe...@commandprompt.com wrote: In short I think a wCTE should only advance the CID, not get a whole new snapshot. Even after unblocking from a write conflict at the READ COMMITTED isolation level? I'm not sure what you mean by this; UPDATE and DELETE can take a look at the new tuple but that's completely separate from the snapshot. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
Excerpts from Marko Tiikkaja's message of vie jul 23 17:44:21 -0400 2010: On 7/24/10 12:37 AM +0300, Alvaro Herrera wrote: Excerpts from Marko Tiikkaja's message of vie jul 23 14:13:18 -0400 2010: I don't think it's fair game to change the behavior of multiple-output rules at this point. However, I also think that it's unwise to base wCTEs on the behavior of rules -- rules are widely considered broken and unusable for nontrivial cases. I don't want to change the behaviour either, but we have two different behaviours right now. We need to change at least the other. It seems like it's EXPLAIN ANALYZE that needs fixing. wCTEs are not going to be based on any of the broken behaviour of rules, that's for sure. What I meant is expanding a single query into multiple queries and running the executor separately for all of them. Is a wCTE going to be expanded into multiple queries? If not, it sounds like we're all agreed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/24/10 1:20 AM +0300, Alvaro Herrera wrote: Excerpts from Marko Tiikkaja's message of vie jul 23 17:44:21 -0400 2010: wCTEs are not going to be based on any of the broken behaviour of rules, that's for sure. What I meant is expanding a single query into multiple queries and running the executor separately for all of them. Is a wCTE going to be expanded into multiple queries? If not, it sounds like we're all agreed. Yes, it will have to be. I tried to make it work for 9.0 by not expanding, and it didn't work out too well. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On 7/24/10 1:20 AM +0300, Alvaro Herrera wrote: It seems like it's EXPLAIN ANALYZE that needs fixing. Yeah, looks like it. I see SQL functions also take a new snapshot for every query. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
On Fri, Jul 23, 2010 at 6:20 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Marko Tiikkaja's message of vie jul 23 17:44:21 -0400 2010: On 7/24/10 12:37 AM +0300, Alvaro Herrera wrote: Excerpts from Marko Tiikkaja's message of vie jul 23 14:13:18 -0400 2010: I don't think it's fair game to change the behavior of multiple-output rules at this point. However, I also think that it's unwise to base wCTEs on the behavior of rules -- rules are widely considered broken and unusable for nontrivial cases. I don't want to change the behaviour either, but we have two different behaviours right now. We need to change at least the other. It seems like it's EXPLAIN ANALYZE that needs fixing. I would suggest that if we're going to change this, we back-patch it to 9.0 before beta4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rewrite, normal execution vs. EXPLAIN ANALYZE
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: I'm not sure what you mean by this; UPDATE and DELETE can take a look at the new tuple but that's completely separate from the snapshot. Never mind -- I remembered that those could operate against tuples not in the original snapshot, but forgot that they did it without generating an actual fresh snapshot. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers