Re: [HACKERS] Syntax for partitioning
On Wed, 2009-11-18 at 13:24 +0900, Itagaki Takahiro wrote: Simon Riggs si...@2ndquadrant.com wrote: Why not just wait until we have a whole patch and then apply? A whole patch can be written by many contributers instead of only one person, no? I think we need to split works for partitioning into serveral parts to encourage developing it. I just did one of the parts, syntax. Anothe patch Partitioning option for COPY will do a good job in the field of INSERT. If we can agree the parts that are required, I would at least be confident that we have understood this enough to allow one part to proceed ahead of the others. For partitioning the parts are these 1. Syntax for explicit partitioning 2. Internal data representations 3. Optimizations many and various 4. Data Routing a) Data routing on INSERT/COPY b) UPDATE handling when the UPDATE causes partition migration If this patch puts forward a solution for (2) also, then it is potentially worthwhile. That is the real blocking point here. Once we have that other people will quickly fill in the later parts. I foresee a data structure that is a sorted list of boundary-values, cached on the parent-relation. This should be accessible to allow bsearch of particular values during both planning and execution. Same rules apply as btree operator classes. For multi-level hierarchies the parent level should have the union of all sub-hierarchies. I think we need an index on pg_inherits also. So please do (1) and (2), not just (1) in isolation. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
Wojciech, your polish_english, polish configurations uses ispell language and slow, while english configuration doesn't contains ispell. So, what's your complains ? Try add ispell dictionary to english configuration and see timings. Oleg On Wed, 18 Nov 2009, Wojciech Knapik wrote: Hello This has been discussed in #postgresql and posted to -performance a couple days ago, but no solution has been found. The discussion can be found here: http://archives.postgresql.org/pgsql-performance/2009-11/msg00162.php I just finished implementing a search engine for my site and found ts_headline extremely slow when used with a Polish tsearch configuration, while fast with English. All of it boils down to a simple testcase, but first some background. I tested on 8.3.1 on G5/OSX 10.5.8 and Xeon/Gentoo AMD64-2008.0 (2.6.21), then switched both installations to 8.3.8 (both packages compiled from source, but provided by the distro - port/emerge). The Polish dictionaries and config were created according to this article (it's in Polish, but the code is self-explanatory): http://www.depesz.com/index.php/2008/04/22/polish-tsearch-in-83-polski-tsearch-w-postgresie-83/ Now for the testcase: text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.' # explain analyze select ts_headline('polish', text, plainto_tsquery('polish', 'foobar')); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=6.407..6.470 rows=1 loops=1) Total runtime: 6.524 ms (2 rows) # explain analyze select ts_headline('english', text, plainto_tsquery('english', 'foobar')); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.861..0.895 rows=1 loops=1) Total runtime: 0.935 ms (2 rows) # explain analyze select ts_headline('simple', text, plainto_tsquery('simple', 'foobar')); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.627..0.660 rows=1 loops=1) Total runtime: 0.697 ms (2 rows) # As you can see, the results differ by an order of magnitude between Polish and English. While in this simple testcase it's a non-issue, in the real world this translates into enormous overhead. One of the queries I ran testing my site's search function took 1870ms. When I took that query and changed all ts_headline(foo) calls to just foo, the time dropped below 100ms. That's the difference between something completely unacceptable and something quite useful. I can post various details about the hardware, software and specific queries, but the testcases speak for themselves. I'm sure you can easily reproduce my results. I'm putting my code into production tomorrow, since I can't wait anymore. Hints would be very much appreciated! cheers, Wojciech Knapik PS. This issue is not related to the loading time of dictionaries, or calls to ts_headline for results that won't be displayed. A few other details can be found here http://pastie.textmate.org/private/hqnqfnsfsknjyjlffzmog along with snippets of my conversations in #postgresql that lead to this testcase. Big thanks to RhodiumToad for helping me with fts for the last couple days ;] Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 with BOM support in psql
On ons, 2009-11-18 at 12:52 +0900, Itagaki Takahiro wrote: Peter Eisentraut pete...@gmx.net wrote: Together, that should cover a lot of cases. Not perfect, but far from useless. For Japanese users on Windows, the client encoding are always set to SJIS because of the restriction of cmd.exe. But the script file can be written in UTF8 with BOM. I don't think we should depend on client encoding. Set by whom, how, and because of what restriction? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 with BOM support in psql
On tis, 2009-11-17 at 23:22 -0500, Andrew Dunstan wrote: Itagaki Takahiro wrote: I don't want user to check the encoding of scripts before executing -- it is far from fail-safe. That's what we require in all other cases. Why should UTF8 be special? But now we're back to the original problem. Certain editors insert BOMs at the beginning of the file. And that is by any definition before the embedded client encoding declaration. I think the only ways to solve this are: 1) Ignore the BOM if a client encoding declaration of UTF8 appears in a narrowly defined location near the beginning of the file (XML and PEP-0263 style). For *example*, we could ignore the BOM if the file starts with exactly BOM\encoding UTF8\n. Would probably not work well in practice. 2) Parse two alternative versions of the file, one with the BOM ignored and one with the BOM not ignored, until you need to make a decision. Hilariously complicated, but would perhaps solve the problem. 3) Give up, do nothing. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC for adding typmods to functions
On tis, 2009-11-17 at 17:09 -0500, Tom Lane wrote: I can see the following definitional issues: Should we be able to find the answers to those, or at least a basis of discussion about those, in the SQL standard? Has anyone checked? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unpredictable shark slowdown after migrating to 8.4
Thank you for the hints. Why only those modes? I'd search for locks with granted=false, then see all the other locks held by the process that's holding the conflicting lock with granted=true (i.e. the one you're waiting on). Something like this? SELECT granted, pid, virtualxid, transactionid, virtualtransaction, count(1) AS locks, current_query FROM pg_locks AS l LEFT JOIN pg_stat_activity AS a ON pid = procpid GROUP BY 1, 2, 3, 4, 5, 7 ORDER BY 1, 6 DESC; And two more queries to do extended analysis of its results after restarting PG: SELECT pg_stat_activity.datname, pg_class.relname, pg_locks.transactionid, pg_locks.mode, pg_locks.granted, pg_stat_activity.usename, pg_stat_activity.current_query, pg_stat_activity.query_start, age(now(),pg_stat_activity.query_start) AS age, pg_stat_activity.procpid FROM pg_stat_activity, pg_locks LEFT OUTER JOIN pg_class ON pg_locks.relation = pg_class.oid WHERE pg_locks.pid = pg_stat_activity.procpid ORDER BY query_start; SELECT * FROM pg_locks; Are there another things I should do when the problem rise up again? -- Regards, Sergey Konoplev -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
Tom Lane wrote: I tested on 8.3.1 on G5/OSX 10.5.8 and Xeon/Gentoo AMD64-2008.0 (2.6.21), then switched both installations to 8.3.8 (both packages compiled from source, but provided by the distro - port/emerge). The Polish dictionaries and config were created according to this article (it's in Polish, but the code is self-explanatory): http://www.depesz.com/index.php/2008/04/22/polish-tsearch-in-83-polski-tsearch-w-postgresie-83/ I tried to duplicate this test, but got no further than here: u8=# CREATE TEXT SEARCH DICTIONARY polish_ispell ( TEMPLATE = ispell, DictFile = polish, AffFile = polish, StopWords = polish ); ERROR: syntax error CONTEXT: line 174 of configuration file /home/tgl/testversion/share/postgresql/tsearch_data/polish.affix: L E C -C,GĹEM #zalec (15a) u8=# Seems there's something about the current version of the dictionary that we don't like. I used sjp-ispell-pl-20091117-src.tar.bz2 ... Here are the files I used (polish.affix, polish.dict already generated): http://wolniartysci.pl/pl.tar.gz These should work fine. I'd be grateful if you could test and see if you get similar results. cheers, Wojciech Knapik PS. Weird, I get the emails without a reply-to set for the list.. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
your polish_english, polish configurations uses ispell language and slow, while english configuration doesn't contains ispell. So, what's your complains ? Try add ispell dictionary to english configuration and see timings. Oh, so this is not anomalous ? These are the expected speeds for an ispell dictionary ? I didn't realize that. Sorry for the bother then. It just seemed way too slow to be practical. cheers, Wojciech Knapik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Itagaki Takahiro wrote: Looks good. I change status of the patch to Ready for Committer. Thanks for the help! BTW, it might not be a work for this patch, we also need to reject too long VALID UNTIL setting. If the password is complex, we should not use the same password for a long time. There are some cases, e.g. application servers logging into the database, where you cannot just let the password expire, so I think this would at best have to be a rule with exceptions. Another thing that makes VALID UNTIL inconvenient to use is that after expiration, logins simply fail, and the user is never prompted to change the password. But of course you are right, requiring a limited password lifetime is closely related to requiring a good password. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl and inline functions -- first draft
On Nov 18, 2009, at 5:46 AM, Andrew Dunstan wrote: Joshua Tolley wrote: +plperl_call_data *save_call_data = current_call_data; +boololdcontext = trusted_context; + + if (SPI_connect() != SPI_OK_CONNECT) +elog(ERROR, could not connect to SPI manager); ... +current_call_data = (plperl_call_data *) palloc0(sizeof(plperl_call_data)); +current_call_data-fcinfo = fake_fcinfo; +current_call_data-prodesc = desc; I don't think this is done in the right order. If it is then this comment in plperl_func_handler is wrong (as well as containing a typo): /* * Create the call_data beforing connecting to SPI, so that it is not * allocated in the SPI memory context */ Yes, current_call_data can't be allocate in the SPI memory context, since it's used to extract the result after SPI_finish is called, although it doesn't lead to problems here since no result is returned. Anyway, I'd move SPI_connect after the current_call_data initialization. I also noticed that no error context is set in the inline handler, not sure whether it really useful except for the sake of consistency, but in case it is - here is the patch: inline_callback.diff Description: Binary data -- Alexey Klyukin http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC for adding typmods to functions
2009/11/18 Peter Eisentraut pete...@gmx.net: On tis, 2009-11-17 at 17:09 -0500, Tom Lane wrote: I can see the following definitional issues: Should we be able to find the answers to those, or at least a basis of discussion about those, in the SQL standard? Has anyone checked? I am not sure if SQL standard is good inspiration in this case. Does SQL standard typmod less varchar or numeric? Does SQL standard polymorphic types? Maybe only one should be in standard. Reply to question should exists functions foo(varchar(3)) and foo(varchar(10)) in same time? Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] byteain for new hex escaped data
Looking at how byteain detects whether the input it is passed is the new hex format escape or the old octal escape, it uses: char *inputText = PG_GETARG_CSTRING(0); if (inputText[0] == '\\' inputText[1] == 'x') Doesn't this read off the end of inputText in the case of SELECT ''::bytea, or is there some padding happening somewhere that makes this legal? Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Summary and Plan for Hot Standby
Tatsuo Ishii wrote: Please correct me if I'm wrong. Parse will result in obtaining RowExclusiveLock on the target table if it is parsing INSERT/UPDATE/DELETE. If so, is this ok in the standby? Any attempt to take RowExclusiveLock will fail. Any attempt to execute INSERT/UPDATE/DELETE will fail. This behaviour should be identical to read only transaction mode. If it is not documented as an exception, please report as a bug. Is it? It seems read only transaction mode is perfectly happy with RowExclusiveLock: Hmm, that's a good point. I can't immediately see that that would cause any trouble, but it gives me an uncomfortable feeling about the locking. Which locks exactly need to be replayed in standby, and why? Which locks can read-only transactions acquire? The doc says: + In recovery, transactions will not be permitted to take any table lock + higher than AccessShareLock. In addition, transactions may never assign + a TransactionId and may never write WAL. + Any LOCK TABLE command that runs on the standby and requests a specific + lock type other than AccessShareLock will be rejected. which seems wrong, given Tatsuo-sans example. Is that paragraph only referring to LOCK TABLE, and not other means of acquiring locks? Either way, it needs to be clarified or fixed. access/transam/README says: +Further details on locking mechanics in recovery are given in comments +with the Lock rmgr code. but there's no explanation there either *why* the locking works as it is. In LockAcquire(), we forbid taking locks higher than AccessShareLock during recovery mode, but only for LOCKTAG_OBJECT locks. Why? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] byteain for new hex escaped data
On ons, 2009-11-18 at 06:46 -0500, Kris Jurka wrote: Looking at how byteain detects whether the input it is passed is the new hex format escape or the old octal escape, it uses: char *inputText = PG_GETARG_CSTRING(0); if (inputText[0] == '\\' inputText[1] == 'x') Doesn't this read off the end of inputText in the case of SELECT ''::bytea, or is there some padding happening somewhere that makes this legal? In case of ''::bytea, the inputText is as a C string, and so inputText[0] == '\0' and the second test is not executed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC for adding typmods to functions
On ons, 2009-11-18 at 11:46 +0100, Pavel Stehule wrote: I am not sure if SQL standard is good inspiration in this case. I'm not sure either, but I think it's premature to make a conclusion about that without having checked at all. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC for adding typmods to functions
2009/11/18 Peter Eisentraut pete...@gmx.net: On ons, 2009-11-18 at 11:46 +0100, Pavel Stehule wrote: I am not sure if SQL standard is good inspiration in this case. I'm not sure either, but I think it's premature to make a conclusion about that without having checked at all. ok, I recheck SQL/PSM part again :) Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 with BOM support in psql
Peter Eisentraut wrote: But now we're back to the original problem. Certain editors insert BOMs at the beginning of the file. And that is by any definition before the embedded client encoding declaration. I think the only ways to solve this are: 1) Ignore the BOM if a client encoding declaration of UTF8 appears in a narrowly defined location near the beginning of the file (XML and PEP-0263 style). For *example*, we could ignore the BOM if the file starts with exactly BOM\encoding UTF8\n. Would probably not work well in practice. 2) Parse two alternative versions of the file, one with the BOM ignored and one with the BOM not ignored, until you need to make a decision. Hilariously complicated, but would perhaps solve the problem. 3) Give up, do nothing. 4) set the client encoding before the file is read in any of the ways that have already been discussed and then allow psql to eat the BOM. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
I'm in Tokyo right now, so please excuse my abbreviated reply. On Tue, 2009-11-17 at 23:13 -0500, Robert Haas wrote: Forgive me if this is discussed before, but why does this store the strategy numbers of the relevant operators instead of the operators themselves? At constraint definition time, I need to make sure that the strategy numbers can be identified anyway, so it wouldn't save any work in ATAddOperatorExclusionConstraint. At the time it seemed slightly more direct to use strategy numbers in index_check_constraint, but it's probably about the same. It seems like this could lead to surprising behavior if the user modifies the definition of the operator class. Right now, operator classes can't be modified in any meaningful way. Am I missing something? I'm wondering if we can't use the existing BuildIndexValueDescription() rather than the new function tuple_as_string(). I realize there are two tuples, but maybe it makes sense to just call it twice? Are you suggesting I change the error output, or reorganize the code to try to reuse BuildIndexValueDescription, or both? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 with BOM support in psql
On ons, 2009-11-18 at 08:52 -0500, Andrew Dunstan wrote: 4) set the client encoding before the file is read in any of the ways that have already been discussed and then allow psql to eat the BOM. This is certainly a workaround, just like piping the file through a suitable sed expression would be, but conceptually, the client encoding is a property of the file and should therefore be marked in the file. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
Robert Haas robertmh...@gmail.com writes: Forgive me if this is discussed before, but why does this store the strategy numbers of the relevant operators instead of the operators themselves? It seems like this could lead to surprising behavior if the user modifies the definition of the operator class. Wild guess: http://www.postgresql.org/docs/8.4/static/xindex.html 34.14.2. Index Method Strategies The operators associated with an operator class are identified by strategy numbers, which serve to identify the semantics of each operator within the context of its operator class. For example, B-trees impose a strict ordering on keys, lesser to greater, and so operators like less than and greater than or equal to are interesting with respect to a B-tree. Because PostgreSQL allows the user to define operators, PostgreSQL cannot look at the name of an operator (e.g., or =) and tell what kind of comparison it is. Instead, the index method defines a set of strategies, which can be thought of as generalized operators. Each operator class specifies which actual operator corresponds to each strategy for a particular data type and interpretation of the index semantics. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] operator exclusion constraints
All, FWIW, I'm doing a redesign of a client's production web application right now. I was able, by combining OEC and the Period type from pgfoundry, to make a set of constraints for declaratively asserting in a sports database: That the same player couldn't belong to two different teams at the same time; That the same player couldn't belong to the same team in two different positions with overlapping time periods. This worked as spec'd, and would be extremely useful for this real-world app if it was ready to use in production now. However, I do have an issue with the SQLSTATE returned from the OEC violation. Currently it returns constraint violation, which, from the perspective of an application developer, is not useful. OECs are, in application terms, materially identical to UNIQUE constraints and serve the same purpose. As such, I'd far rather see OECs return unique key violation instead, as any existing application error-trapping code would handle the violation more intelligently if it did. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
On Wed, 18 Nov 2009, Wojciech Knapik wrote: your polish_english, polish configurations uses ispell language and slow, while english configuration doesn't contains ispell. So, what's your complains ? Try add ispell dictionary to english configuration and see timings. Oh, so this is not anomalous ? These are the expected speeds for an ispell dictionary ? I didn't realize that. Sorry for the bother then. It just seemed way too slow to be practical. You can see real timings using ts_lexize() function for different dictionaries (try several time to avoid cold-start problem) instead of ts_headline(), which involves other factors. On my test machine I see no real difference between very simple dictionary and french ispell, snowball dictionaries: dev-oleg=# select ts_lexize('simple','voila'); ts_lexize --- {voila} (1 row) Time: 0.282 ms dev-oleg=# select ts_lexize('simple','voila'); ts_lexize --- {voila} (1 row) Time: 0.269 ms dev-oleg=# select ts_lexize('french_stem','voila'); ts_lexize --- {voil} (1 row) Time: 0.187 ms I see no big difference in ts_headline as well: dev-oleg=# select ts_headline('english','I can do voila', 'voila'::tsquery); ts_headline --- I can do bvoila/b (1 row) Time: 0.265 ms dev-oleg=# select ts_headline('nomaofr','I can do voila', 'voila'::tsquery); ts_headline --- I can do bvoila/b (1 row) Time: 0.299 ms This is 8.4.1 version of PostgreSQL. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch - Reference Function Parameters by Name
Attached is a patch to perform parameter reference lookups by name in the body of functions. I'm hesitant to put it in for the commitfest as is, without a couple of questions posed to the group: 1. palloc needs no free? I suppose this is a general knowledge question, but it seemed to be the case after trying to look for deallocation 2. I inserted myself more than I had expected along the road from SQL to columnref_hook, and I'm not sure all of those lookups of parameter names and function name are required. 3. Since it was mentioned in an earlier email that the function name.parameter name syntax was desired, I went ahead and added that, but it required another passthrough as indicated in 2 4. I made a judgement call in terms of resolution: if the columnref_hook for param-by-name resolution is called with a non-null node (meaning a column was already found), we avoid being an ambiguous reference, and prefer the column already found. Passes all tests in make check, and I'll add some tests for this after I get feedback for the above items. Regards, -George param_names.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF8 with BOM support in psql
Peter Eisentraut pete...@gmx.net writes: This is certainly a workaround, just like piping the file through a suitable sed expression would be, but conceptually, the client encoding is a property of the file and should therefore be marked in the file. In a perfect world things would be like that, but the world is imperfect. When only one of the available encodings even pretends to have a marking convention, and even that one convention is broken, imagining that you can fix it is just a recipe for making things worse. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
Oleg Bartunov wrote: your polish_english, polish configurations uses ispell language and slow, while english configuration doesn't contains ispell. So, what's your complains ? Try add ispell dictionary to english configuration and see timings. Oh, so this is not anomalous ? These are the expected speeds for an ispell dictionary ? I didn't realize that. Sorry for the bother then. It just seemed way too slow to be practical. You can see real timings using ts_lexize() function for different dictionaries (try several time to avoid cold-start problem) instead of ts_headline(), which involves other factors. On my test machine I see no real difference between very simple dictionary and french ispell, snowball dictionaries: ts_lexize seems to be just as fast for simple, polish_ispell and english_stem with the 'voila' argument. polish_ispell is in fact *faster* for the lorem ipsum text repeated a couple times (10 ?). Which suggests that the issue is with ts_headline iteself. I see no big difference in ts_headline as well: dev-oleg=# select ts_headline('english','I can do voila', 'voila'::tsquery); ts_headline --- I can do bvoila/b (1 row) Time: 0.265 ms Yes, for 4-word texts the results are similar. Try that with a longer text and the difference becomes more and more significant. For the lorem ipsum text, 'polish' is about 4 times slower, than 'english'. For 5 repetitions of the text, it's 6 times, for 10 repetitions - 7.5 times... This is 8.4.1 version of PostgreSQL. An that was 8.3.8/OSX. cheers, Wojciech Knapik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
On sön, 2009-11-15 at 18:39 -0700, James Pye wrote: I can see how function modules might look like a half-step backwards from function fragments at first, but the benefits of a *natural* initialization section (the module body) was enough to convince me. The added value on the PL developer's side was also compelling. Tracebacks were trivial to implement, and there is no need to munge the function's source. It seemed like a win all around... The question is whether it helps the user, not the implementer. As far as I can tell, it just creates more typing for no benefit whatsoever. Also, it's inconsistent with normal Python script files and with other PLs. AFA native typing is concerned, I think the flexibility and potential it offers is useful, no? Already, plpython3 provides properties on PG's datetime types to access the date_part()'s of the object. OTOH, for folk who primarily use the PL to access functionality in Python modules(bindings), native typing may be of no direct utility as they will likely need to convert anyways. (If that's your common use-case, then the absence of interest in native typing is quite understandable.) Right, if I use PL/Python, I do it because I want to use Python. I don't need another PostgreSQL implementation on top of Python. The maintenance effort required to keep those two consistent aside. Again, I'm only one user. But so far I haven't seen anyone else speak up here, and clearly accepting this for inclusion will need nontrivial convincing. the pain of dealing with a second implementation. What pain are you anticipating? Maintenance? Right. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
2009/11/18 Oleg Bartunov o...@sai.msu.su: On Wed, 18 Nov 2009, Wojciech Knapik wrote: your polish_english, polish configurations uses ispell language and slow, while english configuration doesn't contains ispell. So, what's your complains ? Try add ispell dictionary to english configuration and see timings. Oh, so this is not anomalous ? These are the expected speeds for an ispell dictionary ? I didn't realize that. Sorry for the bother then. It just seemed way too slow to be practical. You can see real timings using ts_lexize() function for different dictionaries (try several time to avoid cold-start problem) instead of ts_headline(), which involves other factors. On my test machine I see no real difference between very simple dictionary and french ispell, snowball dictionaries: It's depend on language (and dictionary sizes). for czech: postgres=# select ts_lexize('simple','vody'); ts_lexize --- {vody} (1 row) Time: 0.785 ms postgres=# select ts_lexize('cspell','vody'); ts_lexize --- {voda} (1 row) Time: 1.041 ms I afraid so czech and polland language is very hard (with long affix file). Regards Pavel dev-oleg=# select ts_lexize('simple','voila'); ts_lexize --- {voila} (1 row) Time: 0.282 ms dev-oleg=# select ts_lexize('simple','voila'); ts_lexize --- {voila} (1 row) Time: 0.269 ms dev-oleg=# select ts_lexize('french_stem','voila'); ts_lexize --- {voil} (1 row) Time: 0.187 ms I see no big difference in ts_headline as well: dev-oleg=# select ts_headline('english','I can do voila', 'voila'::tsquery); ts_headline --- I can do bvoila/b (1 row) Time: 0.265 ms dev-oleg=# select ts_headline('nomaofr','I can do voila', 'voila'::tsquery); ts_headline --- I can do bvoila/b (1 row) Time: 0.299 ms This is 8.4.1 version of PostgreSQL. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
On Wed, 18 Nov 2009, Wojciech Knapik wrote: Yes, for 4-word texts the results are similar. Try that with a longer text and the difference becomes more and more significant. For the lorem ipsum text, 'polish' is about 4 times slower, than 'english'. For 5 repetitions of the text, it's 6 times, for 10 repetitions - 7.5 times... Again, I see nothing unclear here, since dictionaries (as specified in configuration) apply to ALL words in document. The more words in document, the more overhead. You can pass not all document to ts_headline, but just part, to have predicted performance. This is useful in any case. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
Again, I'm only one user. But so far I haven't seen anyone else speak up here, and clearly accepting this for inclusion will need nontrivial convincing. Well, FWIW, I am excited about better type integration. Also, I am a little skeptical about this patch. I am sorry if this has already been discussed, but would this mean that I need to choose whether pl/python is built against Python 2.* or Python 3.*? -Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
Nathan Boley npbo...@gmail.com writes: Also, I am a little skeptical about this patch. I am sorry if this has already been discussed, but would this mean that I need to choose whether pl/python is built against Python 2.* or Python 3.*? Yes. That's exactly what I was complaining about upthread. I'm not a Python user, but from what I can gather of the 2-to-3 changes, having to choose one at package build time is going to be a disaster. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC for adding typmods to functions
Tom Lane t...@sss.pgh.pa.us wrote: there is a constituency that cares --- mainly people who use client-side code that tends to fall over if it doesn't see a suitable maxlength attached to query result columns. I suspect it will primarily be software which is dealing with large enough result sets that reading through it all to find the maximum width for a column in a particular request is not viable. It's also likely to be more of an issue with software which needs to work with multiple database products. Finally, it simplifies life for application developers who want a form or report to have columns which are wide enough to handle the data which might show up, when they don't want the column widths to change from run to run -- as when daily reports will be added to a binder. The first example I came across in the archives was http://archives.postgresql.org/pgsql-sql/2002-06/msg00235.php It's not surprising that we got a post about Crystal Reports having an issue; all of the above applies to it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
On Wed, 2009-11-18 at 12:06 -0500, Tom Lane wrote: Nathan Boley npbo...@gmail.com writes: Also, I am a little skeptical about this patch. I am sorry if this has already been discussed, but would this mean that I need to choose whether pl/python is built against Python 2.* or Python 3.*? Yes. That's exactly what I was complaining about upthread. I'm not a Python user, but from what I can gather of the 2-to-3 changes, having to choose one at package build time is going to be a disaster. Agreed. We really need to have a plpython and plpython3. Heck this would play nicely too because we support backward compatibility but also upward version differences. Joshua D. Drake regards, tom lane -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch - Reference Function Parameters by Name
George Gensure wer...@gmail.com writes: Attached is a patch to perform parameter reference lookups by name in the body of functions. I'm hesitant to put it in for the commitfest as is, without a couple of questions posed to the group: I looked through this very quickly. I'm not in favor of the approach you have chosen of hacking all the upper layers in order to pass parameter names through them; and putting the function name into those APIs too is right out. What I did in plpgsql avoided that by establishing a callback protocol and keeping all the knowledge of names within the callback function. SQL functions have a different call path to the parser, so we might need to adjust things in that path; but you definitely should not be needing to mess with plancache.c any further. 1. palloc needs no free? I suppose this is a general knowledge question, but it seemed to be the case after trying to look for deallocation Depends. If you're creating something that is meant to live about as long as the current statement anyway, you can just leave it to be garbage-collected when the current memory context is destroyed. There are cases where you need to be more aggressive about pfree'ing things to avoid large cumulative memory usage, but probably anything that is invoking parsing doesn't really need to worry (the parse process is going to create a whole lot more trash than you will anyway). 4. I made a judgement call in terms of resolution: if the columnref_hook for param-by-name resolution is called with a non-null node (meaning a column was already found), we avoid being an ambiguous reference, and prefer the column already found. The consensus seems to be that we should throw error for ambiguity. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
Joshua D. Drake j...@commandprompt.com writes: On Wed, 2009-11-18 at 12:06 -0500, Tom Lane wrote: Yes. That's exactly what I was complaining about upthread. I'm not a Python user, but from what I can gather of the 2-to-3 changes, having to choose one at package build time is going to be a disaster. Agreed. We really need to have a plpython and plpython3. Peter was concerned about duplicative maintenance effort, but what I think this patch shows is that (at least for the near future) both could be built from a single source file. What we need is configure and makefile support to do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
On Wed, 2009-11-18 at 12:28 -0500, Tom Lane wrote: Joshua D. Drake j...@commandprompt.com writes: On Wed, 2009-11-18 at 12:06 -0500, Tom Lane wrote: Yes. That's exactly what I was complaining about upthread. I'm not a Python user, but from what I can gather of the 2-to-3 changes, having to choose one at package build time is going to be a disaster. Agreed. We really need to have a plpython and plpython3. Peter was concerned about duplicative maintenance effort, but what I think this patch shows is that (at least for the near future) both could be built from a single source file. What we need is configure and makefile support to do that. Ahh, so we would have: --enable-plpython2=/usr/bin/python2 --enable-plpython3=/usr/bin/python3 ? That seems reasonable if we can run both. Although I wonder if longer term (2.x is going to be support a long time) we will end up with frustration within the single source file trying to keep things straight. Joshua D. Drake regards, tom lane -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
Joshua D. Drake j...@commandprompt.com writes: On Wed, 2009-11-18 at 12:28 -0500, Tom Lane wrote: Peter was concerned about duplicative maintenance effort, but what I think this patch shows is that (at least for the near future) both could be built from a single source file. That seems reasonable if we can run both. Although I wonder if longer term (2.x is going to be support a long time) we will end up with frustration within the single source file trying to keep things straight. Once it gets to the point where it's more trouble to keep them together than not, we can split the source. But judging from this patch, a single source file is the ticket for the moment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timezones (in 8.5?)
Andrew Gierth and...@tao11.riddles.org.uk wrote: If he meant (A), then you store the event as: (ts,tz) = (timestamp '2010-07-27 10:30:00', 'Chile/Santiago') If he meant (B), then you store the event as (tsz,tz) = (timestamp '2010-07-27 10:30:00' at time zone 'Chile/Santiago', 'Chile/Santiago') You seem to be agreeing that these problems can't be solved without storing a time zone string in addition to the timestamp. As I read it, Hernán was wishing for types which include this, rather than having to do the above dance with multiple values. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
Oleg Bartunov wrote: Yes, for 4-word texts the results are similar. Try that with a longer text and the difference becomes more and more significant. For the lorem ipsum text, 'polish' is about 4 times slower, than 'english'. For 5 repetitions of the text, it's 6 times, for 10 repetitions - 7.5 times... Again, I see nothing unclear here, since dictionaries (as specified in configuration) apply to ALL words in document. The more words in document, the more overhead. You're missing the point. I'm not surprised that the function takes more time for larger input texts - that's obvious. The thing is, the computation times rise more steeply when the Polish config is used. Steeply enough, that the difference between the Polish and English configs becomes enormous in practical cases. Now this may be expected behaviour, but since I don't know if it is, I posted to the mailing lists to find out. If you're saying this is ok and there's nothing to fix here, then there's nothing more to discuss and we may consider the thread closed. If not, ts_headline deserves a closer look. cheers, Wojciech Knapik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
Here's the patch to support Python =3.1 with PL/Python. The compatibility code is mostly in line with the usual 2-3 C porting practice and is documented inline. I took a cursory look at this patch and, while the logic seems sound and roughly in line with the suggested python porting procedure, I'm not quite certain what this implies for potential future patches. For instance, if I wanted to write a type converter for bytea - the python 3 byte type would the expectation be that I ensure that it works in Python 2? Or is an ifdef that ignores it in the case of Python 2 OK, and we can just put a note in the docs. Also, how far back do we want to maintain 2.x compatibility? 2.0? If I wanted to submit a patch that makes use of the list sort method, do I need to ensure that it can either use the cmp arguments or a key argument? What if I wanted to implement a set returning function that made use of an iterators next() method. Would I just put ifdefs around the code or a preprocessor definition that defines NEXT as next() for Python 2.x and __next__() for 3.x? I guess that my first impression is that Python broke compatibility for a reason, and that either plpython can't evolve, or it will quickly become impossible to maintain. That being said, I mostly buy the maintenance arguments from the previous discussion, but if we want to have plpython and plpython3, a bunch of defines and ifdefs does not seem like the best way to do this. Would a better approach be to maintain compatibility layer? ie plython_compat.h/c plython2.c plython3.c Then patches that apply to a python3 can be applied to plython3.c and any changed function can be ripped out of plython_compat and moved into plpython2. I'm sorry to snipe from the sidelines like this. If we didn't expect plpython to evolve then this patch seems like the correct approach, but there is clearly some desire to expand plpython and following this path seems like it will end in a much more painful split in the future or a necessary rewrite. If there is some consensus that this is the best approach, then I will do a more comprehensive review. -Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes: BTW, it might not be a work for this patch, we also need to reject too long VALID UNTIL setting. If the password is complex, we should not use the same password for a long time. This is a good point --- people who have password strength policies tend to want a limited usage period as well. It's even conceivable that you could have different allowed lifespans depending on how strong the password is. I suggest we alter the hook signature to pass it the valuntil time along with the other parameters it's already getting, and let the one hook enforce policies for both. I'm reviewing the patch now, and barring objections will make this change before committing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
On Nov 18, 2009, at 8:37 AM, Peter Eisentraut wrote: The question is whether it helps the user, not the implementer. Sure, but do you have a patch waiting to implement tracebacks? I'd argue the reason it's never been done is due to the way procedures are currently managed in PL/Python. And *without some significant refactoring*, any patch fully implementing tracebacks is going to be a seriously ugly hack. What helped the implementer here would help the user. As far as I can tell, it just creates more typing for no benefit whatsoever. def main(*args): is annoying, but not entirely lamentable... It's explicit, as well(no need to document munging that occurs behind the scenes). Also, compare the cases where you need to cache some initialized data: if 'key' not in SD: ... SD['key'] = my_newly_initialized_data ... With function modules, SD is not needed as you have your module globals to keep your locally cached data in: ... data = my_newly_initialized_data def main(*args): ... Also, it's inconsistent with normal Python script files Hinges on whether normal is actually normal. I often use the __name__ convention in script files myself: if __name__ == '__main__': main(...) That is, using that convention, the script can be import'd and used without executing the script functionality. (It has proven to be very handy a few times now) and with other PLs. I don't understand why that's a significant enough interest to note. I don't need another PostgreSQL implementation on top of Python. Indeed, and I do understand that. That is, I have removed some features with that very thought in mind. (OTOH, I consider the date_part properties on datetime types to be special: too likely useful.) [tho, PostgreSQL implementation? I think I understand what you were getting at, but..] The maintenance effort required to keep those two consistent aside. I don't think there are many consistency issues here. What did you have in mind? Again, I'm only one user. But so far I haven't seen anyone else speak up here, and clearly accepting this for inclusion will need nontrivial convincing. Agreed. It would seem quite doomed. At this point, I'm not going to try getting it into PG. (apparent futility and such) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes: Albe Laurenz laurenz.a...@wien.gv.at wrote: I agree on the second point, and I changed the patch accordingly. Here's the latest version. Looks good. I change status of the patch to Ready for Committer. Applied with some minor modifications. Aside from the added valuntil parameter, I changed the isencrypted parameter to an int with some #define'd values. It seems easily foreseeable that we'll replace the MD5 encryption scheme someday, and it'd be good to ensure that this API is extendable when that happens. Also, I got rid of the bool return value and made the hook responsible for throwing its own errors. I don't know about you guys, but I would cheerfully kill anybody who tried to make me use a password checker that didn't tell me anything about why it thinks my password is too weak. (The CrackLib API we are using is lamentably badly designed on this score --- does it have another call that provides a more useful error report?) Even if you think weak password is adequate for that class of complaints, the single error message would certainly not do for complaints about the valuntil date being too far away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Oversight in CREATE FUNCTION + EXPLAIN?
Hey, So I ran across this today: CREATE OR REPLACE FUNCTION RETURN_LOTS(INT) RETURNS SETOF INT AS $$ SELECT generate_series(1,$1); $$ COST 0.5 ROWS 50 SET work_mem TO '5MB' LANGUAGE 'SQL'; postgres=# explain analyze select return_lots(1000); QUERY PLAN --- Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.059..21411.187 rows=1000 loops=1) Total runtime: 25951.904 ms Shouldn't the estimated rows be 50? Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Amazing performance failure with SQL function
I was just writing a syntical example and wanted to make sure it worked. I found this: CREATE OR REPLACE FUNCTION RETURN_LOTS(INT) RETURNS SETOF INT AS $$ SELECT generate_series(1,$1); $$ COST 0.5 ROWS 1000 SET work_mem TO '5MB' LANGUAGE 'SQL'; postgres=# explain analyze select return_lots(1000); QUERY PLAN --- Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.057..21255.309 rows=1000 loops=1) Total runtime: 25784.077 ms (2 rows) O.k. slow, but no big deal right? Well: postgres=# SET cpu_operator_cost to 0.5; SET postgres=# set work_mem to 5MB; SET postgres=# explain analyze SELECT generate_series(1,1000); QUERY PLAN -- Result (cost=0.00..0.51 rows=1 width=0) (actual time=0.004..6796.389 rows=1000 loops=1) Total runtime: 11301.681 ms (2 rows) This is repeatable. I expect a little regression because we have to compile the SQL but 14 seconds? postgres=# select version(); version - PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.1-3ubuntu3) 4.4.1 (1 row) Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering If the world pushes look it in the eye and GRR. Then push back harder. - Salamander -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oversight in CREATE FUNCTION + EXPLAIN?
Joshua D. Drake j...@commandprompt.com writes: Shouldn't the estimated rows be 50? It is if you do select * from return_lots(1000). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Amazing performance failure with SQL function
Joshua D. Drake j...@commandprompt.com writes: This is repeatable. I expect a little regression because we have to compile the SQL but 14 seconds? generate_series is a quite efficient C function. I think it's pretty damn good that the overhead of a SQL function on top of that is only 2X. Or were you expecting the SRF to be inlined? If so, you need to (a) be using 8.4, (b) mark it STABLE or IMMUTABLE. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Joachim Wieland j...@mcknight.de writes: 4) Allow readers to read uncommitted notifications as well. The question that strikes me here is one of timing --- apparently, readers will now have to check the queue *without* having received a signal? That could amount to an unpleasant amount of extra overhead when the notify system isn't even in use. (Users who don't care about notify will define unpleasant amount as not zero.) I haven't read the patch, so maybe you have some cute solution to that, but if so please explain what. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
On Mon, Nov 16, 2009 at 2:35 PM, Andrew Chernow a...@esilo.com wrote: 1) drop new notifications if the queue is full (silently or with rollback) I like this one best, but not with silence of course. While it's not the most polite thing to do, this is for a super extreme edge case. I'd rather just throw an exception if the queue is full rather than start messing with the +1 So if you guys are going to insist on turning the notification mechanism isn't a queueing mechanism I think it at least behooves you to have it degrade gracefully into a notification mechanism and not become entirely useless by dropping notification messages. That is, if the queue overflows what you should do is drop the payloads and condense all the messages for a given class into a single notification for that class with unknown payload. That way if a cache which wants to invalidate specific objects gets a queue overflow condition then at least it knows it should rescan the original data and rebuild the cache and not just serve invalid data. I still think you're on the wrong path entirely and will end up with a mechanism which serves neither use case very well instead of two separate mechanisms that are properly designed for the two use cases. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
On Thu, Nov 19, 2009 at 1:48 AM, Tom Lane t...@sss.pgh.pa.us wrote: Joachim Wieland j...@mcknight.de writes: 4) Allow readers to read uncommitted notifications as well. The question that strikes me here is one of timing --- apparently, readers will now have to check the queue *without* having received a signal? That could amount to an unpleasant amount of extra overhead when the notify system isn't even in use. (Users who don't care about notify will define unpleasant amount as not zero.) The sequence in CommitTransaction() is like that: 1) add notifications to queue 2) commit to clog 3) signal backends Only those backends are signalled that listen to at least one channel, if the notify system isn't in use, then nobody will ever be signalled anyway. If a backend is reading a transaction id that has not yet committed, it will not deliver the notification. It knows that eventually it will receive a signal from that transaction and then it first checks its list of uncommitted notifications it has already read and then checks the queue for more pending notifications. Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRIGGER with WHEN clause
Itagaki-san, I don't have any more comments in this patch, so I hope it to be reviewed by committers then upstreamed. Thanks for your good jobs. Itagaki Takahiro wrote: KaiGai Kohei kai...@ak.jp.nec.com wrote: In addition, I could find a few matters. * TOAST may be necessary for pg_trigger? I added toast relation to pg_trigger. DECLARE_TOAST(pg_trigger, 2336, 2337); I think having a toast relation for pg_trigger is reasonable because pg_trigger already has a variable field tgargs even if we don't have the new field tgqual from the patch. I'm not sure why we don't have a toast relation for pg_trigger because user might pass very long trigger arguments. * ROW INSERT TRIGGER on COPY FROM statement Thanks. Good catch! Fixed and regression test added. * Using system column in WHEN clause 2) Describe a notice on the user documentation not to use system columns in the WHEN clause, because these are assigned on after the trigger invocations. I'd like to only add documentation because I don't have a whole solution. System columns are not available in the literalWHEN/ clause because those values are initialized after triggers are called. They might return wrong values if they used in expressions of the clause. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center -- OSS Platform Development Division, NEC KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Not safe to send CSV data message
So I went to investigate bug #5196: turned on log_destination = csvlog etc, and restarted the postmaster. I got this on stderr: 2009-11-18 20:08:52.104 EST : : LOG: loaded library passwordcheck Not safe to send CSV data The first line is a consequence of having still got shared_preload_libraries set from testing the new contrib/passwordcheck module. However, what I'm on the warpath about right now is the second line. That message is useless, not to mention not conformant to our style guidelines. And the fact that it comes out at all suggests that the csvlog startup logic is rather broken. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Python 3.1 support
On Nov 18, 2009, at 1:36 PM, James Pye wrote: At this point, I'm not going to try getting it into PG. (apparent futility and such) ugh, on second thought, I think I've written a bit too much code to stop now. I'm going to get plpython3 as far as I can and submit it to the next commitfest. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
That is, if the queue overflows what you should do is drop the payloads and condense all the messages for a given class into a single notification for that class with unknown payload. That way if a cache which wants to invalidate specific objects gets a queue overflow condition then at least it knows it should rescan the original data and rebuild the cache and not just serve invalid data. That's far more complicated than throwing an error and it discards user payload information. Let the error indicate a rescan is needed. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
On 11/16/09 3:19 AM, Joachim Wieland wrote: 1) drop new notifications if the queue is full (silently or with rollback) 2) block until readers catch up (what if the backend that tries to write the notifications actually is the lazy reader that everybody is waiting for to proceed?) 3) invent a new signal reason and send SIGUSR1 to the lazy readers, they need to interrupt whatever they are doing and copy the notifications into their own address space (without delivering the notifications since they are in a transaction at that moment). (4) drop *old* notifications if the queue is full. Since everyone has made the point that LISTEN is not meant to be a full queueing system, I have no problem dropping notifications LRU-style. If we've run out of room, the oldest notifications should go first; we probably don't care about them anyway. We should probably also log the fact that we ran out of room, so that the DBA knows that they ahve a design issue. For volume reasons, I don't think we want to log every dropped message. Alternately, it would be great to have a configuration option which would allow the DBA to choose any of 3 behaviors via GUC: drop-oldest (as above) drop-largest (if we run out of room, drop the largest payloads first to save space) error (if we run out of room, error and rollback) --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
We should probably also log the fact that we ran out of room, so that the DBA knows that they ahve a design issue. Can't they just bump allowed memory and avoid a redesign? Alternately, it would be great to have a configuration option which would allow the DBA to choose any of 3 behaviors via GUC: drop-oldest (as above) drop-largest (if we run out of room, drop the largest payloads first to save space) error (if we run out of room, error and rollback) I mentioned this up thread. I completely agree that overflow behavior should be tunable. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: psql-wrap-formatting/Unicode UTF-8 table formatting for psql text output
Overview: Patch to make data output that includes newlines wrapped lines consistent with the headers for that data. Link: https://commitfest.postgresql.org/action/patch_view?id=220 Submission review: * is in context diff * applies cleanly to current HEAD * includes its own test .sql file to verify new output * includes appropriate doc patches Usability review: * the patch does indeed do what it says * maintains backward compatibility, should the end-user desire it * included tests perform as advertised Coding review: * looks good to me * fits in with the surrounding code * no compiler warnings, no crashes * complete understandable docs Personally, I think this rocks. It really improves the readability of query output. gabrielle -- on behalf of the PDXPUG Patch Review Team (Dan, Mark, John, Brad, me) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Joachim Wieland j...@mcknight.de writes: The sequence in CommitTransaction() is like that: 1) add notifications to queue 2) commit to clog 3) signal backends Only those backends are signalled that listen to at least one channel, if the notify system isn't in use, then nobody will ever be signalled anyway. If a backend is reading a transaction id that has not yet committed, it will not deliver the notification. But you were saying that this patch would enable sending more data than would fit in the queue. How will that happen if the other backends don't look at the queue until you signal them? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not safe to send CSV data message
Andrew Dunstan and...@dunslane.net writes: So the logger there has been doing CSV logging for quite a while without memory ballooning. I was able to generate a noticeable leak by cranking log_rotation_size way down ... it's about 1K per size rotation event. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Josh Berkus j...@agliodbs.com writes: (4) drop *old* notifications if the queue is full. Since everyone has made the point that LISTEN is not meant to be a full queueing system, I have no problem dropping notifications LRU-style. NO, NO, NO, a thousand times no! That turns NOTIFY into an unreliable signaling system, and if I haven't made this perfectly clear yet, any such change will be committed over my dead body. If we are unable to insert a new message into the queue, the correct recourse is to fail the transaction that is trying to insert the *new* message. Not to drop messages from already-committed transactions. Failing the current transaction still leaves things in a consistent state, ie, you don't get messages from aborted transactions but that's okay because they didn't change the database state. I think Greg has a legitimate concern about whether this redesign reduces the usefulness of NOTIFY for existing use-cases, though. Formerly, since pg_listener would effectively coalesce notifies across multiple sending transactions instead of only one, it was impossible to overflow the queue, unless maybe you managed to bloat pg_listener to the point of being out of disk space, and even that was pretty hard. There will now be a nonzero chance of transactions failing at commit because of queue full. If the chance is large this will be an issue. (Is it sane to wait for the queue to be drained?) BTW, did we discuss the issue of 2PC transactions versus notify? The current behavior of 2PC with notify is pretty cheesy and will become more so if we make this change --- you aren't really guaranteed that the notify will happen, even though the prepared transaction did commit. I think it might be better to disallow NOTIFY inside a prepared xact. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Andrew Chernow a...@esilo.com writes: I mentioned this up thread. I completely agree that overflow behavior should be tunable. There is only one correct overflow behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ProcessUtility_hook
Itagaki Takahiro escreveu: Here is a patch to add ProcessUtility_hook to handle all DDL commands in user modules. (ProcessUtility_hook_20091021.patch) It adds a global variable 'ProcessUtility_hook' and export the original function as 'standard_ProcessUtility'. I've reviewed your patch. It applies cleanly and compiles without warnings. It lacks documentation. Could you update it? The functionality is divided in two parts. The first part is a hook in the utility module. The idea is capture the commands that doesn't pass through executor. I'm afraid that that hook will be used only for capturing non-DML queries. If so, why don't we hack the tcop/postgres.c and grab those queries from the same point we log statements? This feature is similar to the executor one. But in the executor case, we use the plan for other things. Other utilities are (i) using that hook to filter out some non-DML commands and (ii) developing some non-DML replication mechanism for trigger-based replication solutions. The second part is to use that hook to capture non-DML commands for pg_stat_statements module. Do we need to have rows = 0 for non-DML commands? Maybe NULL could be an appropriate value. The PREPARE command stopped to count the number of rows. Should we count the rows in EXECUTE command or in the PREPARE command? The other command that doesn't count properly is COPY. Could you fix that? I'm concerned about storing some commands that expose passwords like CREATE ROLE foo PASSWORD 'secret'; I know that the queries are only showed to superusers but maybe we should add this information to documentation or provide a mechanism to exclude those commands. I don't know if it is worth the trouble adding some code to capture VACUUM and ANALYZE commands called inside autovacuum. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Tom Lane wrote: Andrew Chernow a...@esilo.com writes: I mentioned this up thread. I completely agree that overflow behavior should be tunable. There is only one correct overflow behavior. I count three. 1. wait 2. error 3. skip #1 and #2 are very similar to a file system. If FS buffers are full on write, it makes you wait. In non-blocking mode, it throws an EAGAIN error. IMHO those two behaviors are totally acceptable for handling notify overflow. #3 is pretty weak but I *think* there are uses for it. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not safe to send CSV data message
Andrew Dunstan and...@dunslane.net writes: Tom Lane wrote: And the fact that it comes out at all suggests that the csvlog startup logic is rather broken. Comments? Not sure why you say that. This can only happen very early in the startup process before the postmaster has had a chance to finish setting up the syslogger process and dup the pipes. As soon as that happens redirection_done is set to true and this message is no longer possible. Well, in that case the code is operating as designed and the bleating is simply inappropriate. What I was wondering was whether we should try to launch the syslogger before we do process_shared_preload_libraries(). But now that I think about it, I think that ordering was intentional on the grounds that some types of monitoring plugins might want to be live in all postmaster children including the syslogger. In any case there will certainly always be *some* postmaster messages that could be emitted after setting the log_destination GUC and before launching the syslogger child. If the designed behavior is that we dump to stderr during that interval, we should just do it, without the useless and confusing bleating. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Andrew Chernow a...@esilo.com writes: Tom Lane wrote: There is only one correct overflow behavior. I count three. Waiting till you can insert is reasonable (especially if we have some behavior that nudges other backends to empty the queue). If by skip you mean losing the notify but still committing, that's incorrect. There is no room for debate about that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not safe to send CSV data message
Tom Lane wrote: In any case there will certainly always be *some* postmaster messages that could be emitted after setting the log_destination GUC and before launching the syslogger child. If the designed behavior is that we dump to stderr during that interval, we should just do it, without the useless and confusing bleating. I'm fine with that. I don't remember whether I put that in or whether it came from the original patch author(s). Either way, I assume the reason was to explain why the message appeared on stderr rather than the CSVlog. Now we have a couple of years of experience with CSVlog I agree it's not needed (if it were we'd probably have had more complaints like yours anyway). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Tom Lane wrote: Andrew Chernow a...@esilo.com writes: Tom Lane wrote: There is only one correct overflow behavior. I count three. Waiting till you can insert is reasonable (especially if we have some behavior that nudges other backends to empty the queue). If by skip you mean losing the notify but still committing, that's incorrect. There is no room for debate about that. Yeah like I said, skip felt weak. In regards to waiting, what would happen if other backends couldn't help empty the queue because they to are clogged? ISTM that any attempt to flush to other non-disk queues is doomed to possible overflows as well. Then what? Personally, I would just wait until room became available or the transaction was canceled. We could get fancy and tack a timeout value onto the wait. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify - what to do when the queue is full
Andrew Chernow a...@esilo.com writes: Personally, I would just wait until room became available or the transaction was canceled. Works for me, as long as there's a CHECK_FOR_INTERRUPTS in there to allow a cancel to happen. The current patch seems to have a lot of pointless logging and no CHECK_FOR_INTERRUPTS ;-) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Timezones (in 8.5?)
Kevin == Kevin Grittner kevin.gritt...@wicourts.gov writes: If he meant (A), then you store the event as: (ts,tz) = (timestamp '2010-07-27 10:30:00', 'Chile/Santiago') If he meant (B), then you store the event as (tsz,tz) = (timestamp '2010-07-27 10:30:00' at time zone 'Chile/Santiago', 'Chile/Santiago') Kevin You seem to be agreeing that these problems can't be solved Kevin without storing a time zone string in addition to the Kevin timestamp. As I read it, Hernán was wishing for types which Kevin include this, rather than having to do the above dance with Kevin multiple values. Right, but including more data in a single type is the wrong approach, since it complicates the semantics and interferes with normalization. For example, if you have a type T which incorporates a timestamp and a timezone, what semantics does the T = T operator have? What semantics apply if the definitions of timezones change? What if you're storing times of events at specific places; in that case you want to associate the timezone with the _place_ not the event (so that if the timezone rules change, moving the place from one timezone to another, you only have to change the place, not all the events that refer to it). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Very bad FTS performance with the Polish config
ts_headline calls ts_lexize equivalent to break the text. Off course there is algorithm to process the tokens and generate the headline. I would be really surprised if the algorithm to generate the headline is somehow dependent on language (as it only processes the tokens). So Oleg is right when he says ts_lexize is something to be checked. I will try to replicate what you are trying to do but in the meantime can you run the same ts_headline under psql multiple times and paste the result. -Sushant. 2009/11/19 Wojciech Knapik webmas...@wolniartysci.pl Oleg Bartunov wrote: Yes, for 4-word texts the results are similar. Try that with a longer text and the difference becomes more and more significant. For the lorem ipsum text, 'polish' is about 4 times slower, than 'english'. For 5 repetitions of the text, it's 6 times, for 10 repetitions - 7.5 times... Again, I see nothing unclear here, since dictionaries (as specified in configuration) apply to ALL words in document. The more words in document, the more overhead. You're missing the point. I'm not surprised that the function takes more time for larger input texts - that's obvious. The thing is, the computation times rise more steeply when the Polish config is used. Steeply enough, that the difference between the Polish and English configs becomes enormous in practical cases. Now this may be expected behaviour, but since I don't know if it is, I posted to the mailing lists to find out. If you're saying this is ok and there's nothing to fix here, then there's nothing more to discuss and we may consider the thread closed. If not, ts_headline deserves a closer look. cheers, Wojciech Knapik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Not safe to send CSV data message
Tom Lane wrote: So I went to investigate bug #5196: turned on log_destination = csvlog etc, and restarted the postmaster. I got this on stderr: 2009-11-18 20:08:52.104 EST : : LOG: loaded library passwordcheck Not safe to send CSV data The first line is a consequence of having still got shared_preload_libraries set from testing the new contrib/passwordcheck module. However, what I'm on the warpath about right now is the second line. That message is useless, not to mention not conformant to our style guidelines. (donning armor) Possibly. And the fact that it comes out at all suggests that the csvlog startup logic is rather broken. Comments? Not sure why you say that. This can only happen very early in the startup process before the postmaster has had a chance to finish setting up the syslogger process and dup the pipes. As soon as that happens redirection_done is set to true and this message is no longer possible. My memory is fading a bit about this now, but ISTR we debated the logic some when I was working to get this committed around 2 1/2 years ago. It's quite possible we (or I) got it wrong, but I'm not sure what we should be doing if the logger isn't ready yet. BTW, re that bug, I recently turned on CSV logging under 8.3 for a client running Suse 10.2 64bit, with fairly heavy usage. Here is the current state of the non-client processes: postgres 7080 0.0 0.4 12913224 267216 ? SOct28 2:58 /usr/bin/postmaster -D /pg_database/pg_data1/pg postgres 7087 0.1 0.0 99824 60696 ?Ss Oct28 30:32 postgres: logger process postgres 7092 0.0 16.3 12919564 10791472 ? Ss Oct28 7:41 postgres: writer process postgres 7093 0.0 0.0 12919256 10432 ? Ss Oct28 0:29 postgres: wal writer process postgres 7094 0.0 0.0 12920892 3544 ? Ss Oct28 0:00 postgres: autovacuum launcher process postgres 7095 0.0 0.0 34396 1904 ?Ss Oct28 0:05 postgres: archiver process last was 0001020500A3 postgres 7096 0.0 0.0 36156 2916 ?Ss Oct28 10:26 postgres: stats collector process So the logger there has been doing CSV logging for quite a while without memory ballooning. (The writer process, on the other hand, has eaten up a boatload of memory, by the look of it) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xpath_table equivalent
Andrew Dunstan wrote: I've been reading over the documentation to find an alternative to the deprecated xpath_table functionality. I think it may be a possibility but I'm not seeing a clear alternative. Thanks, Chris Graner The standard is XMLTABLE and is implemented by both db2 and oracle but is on our list of unimplemented features. I would love to see this implemented in Postgres. I recall it coming up here before. But I don't think it went beyond discussing which xquery library we could use. Yes, Chris spoke to me about this last night and emailed me an example of what he needs today, and I've spent the couple of hours thinking about it. Not have a nice way of getting a recordset out of a piece of XML is actually quite a gap in our API. The trouble is that XMLTABLE is a horrible grammatical mess, ISTM, and I don't much like the way xpath_table() works either. Passing a table name as text into a function is rather ugly. I think we could do with a much simple, albeit non-standard, API. Something like: xpathtable(source xml, rootnodes text, leaves variadic text[]) returns setof record But unless I'm mistaken we'd need the proposed LATERAL extension to make it iterate nicely over a table. Then we could possibly do something like: select x.bar, x.blurfl from foo f, lateral xpathtable(f.xmlfield,'//foo','bar','baz[1]/blurfl','@is-some-property') as x(bar int, blurfl text, xmlprop bool) where f.otherfield or x.xmlprop; cheers andrew I agree that the syntax of XMLTABLE is odd. But not demonstrably worse than xpathtable. If we are going to exert effort on it, why not do it in a standards compliant way? Otherwise I'd suggest a stop gap of just adding some support functions to make it easier to extract a scalar value from a node. Something like what I did here. http://scottrbailey.wordpress.com/2009/06/19/xml-parsing-postgres/ The nice thing about XMLTABLE is that it adds xquery support. I think the majority of xquery engines seem to be written in Java. XQuilla is C++. I'm not sure if our licensing is compatible, but it I would love the irony of using Berkeley DB XML (formerly Sleepycat) now that its owned by Oracle. Scott -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Summary and Plan for Hot Standby
On Wed, 2009-11-18 at 14:51 +0200, Heikki Linnakangas wrote: Tatsuo Ishii wrote: Please correct me if I'm wrong. Parse will result in obtaining RowExclusiveLock on the target table if it is parsing INSERT/UPDATE/DELETE. If so, is this ok in the standby? Any attempt to take RowExclusiveLock will fail. Any attempt to execute INSERT/UPDATE/DELETE will fail. This behaviour should be identical to read only transaction mode. If it is not documented as an exception, please report as a bug. Is it? It seems read only transaction mode is perfectly happy with RowExclusiveLock: Hmm, that's a good point. I can't immediately see that that would cause any trouble, but it gives me an uncomfortable feeling about the locking. Which locks exactly need to be replayed in standby, and why? Which locks can read-only transactions acquire? The doc says: + In recovery, transactions will not be permitted to take any table lock + higher than AccessShareLock. In addition, transactions may never assign + a TransactionId and may never write WAL. + Any LOCK TABLE command that runs on the standby and requests a specific + lock type other than AccessShareLock will be rejected. which seems wrong, given Tatsuo-sans example. Is that paragraph only referring to LOCK TABLE, and not other means of acquiring locks? Either way, it needs to be clarified or fixed. access/transam/README says: +Further details on locking mechanics in recovery are given in comments +with the Lock rmgr code. but there's no explanation there either *why* the locking works as it is. In LockAcquire(), we forbid taking locks higher than AccessShareLock during recovery mode, but only for LOCKTAG_OBJECT locks. Why? Recovery does *not* take the same locks as the original statements on the master took. For example, the WAL record for an INSERT just makes its changes without acquiring locks. This is OK as long as we only allow read-only users to acquire AccessShareLocks. If we allowed higher locks we might need to do deadlock detection, which would add more complexity. The above restrictions are limited to LOCKTAG_OBJECT so that advisory locks work as advertised. So advisory locks can take both shared and exclusive locks. This never conflicts with recovery because advisory locks are not WAL logged. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rejecting weak passwords
Tom Lane wrote: Applied with some minor modifications. Aside from the added valuntil parameter, I changed the isencrypted parameter to an int with some #define'd values. It seems easily foreseeable that we'll replace the MD5 encryption scheme someday, and it'd be good to ensure that this API is extendable when that happens. Also, I got rid of the bool return value and made the hook responsible for throwing its own errors. I don't know about you guys, but I would cheerfully kill anybody who tried to make me use a password checker that didn't tell me anything about why it thinks my password is too weak. (The CrackLib API we are using is lamentably badly designed on this score --- does it have another call that provides a more useful error report?) Even if you think weak password is adequate for that class of complaints, the single error message would certainly not do for complaints about the valuntil date being too far away. Thank you. I agree on all points. I did not know that contrib modules get translated too, else I would have thrown the error messages there. Yours, Laurenz Albe -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers