[HACKERS] array_to_json re-encodes ARRAY of json type
If we pass an ARRAY of json type to array_to_json() function, the function seems to re-encode the JSON text. But should the following examples be the same result? I'm not sure why we don't have a special case for json type in datum_to_json() -- do we need to pass-through json types in it? =# \x =# SELECT '[A]'::json, array_to_json(ARRAY['A']), array_to_json(ARRAY['A'::json]); -[ RECORD 1 ]-+-- json | [A] array_to_json | [A] array_to_json | [\A\] -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dblink: enable parameters
On Tue, Nov 22, 2011 at 20:09, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: It would be nice to add support for SQL/MED passthrough mode... That would allow you to do any queries/updates to foreign servers. It wouldn't sound very difficult at first glance, though I'm not sure what it would mean to our parser, for example. I think passthrough mode is almost impossible or has very limited usage because we cannot pass query texts that our parser doesn't accept; we cannot support foreign-table-specific queries. On 22.11.2011 06:55, Pavel Stehule wrote: I know so dblink is deprecated interface - but it has necessary functionality still - it support a writable statements. So, dblink on FDW connection seems to be a possible solution. We pass a query as a form of a plain text. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Dash in Extension Names
On Fri, Nov 11, 2011 at 14:40, David E. Wheeler da...@kineticode.com wrote: one might use - in the name itself, but probably not -- -- it seems that the dash is not actually allowed: create extension pgtap-core; ERROR: syntax error at or near - LINE 1: create extension pgtap-core; Parser error? You need double-quotes around the name: =# CREATE EXTENSION uuid-ossp; CREATE EXTENSION -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UTF-8 files with BOM in COPY FROM
On Mon, Sep 26, 2011 at 20:12, Magnus Hagander mag...@hagander.net wrote: I like it in general. But if we're looking at the BOM, shouldn't we also look and *reject* the file if it's a BOM for a non-UTF8 file? Say if the BOM claims it's UTF16? -1 because we're depending on manual configuration for now. It would be reasonable if we had used automatic detection of character encoding, but we don't. In addition, some crazy encoding might use BOM codes as a valid character. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Support UTF-8 files with BOM in COPY FROM
Hi, I'd like to support UTF-8 text or csv files that has BOM (byte order mark) in COPY FROM command. BOM will be automatically detected and ignored if the file encoding is UTF-8. WIP patch attached. I'm thinking about only COPY FROM for reads, but if someone wants to add BOM in COPY TO, we might also support COPY TO WITH BOM for writes. Comments welcome. -- Itagaki Takahiro copy_from_bom.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding Japanese README
On Thu, Jun 30, 2011 at 09:42, Tatsuo Ishii is...@postgresql.org wrote: BTW I will talk to some Japanese speaking developers about my idea if community agree to add Japanese README to the source tree so that I am not the only one who are contributing this project Now, if someone wanted to set up a web site or Wiki page with translations, that would be up to them. IMHO, the Wiki approach seems to be reasonable than a README file. It will be suitable for adding non-Japanese translations and non-core developer can join to translate or fix the docs. If we will promote those README files as documentation, we could move them into internals section in the SGML doc tree. If so, the translated README will be a part of the doc translation project. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench--new transaction type
On Mon, Jun 20, 2011 at 07:30, Greg Smith g...@2ndquadrant.com wrote: I applied Jeff's patch but changed this to address concerns about the program getting stuck running for too long in the function: #define plpgsql_loops 512 Is it OK to define the value as constant? Also, it executes much more queries if -t option (transactions) specified; Of course it runs the specified number of transactions, but actually runs plpgsql_loops times than other modes. I think this is a really nice new workload to demonstrate. One of the things we tell people is that code works much faster when moved server-side, What is the most important part of the changes? The proposal includes 3 improvements. It might defocus the most variable tuning point. #1 Execute multiple queries in one transaction. #2 Run multiple queries in the server with stored procedure. #3 Return only one value instead of 512. Anyway, I'm not sure we need to include the query mode into the pgbench's codes. Instead, how about providing a sample script as a separate sql file? pgbench can execute any script files with -f option. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possible connection leak in dblink?
On Wed, Jun 15, 2011 at 11:41, Fujii Masao masao.fu...@gmail.com wrote: ISTM that the root problem is that dblink_send_query calls DBLINK_GET_CONN though it doesn't accept the connection string as an argument. +1 for the fix. I have the same conclusion at the first glance. Since the first argument in dblink_send_query must be the connection name, dblink_send_query should call DBLINK_GET_NAMED_CONN instead. The variable 'freeconn' is used only when DBLINK_GET_CONN is called. So, if dblink_send_query uses DBLINK_GET_NAMED_CONN instead, the variable 'freeconn' is no longer necessary. The similar problem exists in dblink_get_result and dblink_record_internal. Attached patch fixes those problems. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)
On Tue, Jun 14, 2011 at 09:27, Jeff Janes jeff.ja...@gmail.com wrote: pgbench sends each query (per connection) and waits for the reply before sending another. We can use -j option to run pgbench in multiple threads to avoid request starvation. What setting did you use, Stefan? for those curious - the profile for pgbench looks like: samples %symbol name 2937841.9087 doCustom 1750224.9672 threadRun 7629 10.8830 pg_strcasecmp If the bench is bottleneck, it would be better to reduce pg_strcasecmp calls by holding meta command names as integer values of sub-META_COMMAND instead of string comparison for each loop. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pgbench cpu overhead (was Re: [HACKERS] lazy vxid locks, v1)
On Tue, Jun 14, 2011 at 13:09, Alvaro Herrera alvhe...@commandprompt.com wrote: I noticed that pgbench's doCustom (the function highest in the profile posted) returns doing nothing if the connection is supposed to be sleeping; seems an open door for busy waiting. pgbench uses select() with/without timeout in the cases, no? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heap vacuum cleanup locks
On Sun, Jun 5, 2011 at 12:03, Robert Haas robertmh...@gmail.com wrote: If other buffer pins do exist, then we can't defragment the page, but that doesn't mean no useful work can be done: we can still mark used line pointers dead, or dead line pointers unused. We cannot defragment, but that can be done either by the next VACUUM or by a HOT cleanup. This is just an idea -- Is it possible to have copy-on-write techniques? VACUUM allocates a duplicated page for the pinned page, and copy valid tuples into the new page. Following buffer readers after the VACUUM will see the cloned page instead of the old pinned one. Of course, copy-on-writing is more complex than skipping pinned pages, but I wonder we cannot vacuum at all in some edge cases with the skipping method. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lock problem with autovacuum truncating heap
On Sun, Mar 27, 2011 at 01:12, Simon Riggs si...@2ndquadrant.com wrote: At the same time I would change count_nondeletable_pages() so that it uses a forward scan direction (if that leads to a speedup). +1. Do we need that? Linux readahead works in both directions doesn't it? Guess it wouldn't hurt too much. Yes, probably. AFAIK, RHEL 5 cannot readahead in backward scans. It might be improved in the latest kernel, but it would be safe not to rely on kernels except simple forward scans. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] KEEPONLYALNUM for pg_trgm is not documented
contrib/pg_trgm in 9.1 becomes more attractive feature by index supports for LIKE operators, but only alphabet and numeric characters are indexed by default. But, we can modify KEEPONLYALNUM in the source code to keep all characters in n-gram words. However, the limitation and KEEPONLYALNUM are not documented in the page: http://developer.postgresql.org/pgdocs/postgres/pgtrgm.html An additonal documentation patches acceptable? The issues would be a FAQ for non-English users. I heard that pg_trgm will be one of the *killer features* of 9.1 in Japan, where N-gram based text search is preferred. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Header comments in the recently added files
I found trivial mistakes in the recently added files. Will they fixed by some automated batches, or by manual? - Copyright (c) xxx-*2010*, PostgreSQL Global Development Group in pg_collation.h, pg_foreign_table.h, basebackup.h, syncrep.h, pg_backup_directory.c and auth_delay.c. - IDENTIFICATION $PostgreSQL$ in pg_collation.h, syncrep.h, and syncrep.c Other files has their actual paths in the same place. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Header comments in the recently added files
On Thu, Mar 10, 2011 at 12:55, Robert Haas robertmh...@gmail.com wrote: On Wed, Mar 9, 2011 at 8:33 PM, Itagaki Takahiro itagaki.takah...@gmail.com wrote: I found trivial mistakes in the recently added files. Will they fixed by some automated batches, or by manual? I think these should be fixed manually. The first might eventually get fixed automatically, but perhaps not until 2012, and I'm not sure the second would ever get fixed automatically. OK, I'll do that. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] aggregate version of first_value function?
We have window function version of first_value(), but aggregate version looks useful to write queries something like: =# CREATE TABLE obj (id integer, pos point); =# SELECT X.id, first_value(Y.id ORDER BY X.pos - Y.pos) AS neighbor FROM obj X, obj Y GROUP BY X.id; Is it reasonable? Or, do we have alternative ways for the same purpose? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UNLOGGED tables in psql \d
On Tue, Feb 22, 2011 at 18:11, Cédric Villemain cedric.villemain.deb...@gmail.com wrote: 2011/2/22 Itagaki Takahiro itagaki.takah...@gmail.com: psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes for the table. So, we cannot know the table is unlogged or not unless we directly select from pg_class.relpersistence. Is this a TODO item? I believe it is in the title of the table presented by \d (Table, Unlogged table, Temp table) Ah, I see. Thank you. They are shown as Unlogged Table and Unlogged Index. - We don't have Temp for temp tables. Should we have? - The head characters of the second words would be small letters. We describe other objects as Composite type, Foreign table, or so. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql tab-completion for CREATE UNLOGGED TABLE
Here is a patch to support CREATE UNLOGGED TABLE in psql tab-completion. It also fixes a bug that DROP is completed with TEMP and UNIQUE unexpectedly and cleanup codes for DROP OWNED BY in drop_command_generator(). -- Itagaki Takahiro psql_tab_completion_for_unlogged-20110223.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] validating foreign tables
On Tue, Feb 22, 2011 at 10:12, Andrew Dunstan and...@dunslane.net wrote: The API for FDW validators doesn't appear to have any way that the validator function can check that the defined foreign table shape matches the FDW options sanely. Huh? The options ought to be orthogonal to the table column info. If they're not, maybe you need to rethink your option definitions. Well, let's take a couple of cases. 1. My old favorite, file as a text array. 2. A hypothetical RSS feed, where the options specify which RSS fields we want. I think we need to overhaul validators in 9.2 listening to FDW developers' opinions anyway. The text array is an example, but there should be many other requirements. Personally, I'd like to have a method to list available options from SQL. We should also consider column-level options for foreign tables then. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] UNLOGGED tables in psql \d
psql \d(+) doesn't show any information about UNLOGGED and TEMP attributes for the table. So, we cannot know the table is unlogged or not unless we directly select from pg_class.relpersistence. Is this a TODO item? The same issue is in TEMP tables, but we can determine them by their schema; they are always created in pg_temp_N schema. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] contrib/unaccent regression test failed
I found make installcheck for contrib/unaccent in master is failed on my machine, The regressions.diffs attached. I'm not sure the reason because BuildFarm machines are all OK, but we might need some checks in addition to database encoding. * OS: Linux 2.6.35.10-74.fc14.x86_64 #1 SMP Thu Dec 23 16:04:50 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux * PG DB: Name| Owner | Encoding | Collate | Ctype | Access privileges +--+--+-+---+--- contrib_regression | postgres | UTF8 | C | C | -- Itagaki Takahiro regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY ENCODING revisited
On Fri, Feb 18, 2011 at 20:12, Itagaki Takahiro itagaki.takah...@gmail.com wrote: + extern char *pg_any_to_server(const char *s, int len, int encoding); + extern char *pg_server_to_any(const char *s, int len, int encoding); I applied the version with additional codes for file_fdw. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY ENCODING revisited
On Fri, Feb 18, 2011 at 03:57, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 16, 2011 at 10:45 PM, Itagaki Takahiro I am not qualified to fully review this patch because I'm not all that familiar with the encoding stuff, but it looks reasonably sensible on a quick read-through. I am supportive of making a change in this area even at this late date, because it seems to me that if we're not going to change this then we're pretty much giving up on having a usable file_fdw in 9.1. And since postgresql_fdw isn't in very good shape either, that would mean we may as well give up on SQL/MED. We might have to do that anyway, but I don't think we should do it just because of this issue, if there's a reasonable fix. One design issue is the new function names: extern char *pg_client_to_server(const char *s, int len); extern char *pg_server_to_client(const char *s, int len); + extern char *pg_any_to_server(const char *s, int len, int encoding); + extern char *pg_server_to_any(const char *s, int len, int encoding); They don't contain any words related to encoding or conversion. Ishii-san, do you have comments? I guess you designed the area. Please let me know if there are better alternatives. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE
UNLOGGED is defigned in OptTemp in gram.y, so the parser accepts CREATE UNLOGGED VIEW and CREATE UNLOGGED SEQUENCE, but they are actually not supported. =# CREATE UNLOGGED VIEW uv AS SELECT 1; TRAP: FailedAssertion(!(relkind == 'r' || relkind == 't'), File: heap.c, Line: 1246) =# CREATE UNLOGGED SEQUENCE us; TRAP: FailedAssertion(!(relkind == 'r' || relkind == 't'), File: heap.c, Line: 1246) The most easiest fix would be preventing them in parser level. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY ENCODING revisited
On Fri, Feb 18, 2011 at 04:04, Hitoshi Harada umi.tan...@gmail.com wrote: FWIW, I finally found the good example to cache miscellaneous data in file local, namely regexp.c. It allocates compiled regular expressions up to 32 by using malloc(). I'm not exactly sure the cache usage in mbutils.c because it doesn't have routine for cache invalidation at all. Conversion procs are rarely replaced, but we might not ought to keep cached functions unlimitedly. Regexp cache is OK because the algorithm cannot be modified at runtime. We need only 4 or so for encoding conversion cache, in which cache search doesn't seem like overhead. We need to research what we should cache for conversion procs. We will need 4 bytes per conversion pair if we cache only OIDs, but sizeof(FmgrInfo) bytes if we use the same way as ToXXXConvProc cache. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-scope advisory locks
On Thu, Feb 17, 2011 at 17:05, Itagaki Takahiro itagaki.takah...@gmail.com wrote: I did a few cosmetic fixes, mainly lmgr/README and make a subroutine ReleaseLockForOwner() for LockReleaseSession and LockReleaseCurrentOwner. Committed with a few typo fixes. Thanks, Marko! -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2011-01 as of 2011-02-04
On Fri, Feb 18, 2011 at 13:15, Shigeru HANADA han...@metrosystems.co.jp wrote: When I've used COPY TO for testing file_fdw, I got wrong result. It would be because DR_copy's processed is not initialized in CreateCopyDestReceiver(). Please see attached patch. Oops, thanks, applied. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] COPY ENCODING revisited
COPY ENCODING patch was returned with feedback, https://commitfest.postgresql.org/action/patch_view?id=501 but we still need it for file_fdw. Using client_encoding at runtime is reasonable for one-time COPY command, but logically nonsense for persistent file_fdw tables. Base on the latest patch, http://archives.postgresql.org/pgsql-hackers/2011-01/msg02903.php I added pg_any_to_server() and pg_server_to_any() functions instead of exposing FmgrInfo in pg_wchar.h. They are same as pg_client_to_server() and pg_server_to_client(), but accept any encoding. They use cached conversion procs only if the specified encoding matches the client encoding. According to Harada's research, http://archives.postgresql.org/pgsql-hackers/2011-01/msg02397.php non-cached conversions are slower than cached ones. This version provides the same performance before when file and client encoding are same, but would be a bit slower on other cases. We could improve the performance in future versions, for example, caching each used conversion proc in pg_do_pg_do_encoding_conversion(). file_fdw will support ENCODING option. Also, if not specified it might have to store the client_encoding at CREATE FOREIGN TABLE. Even if we use a different client_encoding at SELECT, the encoding at definition is used. ENCODING 'quoted name' issue is also fixed; it always requires quoted names. I think we only accept non-quoted text as identifier names. Unquoted text should be treated as double quoted, but encoding names are not identifiers. -- Itagaki Takahiro copy_encoding-20110217.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2011-01 as of 2011-02-04
On Tue, Feb 15, 2011 at 01:27, Robert Haas robertmh...@gmail.com wrote: However, file_fdw is in pretty serious trouble because (1) the copy API patch that it depends on still isn't committed and (2) it's going to be utterly broken if we don't do something about the client_encoding vs. file_encoding problem; there was a patch to do that in this CF, but we gave up on it. Will we include the copy API patch in 9.1 even if we won't have file_fdw? Personally, I want to include the APIs because someone can develop file_fdw as a third party extension for 9.1 using the infrastructure. The extension will lack of file encoding support, but still useful for many cases. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CommitFest 2011-01 as of 2011-02-04
On Wed, Feb 16, 2011 at 02:14, Andrew Dunstan and...@dunslane.net wrote: I've been kind of wondering why you haven't already committed it. If you're confident that the code is in good shape, I don't particularly see any benefit to holding off. +10. The sooner the better. Thanks comments. I've applied the COPY API patch. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Tue, Feb 8, 2011 at 00:30, Shigeru HANADA han...@metrosystems.co.jp wrote: Fixed version is attached. I reviewed your latest git version, that is a bit newer than the attached patch. http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=commit;h=0e1a1e1b0e168cb3d8ff4d637747d0ba8f7b8d55 The code still works with small adjustment, but needs to be rebased on the latest master, especially for extension support and copy API changes. Here are a list of comments and suggestions: * You might forget some combination or unspecified options in file_fdw_validator(). For example, format == NULL or !csv header cases. I've not tested all cases, but please recheck validations used in BeginCopy(). * estimate_costs() needs own row estimation rather than using baserel. What value does baserel-tuples have? Foreign tables are never analyzed for now. Is the number correct? No, baserel-tuples is always 0, and baserel-pages is 0 too. In addition, width and rows are set to 100 and 1, as default values. It means baserel is not reliable at all, right? If so, we need alternative solution in estimate_costs(). We adjust statistics with runtime relation size in estimate_rel_size(). Also, we use get_rel_data_width() for not analyzed tables. We could use similar technique in file_fdw, too. * Should use int64 for file byte size (or, uint32 in blocks). unsigned long might be 32bit. ulong is not portable. * Oid List (list_make1_oid) might be more handy than Const to hold relid in FdwPlan.fdw_private. * I prefer FileFdwExecutionState to FileFdwPrivate, because there are two different 'fdw_private' variables in FdwPlan and FdwExecutionState. * The comment in fileIterate seems wrong. It should be /* Store the next tuple as a virtual tuple. */ , right? * #include sys/stat.h is needed. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add support for logging the current role
On Mon, Feb 14, 2011 at 11:51, Stephen Frost sfr...@snowman.net wrote: It would be awfully nice if we could inject something into the csvlog output format that would let client programs know which fields are present. This would be especially useful for people writing tools that are intended to work on ANY PG installation, rather than just, say, their own. I'm not sure if there's a clean way to do that, though. Added csvlog_header GUC to allow the user to ask for the header to be printed at the top of each log file. If and when an option is added to PG's COPY to respect the header, this should resolve that issue. We need to design csvlog_header more carefully. csvlog_header won't work if log_filename is low-resolution, ex. log-per-day. It's still useful when a DBA reads the file manually, but documentation would better. Or, should we skip writing headers when the open log file is not empty? It works unless when csvlog_fields is modified before restart, and also on logger's crash and restart, though it's a rare case. A few comments and trivial issues: * It might be my misunderstanding, but there was a short description for %U for in log_line_prefix in postgresql.conf, right? Did you remove it in the latest version? * The long csvlog_fields default is a problem also in postgresql.conf, but we have no solution for the issue... The current code is the best for now. * In assign_csvlog_fields(), we need to cleanup memory and memory context before return on error. + /* check if no option matched, and if so, return error */ + if (curr_option == MAX_CSVLOG_OPTS) + return NULL; * An added needless return should be removed. *** 2139,2144 write_csvlog(ErrorData *edata) --- 2379,2386 write_pipe_chunks(buf.data, buf.len, LOG_DESTINATION_CSVLOG); pfree(buf.data); + + return; } /* -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Mon, Feb 14, 2011 at 22:06, Noah Misch n...@leadboat.com wrote: On Sun, Feb 13, 2011 at 12:41:11PM -0600, Kevin Grittner wrote: Unpatched: real 17m24.171s real 16m52.892s real 16m40.624s real 16m41.700s Patched: real 15m56.249s real 15m47.001s real 15m3.018s real 17m16.157s Since you said that a cursory test, or no test at all, should be good enough given the low risk of performance regression, I didn't book a machine and script a large test run, but if anyone feels that's justified, I can arrange something. Based on this, I've taken the liberty of marking the patch Ready for Committer. Thank you very much for performance testing and reviewing! The result is interesting because I didn't intend performance optimization. At least no performance regression is enough for the purpose. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add support for logging the current role
On Mon, Feb 14, 2011 at 23:30, Stephen Frost sfr...@snowman.net wrote: * In assign_csvlog_fields(), we need to cleanup memory and memory context before return on error. Fixed this and a couple of similar issues. Not yet fixed. Switched memory context is not restored on error. Updated patch attached, git log below. Now I mark the patch to Ready for Committer, because I don't have suggestions any more. For reference, I note my previous questions. Some of them might be TODO items, or might not. We can add the basic feature in 9.1, and improve it 9.2 or later versions. * csvlog_fields and csvlog_header won't work with non-default log_filename when it doesn't contain seconds in the format. They expect they can always open empty log files. * The long default value for csvlog_fields leads long text line in postgresql.conf, SHOW ALL, pg_settings view, but there were no better alternative solutions in the past discussion. * csvlog_fields is marked as PGC_POSTMASTER. It can protect mixed formats in a csv file on default log_filename, but other similar GUC variables are usually marked AS PGC_SIGHUP. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Sat, Feb 12, 2011 at 00:50, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Right, but making the parser slower has a cost, too. ScanKeywordLookup() is already a hotspot in some workloads, and there's overhead buried in the bison parser, too. Yeah. Keep in mind that a bison parser fundamentally runs off a two-dimensional array: one axis is parser state and the other is token number. They have some tricks to compress the array a bit, but adding keywords contributes directly to a bigger array, which means slower parsing (more L1 cache misses). The parser's inner loop frequently shows up as a hotspot in profiles I do, and I think that has to be more about the amount of data it's touching than the cost of the loop per se. Did you measure the actual cost in the real world? If we are using such a sensitive parser, it should be a problem even without the patch. Adding unnecessary keywords is something to be avoided. Our conclusion is we never support multiset syntax in the SQL standard, right? If we think they are unnecessary, we cannot support it. I will remove parser changes from the patch; it will add only a few array functions. Then, please let me know functions you don't want to include in the core, if any. I'll remove them at the same time. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Sat, Feb 12, 2011 at 01:12, Robert Haas robertmh...@gmail.com wrote: On Wed, Feb 9, 2011 at 2:03 AM, Noah Misch n...@leadboat.com wrote: From a functional and code structure perspective, I find this ready to commit. (I assume you'll drop the XXX: indent only comments on commit.) Kevin, did you want to do that performance testing you spoke of? OK, so is this Ready for Committer, or we're still working on it? Basically, we have no more tasks until the FDW core API is applied. COPY API and file_fdw patches are waiting for it. If we extend them a little more, I'd raise two items: * Should we print foreign table names in error messages? http://archives.postgresql.org/pgsql-hackers/2011-02/msg00427.php * COPY encoding patch was rejected, but using client_encoding is logically wrong for file_fdw. We might need subset of the patch for file_fdw. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Sat, Feb 12, 2011 at 05:01, Stephen Frost sfr...@snowman.net wrote: Input arrays are always flattened into one-dimensional arrays. That just strikes me as completely broken when it comes to PG Arrays. Contains operators (@, , @) ignore multi-dimensions. Array slice operator ([lo:hi]) always reset the index offsets. What does the spec say about this, if anything? Is that required by spec, or is the spec not relevant to this because MULTISETs are only one dimensional..? Yes. The SQL standard has only one-dimensional ARRAYs and MULTISETs , though it supports collections of collections, that we don't have. I would think that we would have a way to define what dimension or piece of the array that would be sorted or returned as a set, etc. It would be consistent if we return (ARRAY[][])[] = ARRAY[], but we throw errors actually. In my view, we should be throwing an error if we get called on a multi-dim array and we can't perform the operation on that in an obviously correct way, not forcing the input to match something we can make work. Agreed, I'll do so. We could also keep lower-bounds of arrays, at least on sort. I'm not thrilled with called ArrayGetNItems array_cardinality(). Why not just provide a function with a name like array_getnitems() instead? We must use the name, because it is in the SQL standard. FUNCTION cardinality(collection) = number We would have overloaded two versions for ARRAYs andMULTISETs. trim_array() suffers from two problems: lack of comments, and no spell checking done on those that are there. What comments do you want? Should get_type_cache() really live in array_userfuncs.c ? Do you find codes using the same operation in other files? There's more, primairly lack of comments and what I consider poor function naming (sort_or_unique() ? really?), Could you suggest better names? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-scope advisory locks
On Thu, Feb 10, 2011 at 08:36, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: One issue might be in pg_locks Robert suggested not doing this for 9.1, and I don't have anything against that. Agreed. Updated patch attached. Looks good to commit. I note a few minor issues for committer: * Functions listed in Table 9-62. Advisory Lock Functions might need sorted in alphabetical order. * We could extend LockReleaseAll() to have the 3rd mode instead of LockReleaseSession(). Existing behavior is: | LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks) | allLocks == true: release all locks including session locks. | allLocks == false: release all non-session locks. * Or, we might have one subroutine for LockReleaseSession() and LockReleaseCurrentOwner(). They have similar codes. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add support for logging the current role
On Mon, Feb 7, 2011 at 04:10, Stephen Frost sfr...@snowman.net wrote: Yeah, doesn't seem to work for me (missing '/bin/collateindex.pl', apparently..). You might need yum install openjade stylesheets or similar packages and re-configure. Ok, I've cleaned up that part of the documentation to be a table instead of the listings that were there, seems like a better approach anyway. Yeah, that's a good job! I agree that it's logically good design, but we could not accept it as long as it breaks tools in the real world... If it does, I think it's pretty clear that those tools are themselves broken.. The word break was my wrong choice, but your new parameter still requires very wide monitors to display SHOW ALL and pg_settings. I'd like to solve the issue even though the feature itself is useful. One fast and snappy solution might be to set the default value to default, that means the compatible set of columns. Other better ideas? For implementation, write_csvlog() has many following lines: if (curr_field != num_fields) appendStringInfoChar(buf, ','); It will be cleaner if we add first_col flag and move it out of the switch statement. Other questions I raised before might be matters of preference. I'd like to here about them form third person. * name: log_csv_fields vs. csvlog_fields * when to assign: PGC_POSTMASTER vs. PGC_SIGHUP -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql patch: tab-complete :variables also at buffer start
On Thu, Feb 10, 2011 at 19:37, Christoph Berg c...@df7cb.de wrote: Currently, tab-completing :variable names in psql does not work at the beginning of the line. Fix this by moving the code block before the empty buffer case. Seems reasonable to me. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-scope advisory locks
On Thu, Feb 3, 2011 at 00:24, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: .. and here's the patch. I'm not too confident with the code I added to storage/lmgr/lock.c, but it seems to be working. Sorry for the delayed review. The patch needs adjustment of OIDs for recently commits, but it still works well. See the attached small fix. The patch looks almost ready to commit unless we want to fix the pg_locks issue below. === Features === Now unlock functions only release session-level locks and the behavior is documented, so no confusion here. We don't have upgrade method for advisory locks actually -- session and xact locks block each other, but they are acquired and released independently. One issue might be in pg_locks, as you pointed out in the previous mail: if a session holds both a transaction level and a session level lock on the same resource, only one of them will appear in pg_locks. Also, we cannot distinguish transaction-level locks from session-level locks from pg_locks. It was not an issue before because session locks are only used in internal implementation. It looks as a transaction from users. However, this feature reveals the status in public. We might need to add some bits to shared lock state to show which lock is session-level. === Implementation === * pg_advisory_unlock_all() calls LockReleaseSession(), ant it releases not only advisory locks but also all session-level locks. We use session-level locks in some places, but there is no chance for user to send SQL commands during the lock. The behavior is safe as of now, but it might break something in the future. So I'd recommend to keep locktype checks in it. * user_lockmethod.transactional was changed to 'true', so we don't have any differences between it and default_lockmethod except trace_flag. LockMethodData is now almost useless, but we could keep it for compatibility. Earlier there was some discussion about adding regression tests for advisory locks. However, I don't see where they would fit in our current .sql files and adding a new one just for a few tests didn't seem right. Anyone have an idea where they should go or should I just add a new one? I think you can add advisory_lock.sql for the test. -- Itagaki Takahiro advisory4fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] another mvcc.sgml typo
On Thu, Feb 10, 2011 at 09:30, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Trivial patch attached. Applied. Thanks! -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing COPY API
On Tue, Feb 8, 2011 at 09:38, Andrew Dunstan and...@dunslane.net wrote: Here is a patch against the latest revision of file_fdw to exercise this API. It includes some regression tests, and I think apart from one or two small details plus a requirement for documentation, is complete. The patch contains a few fixes for typo in the original patch. Hanada-san, could you take them into the core file_fdw patch? CREATE FOREIGN TABLE jagged_text ( t text[] ) SERVER file_server OPTIONS (format 'csv', filename '/home/andrew/pgl/pg_head/contrib/file_fdw/data/jagged.csv', header 'true', textarray 'true'); There might be another approach -- we could have jagged_file_fdw aside from file_fdw, because we cannot support some features in textarray mode like force_not_null option and multiple non-text[] columns. I'll include NextCopyFromRawFields() in COPY API patch to complete raw_fields support in CopyState. (Or, we should also revert changes related to raw_fields.) However, we'd better postpone jagged csv support to 9.2. The design is still under discussion. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Issues with generate_series using integer boundaries
On Mon, Feb 7, 2011 at 20:38, Thom Brown t...@linux.com wrote: Yes, of course, int8 functions are separate. I attach an updated patch, although I still think there's a better way of doing this. Thanks. Please add the patch to the *current* commitfest because it's a bugfix. https://commitfest.postgresql.org/action/commitfest_view?id=9 I've not tested the patch yet, but if we could drop the following line in the patch, the code could be much cleaner. /* ensure first value in series should exist */ I'm not sure how this should be handled. Should there just be a check for either kind of infinity and return an error if that's the case? I Maybe so. It also works if we had infinity on timestamp overflow, but I've not tested yet. Anyway, we need similar fix for timestamp versions. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Issues with generate_series using integer boundaries
On Wed, Feb 9, 2011 at 10:17, Andrew Dunstan and...@dunslane.net wrote: Isn't this all really a bug fix that should be backpatched, rather than a commitfest item? Sure, but we don't have any bug trackers... -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump directory archive format / parallel pg_dump
On Tue, Feb 8, 2011 at 13:34, Robert Haas robertmh...@gmail.com wrote: So how close are we to having a committable version of this? Should we push this out to 9.2? I think so. The feature is pretty attractive, but more works are required: * Re-base on synchronized snapshots patch * Consider to use pipe also on Windows. * Research libpq + fork() issue. We have a warning in docs: http://developer.postgresql.org/pgdocs/postgres/libpq-connect.html | On Unix, forking a process with open libpq connections can lead to unpredictable results -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error attribution in foreign scans
On Mon, Feb 7, 2011 at 22:47, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On Mon, Feb 7, 2011 at 21:17, Noah Misch n...@leadboat.com wrote: The message does not show which foreign table yielded the error. We could evade the problem in this case by adding a file name to the error message in the COPY code, Yeah, an error context callback like that makes sense. In the case of the file FDW, though, just including the filename in the error message seems even better. Especially if the error is directly related to failure in reading the file. What do you think about filenames in terms of security? We will allow non-superusers to use existing foreign tables of file_fdw. For reference, we hide some path settings in GUC variables. We also reconsider privilege of fdwoptions, umoptions, etc. They could contain password or server-side path, but all users can retrieve the values. It's an existing issue, but will be more serious in 9.1. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Range Type constructors
On Wed, Feb 9, 2011 at 14:50, Jeff Davis pg...@j-davis.com wrote: 1. The obvious constructor would be: range(1, 10) But is that [1, 10), (1, 10], (1, 10), or [1, 10]? We need to support all 4, and it's not obvious how to do that easily. here is the same issue in table partitioning. Also, We might use the syntax for our partitioning in the future. Just for reference, DB2 uses EXCLUSIVE and INCLUSIVE keywords to specify boundaries. CREATE TABLE ... PARTITION BY RANGE (...) (STARTING 0 EXCLUSIVE ENDING 100 INCLUSIVE) http://publib.boulder.ibm.com/infocenter/db2luw/v9r8/index.jsp?topic=/com.ibm.db2.luw.sql.ref.doc/doc/r927.html I'm not sure it is the best syntax, but at least it's easy to read for beginners and works with parentheses completion by text editors. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Issues with generate_series using integer boundaries
On Fri, Feb 4, 2011 at 21:32, Thom Brown t...@linux.com wrote: The issue is that generate_series will not return if the series hits either the upper or lower boundary during increment, or goes beyond it. The attached patch fixes this behaviour, but should probably be done a better way. The first 3 examples above will not return. There are same bug in int8 and timestamp[tz] versions. We also need fix for them. =# SELECT x FROM generate_series(9223372036854775807::int8, 9223372036854775807::int8) AS a(x); =# SELECT x FROM generate_series('infinity'::timestamp, 'infinity', '1 sec') AS a(x); =# SELECT x FROM generate_series('infinity'::timestamptz, 'infinity', '1 sec') AS a(x); postgres=# SELECT x FROM generate_series(1, 9,-1) AS a(x); postgres=# SELECT x FROM generate_series(1, 9,3) AS a(x); They work as expected in 9.1dev. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Mon, Feb 7, 2011 at 16:01, Shigeru HANADA han...@metrosystems.co.jp wrote: This patch is based on latest FDW API patches which are posted in another thread SQL/MED FDW API, and copy_export-20110104.patch which was posted by Itagaki-san. I have questions about estimate_costs(). * What value does baserel-tuples have? Foreign tables are never analyzed for now. Is the number correct? * Your previous measurement showed it has much more startup_cost. When you removed ReScan, it took long time but planner didn't choose materialized plans. It might come from lower startup costs. * Why do you use lstat() in it? Even if the file is a symlink, we will read the linked file in the succeeding copy. So, I think it should be stat() rather than lstat(). +estimate_costs(const char *filename, RelOptInfo *baserel, + double *startup_cost, double *total_cost) +{ ... + /* get size of the file */ + if (lstat(filename, stat) == -1) + { + ereport(ERROR, + (errcode_for_file_access(), +errmsg(could not stat file \%s\: %m, filename))); + } + + /* +* The way to estimate costs is almost same as cost_seqscan(), but there +* are some differences: +* - DISK costs are estimated from file size. +* - CPU costs are 10x of seq scan, for overhead of parsing records. +*/ + pages = stat.st_size / BLCKSZ + (stat.st_size % BLCKSZ 0 ? 1 : 0); + run_cost += seq_page_cost * pages; + + *startup_cost += baserel-baserestrictcost.startup; + cpu_per_tuple = cpu_tuple_cost + baserel-baserestrictcost.per_tuple; + run_cost += cpu_per_tuple * 10 * baserel-tuples; + *total_cost = *startup_cost + run_cost; + + return stat.st_size; +} -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add support for logging the current role
On Sun, Feb 6, 2011 at 23:31, Stephen Frost sfr...@snowman.net wrote: * Itagaki Takahiro (itagaki.takah...@gmail.com) wrote: I think we need to improve postgresql.conf.sample a bit more, especially the long line for #log_csv_fields = '...'. 330 characters in it! #1. Leave the long line because it is needed. It's needed to match what the current default is.. I agree that it's logically good design, but we could not accept it as long as it breaks tools in the real world... Will it break SHOW ALL and SELECT * FROM pg_settings output? I'm worried that they are not designed to display such a long value. I think it depends default log filename, that contains %S (seconds) suffix. We can remove %S from log_filename; if we use a log per-day, those log might contain different columns even after restart. If we cannot avoid zigged csv fields completely, SIGHUP seems reasonable for it. This problem is bigger than SIGHUP, but at least with a restart required, the default will work properly. The default configuration wouldn't work w/ a change to the log line and a SIGHUP done. Only works with the default settings is just wrong design. If we cannot provide a perfect solution, we should allow users to control everything as they like. I still think PGC_SIGHUP is the best mode for the parameter, with a note of caution in the docs. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing COPY API
Here is a demonstration to support jagged input files. It's a patch on the latest patch. The new added API is: bool NextLineCopyFrom( [IN] CopyState cstate, [OUT] char ***fields, [OUT] int *nfields, [OUT] Oid *tupleOid) It just returns separated fields in the next line. Fortunately, I need no extra code for it because it is just extracted from NextCopyFrom(). I'm willing to include the change into copy APIs, but we still have a few issues. See below. On Fri, Feb 4, 2011 at 16:53, Andrew Dunstan and...@dunslane.net wrote: The problem with COPY FROM is that nobody's come up with a good syntax for allowing it as a FROM target. Doing what I want via FDW neatly gets us around that problem. But I'm quite OK with doing the hard work inside the COPY code - that's what my working prototype does in fact. I think it is not only syntax issue. I found an issue that we hard to support FORCE_NOT_NULL option for extra fields. See FIXME in the patch. It is a fundamental problem to support jagged fields. One thing I'd like is to to have file_fdw do something we can't do another way. currently it doesn't, so it's nice but uninteresting. BTW, how do you determine which field is shifted in your broken CSV file? For example, the case you find AB,CD,EF for 2 columns tables. I could provide a raw CSV reader for jagged files, but you still have to cook the returned fields into a proper tuple... -- Itagaki Takahiro jagged_csv_api-20110204.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Sat, Feb 5, 2011 at 02:29, Robert Haas robertmh...@gmail.com wrote: I am still not in favor of adding this syntax. I also don't like the syntax, but unfortunately, almost all of them are in the SQL standard :-(. In addition, Oracle already uses the same feature with the special syntax, though true multiset data types are supported in it. [It sure is awkward that trim_array has the word array second and all of our other array functions have it first - is this required by the spec?] Yes, again, but we could have array_trim() for the alias. array_to_set() and array_is_set() seem possibly useful, but I probably would have called them array_remove_dups() and array_has_dups(). I might be in the minority on that one, though. array_to_set is the only function we can rename freely. Another candidates might be array_unique (contrib/intarray uses uniq). array_is_set() is an internal representation of IS A SET operator. So, the name is not so important (and not documented.) I think array_subset(), array_union(), array_intersect(), and array_except() are useful, but I think they should just be regular functions, without any special syntax. All of the special syntax are in the spec, except the argument types should be multisets rather than arrays. fusion() and intersection() also seem useful, but maybe it would be more consistent to all them array_fusion() and array_intersection(). Both of the names are the standard... We could have array_fusion() for array types and fusion() for multiset types, but I prefer overloaded fusion() to have both names. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Sat, Feb 5, 2011 at 03:04, Robert Haas robertmh...@gmail.com wrote: I am still not in favor of adding this syntax. I also don't like the syntax, but unfortunately, almost all of them are in the SQL standard :-(. The standard specifies this syntax for arrays, or for multisets? Multisets. But I chose the same function name and syntax because arrays *are* multisets by definition. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Sat, Feb 5, 2011 at 04:24, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: In math class, maybe. But in programming, no. Multiset is a datatype. Array is a different datatype. There is no reason why we need to clutter our parser with extra keywords to support a non-standard feature extension. My understanding is that we will have to have those functions defined and user visible, and that we benefit from function overloading which is not in the standard. So there's no reason not to provide those function for arrays already, then extend to full multiset support. Given PostgreSQL overloading, yes, arrays are multisets as far as defining those standard compliant APIs is concerned. AFAIUI. Yes, I'd like to use overloading. Choosing arbitrary names increases learning costs for users. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing COPY API
On Fri, Feb 4, 2011 at 09:48, Andrew Dunstan and...@dunslane.net wrote: I'd like to be able to add a callback function to construct the values for the tuple. So it would become something like: typedef void (*copy_make_values) (CopyState cstate, NumFieldsRead int); You can do nothing interesting in the callback probably because the details of CopyState is not exported yet. Also, we should pass through user context for such kind of callback. The prototype of would have void *userdata. Of course, I want this so I could construct a text array from the read in data, but I could also imagine a foreign data wrapper wanting to mangle the data before handing it to postgres, say by filling in a field or hashing it. Could you explain the actual use-cases and examples? I think we need to have SQL-level extensibility if we provide such flexibility. I guess typical users don't want to write functions with C for each kind of input files. Note that pg_bulkload has a similar feature like as: CREATE FUNCTION my_function(...) RETURNS record AS ...; COPY tbl FROM 'file' WITH (make_record_from_line = my_function) -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing COPY API
On Fri, Feb 4, 2011 at 11:32, Andrew Dunstan and...@dunslane.net wrote: Umm, where? I can't find this in the documentation http://pgbulkload.projects.postgresql.org/pg_bulkload.html Here: http://pgbulkload.projects.postgresql.org/pg_bulkload.html#filter The object, as I have explained previously, is to have a FDW that returns a text array from a (possibly irregularly shaped) file. I remember the text array proposal, but if the extension is written in C, it can only handle one kind of input files. If another file is broken in a different way, you need to rewrite the C code, no? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] exposing COPY API
On Fri, Feb 4, 2011 at 12:17, Andrew Dunstan and...@dunslane.net wrote: http://pgbulkload.projects.postgresql.org/pg_bulkload.html#filter AFAICT, this doesn't support ragged tables with too many columns, which is my prime use case. If it supported variadic arguments in filter functions it might come closer. It will be good improvement for pg_bulkload, but it's off-topic ;-) Also, a FDW allows the COPY to be used as a FROM target, giving it great flexibility. AFAICT this does not. BTW, which do you want to improve, FDW or COPY FROM? If FDW, the better API would be raw version of NextCopyFrom(). For example: bool NextRawFields(CopyState cstate, char **fields, int *nfields) The caller FDW has responsibility to form tuples from the raw fields. If you need to customize how to form the tuples from various fields, the FDW also need to have such extensibility. If COPY FROM, we should implement all the features in copy.c rather than exported APIs. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump directory archive format / parallel pg_dump
On Wed, Feb 2, 2011 at 13:32, Joachim Wieland j...@mcknight.de wrote: Here is a rebased version with some minor changes as well. I read the patch works as below. Am I understanding correctly? 1. Open all connections in a parent process. 2. Start transactions for each connection in the parent. 3. Spawn child processes with fork(). 4. Each child process uses one of the inherited connections. I think we have 2 important technical issues here: * The consistency is not perfect. Each transaction is started with small delays in step 1, but we cannot guarantee no other transaction between them. * Can we inherit connections to child processes with fork() ? Moreover, we also need to pass running transactions to children. I wonder libpq is designed for such usage. To solve both issues, we might want a way to control visibility in a database server instead of client programs. Don't we need server-side support like [1] before developing parallel dump? [1] http://wiki.postgresql.org/wiki/ClusterFeatures#Export_snapshots_to_other_sessions I haven't tested it on Windows now but will do so as soon as the Unix part has been reviewed. It might be better to remove Windows-specific codes from the first try. I doubt Windows message queue is the best API in such console-based application. I hope we could use the same implementation for all platforms for inter-process/thread communication. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add ENCODING option to COPY
On Tue, Feb 1, 2011 at 13:08, Hitoshi Harada umi.tan...@gmail.com wrote: The third patch is attached, modifying mb routines so that they can receive conversion procedures as FmgrInof * and save the function pointer in CopyState. I tested it with encoding option and could not see performance slowdown. Hmm, sorry, the patch was wrong. Correct version is attached. Here is a brief review for the patch. * Syntax: ENCODING encoding vs. ENCODING 'encoding' We don't have to quote encoding names in the patch, but we always need quotes for CREATE DATABASE WITH ENCODING. I think we should modify CREATE DATABASE to accept unquoted encoding names aside from the patch. Changes in pg_wchar.h are the most debatable parts of the patch. The patch adds pg_cached_encoding_conversion(). We normally use pg_do_encoding_conversion(), but it is slower than the proposed function because it lookups conversion proc from catalog every call. * Can we use #ifndef FRONTEND in the header? Usage of fmgr.h members will broke client applications without the #ifdef, but I guess client apps don't always have definitions of FRONTEND. If we don't want to change pg_wchar.h, pg_conversion_fn.h might be a better header for the new API because FindDefaultConversion() is in it. Changes in copy.c looks good except a few trivial cosmetic issues: * encoding_option could be a local variable instead of cstate's member. * cstate-client_encoding is renamed to target_encoding, but I prefer file_encoding or remote_encoding. * CopyState can have conv_proc entity as a member instead of the pointer. * need_transcoding checks could be replaced with conv_proc IS NULL check. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER EXTENSION UPGRADE, v3
On Wed, Feb 2, 2011 at 03:21, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: PFA version 3 of the ALTER EXTENSION PATCH, cleaned and merged against recent HEAD and extension's branch from which I just produced the v30 patch. Excuse me for asking, but could you explain what is the purpose? Which is true, upgrade to 9.1 from past versions or upgrade from 9.1 to future versions? Also, how much advantage will we have compared with uninstall_MODULE.sql + CREATE EXTENSION? In my understanding, the patch does two things: 1. Add ALTER object SET EXTENSION 2. Add MODULE.upgrade.sql script for each contrib module #1 seems reasonable as a supplement for CREATE EXTENSION patch, but we might need not only attach but also detach. I guess #2 is much more difficult than expected because we might upgrade databases from older versions. Will we need upgrade script for each supported versions? -- if so, it would be nightmare... -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log_checkpoints and restartpoint
On Mon, Jan 31, 2011 at 05:17, Robert Haas robertmh...@gmail.com wrote: The attached patch changes LogCheckpointEnd so that it logs the number of WAL files created/deleted/recycled even in restartpoint. This patch looks good to me. For now, I'm marking it Ready for Committer. Absent objections, I will also commit it. I don't have any objections for the patch, but we might also need to add description about restartpoints into log_checkpoints option. http://developer.postgresql.org/pgdocs/postgres/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add support for logging the current role
Updated patch attached. I think we need to improve postgresql.conf.sample a bit more, especially the long line for #log_csv_fields = '...'. 330 characters in it! #1. Leave the long line because it is needed. #2. Hide the variable from the default conf. #3. Use short %x mnemonic both in log_line_prefix and log_csv_fields. (It might require many additional mnemonics.) Which is better, or another idea? On Sat, Jan 29, 2011 at 13:06, Stephen Frost sfr...@snowman.net wrote: * log_csv_fields's GUC context is PGC_POSTMASTER. Is it by design? Doing SIGHUP would require addressing how to get all of the backends to close the old log file and open the new one, because we don't want to have a given log file which has two different CSV formats in it (we wouldn't be able to load it into the database...). This was specifically addressed in the thread leading up to this patch... I think it depends default log filename, that contains %S (seconds) suffix. We can remove %S from log_filename; if we use a log per-day, those log might contain different columns even after restart. If we cannot avoid zigged csv fields completely, SIGHUP seems reasonable for it. * What objects do you want to allocate in TopMemoryContext in assign_log_csv_fields() ? I just moved the switch to Top to be after those are allocated. How about changing the type of csv_log_fields from List* to fixed array of LogCSVFields? If so, we can use an array-initializer instead of build_default_csvlog_list() ? The code will be simplified. Fixed length won't be a problem because it would be rare that the same field are specified many times. * Docs need some tags for itemized elements or pre-formatted codes. They looks itemized in the sgml files, but will be flattened in complied HTML files. Not sure what you're referring to here...? Can you elaborate? I'm not great with the docs. :/ Could you try to make html in the doc directory? Your new decumentation after | These columns may be included in the CSV output: will be unaligned plain text without some tags. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-scope advisory locks
On Fri, Jan 28, 2011 at 17:12, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: I still didn't address the issue with pg_advisory_unlock_all() releasing transaction scoped locks, I guess you don't want independent locks, right? If an user object is locked by session locks, it also blocks backends trying to lock it with transaction locks. If so, I think an ideal behavior is below: - The transaction-or-session property is overwritten by the last lock function call. We can promote session locks from/to transaction locks. - Shared and exclusive locks are managed independently. We could have shared session lock and exclusive transaction lock on the same resource in a transaction. - Unlock functions releases both transaction and session locks. - unlock_all() releases all both locks. Those might be odd in DBMS-perspective, but would be natural as programming languages. I guess advisory locks are often used in standard programming like flows. Another issue I found while testing the behaviour here: http://archives.postgresql.org/pgsql-hackers/2011-01/msg01939.php is that if a session holds both a transaction level and a session level lock on the same resource, only one of them will appear in pg_locks. Is that going to be a problem from the user's perspective? Could it be an indication of a well-hidden bug? Based on my tests it seems to work, but I'm not at all confident with the code. In the above proposal, we won't have both session and transaction lock on the same resource at the same time, though we still need to show exclusive and shared locks in different lines. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spread checkpoint sync
On Mon, Jan 31, 2011 at 13:41, Robert Haas robertmh...@gmail.com wrote: 1. Absorb fsync requests a lot more often during the sync phase. 2. Still try to run the cleaning scan during the sync phase. 3. Pause for 3 seconds after every fsync. So if we want the checkpoint to finish in, say, 20 minutes, we can't know whether the write phase needs to be finished by minute 10 or 15 or 16 or 19 or only by 19:59. We probably need deadline-based scheduling, that is being used in write() phase. If we want to sync 100 files in 20 minutes, each file should be sync'ed in 12 seconds if we think each fsync takes the same time. If we would have better estimation algorithm (file size? dirty ratio?), each fsync chould have some weight factor. But deadline-based scheduling is still needed then. BTW, we should not sleep in full-speed checkpoint. CHECKPOINT command, shutdown, pg_start_backup(), and some of checkpoints during recovery might don't want to sleep. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add reference to client_encoding parameter
On Tue, Feb 1, 2011 at 00:37, Thom Brown t...@linux.com wrote: I've attached a small patch for the docs which adds a reference to the client_encoding parameter description. This is in response to someone attempting to submit a comment which explains where available encodings can be found. Thanks. It's a reasonable reference. But I reworded it as below, that we are using in other a few places. The character sets supported by the PostgreSQL server are described in ... -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] setlocale and gettext in Postgres
2011/1/27 Hiroshi Inoue in...@tpf.co.jp: I see now the following lines in libintl.h of version 0.18.1.1 which didn't exist in 0.17 version. The macro may cause a trouble especially on Windows. Attached is a patch to disable the macro on Windows. Can anyone test the fix? I added the patch to the current commitfest for reminder. https://commitfest.postgresql.org/action/patch_view?id=528 -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] off-by-one mistake in array code error reporting
On Mon, Jan 31, 2011 at 17:19, Alexey Klyukin al...@commandprompt.com wrote: While working on PL/Perl patch for arrays as input arguments I've noticed that PostgreSQL reports one less dimension in the 'number of array dimensions (%d) exceeds the maximum allowed (%d), i.e. select '{{{1,2},{3,4}},{{5,6},{7,8}}},{{{9,10},{11,12}},{{13,14},{15,16, 17,18},{19,20}},{{21,22},{23,24}}},{{{25,26},{27,28}},{{29,30},{31,32}, {1,2},{3,4}},{{5,6},{7,8}}},{{{9,10},{11,12}},{{13,14},{15,16, 17,18},{19,20}},{{21,22},{23,24}}},{{{25,26},{27,28}},{{29,30},{31,32}}, {{1,2},{3,4}},{{5,6},{7,8}}},{{{9,10},{11,12}},{{13,14},{15,16, 17,18},{19,20}},{{21,22},{23,24}}},{{{25,26},{27,28}},{{29,30},{31,32}, {1,2},{3,4}},{{5,6},{7,8}}},{{{9,10},{11,12}},{{13,14},{15,16, 17,18},{19,20}},{{21,22},{23,24}}},{{{25,26},{27,28}},{{29,30},{31,32}}}' ::int[]; ERROR: number of array dimensions (6) exceeds the maximum allowed (6) Attached is the simple fix for that. Thanks. I found one more same bug in PL/pgSQL. s=# DO $$ DECLARE a int[]; BEGIN a = (ARRAY[1])[1][1][1][1][1][1][1]; END; $$; ERROR: number of array dimensions (6) exceeds the maximum allowed (6) -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Mon, Jan 31, 2011 at 02:34, Robert Haas robertmh...@gmail.com wrote: I'm in favor of rejecting this patch in its entirety. The functionality looks useful, but once you remove the syntax support, it could just as easily be distributed as a contrib module rather than in core. +1 ... if we're going to provide nonstandard behavior, it should be with a different syntax. Also, with a contrib module we could keep on providing the nonstandard behavior for people who still need it, even after implementing the standard properly. Good point. I agree for collect() function, that is the only function we cannot provide compatibility when we have MULTISET. But others are still reasonable because they won't provide nonstandard behavior. The SQL standard seems to have abstract COLLECTION data type as a super class of ARRAY and MULTISET. So, it's reasonable that functions and operators that accept MULTISETs also accept ARRAYs. For example, we will have cardinality(ARRAY) even if we have cardinality(MULTISET). Also, trim_array() is in the SQL standard. I can remove some parts in the patch, especially for parser changes, but others should be still in the core. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Mon, Jan 31, 2011 at 04:12, Robert Haas robertmh...@gmail.com wrote: Well, do you want to revise this and submit a stripped-down version? I'm not averse to adding things that are required by the standard and won't cause backward compatibility problems later. Sure. I'll remove collect() function. I can also remove syntax support, but it requires rework for documentation (for, IS A SET operator to is_a_set function), so I'd like to wait for the final consensus a bit more. The documentation for trim_array() in the current patch version is pretty terrible. The documentation describe it as having the prototype trim_array(anyarray), but it's called in the example as trim(integer[], integer) - two arguments vs. one. Oops, it's just my mistake. trim(anyarray, integer) is correct. Also the docs don't say how it decides how many elements to remove, or what happens to a multi-dimensional array. I wrote the description below in the patch: + In functionarray_sort/, functionset/, and functiontrim_array/ + functions, input arrays are always flattened into one-dimensional arrays. + In addition, the lower bounds of the arrays are adjusted to 1. I'm not sure what is the consistent behavior for MD arrays. For example, array_concat() is very strict, but @ and operators don't care about the dimensions. I interpreted the second argument for trim_array() as a number of elements, but of course we can redefine it as a number of rows for MD arrays. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: pg_ctl failover Re: [HACKERS] Latches, signals, and waiting
On Mon, Jan 31, 2011 at 11:52, Fujii Masao masao.fu...@gmail.com wrote: On Sat, Jan 29, 2011 at 1:11 AM, Tatsuo Ishii is...@postgresql.org wrote: Ok. I will write a C user function and add to pgpool source tree. I think it will be fairly easy to create a trigger file in the function. If the pg_ctl promote patch will have been committed, I recommend that the C function should send the signal to the startup process rather than creating the trigger file. The C function needs to create a trigger file in $PGDATA/promote before sending signals, no?system(pg_ctl promote) seems the easiest way if you use an external module. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions support for pg_dump, patch v27
On Fri, Jan 28, 2011 at 18:03, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: After review, I included all your proposed changes, thanks a lot! Please find attached a new version of the patch, v29, Additional questions and discussions: * relocatable and schema seems to be duplicated options. We could treat an extension is relocatable when schema is not specified instead of relocatable option. Or, If we keep schema option, it would be used as the default schema to be installed when WITH SCHEMA is not specified. * version field in pg_available_extension might mislead when a newer version of an extension is available but an older version is installed. How about returning installed version for installed field instead of booleans? The field will be NULLs if not installed. * I want to remove O(n^2) behavior in pg_extensions(). It scans pg_extension catalog to return whether the extension is installed, but it would be better to change the function to return just whole extensions and JOIN with pg_extension in pg_available_extensions. (it's the same technique used in pg_stat_replication) -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions support for pg_dump, patch v27
On Thu, Jan 27, 2011 at 20:01, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Ok, done. Of course, it solves the whole problem Itagaki had with adminpack because we stop relying on dependencies to get it right now. I found pg_restore with -c option fails when an extension is created in pg_catalog. Since pg_catalog is an implicit search target, so we might need the catalog to search_path explicitly. Note that I can restore adminpack with not errors because it has explicit pg_catalog. prefix in the installer script. postgres=# CREATE EXTENSION cube WITH SCHEMA pg_catalog; $ pg_dump -Fc postgres postgres.dump $ pg_restore -d postgres -c postgres.dump pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 1; 3996 16678 EXTENSION cube pg_restore: [archiver (db)] could not execute query: ERROR: no schema has been selected to create in BTW, I have a minor comments for the code. extern bool extension_relocatable_p(Oid ext_oid); What is _p ? Also, we could make the function static because it is used only in extension.c. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions support for pg_dump, patch v27
On Thu, Jan 27, 2011 at 22:48, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Itagaki Takahiro itagaki.takah...@gmail.com writes: I found pg_restore with -c option fails when an extension is created in pg_catalog. Nice catch, thank you very much (again) for finding those :) Seems good. extern bool extension_relocatable_p(Oid ext_oid); predicate. Maybe I've done too much Emacs Lisp coding at the time I added that function, but it looked natural (enough) to me :) Hmmm, I like extension_is_relocatable() or so... Including the above, I wrote a patch on your patch for minor cleanup. Please merge reasonable parts in it. * access() is not portable. The pre-checking with access() doesn't seems needed because the same error will be raised in parse_extension_control_file(). * There are some dead code in the patch. For example, you exported ObjectAddresses to public, but it is not used in extension.c actually. I reverted some of changes. * Should we support absolute control file paths? Absolute paths are supported by get_extension_absolute_path(), but I'm not sure actual use-cases. * Each ereport(ERROR) should have a reasonable errcode unless they are an internal logic error, and whether the error message follows our guidline (starting with a lower case character, etc.) * Changed create_extension_with_user_data to a static variable. CreateExtensionAddress and create_extension are still exported. We could have better names for them -- CurrentExtensionAddress and in_create_extension? Or, in_create_extension might be replaced with CreateExtensionAddress.objectId != InvalidOid. * Added psql tab completion for CREATE/DROP/ALTER EXTENSION. * Use palloc0() instead of palloc() and memset(0). * Several code cleanup. -- Itagaki Takahiro extension-diff-on.v28a.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REVIEW: WIP: plpgsql - foreach in
On Mon, Jan 24, 2011 at 13:05, Stephen Frost sfr...@snowman.net wrote: FOR var in ARRAY array_expression ... I like that a lot more than inventing a new top-level keyword, AFAIR, the syntax is not good at an array literal. FOR var IN ARRAY ARRAY[1,2,5] LOOP ... And it was the only drawback compared with FOREACH var IN expr. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions support for pg_dump, patch v27
On Mon, Jan 24, 2011 at 18:22, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Following recent commit of the hstore Makefile cleaning, that I included into my extension patch too, I have a nice occasion to push version 27 of the patch. Please find it enclosed. Hi, I read the patch and test it in some degree. It works as expected generally, but still needs a bit more development in edge case. * commands/extension.h needs cleanup. - Missing extern declarations for create_extension and create_extension_with_user_data variables. - ExtensionControlFile and ExtensionList can be hidden from the header. - extension_objects_fctx is not used at all. * There are many recordDependencyOn(myself, CreateExtensionAddress, ...) in some places, but I'm not sure we forget something -- TRIGGERs? * Will we still need uninstaller scripts? DROP EXTENSION depends on object dependency to drop objects, but we have a few configurations that cannot be described in the dependency. For example, rows inserted into pg_am for user AMs or reverting default settings modified by the extension. * Extensions installed in pg_catalog: If we install an extension into pg_catalog, CREATE EXTENSION xxx WITH SCHEMA pg_catalog pg_dump dumps nothing about the extension. We might need special care for modules that install functions only in pg_catalog. * I can install all extensions successfully, but there is a warning in hstore. The warning was not reported to the client, but the whole contents of hstore.sql.in was recorded in the server log. WARNING: = is deprecated as an operator name We sets client_min_messages for the issue in hstore.sql.in, but I think we also need an additional SET log_min_messages TO ERROR around it. * Is it support nested CREATE EXTENSION calls? It's rare case, but we'd have some error checks for it. * It passes all regression test for both backend and contrib modules, but there are a couple of warnings during build and installation. Three compiler warnings found. Please check. pg_proc.c: In function 'ProcedureCreate': pg_proc.c:110:8: warning: 'extensionOid' may be used uninitialized in this function pg_shdepend.c: In function 'shdepReassignOwned': pg_shdepend.c:1335:6: warning: implicit declaration of function 'AlterOperatorOwner_oid' operatorcmds.c:369:1: warning: no previous prototype for 'AlterOperatorOwner_oid' * Five warnings also found during make install in contrib directory. ../../config/install-sh: ./uninstall_btree_gist.sql does not exist. ../../config/install-sh: ./uninstall_pageinspect.sql does not exist. ../../config/install-sh: ./uninstall_pgcrypto.sql does not exist. ../../config/install-sh: ./uninstall_pgrowlocks.sql does not exist. ../../config/install-sh: ./uninstall_pgstattuple.sql does not exist. * You use const = expr in some places, ex. if( InvalidOid != ...), but we seem to prefer expr = const. * [off topic] I found some installer scripts are inconsistent. They uses CREATE FUNCTION for some of functions, but CREATE OR REPLACE FUNCTION for others. We'd better write documentation about how to write installer scripts because CREATE EXTENSION has some implicit assumptions in them. For example, Don't use transaction, etc. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extensions support for pg_dump, patch v27
On Wed, Jan 26, 2011 at 02:57, David E. Wheeler da...@kineticode.com wrote: Other than adminpack, I think it makes sense to say that extensions are not going into pg_catalog… Given this, we should maybe see about either making adminpack part of PostgreSQL's core distribution (probably a good idea) or moving it out of pg_catalog so we don't have an exception to the rule. +1 I doubt it can solve the real problem. It is my understanding that we install them in pg_catalog because they are designed to be installed in template1 for pgAdmin, right? We have a problem in pg_dump when we install extension modules in template1. If we create a database, installed functions are copied from template1. However, they are also dumped with pg_dump unless they are in pg_catalog. So, we encounter function already exists errors when pg_restore. Since pg_dump won't dump user objects in pg_catalog, adminpack can avoid the above errors by installing functions in pg_catalog. CREATE EXTENSION might have the same issue -- Can EXTENSION work without errors when we install extensions in template databases? To avoid errors, pg_dump might need to dump extensions as CREATE OR REPLACE EXTENSION or CREATE EXTENSION IF NOT EXISTS rather than CREATE EXTENSION. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REVIEW: WIP: plpgsql - foreach in
On Mon, Jan 24, 2011 at 20:10, Pavel Stehule pavel.steh...@gmail.com wrote: we have to iterate over array's items because it allow seq. access to array's data. I need a global index for function array_get_isnull. I can't to use a buildin functions like array_slize_size or array_get_slice, because there is high overhead of array_seek function. I redesigned the initial part of main cycle, but code is little bit longer :(, but I hope, it is more readable. The attached patch only includes changes in plpgsql. I tested it combined with the previous one, that includes other directories. * src/pl/plpgsql/src/gram.y | for_variable K_SLICE ICONST The parameter for SLICE must be an integer literal. Is it a design? I got a syntax error when I wrote a function like below. However, FOREACH returns different types on SLICE = 0 or SLICE = 1. (element type for 0, or array type for =1) So, we cannot write a true generic function that accepts all SLICE values even if the syntax supports an expr for the parameter. CREATE FUNCTION foreach_test(int[], int) RETURNS SETOF int[] AS $$ DECLARE a integer[]; BEGIN FOREACH a SLICE $2 IN $1 LOOP -- syntax error RETURN NEXT a; END LOOP; END; $$ LANGUAGE plpgsql; * /doc/src/sgml/plpgsql.sgml + FOREACH replaceabletarget/replaceable optional SCALE s/SCALE/SLICE/ ? * Why do you use foreach_a for some identifiers? We use foreach only for arrays, so _a suffix doesn't seem needed. * accumArrayResult() for SLICE has a bit overhead. If we want to optimize it, we could use memcpy() because slices are placed in continuous memory. But I'm not sure the worth; I guess FOREACH will be used with SLICE = 0 in many cases. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: FDW API
On Sat, Jan 22, 2011 at 07:20, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: * I wonder if CREATE FOREIGN DATA WRAPPER should automatically create the handler function, if it doesn't exist yet. Doing that would require the equivalent of pg_pltemplate for FDWs, no? I think we're a long way from wanting to do that. Also, it seems to me that add-on FDWs are likely to end up getting packaged as extensions, The proposed file_fdw.sql actually creates a default FDW on installation. So I think the installation scripts work as a template even if we don't have FDW template catalogs. + /* contrib/file_fdw/file_fdw.sql.in */ + -- create wrapper with validator and handler + CREATE OR REPLACE FUNCTION file_fdw_validator (text[], oid) + CREATE OR REPLACE FUNCTION file_fdw_handler () + CREATE FOREIGN DATA WRAPPER file_fdw + VALIDATOR file_fdw_validator HANDLER file_fdw_handler; -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add support for logging the current role
On Wed, Jan 19, 2011 at 14:36, Stephen Frost sfr...@snowman.net wrote: Alright, here's the latest on this patch. I've added a log_csv_fields GUC along with the associated magic to make it work (at least for me). Also added 'role_name' and '%U' options. Requires postmaster restart to change, didn't include any 'short-cut' field options, though I don't think it'd be hard to do if we can decide on it. Default remains the same as what was in 9.0. Hi, I reviewed log_csv_options.patch. It roughly looks good, but I'd like to discuss about the design in some points. Features The patch adds log_csv_fields GUC variable. It allows to customize columns in csvlog. The default setting is what we were writing 9.0 or earlier versions. It also add role_name for log_csv_fields and %U for log_line_prefix to record role names. They are the name set by SET ROLE. OTOH, user_name and %u shows authorization user names. Discussions * How about csvlog_fields rather than log_csv_fields? Since we have variables with syslog_ prefix, csvlog_ prefix seems to be better. * We use %what syntax to specify fields in logs for log_line_prefix, but will use long field names for log_csv_fields. Do you have any better idea to share the names in both options? At least I want to share the short description for them in postgresql.conf. * log_csv_fields's GUC context is PGC_POSTMASTER. Is it by design? PGC_SIGHUP would be more consistent compared with log_line_prefix. However, the csv format will not be valid because the column definitions will be changed in the middle of file. * none is not so useful for the initial role_name field. Should we use user_name instead of the empty role_name? Comments for codes Some of the items are trivial, though. * What objects do you want to allocate in TopMemoryContext in assign_log_csv_fields() ? AFAICS, we don't have to keep rawstring and column_list in long-term context. Or, if you need TopMemoryContext, those variables should be pfree'ed at the end of function. * appendStringInfoChar() calls in write_csvlog() seem to be wrong when the last field is not application_name. * Docs need more cross-reference hyper-links for see also items. * Docs need some tags for itemized elements or pre-formatted codes. They looks itemized in the sgml files, but will be flattened in complied HTML files. * A declaration of assign_log_csv_fields() at the top of elog.c needs extern. * There is a duplicated declaration for build_default_csvlog_list(). * list_free() is NIL-safe. You don't have to check whether the list is NIL before call the function. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Mon, Jan 24, 2011 at 20:49, Robert Haas robertmh...@gmail.com wrote: I notice that this is adding keywords and syntax support for what is basically a PostgreSQL extension (since we certainly can't possibly be following the SQL standards given that we're not implementing a new datatype. Is that really a good idea? As I wrote here, http://archives.postgresql.org/pgsql-hackers/2011-01/msg00829.php I think we can follow the SQL standard incrementally because we have function overloads. One exception is the result type of collect() aggregate function. It returns an array for now, but will return a multiset when we support true multiset data type. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Per-column collation, the finale
On Sat, Jan 15, 2011 at 06:41, Peter Eisentraut pete...@gmx.net wrote: I've been going over this patch with a fine-tooth comb for the last two weeks, fixed some small bugs, added comments, made initdb a little friendlier, but no substantial changes. What can I do to test the patch? No regression tests are included in it, and I have an almost empty pg_collation catalog with it: =# SELECT * FROM pg_collation; collname | collnamespace | collencoding | collcollate | collctype --+---+--+-+--- default |11 |0 | | (1 row) -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add ENCODING option to COPY
On Sat, Jan 15, 2011 at 02:25, Hitoshi Harada umi.tan...@gmail.com wrote: The patch overrides client_encoding by the added ENCODING option, and restores it as soon as copy is done. We cannot do that because error messages should be encoded in the original encoding even during COPY commands with encoding option. Error messages could contain non-ASCII characters if lc_messages is set. I see some complaints ask to use pg_do_encoding_conversion() instead of pg_client_to_server/server_to_client(), but the former will surely add slight overhead per reading line If we want to reduce the overhead, we should cache the conversion procedure in CopyState. How about adding something like FmgrInfo file_to_server_covv into it? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] add __attribute__((noreturn)) to suppress a waring
I found the following warning with Fedora 14 / gcc 4.5.1. pg_backup_archiver.c: In function ‘_discoverArchiveFormat’: pg_backup_archiver.c:1736:11: warning: ‘fh’ may be used uninitialized in this function To suppress it, I'm thinking to add noreturn to die_horribly(). Any objections? Another solution might be adding a dummy assignment after calls of die_horribly(). -- Itagaki Takahiro noreturn.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to know killed by pg_terminate_backend
On Fri, Jan 21, 2011 at 13:56, Tatsuo Ishii is...@postgresql.org wrote: Here is the patch to implement the feature. 1) pg_terminate_backend() sends SIGUSR1 signal rather than SIGTERM to the target backend. 2) The infrastructure used for message passing is storage/ipc/procsignal.c The new message type for ProcSignalReason is PROCSIG_TERMNINATE_BACKEND_INTERRUPT 3) I assign new error code 57P04 which is returned from the backend killed by pg_terminate_backend(). #define ERRCODE_TERMINATE_BACKEND MAKE_SQLSTATE('5','7', 'P','0','4') Anyone has better idea? Tom dislikes my patch but I don't know how to deal with it. There was another design in the past discussion: One idea is postmaster sets a flag in the shared memory area indicating it rceived SIGTERM before forwarding the signal to backends. Is it enough for your purpose and do we think it is more robust way? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Tue, Jan 18, 2011 at 09:33, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Review for CF: Thank your for the review! Since it doesn't appear to be intended to change any user-visible behavior, I don't see any need for docs or changes to the regression tests. There might be some user-visible behaviors in error messages because I rearranged some codes to check errors, But we can see the difference only if we have two or more errors in COPY commands. They should be not so serious issues. So far everything I've done has been with asserts enabled, so I haven't tried to get serious benchmarks, but it seems fast. I will rebuild without asserts and do performance tests before I change the status on the CF page. I'm wondering if it would make more sense to do the benchmarking with just this patch or the full fdw patch set? Both? I tested the performance on my desktop PC, but I cannot see any differences. But welcome if any of you could test on high-performance servers. Comparison with file_fdw would be more interesting If they have similar performance, we could replace COPY FROM to CREATE TABLE AS SELECT FROM foreign_table, that is more flexible. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Fri, Jan 21, 2011 at 22:12, Shigeru HANADA han...@metrosystems.co.jp wrote: My concern is the explainInfo interface is not ideal for the purpose and therefore it will be unstable interface. If we support nested plans in FDWs, each FDW should receive a tree writer used internally in explain.c. explainInfo, that is a plan text, is not enough for complex FdwPlans. However, since we don't have any better solution for now, we could have the variable for 9.1. It's much better than nothing. When I was writing file_fdw, I hoped to use static functions in explain.c such as ExplainProperty() to handle complex information. Even for single plan node, I think that filename and size (currently they are printed in a plain text together) should be separated in the output of explain, especially when the format was XML or JSON. Just an idea -- we could return complex node trees with explainInfo if we use XML or JSON for the format. For example, pgsql_fdw can return the result from EXPLAIN (FORMAT json) without modification. It might be one of the reasons we should should support JSON in the core :) -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Wed, Jan 19, 2011 at 00:34, Shigeru HANADA han...@metrosystems.co.jp wrote: Attached patch requires FDW API patches and copy_export-20110114.patch. Some minor comments: * Can you pass slot-tts_values and tts_isnull directly to NextCopyFrom()? It won't allocate the arrays; just fill the array buffers. * You can pass NULL for the 4th argument for NextCopyFrom(). | Oid tupleoid; /* just for required parameter */ * file_fdw_validator still has duplicated codes with BeginCopy, but I have no idea to share the validation code in clean way... * Try strVal() instead of DefElem-val.str * FdwEPrivate seems too abbreviated for me. How about FileFdwPrivate? * private is a bad identifier name because it's a C++ keyword. We should rename FdwExecutionState-private. In that message, you also pointed out that FDW must generate explainInfo in every PlanRelScan call even if the planning is not for EXPLAIN. I'll try to defer generating explainInfo until EXPLAIN VERBOSE really uses it. It might need new hook point in expalain.c, though. I complained about the overhead, but it won't be a problem for file_fdw and pgsql_fdw. file_fdw can easily generate the text, and pgsql_fdw needs to generate a SQL query anyway. My concern is the explainInfo interface is not ideal for the purpose and therefore it will be unstable interface. If we support nested plans in FDWs, each FDW should receive a tree writer used internally in explain.c. explainInfo, that is a plan text, is not enough for complex FdwPlans. However, since we don't have any better solution for now, we could have the variable for 9.1. It's much better than nothing. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON data type status?
On Fri, Jan 21, 2011 at 09:11, Bruce Momjian br...@momjian.us wrote: What happened to our work to add a JSON data type for PG 9.1? Nothing will happen in 9.1. I assume we are in competition status: http://archives.postgresql.org/pgsql-hackers/2010-11/msg00481.php Also, if PGXN will work well, we might not have to include JSON in the core. We can download any JSON implementations from the site after installing the core server. Of course, if we will use JSON types in the core (EXPLAIN JSON output?), we have to include one of them. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REVIEW: EXPLAIN and nfiltered
On Thu, Jan 20, 2011 at 12:16, Stephen Frost sfr...@snowman.net wrote: This patch looked good, in general, to me. I added a few documentation updates and a comment, but it's a very straight-forward patch as far as I can tell. Passes all regressions and my additional testing. Looks good and useful for me, too. We need to adjust a bit more documentation. The patch changes all of EXPLAIN ANALYZE outputs. When I grep'ed the docs with loops=, EXPLAIN ANALYZE is also used in perform.sgml and auto-explain.sgml in addition to explain.sgml. It's good to have documentation about nfiltered parameter. The best place would be around the descriptions of loops in Using EXPLAIN page: http://developer.postgresql.org/pgdocs/postgres/using-explain.html -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
On Tue, Jan 18, 2011 at 05:39, Tom Lane t...@sss.pgh.pa.us wrote: I haven't looked at this patch, but it seems to me that it would be reasonable to conclude A != B if the va_extsize values in the toast pointers don't agree. It's a very light-weight alternative of memcmp the byte data, but there is still the same issue -- we might have different compressed results if we use different algorithm for TOASTing. So, it would be better to apply the present patch as-is. We can improve the comparison logic over the patch in another development cycle if possible. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
2011/1/17 KaiGai Kohei kai...@ak.jp.nec.com: Are you talking about an idea to apply toast id as an alternative key? No, probably. I'm just talking about whether diff -q A.txt B.txt and diff -q A.gz B.gz always returns the same result or not. ... I found it depends on version of gzip. So, if we use such logic, we cannot improve toast compression logic because the data is migrated by pg_upgrade. I did similar idea to represent security label on user tables for row level security in the v8.4/9.0 based implementation. Because a small number of security labels are shared by massive tuples, it is waste of space if we have all the text data being toasted individually, not only performance loss in toast/detoast. It looks the same issue as large object rather than the discussion here. We have vacuumlo in contrib to solve the issue. So, we could have vacuumlo-like special sweeping logic for the security label table. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_replication security
On Mon, Jan 17, 2011 at 19:51, Magnus Hagander mag...@hagander.net wrote: Here's a patch that limits it to superuser only. We can't easily match it to the user of the session given the way the walsender data is returned - it doesn't contain the user information. But limiting it to superuser only seems perfectly reasonable and in line with the encouragement not to use the replication user for login. Objections? It hides all fields in pg_stat_wal_senders(). Instead, can we just revoke usage of the function and view? Or, do we have some plans to add fields which normal users can see? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Sat, Jan 15, 2011 at 08:35, Shigeru HANADA han...@metrosystems.co.jp wrote: Interface of NextCopyFrom() is fixed to return HeapTuple, to support tableoid without any change to TupleTableSlot. 3) 20110114-copy_export_HeapTupe.patch This patch fixes interface of NextCopyFrom() to return results as HeapTuple. I think file_fdw can return tuples in virtual tuples forms, and ForeignNext() calls ExecMaterializeSlot() to store tableoid. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
On Mon, Jan 17, 2011 at 04:05, Andy Colson a...@squeakycode.net wrote: This is a review of: https://commitfest.postgresql.org/action/patch_view?id=468 Purpose: Equal and not-equal _may_ be quickly determined if their lengths are different. This _may_ be a huge speed up if we don't have to detoast. We can skip detoast to compare lengths of two text/bytea values with the patch, but we still need detoast to compare the contents of the values. If we always generate same toasted byte sequences from the same raw values, we don't need to detoast at all to compare the contents. Is it possible or not? -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transaction-scope advisory locks
On Sun, Jan 16, 2011 at 06:20, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: Here's the latest version of the patch. It now uses the API proposed by Simon, but still lacks documentation changes, which I'm going to send tomorrow. Here is a short review for Transaction scoped advisory locks: https://commitfest.postgresql.org/action/patch_view?id=518 == Features == The patch adds pg_[try_]advisory_xact_lock[_shared] functions. The function names follows the past discussion -- it's better than bool isXact argument or changing the existing behavior. == Coding == The patch itself is well-formed and be applied cleanly. I expect documentation will come soon. There is no regression test, but we have no regression test for advisory locks even now. Tests for lock conflict might be difficult, but we could have single-threaded test for lock/unlock and pg_locks view. == Questions == I have a question about unlocking transaction-scope advisory locks. We cannot unlock them with pg_advisory_unlock(), but can unlock with pg_advisory_unlock_all(). It's inconsistent behavior. Furthermore, I wonder we can allow unlocking transaction-scope locks -- we have LOCK TABLE but don't have UNLOCK TABLE. postgres=# BEGIN; BEGIN postgres=# SELECT pg_advisory_xact_lock(1); pg_advisory_xact_lock --- (1 row) postgres=# SELECT pg_advisory_unlock(1); WARNING: you don't own a lock of type ExclusiveLock pg_advisory_unlock f (1 row) postgres=# SELECT pg_advisory_unlock_all(); pg_advisory_unlock_all (1 row) postgres=# ROLLBACK; ROLLBACK -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] texteq/byteaeq: avoid detoast [REVIEW]
On Mon, Jan 17, 2011 at 16:13, Pavel Stehule pavel.steh...@gmail.com wrote: If we always generate same toasted byte sequences from the same raw values, we don't need to detoast at all to compare the contents. Is it possible or not? For bytea, it seems it would be possible. For text, I think locales may make that impossible. Aren't there locale rules where two different characters can behave the same when comparing them? I know in Swedish at least w and v behave the same when sorting (but not when comparing) in some variants of the locale. Some string's comparation operations are binary now too. But it is question what will be new with collate support. Right. We are using memcmp() in texteq and textne now. We consider collations only in , =, =, and compare support functions. So, I think there is no regression here as long as raw values and toasted byte sequences have one-to-one correspondence. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] textarray option for file FDW
On Sun, Jan 16, 2011 at 02:29, Andrew Dunstan and...@dunslane.net wrote: textarray. This option would require that the foreign table have exactly one field, of type text[], and would compose all the field strings read from the file for each record into the array (however many there are). CREATE FOREIGN TABLE arr_text ( t text[] ) SERVER file_server OPTIONS (format 'csv', filename '/path/to/ragged.csv', textarray 'true'); FOREIGN TABLE has stable definitions, so there are some issues that doesn't exist in COPY command: - when the type of the last column is not a text[] - when the last column is changed by ALTER FOREIGN TABLE ADD/DROP COLUMN BTW, those options could be specified in column options rather than table options. But we don't have column option syntax in the current proposal. CREATE FOREIGN TABLE arr_text ( i integer OPTION (column 1), -- column position in csv file t text[] OPTION (column 'all the rest'), d dateOPTION (column 2) ) SERVER file_server OPTIONS (format 'csv', filename '/path/to/ragged.csv'); -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL/MED - file_fdw
On Fri, Jan 14, 2011 at 14:20, Shigeru HANADA han...@metrosystems.co.jp wrote: After copying statisticsof pgbench_xxx tables into csv_xxx tables, planner generates same plans as for local tables, but costs of ForeignScan nodes are little lower than them of SeqScan nodes. Forced Nested Loop uses Materialize node as expected. Interesting. It means we need per-column statistics for foreign tables in addition to cost values. ISTM that new interface which is called from ANALYZE would help to update statistics of foreign talbes. If we could leave sampling argorythm to FDWs, acquire_sample_rows() might fit for that purpose. We will discuss how to collect statistics from foreign tables in the next development cycle. I think we have two choice here: #1. Retrieve sample rows from remote foreign tables and store stats in the local pg_statistic. #2. Use remote statistics for each foreign table directly. acquire_sample_rows() would be a method for #1, Another approach for #2 is to use remote statistics directly. We provide hooks to generate virtual statistics with get_relation_stats_hook() and families. We could treat statistics for foreign tables in a similar way as the hook. file_fdw likes #1 because there are no external storage to store statistics for CSV files, but pgsql_fdw might prefer #2 because the remote server already has stats for the underlying table. -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiset patch review
On Wed, Jan 12, 2011 at 15:21, Peter Eisentraut pete...@gmx.net wrote: You may want to read this thread about the cardinality function are you trying to add: http://archives.postgresql.org/pgsql-hackers/2009-02/msg01388.php Since our archive is split per month, this might be more readable: http://postgresql.1045698.n5.nabble.com/cardinality-td2003172.html We've discussed what number should cardinality() returns: #1. The total number of elements. (It's currently implemented.) #2. The length of the first dimension. It's as same as array_length(array, 1) . I prefer #1 because we have no easy way to retrieve the number. array_dims() returns similar numbers, but calculate the total number is a bit complex. If we will support array of arrays (jugged array), cardinality() can return the number of elements in the most outer array. It's similar definition in multi-dimensional arrays in C#, that has both array of arrays and multi-dimensional arrays. http://msdn.microsoft.com/library/system.array.length(v=VS.100).aspx We can compare those SQL functions and C# array methods: * cardinality(array) -- array.Length * array_length(array. dim) -- array.GetLength(dim) Also, what happened to the idea of a separate MULTISET type? I don't have any plans to implement dedicated MULTISET type for now because almost all functions and operators can work also for arrays. If we have a true MULTISET data type, we can overload them with MULTISET arguments. One exception might be collect() aggregate function because we might need to change the result type from array to multiset. collect(anyelement) = anyarray for now Note that fusion() won't be an issue because we can overload it: fusion(anyarray) = anyarray and (anymultiset) = anymultiset -- Itagaki Takahiro -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers