Re: [HACKERS] Encoding issues in console and eventlog on win32
On Tue, 2009-09-15 at 12:49 +0900, Itagaki Takahiro wrote: > Here is an updated version of the patch. This is a review of the Eventlog encoding on Windows patch: http://archives.postgresql.org/message-id/20090915123243.9c59.52131...@oss.ntt.co.jp Purpose & Format This patch is designed to coerce log messages to a specific encoding. It's currently only targeted at the win32 port, where the logs are written in UTF-16. The patch applies cleanly. It doesn't include any documentation updates or additional regression tests. A comment in the documentation that logs on Windows will go through an encoding conversion if appropriate might be nice, though. Initial Run === To (hopefully) properly test I initdb'd a couple directories under different locales. I then ran a few statements designed to generate event log messages showing characters in a different encoding: SELECT E'\xF0'::int; The unpatched backend generated event log message showing only the byte value interpreted as the same character each time in the system default encoding. With the patch in place the event log message showed the character correctly for each of the different encodings. I haven't tried any performance testing against it. Concurrent Development Issues = On a hunch, tried applying the "syslogger infrastructure changes" at the same time. They conflict on elog.c. Not sure if we're supposed to check for that, but thought I'd point it out. :) Editorial = The problem seems to stem from PG and Windows each having a few encodings the other won't understand, or at least don't immediately support. So log messages back to the system from its perspective contain incorrect or broken characters. I'm not sure this is as much of a problem on other platforms, though, where the database encoding typically doesn't have any trouble matching the system's; would it be worth pursuing beyond the win32 port? I'm not too familiar with alternate character sets... I would assume if there's a code page supported on win32 it'll naturally support conversion to UTF-16 on the platform, but is there any time this could fail? What about the few encodings that it doesn't directly support, which need a conversion to UTF-8 first? Maybe someone with more familiarity with encoding conversion issues could comment on that? Otherwise I think this is ready to be bumped up for committer review. - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Easier to translate psql help Instead of requiring translators
On Fri, 2009-09-18 at 09:46 -0400, Tom Lane wrote: > pet...@postgresql.org (Peter Eisentraut) writes: > > Log Message: > > --- > > Easier to translate psql help > > Looks like this broke the msvc build ... Yep, one of create_help.pl's arguments changed to not include the file extension, but msvc was still trying to give it "sql_help.h". Quickpatch attached. Wondered why my attempts to test and review another patch were suddenly confounded... > regards, tom lane - Josh Williams Index: src/tools/msvc/Solution.pm === RCS file: /projects/cvsroot/pgsql/src/tools/msvc/Solution.pm,v retrieving revision 1.47 diff -c -r1.47 Solution.pm *** src/tools/msvc/Solution.pm 6 Jan 2009 18:37:50 - 1.47 --- src/tools/msvc/Solution.pm 19 Sep 2009 03:41:40 - *** *** 235,241 { print "Generating sql_help.h...\n"; chdir('src\bin\psql'); ! system("perl create_help.pl ../../../doc/src/sgml/ref sql_help.h"); chdir('..\..\..'); } --- 235,241 { print "Generating sql_help.h...\n"; chdir('src\bin\psql'); ! system("perl create_help.pl ../../../doc/src/sgml/ref sql_help"); chdir('..\..\..'); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Elementary dependency look-up
On Sun, 2009-09-13 at 21:20 -0400, Robert Haas wrote: > I'm not sure there's any point in reviewing this patch in its present > form. Barring objections (or a new version), I think we should mark > this Returned with Feedback. > > ...Robert Yeah, sounds reasonable. The new version probably won't look at all like the current one, so no need to waste reviewer cycles on it. I'll work on a revised version; feel free to mark it as such in the mean time. Thanks, - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Elementary dependency look-up
On Wed, 2009-09-09 at 11:30 -0500, decibel wrote: > On Sep 9, 2009, at 8:05 AM, Peter Eisentraut wrote: > > How is this better than just reading the information directly from > > pg_depend? > > pg_depend is very difficult to use. You have to really, really know > the catalogs to be able to figure it out. Part of the problem is > (afaik) there's nothing that documents every kind of record/ > dependency you might find in there. Exactly - these functions were designed around making that easier for the end user. The less poking around in system catalogs a user has to do the better. Yeah, the documentation about what can be found in pg_depend is scattered at best, though then again there doesn't seem to be a whole lot in there that's of much interest to end users... Actually, apart from pg_get_serial_sequence() do we have anything else that utilizes dependency data to show the user information? > What might be more useful is a view that takes the guesswork out of > using pg_depend. Namely, convert (ref)classid into a catalog table > name (or better yet, what type of object it is), (ref)objid into an > actual object name, and (ref)objsubid into a real name. Makes sense, would be much more future-proof. It shouldn't be difficult to put in some intelligence to figure out the type of object, such as looking at relkind if (ref)classid = pg_class. It might be a little difficult to maintain, depending on what else finds its way into the system catalogs later (but then, probably not much more so than INFORMATION SCHEMA is.) Would that be preferable, over a couple additional functions? - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Elementary dependency look-up
Attached is a patch to add a couple basic dependency look-up capability functions. They're based off the pg_get_serial_sequence function, and are kind of the inverse of that function in some respects. The patch adds two new functions to the backend, pg_get_owner_object and pg_get_owner_column. These look up the requested object in the pg_depend table, looking for an 'a' type dependency to another relation, and resolve either the relation or column names to text. postgres=# SELECT pg_get_owner_object('tbl_id_seq') AS obj, postgres-# pg_get_owner_column('tbl_id_seq') AS col, postgres-# pg_get_serial_sequence( postgres(# pg_get_owner_object('tbl_id_seq'), postgres(# pg_get_owner_column('tbl_id_seq') postgres(# ) AS full_circle; obj | col |full_circle +-+--- public.tbl | id | public.tbl_id_seq (1 row) I tried not to be too myopic in the design, but apart from sequence ownership I can't really think of any other uses for this. 'p'in and 'i'nternal relationships wouldn't make much sense, and 'n'ormal ones are generally exposed in other ways. Anyone have any input there on how this could be expanded? Anyway, as an immediate practical example the patch modifies psql's describe-verbose on sequences to show the ownership information... postgres=# \d+ tbl_id_seq (...) Owner: public.tbl.id - Josh Williams Index: doc/src/sgml/func.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.487 diff -c -r1.487 func.sgml *** doc/src/sgml/func.sgml 16 Aug 2009 19:55:21 - 1.487 --- doc/src/sgml/func.sgml 2 Sep 2009 23:11:15 - *** *** 12264,12269 --- 12264,12277 + pg_get_owner_object + + + + pg_get_owner_column + + + pg_tablespace_databases *** *** 12365,12370 --- 12373,12388 uses +pg_get_owner_object(relation_oid) +text +get name of the relation that owns the specified object, such as a sequence + + +pg_get_owner_column(relation_oid) +text +get column name associated with the specified object in its owning relation + + pg_get_triggerdef(trigger_oid) text get CREATE [ CONSTRAINT ] TRIGGER command for trigger *** *** 12478,12483 --- 12496,12513 +pg_get_owner_object returns the name of the relation +that owns the specified relation object, or NULL if the object isn't owned +by a relation. The input parameter can be passed as an OID or possibly a +double-quoted identifier. This can be treated in some respects as the +inverse of pg_get_serial_sequence, where the association +can be modified or removed with ALTER SEQUENCE OWNED BY. +pg_get_owner_column returns the name of the column +associated with an owned object, such as the name of a sequence's +original serial column. + + + pg_get_userbyid extracts a role's name given its OID. Index: src/backend/utils/adt/ruleutils.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/ruleutils.c,v retrieving revision 1.306 diff -c -r1.306 ruleutils.c *** src/backend/utils/adt/ruleutils.c 1 Aug 2009 19:59:41 - 1.306 --- src/backend/utils/adt/ruleutils.c 2 Sep 2009 23:11:19 - *** *** 1446,1451 --- 1446,1601 /* + * pg_get_owner_object + * Returns the name of the object that owns the specified object + * by looking up an "auto" dependency relationship. + * Useful for finding a sequence's parent table. + * See pg_get_owner_column for the originating serial column. + */ + Datum + pg_get_owner_object(PG_FUNCTION_ARGS) + { + Oid relId = PG_GETARG_OID(0); + Oid ownerId = InvalidOid; + RelationdepRel; + ScanKeyData key[3]; + SysScanDesc depScan; + HeapTuple tup; + + /* Find the requested object in the dependency table... */ + depRel = heap_open(DependRelationId, AccessShareLock); + + ScanKeyInit(&key[0], + Anum_pg_depend_classid, + BTEqualStrategyNumber, F_OIDEQ, + ObjectIdGetDatum(RelationRelationId)); + ScanKeyInit(&key[1], + Anum_pg_depend_objid, + BTEqualStrategyNumber, F_OIDEQ, + ObjectIdGetDatum(relId)); + ScanKeyInit(&key[2], + Anum_pg_depend_objsubid, + BTEqualStrategyNumber, F_INT4EQ, + Int32GetDatum(0)); + + depScan = systable_beginscan(depRel, DependDependerIndexId, true, SnapshotNow, 3, key); + + while (HeapTupleIsValid(tup = systable_getnext(depScan))) + { + Form_
Re: [HACKERS] Review: Patch for contains/overlap of polygons
On Sun, 2009-08-09 at 13:27 -0600, Joshua Tolley wrote: > On Sun, Aug 09, 2009 at 02:29:44PM -0400, Robert Haas wrote: > > On Sun, Aug 9, 2009 at 2:20 PM, Bruce Momjian wrote: > > > This is a nice section layout for a patch review report --- should we > > > provide an email template like this one for reviewers to use? > > > > We could, but it might be over-engineering. Those particular section > > headers might not be applicable to someone else's review. > > I've just added a link to this email to the "Reviewing a Patch" wiki page > (http://wiki.postgresql.org/wiki/Reviewing_a_Patch). Do with it as you see fit > :) Sweet. :) Actually that was mainly for keeping organized and sane when conducting my first review, and it seemed to translate well into the email when it came time to write it up. The appropriate sections* most certainly would change patch-to-patch -- reviewer-to-reviewer, even -- so a set template wouldn't be appropriate. But as a style recommendation it could make sense. I'd made a mental note to try and refine the formatting next time around, but I haven't been back to request another yet. On that note, and now that I'm back online and clean of Pennsic dust, anything else in this CommitFest in need of a last minute Windows run-through? - Josh Williams * I could envision having the ability to write reviews directly into the commitfest web app, where one could define and tag sections. Then anyone curious about a patch's performance implications, for example, could pull down and read just the performance results of potentially multiple reviewers. How's that for over-engineering? ;) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-threaded pgbench
On Tue, 2009-07-28 at 23:38 -0400, Josh Williams wrote: > Huh, running the patched version on a single thread with 128 clients > just got it to crash. Actually consistently, three times now. Will try > the same thing on the development box tomorrow morning to get some > better debugging information. So yeah, buffer overrun. In pgbench.c FD_SETSIZE is redefined to get around the Windows default of 64. But this is done after bringing in winsock2.h (a couple levels in as a result of first including postgres_fe.h). So any fd_set is built with an array of 64 descriptors, while pgbench thinks it has 1024 available to work with. This was introduced a while back; the multi-threaded patch just makes it visible by giving it an important pointer to write over. Previously it would just run over into the loop counter (and probably a couple other things) and thus it'd continue on happily with the [sub]set it has. In either case this seems to be a simple fix, to move that #define earlier (see pgbench_win32.patch.) - Josh Williams diff -c -r1.87 pgbench.c *** contrib/pgbench/pgbench.c 11 Jun 2009 14:48:51 - 1.87 --- contrib/pgbench/pgbench.c 29 Jul 2009 21:18:18 - *** *** 26,31 --- 26,36 * PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. * */ + + #ifdef WIN32 + #define FD_SETSIZE 1024 /* set before winsock2.h is included */ + #endif /* ! WIN32 */ + #include "postgres_fe.h" #include "libpq-fe.h" *** *** 34,41 #include #ifdef WIN32 - #undef FD_SETSIZE - #define FD_SETSIZE 1024 #include #else #include --- 39,44 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-threaded pgbench
On Tue, 2009-07-28 at 12:10 -0400, Greg Smith wrote: > If your test system > is still setup, it might be interesting to try the 64 and 128 client cases > with Task Manager open, to see what percentage of the CPU the pgbench > driver program is using. If the pgbench client isn't already pegged at a > full CPU, I wouldn't necessarily threading it to help--it would just add > overhead that doesn't buy you anything, which seems to be what you're > measuring. That's a really good point, I do recall seeing pgbench taking only a fraction of the CPU... Running it again, it hovers around 6 or 7 percent in both cases, so it's only using up around half a core. Huh, running the patched version on a single thread with 128 clients just got it to crash. Actually consistently, three times now. Will try the same thing on the development box tomorrow morning to get some better debugging information. > All the Linux tests suggest that limit tends up show up at over 20,000 TPS > nowawadys, so maybe your Window system is bottlenecking somewhere > completely different before it reaches saturation on the client. I figured it was just indicating a limitation of the environment, where Windows has some kind of inefficiency either in the PG port or just something inherent in how the OS works. It does make me wonder where exactly all that CPU time is going, though. OProfile, how I miss thee. But that's a different discussion entirely. - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multi-threaded pgbench
On Wed, 2009-07-22 at 22:23 -0400, Greg Smith wrote: > Onto performance. My test system has a 16 cores of Xeon X5550 @ > 2.67GHz. > I created a little pgbench database (-s 10) and used the default > postgresql.conf parameters for everything but max_connections for a > rough > initial test. To test on Windows, I set up a similar database on an 8-core 2.0GHz E5335 (closest match I have.) It's compiled against a fresh CVS pull from this morning, patched with the "20090724" updated version. I tried to mirror the tests as much as possible, including the concurrent thread counts despite having half the number of available cores. Doing that didn't have much impact on the results, but more on that later. Comparing the unpatched version to the new version running a single client thread, there's no significant performance difference: C:\pgsql85>bin\pgbenchorig.exe -S -c 8 -t 10 pgbench ... tps = 19061.234215 (including connections establishing) C:\pgsql85>bin\pgbench.exe -S -c 8 -t 10 pgbench tps = 18852.928562 (including connections establishing) As a basis of comparison the original pgbench was run with increasing client counts, which shows the same drop off in throughput past the 16-client sweet spot: con tps 8 18871 16 19161 24 18804 32 18670 64 17598 128 16664 However I was surprised to see these results for the patched version, running 16 worker threads (apart from the 8 client run of course.) C:\pgsql85>bin\pgbench.exe -S -j 16 -c 128 -t 10 pgbench ... con tps 8 18435 (-j 8) 16 18866 24 - 32 17937 64 17016 128 15930 In all cases the patched version resulted in a lower performing output than the unpatched version. It's clearly working, at least in that it's launching the requested number of worker threads when looking at the process. Adjusting the worker thread count down to match the number of cores yielded identical results in the couple of test cases I ran. Maybe pgbench itself is less of a bottleneck in this environment, relatively speaking? - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review: Patch for contains/overlap of polygons
Teodor, et al, This is a review of the Polygons patch: http://archives.postgresql.org/message-id/4a5761a2.8070...@sigaev.ru There hasn't been any discussion, at least that I've been able to find. Contents & Purpose == This small patch primarily fixes a couple polygon functions, poly_overlap (the && operator) and poly_contain (@>). Previously the functions only performed simple bounding box calculations or checks based on sets of points. That works only for the simplest of cases; this patch accounts for more complex shapes. Included in the patch are new regression test cases, but no changes to documentation. The patch only corrects the behavior of existing functions, though, so perhaps no changes to the documentation are expected. Initial Run === The patch applies cleanly to HEAD. The regression tests all pass successfully against the new patch, but fail against pre-patched HEAD, so the test cases are sane and do cover the new behavior. As far as I can see the math behind the new calculations seems sound. Performance === Despite the functions in question containing an order of magnitude more code the operators performed faster than the previous versions in my test run. Though I have a feeling that may have more to do with this laptop's processor speed and/or the rather trivial test cases being thrown at it. In any case having the operators work correctly should far outweigh the negligible performance impact. Nitpicking & Conclusion === The patch splits out and adds a couple helper functions next to the existing ones in geo_ops.c, but would those be better defined down in the Private routines section? There's a #define in the middle of the touched_lseg_inside_poly() function. The macro is only used in that function and a couple of times at that, but it still feels somewhat out of place. Perhaps that'd be better placed with other #define's near the top? I could certainly be wrong in both cases. :) There's also two "int i"s declared in poly_contain(). Otherwise it seems to do exactly what it promises. I could see the correct behavior of these operators being important for GIS applications. +1 for committer review. - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Quick patch: Display sequence owner
On Tue, 2008-12-09 at 09:32 -0500, Tom Lane wrote: > I think the place that such information could most naturally be squeezed > into psql's \d commands would be to add another type of footer > information to \dt, eg > > Table "foo.bar" > ... > Indexes: > "bari" ... > Owned sequences: > "baz" owned by col1 That makes more sense, though isn't that a little repetitive when "default nextval(...)" is visible immediately above it? Doesn't guarantee the sequence is owned by the table of course, but I'd imagine to most people it'd just be noise. Could see it being shown in the verbose version, \d+ foo.bar. I certainly like that better than "making up" an nonexistent column. :) > If you really want to attach the information to the \d output for the > sequence instead of the table, consider a similar footer-style display > instead of making it look like something it's not. For the sequences themselves, it'd be nice to show somewhere, at least for tracking down stray sequences and identifying relationships. Perhaps a function to do the reverse of pg_get_serial_sequence()? Or better yet if no one else is already working on it, a more generic way to get readable information out of pg_depend? > > regards, tom lane > - Josh Williams -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Quick patch: Display sequence owner
Hi folks, Was recently poked and reminded that this patch may be of interest to the community. It was mostly done as an academic exercise, just to see how it works, and so it has a rather hackish feel. The patch adds the sequence owner, if available, to psql's \d output, as suggested in a recent thread: http://archives.postgresql.org/pgsql-general/2008-11/msg01300.php The patch adds a query against pg_depend, then fakes an extra column "owned_by" in the output: # \d tablename_columnname_seq Sequence "public.tablename_columnname_seq" Column | Type | Value ---+--+-- sequence_name | name | tablename_columnname_seq last_value| bigint | 1 start_value | bigint | 1 increment_by | bigint | 1 max_value | bigint | 9223372036854775807 min_value | bigint | 1 cache_value | bigint | 1 log_cnt | bigint | 1 is_cycled | boolean | f is_called | boolean | f owned_by | regclass | tablename Now for the snags and additional thoughts: The query against pg_depend looks for relations for which the sequence is auto-dependent. It wouldn't make any sense, but is it at all possible for a sequence to auto-depend on something else? An earlier version of the patch pulled the owning table and schema names directly, rather than casting to regclass, so the schema name was always shown. Would this be preferable, in case there's some ambiguity in similarly named tables between schemas? I'd pondered briefly whether there should be a real attribute to represent the sequence owner, just for display purposes. But I'm assuming that would present a big concurrency issue, as other transactions would see the change on the sequence immediately while pg_depend wouldn't be seen to change until committed. That, and ROLLBACK wouldn't work at all... The column info query is getting messy. Could probably clean that up a bit if anyone thinks it'd be worth it? - Josh Williams Index: src/bin/psql/describe.c === RCS file: /projects/cvsroot/pgsql/src/bin/psql/describe.c,v retrieving revision 1.188 diff -r1.188 describe.c 917c917 < seq_values = pg_malloc_zero((SEQ_NUM_COLS+1) * sizeof(*seq_values)); --- > seq_values = pg_malloc_zero((SEQ_NUM_COLS+2) * sizeof(*seq_values)); 922a923,939 > > printfPQExpBuffer(&buf, > "SELECT d.refobjid::regclass\n" > "FROM pg_catalog.pg_depend d\n" > "WHERE d.deptype = 'a' AND d.objid = '%s'", > oid); > > result = PSQLexec(buf.data, false); > if (!result) > goto error_return; > > if (PQntuples(result)) > seq_values[10] = pg_strdup(PQgetvalue(result, 0, 0)); > else > seq_values[10] = ""; > > PQclear(result); 940c957,966 < appendPQExpBuffer(&buf, "\nORDER BY a.attnum"); --- > /* For sequences we'll 'fake' an additional column to show the owning relation */ > if (tableinfo.relkind == 'S') > { > appendPQExpBuffer(&buf, "\nUNION SELECT 'owned_by', 'regclass', NULL, true, 11"); > if (verbose) > appendPQExpBuffer(&buf, ", 'p', 'Owning relation'"); > appendPQExpBuffer(&buf, "\nORDER BY attnum"); > } > else > appendPQExpBuffer(&buf, "\nORDER BY a.attnum"); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers