Re: [HACKERS] Autonomous transaction
It would be useful to have a relation such that all dirtied buffers got written out even for failed transactions (barring a crash) and such that read-any-undeleted were easy to do, despite the non-ACIDity. The overhead of a side transaction seems overkill for such things as logs or advisory relations, and non-DB files would be harder to tie in efficiently to DB activity. A side transaction would still have to be committed in order to be useful; either you're committing frequently (ouch!), or you risk failing to commit just as you would the main transaction. David Hudson -Original Message- From: Loïc Vaumerel [mailto:she...@gmail.com] Sent: Sunday, April 4, 2010 10:26 AM To: pgsql-hackers@postgresql.org Subject: [HACKERS] Autonomous transaction Hi, I have an application project based on a database. I am really interested in using PostgreSQL. I have only one issue, I want to use autonomous transactions to put in place a debug / logging functionality. To do so, I insert messages in a debug table. The problem is, if the main transaction / process rollback, my debug message insert will be rolled back too. This is not the behavior I wish. I need a functionality with the same behavior than the Oracle PRAGMA AUTONOMOUS_TRANSACTION one. I have searched for it in the documentation and on the net, unfortunately nothing. (maybe I missed something) I just found some posts regarding this : http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php https://labs.omniti.com/trac/pgtreats/browser/trunk/autonomous_logging_tool ... and some others ... All solutions I found are working the same way : they use dblink. I consider these solution more as handiwork than a clean solution. I am a little bit concerned about side effects as dblink were not intially designed for this. So my questions : Is there a way to use real and clean autonomous transactions in PostgreSQL yet ? If no, is it planned to do so ? When ? Thanks in advance Best regards Shefla
Re: [HACKERS] Bloom filters bloom filters bloom filters
Then your union operation is to just bitwise or the two bloomfilters. Keep in mind that when performing this sort of union between two comparably-sized sets, your false-positive rate will increase by about an order of magnitude. You need to size your bloom filters accordingly, or perform the union differently. Intersections, however, behave well. There is a similar problem, among others, with expanding smaller filters to match larger ones. David Hudson
Re: [HACKERS] Syntax for partitioning
PARTITION BY RANGE ( a_expr ) ... PARTITION BY HASH ( a_expr ) PARTITIONS num_partitions; Unless someone comes up with a maintenance plan for stable hashfunctions, we should probably not dare look into this yet. What would cover the common use case of per-day quals and drops over an extended history period, say six or nine months? You don't get quite the same locality of reference, generally, with an unpartitioned table, due to slop in the arrival of rows. Ideally, you don't want to depend on an administrator, or even an administrative script, to continually intervene in the structure of a table, as would be the case with partitioning by range, and you don't want to coalesce multiple dates, as an arbitrary hash might do. What the administrator would want would be to decide what rows were too old to keep, then process (e.g. archive, summarize, filter) and delete them. Suppose that the number of partitions were taken as a hint rather than as a naming modulus, and that any quasi-hash function had to be specified explicitly (although storage assignment could be based on a hash of the quasi-hash output). If a_expr were allowed to include a to-date conversion of a timestamp, day-by-day partitioning would fall out naturally. If, in addition, single-parameter (?) functions were characterized as range-preserving and order-preserving, plan generation could be improved for time ranges on quasi-hash-partitioned tables, without a formal indexing requirement. There are cases where additional partition dimensions would be useful, for eventual parallelized operation on large databases, and randomizing quasi-hash functions would help. IMHO stability is not needed, except to the extent that hash functions have properties that lend themselves to plan generation and/or table maintenance. It is not clear to me what purpose there would be in dropping a partition. This would be tantamount to deleting all of the rows in a partition, if it were analogous to dropping a table, and would require some sort of compensatory aggregation of existing partitions (in effect, a second partitioning dimension), if it were merely structural. Perhaps I'm missing something here. David Hudson
Re: [HACKERS] Unicode Normalization
In a context using normalization, wouldn't you typically want to store a normalized-text type that could perhaps (depending on locale) take advantage of simpler, more-efficient comparison functions? Whether you're doing INSERT/UPDATE, or importing a flat text file, if you canonicalize characters and substrings of identical meaning when trivial distinctions of encoding are irrelevant, you're better off later. User-invocable normalization functions by themselves don't make much sense. (If Postgres now supports binary- or mixed-binary-and-text flat files, perhaps for restore purposes, the same thing applies.) David Hudson
Re: [HACKERS] [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex
Normally I'd try a small lookup table (1-byte index to 1-byte value) in this case. But if the bitscan instruction were even close in performance, it'd be preferable, due to its more-reliable caching behavior; it should be possible to capture this at code-configuration time (aligned so as to produce an optimal result for each test case; see below). The specific code for large-versus-small testing would be useful; did I overlook it? Note that instruction alignment with respect to words is not the only potential instruction-alignment issue. In the past, when optimizing code to an extreme, I've run into cache-line issues where a small change that should've produced a small improvement resulted in a largish performance loss, without further work. Lookup tables can have an analogous issue; this could, in a simplistic test, explain an anomalous large-better-than-small result, if part of the large lookup table remains cached. (Do any modern CPUs attempt to address this??) This is difficult to tune in a multiplatform code base, so the numbers in a particular benchmark do not tell the whole tale; you'd need to make a judgment call, and perhaps to allow a code-configuration override. David Hudson
Re: [HACKERS] Improving the ngettext() patch
(Grrr, declension, not declination.) Plural-Forms: nplurals=3; plural=n%10==1 n%100!=11 ? 0 :n%10=2 n%10=4 (n%10010 ||n%100=20) ? 1 : 2;\n Thanks. The above (ignoring backslash-EOL) is the form recommended for Russian (inter alia(s)) in the Texinfo manual for gettext (info gettext). FWIW this might be an alternative: Plural-Forms: nplurals=3; plural=((n - 1) % 10) = (5-1) || (((n - 1) % 100) = (14-1) ((n - 1) % 100) = (11 - 1)) ? 2 : ((n - 1) % 10) == (1 - 1) ? 0 : 1;\n David Hudson
Re: [HACKERS] Improving the ngettext() patch
Russian plural forms for 100, 101, 102 etc. is different, as for 0, 1, 2. True. The rule IIRC is that except for 11-14 and for collective numerals, declination follows the last digit. It would be possible to generalize declination via a language-specific message-selector function, especially if the number of numerical complements were limited to 1. How awkward would it be to re-word the style of messages to avoid declination? For example, the Russian equivalent of X rows could be something like #rows -- X. David Hudson
Re: realloc overhead (was [HACKERS] Multiple sorts in a query)
So at least transiently we use 3x the size of the actual array. I was conjecturing, prior to investigation. Are you saying you know this/have seen this already? Well I'm just saying if you realloc a x kilobyte block into a 2x block and the allocator can't expand it and has to copy then it seems inevitable. FYI the malloc()/realloc()/free() on FC4 causes memory fragmentation, and thus a long-term growth in process memory, under some circumstances. ?This, together with the power-of-two allocations in aset.c not accounting for malloc() overhead (not that they could), implies that memory contexts can cause fragmentation, more slowly, too. Reallocations of smallish blocks from memory contexts tend to use memory already withheld from the OS; a transient increase in memory usage is possible, but unlikely to matter. ?Perhaps something should be done about larger blocks. David Hudson