Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)

2010-12-21 Thread Dimitri Fontaine
Erik Rijkers e...@xs4all.nl writes:
 I might be mistaken but it looks like a doc/src/sgml/ref/alter_extension.sgml 
 is missing?

Mmm, it seems that git was agreeing with you, so here's it:

  git ls-files doc/src/sgml/ref/alter_extension.sgml 
  
http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=9371a9763651df2636cb6c20dced7cd67398c477

It was already online for readers of the HTML version of the docs:

  http://pgsql.tapoueh.org/extensions/doc/html/sql-alterextension.html

And it will appear in next revision of the patch. Thanks!
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)

2010-12-21 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@commandprompt.com writes:
  function link 
 linkend=functions-extensionpg_extension_flag_dump/link
[...]
 So presumably this shouldn't be documented because it cannot be called
 anyway?

It can be called but only from an extension's script.

 To be honest I don't understand the purpose of this part of the patch.

So the problem we're offering a solution for, here, is the extension
with user data problem: the extension infrastructure is only there so
that pg_dump knows to filter OUT sql objects from its dump, prefering a
single create extension command.  Some extension allows users to control
the data in some of they're objects: now you want to have those in the
backup again.

From the docs:

  
http://pgsql.tapoueh.org/extensions/doc/html/functions-admin.html#FUNCTIONS-EXTENSION

  pg_extension_with_user_data allows extension's author to prepare
  installation scripts that will work well for initial installation and
  when restoring from a pg_dump backup, which issues CREATE EXTENSION
  foo WITH NO USER DATA;. See CREATE EXTENSION for details. One way to
  use it is as following:

  DO $$
   BEGIN
IF pg_extension_with_user_data() THEN
  create schema foo;
  create table foo.bar(id serial primary key);
  perform pg_extension_flag_dump('foo.bar_id_seq'::regclass);
  perform pg_extension_flag_dump('foo.bar::regclass);
END IF;
   END;
  $$;

I don't really know how to improve the docs, you seem to have been
surprised by reading the CREATE EXTENSION docs but you didn't follow the
link to the function's doc with the above details, did you?

I'm open to improving things here, but I'm not seeing how yet.

 I attach some minor fixes while reading it over.  I compiled but didn't
 run it :-)

Thanks a lot, that's applied in my git repo, and I did run it
successfully! It will be part of the next patch revision.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
 On Dec18, 2010, at 17:59 , Tomas Vondra wrote:
 It seems to me you're missing one very important thing - this was not
 meant as a new default way to do estimates. It was meant as an option
 when the user (DBA, developer, ...) realizes the current solution gives
 really bad estimates (due to correlation). In that case he could create
 'cross-column' statistics on those columns, and the optimizer would use
 that info to do the estimates.

 I do understand that. I just have the nagging feeling that there is a
 way to judge from dist(A), dist(B) and dist(A,B) whether it makes sense
 to apply the uniform bayesian approach or to assume the columns are
 unrelated.

I doubt there is a way to this decision with just dist(A), dist(B) and
dist(A,B) values. Well, we could go with a rule

  if [dist(A) == dist(A,B)] the [A = B]

but that's very fragile. Think about estimates (we're not going to work
with exact values of dist(?)), and then about data errors (e.g. a city
matched to an incorrect ZIP code or something like that).

So for a real-world dataset, the condition [dist(A)==dist(A,B)] will
almost never hold. And about the same holds for the uniform correlation
assumption which is the basis for the formula I posted.

So actually we're looking for a formula that does reasonable estimates and
is robust even in cases where the correlation is not uniform or the
estimates are a reasonably unprecise.

 This motivates the definition

 F(A,B) = [ dist(A)*dist(B) - dist(A,B) ] / [dist(A,B) * ( dist(B) - 1)]

 (You can probably drop the -1, it doesn't make much of a difference
 for larger values of dist(B).

 F(A,B) specifies where dist(A) lies relative to dist(A,B)/dist(B) and
 dist(A,B) - a value of 0 indicates dist(A) = dist(A,B)/dist(B) while
 a value of 1 indicates that dist(A) == dist(A,B).

 So F(A,B) is a suitable measure of Implicativeness - it's higher
 if the table (A,B) looks more like a function A - B.

 You might use that to decide if either A-B or B-a looks function-like
 enough to use the uniform bayesian approach. Or you might even go further,
 and decide *with* bayesian formula to use - the paper you cited always
 averages

   P(A=x|B=y)*P(B=y) and
   P(B=y|A=x)*P(A=x)

 but they offer no convincing reason for that other than We don't know
 which to pick.

Well, the reason why they chose the sum/2 approach is they were unable to
infer which of the values is 'better' and the sum/2 limits the errors.

I haven't studied this thoroughly, but my impression is that you are going
in the same direction as they did, i.e. while they've done

   P(A,B) = (P(A|B)*P(A) + P(B|A)*P(B)) / 2

with P(A|B) = dist(A) / dist(A,B), you've chosen P(A|B) ~ F(B,A) or
something like that.

You'll probably object that you could compute F(A,B) and F(B,A) and then
use only the part corresponding to the larger value, but what if the
F(A,B) and F(B,A) are about the same?

This is the reason why they choose to always combine the values (with
varying weights).

 I'd like to find a statistical explanation for that definition of
 F(A,B), but so far I couldn't come up with any. I created a Maple 14
 worksheet while playing around with this - if you happen to have a
 copy of Maple available I'd be happy to send it to you..

No, I don't have Maple. Have you tried Maxima
(http://maxima.sourceforge.net) or Sage (http://www.sagemath.org/). Sage
even has an online notebook - that seems like a very comfortable way to
exchange this kind of data.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2010-12-21 Thread Shigeru HANADA
On Mon, 20 Dec 2010 20:42:38 +0900
Itagaki Takahiro itagaki.takah...@gmail.com wrote:
 On Sun, Dec 19, 2010 at 12:45, Robert Haas robertmh...@gmail.com wrote:
  I'm not questioning any of that.  But I'd like the resulting code to
  be as maintainable as we can make it.
 
 I added comments and moved some setup codes for COPY TO to BeginCopyTo()
 for maintainability. CopyTo() still contains parts of initialization,
 but I've not touched it yet because we don't need the arrangement for now.

Attached is the revised version of file_fdw patch.  This patch is
based on Itagaki-san's copy_export-20101220.diff patch.

Changes from previous version are:

* file_fdw uses CopyErrorCallback() as error context callback routine
in fileIterate() to report error context.  CONTEXT line in the
example below is added by the callback.

postgres=# select * From csv_tellers_bad;
ERROR:  missing data for column bid
CONTEXT:  COPY csv_tellers_bad, line 10: 10
postgres=#

* Only superusers can change table-level file_fdw options.  Normal
user can't change the options even if the user was the owner of the
table.  This is for security reason.

Regards,
--
Shigeru Hanada


file_fdw-20101221.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread Robert Haas
On Mon, Dec 20, 2010 at 9:29 PM, Florian Pflug f...@phlo.org wrote:
 I tried to pick up Robert's idea of quantifying Implicativeness -
 i.e., finding a number between 0 and 1 that describes how close the
 (A,B) are to representing a function A - B.

Actually Heikki's idea...

 Observe that dist(A),dist(B) = dist(A,B) = dist(A)*dist(B) if the
 estimates of dist(?) are consistent. From that you easily get

  dist(A,B)/dist(B) = dist(A) = dist(A,B) and
  dist(A,B)/dist(A) = dist(B) = dist(A,B)

 If dist(A) == dist(A,B), then there is a functional dependency
 A - B, and conversely if dist(B) == dist(A,B) there is a functional
 dependency B - A. Note that you can have both at the same time!

 On the other hand, if dist(B) = dist(A,B)/dist(A), then B has the
 smallest number of distinct values possible for a given combination
 of dist(A,B) and dist(A). This is the anti-function case.

 This motivates the definition

  F(A,B) = [ dist(A)*dist(B) - dist(A,B) ] / [ dist(A,B) * ( dist(B) - 1) ]

 (You can probably drop the -1, it doesn't make much of a difference
 for larger values of dist(B).

 F(A,B) specifies where dist(A) lies relative to dist(A,B)/dist(B) and
 dist(A,B) - a value of 0 indicates dist(A) = dist(A,B)/dist(B) while
 a value of 1 indicates that dist(A) == dist(A,B).

 So F(A,B) is a suitable measure of Implicativeness - it's higher
 if the table (A,B) looks more like a function A - B.

 You might use that to decide if either A-B or B-a looks function-like
 enough to use the uniform bayesian approach. Or you might even go further,
 and decide *with* bayesian formula to use - the paper you cited always
 averages

  P(A=x|B=y)*P(B=y) and
  P(B=y|A=x)*P(A=x)

 but they offer no convincing reason for that other than We don't know
 which to pick.

Ideally you want to somehow make this a continuous transaition between
the available formulas rather than a discrete transition, I think.  If
F(A,B) = 1 then the selectivity of A = x AND B = y is just P(A=x), and
if it's 0, then it's P(A=x)*P(B=y).  But suppose F(A,B)=0.5.  Then
what?  A naive approach would be to estimate P(A=x  B=y) = P(A=x) *
(1 - (1 - F(A,B))*(1 - P(B = y))), so that if, say, P(A=x) = 0.1 and
P(B=y) = 0.1, then when F(A,B) = 0 we estimate 0.01, when F(A,B) = 1
we estimate 0.1, and when F(A,B) = 0.5 we estimate (0.1)(1 - 0.5*0.9)
= 0.055.  Of course I'm just hand-waving here, and this is without any
mathematical basis, being just the simplest formula I could think of
that gets the endpoints right and plots some sort of smooth curve
between them in the middle.  A similar formula with a believable
argument to back it up seems like it would be a big step forward for
this method.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Robert Haas
On Mon, Dec 20, 2010 at 3:14 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Dec 20, 2010 at 3:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 20, 2010 at 2:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I like that better actually ... one less thing for developers to get wrong.

 The attached patch appears to work correctly on MacOS X.  I did check,
 BTW: getppid() in the attached process returns gdb's pid.  Poor!

 Looks good to me.

 For my own purposes, I would be just as happy to apply this only to
 master.  But I wonder if anyone wants to argue for back-patching, to
 help debug existing installations?

 Given the lack of non-developer complaints, I see no need to backpatch.

 Well, non-developers don't tend to attach gdb very often.  Alvaro
 mentioned a problem installation upthread, thus the question.

Hearing no cries of please-oh-please-backpatch-this, I've committed
it just to master.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Robert Haas
2010/12/21 Tomáš Mudruňka to...@mudrunka.cz:
 Is there possibility of having internal base converting function in PgSQL?
 There are already functions for converting between decimal and hexadecimal
 notations i think pgsql can be able to convert between number with radixes
 from 1 to 36 (actually fast (de)encoding base36 is what i need)...

It should be pretty easy to write such a function in C, perhaps using
strtol() or strtoul().  Because PostgreSQL uses an extensible
architecture, you could load such a function into your copy of
PostgreSQL and use it in your environment even if it weren't part of
the core distribution.  There are a number of existing contrib
modules that you can look at for examples of how to do this.

Whether or not we'd accept a patch to add such a function to core or
contrib, I'm not sure.  Nobody's written one yet...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Robert Haas
On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasby j...@nasby.net wrote:
 On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:
 Does postgres make an effort to create a file with physically continuous 
 blocks?

 AFAIK all files are expanded as needed. I don't think there's any flags you 
 can pass to the filesystem to tell it this file will eventually be 1GB in 
 size. So, we're basically at the mercy of the FS to try and keep things 
 contiguous.

There have been some reports that we would do better on some
filesystems if we extended the file more than a block at a time, as we
do today.  However, AFAIK, no one is pursuing this ATM.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Pavel Stehule
Dne 21. prosince 2010 12:48 Robert Haas robertmh...@gmail.com napsal(a):
 2010/12/21 Tomáš Mudruňka to...@mudrunka.cz:
 Is there possibility of having internal base converting function in PgSQL?
 There are already functions for converting between decimal and hexadecimal
 notations i think pgsql can be able to convert between number with radixes
 from 1 to 36 (actually fast (de)encoding base36 is what i need)...

 It should be pretty easy to write such a function in C, perhaps using
 strtol() or strtoul().  Because PostgreSQL uses an extensible
 architecture, you could load such a function into your copy of
 PostgreSQL and use it in your environment even if it weren't part of
 the core distribution.  There are a number of existing contrib
 modules that you can look at for examples of how to do this.

 Whether or not we'd accept a patch to add such a function to core or
 contrib, I'm not sure.  Nobody's written one yet...

Most used transformations are available from core now - just need a
wrapper function.

This functions isn't a clean, - should be based on int, long int or bytea?

Pavel



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
 On Mon, Dec 20, 2010 at 9:29 PM, Florian Pflug f...@phlo.org wrote:
 You might use that to decide if either A-B or B-a looks function-like
 enough to use the uniform bayesian approach. Or you might even go
 further,
 and decide *with* bayesian formula to use - the paper you cited always
 averages

  P(A=x|B=y)*P(B=y) and
  P(B=y|A=x)*P(A=x)

 but they offer no convincing reason for that other than We don't know
 which to pick.

 Ideally you want to somehow make this a continuous transaition between
 the available formulas rather than a discrete transition, I think.  If
 F(A,B) = 1 then the selectivity of A = x AND B = y is just P(A=x), and
 if it's 0, then it's P(A=x)*P(B=y).  But suppose F(A,B)=0.5.  Then
 what?  A naive approach would be to estimate P(A=x  B=y) = P(A=x) *
 (1 - (1 - F(A,B))*(1 - P(B = y))), so that if, say, P(A=x) = 0.1 and
 P(B=y) = 0.1, then when F(A,B) = 0 we estimate 0.01, when F(A,B) = 1
 we estimate 0.1, and when F(A,B) = 0.5 we estimate (0.1)(1 - 0.5*0.9)
 = 0.055.  Of course I'm just hand-waving here, and this is without any
 mathematical basis, being just the simplest formula I could think of
 that gets the endpoints right and plots some sort of smooth curve
 between them in the middle.  A similar formula with a believable
 argument to back it up seems like it would be a big step forward for
 this method.

This somehow reminds me how the various t-norms in fuzzy logic evolved.
I'm not saying we should use fuzzy logic here, but the requirements are
very similar so it might be an interesting inspiration. See for example
this http://plato.stanford.edu/entries/logic-fuzzy (chapter 4).

And there's one additional - IMHO very important - requirement. The whole
thing should easily extend to more than two columns. This IF (F(A,B) 
F(B,A)) THEN ... probably is not a good solution regarding this.

For example given 3 columns A,B,C, would you do that comparison for each
pair of columns, or would you do that for A vs (B,C)? Or maybe a
completely different approach? Because that would require to collect a lot
more data (number of distinct values in each combination) etc.

I'm not saying for example there is a table with (C=A+B)

  A | B | C
 ===
  1 | 1 | 2
  1 | 2 | 3
  1 | 3 | 4
  2 | 1 | 3
  2 | 2 | 4
  2 | 3 | 5
  3 | 1 | 4
  3 | 2 | 5
  3 | 3 | 6

So that dist(A)=dist(B)=3, dist(C)=6 and dist(A,B,C)=dist(A,B)=9. Given
the paper, you get something like

 P(A,B,C) = [dist(A)*P(A) +  dist(B)*P(B) + dist(C)*P(C)] / [3*dist(A,B,C)]
  = [P(A) + P(B) + 2*P(C)] / 9

so for example

 P(A=3,B=2,C=5) = [1/3 + 1/3 + 2/9]/9 = (8/9)/9

which is almost correct (by 1/81).

Don't get me wrong - I'm not a fanatic who thinks this particular formula
is the best formula possible. I'm just saying we could end up with a
formula that works beautifully in 2D, but once we get to 3 columns it
fails miserably.

Hmmm, maybe we could give this possibility (to identify two separate
groups of columns) to the user. So instead of 'buid stats for (A,B,C)' the
user would say 'build stats for (A,B) and (C)' - this actually represents
apriori knowledge of dependencies supplied by the user.

In that case we could search for 'implicativeness' between those two
groups (and not within the groups), and we could restrict ourselves to 2D
(and thus use a more sophisticated formula).

But we should be able to do some basic analysis even when the user
supplies a list of columns without such apriori knowledge.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2010-12-21 Thread Itagaki Takahiro
On Tue, Dec 21, 2010 at 20:14, Shigeru HANADA han...@metrosystems.co.jp wrote:
 Attached is the revised version of file_fdw patch.  This patch is
 based on Itagaki-san's copy_export-20101220.diff patch.

#1. Don't you have per-tuple memory leak? I added GetCopyExecutorState()
because the caller needs to reset the per-tuple context periodically.

Or, if you eventually make a HeapTuple from values and nulls arrays,
you could modify NextCopyFrom() to return a HeapTuple instead of values,
nulls, and tupleOid. The reason I didn't use HeapTuple is that I've
seen arrays were used in the proposed FDW APIs. But we don't have to
use such arrays if you use HeapTuple based APIs.

IMHO, I prefer HeapTuple because we can simplify NextCopyFrom and
keep EState private in copy.c.

#2. Can you avoid making EXPLAIN text in fplan-explainInfo on
non-EXPLAIN cases? It's a waste of CPU cycles in normal executions.
I doubt whether FdwPlan.explainInfo field is the best design.
How do we use the EXPLAIN text for XML or JSON explain formats?
Instead, we could have an additional routine for EXPLAIN.

#3. Why do you re-open a foreign table in estimate_costs() ?
Since the caller seems to have the options for them, you can
pass them directly, no?

In addition, passing a half-initialized fplan to estimate_costs()
is a bad idea. If you think it is an OUT parameter, the OUT params
should be *startup_cost and *total_cost.

#4. It'a minor cosmetic point, but our coding conventions would be
we don't need (void *) when we cast a pointer to void *, but need
(Type *) when we cast a void pointer to another type.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2010-12-21 Thread Robert Haas
On Mon, Dec 20, 2010 at 6:42 AM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Sun, Dec 19, 2010 at 12:45, Robert Haas robertmh...@gmail.com wrote:
 I'm not questioning any of that.  But I'd like the resulting code to
 be as maintainable as we can make it.

 I added comments and moved some setup codes for COPY TO to BeginCopyTo()
 for maintainability. CopyTo() still contains parts of initialization,
 but I've not touched it yet because we don't need the arrangement for now.

I haven't analyzed this enough to know whether I agree with it, but as
a trivial matter you should certainly revert this hunk:

/* field raw data pointers found by COPY FROM */
-
-   int max_fields;
-   char ** raw_fields;
+   int max_fields;
+   char  **raw_fields;


-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bug in ts_rank_cd

2010-12-21 Thread Sushant Sinha
There is a bug in ts_rank_cd. It does not correctly give rank when the
query lexeme is the first one in the tsvector.

Example:

select ts_rank_cd(to_tsvector('english', 'abc sdd'),
plainto_tsquery('english', 'abc'));   
 ts_rank_cd 

  0

select ts_rank_cd(to_tsvector('english', 'bcg abc sdd'),
plainto_tsquery('english', 'abc'));
 ts_rank_cd 

0.1

The problem is that the Cover finding algorithm ignores the lexeme at
the 0th position, I have attached a patch which fixes it. After the
patch the result is fine.

select ts_rank_cd(to_tsvector('english', 'abc sdd'), plainto_tsquery(
'english', 'abc'));
 ts_rank_cd 

0.1

--- postgresql-9.0.0/src/backend/utils/adt/tsrank.c	2010-01-02 22:27:55.0 +0530
+++ postgres-9.0.0-tsrankbugfix/src/backend/utils/adt/tsrank.c	2010-12-21 18:39:57.0 +0530
@@ -551,7 +551,7 @@
 	memset(qr-operandexist, 0, sizeof(bool) * qr-query-size);
 
 	ext-p = 0x7fff;
-	ext-q = 0;
+	ext-q = -1;
 	ptr = doc + ext-pos;
 
 	/* find upper bound of cover from current position, move up */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bug in ts_rank_cd

2010-12-21 Thread Sushant Sinha
MY PREV EMAIL HAD A PROBLEM. Please reply to this one
==

There is a bug in ts_rank_cd. It does not correctly give rank when the
query lexeme is the first one in the tsvector.

Example:

select ts_rank_cd(to_tsvector('english', 'abc sdd'),
plainto_tsquery('english', 'abc'));   
 ts_rank_cd 

  0

select ts_rank_cd(to_tsvector('english', 'bcg abc sdd'),
plainto_tsquery('english', 'abc'));
 ts_rank_cd 

0.1

The problem is that the Cover finding algorithm ignores the lexeme at
the 0th position, I have attached a patch which fixes it. After the
patch the result is fine.

select ts_rank_cd(to_tsvector('english', 'abc sdd'), plainto_tsquery(
'english', 'abc'));
 ts_rank_cd 

0.1

--- postgresql-9.0.0/src/backend/utils/adt/tsrank.c	2010-01-02 22:27:55.0 +0530
+++ postgres-9.0.0-tsrankbugfix/src/backend/utils/adt/tsrank.c	2010-12-21 18:39:57.0 +0530
@@ -551,7 +551,7 @@
 	memset(qr-operandexist, 0, sizeof(bool) * qr-query-size);
 
 	ext-p = 0x7fff;
-	ext-q = 0;
+	ext-q = -1;
 	ptr = doc + ext-pos;
 
 	/* find upper bound of cover from current position, move up */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Florian Pflug
On Dec21, 2010, at 12:48 , Robert Haas wrote:
 2010/12/21 Tomáš Mudruňka to...@mudrunka.cz:
 Is there possibility of having internal base converting function in PgSQL?
 There are already functions for converting between decimal and hexadecimal
 notations i think pgsql can be able to convert between number with radixes
 from 1 to 36 (actually fast (de)encoding base36 is what i need)...
 
 It should be pretty easy to write such a function in C, perhaps using
 strtol() or strtoul().

If you're not comfortable doing this in C, you might also want to consider
one of procedural languages pl/pgsql, pl/perl, pl/python. pl/pgsql is probably
only viable if you just need this for ints and bigints, unless you don't
care about performance.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread Florian Pflug
On Dec21, 2010, at 11:37 , t...@fuzzy.cz wrote:
 I doubt there is a way to this decision with just dist(A), dist(B) and
 dist(A,B) values. Well, we could go with a rule
 
  if [dist(A) == dist(A,B)] the [A = B]
 
 but that's very fragile. Think about estimates (we're not going to work
 with exact values of dist(?)), and then about data errors (e.g. a city
 matched to an incorrect ZIP code or something like that).

Huh? The whole point of the F(A,B)-exercise is to avoid precisely this
kind of fragility without penalizing the non-correlated case...

 This is the reason why they choose to always combine the values (with
 varying weights).

There are no varying weights involved there. What they do is to express
P(A=x,B=y) once as

  P(A=x,B=y) = P(B=y|A=x)*P(A=x) and then as
  P(A=x,B=y) = P(A=x|B=y)*P(B=y).

Then they assume

  P(B=y|A=x) ~= dist(A)/dist(A,B) and
  P(A=x|B=y) ~= dist(B)/dist(A,B),

and go on to average the two different ways of computing P(A=x,B=y), which
finally gives

  P(A=x,B=y) ~= P(B=y|A=x)*P(A=x)/2 + P(A=x|B=y)*P(B=y)/2
  = dist(A)*P(A=x)/(2*dist(A,B)) + dist(B)*P(B=x)/(2*dist(A,B))
  = (dist(A)*P(A=x) + dist(B)*P(B=y)) / (2*dist(A,B))

That averaging steps add *no* further data-dependent weights. 

 I'd like to find a statistical explanation for that definition of
 F(A,B), but so far I couldn't come up with any. I created a Maple 14
 worksheet while playing around with this - if you happen to have a
 copy of Maple available I'd be happy to send it to you..
 
 No, I don't have Maple. Have you tried Maxima
 (http://maxima.sourceforge.net) or Sage (http://www.sagemath.org/). Sage
 even has an online notebook - that seems like a very comfortable way to
 exchange this kind of data.

I haven' tried them, but I will. That java-based GUI of Maple is driving
me nuts anyway... Thanks for the pointers!

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
 On Dec21, 2010, at 11:37 , t...@fuzzy.cz wrote:
 I doubt there is a way to this decision with just dist(A), dist(B) and
 dist(A,B) values. Well, we could go with a rule

  if [dist(A) == dist(A,B)] the [A = B]

 but that's very fragile. Think about estimates (we're not going to work
 with exact values of dist(?)), and then about data errors (e.g. a city
 matched to an incorrect ZIP code or something like that).

 Huh? The whole point of the F(A,B)-exercise is to avoid precisely this
 kind of fragility without penalizing the non-correlated case...

Yes, I understand the intention, but I'm not sure how exactly do you want
to use the F(?,?) function to compute the P(A,B) - which is the value
we're looking for.

If I understand it correctly, you proposed something like this

  IF (F(A,B)  F(B,A)) THEN
P(A,B) := c*P(A);
  ELSE
P(A,B) := d*P(B);
  END IF;

or something like that (I guess c=dist(A)/dist(A,B) and
d=dist(B)/dist(A,B)). But what if F(A,B)=0.6 and F(B,A)=0.59? This may
easily happen due to data errors / imprecise estimate.

And this actually matters, because P(A) and P(B) may be actually
significantly different. So this would be really vulnerable to slight
changes in the estimates etc.

 This is the reason why they choose to always combine the values (with
 varying weights).

 There are no varying weights involved there. What they do is to express
 P(A=x,B=y) once as

 ...

   P(A=x,B=y) ~= P(B=y|A=x)*P(A=x)/2 + P(A=x|B=y)*P(B=y)/2
   = dist(A)*P(A=x)/(2*dist(A,B)) +
 dist(B)*P(B=x)/(2*dist(A,B))
   = (dist(A)*P(A=x) + dist(B)*P(B=y)) / (2*dist(A,B))

 That averaging steps add *no* further data-dependent weights.

Sorry, by 'varying weights' I didn't mean that the weights are different
for each value of A or B. What I meant is that they combine the values
with different weights (just as you explained).

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread Florian Pflug
On Dec21, 2010, at 15:51 , t...@fuzzy.cz wrote:
 This is the reason why they choose to always combine the values (with
 varying weights).
 
 There are no varying weights involved there. What they do is to express
 P(A=x,B=y) once as
 
 ...
 
  P(A=x,B=y) ~= P(B=y|A=x)*P(A=x)/2 + P(A=x|B=y)*P(B=y)/2
  = dist(A)*P(A=x)/(2*dist(A,B)) +
 dist(B)*P(B=x)/(2*dist(A,B))
  = (dist(A)*P(A=x) + dist(B)*P(B=y)) / (2*dist(A,B))
 
 That averaging steps add *no* further data-dependent weights.
 
 Sorry, by 'varying weights' I didn't mean that the weights are different
 for each value of A or B. What I meant is that they combine the values
 with different weights (just as you explained).

I'm still not sure we're on the same page here. The resulting formula
is *not* a weighted average of P(A=x) and P(B=y), since in general
dist(A) + dist(B) = 2*dist(A,B) does *not* hold. It may look like one
syntactically, but that's about it.

The resulting formula instead is an *unweighted* (weights 1) average of
the two estimates P(B=y|A=x)*P(A=x) and P(A=x|B=y)*P(B=y). You might just
as well estimate P(A=x,B=y) with

  P(B=y|A=x)*P(A=x) = dist(A)*P(A=x)/dist(A,B)

and it's *still* be the very same uniform bayesian approach, just no
longer symmetric in A and B. Which may easily be preferable if you
have reasons to believe that this estimate is more correct than the
one obtained by swapping A and B. The original paper doesn't deal with
that case simply because they don't mention how P(A=x) and P(B=y)
are obtained at all. The postgres estimator, on the other hand,
knows quite well how it derived P(A=x) and P(B=y) and may have much
higher confidence in one value than in the other.

Assume for example that you're preparing the statement

  SELECT * FROM T WHERE A = ? AND B = 1

We'll then estimate P(A=?) as 1/dist(A), since we cannot do any better
without an actual value for the parameter ?. The estimate for P(B=1),
on the other hand, can use the histogram, and will thus very likely be
much more precise. The two estimates for P(A=?,B=1) in this case are

  P(A=?,B=1)*P(B=1) = dist(B)*P(B=1)/dist(A,B), and
  P(B=1,A=?)*P(A=1) = dist(A)*P(A=?)/dist(A,B).

There's a good chance that the former beats the latter, and thus also
the average, then.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)

2010-12-21 Thread Erik Rijkers
On Tue, December 21, 2010 09:57, Dimitri Fontaine wrote:
 Erik Rijkers e...@xs4all.nl writes:
 I might be mistaken but it looks like a 
 doc/src/sgml/ref/alter_extension.sgml is missing?

 Mmm, it seems that git was agreeing with you, so here's it:

   git ls-files doc/src/sgml/ref/alter_extension.sgml
   
 http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=9371a9763651df2636cb6c20dced7cd67398c477

 It was already online for readers of the HTML version of the docs:

   http://pgsql.tapoueh.org/extensions/doc/html/sql-alterextension.html

 And it will appear in next revision of the patch. Thanks!
 --
 Dimitri Fontaine
 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


Two changes to sql-alterextension.sgml:

  ALTER EXTENSION name SET EXTENSION new_schema

should be:

  ALTER EXTENSION name SET SCHEMA new_schema




And in the 'Description' there are (I think) old copy/paste remnants:

  ALTER EXTENSION changes the definition of an existing type. There are only 
one subforms:
  SET SCHEMA

it should be (something like):

  ALTER EXTENSION changes an existing extension. There is only one form:
  ALTER EXTENSION set schema new_schema




Erik Rijkers





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread Florian Pflug
On Dec21, 2010, at 13:25 , t...@fuzzy.cz wrote:
 And there's one additional - IMHO very important - requirement. The whole
 thing should easily extend to more than two columns. This IF (F(A,B) 
 F(B,A)) THEN ... probably is not a good solution regarding this.
 
 For example given 3 columns A,B,C, would you do that comparison for each
 pair of columns, or would you do that for A vs (B,C)? Or maybe a
 completely different approach? Because that would require to collect a lot
 more data (number of distinct values in each combination) etc.

That's certainly a valid concern. The uniform bayesian approach avoids that
quite beautifully...

 Hmmm, maybe we could give this possibility (to identify two separate
 groups of columns) to the user. So instead of 'buid stats for (A,B,C)' the
 user would say 'build stats for (A,B) and (C)' - this actually represents
 apriori knowledge of dependencies supplied by the user.
 
 In that case we could search for 'implicativeness' between those two
 groups (and not within the groups), and we could restrict ourselves to 2D
 (and thus use a more sophisticated formula).

Hm, I hated this idea at first, but I'm starting to like it more and more.
It *does* seem rather unrealistic that a user would know that a bunch of
columns are correlated, but have no idea in what way... 

Any examples when this's be the case would be very much appreciated - Maybe
we should ask around on -general about this?

 But we should be able to do some basic analysis even when the user
 supplies a list of columns without such apriori knowledge.

That, I think, overcomplicates things, at least for a first cut.

To summarize, I think you've shown quite nicely that the uniform bayesian
approach is a very sensible first step towards better estimates in the case
of correlated columns. It's statistically sound, and the dist(A,B) estimates
it requires are probably a necessary ingredient of any solution to the problem.
If we can make it degrade more gracefully if the columns are uncorrelated we
should do that, but if we can't thats still no reason to drop the whole idea.

So I guess we should turn our attention to how we'd obtain reasonably good 
estimates
of dist(A,B), and return to the current discussion once the other pieces are in 
place.

I think that maybe it'd be acceptable to scan a large portion of the
table to estimate dist(A,B), *if* we must only do so only once in a while. But 
even with
a full scan, how would we arrive at an estimate for dist(A,B)? Keeping all 
values in memory,
and spilling them into, say, an on-disk hash table adds even more overhead to 
the already
expensive full scan. Maybe using a bloom filter instead of a hash table could 
avoid
the spilling to disk, in exchange for a slightly less precise result...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest wrap-up

2010-12-21 Thread Robert Haas
On Wed, Dec 15, 2010 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 - Writeable CTEs - I think we need Tom to pick this one up.
 - Fix snapshot taking inconsistencies - Ready for committer. Can any
 committer pick this up?

 Will take a look at these two also.

Tom, what is your time frame on this?  I think we should wrap up the
CF without these and bundle 9.1alpha3 unless you plan to get to this
in the next day or two.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Owner inheritance

2010-12-21 Thread gsdfg gdfg
Would be great if owner can be inherited from parent object (owner table ==
schema owner == database owner).
CREATE statement could add OWNER TO PARENT to cover this feature.

Michel


Re: [HACKERS] CommitFest wrap-up

2010-12-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 15, 2010 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 - Writeable CTEs - I think we need Tom to pick this one up.
 - Fix snapshot taking inconsistencies - Ready for committer. Can any
 committer pick this up?

 Will take a look at these two also.

 Tom, what is your time frame on this?  I think we should wrap up the
 CF without these and bundle 9.1alpha3 unless you plan to get to this
 in the next day or two.

We probably shouldn't hold up the alpha for these, if there are no
other items outstanding.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, patch 22 (cleanup, review, cleanup)

2010-12-21 Thread Dimitri Fontaine
Erik Rijkers e...@xs4all.nl writes:
   http://pgsql.tapoueh.org/extensions/doc/html/sql-alterextension.html
[...]
 Two changes to sql-alterextension.sgml:

Fixed and uploaded on the URL above, will be in the next patch revision,
thanks!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Owner inheritance

2010-12-21 Thread Tom Lane
gsdfg gdfg mx.mo...@gmail.com writes:
 Would be great if owner can be inherited from parent object (owner table ==
 schema owner == database owner).
 CREATE statement could add OWNER TO PARENT to cover this feature.

What it would be is a great security hole --- exactly analogous to
allowing Unix chown to non-superusers.  Read up on the security
pitfalls of being able to give away ownership of an object.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Owner inheritance

2010-12-21 Thread Andrew Dunstan



On 12/21/2010 07:04 AM, gsdfg gdfg wrote:
Would be great if owner can be inherited from parent object (owner 
table == schema owner == database owner).

CREATE statement could add OWNER TO PARENT to cover this feature.



That syntax would violate POLA in the case of inherited tables (OWNER TO 
CONTAINER, or just OWNER TO SCHEMA etc might be clearer). And I think 
we'd have to restrict it to superusers anyway, which would seriously 
limit its usefulness.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread tv
 On Dec21, 2010, at 15:51 , t...@fuzzy.cz wrote:
 This is the reason why they choose to always combine the values (with
 varying weights).

 There are no varying weights involved there. What they do is to express
 P(A=x,B=y) once as

 ...

  P(A=x,B=y) ~= P(B=y|A=x)*P(A=x)/2 + P(A=x|B=y)*P(B=y)/2
  = dist(A)*P(A=x)/(2*dist(A,B)) +
 dist(B)*P(B=x)/(2*dist(A,B))
  = (dist(A)*P(A=x) + dist(B)*P(B=y)) / (2*dist(A,B))

 That averaging steps add *no* further data-dependent weights.

 Sorry, by 'varying weights' I didn't mean that the weights are different
 for each value of A or B. What I meant is that they combine the values
 with different weights (just as you explained).

 I'm still not sure we're on the same page here. The resulting formula
 is *not* a weighted average of P(A=x) and P(B=y), since in general
 dist(A) + dist(B) = 2*dist(A,B) does *not* hold. It may look like one
 syntactically, but that's about it.

OK, another crazy usage or 'weights' on my side :-(

What I meant is that in the end you have two equations of P(A,B):

  P(A=x|B=y)*P(B=y) = dist(B)*P(B=y)/dist(A,B)
  P(B=y|A=x)*P(A=x) = dist(A)*P(A=x)/dist(A,B)

and you need to combine those two estimates. They did that by averaging,
as they don't know which of the estimates is better.

Generally I think that is a good solution, unless you know one of the
estimates is much more reliable (although I'm not sure we should
completely omit the other estimate).

 The resulting formula instead is an *unweighted* (weights 1) average of
 the two estimates P(B=y|A=x)*P(A=x) and P(A=x|B=y)*P(B=y). You might just
 as well estimate P(A=x,B=y) with

   P(B=y|A=x)*P(A=x) = dist(A)*P(A=x)/dist(A,B)

 and it's *still* be the very same uniform bayesian approach, just no
 longer symmetric in A and B. Which may easily be preferable if you
 have reasons to believe that this estimate is more correct than the
 one obtained by swapping A and B. The original paper doesn't deal with
 that case simply because they don't mention how P(A=x) and P(B=y)
 are obtained at all. The postgres estimator, on the other hand,
 knows quite well how it derived P(A=x) and P(B=y) and may have much
 higher confidence in one value than in the other.

OK, good point. I haven't realized that one of the estimates may be much
more reliable.

But let's assume both estimates are about the same (regarding reliability)
and let's see the following example

 A | B
 =
 1 | 1
 1 | 1
 1 | 1
 1 | 2
 2 | 1
 2 | 2
 2 | 2
 2 | 2

Thus dist(A)=dist(B)=2, dist(A,B)=4 and

  P(A=1)=P(A=2)=P(B=1)=P(B=2)=1/2
  P(A=1,B=1)=P(A=2,B=2)=3/8
  P(A=1,B=2)=P(A=1,B=1)=1/8

According to the formula presented in the paper, the partial estimates for
P(A=1,B=2) are

  P(A=1|B=2)*P(B=2) = dist(A)/dist(A,B) * P(B=2) = 2/4 * 1/2 = 1/4
  P(B=2|A=1)*P(A=1) = dist(B)/dist(A,B) * P(A=1) = 2/4 * 1/2 = 1/4

Thus P(A=1,B=2) = (1/4 + 1/4)/2 = 1/4, so it's overestimated (2x)

 A | B
 =
 1 | 1
 1 | 2
 1 | 2
 1 | 2
 2 | 1
 2 | 1
 2 | 1
 2 | 2

This obviously has exactly the same features (regarding number of distinct
values), and the produced estimate is exactly the same. But in this case

  P(A=1,B=2)=P(A=2,B=1)=3/8
  P(A=1,B=1)=P(A=2,B=2)=1/8

and thus the 1/4 is an underestimate (compared to 3/8).

The problem is the F(A,B) does not change at all. It's very simple to
construct examples (just use more rows) where F(A,B) returns exactly the
same value, but the estimates are off. The averaging somehow (smooths)
this of ...

But I think I'm missing something about how to use the F(?,?) to derive
the final estimate. So maybe the resulting estimate would be better.

Say there are two tables

   A | B | number of such rows
  ==
   1 | 1 | 1000
   1 | 2 | 1000
   2 | 1 | 1000
   1 | 2 | 1000

   A | B | number of such rows
  ==
   1 | 1 | 1
   1 | 2 | 1999
   2 | 1 | 1999
   1 | 2 | 1

How would you estimate the P(A=1,B=1) in those cases? Assume that both
estimates are equally reliable - i.e. deduced from a histogram or MCV.


 Assume for example that you're preparing the statement

   SELECT * FROM T WHERE A = ? AND B = 1

 We'll then estimate P(A=?) as 1/dist(A), since we cannot do any better
 without an actual value for the parameter ?. The estimate for P(B=1),
 on the other hand, can use the histogram, and will thus very likely be
 much more precise. The two estimates for P(A=?,B=1) in this case are

   P(A=?,B=1)*P(B=1) = dist(B)*P(B=1)/dist(A,B), and
   P(B=1,A=?)*P(A=1) = dist(A)*P(A=?)/dist(A,B).

 There's a good chance that the former beats the latter, and thus also
 the average, then.

OK, good point. I was not thinking about prepared statements. In this case
it makes sense to use only one of the estimates ...

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] optimization histograms

2010-12-21 Thread amit sehas
HI,

for the histograms for cost based optimization, is there a rule of thumb on how 
often to rebuild them? They are obviously not being continuously updated...what 
is the state of the art in this area, do all the other databases also end up 
with stale statistics every now and then and have to keep rebuilding the stats?

thanks
-Amit


  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] optimization histograms

2010-12-21 Thread Kevin Grittner
amit sehas cu...@yahoo.com wrote:
 
 for the histograms for cost based optimization, is there a rule of
 thumb on how often to rebuild them?
 
In recent major versions, autovacuum should normally keep you in
good shape.  The exception is when you make major changes to the
contents of a table (such as in a bulk data load) and then
immediately try to use the table before autovacuum has had time to
notice the activity and generate fresh statistics; for these cases
you probably want to do a manual run.
 
For more information, see:
 
http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#VACUUM-FOR-STATISTICS
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal : cross-column stats

2010-12-21 Thread Tomas Vondra
Dne 21.12.2010 16:54, Florian Pflug napsal(a):
 Hmmm, maybe we could give this possibility (to identify two separate
 groups of columns) to the user. So instead of 'buid stats for (A,B,C)' the
 user would say 'build stats for (A,B) and (C)' - this actually represents
 apriori knowledge of dependencies supplied by the user.

 In that case we could search for 'implicativeness' between those two
 groups (and not within the groups), and we could restrict ourselves to 2D
 (and thus use a more sophisticated formula).
 
 Hm, I hated this idea at first, but I'm starting to like it more and more.
 It *does* seem rather unrealistic that a user would know that a bunch of
 columns are correlated, but have no idea in what way... 

Yes, that's true. Although sometimes the dependency may be very
complicated - but let's restrict to 2D for now, build something that
solves this simplified case and then we can discuss higher dimensions.

 Any examples when this's be the case would be very much appreciated - Maybe
 we should ask around on -general about this?

Well, I think the ZIP code example i a typical case of this - the users
know about the dependency between ZIP codes and cities. A natural
workaround would be to omit the dependent column from the query, but
that's not always possible (e.g. when an ORM is involved, building the
queries automatically).

 But we should be able to do some basic analysis even when the user
 supplies a list of columns without such apriori knowledge.
 
 That, I think, overcomplicates things, at least for a first cut.
 
 To summarize, I think you've shown quite nicely that the uniform bayesian
 approach is a very sensible first step towards better estimates in the case
 of correlated columns. It's statistically sound, and the dist(A,B) estimates
 it requires are probably a necessary ingredient of any solution to the 
 problem.
 If we can make it degrade more gracefully if the columns are uncorrelated we
 should do that, but if we can't thats still no reason to drop the whole idea.

Agreed. IMHO the uncorrelated case is not a big concern, as the users
usually know something's wrong with the columns. But we should introduce
some 'autodetect' but let's leave that for the future.

 So I guess we should turn our attention to how we'd obtain reasonably good 
 estimates
 of dist(A,B), and return to the current discussion once the other pieces are 
 in place.
 
 I think that maybe it'd be acceptable to scan a large portion of the
 table to estimate dist(A,B), *if* we must only do so only once in a while. 
 But even with
 a full scan, how would we arrive at an estimate for dist(A,B)? Keeping all 
 values in memory,
 and spilling them into, say, an on-disk hash table adds even more overhead to 
 the already
 expensive full scan. Maybe using a bloom filter instead of a hash table could 
 avoid
 the spilling to disk, in exchange for a slightly less precise result...

I have no idea what a Bloom filter is (shame on me). I was not thinking
about collecting the stats, I was interested primarily in what data do
we actually need. And my knowledge about the algorithms currently used
is very limited :-(

But I agree we should at least discuss the possible solutions. Until now
I've done something like this

   SELECT COUNT(DISTINCT a) AS dist_a,
  COUNT(DISTINCT b) AS dist_b,
  COUNT(DISTINCT a || ':' || b) AS dist_ab FROM my_table;

but that's not very efficient.

My plan for the near future (a few weeks) is to build a simple 'module'
with the ability to estimate the number of rows for a given condition.
This could actually be quite useful as a stand-alone contrib module, as
the users often ask how to get a number of rows fast (usually for paging).

That may be quite slow when the query returns too many rows, even when
there is an index. It may be even much slower than the actual query (as
it usually contains a small LIMIT).

An estimate is often sufficient, but the 'pg_class.tuples' does not
really work with conditions. So this might be an improvement ...

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Eric Ridge
On Mon, Dec 20, 2010 at 3:36 PM, Martijn van Oosterhout
klep...@svana.org wrote:
 On Mon, Dec 20, 2010 at 03:08:02PM -0500, Robert Haas wrote:
 The attached patch appears to work correctly on MacOS X.  I did check,
 BTW: getppid() in the attached process returns gdb's pid.  Poor!

 This appears to be a BSDism at least. On Linux and BSD derivatives the
 man pages specifically mention the reparenting (needed for catching
 signals) but on Linux getppid() is specifically documented to return
 the correct value anyway.

I'm just a random lurker here, and happened to catch the last bit of
this thread.  Could one of you that understand this issue straighten
something out for me?

Every now and again we've been known to attach gdb to a production
Postgres backend to troubleshoot problems.  Ya know, just trying to
get an idea of what Postgres is actually doing via a backtrace.  This
is always on Linux, BTW.

Does this thread mean that the above no longer works with v9?  Or is
this only on non-Linux systems, or did the patch Robert Haas commit
fix fix?  We're still using 8.1 (slowly moving to 8.4) in
production, but have plans of picking up 9.x later in '11.  Just
wondering if we need to actually be a bit more careful in the future?

Thanks!

eric

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] The cost of visibillity testing? (gin-search)

2010-12-21 Thread Jesper Krogh

Hi Hackers.

I have a feeling that GIN is cheating on the visibillity checks:

test=# set enable_seqscan = off;
SET
Time: 0.129 ms
test=# select count(id) from fts_test where fts @@ to_tsquery('core');
 count

 158827
(1 row)

Time: 95.530 ms
test=# explain select count(id) from fts_test where fts @@ 
to_tsquery('core');

  QUERY PLAN
--
 Aggregate  (cost=211571.52..211571.53 rows=1 width=4)
   -  Bitmap Heap Scan on fts_test  (cost=134925.95..211174.01 
rows=159004 width=4)

 Recheck Cond: (fts @@ to_tsquery('core'::text))
 -  Bitmap Index Scan on fts_idx  (cost=0.00..134886.20 
rows=159004 width=0)

   Index Cond: (fts @@ to_tsquery('core'::text))
(5 rows)

Time: 0.609 ms

test=# select count(id) from fts_test;
 count

 168556
(1 row)

Time: 164.655 ms

test=# explain select count(id) from fts_test;
   QUERY PLAN

 Aggregate  (cost=1075969.95..1075969.96 rows=1 width=4)
   -  Seq Scan on fts_test  (cost=100.00..1075548.56 
rows=168556 width=4)

(2 rows)

Time: 0.338 ms

This is run multiple times for both queries and the seqscan of the table
is consistently about 1.8 times more expensive than the fts-scan.
This is all on a fully memory cached dataset.

The first query should have the cost of the GIN-search + 
visibillity-test of 158K tuples,
the latter should have the cost of visibillity-testing 168K tuples. If 
we set the cost
of actually searching GIN to 0 then the gin-search - visibillity costs: 
95/158000 0.000373ms/tuple
where the seq-scan case costs close to 0.001ms/tuple (close to 3 times 
as much).


So I have a strong feeling that GIN is cheating on the visibillity tests
otherwise I have problems imagining how it ever can become faster to execute
than the seq_scan of the table.

Or is a Bitmap Heap Scan simply 3 times faster than a Seq-scan for 
visibillity-testing?


What have I missed in the logic?

Thanks.

--
Jesper


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - core functionality

2010-12-21 Thread Simon Riggs
On Wed, 2010-12-15 at 22:25 +0900, Shigeru HANADA wrote:

 Attached are revised version of SQL/MED core functionality patches.

Looks very interesting new feature, well done.

Can I ask some questions about how this will work?
No particular order, just numbered for reference.

1. The docs don't actually say what a foreign table is. Is it a local
representation of foreign data? Or a local copy of foreign data? Or is
it a table created on a remote node?

2. Will CREATE FOREIGN TABLE require a transactionid? It seems a good
replacement for temp tables on Hot Standby to be able to run a CREATE
FOREIGN TABLE using the file_fdw, then reuse the file again later.

3. Do we support CREATE TEMP FOREIGN TABLE? It seems desirable to be
able to move data around temporarily, as we do with normal tables.

4. In Hot Standby, we are creating many copies of the data tables on
different servers. That seems to break the concept that data is in only
one place, when we assume that a foreign table is on only one foreign
server. How will we represent the concept that data is potentially
available identically from more than one place? Any other comments about
how this will work with Hot Standby?

5. In PL/Proxy, we have the concept that a table is sharded across
multiple nodes. Is that possible here? Again, we seem to have the
concept that a table is only ever in a single place.

6. Can we do CREATE FOREIGN TABLE  AS SELECT ...
I guess the answer depends on (1)

7. Why does ANALYZE skip foreign tables? Surely its really important we
know things about a foreign table, otherwise we are going to optimize
things very badly.

8. Is the WHERE clause passed down into a ForeignScan?

9. The docs for CHECK constraints imply that the CHECK is executed
against any rows returned from FDW. Are we really going to execute that
as an additional filter on each row retrieved?

10. Can a foreign table be referenced by a FK?

11. Can you create a DO INSTEAD trigger on a FOREIGN TABLE?

12. I think it would be useful for both review and afterwards to write
the documentation section now, so we can begin to understand this. Will
there be a documentation section on writing a FDW also? There are enough
open questions here that I think we need docs and a review guide,
otherwise we'll end up with some weird missing feature, which would be a
great shame.

13. How does this relate to dblink? Is that going to be replaced by this
feature?

14. How do we do scrollable cursors with foreign tables? Do we
materialize them always? Or...

15. In terms of planning queries, do we have a concept of additional
cost per row on a foreign server? How does the planner decide how costly
retrieving data is from the FDW?

16. If we cancel a query, is there an API call to send query cancel to
the FDW and so on to the foreign server? Does that still work if we hot
other kinds of ERROR, or FATAL?

17. Can we request different types of transaction isolation on the
foreign server, or do certain states get passed through from our
session? e.g. if we are running a serializable transaction, does that
state get passed through to the FDW, so it knows to request that on the
foreign server? That seems essential if we are going to make pg_dump
work correctly.

18. Does pg_dump dump the data in the FDW or just of the definition of
the data? Can we have an option for either?

19. If we PREPARE a statement, are there API calls to pass thru the
PREPARE to the FDW? Or are calls always dynamic?

20. If default privileges include INSERT, UPDATE or DELETE, does this
cause error, or does it silently get ignored for foreign tables? I think
I would want the latter.

21. Can we LOCK a foreign table? I guess so. Presumably no LOCK is
passed through to the FDW?

22. Can we build an local index on a foreign table?

No too sure what the right answers are to these questions, but I think
we need to know the answers to understand what we are getting.

Thanks

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 1:45 PM, Eric Ridge eeb...@gmail.com wrote:
 On Mon, Dec 20, 2010 at 3:36 PM, Martijn van Oosterhout
 klep...@svana.org wrote:
 On Mon, Dec 20, 2010 at 03:08:02PM -0500, Robert Haas wrote:
 The attached patch appears to work correctly on MacOS X.  I did check,
 BTW: getppid() in the attached process returns gdb's pid.  Poor!

 This appears to be a BSDism at least. On Linux and BSD derivatives the
 man pages specifically mention the reparenting (needed for catching
 signals) but on Linux getppid() is specifically documented to return
 the correct value anyway.

 I'm just a random lurker here, and happened to catch the last bit of
 this thread.  Could one of you that understand this issue straighten
 something out for me?

 Every now and again we've been known to attach gdb to a production
 Postgres backend to troubleshoot problems.  Ya know, just trying to
 get an idea of what Postgres is actually doing via a backtrace.  This
 is always on Linux, BTW.

 Does this thread mean that the above no longer works with v9?  Or is
 this only on non-Linux systems, or did the patch Robert Haas commit
 fix fix?  We're still using 8.1 (slowly moving to 8.4) in
 production, but have plans of picking up 9.x later in '11.  Just
 wondering if we need to actually be a bit more careful in the future?

The point of the patch was to improve cases where attaching gdb
*didn't* work well.  Any cases where it was already working for you
aren't going to be made worse by this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Eric Ridge
On Tue, Dec 21, 2010 at 2:33 PM, Robert Haas robertmh...@gmail.com wrote:
 The point of the patch was to improve cases where attaching gdb
 *didn't* work well.  Any cases where it was already working for you
 aren't going to be made worse by this.

Okay, great.  Thanks for the clarification.

eric

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The cost of visibillity testing? (gin-search)

2010-12-21 Thread Heikki Linnakangas

On 21.12.2010 21:25, Jesper Krogh wrote:

The first query should have the cost of the GIN-search +
visibillity-test of 158K tuples,
the latter should have the cost of visibillity-testing 168K tuples. If
we set the cost
of actually searching GIN to 0 then the gin-search - visibillity costs:
95/158000 0.000373ms/tuple
where the seq-scan case costs close to 0.001ms/tuple (close to 3 times
as much).

So I have a strong feeling that GIN is cheating on the visibillity tests
otherwise I have problems imagining how it ever can become faster to
execute
than the seq_scan of the table.

Or is a Bitmap Heap Scan simply 3 times faster than a Seq-scan for
visibillity-testing?


It certainly shouldn't be.


What have I missed in the logic?


Perhaps you have a lot of empty space or dead tuples that don't match 
the query in the table, which the sequential scan has to grovel through, 
but the bitmap scan skips? What does EXPLAIN ANALYZE of both queries say?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar dic 21 08:40:49 -0300 2010:

  Well, non-developers don't tend to attach gdb very often.  Alvaro
  mentioned a problem installation upthread, thus the question.
 
 Hearing no cries of please-oh-please-backpatch-this, I've committed
 it just to master.

Please-oh-please backpatch this ... at least to 8.4.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Tomáš Mudruňka

Thx for you answers :-)
Well... i know that i can write my own plugin and i am familiar with C so
this is not the problem, but i think that such feature should be
implemented directly in PgSQL because there are already functions for
converting to/from base 16 so why don't make this more flexible and
generalize it to any other radix? It's quite simple to do and i don't see
any reason why 16 should be there and 8, 32 or 36 shouldn't :-)

peace

On Tue, 21 Dec 2010 15:04:03 +0100, Florian Pflug f...@phlo.org wrote:
 On Dec21, 2010, at 12:48 , Robert Haas wrote:
 2010/12/21 Tomáš Mudruňka to...@mudrunka.cz:
 Is there possibility of having internal base converting function in
 PgSQL?
 There are already functions for converting between decimal and
 hexadecimal
 notations i think pgsql can be able to convert between number with
 radixes
 from 1 to 36 (actually fast (de)encoding base36 is what i need)...
 
 It should be pretty easy to write such a function in C, perhaps using
 strtol() or strtoul().
 
 If you're not comfortable doing this in C, you might also want to
consider
 one of procedural languages pl/pgsql, pl/perl, pl/python. pl/pgsql is
 probably
 only viable if you just need this for ints and bigints, unless you don't
 care about performance.
 
 best regards,
 Florian Pflug

-- 
S pozdravem
Best regards
   Tomáš Mudruňka - Spoje.net / Arachne Labs

XMPP/Jabber: har...@jabbim.cz, ICQ: 283782978

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The cost of visibillity testing? (gin-search)

2010-12-21 Thread Andres Freund
On Tuesday 21 December 2010 20:25:16 Jesper Krogh wrote:
 What have I missed in the logic?
A reproducible testcase ;-)

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Pavel Stehule
Hello

Dne 21. prosince 2010 21:11 Tomáš Mudruňka to...@mudrunka.cz napsal(a):

 Thx for you answers :-)
 Well... i know that i can write my own plugin and i am familiar with C so
 this is not the problem, but i think that such feature should be
 implemented directly in PgSQL because there are already functions for
 converting to/from base 16 so why don't make this more flexible and
 generalize it to any other radix? It's quite simple to do and i don't see
 any reason why 16 should be there and 8, 32 or 36 shouldn't :-)


* It isn't a typical and often request,
* There are not hard breaks for custom implementation,
* You can use plperu or plpython based solutions,
* It's not part of ANSI SQL

Regards

Pavel Stehule

 peace

 On Tue, 21 Dec 2010 15:04:03 +0100, Florian Pflug f...@phlo.org wrote:
 On Dec21, 2010, at 12:48 , Robert Haas wrote:
 2010/12/21 Tomáš Mudruňka to...@mudrunka.cz:
 Is there possibility of having internal base converting function in
 PgSQL?
 There are already functions for converting between decimal and
 hexadecimal
 notations i think pgsql can be able to convert between number with
 radixes
 from 1 to 36 (actually fast (de)encoding base36 is what i need)...

 It should be pretty easy to write such a function in C, perhaps using
 strtol() or strtoul().

 If you're not comfortable doing this in C, you might also want to
 consider
 one of procedural languages pl/pgsql, pl/perl, pl/python. pl/pgsql is
 probably
 only viable if you just need this for ints and bigints, unless you don't
 care about performance.

 best regards,
 Florian Pflug

 --
 S pozdravem
 Best regards
   Tomáš Mudruňka - Spoje.net / Arachne Labs

 XMPP/Jabber: har...@jabbim.cz, ICQ: 283782978

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE behaviour

2010-12-21 Thread Peter Eisentraut
On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
 On 2010-11-12 8:25 PM +0200, I wrote:
  I'm going to take some time off this weekend to get a patch with this
  behaviour to the next commitfest.
 
 .. and a wild patch appears.
 
 This is almost exactly the patch from 2010-02 without 
 CommandCounterIncrement()s.  It's still a bit rough around the edges and 
 needs some more comments, but I'm posting it here anyway.

To pick up an earlier thread again, has any serious thought been given
to adapting the SQL2001/DB2 syntax instead of our own?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Comment typo in nodeWindowAgg.c

2010-12-21 Thread Andreas Karlsson
Hi,

Found a couple of small typos in the comments of nodeWindowAgg.c when
they refer to functions in nodeAgg.c. The pluralities of the function
names (initialize_aggregates and advance_aggregates) are wrong. The
reference to finalize_aggregate is correct though.


diff --git a/src/backend/executor/nodeWindowAgg.c
b/src/backend/executor/nodeWindowAgg.c
index c3efe12..51f98c1 100644
*** a/src/backend/executor/nodeWindowAgg.c
--- b/src/backend/executor/nodeWindowAgg.c
*** static bool window_gettupleslot(WindowOb
*** 181,187 
  
  /*
   * initialize_windowaggregate
!  * parallel to initialize_aggregate in nodeAgg.c
   */
  static void
  initialize_windowaggregate(WindowAggState *winstate,
--- 181,187 
  
  /*
   * initialize_windowaggregate
!  * parallel to initialize_aggregates in nodeAgg.c
   */
  static void
  initialize_windowaggregate(WindowAggState *winstate,
*** initialize_windowaggregate(WindowAggStat
*** 207,213 
  
  /*
   * advance_windowaggregate
!  * parallel to advance_aggregate in nodeAgg.c
   */
  static void
  advance_windowaggregate(WindowAggState *winstate,
--- 207,213 
  
  /*
   * advance_windowaggregate
!  * parallel to advance_aggregates in nodeAgg.c
   */
  static void
  advance_windowaggregate(WindowAggState *winstate,

Regards,
Andreas Karlsson



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wCTE behaviour

2010-12-21 Thread David Fetter
On Tue, Dec 21, 2010 at 11:14:31PM +0200, Peter Eisentraut wrote:
 On sön, 2010-11-14 at 04:45 +0200, Marko Tiikkaja wrote:
  On 2010-11-12 8:25 PM +0200, I wrote:
   I'm going to take some time off this weekend to get a patch with this
   behaviour to the next commitfest.
  
  .. and a wild patch appears.
  
  This is almost exactly the patch from 2010-02 without 
  CommandCounterIncrement()s.  It's still a bit rough around the edges and 
  needs some more comments, but I'm posting it here anyway.
 
 To pick up an earlier thread again, has any serious thought been given
 to adapting the SQL2001/DB2 syntax instead of our own?

Yes, and it's a good deal more limited and less intuitive than ours.

This is one place where we got it right and the standard just got
pushed into doing whatever IBM did.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Pavel Golub
Hello, Pavel.

You wrote:

PS Hello

PS Dne 21. prosince 2010 21:11 Tomáš Mudruňka to...@mudrunka.cz napsal(a):

 Thx for you answers :-)
 Well... i know that i can write my own plugin and i am familiar with C so
 this is not the problem, but i think that such feature should be
 implemented directly in PgSQL because there are already functions for
 converting to/from base 16 so why don't make this more flexible and
 generalize it to any other radix? It's quite simple to do and i don't see
 any reason why 16 should be there and 8, 32 or 36 shouldn't :-)


PS * It isn't a typical and often request,
PS * There are not hard breaks for custom implementation,
PS * You can use plperu or plpython based solutions,
PS * It's not part of ANSI SQL

But MySQL has such function. What's wrong with us? ;)

PS Regards

PS Pavel Stehule

 peace

 On Tue, 21 Dec 2010 15:04:03 +0100, Florian Pflug f...@phlo.org wrote:
 On Dec21, 2010, at 12:48 , Robert Haas wrote:
 2010/12/21 Tomáš Mudruňka to...@mudrunka.cz:
 Is there possibility of having internal base converting function in
 PgSQL?
 There are already functions for converting between decimal and
 hexadecimal
 notations i think pgsql can be able to convert between number with
 radixes
 from 1 to 36 (actually fast (de)encoding base36 is what i need)...

 It should be pretty easy to write such a function in C, perhaps using
 strtol() or strtoul().

 If you're not comfortable doing this in C, you might also want to
 consider
 one of procedural languages pl/pgsql, pl/perl, pl/python. pl/pgsql is
 probably
 only viable if you just need this for ints and bigints, unless you don't
 care about performance.

 best regards,
 Florian Pflug

 --
 S pozdravem
 Best regards
   Tomáš Mudruňka - Spoje.net / Arachne Labs

 XMPP/Jabber: har...@jabbim.cz, ICQ: 283782978

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers





-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Kenneth Marshall
On Tue, Dec 21, 2010 at 11:28:17PM +0200, Pavel Golub wrote:
 Hello, Pavel.
 
 You wrote:
 
 PS Hello
 
 PS Dne 21. prosince 2010 21:11 Tom Mudru??ka to...@mudrunka.cz 
 napsal(a):
 
  Thx for you answers :-)
  Well... i know that i can write my own plugin and i am familiar with C so
  this is not the problem, but i think that such feature should be
  implemented directly in PgSQL because there are already functions for
  converting to/from base 16 so why don't make this more flexible and
  generalize it to any other radix? It's quite simple to do and i don't see
  any reason why 16 should be there and 8, 32 or 36 shouldn't :-)
 
 
 PS * It isn't a typical and often request,
 PS * There are not hard breaks for custom implementation,
 PS * You can use plperu or plpython based solutions,
 PS * It's not part of ANSI SQL
 
 But MySQL has such function. What's wrong with us? ;)
 

You are not really helping to make a good case... :)

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Andrew Dunstan



On 12/21/2010 04:28 PM, Pavel Golub wrote:


PS  * It isn't a typical and often request,
PS  * There are not hard breaks for custom implementation,
PS  * You can use plperu or plpython based solutions,
PS  * It's not part of ANSI SQL

But MySQL has such function. What's wrong with us? ;)



Our aim is not to duplicate everything in MySQL.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Pavel Golub
Hello.

Guys, guys! It was only a joke! :)

Please accept my appologies.

Anyway I find such function usefull even though I still hadn't
situation when it might be needed.
You wrote:



AD On 12/21/2010 04:28 PM, Pavel Golub wrote:

 PS  * It isn't a typical and often request,
 PS  * There are not hard breaks for custom implementation,
 PS  * You can use plperu or plpython based solutions,
 PS  * It's not part of ANSI SQL

 But MySQL has such function. What's wrong with us? ;)


AD Our aim is not to duplicate everything in MySQL.

AD cheers

AD andrew



-- 
With best wishes,
 Pavel  mailto:pa...@gf.microolap.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 4:57 PM, Pavel Golub pa...@microolap.com wrote:
 Anyway I find such function usefull even though I still hadn't
 situation when it might be needed.

Yeah, I agree.  I'm not sure we should add it to core, but it's
certainly just as useful as many things we have in contrib.  I'll bet
it would get at least as much use as the six argument form of
levenshtein_less_equal().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch BUG #5103: pg_ctl -w (re)start fails with custom unix_socket_directory

2010-12-21 Thread Quan Zongliang
On Mon, 29 Nov 2010 10:29:17 -0300
Alvaro Herrera alvhe...@commandprompt.com wrote:

 Excerpts from Quan Zongliang's message of sáb nov 27 06:03:12 -0300 2010:
  Hi, all
  
  I created a pg_ctl patch to fix:
  * BUG #5103: pg_ctl -w (re)start fails with custom unix_socket_directory 
  Allow pg_ctl to work properly with configuration files located outside the 
  PGDATA directory
 
 I think the way this should work is that you call postmaster with a new
 switch and it prints out its configuration, after reading the
 appropriate config file(s).  That way it handles all the little details
 such as figuring out the correct config file, hadle include files, etc.
 This output would be presumably easier to parse and more trustworthy.
 
 Right now we have --describe-config, which is missing the values for
 each config option.
 

Sorry for my late reply.

I will check the source of postmaster.


-- 
Quan Zongliang quanzongli...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment typo in nodeWindowAgg.c

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 4:17 PM, Andreas Karlsson andr...@proxel.se wrote:
 Found a couple of small typos in the comments of nodeWindowAgg.c when
 they refer to functions in nodeAgg.c. The pluralities of the function
 names (initialize_aggregates and advance_aggregates) are wrong. The
 reference to finalize_aggregate is correct though.

Committed, thanks.  But please attach patches rather than including them inline.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Robert Haas
On Mon, Dec 20, 2010 at 1:10 PM, Noah Misch n...@leadboat.com wrote:
 When the caller knows the smaller string length, memcmp and strncmp are
 functionally equivalent.  Since memcmp need not watch each byte for a NULL
 terminator, it often compares a CPU word at a time for better performance.  
 The
 attached patch changes use of strncmp to memcmp where we have the length of 
 the
 shorter string.  I was most interested in the varlena.c instances, but I tried
 to find all applicable call sites.  To benchmark it, I used the attached
 bench-texteq.sql.  This patch improved my 5-run average timing of the SELECT
 from 65.8s to 56.9s, a 13% improvement.  I can't think of a case where the
 change should be pessimal.

This is a good idea.  I will check this over and commit it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The cost of visibillity testing? (gin-search)

2010-12-21 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 21.12.2010 21:25, Jesper Krogh wrote:
 Or is a Bitmap Heap Scan simply 3 times faster than a Seq-scan for
 visibillity-testing?

 It certainly shouldn't be.

 What have I missed in the logic?

 Perhaps you have a lot of empty space or dead tuples that don't match 
 the query in the table, which the sequential scan has to grovel through, 
 but the bitmap scan skips? What does EXPLAIN ANALYZE of both queries say?

Another possibility is that the seqscan is slowed by trying to operate
in a limited number of buffers (the buffer strategy stuff).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 
 *) what's a good way to stress the clog severely? I'd like to pick
 a degenerate case to get a better idea of the way things stand
 without them.
 
The worst I can think of is a large database with a 90/10 mix of
reads to writes -- all short transactions.  Maybe someone else can
do better.  In particular, I'm not sure how savepoints might play
into a degenerate case.
 
Since we're always talking about how to do better with hint bits
during an unlogged bulk load, it would be interesting to benchmark
one of those followed by a `select count(*) from newtable;` with and
without the patch, on a data set too big to fit in RAM.
 
 *) is there community interest in a full patch that fills in the
 missing details not implemented here?
 
I'm certainly curious to see real numbers.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Mark Kirkwood

On 22/12/10 11:42, Merlin Moncure wrote:

Attached is an incomplete patch disabling hint bits based on compile
switch.  It's not complete, for example it's not reconciling some
assumptions in heapam.c that hint bits have been set in various
routines.  However, it mostly passes regression and I deemed it good
enough to run some preliminary benchmarks and fool around.  Obviously,
hint bits are an annoying impediment to a couple of other cool pending
features, and it certainly would be nice to operate without them.
Also, for particular workloads, the extra i/o hint bits can cause a
fair amount of pain.


Looks like a great idea to test, however I don't seem to be able to 
compile with it applied: (set#define DISABLE_HINT_BITS 1 at the end of 
src/include/pg_config_manual.h)


gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing 
-fwrapv -g -I../../../../src/include -D_GNU_SOURCE -c -o heapam.o heapam.c

heapam.c: In function ‘HeapTupleHeaderAdvanceLatestRemovedXid’:
heapam.c:3867: error: ‘HEAP_XMIN_COMMITTED’ undeclared (first use in 
this function)

heapam.c:3867: error: (Each undeclared identifier is reported only once
heapam.c:3867: error: for each function it appears in.)
heapam.c:3869: error: ‘HEAP_XMIN_INVALID’ undeclared (first use in this 
function)

make[4]: *** [heapam.o] Error 1


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Mark Kirkwood

On 22/12/10 13:05, Mark Kirkwood wrote:

On 22/12/10 11:42, Merlin Moncure wrote:

Attached is an incomplete patch disabling hint bits based on compile
switch.  It's not complete, for example it's not reconciling some
assumptions in heapam.c that hint bits have been set in various
routines.  However, it mostly passes regression and I deemed it good
enough to run some preliminary benchmarks and fool around.  Obviously,
hint bits are an annoying impediment to a couple of other cool pending
features, and it certainly would be nice to operate without them.
Also, for particular workloads, the extra i/o hint bits can cause a
fair amount of pain.


Looks like a great idea to test, however I don't seem to be able to 
compile with it applied: (set#define DISABLE_HINT_BITS 1 at the end of 
src/include/pg_config_manual.h)


gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing 
-fwrapv -g -I../../../../src/include -D_GNU_SOURCE -c -o heapam.o 
heapam.c

heapam.c: In function ‘HeapTupleHeaderAdvanceLatestRemovedXid’:
heapam.c:3867: error: ‘HEAP_XMIN_COMMITTED’ undeclared (first use in 
this function)

heapam.c:3867: error: (Each undeclared identifier is reported only once
heapam.c:3867: error: for each function it appears in.)
heapam.c:3869: error: ‘HEAP_XMIN_INVALID’ undeclared (first use in 
this function)

make[4]: *** [heapam.o] Error 1



Arrg, sorry - against git head on Ubuntu 10.03 (gcc 4.4.3)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 7:06 PM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz wrote:
 On 22/12/10 13:05, Mark Kirkwood wrote:

 On 22/12/10 11:42, Merlin Moncure wrote:

 Attached is an incomplete patch disabling hint bits based on compile
 switch.  It's not complete, for example it's not reconciling some
 assumptions in heapam.c that hint bits have been set in various
 routines.  However, it mostly passes regression and I deemed it good
 enough to run some preliminary benchmarks and fool around.  Obviously,
 hint bits are an annoying impediment to a couple of other cool pending
 features, and it certainly would be nice to operate without them.
 Also, for particular workloads, the extra i/o hint bits can cause a
 fair amount of pain.

 Looks like a great idea to test, however I don't seem to be able to
 compile with it applied: (set#define DISABLE_HINT_BITS 1 at the end of
 src/include/pg_config_manual.h)

 gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
 -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g
 -I../../../../src/include -D_GNU_SOURCE -c -o heapam.o heapam.c
 heapam.c: In function ‘HeapTupleHeaderAdvanceLatestRemovedXid’:
 heapam.c:3867: error: ‘HEAP_XMIN_COMMITTED’ undeclared (first use in this
 function)
 heapam.c:3867: error: (Each undeclared identifier is reported only once
 heapam.c:3867: error: for each function it appears in.)
 heapam.c:3869: error: ‘HEAP_XMIN_INVALID’ undeclared (first use in this
 function)
 make[4]: *** [heapam.o] Error 1


 Arrg, sorry - against git head on Ubuntu 10.03 (gcc 4.4.3)

did you check to see if the patch applied clean? btw I was working
against postgresql-9.0.1...

it looks like you are missing at least some of the changes to htup.h:

../postgresql-9.0.1_hb2/src/include/access/htup.h

#ifndef DISABLE_HINT_BITS
#define HEAP_XMIN_COMMITTED 0x0100  /* t_xmin committed */
#define HEAP_XMIN_INVALID   0x0200  /* t_xmin invalid/aborted */
#define HEAP_XMAX_COMMITTED 0x0400  /* t_xmax committed */
#define HEAP_XMAX_INVALID   0x0800  /* t_xmax invalid/aborted */
#endif

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 7:20 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 7:06 PM, Mark Kirkwood
 mark.kirkw...@catalyst.net.nz wrote:
 On 22/12/10 13:05, Mark Kirkwood wrote:

 On 22/12/10 11:42, Merlin Moncure wrote:

 Attached is an incomplete patch disabling hint bits based on compile
 switch.  It's not complete, for example it's not reconciling some
 assumptions in heapam.c that hint bits have been set in various
 routines.  However, it mostly passes regression and I deemed it good
 enough to run some preliminary benchmarks and fool around.  Obviously,
 hint bits are an annoying impediment to a couple of other cool pending
 features, and it certainly would be nice to operate without them.
 Also, for particular workloads, the extra i/o hint bits can cause a
 fair amount of pain.

 Looks like a great idea to test, however I don't seem to be able to
 compile with it applied: (set#define DISABLE_HINT_BITS 1 at the end of
 src/include/pg_config_manual.h)

 gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
 -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g
 -I../../../../src/include -D_GNU_SOURCE -c -o heapam.o heapam.c
 heapam.c: In function ‘HeapTupleHeaderAdvanceLatestRemovedXid’:
 heapam.c:3867: error: ‘HEAP_XMIN_COMMITTED’ undeclared (first use in this
 function)
 heapam.c:3867: error: (Each undeclared identifier is reported only once
 heapam.c:3867: error: for each function it appears in.)
 heapam.c:3869: error: ‘HEAP_XMIN_INVALID’ undeclared (first use in this
 function)
 make[4]: *** [heapam.o] Error 1


 Arrg, sorry - against git head on Ubuntu 10.03 (gcc 4.4.3)

 did you check to see if the patch applied clean? btw I was working
 against postgresql-9.0.1...

ah, this is the problem (9.0.1 vs head).  to work vs head it prob
needs a few more tweaks.  you can also try removing it yourself --
most of the changes follow a similar pattern.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 Attached is an incomplete patch disabling hint bits based on compile
 switch. ...
 So far, at least doing pgbench runs and another test designed to
 exercise clog lookups, the performance loss of always doing full
 lookup hasn't materialized.

The standard pgbench test would be just about 100% useless for stressing
this, because its net database activity is only about one row
touched/updated per query.  You need a test case that hits lots of rows
per query, else you're just measuring parse+plan+network overhead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Merlin Moncure
On Tue, Dec 21, 2010 at 7:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 Attached is an incomplete patch disabling hint bits based on compile
 switch. ...
 So far, at least doing pgbench runs and another test designed to
 exercise clog lookups, the performance loss of always doing full
 lookup hasn't materialized.

 The standard pgbench test would be just about 100% useless for stressing
 this, because its net database activity is only about one row
 touched/updated per query.  You need a test case that hits lots of rows
 per query, else you're just measuring parse+plan+network overhead.

right -- see the attached clog_stress.sql above.  It creates a script
that inserts records in blocks of 1, deletes half of them, and
vacuums.  Neither the execution of the script nor a seq scan following
its execution showed an interesting performance difference (which I am
arbitrarily calling 5% in either direction).  Like I said though, I
don't trust the patch or the results yet.

@Mark: apparently the cvs server is behind git and there are some
recent changes to heapam.c that need more attention.  I need to get
git going on my box, but try changing this:

if ((tuple-t_infomask  HEAP_XMIN_COMMITTED) ||
(!(tuple-t_infomask  HEAP_XMIN_COMMITTED) 
 !(tuple-t_infomask  HEAP_XMIN_INVALID) 
 TransactionIdDidCommit(xmin)))

to this:

if (TransactionIdDidCommit(xmin))

also, isn't the extra check vs HEAP_XMIN_COMMITTED redundant, and if
you do have to look up clog, why not set the hint bit?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Fujii Masao
On Sat, Dec 18, 2010 at 1:00 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Dec 17, 2010 at 10:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think the attached might be a little tidier.  Thoughts?

 I'm not really thrilled at the idea of calling
 IsPostmasterChildWalSender for every child whether or not it will have
 any impact on the decision.  That involves touching shared memory which
 can be rather expensive (see previous discussions about shared cache
 lines and so forth).

 The existing code already does that, unless I'm missing something.  We
 could improve on my proposed patch a bit by doing the is_autovacuum
 test first and the walsender test second.  I'm not sure how to improve
 on it beyond that.

How about doing target != ALL test at the head for the most common case
(target == ALL)? I added that test into your patch and changed it so that the
is_autovacuum test is done first.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


signal-some-children-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch BUG #5103: pg_ctl -w (re)start fails with custom unix_socket_directory

2010-12-21 Thread Alvaro Herrera
Excerpts from Quan Zongliang's message of mar dic 21 18:36:11 -0300 2010:
 On Mon, 29 Nov 2010 10:29:17 -0300
 Alvaro Herrera alvhe...@commandprompt.com wrote:
 

  I think the way this should work is that you call postmaster with a new
  switch and it prints out its configuration, after reading the
  appropriate config file(s).  That way it handles all the little details
  such as figuring out the correct config file, hadle include files, etc.
  This output would be presumably easier to parse and more trustworthy.
 
 Sorry for my late reply.
 
 I will check the source of postmaster.

Actually Bruce Momjian is now working on a different fix:
unix_socket_directory would be added to postmaster.pid, allowing pg_ctl
to find it.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Gurjeet Singh
On Tue, Dec 21, 2010 at 6:24 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Dec 20, 2010 at 1:10 PM, Noah Misch n...@leadboat.com wrote:
  When the caller knows the smaller string length, memcmp and strncmp are
  functionally equivalent.  Since memcmp need not watch each byte for a
 NULL
  terminator, it often compares a CPU word at a time for better
 performance.  The
  attached patch changes use of strncmp to memcmp where we have the length
 of the
  shorter string.  I was most interested in the varlena.c instances, but I
 tried
  to find all applicable call sites.  To benchmark it, I used the attached
  bench-texteq.sql.  This patch improved my 5-run average timing of the
 SELECT
  from 65.8s to 56.9s, a 13% improvement.  I can't think of a case where
 the
  change should be pessimal.

 This is a good idea.  I will check this over and commit it.


Doesn't this risk accessing bytes beyond the shorter string? Look at the
warning above the StrNCpy(), for example.

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Rob Wultsch
On Tue, Dec 21, 2010 at 4:49 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasby j...@nasby.net wrote:
 On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:
 Does postgres make an effort to create a file with physically continuous 
 blocks?

 AFAIK all files are expanded as needed. I don't think there's any flags you 
 can pass to the filesystem to tell it this file will eventually be 1GB in 
 size. So, we're basically at the mercy of the FS to try and keep things 
 contiguous.

 There have been some reports that we would do better on some
 filesystems if we extended the file more than a block at a time, as we
 do today.  However, AFAIK, no one is pursuing this ATM.



The has been found to be the case in the MySQL world, particularly
when ext3 is in use:
http://forge.mysql.com/worklog/task.php?id=4925
http://www.facebook.com/note.php?note_id=194501560932


Also, InnoDB has an option for how much data should be allocated at
the end of a tablespace when it needs to grow:
http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 8:29 PM, Gurjeet Singh singh.gurj...@gmail.com wrote:
 On Tue, Dec 21, 2010 at 6:24 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Dec 20, 2010 at 1:10 PM, Noah Misch n...@leadboat.com wrote:
  When the caller knows the smaller string length, memcmp and strncmp are
  functionally equivalent.  Since memcmp need not watch each byte for a
  NULL
  terminator, it often compares a CPU word at a time for better
  performance.  The
  attached patch changes use of strncmp to memcmp where we have the length
  of the
  shorter string.  I was most interested in the varlena.c instances, but I
  tried
  to find all applicable call sites.  To benchmark it, I used the attached
  bench-texteq.sql.  This patch improved my 5-run average timing of the
  SELECT
  from 65.8s to 56.9s, a 13% improvement.  I can't think of a case where
  the
  change should be pessimal.

 This is a good idea.  I will check this over and commit it.

 Doesn't this risk accessing bytes beyond the shorter string?

If it's done properly, I don't see how this would be a risk.

 Look at the
 warning above the StrNCpy(), for example.

If you're talking about this comment:

 *  BTW: when you need to copy a non-null-terminated string (like a text
 *  datum) and add a null, do not do it with StrNCpy(..., len+1).  That
 *  might seem to work, but it fetches one byte more than there is in the
 *  text object.

...then that's not applicable here.  It's perfectly safe to compare to
strings of length n using an n-byte memcmp().  The bytes being
compared are 0 through n - 1; the terminating null is in byte n, or
else it isn't, but memcmp() certainly isn't going to look at it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Gurjeet Singh
On Tue, Dec 21, 2010 at 9:01 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Dec 21, 2010 at 8:29 PM, Gurjeet Singh singh.gurj...@gmail.com
 wrote:
  On Tue, Dec 21, 2010 at 6:24 PM, Robert Haas robertmh...@gmail.com
 wrote:
 
  On Mon, Dec 20, 2010 at 1:10 PM, Noah Misch n...@leadboat.com wrote:
   When the caller knows the smaller string length, memcmp and strncmp
 are
   functionally equivalent.  Since memcmp need not watch each byte for a
   NULL
   terminator, it often compares a CPU word at a time for better
   performance.  The
   attached patch changes use of strncmp to memcmp where we have the
 length
   of the
   shorter string.  I was most interested in the varlena.c instances, but
 I
   tried
   to find all applicable call sites.  To benchmark it, I used the
 attached
   bench-texteq.sql.  This patch improved my 5-run average timing of
 the
   SELECT
   from 65.8s to 56.9s, a 13% improvement.  I can't think of a case where
   the
   change should be pessimal.
 
  This is a good idea.  I will check this over and commit it.
 
  Doesn't this risk accessing bytes beyond the shorter string?

 If it's done properly, I don't see how this would be a risk.

  Look at the
  warning above the StrNCpy(), for example.

 If you're talking about this comment:

  *  BTW: when you need to copy a non-null-terminated string (like a
 text
  *  datum) and add a null, do not do it with StrNCpy(..., len+1).  That
  *  might seem to work, but it fetches one byte more than there is in
 the
  *  text object.

 ...then that's not applicable here.  It's perfectly safe to compare to
 strings of length n using an n-byte memcmp().  The bytes being
 compared are 0 through n - 1; the terminating null is in byte n, or
 else it isn't, but memcmp() certainly isn't going to look at it.


I missed the part where Noah said ... where we have the length of the *
_shorter_* string. I agree we are safe here.

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Mark Kirkwood

On 22/12/10 13:56, Merlin Moncure wrote:

On Tue, Dec 21, 2010 at 7:45 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

@Mark: apparently the cvs server is behind git and there are some
recent changes to heapam.c that need more attention.  I need to get
git going on my box, but try changing this:

if ((tuple-t_infomask  HEAP_XMIN_COMMITTED) ||
(!(tuple-t_infomask  HEAP_XMIN_COMMITTED)
!(tuple-t_infomask  HEAP_XMIN_INVALID)
TransactionIdDidCommit(xmin)))

to this:

if (TransactionIdDidCommit(xmin))

also, isn't the extra check vs HEAP_XMIN_COMMITTED redundant, and if
you do have to look up clog, why not set the hint bit?



That gets it compiling.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 How about doing target != ALL test at the head for the most common case
 (target == ALL)?

That's an idea, but the test you propose implements it incorrectly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 6:24 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Dec 20, 2010 at 1:10 PM, Noah Misch n...@leadboat.com wrote:
 When the caller knows the smaller string length, memcmp and strncmp are
 functionally equivalent.  Since memcmp need not watch each byte for a NULL
 terminator, it often compares a CPU word at a time for better performance.  
 The
 attached patch changes use of strncmp to memcmp where we have the length of 
 the
 shorter string.  I was most interested in the varlena.c instances, but I 
 tried
 to find all applicable call sites.  To benchmark it, I used the attached
 bench-texteq.sql.  This patch improved my 5-run average timing of the 
 SELECT
 from 65.8s to 56.9s, a 13% improvement.  I can't think of a case where the
 change should be pessimal.

 This is a good idea.  I will check this over and commit it.

A little benchmarking reveals that on my system (MacOS X 10.6.5) it
appears that strncmp() is faster for a 4 character string, but
memcmp() is faster for a 5+ character string.  So I think most of
these are pretty clear wins, but I have reverted the changes to
src/backend/tsearch because I'm not entirely confident that lexemes
and affixes will be long enough on average for this to be a win there.
 Please feel free to resubmit that part with performance results
showing that it works out to a win.  Some of the ltree changes
produced compiler warnings, so I omitted those also.  Committed the
rest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If it's done properly, I don't see how this would be a risk.

I'm fairly uncomfortable about the broad swath and low return of this
patch.  Noah is assuming that none of these places are relying on
strncmp to stop short upon finding a null, and I don't believe that
that's a safe assumption in every single place.  Nor do I believe that
it's worth the effort of trying to prove it safe in most of those
places.

I think this might be a good idea in the varchar.c and varlena.c calls,
but I'd be inclined to leave the rest of the calls alone.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 10:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 If it's done properly, I don't see how this would be a risk.

 I'm fairly uncomfortable about the broad swath and low return of this
 patch.  Noah is assuming that none of these places are relying on
 strncmp to stop short upon finding a null, and I don't believe that
 that's a safe assumption in every single place.  Nor do I believe that
 it's worth the effort of trying to prove it safe in most of those
 places.

 I think this might be a good idea in the varchar.c and varlena.c calls,
 but I'd be inclined to leave the rest of the calls alone.

Eh, I already committed somewhat more than that.  I did think about
the concern which you raise.  It seems pretty clear that's not a
danger in readfuncs.c.  In the hstore and ltree cases, at least at
first blush, it appears to me that it would be downright broken for
someone to be counting on a null to terminate the comparison.  The
intent of these bits of code appears to be to do equality comparison a
string stored as a byte count + a byte string, rather than a
null-terminated cstring, so unless I'm misunderstanding something it's
more likely that the use of strncmp() would lead to a bug; the prior
coding doesn't look like it would be correct if NUL bytes were
possible.  The tsearch cases also appear to be safe in this regard,
but since I decided against committing those on other grounds I
haven't looked at them as carefully.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest wrap-up

2010-12-21 Thread Robert Haas
On Tue, Dec 21, 2010 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 15, 2010 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 - Writeable CTEs - I think we need Tom to pick this one up.
 - Fix snapshot taking inconsistencies - Ready for committer. Can any
 committer pick this up?

 Will take a look at these two also.

 Tom, what is your time frame on this?  I think we should wrap up the
 CF without these and bundle 9.1alpha3 unless you plan to get to this
 in the next day or two.

 We probably shouldn't hold up the alpha for these, if there are no
 other items outstanding.

OK.  I've moved them to the next CommitFest and marked this one closed.

*bangs gavel*

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in ts_rank_cd

2010-12-21 Thread Tom Lane
Sushant Sinha sushant...@gmail.com writes:
 There is a bug in ts_rank_cd. It does not correctly give rank when the
 query lexeme is the first one in the tsvector.

Hmm ... I cannot reproduce the behavior you're complaining of.
You say

 select ts_rank_cd(to_tsvector('english', 'abc sdd'),
 plainto_tsquery('english', 'abc'));   
  ts_rank_cd 
 
   0

but I get

regression=# select ts_rank_cd(to_tsvector('english', 'abc sdd'),
regression(# plainto_tsquery('english', 'abc'));   
 ts_rank_cd 

0.1
(1 row)

 The problem is that the Cover finding algorithm ignores the lexeme at
 the 0th position,

As far as I can tell, there is no 0th position --- tsvector counts
positions from one.  The only way to see pos == 0 in the input to
Cover() is if the tsvector has been stripped of position information.
ts_rank_cd is documented to return 0 in that situation.  Your patch
would have the effect of causing it to return some nonzero, but quite
bogus, ranking.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncmp-memcmp when we know the shorter length

2010-12-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Dec 21, 2010 at 10:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm fairly uncomfortable about the broad swath and low return of this
 patch.  Noah is assuming that none of these places are relying on
 strncmp to stop short upon finding a null, and I don't believe that
 that's a safe assumption in every single place.  Nor do I believe that
 it's worth the effort of trying to prove it safe in most of those
 places.

 Eh, I already committed somewhat more than that.  I did think about
 the concern which you raise.

Okay ... I was arguing for not bothering to expend that effort, but
since you already did, it's a moot point.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperlu problem with utf8

2010-12-21 Thread Alex Hunsaker
On Mon, Dec 20, 2010 at 00:39, Alex Hunsaker bada...@gmail.com wrote:

 In further review over caffeine this morning I noticed there are a few
 places I missed: plperl_build_tuple_result(), plperl_modify_tuple()
 and Util.XS.

And here is v3, fixes the above and also makes sure to properly
encode/decode SPI arguments.  Tested on a latin1 database with latin1
columns and utf8 with utf8 columns.  Also passes make installcheck (of
course) and changes one or two things to make plperl.c warning free.


plperl_enc_v3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The cost of visibillity testing? (gin-search)

2010-12-21 Thread Jesper Krogh

On 2010-12-21 21:28, Andres Freund wrote:

On Tuesday 21 December 2010 20:25:16 Jesper Krogh wrote:
   

What have I missed in the logic?
 

A reproducible testcase ;-)
   

Yes, I did a  complete dump/restore of the dataset and the numbers
looked like expected. So table bloat seems to be the problem/challenge.

I must have hit a strange sitauation where my table-bloat proportionally
was significantly higher than my gin-index-bloat.

Jesper

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bug in SignalSomeChildren

2010-12-21 Thread Fujii Masao
On Wed, Dec 22, 2010 at 12:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 How about doing target != ALL test at the head for the most common case
 (target == ALL)?

 That's an idea, but the test you propose implements it incorrectly.

Thanks! I revised the patch.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


signal-some-children-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Heikki Linnakangas

On 22.12.2010 03:45, Rob Wultsch wrote:

On Tue, Dec 21, 2010 at 4:49 AM, Robert Haasrobertmh...@gmail.com  wrote:

On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasbyj...@nasby.net  wrote:

On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:

Does postgres make an effort to create a file with physically continuous blocks?


AFAIK all files are expanded as needed. I don't think there's any flags you can pass to 
the filesystem to tell it this file will eventually be 1GB in size. So, we're 
basically at the mercy of the FS to try and keep things contiguous.


There have been some reports that we would do better on some
filesystems if we extended the file more than a block at a time, as we
do today.  However, AFAIK, no one is pursuing this ATM.


The has been found to be the case in the MySQL world, particularly
when ext3 is in use:
http://forge.mysql.com/worklog/task.php?id=4925
http://www.facebook.com/note.php?note_id=194501560932


These seem to be about extending the transaction log, and we already 
pre-allocate the WAL. The WAL is repeatedly fsync'd, so I can understand 
that extending that in small chunks would hurt performance a lot, as the 
filesystem needs to flush the metadata changes to disk at every commit. 
However, that's not an issue with extending data files, they are only 
fsync'd at checkpoints.


It might well be advantageous to extend data files in larger chunks too, 
but it's probably nowhere near as important as with the WAL.



Also, InnoDB has an option for how much data should be allocated at
the end of a tablespace when it needs to grow:
http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path


Hmm, innodb_autoextend_increment seems more like what we're discussing 
here 
(http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment). 
If I'm reading that correctly, InnoDB defaults to extending files in 8MB 
chunks.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How much do the hint bits help?

2010-12-21 Thread Heikki Linnakangas

On 22.12.2010 02:56, Merlin Moncure wrote:

On Tue, Dec 21, 2010 at 7:45 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Merlin Moncuremmonc...@gmail.com  writes:

Attached is an incomplete patch disabling hint bits based on compile
switch. ...
So far, at least doing pgbench runs and another test designed to
exercise clog lookups, the performance loss of always doing full
lookup hasn't materialized.


The standard pgbench test would be just about 100% useless for stressing
this, because its net database activity is only about one row
touched/updated per query.  You need a test case that hits lots of rows
per query, else you're just measuring parse+plan+network overhead.


right -- see the attached clog_stress.sql above.  It creates a script
that inserts records in blocks of 1, deletes half of them, and
vacuums.  Neither the execution of the script nor a seq scan following
its execution showed an interesting performance difference (which I am
arbitrarily calling 5% in either direction).  Like I said though, I
don't trust the patch or the results yet.


Make sure you have a good mix of different xids in the table, 
TransactionLogFetch has a one-item cache so repeatedly checking the same 
xid is much faster than the general case.


Perhaps run pgbench for a while, and then do SELECT COUNT(*) on the 
resulting tables.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Rob Wultsch
On Wed, Dec 22, 2010 at 12:15 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 22.12.2010 03:45, Rob Wultsch wrote:

 On Tue, Dec 21, 2010 at 4:49 AM, Robert Haasrobertmh...@gmail.com
  wrote:

 On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasbyj...@nasby.net  wrote:

 On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:

 Does postgres make an effort to create a file with physically
 continuous blocks?

 AFAIK all files are expanded as needed. I don't think there's any flags
 you can pass to the filesystem to tell it this file will eventually be 1GB
 in size. So, we're basically at the mercy of the FS to try and keep things
 contiguous.

 There have been some reports that we would do better on some
 filesystems if we extended the file more than a block at a time, as we
 do today.  However, AFAIK, no one is pursuing this ATM.

 The has been found to be the case in the MySQL world, particularly
 when ext3 is in use:
 http://forge.mysql.com/worklog/task.php?id=4925
 http://www.facebook.com/note.php?note_id=194501560932

 These seem to be about extending the transaction log, and we already
 pre-allocate the WAL. The WAL is repeatedly fsync'd, so I can understand
 that extending that in small chunks would hurt performance a lot, as the
 filesystem needs to flush the metadata changes to disk at every commit.
 However, that's not an issue with extending data files, they are only
 fsync'd at checkpoints.

 It might well be advantageous to extend data files in larger chunks too, but
 it's probably nowhere near as important as with the WAL.

Agree.

 Also, InnoDB has an option for how much data should be allocated at
 the end of a tablespace when it needs to grow:

 http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path

 Hmm, innodb_autoextend_increment seems more like what we're discussing here
 (http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
 If I'm reading that correctly, InnoDB defaults to extending files in 8MB
 chunks.

This is not pure apples to apples as InnoDB does direct io, however
doesn't the checkpoint completion target code call fsync repeatedly in
order to achieve the check point completion target? And for that
matter, haven't there been recent discussion on hackers about calling
fsync more often?

Sorry for the loopy email. I have not been getting anywhere near
enough sleep recently :(
-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can postgres create a file with physically continuous blocks.

2010-12-21 Thread Heikki Linnakangas

On 22.12.2010 09:25, Rob Wultsch wrote:

On Wed, Dec 22, 2010 at 12:15 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Hmm, innodb_autoextend_increment seems more like what we're discussing here
(http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
If I'm reading that correctly, InnoDB defaults to extending files in 8MB
chunks.


This is not pure apples to apples as InnoDB does direct io, however
doesn't the checkpoint completion target code call fsync repeatedly in
order to achieve the check point completion target?


It only fsync's each file once. If there's a lot of files, it needs to 
issue a lot of fsync's, but for different files.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers