Re: [HACKERS] Many "loaded library" logs by preload libraries

2009-01-04 Thread Tom Lane
ITAGAKI Takahiro  writes:
> If we set shared_preload_libraries or local_preload_libraries to
> load some modules, "loaded library" messages are logged in server
> log every new connections and autovacuum workers.
> LOG:  loaded library "pg_stat_statements"

Yeah, I was noticing that myself while testing pg_stat_statements.
I agree that we should fix it to reduce the message level for reloads
occurring in child processes.  I'd suggest using DEBUG2 if
(IsUnderPostmaster && process_shared_preload_libraries_in_progress).
I'm not so enthused about eliminating messaging for 
local_preload_libraries, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SPI nesting in plperl

2009-01-04 Thread Tom Lane
I tried fixing this
http://archives.postgresql.org/pgsql-general/2009-01/msg00030.php
by inserting SPI_push/SPI_pop calls around plperl's use of
InputFunctionCall and OutputFunctionCall.  Unfortunately it soon
turned into a mess, because there are various control paths through
that code and some arrive at the I/O calls inside a SPI context
while others don't.  We could probably fix it with a kluge or two
but it would be awfully fragile.  The reason for the inconsistency
is that the call handlers exit the SPI context before preparing
their results:

/
 * Disconnect from SPI manager and then create the return
 * values datum (if the input function does a palloc for it
 * this must not be allocated in the SPI memory context
 * because SPI_finish would free it).
 /
if (SPI_finish() != SPI_OK_FINISH)
elog(ERROR, "SPI_finish() failed");

It seems like a "clean" fix would involve moving the SPI_finish down
to the end and dealing with the problem mentioned in the comment by
paying an extra datumCopy cycle for pass-by-reference function results.
Which is annoying, though in the big scheme of things it's probably
not much compared to the overall overhead of a PL function.

I also thought about attacking the problem by having InputFunctionCall
and OutputFunctionCall automatically do SPI_push/SPI_pop if they are
called within an active SPI context.  I don't like this approach too
much because it seems likely to mask bugs as often as fix them.  (In
particular I'd be afraid to back-patch such a change.)  It might be the
cleanest solution overall, though, particularly when you consider that
we've probably got similar issues in pltcl, plpython, and add-on PLs.

Anyone have comments or better ideas?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1226

2009-01-04 Thread ITAGAKI Takahiro
Thank you for many works.

Tom Lane  wrote:

> 1. The proposed change to track system/user CPU time presents an
> enormous cost, and no argument has been made to show that there is any
> comparable benefit.

> 2. I'm unconvinced by the proposed changes to accumulate backend-local
> I/O counters, too.  The fact of the matter is that those counters are
> left over from Berkeley days, a time when PG hackers tended to do their
> performance measurements in standalone backends (!).

Ok, I need to reconsider performance and design of I/O counters.
I think those information is still useful because we can determine
not only which query is bad, but also why the query is bad 
*without repeating the query in production servers*.
But the reworking would be done in 8.5...


> * I changed the default track setting to "top".  I don't see the
> likelihood that someone would load this module into their server
> and not want it turned on.

Looks good. Setting shared_preload_libraries is enough for normal use.

> * I'm not entirely seeing the point of a server-wide tracking facility
> that only counts SELECT/INSERT/UPDATE/DELETE.  ISTM this should be
> modified to count utility commands too; which probably means adding some
> hooks around ProcessUtility (and what about autovacuum?).  I left that
> work for someone else to do, though.

Sure, but we should also consider utility commands should not kick out
DML queries because maintenance commands  take long time typically.

> * As already mentioned I find the entry_dealloc logic pretty darn
> dubious; but I didn't touch that either in this go-round.  If we do
> keep it in its current form, ISTM that usage ought to be proportional
> to total execution time not total call count.

I think the current implementation is not ideal, too.
I also don't like using fixed-size shared memory. I'm thinking
that shared memory used by extended modules to be allocated
from the shared buffer pool in the future. If we could do it,
memory management will be flexible and we can load libraries
after server starts.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread KaiGai Kohei
Alvaro Herrera wrote:
> KaiGai Kohei wrote:
> 
>> (1) Who/Where should allocate a string area?
>>
>> + /* Note that this assumes that the variable is already allocated! */
>> + #define HANDLE_STRING_RELOPTION(optname, var, option) \
>> +   if (HAVE_RELOPTION(optname, option))\
>> +   {   \
>> +   strcpy(var, \
>> +  option.isset ? option.values.string_val :\
>> +  ((relopt_string *) option.gen)->default_val);\
>> +   continue;   \
>> +   }
>>
>> I think HANDLE_STRING_RELOPTION() should allocate a string area via
>> pstrdup(). If we have to put individual pstrdup() for each reloptions,
>> it will make noisy listing of codes.
>>
>> How do you think:
>>
>> #define HANDLE_STRING_RELOPTION(optname, var, option) \
>> if (HAVE_RELOPTION(optname, option))  \
>> { \
>> char *tmp = (option.isset ? option.values.string_val :\
>> ((relopt_string *) option.gen)->default_val); \
>> var = pstrdup(tmp);   \
>> continue; \
>> }
> 
> Well, that's precisely the problem with string options.  If we want
> memory to be freed properly, we can only allocate a single chunk which
> is what's going to be stored under the rd_options bytea pointer.
> Allocating separately doesn't work because we need to rebuild the
> relcache entry (freeing it and allocating a new one) when it is
> invalidated for whatever reason.  Since the relcache code cannot follow
> a pointer stored in the bytea area, this would result in a permanent
> memory leak.
> 
> So the rule I came up with is that the caller is responsible for
> allocating it -- but it must be inside the bytea area to be returned.
> Below is a sample amoptions routine to show how it works.  Note that
> this is exactly why I said that only a single string option can be
> supported.

If the caller allocates a surplus area to store string on tail of the
StdRdOptions (or others), the string option can be represented as an
offset value and should be accessed via macros, like:

struct StdRdOptions
{
int32   vl_len_;
int fillfactor;
int test_option_a;  /* indicate offset of the surplus area from 
head */
int test_option_b;  /* of this structure.   
 */
/* in actually surplus area is allocated here */
};

#define GetOptionString(ptr, ofs) (ofs==0 ? NULL : ((char *)(ptr) + 
(ptr)->(ofs)))

We can access string options as follows:

  elog(NOTICE, "test_option_a is %s", GetOptionString(RdOpts, test_option_a));
  elog(NOTICE, "test_option_b is %s", GetOptionString(RdOpts, test_option_b));

It enables to store multiple string options, and represent NULL string.
If possible, HANDLE_STRING_RELOPTION() should be able to manage the head of 
unused
surplus area and put the given val its offset automatically.

I think using a macro to access string option is more proper restriction than
mutually exclusive string options.

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Export IsUnderPostmaster for pg_stat_statements on win32

2009-01-04 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
> Hi,
> 
> I compiled pg_stat_statements on mingw, and got the following error.
> PGDLLIMPORT for IsUnderPostmaster seems to be needed on win32.

Applied, thanks.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for basic window frame support

2009-01-04 Thread Tom Lane
"Hitoshi Harada"  writes:
> 2008/12/31 Tom Lane :
>> RANGE UNBOUNDED PRECEDING   -- same as AND CURRENT ROW
>> ROWS UNBOUNDED PRECEDING-- same as AND CURRENT ROW

> Is this true?

7.11 syntax rule 9 says so.  AFAICS general rule 5b doesn't discuss the
case where  isn't present, and doesn't need to
because of syntax rule 9.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PROPOSAL: geqo improvement

2009-01-04 Thread marcin mank
> It sounds like you're proposing to compare the time spent planning to
> the estimated execution time.  AFAICS, those things are unrelated, so
> I'm not sure what you hope to figure out by comparing them.

The idea is: If we are to spend a LOT of resources executing the
query, we might as well burn some cycles in hope of finding a better
plan.


> It sounds like you may have some concrete queries that suffer from
> this problem.  It might be helpful to post the queries and the good
> and bad plans.  It may be that the problem can be fixed with some
> tuning of the existing parameters.

Actually, no. This is my random thought based on observing some
threads where people get bad plans due to GEQO.


> The deadline for the final CommitFest was November 1st, so I think it
> is too late for 8.4.

ugh.. too bad. I`m still interested anyway :)

Marcin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PROPOSAL: geqo improvement

2009-01-04 Thread Robert Haas
2009/1/4 marcin mank :
> GEQO would decide that the plan is bad when the calculated cost of the
> plan would exceed the time spent planning so far a fixed number of
> times (100 ? a configurable parameter ?) .
> I think a function infering cost from time spent could be calculated
> from cpu_operator_cost - or is there a better way?

It sounds like you're proposing to compare the time spent planning to
the estimated execution time.  AFAICS, those things are unrelated, so
I'm not sure what you hope to figure out by comparing them.

> An alternative to restarting the search might be just extending it -
> running the main loop of geqo() function longer.  I plan restarting
> because I`m afraid the real reason for getting bad plans could be that
> the algorithm is getting into some local minimum and can`t get out. I
> will explore that more.

It sounds like you may have some concrete queries that suffer from
this problem.  It might be helpful to post the queries and the good
and bad plans.  It may be that the problem can be fixed with some
tuning of the existing parameters.

> If there is agreement to do this, it looks simple enough that I
> volunteer to implement it. Please tell me what is the deadline for
> this to make into 8.4 .

The deadline for the final CommitFest was November 1st, so I think it
is too late for 8.4.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Many "loaded library" logs by preload libraries

2009-01-04 Thread ITAGAKI Takahiro
Hi,

If we set shared_preload_libraries or local_preload_libraries to
load some modules, "loaded library" messages are logged in server
log every new connections and autovacuum workers.

LOG:  loaded library "pg_stat_statements"

Messages by shared_preload_libraries are logged only once
on non-EXEC_BACKEND platforms, but many times on Windows.
On the other hand, local_preload_libraries outputs logs
meny times in all platforms.

Is it too noisy? How about reducing the log level to DEBUG
and/or logging them only in postmaster?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PROPOSAL: geqo improvement

2009-01-04 Thread marcin mank
Hello, List.

There are cases when GEQO returns a very bad plan in some rare
executions of a query. To decrease likehood of this happening, I
propose:

When GEQO detects that what it found is in fact a miserable plan  it
restarts the search. Simple math shows that if the probability of a
bad plan found in one 'go' is p, the overall probability of a bad plan
is p^N  .

GEQO would decide that the plan is bad when the calculated cost of the
plan would exceed the time spent planning so far a fixed number of
times (100 ? a configurable parameter ?) .
I think a function infering cost from time spent could be calculated
from cpu_operator_cost - or is there a better way?

As a safety mean I wish to limit the number of replannings to a fixed
value (10? 20? a configurable parameter?)

If I introduce some configuration variables, I plan to infer the
defaults from geqo_effort (no concrete plan for this now).

An alternative to restarting the search might be just extending it -
running the main loop of geqo() function longer.  I plan restarting
because I`m afraid the real reason for getting bad plans could be that
the algorithm is getting into some local minimum and can`t get out. I
will explore that more.

If there is agreement to do this, it looks simple enough that I
volunteer to implement it. Please tell me what is the deadline for
this to make into 8.4 .

What I lack is good test cases to verify the solution.

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)

2009-01-04 Thread Alvaro Herrera
> On Sat, Jan 3, 2009 at 17:56, Lasse Reinhold  wrote:

> > That sounds really exciting, I'd love to see QuickLZ included into
> > PostgreSQL. I'd be glad to offer support and add custom optimizations,
> > features or hacks or whatever should turn up.
> >
> > My only concern is to avoid undermining the commercial license, but this
> > can, as you suggest, be solved by exceptionally allowing QuickLZ to be
> > linked with PostgreSQL. Since I have exclusive copyright of QuickLZ any
> > construction is possible.

Hmm ... keep in mind that PostgreSQL is used as a base for a certain
number of commercial, non-BSD products (Greenplum, Netezza,
EnterpriseDB, Truviso, are the ones that come to mind).  Would this
exception allow for linking QuickLZ with them too?  It doesn't sound to
me like you're open to relicensing it under BSD, which puts us in an
uncomfortable position.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Export IsUnderPostmaster for pg_stat_statements on win32

2009-01-04 Thread ITAGAKI Takahiro
Hi,

I compiled pg_stat_statements on mingw, and got the following error.
PGDLLIMPORT for IsUnderPostmaster seems to be needed on win32.


Info: resolving _IsUnderPostmaster by linking to __imp__IsUnderPostmaster 
(auto-import)
fu01.o:(.idata$2+0xc): undefined reference to `libpostgres_a_iname'
nmth00.o:(.idata$4+0x0): undefined reference to `_nm__IsUnderPostmaster'


Patch attached.

Index: src/include/miscadmin.h
===
--- src/include/miscadmin.h (HEAD)
+++ src/include/miscadmin.h (fixed)
@@ -123,7 +123,7 @@
  */
 extern pid_t PostmasterPid;
 extern bool IsPostmasterEnvironment;
-extern bool IsUnderPostmaster;
+extern PGDLLIMPORT bool IsUnderPostmaster;
 
 extern bool ExitOnAnyError;
 

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Alvaro Herrera
Alex Hunsaker escribió:
> On Sun, Jan 4, 2009 at 15:01, Alvaro Herrera  
> wrote:
> > Alvaro Herrera wrote:
> >
> >> Okay, it was basically fine except for the attached minor correction.
> >> Warning: I intend to commit this patch fairly soon!
> >
> > This is the patch in its final form.  I have included a few macros to
> > simplify the writing of amoptions routines.
> 
> Looks good to me, I just used it to whip up a patch to control some
> toast compression knobs..

Excellent, thanks for testing :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Alvaro Herrera
KaiGai Kohei wrote:

> (1) Who/Where should allocate a string area?
> 
> + /* Note that this assumes that the variable is already allocated! */
> + #define HANDLE_STRING_RELOPTION(optname, var, option) \
> +   if (HAVE_RELOPTION(optname, option))\
> +   {   \
> +   strcpy(var, \
> +  option.isset ? option.values.string_val :\
> +  ((relopt_string *) option.gen)->default_val);\
> +   continue;   \
> +   }
> 
> I think HANDLE_STRING_RELOPTION() should allocate a string area via
> pstrdup(). If we have to put individual pstrdup() for each reloptions,
> it will make noisy listing of codes.
> 
> How do you think:
> 
> #define HANDLE_STRING_RELOPTION(optname, var, option) \
> if (HAVE_RELOPTION(optname, option))  \
> { \
> char *tmp = (option.isset ? option.values.string_val :\
> ((relopt_string *) option.gen)->default_val); \
> var = pstrdup(tmp);   \
> continue; \
> }

Well, that's precisely the problem with string options.  If we want
memory to be freed properly, we can only allocate a single chunk which
is what's going to be stored under the rd_options bytea pointer.
Allocating separately doesn't work because we need to rebuild the
relcache entry (freeing it and allocating a new one) when it is
invalidated for whatever reason.  Since the relcache code cannot follow
a pointer stored in the bytea area, this would result in a permanent
memory leak.

So the rule I came up with is that the caller is responsible for
allocating it -- but it must be inside the bytea area to be returned.
Below is a sample amoptions routine to show how it works.  Note that
this is exactly why I said that only a single string option can be
supported.

If you have a better idea, I'm all ears.

> (2) How does it represent NULL in string_option?
> 
> It seems to me we cannot represent a NULL string in the default.
> Is it possible to add a mark to indicate NULL, like "bool default_null"
> within struct relopt_string?

Ah, good point.  I'll have a look at this.

> (3) heap_reloptions() from RelationParseRelOptions() makes a trouble.

This is the same as (1) actually.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
/* must follow StdRdOptions conventions */
typedef struct BtOptions
{
int32   vl_len_;
int fillfactor;
charteststring[1];
} BtOptions;


Datum
btoptions(PG_FUNCTION_ARGS)
{
Datum   reloptions = PG_GETARG_DATUM(0);
boolvalidate = PG_GETARG_BOOL(1);
bytea  *result;
relopt_value *options;
int numoptions;
int i;
static  bool initialized = false;

if (!initialized)
{
add_string_reloption(RELOPT_KIND_BTREE, "teststring", NULL,
 "helluva string here 
and there!");
initialized = true;
}

options = parseRelOptions(reloptions, validate, RELOPT_KIND_BTREE, 
&numoptions);

/* if none set, we're done */
if (numoptions == 0)
result = NULL;
else
{
BtOptions *rdopts;
int tstrlen;

for (i = 0; i < numoptions; i++)
{
if (HAVE_RELOPTION("teststring", options[i]))
{
tstrlen = options[i].isset ?
strlen(options[i].values.string_val) :
((relopt_string *) 
options[i].gen)->default_len;
break;
}
}

rdopts = palloc0(sizeof(BtOptions) + tstrlen + 1);

for (i = 0; i < numoptions; i++)
{
HANDLE_INT_RELOPTION("fillfactor", rdopts->fillfactor, 
options[i]);
HANDLE_STRING_RELOPTION("teststring", 
rdopts->teststring, options[i]);
}

pfree(options);
SET_VARSIZE(rdopts, sizeof(BtOptions) + tstrlen + 1);
result = (bytea *) rdopts;
}

if (result)
PG_RETURN_BYTEA_P(result);
PG_RETURN_NULL();
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Significantly larger toast tables on 8.4?

2009-01-04 Thread Stephen R. van den Berg
James Mansion wrote:
>Peter Eisentraut wrote:
>>>c. Are there any well-known pitfalls/objections which would prevent me 
>>>from
>>>   changing the algorithm to something more efficient (read: IO-bound)?

>>copyright licenses and patents

>Would it be possible to have a plugin facility?

>I guess the most likely candidate is the LZJB mechanism in ZFS which is 
>CDDL licensed.

The most likely candidate for a speedy algorithm seems QuickLZ, the author
is willing to accomodate the licensing.
-- 
Sincerely,
   Stephen R. van den Berg.

Climate is what you expect.  Weather is what you get.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)

2009-01-04 Thread Stephen R. van den Berg
I asked the author of the QuickLZ algorithm about licensing...
Sounds like he is willing to cooperate.  This is what I got from him:

On Sat, Jan 3, 2009 at 17:56, Lasse Reinhold  wrote:
> Hi Stephen,
>
> That sounds really exciting, I'd love to see QuickLZ included into
> PostgreSQL. I'd be glad to offer support and add custom optimizations,
> features or hacks or whatever should turn up.
>
> My only concern is to avoid undermining the commercial license, but this
> can, as you suggest, be solved by exceptionally allowing QuickLZ to be
> linked with PostgreSQL. Since I have exclusive copyright of QuickLZ any
> construction is possible.
>
> Greetings,
>
> Lasse Reinhold
> Developer
> http://www.quicklz.com/
> l...@quicklz.com
>
> On Sat Jan 3 15:46 , 'Stephen R. van den Berg' sent:
>
> PostgreSQL is the most advanced Open Source database at this moment, it is
> being distributed under a Berkeley license though.
>
> What if we'd like to use your QuickLZ algorithm in the PostgreSQL core
> to compress rows in the internal archive format (it's not going to be a
> compression algorithm which is exposed to the SQL level)?
> Is it conceivable that you'd allow us to use the algorithm free of charge
> and allow it to be distributed under the Berkeley license, as long as it
> is part of the PostgreSQL backend?
> --
> Sincerely,
> Stephen R. van den Berg.
>
> Expect the unexpected!
> )
>
>



-- 
Sincerely,
Stephen R. van den Berg.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread KaiGai Kohei
Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> 
>> Okay, it was basically fine except for the attached minor correction.
>> Warning: I intend to commit this patch fairly soon!
> 
> This is the patch in its final form.  I have included a few macros to
> simplify the writing of amoptions routines.

Thanks for your efforts!

However, I could find a few issues about string reloptions.

(1) Who/Where should allocate a string area?

+ /* Note that this assumes that the variable is already allocated! */
+ #define HANDLE_STRING_RELOPTION(optname, var, option) \
+   if (HAVE_RELOPTION(optname, option))\
+   {   \
+   strcpy(var, \
+  option.isset ? option.values.string_val :\
+  ((relopt_string *) option.gen)->default_val);\
+   continue;   \
+   }

I think HANDLE_STRING_RELOPTION() should allocate a string area via
pstrdup(). If we have to put individual pstrdup() for each reloptions,
it will make noisy listing of codes.

How do you think:

#define HANDLE_STRING_RELOPTION(optname, var, option) \
if (HAVE_RELOPTION(optname, option))  \
{ \
char *tmp = (option.isset ? option.values.string_val :\
((relopt_string *) option.gen)->default_val); \
var = pstrdup(tmp);   \
continue; \
}

(2) How does it represent NULL in string_option?

It seems to me we cannot represent a NULL string in the default.
Is it possible to add a mark to indicate NULL, like "bool default_null"
within struct relopt_string?

(3) heap_reloptions() from RelationParseRelOptions() makes a trouble.

It invokes heap_reloptions() under CurrentMemoryContext, and its result
is copied in CacheMemoryContext later, if the result is not NULL.
But it makes a matter when StdRdOptions contains a pointer.
If a string allocated under CurrentMemoryContext, target of the copied
pointer is not valid on the next query execution, even if StdRdOptions
is valid because it is copied to another memoery context.

I think RelationParseRelOptions() should be patched as follows:

/* Parse into appropriate format; don't error out here */
+   oldctx = MemoryContextSwitchTo(CacheMemoryContext);
switch (relation->rd_rel->relkind)
{
case RELKIND_RELATION:
case RELKIND_TOASTVALUE:
case RELKIND_UNCATALOGED:
options = heap_reloptions(relation->rd_rel->relkind, datum,
  false);
break;
case RELKIND_INDEX:
options = index_reloptions(relation->rd_am->amoptions, datum,
   false);
break;
default:
Assert(false);  /* can't get here */
options = NULL; /* keep compiler quiet */
break;
}
+   MemoryContextSwitchTo(oldctx);

-   /* Copy parsed data into CacheMemoryContext */
-   if (options)
-   {
-   relation->rd_options = MemoryContextAlloc(CacheMemoryContext,
- VARSIZE(options));
-   memcpy(relation->rd_options, options, VARSIZE(options));
-   }

Thanks,
-- 
OSS Platform Development Division, NEC
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for basic window frame support

2009-01-04 Thread Hitoshi Harada
2009/1/5 Tom Lane :
> "Hitoshi Harada"  writes:
>> 2008/12/31 Tom Lane :
>>> RANGE UNBOUNDED PRECEDING   -- same as AND CURRENT ROW
>>> ROWS UNBOUNDED PRECEDING-- same as AND CURRENT ROW
>
>> Is this true?
>
> 7.11 syntax rule 9 says so.  AFAICS general rule 5b doesn't discuss the
> case where  isn't present, and doesn't need to
> because of syntax rule 9.
>

OK, I've just found it and confirmed Oracle does so.



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for basic window frame support

2009-01-04 Thread Hitoshi Harada
2008/12/31 Tom Lane :
> No docs or regression tests yet, but it seems to work ... please check
> against Oracle and DB2 behavior.  Supported cases are
>
>RANGE UNBOUNDED PRECEDING   -- same as AND CURRENT ROW
>ROWS UNBOUNDED PRECEDING-- same as AND CURRENT ROW

Is this true?
I guess that the 7.11 rule 5.b in the spec says as far as the bound is
not specified in the window frame clause, all rows of the partition
are contained in the frame. The rule then removes rows from the
initial frame as the frame bound indicates.

So as the result,
RANGE UNBOUNDED PRECEDING
ROWS UNBOUNDED PRECEDING
both mean
(RANGE / ROWS) BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
if I don't miss something.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Tom Lane
Alvaro Herrera  writes:
> This is the patch in its final form.  I have included a few macros to
> simplify the writing of amoptions routines.

Minor gripes:

* Does initialize_reloptions() need to be exported?  It seems to be
only called within parseRelOptions().  It's far from clear who else
should be expected to call it.

* The HANDLE_ macros are dangerous as-is (dangling if/else).  Need to
use the usual do/while trick.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] ALTER TABLE SET (compress_max_size... = )

2009-01-04 Thread Alex Hunsaker
This patch lets you control 3 pg_lzcompress knobs on a per table basis
(note requires reloptions.patch)

compress_max_size:  Controls the maximum size to be considered for
TOAST compression.
compress_min_rate: Minimum compression rate (0-100%) required for
TOAST compression to be used.
compress_success_by: if by this byte no compressible data found abort
compression.

Note this adds some documentation, but I was having a hard time coming
up with a good way to describe these.  I'm also not very happy with
the names.  I originally tried something like toast.max_input_size.
But decided later if we allow you to set toast attributes that might
be confusing.  So help with verbiage and names is appreciated.

Also I only did those 3 because they seemed the 3 most useful things
someone would want to tune.  Later if we need to we can export them
all and make them per column settings (and maybe you can pick a
compression algo or what not...)  But I figured lets start small.

I thought about doing another cleanup patch to get rid of
PGLZ_Strategy_default and PGLZ_Strategy_always.  Nothing uses the
later, and if we expose all the nobs nothing will use the first.
Comments?


compress_opts.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Alex Hunsaker
On Sun, Jan 4, 2009 at 15:01, Alvaro Herrera  wrote:
> Alvaro Herrera wrote:
>
>> Okay, it was basically fine except for the attached minor correction.
>> Warning: I intend to commit this patch fairly soon!
>
> This is the patch in its final form.  I have included a few macros to
> simplify the writing of amoptions routines.

Looks good to me, I just used it to whip up a patch to control some
toast compression knobs..

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1226

2009-01-04 Thread Tom Lane
ITAGAKI Takahiro  writes:
> Here is an updated version of contrib/pg_stat_statements patch.

I've committed this with significant revisions.  Other than the points
already mentioned in previous messages:

* I removed the proposed changes to the behavior of the core EXPLAIN
code.  I think that that should be submitted and discussed as a separate
patch, not slide in under the misleading title of being a "contrib
module".  I'm personally against those changes anyway, on two grounds:

1. The proposed change to track system/user CPU time presents an
enormous cost, and no argument has been made to show that there is any
comparable benefit.  The change causes each EXPLAIN ANALYZE tracking
call to invoke getrusage() as well as gettimeofday().  I did a little
bit of testing and found that this is over seven times slower on Fedora
9 on x86_64 (Xeon hardware) and over twenty-seven times slower on Darwin
(on Core 2 Duo hardware).  Considering that EXPLAIN ANALYZE overhead is
already higher than anyone would like, you would need a pretty darn
convincing argument to persuade us to accept that kind of slowdown.
At the very least the code would need to be modified so that it doesn't
execute getrusage() unless the user is actually going to look at the
results.

2. I'm unconvinced by the proposed changes to accumulate backend-local
I/O counters, too.  The fact of the matter is that those counters are
left over from Berkeley days, a time when PG hackers tended to do their
performance measurements in standalone backends (!).  They're obviously
not the full story now on write measurements, and I don't have any
confidence in them as read measurements either, particularly seeing that
the wave of the future is likely to be asynchronous read operations
(with the posix_fadvise patch and foreseeable follow-on work).  I think
those counters should more likely be done away with than
institutionalized in EXPLAIN ANALYZE output.  You can get more reliable
information about what's happening from the existing pgstats system-wide
I/O counts.

* I changed the default track setting to "top".  I don't see the
likelihood that someone would load this module into their server
and not want it turned on.

* I'm not entirely seeing the point of a server-wide tracking facility
that only counts SELECT/INSERT/UPDATE/DELETE.  ISTM this should be
modified to count utility commands too; which probably means adding some
hooks around ProcessUtility (and what about autovacuum?).  I left that
work for someone else to do, though.

* As already mentioned I find the entry_dealloc logic pretty darn
dubious; but I didn't touch that either in this go-round.  If we do
keep it in its current form, ISTM that usage ought to be proportional
to total execution time not total call count.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Significantly larger toast tables on 8.4?

2009-01-04 Thread James Mansion

Peter Eisentraut wrote:

c. Are there any well-known pitfalls/objections which would prevent me from
   changing the algorithm to something more efficient (read: IO-bound)?



copyright licenses and patents
  

Would it be possible to have a plugin facility?

I guess the most likely candidate is the LZJB mechanism in ZFS which is 
CDDL licensed.


Would that be compatible in contrib, if not in the main source?

James


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Okay, it was basically fine except for the attached minor correction.
> Warning: I intend to commit this patch fairly soon!

This is the patch in its final form.  I have included a few macros to
simplify the writing of amoptions routines.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Index: src/backend/access/common/reloptions.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/common/reloptions.c,v
retrieving revision 1.12
diff -c -p -r1.12 reloptions.c
*** src/backend/access/common/reloptions.c	1 Jan 2009 17:23:34 -	1.12
--- src/backend/access/common/reloptions.c	4 Jan 2009 21:59:41 -
***
*** 15,20 
--- 15,23 
  
  #include "postgres.h"
  
+ #include "access/gist_private.h"
+ #include "access/hash.h"
+ #include "access/nbtree.h"
  #include "access/reloptions.h"
  #include "catalog/pg_type.h"
  #include "commands/defrem.h"
***
*** 22,29 
--- 25,362 
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "utils/guc.h"
+ #include "utils/memutils.h"
  #include "utils/rel.h"
  
+ /*
+  * Contents of pg_class.reloptions
+  *
+  * To add an option:
+  *
+  * (i) decide on a class (integer, real, bool, string), name, default value,
+  * upper and lower bounds (if applicable).
+  * (ii) add a record below.
+  * (iii) add it to StdRdOptions if appropriate
+  * (iv) add a block to the appropriate handling routine (probably
+  * default_reloptions)
+  * (v) don't forget to document the option
+  *
+  * Note that we don't handle "oids" in relOpts because it is handled by
+  * interpretOidsOption().
+  */
+ 
+ static relopt_bool boolRelOpts[] =
+ {
+ 	/* list terminator */
+ 	{ { NULL } }
+ };
+ 
+ static relopt_int intRelOpts[] =
+ {
+ 	{
+ 		{
+ 			"fillfactor",
+ 			"Packs table pages only to this percentage",
+ 			RELOPT_KIND_HEAP
+ 		},
+ 		HEAP_DEFAULT_FILLFACTOR, HEAP_MIN_FILLFACTOR, 100
+ 	},
+ 	{
+ 		{
+ 			"fillfactor",
+ 			"Packs btree index pages only to this percentage",
+ 			RELOPT_KIND_BTREE
+ 		},
+ 		BTREE_DEFAULT_FILLFACTOR, BTREE_MIN_FILLFACTOR, 100
+ 	},
+ 	{
+ 		{
+ 			"fillfactor",
+ 			"Packs hash index pages only to this percentage",
+ 			RELOPT_KIND_HASH
+ 		},
+ 		HASH_DEFAULT_FILLFACTOR, HASH_MIN_FILLFACTOR, 100
+ 	},
+ 	{
+ 		{
+ 			"fillfactor",
+ 			"Packs gist index pages only to this percentage",
+ 			RELOPT_KIND_GIST
+ 		},
+ 		GIST_DEFAULT_FILLFACTOR, GIST_MIN_FILLFACTOR, 100
+ 	},
+ 	/* list terminator */
+ 	{ { NULL } }
+ };
+ 
+ static relopt_real realRelOpts[] =
+ {
+ 	/* list terminator */
+ 	{ { NULL } }
+ };
+ 
+ static relopt_string stringRelOpts[] = 
+ {
+ 	/* list terminator */
+ 	{ { NULL } }
+ };
+ 
+ static relopt_gen **relOpts = NULL;
+ static int last_assigned_kind = RELOPT_KIND_LAST_DEFAULT + 1;
+ 
+ static int		num_custom_options = 0;
+ static relopt_gen **custom_options = NULL;
+ static bool		need_initialization = true;
+ 
+ static void parse_one_reloption(relopt_value *option, char *text_str,
+ 	int text_len, bool validate);
+ 
+ /*
+  * initialize_reloptions
+  * 		initialization routine, must be called before parsing
+  *
+  * Initialize the relOpts array and fill each variable's type and name length.
+  */
+ void
+ initialize_reloptions(void)
+ {
+ 	int		i;
+ 	int		j = 0;
+ 
+ 	for (i = 0; boolRelOpts[i].gen.name; i++)
+ 		j++;
+ 	for (i = 0; intRelOpts[i].gen.name; i++)
+ 		j++;
+ 	for (i = 0; realRelOpts[i].gen.name; i++)
+ 		j++;
+ 	for (i = 0; stringRelOpts[i].gen.name; i++)
+ 		j++;
+ 	j += num_custom_options;
+ 
+ 	if (relOpts)
+ 		pfree(relOpts);
+ 	relOpts = MemoryContextAlloc(TopMemoryContext,
+  (j + 1) * sizeof(relopt_gen *));
+ 
+ 	j = 0;
+ 	for (i = 0; boolRelOpts[i].gen.name; i++)
+ 	{
+ 		relOpts[j] = &boolRelOpts[i].gen;
+ 		relOpts[j]->type = RELOPT_TYPE_BOOL;
+ 		relOpts[j]->namelen = strlen(relOpts[j]->name);
+ 		j++;
+ 	}
+ 
+ 	for (i = 0; intRelOpts[i].gen.name; i++)
+ 	{
+ 		relOpts[j] = &intRelOpts[i].gen;
+ 		relOpts[j]->type = RELOPT_TYPE_INT;
+ 		relOpts[j]->namelen = strlen(relOpts[j]->name);
+ 		j++;
+ 	}
+ 
+ 	for (i = 0; realRelOpts[i].gen.name; i++)
+ 	{
+ 		relOpts[j] = &realRelOpts[i].gen;
+ 		relOpts[j]->type = RELOPT_TYPE_REAL;
+ 		relOpts[j]->namelen = strlen(relOpts[j]->name);
+ 		j++;
+ 	}
+ 
+ 	for (i = 0; stringRelOpts[i].gen.name; i++)
+ 	{
+ 		relOpts[j] = &stringRelOpts[i].gen;
+ 		relOpts[j]->type = RELOPT_TYPE_STRING;
+ 		relOpts[j]->namelen = strlen(relOpts[j]->name);
+ 		j++;
+ 	}
+ 
+ 	for (i = 0; i < num_custom_options; i++)
+ 	{
+ 		relOpts[j] = custom_options[i];
+ 		j++;
+ 	}
+ 
+ 	/* add a list terminator */
+ 	relOpts[j] = NULL;
+ }
+ 
+ /*
+  * add_reloption_kind
+  * 		Create a new relopt_kind value, to be used in custom reloptions by
+  * 		user-defined AMs.
+  */
+ int
+ add_reloption_kind(void)
+ {
+ 	return last_assigned_kind++;
+ }
+

Re: [HACKERS] contrib/pg_stat_statements 1226

2009-01-04 Thread Tom Lane
I wrote:
> I'm not enamored of "saved_file" either, it seems like the wrong
> part of speech somehow.  Maybe "save_in_file"?

Actually ... what is the point of letting users control the filename at
all?  It seems like the only useful nondefault value is the empty string
(to suppress storing the stats).  So why not simplify this to a boolean?
"pg_stat_statements.save" = on or off.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Alvaro Herrera
Simon Riggs wrote:

> Custom variable classes are often useful, but they are system wide. It
> would be good to be able to use table-level options and have them work
> very similarly to something we already have. Table-level options are
> just an obvious "normalisation" of how we handle parameters.
> 
> If you really can't see a use for this, OK, then: Please can you put in
> a plugin API for user defined reloptions as well as what you are
> proposing. We discussed this before in late July/early Aug on thread
> "Uncopied parameters..."

I've been giving this a thought and I don't see any easy way to handle
it.  Since I've been threatened that this whole thing may be punted for
8.5 if I'm not quick about it, I've left this alone for now, and will
concentrate on getting the namespace thing done which will allow
specifying reloptions for the toast table by an ALTER TABLE on the main
table.  It's not that I don't see a use for it; it's just that I don't
have the time for it right now.

(Note: I think there are ways to do this; they'll involve storing the
unknown options as a text array.  It won't be pretty or performant, but
it seems the only way around the fact that heap_reloptions comes
hardcoded with the system.)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generic reloptions improvement

2009-01-04 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Some notes about this patch:
> 
> - the string type handling (basically all the new code) is untested.
> I'll have a look tomorrow at the btree test code I sent the other day to
> add a string option and see how it goes.

Okay, it was basically fine except for the attached minor correction.
Warning: I intend to commit this patch fairly soon!

As far as I can see, the new code can work with the options you've
defined in the SEPgsql code just fine.  Handling string options in
itself is fine; the complexity (as I already said) is in allocating
memory for the string if you want to store it unchanged in the bytea
stuff in relcache.  Since you're not storing the string itself but
convert it to an Oid, there's no problem.

Actually, storing the string itself works fine as long as you have a
single one, because you can define the option struct like this:

/* must follow StdRdOptions conventions */
typedef struct BtOptions
{
int32   vl_len_;
int fillfactor;
charteststring[1];
} BtOptions;

and there are no pointers involved.  This doesn't work:

typedef struct BtOptions
{
int32   vl_len_;
int fillfactor;
char*teststring;
} BtOptions;

because then there's a pointer, and it fails as soon as the bytea * is
copied by the relcache code.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
--- src/backend/access/common/reloptions.c	4 Jan 2009 03:07:38 -
+++ src/backend/access/common/reloptions.c	4 Jan 2009 20:59:57 -
@@ -704,7 +704,7 @@
 		case RELOPT_TYPE_STRING:
 			option->values.string_val = value;
 			nofree = true;
-
+			parsed = true;
 			/* no validation possible */
 			break;
 		default:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] dblink vs SQL/MED - security and implementation details

2009-01-04 Thread Joe Conway
(changed the subject to hopefully get a few more eyes looking at this 
thread)


Martin Pihlak wrote:


I'd vote for allowing aribitrary connect strings -- ordinary users cannot
create servers and user mappings unless explicitly granted the privileges.
It probably should be noted in the documentation that allowing ordinary
users to create user mappings enables the use of postgres .pgpass file.
Not sure where to put this at the moment.


I'm mainly concerned about re-opening security holes that we spent a lot 
of time debating and subsequently closing. I suspect if we assume that 
any FDW-derived connect string can bypass the checks we put in place, we 
will regret it later. But I'm open to arguments on both sides...



2. It seems like get_connect_string() is generically useful to any
   client of postgresql_fdw.c -- should it go there instead of dblink?

I'm pretty sure get_connect_string() should be moved to postgresql_fdw.c
-- objections?


There is some discussion in another thread about this:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01875.php
http://archives.postgresql.org/pgsql-hackers/2009-01/msg00021.php

The initial approach was to let each foreign data wrapper provide its own
connection string/list builder function. Latest is to provide the lookup
functions in foreign.c, and use the same functions for all the different
fdw's. I was about to implement those but got distracted. Will resume now.


It seems to me that get_connect_string() (or whatever we decide to call 
it), is very libpq specific, and therefore belongs with postgresql_fdw.c 
rather than foreign.c. But if we can't reach a consensus there is no 
harm in leaving it as a dblink.c specific static function either.


Joe

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink vs SQL/MED

2009-01-04 Thread Martin Pihlak
Joe Conway wrote:
>> Two specific questions on this approach:
>> 1. This implies that the exact same dblink_connstr_check() is performed
>>on a predefined foreign server and user mapping as a raw connstr --
>>is this desireable? I'm not entirely clear on the intended purpose
>>and use of foreign data wrappers yet.
> 
> On the one hand, why be any less stringent on an fdw server than any
> other connect string? But on the other hand, the fdw server definition
> has supposedly been vetted by a superuser. Any thoughts of this?
> 

I'd vote for allowing aribitrary connect strings -- ordinary users cannot
create servers and user mappings unless explicitly granted the privileges.
It probably should be noted in the documentation that allowing ordinary
users to create user mappings enables the use of postgres .pgpass file.
Not sure where to put this at the moment.

>> 2. It seems like get_connect_string() is generically useful to any
>>client of postgresql_fdw.c -- should it go there instead of dblink?
> 
> I'm pretty sure get_connect_string() should be moved to postgresql_fdw.c
> -- objections?
> 

There is some discussion in another thread about this:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01875.php
http://archives.postgresql.org/pgsql-hackers/2009-01/msg00021.php

The initial approach was to let each foreign data wrapper provide its own
connection string/list builder function. Latest is to provide the lookup
functions in foreign.c, and use the same functions for all the different
fdw's. I was about to implement those but got distracted. Will resume now.

regards,
Martin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel restore

2009-01-04 Thread Alvaro Herrera
Andrew Dunstan wrote:
>
> Attached is the latest parallel restore patch. I think this is getting  
> fairly close.

Some random comments

- please #define the return type of prestore().  Also, that it returns
in one platform and exits in another seems weird as an API.  I think it
should return in both cases, and have the caller act appropriately.

- RestoreArchiveParallel has too many #ifdef WIN32.  It should be
possible to split it so that the cruft is contained better.

- Aren't there too many checks for libz presence?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stats queries versus per-database encodings

2009-01-04 Thread Tom Lane
Heikki Linnakangas  writes:
> Tom Lane wrote:
>> We could attack this by including source database's encoding in the
>> shared-memory entries, and performing a conversion on the fly when
>> reading out the data.  However, what happens if the conversion fails?

> The most useful behavior would be to replace the untranslatable 
> characters with "?". I'm not sure how invasive the changes to the 
> conversion functions would be to support that.

I agree, but it looks like fairly massive changes would be needed,
starting with redefining the API for conversion functions to add
an error/noerror boolean.  Not something that I care to tackle
right now.  Maybe we shall just have to live with it for another
release.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New patch for Column-level privileges

2009-01-04 Thread Tom Lane
Markus Wanner  writes:
> Stephen Frost wrote:
>>> BTW: how are long constant strings expected to be formatted? Are those
>>> allowed to exceed 80 columns, or are they expected to be split like so
>> 
>> Honestly, I think I've seen both done.

> Yeah, that's why I'm asking.

IMHO, the trouble with breaking up error strings like that is that it
can defeat attempts to grep the source for a particular error message.
(If you search for "foo bar baz", you won't find it if someone chose
to break the string between those words.)  This isn't too harmful if
you get no hits, because you can try again with a substring --- but it
really sucks if some instances of the string are broken up differently
than others, because you might see only the wrong instances and come
to a mistaken conclusion about the possible sources of an error reported
from the field.  So I tend not to like breaking up long strings at all,
and definitely look with disfavor on breaking them in random spots
rather than natural break points of the sentence.

Because of that, I tend not to worry about holding to the 80-column
window width when it comes to error message strings.  Other than that
case, though, you should try to stay to less than 80 columns.  If you
don't, pgindent will do it for you, and the end result will probably
be uglier than if you'd made the code fit manually.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel restore

2009-01-04 Thread Simon Riggs

On Mon, 2008-12-29 at 18:42 -0500, Andrew Dunstan wrote:
> Attached is the latest parallel restore patch. I think this is getting 
> fairly close.
> 
> Includes help text, docco and some extra error checking.

Very brief review.

Hopefully the --truncate-before-load option works in both parallel mode
and data-only mode?

Few typos on docs.

No README or comments explain how the patch works. This part of the code
was probably most opaque already, so its really needed unfortunately. 

* I'm particularly interested in error handling, for example if one
thread hits a deadlock, gets accidentally killed by user, hits bug in
custom add-in code etc..

* Earlier bugs with pre/post data were related to missing objects or
putting them in wrong groups. Documenting that will help identify
errors.

Few starnge names, sorry:

work_is_being_done --> work_in_progress
_inhibit_data?? --> _skip_data??
prestored --> parallel_restored
restore_one_te --> restore_TocEntry

Would like ability to increase/decrease number of parallel threads as
restore progresses. Experience says you always pick the wrong number
and/or situation changes while in progress.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New patch for Column-level privileges

2009-01-04 Thread Markus Wanner
Hello Stephen,

Stephen Frost wrote:
> I'm going to look into it but it's a bit complicated.  I was hoping
> someone who's more familiar with those parts would be able to look at
> it.

Good to hear. I've just been asking, because it remained unclear to me.

> I don't think that's the right approach, but I'll look into it.  I ran
> into a similiar issue though, and I don't believe it's too hard to fix
> (the issue here is that the REVOKE needs to remove the column-level grant
> as well).  I'll try and look into it tonight or tomorrow.

Cool, because that's the biggest issue, IMO.

>> test=# GRANT UPDATE(xmin) ON test TO joe;
>> GRANT
>> test=# GRANT INSERT(xmin) ON test TO joe;
>> GRANT
> 
> Hmm, ok, that's easy enough to fix.
> 
>> [ Note that user joe can INSERT or UPDATE tuples of relation test even
>> without those column level privileges, as long as he is allowed to
>> INSERT or UPDATE the affected non-hidden columns. ]
> 
> Right, that's correct.  You don't need table-level permissions so long
> as you have permissions on the columns you're trying to select/modify.

I was trying to check, if these privileges on hidden columns have any
effect. So far I didn't encounter any, except for SELECT.

>> Some minor nit-picks: some lines exceed 80 columns, multi-line comments
>> don't follow coding standards.
> 
> Hrmpf.  I'll go back and review the coding standards..  I don't recall
> that 80 column was a fixed limit.

Hm.. sorry, looks like it's not mentioned in the docu. I'm pretty sure
pgindent strips lines to something below 80 columns, though. (And I'm
personally used to terminals with exactly 80 cols, so everything longer
than  is not easy to my eyes, thus the complaint. Don't bother much).

>> BTW: how are long constant strings expected to be formatted? Are those
>> allowed to exceed 80 columns, or are they expected to be split like so
>> (i.e. for errmsg):
>>
>>   "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed "
>>   "do eiusmod tempor incididunt ut labore et dolore magna aliqua."
> 
> Honestly, I think I've seen both done.

Yeah, that's why I'm asking.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel restore

2009-01-04 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan  writes:
  

I propose to commit this unless someone wants more time for reviewing.



A moment's eyeballing of the patch finds rather a lot of debugging cruft
yet (inserted printfs, "#ifdef WIN32blah", etc); also I think the
addition to include/port.h belongs in port/win32.h instead.

More generally, though, I think it hasn't been looked at much by
anyone (certainly not by me).


  


I must have sent the wrong patch file.

I'll recheck for debugging stuff and send a new patch.

I can certainly hold off committing it.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel restore

2009-01-04 Thread Tom Lane
Andrew Dunstan  writes:
> I propose to commit this unless someone wants more time for reviewing.

A moment's eyeballing of the patch finds rather a lot of debugging cruft
yet (inserted printfs, "#ifdef WIN32blah", etc); also I think the
addition to include/port.h belongs in port/win32.h instead.

More generally, though, I think it hasn't been looked at much by
anyone (certainly not by me).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Greg Stark
What I ifind interesting about this is that whereas I had been  
concerned that adding hot standby late in the development cycle might  
be destabilize the tree and add lots of time to the release cycle it  
seems having it might actually increase our ability to see problems in  
the recovery code which was previously quite hard to test.


--
Greg


On 4 Jan 2009, at 09:59, Heikki Linnakangas > wrote:



Heikki Linnakangas wrote:
I can reproduce that too with CVS HEAD, so it's clearly a bug. I  
probably introduced it with the recent smgr changes; I'll try to  
hunt it down.


Now that was an embarrassingly simple bug:

--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -106,7 +106,7 @@ RelationCreateStorage(RelFileNode rnode, bool  
istemp)

   srel = smgropen(rnode);
   smgrcreate(srel, MAIN_FORKNUM, false);

-if (istemp)
+if (!istemp)
   {
   /*
* Make an XLOG entry showing the file creation.  If we  
abort, the file


Fixed, as well as the same bug in RelationTruncate. Thanks for  
report, I'll keep this brown paper bag on for a few days...


--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New patch for Column-level privileges

2009-01-04 Thread Stephen Frost
Markus,

* Markus Wanner (mar...@bluegap.ch) wrote:
> Stephen Frost wrote:
> > ..in the attached patch.
> 
> Thanks, I've reviewed this patch again.

Thanks!

> > Currently,
> > column-level privileges are not honored when JOINs are involved (you
> > must have the necessary table-level privileges, as you do today). It
> > would really be great to have that working and mainly involves
> > modifying the rewriter to add on to the appropriate range table column
> > list entries the columns which are used in the joins and output from
> > joins.
> 
> Understood. Do you plan to work on that? Or do you think the patch
> should go into 8.4 even without support for JOINs?

I'm going to look into it but it's a bit complicated.  I was hoping
someone who's more familiar with those parts would be able to look at
it.

> Experimenting with relation vs column level privileges, I've discovered
> a strange behavior:
> 
> test=# GRANT SELECT ON test TO joe;
> GRANT
> test=# GRANT SELECT(id) ON test TO joe;
> GRANT
> test=# REVOKE SELECT ON test FROM joe;
> ERROR:  tuple already updated by self
> 
> That's odd. Maybe you need to increment the command counter in between
> two updates of that tuple?

I don't think that's the right approach, but I'll look into it.  I ran
into a similiar issue though, and I don't believe it's too hard to fix
(the issue here is that the REVOKE needs to remove the column-level grant
as well).  I'll try and look into it tonight or tomorrow.

> Also note, that I'm unsure about what to expect from this REVOKE. Is it
> intended to remove the column level privilege as well or not?

Yes, the SQL spec requires that a table-level REVOKE also revoke all
column-level grants as well.

> Removing the column level privilege first, then the relation level one
> works:
> 
> test=# REVOKE SELECT(id) ON test FROM joe;
> REVOKE
> test=# REVOKE SELECT ON test FROM joe;
> REVOKE

This is essentially what should happen automagically.

> I've also tested column level privileges on hidden attributes (xmin,
> xmax, ctid, tableoid, cmin), which works fine for SELECTs. However, you
> might want to filter out INSERT and UPDATE privileges on those, as those
> don't make much sense:
> 
> test=# GRANT UPDATE(xmin) ON test TO joe;
> GRANT
> test=# GRANT INSERT(xmin) ON test TO joe;
> GRANT

Hmm, ok, that's easy enough to fix.

> [ Note that user joe can INSERT or UPDATE tuples of relation test even
> without those column level privileges, as long as he is allowed to
> INSERT or UPDATE the affected non-hidden columns. ]

Right, that's correct.  You don't need table-level permissions so long
as you have permissions on the columns you're trying to select/modify.

> Some minor nit-picks: some lines exceed 80 columns, multi-line comments
> don't follow coding standards.

Hrmpf.  I'll go back and review the coding standards..  I don't recall
that 80 column was a fixed limit.

> BTW: how are long constant strings expected to be formatted? Are those
> allowed to exceed 80 columns, or are they expected to be split like so
> (i.e. for errmsg):
> 
>   "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed "
>   "do eiusmod tempor incididunt ut labore et dolore magna aliqua."

Honestly, I think I've seen both done.  I can do it either way, of
course.

> Nice work! I'd like to see this shipping with 8.4. The above mentioned
> bugs (the "updated by self" and the hidden columns) should be easy
> enough to fix, I think. Respecting columns level privileges for JOINs is
> probably going to be more work, but is required as well, IMO.

Thanks for the review!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Heikki Linnakangas

Heikki Linnakangas wrote:
I can reproduce that too with CVS HEAD, so it's clearly a bug. I 
probably introduced it with the recent smgr changes; I'll try to hunt it 
down.


Now that was an embarrassingly simple bug:

--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -106,7 +106,7 @@ RelationCreateStorage(RelFileNode rnode, bool istemp)
srel = smgropen(rnode);
smgrcreate(srel, MAIN_FORKNUM, false);

-   if (istemp)
+   if (!istemp)
{
/*
 * Make an XLOG entry showing the file creation.  If we abort, 
the file

Fixed, as well as the same bug in RelationTruncate. Thanks for report, 
I'll keep this brown paper bag on for a few days...


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Heikki Linnakangas

Simon Riggs wrote:

On Sun, 2009-01-04 at 22:18 +1300, Mark Kirkwood wrote:

bench=# select now(),count(*) from history;
ERROR:  could not open relation base/16384/16394: No such file or
directory




I'm guessing something tied up with the fact that history has no rows
to 
start with...


Good guess, thanks. I can recreate the error now, though not by
following the actions in the order you mentioned. I guess the files
hadn't applied fully before you ran the test.

The problem I can re-create looks like this:

1. Create standby set-up, with both primary and standby active
2. Create new table on primary, but don't add data; wait for apply
3. Attempt to access new table on standby, throws ERROR as shown
4. Add 1 row on primary; wait for apply
5. Attempt to access new table on standby, no ERROR

It looks to me like WAL for CREATE TABLE doesn't actually create a file,
we just rely on the ability of mdextend() to create the file if required
during recovery.

So it looks to me like either an outstanding error with current system,
or a new error introduced with recent-ish md/smgr changes. Second
opinion please Heikki, if you are available?


Hmm, that's odd. Table creation calls RelationCreateStorage, which calls 
smgrcreate and writes the WAL record. smgr_redo certainly does call 
smgrcreate.


I can reproduce that too with CVS HEAD, so it's clearly a bug. I 
probably introduced it with the recent smgr changes; I'll try to hunt it 
down.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Simon Riggs

On Sun, 2009-01-04 at 22:18 +1300, Mark Kirkwood wrote:
> >>>
> >>> bench=# select now(),count(*) from history;
> >>> ERROR:  could not open relation base/16384/16394: No such file or
> >>> directory
> >>> 

> I'm guessing something tied up with the fact that history has no rows
> to 
> start with...

Good guess, thanks. I can recreate the error now, though not by
following the actions in the order you mentioned. I guess the files
hadn't applied fully before you ran the test.

The problem I can re-create looks like this:

1. Create standby set-up, with both primary and standby active
2. Create new table on primary, but don't add data; wait for apply
3. Attempt to access new table on standby, throws ERROR as shown
4. Add 1 row on primary; wait for apply
5. Attempt to access new table on standby, no ERROR

It looks to me like WAL for CREATE TABLE doesn't actually create a file,
we just rely on the ability of mdextend() to create the file if required
during recovery.

So it looks to me like either an outstanding error with current system,
or a new error introduced with recent-ish md/smgr changes. Second
opinion please Heikki, if you are available?

I'll come back to this in a few hours, but I have some other things need
to do right now.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New patch for Column-level privileges

2009-01-04 Thread Markus Wanner
Hello Stephen,

Stephen Frost wrote:
> ..in the attached patch.

Thanks, I've reviewed this patch again.

> Please find attached an updated patch for column-level privileges
> which incorporates Alvaro's suggested changes and is updated to the
> latest CVS HEAD.

Cool, applies cleanly and compiles without any error or warning on my
Debian box. Regression tests pass fine as well.

> Regression tests have been added as well as
> documentation (though this could probably be improved).

Sorry I didn't get around writing docu. Looks good so far. Some more
hints on its usage wouldn't hurt, especially because the error messages
aren't overly verbose ('permission denied..' doesn't tell you much).

> Currently,
> column-level privileges are not honored when JOINs are involved (you
> must have the necessary table-level privileges, as you do today). It
> would really be great to have that working and mainly involves
> modifying the rewriter to add on to the appropriate range table column
> list entries the columns which are used in the joins and output from
> joins.

Understood. Do you plan to work on that? Or do you think the patch
should go into 8.4 even without support for JOINs?


Experimenting with relation vs column level privileges, I've discovered
a strange behavior:

test=# GRANT SELECT ON test TO joe;
GRANT
test=# GRANT SELECT(id) ON test TO joe;
GRANT
test=# REVOKE SELECT ON test FROM joe;
ERROR:  tuple already updated by self

That's odd. Maybe you need to increment the command counter in between
two updates of that tuple?

Also note, that I'm unsure about what to expect from this REVOKE. Is it
intended to remove the column level privilege as well or not?

Removing the column level privilege first, then the relation level one
works:

test=# REVOKE SELECT(id) ON test FROM joe;
REVOKE
test=# REVOKE SELECT ON test FROM joe;
REVOKE



I've also tested column level privileges on hidden attributes (xmin,
xmax, ctid, tableoid, cmin), which works fine for SELECTs. However, you
might want to filter out INSERT and UPDATE privileges on those, as those
don't make much sense:

test=# GRANT UPDATE(xmin) ON test TO joe;
GRANT
test=# GRANT INSERT(xmin) ON test TO joe;
GRANT

[ Note that user joe can INSERT or UPDATE tuples of relation test even
without those column level privileges, as long as he is allowed to
INSERT or UPDATE the affected non-hidden columns. ]


Some minor nit-picks: some lines exceed 80 columns, multi-line comments
don't follow coding standards.

BTW: how are long constant strings expected to be formatted? Are those
allowed to exceed 80 columns, or are they expected to be split like so
(i.e. for errmsg):

  "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed "
  "do eiusmod tempor incididunt ut labore et dolore magna aliqua."


Nice work! I'd like to see this shipping with 8.4. The above mentioned
bugs (the "updated by self" and the hidden columns) should be easy
enough to fix, I think. Respecting columns level privileges for JOINs is
probably going to be more work, but is required as well, IMO.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy_truncate_heap()

2009-01-04 Thread Simon Riggs

On Sun, 2009-01-04 at 13:01 +0200, Heikki Linnakangas wrote:
> Why does an AccessExclusiveLock lead to cancelled queries 
> >> or pausing WAL application? I thought it'd just block other queries 
> >> trying to acquire a conflicting lock in the standby, just like holding 
> >> an AccessExclusiveLock on the primary does. It's unrelated to the xmin 
> >> horizon issue.
> > 
> > Yes, it is unrelated to the xmin horizon issue. There are two reasons
> > for delaying WAL apply:
> > * locks
> > * xmin horizon
> > 
> > When a lock is acquired on the primary it almost always precedes an
> > action which cannot occur concurrently. For example, if VACUUM did
> > truncate a table then queries could get errors because parts of their
> > table disappear from under them. Others are drop table etc..
> 
> Have you implemented the query cancellation mechanism for that scenario 
> too? (I'm cool either way, just curious..)

Yes, they both lead to a conflict between WAL and standby queries, so
are treated the same, currently: if conflict occurs, wait until
max_standby_delay expires, then cancel.

Logically, "xmin horizon" conflicts could be flexible/soft. That is, if
we implemented the idea to store a lastCleanedLSN for each buffer then
"xmin horizon" conflicts would be able to continue executing until they
see a buffer with buffer.lastCleanedLSN > conflictLSN. Whereas the lock
would be a hard limit beyond which a query could not progress.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Mark Kirkwood

Simon Riggs wrote:

On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:

  

bench=# \d history
  Table "public.history"
 Column |Type | Modifiers
+-+---
 tid| integer |
 bid| integer |
 aid| integer |
 delta  | integer |
 mtime  | timestamp without time zone |
 filler | character(22)   |

bench=# select now(),count(*) from history;
ERROR:  could not open relation base/16384/16394: No such file or
directory



>From my recreating your test case, the oids are consistent with the
History table. So the cache looks good.

md.c should be cacheing the file descriptor so the second use of the
file should not be reopening it. I've not touched smgr/md so a missing
file error is a surprise.

I wonder if this is an error associated with large file handling and
file forks? Smells like an FSM or VM error.

Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*

  

Yeah -
$ ls -l $PGDATA/base/16384/16394*
ls: /data0/pgslave/8.4/base/16384/16394*: No such file or directory

regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Simon Riggs

On Sun, 2009-01-04 at 22:13 +1300, Mark Kirkwood wrote:
> Simon Riggs wrote:
> >
> > Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*
> >
> >   
> Yeah -
> $ ls -l $PGDATA/base/16384/16394*
> ls: /data0/pgslave/8.4/base/16384/16394*: No such file or directory

What else is missing? Files, Directories etc?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Mark Kirkwood

Mark Kirkwood wrote:

Simon Riggs wrote:

On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:

 

bench=# \d history
  Table "public.history"
 Column |Type | Modifiers
+-+---
 tid| integer |
 bid| integer |
 aid| integer |
 delta  | integer |
 mtime  | timestamp without time zone |
 filler | character(22)   |

bench=# select now(),count(*) from history;
ERROR:  could not open relation base/16384/16394: No such file or
directory



>From my recreating your test case, the oids are consistent with the
History table. So the cache looks good.

md.c should be cacheing the file descriptor so the second use of the
file should not be reopening it. I've not touched smgr/md so a missing
file error is a surprise.

I wonder if this is an error associated with large file handling and
file forks? Smells like an FSM or VM error.

Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*

  

Yeah -
$ ls -l $PGDATA/base/16384/16394*
ls: /data0/pgslave/8.4/base/16384/16394*: No such file or directory



This might be useful:

the other tables in the dataset (accounts, branches, tellers)  all 
behave as expected:


bench=# select now(),count(*) from branches;
 now  | count
---+---
2009-01-04 22:17:00.298597+13 |   100
(1 row)

I'm guessing something tied up with the fact that history has no rows to 
start with...


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Simon Riggs

On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:

> bench=# \d history
>   Table "public.history"
>  Column |Type | Modifiers
> +-+---
>  tid| integer |
>  bid| integer |
>  aid| integer |
>  delta  | integer |
>  mtime  | timestamp without time zone |
>  filler | character(22)   |
> 
> bench=# select now(),count(*) from history;
> ERROR:  could not open relation base/16384/16394: No such file or
> directory

>From my recreating your test case, the oids are consistent with the
History table. So the cache looks good.

md.c should be cacheing the file descriptor so the second use of the
file should not be reopening it. I've not touched smgr/md so a missing
file error is a surprise.

I wonder if this is an error associated with large file handling and
file forks? Smells like an FSM or VM error.

Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy_truncate_heap()

2009-01-04 Thread Heikki Linnakangas

Simon Riggs wrote:

On Thu, 2009-01-01 at 12:00 +0200, Heikki Linnakangas wrote:

Greg Stark wrote:

On 31 Dec 2008, at 13:21, Simon Riggs  wrote:

Both of these bugs are minor, but the effect of either/both of them is
to cause more AccessExclusiveLocks than we might expect.

For Hot Standby this means that many VACUUMs take AccessExclusiveLocks
on relations, which would potentially lead to having queries cancelled
for no reason at all.
Well by default it would just cause wal to pause briefly until the 
queries with those locks finish, no?
Wait a minute. Why does an AccessExclusiveLock lead to cancelled queries 
or pausing WAL application? I thought it'd just block other queries 
trying to acquire a conflicting lock in the standby, just like holding 
an AccessExclusiveLock on the primary does. It's unrelated to the xmin 
horizon issue.


Yes, it is unrelated to the xmin horizon issue. There are two reasons
for delaying WAL apply:
* locks
* xmin horizon

When a lock is acquired on the primary it almost always precedes an
action which cannot occur concurrently. For example, if VACUUM did
truncate a table then queries could get errors because parts of their
table disappear from under them. Others are drop table etc..


Have you implemented the query cancellation mechanism for that scenario 
too? (I'm cool either way, just curious..)


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stats queries versus per-database encodings

2009-01-04 Thread Heikki Linnakangas

Tom Lane wrote:

I notice that the pg_stat_statements patch is applying pg_mbcliplen()
to query strings, in the fond illusion that it knows what encoding
they are in.

This brings up a bigger issue, namely that pg_stat_activity isn't
exactly encoding-proof either --- whatever encoding is in use in a
particular database is what query strings from backends in that database
will be stored in.  Readers in another database will be exposed to
strings that probably aren't encoded correctly for their DB.

We could attack this by including source database's encoding in the
shared-memory entries, and performing a conversion on the fly when
reading out the data.  However, what happens if the conversion fails?
Seems like this provides a way for users to hide their queries from
the DBA ... just include a comment with some characters that are
untranslatable.


The DBA could always connect to the same database to see the query in 
its original form, so I don't think it provides a very useful way to 
hide queries.


The most useful behavior would be to replace the untranslatable 
characters with "?". I'm not sure how invasive the changes to the 
conversion functions would be to support that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 8.4 - psql output for \l

2009-01-04 Thread hubert depesz lubaczewski
is it going to stay that way? i find it actually worse than 8.3
behaviour:

(dep...@[local]:5840) 11:43:40 [depesz]
# \l
List of databases
   Name|  Owner   | Encoding |  Collation  |Ctype| Access privileges
---+--+--+-+-+---
 depesz| depesz   | UTF8 | pl_PL.UTF-8 | pl_PL.UTF-8 |
 postgres  | pgdba| UTF8 | pl_PL.UTF-8 | pl_PL.UTF-8 |
 template0 | pgdba| UTF8 | pl_PL.UTF-8 | pl_PL.UTF-8 | =c/pgdba
 : pgdba=CTc/pgdba
 template1 | pgdba| UTF8 | pl_PL.UTF-8 | pl_PL.UTF-8 | pgdba=CTc/pgdba
 : =c/pgdba
 test  | depesz   | UTF8 | pl_PL.UTF-8 | pl_PL.UTF-8 |
(7 rows)

do we really have to introduce newlines even when it would (without any 
problem) fit in single line?

I mean - I udnerstand that if the grants are long, and complex - it makes sense
to display them like this, displaying extra lines just for superuser privileges
looks weird.

Best regards,

depesz

-- 
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Mark Kirkwood

Simon Riggs wrote:

On Wed, 2008-12-17 at 15:21 +, Simon Riggs wrote:
  

http://wiki.postgresql.org/wiki/Hot_Standby

now contains a link to latest version of this patch. 



v6 of Hot Standby now uploaded to Wiki (link above), with these changes:

* Must ignore_killed_tuples and never kill_prior_tuple during index
scans in recovery (v6)
  * XLOG_BTREE_DELETE records handled correctly (v6)
  * btree VACUUM code - must scan every block of index (v6)
  * BEGIN TRANSACTION READ WRITE should throw error (v6)

New test cycle starting with this patch over next few days.

Work continues on other items.

Happy New Year everyone,

  

I'm running some tests on v6a. The setup is:

- install master, setup standby as usual, start standby
- create database bench on master
- initialize pgbench dataset size 100 on master
- start 4 clients doing 50 transactions each.

After about 10 transactions have been processed on the master, query 
the standby:


bench=# \d history
 Table "public.history"
Column |Type | Modifiers
+-+---
tid| integer |
bid| integer |
aid| integer |
delta  | integer |
mtime  | timestamp without time zone |
filler | character(22)   |

bench=# select now(),count(*) from history;
ERROR:  could not open relation base/16384/16394: No such file or directory

regards

Mark




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latest version of Hot Standby patch: unexpected error querying standby

2009-01-04 Thread Simon Riggs

On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:

> bench=# select now(),count(*) from history;
> ERROR:  could not open relation base/16384/16394: No such file or
> directory

Thanks for the report.

I'm attempting to recreate now.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers