date:20100930

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Greg Smith


Darren Duncan wrote:
This matter reminds me of a discussion on the SQLite list years ago 
about whether pragma synchronous=normal or synchronous=full should be 
the default, and thankfully 'full' won.


Right now, when I see deployments in the field, serious database servers 
setup by professional DBAs tend to use the right hardware and setup to 
do the correct thing with PostgreSQL.  And trivial installs done for 
testing purposes cheat, but many of those users don't really care 
because they know they are not running a real server, and expect that 
their desktop is neither reliable nor fast at database work.  The SQLite 
situation has a slightly different context than this, because the places 
it's put into don't so regularly have a DBA involved at all in 
situations where the data is important.  It's often just system software 
sitting in the background nobody even is aware of.


I also remember when SQLite did come out of the background, when it was 
crucified for being the cause of Firefox slowdowns actually linked to 
changed kernel fsync behavior.  That's the sort of bad press this 
project really doesn't need right now, when it actually doesn't matter 
on so many production database servers.  You may not be aware that 
there's already such a change floating around out there.  PostgreSQL 
installs on Linux kernel 2.6.32 or later using ext4 are dramatically 
slower out of the box than they used to be, because the OS started doing 
the right thing by default; no change in the database code.  I remain in 
mild terror that this news is going to break in a bad way and push this 
community into damage control.  So far I've only seen that reported on 
Phoronix, and that included a testimony from a kernel developer that 
they introduced the regression so it wasn't so bad.  The next such 
publicized report may not be so informed.


Some of this works out to when to change things rather than what to 
change.  PostgreSQL is at a somewhat critical spot right now.  If people 
grab a new version, and performance sucks compared to earlier ones, 
they're not going to think "oh, maybe they changed an option and the new 
version is tuned for safety better".  They're going to say "performance 
sucks on this database now" and give up on it.  Many evals are done on 
hardware that isn't representative of a real database server, and if we 
make a change that only hurts those people--while not actually impacting 
production quality hardware--that needs to be done carefully.  And 
that's exactly what I think would happen here if this was just changed 
all of the sudden.


I don't think anyone is seriously opposed to changing the defaults for 
safety instead of performance.  The problem is that said change would 
need to be *preceeded* by a major update to the database documentation, 
and perhaps even some code changes to issue warnings when you create a 
cluster with what is going to turn out to now be a slow configuration.  
We'd need to make it really obvious to people who upgrade and notice 
that performance tanks that it's because of a configuration change made 
for safety reasons, one that they can undo for test deployments.  That 
particular area, giving people better advice about what they should do 
to properly tune a new install for its intended workload, is something 
that's been making slow progress but still needs a lot of work.  I think 
if some better tools there come along, so that most people are expected 
to follow a path that involves a tuning tool, it will be much easier to 
stomach the idea of changing the default--knowing that something that 
will undo that change is likely to appears to the user that suggests the 
possibility is available.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance"Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Darren Duncan


Greg Smith wrote:
You didn't quote the next part of that, which says "fsync() is not 
sufficient to guarantee that your data is on stable
storage and on MacOS X we provide a fcntl(), called F_FULLFSYNC, to ask 
the drive to flush all buffered data to stable storage."  That's exactly 
what turning on fsync_writethrough does in PostgreSQL.  See 
http://archives.postgresql.org/pgsql-hackers/2005-04/msg00390.php as the 
first post on this topic that ultimately led to that behavior being 
implemented.


 From the perspective of the database, whether or not the behavior is 
standards compliant isn't the issue.  Whether pages make it to physical 
disk or not when fsync is called, or when O_DSYNC writes are done on 
platforms that support them, is the important part.  If you the OS 
doesn't do that, it is doing nothing useful from the perspective of the 
database's expectations.  And that's not true on Darwin unless you 
specify F_FULLFSYNC, which doesn't happen by default in PostgreSQL.  It 
only does that when you switch wal_sync_method=fsync_writethrough


Greg Smith also wrote:

The main downside to switching the default on either OS X or Windows is

developers using those platforms for test deployments will suffer greatly from a
performance drop for data they don't really care about. As those two in
particular are much more likely to be client development platforms, too, that's
a scary thing to consider.

I think that, bottom line, Postgres should be defaulting to whatever the safest 
and most reliable behavior is, per each platform, because data integrity is the 
most important thing, ensuring that a returning commit has actually written data 
to disk.  If performance is worse, then so what?  Code that does nothing has the 
best performance of all, and is also generally useless.


Whenever there is a tradeoff to be made, reliability for speed, then users 
should have to explicitly choose the less reliable option, which would 
demonstrate they know what they're doing.  Let the testers explicitly choose a 
faster and less reliable option for the data they don't care about, and 
otherwise by default users who don't better should get the safest option, for 
data they likely care about.  That is a DBMS priority.


This matter reminds me of a discussion on the SQLite list years ago about 
whether pragma synchronous=normal or synchronous=full should be the default, and 
thankfully 'full' won.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch: SQL/MED(FDW) DDL

2010-09-30 Thread Robert Haas

On Sep 29, 2010, at 10:09 AM, Alvaro Herrera  wrote:
> Excerpts from Robert Haas's message of mar sep 28 10:26:54 -0400 2010:
> 
>> Then:
>> 
>> - Begin a sequential scan with the following set of quals.
>> - Begin an index scan using the index called X with the following set of 
>> quals.
>> - Fetch next tuple.
>> - End scan.
> 
> I'm not sure that it's a good idea to embed into the FDW API the set of
> operations known to the executor.  For example your proposal fails to
> consider bitmap scans.  Seems simpler and more general to hand the quals
> over saying "I need to scan this relation with these quals", and have it
> return an opaque iterable object; the remote executor would be in charge
> of determining their usage for indexscans; or even for filtering tuples
> with quals that cannot be part of the index condition.

Yeah, that might be better.  Is it reasonable to assume we always want to push 
down as much as possible, or do we need to think about local work vs. remote 
work trade-offs?

> There doesn't to be much point in knowing the names of remote indexes
> either (if it came to referencing them, better use OIDs)

Well, you can't assume the remote side is PG.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security hook on table creation

2010-09-30 Thread Robert Haas

On Sep 30, 2010, at 9:01 PM, KaiGai Kohei  wrote:
> (2010/10/01 3:09), Robert Haas wrote:
>> 2010/9/29 KaiGai Kohei:
>>> In addition, I want to give these entrypoints its name which
>>> represents an appropriate purpose of the hook, rather than
>>> a uniformed one.
>> 
>> It sounds like you're proposing to create a vast number of hooks
>> rather than just one.  If we have ~20 object types in the system,
>> that's 40 hooks just for create and drop, and then many more to handle
>> comment, alter (perhaps in various flavors), etc.  I'm pretty
>> unexcited about that.  The main hook function can always dispatch
>> internally if it so desires, but I don't see any benefit to forcing
>> people to write the code that way.
>> 
> What I proposed is to create just one hook and wrapper functions
> with appropriate name; that calls the hook with appropriate parameters,
> such as SearchSysCache1, 2, 3 and 4.

Seems like you'd end up creating a lot of macros that were only used once.

> BTW, as an aside, the SearchSysCacheX() interface also inspired me.
> If the hook function can deliver a few Datum values depending on object
> types and event types, it may allows the main hook to handle most of
> security checks, even if we need to add various flavors.

Good idea.  Let's leave that out for the first version of this, though.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-09-30 Thread Robert Haas

On Sep 30, 2010, at 9:07 PM, Itagaki Takahiro  
wrote:
> Hi, Leonardo-san,
> 
> On Fri, Oct 1, 2010 at 4:04 AM, Tom Lane  wrote:
>> The wording should be something like "CLUSTER requires transient disk
>> space equal to about twice the size of the table plus its indexes".
> 
> Could you merge those discussions into the final patch?
> Also, please check whether my modification broke your patch.
> Thank you.

It sounds like the costing model might need a bit more work before we commit 
this.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Adding getrusage profiling data to EXPLAIN output

2010-09-30 Thread Itagaki Takahiro

On Fri, Oct 1, 2010 at 9:16 AM, Greg Stark  wrote:
> Attached is a patch to display getrusage output to EXPLAIN output.
> This is the patch I mentioned previously in
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg00684.php and
> it raises the same issues we were talking about then.

How much overhead do you have with "resource" option?
getrusage() calls for each tuple might have considerable overheads.
How much difference between (analyze) and (analyze, resource) options?

Auto_explain and pg_stat_statements will be also adjusted to the change
when the patch is acceptable, I was asked for "queries ordered by CPU times"
in pg_stat_statements several times. The getrusage infrastructure will
make it a real possibility.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres vs. intel ccNUMA on Linux

2010-09-30 Thread Greg Smith


James Robinson wrote:
Any tips / conventional wisdom regarding running postgres on 
large-ish memory ccNUMA intel machines, such as a 32G dual-quad-core, 
showing two NUMA nodes of 16G each? I expect each postgres backend's 
non-shared memory usage to remain nice and reasonably sized, hopefully 
staying within the confines of its processor's local memory region, 
but how will accesses to shared memory and / or buffer cache play out? 
Do people tune their backends via 'numactl' ?


My gut feel here is that the odds this particular area will turn into 
your bottleneck is so slim that worrying about it in advance is 
premature optimization.  If you somehow end up in the unexpected 
situation where processor time that might be altered via such fine 
control is your bottleneck, as opposed to disks, buffer cache 
contention, the ProcArray contention Robert mentioned, WAL contention, 
or something else like that--all things you can't segment usefully 
here--well maybe at that point I'd start chasing after numactl.  As for 
how likely that is, all I can say is I've never gotten there before 
finding a much more obvious bottleneck first.


However, I recently wrote a little utility to test memory speeds as 
increasing numbers of clients do things on a system, and it may provide 
you some insight into how your system responds as different numbers of 
them do things:  http://github.com/gregs1104/stream-scaling


I've gotten results submitted to me where you can see memory speeds 
fluctuate on servers where threads bounce between processors and their 
associated memory, stuff that goes away if you then lock the test 
program to specific cores.  If you want to discuss results from trying 
that on your system and how that might impact real-world server 
behavior, I'd recommend posting about that to the pgsql-performance list 
rather than this one.  pgsql-hackers is more focused on code-level 
issues with PostgreSQL.  There really aren't any of those in the area 
you're asking about, as the database is blind to what the OS is doing 
underneath of it here.


Furthermore, if one had more than one database being served by the 
machine, would it be advisable to do this via multiple clusters 
instead of a single cluster, tweaking the processor affinity of each 
postmaster accordingly, trying to ensure each cluster's shared memory 
segments and buffer cache pools remain local for the resulting backends?


If you have a database that basically fits in memory, that might 
actually work.  Note however that the typical useful tuning for 
PostgreSQL puts more cache into the operating system side of things than 
what's dedicated to the database, and that may end up mixed across 
"banks" as it were.  I'd still place my money on running into another 
limitation first, but the idea is much more sound .  What I would try 
doing here is running the SELECT-only version of pgbench against both 
clusters at once, and see if you really can get more total oomph out of 
the system than a single cluster of twice the size.  The minute disks 
start entering the picture though, you're likely to end up back to where 
processor/memory affinity is the least of your concerns.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance"Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-09-30 Thread Itagaki Takahiro

Hi, Leonardo-san,

On Fri, Oct 1, 2010 at 4:04 AM, Tom Lane  wrote:
> The wording should be something like "CLUSTER requires transient disk
> space equal to about twice the size of the table plus its indexes".

Could you merge those discussions into the final patch?
Also, please check whether my modification broke your patch.
Thank you.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Greg Smith


A.M. wrote:

That is not correct. fsync and friends on Darwin synchronizes I/O and flushes 
dirty kernel caches to the disk which meets the specification and is distinctly 
different from doing nothing...
"On MacOS X, fsync() always has and always will flush all file data
from host memory to the drive on which the file resides."
http://lists.apple.com/archives/Darwin-dev/2005/Feb/msg00072.html
  


You didn't quote the next part of that, which says "fsync() is not 
sufficient to guarantee that your data is on stable
storage and on MacOS X we provide a fcntl(), called F_FULLFSYNC, to ask 
the drive to flush all buffered data to stable storage."  That's exactly 
what turning on fsync_writethrough does in PostgreSQL.  See 
http://archives.postgresql.org/pgsql-hackers/2005-04/msg00390.php as the 
first post on this topic that ultimately led to that behavior being 
implemented.


From the perspective of the database, whether or not the behavior is 
standards compliant isn't the issue.  Whether pages make it to physical 
disk or not when fsync is called, or when O_DSYNC writes are done on 
platforms that support them, is the important part.  If you the OS 
doesn't do that, it is doing nothing useful from the perspective of the 
database's expectations.  And that's not true on Darwin unless you 
specify F_FULLFSYNC, which doesn't happen by default in PostgreSQL.  It 
only does that when you switch wal_sync_method=fsync_writethrough


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance"Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security hook on table creation

2010-09-30 Thread KaiGai Kohei

(2010/10/01 3:09), Robert Haas wrote:
> 2010/9/29 KaiGai Kohei:
>> In addition, I want to give these entrypoints its name which
>> represents an appropriate purpose of the hook, rather than
>> a uniformed one.
> 
> It sounds like you're proposing to create a vast number of hooks
> rather than just one.  If we have ~20 object types in the system,
> that's 40 hooks just for create and drop, and then many more to handle
> comment, alter (perhaps in various flavors), etc.  I'm pretty
> unexcited about that.  The main hook function can always dispatch
> internally if it so desires, but I don't see any benefit to forcing
> people to write the code that way.
> 
What I proposed is to create just one hook and wrapper functions
with appropriate name; that calls the hook with appropriate parameters,
such as SearchSysCache1, 2, 3 and 4.

However, the reason why I proposed the wrapper functions is mainly from
a sense of beauty at the code. So, I choose the term of 'my preference'.
Well, at first, I'll try to work on as you suggested.

---
BTW, as an aside, the SearchSysCacheX() interface also inspired me.
If the hook function can deliver a few Datum values depending on object
types and event types, it may allows the main hook to handle most of
security checks, even if we need to add various flavors.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Greg Smith


Tom Lane wrote:

I'm not sure whether we should select fsync_writethrough as the default
on OSX.  We don't make an equivalent attempt to prevent OS or storage
malfeasance on other Unixoid platforms --- in fact, I'd say OSX is a bit
ahead of the game in that you *can* force writethrough without resorting
to arcane hacks with hdparm or some such.
  


The same issue exists on Windows too, with the same workaround:  
normally unsafe by default, have to switch to fsync_writethrough in 
order to get commit safety and write caching for regular writes, still 
ahead of most Unixes because at least it's that easy.


The main downside to switching the default on either OS X or Windows is 
developers using those platforms for test deployments will suffer 
greatly from a performance drop for data they don't really care about.  
As those two in particular are much more likely to be client development 
platforms, too, that's a scary thing to consider.


As for the documentation, I wrote 
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm a 
while ago trying to address this better, but never got to merging some 
of those notes into the official docs.  At the time, such heavy linking 
to external URLs was particularly frowned upon in the official docs.  
While that's eased up a bit, what I think I could do now is convert that 
article into something that's on the Wiki instead, and we could point 
the "Reliability" section of the docs toward there as something that 
could stay live as platform-specific changes happen.  I'm thinking of 
the whole ext4 behavior change when I say that; stuff like that will 
only be addressable in a live document, since changes were happening in 
between major releases in that area.


More recently, I've written a few things covering this area in what 
approaches excruciating detail, for this little document you might have 
been spammed in somebody's signature about.  I'm going to ask the 
publisher to make that specific material the public sample chapter for 
the book, in hopes of making it easier for people to find a detailed 
discussion of this topic.  Regardless of whether that works out, I have 
to clean up documentation in this whole area up for the checkpoint 
changes I'm working on for 9.1 anyway.  That tinkers with the timing on 
fsync calls, and whether that works or not is very filesystem specific.  
I'm basically stuck with documenting exactly what happens for most 
possibilities in order to provide a comprehensive manual section 
covering that.  I have a deadline for when I have to get my work in 
progress organized to share with the world now:  
https://www.postgresqlconference.org/content/righting-your-writes , so 
after I get back from that conference in November I'll see what I can do 
about the docs too.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance"Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery.conf location

2010-09-30 Thread Josh Berkus

On 9/29/10 7:54 PM, Tom Lane wrote:
> Robert Haas  writes:
>> But that's not what Tom is talking about, I don't think: you might
>> also want a way to explicitly whack the flag in pg_control around.
>> That would probably be along the lines of pg_resetxlog.  I'm not sure
>> how much use case there is for such a thing, but if it's needed it's
>> certainly wouldn't be hard to write.
> 
> Right, but instead of having to provide such a tool, we could just
> store the status as a text file.  There is a pretty time-honored
> tradition for that, ya know.

And then move all the other config parameters to postgresql.conf?  And
have PG poll that text file periodically so that you could update it and
it would fail over?

+1.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Adding getrusage profiling data to EXPLAIN output

2010-09-30 Thread Greg Stark

Attached is a patch to display getrusage output to EXPLAIN output.
This is the patch I mentioned previously in
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00684.php and
it raises the same issues we were talking about then. Should the
resource usage stats displayed be per-iteration totals for the whole
query execution for that node. I can't see dividing by nloops making
things clearer but the discrepancy is becoming more noticeable. I
wonder if there's anything we can do to make things clearer.

Incidentally, this is a first cut at the patch written in a single
sitting. I need to go through it again to make sure I didn't do
anything silly. One thing I know is broken is Windows support. I
fleshed out our getrusage stub a bit but I can't test it at all, and
the INSTR_TIME_* macros actually are defined differently on windows so
I can't use them quite so simply on struct timeval. The simplest
option is to define the struct timeval versions always even if
INSTR_TIME_* doesn't use them and this code can use them directly.
Another would be to modify the struct rusage definition on Windows so
we use the native Windows time datatype -- which would be tempting
since it would avoid the loss of precision in "only" having
microsecond precision.


postgres=# explain (analyze,  resource) select * from x;
 QUERY PLAN
-
 Seq Scan on x  (cost=0.00..11540.00 rows=80 width=4) (actual
time=69.851..1287.025 rows=80 loops=1)
   Resources: sys=240.000ms user=940.000ms read=27.7MB
 Total runtime: 2431.237 ms
(3 rows)

[...flush buffers and drop caches between tests...]

postgres=# explain (analyze, verbose, buffers, resource) select * from x;
 QUERY PLAN

 Seq Scan on public.x  (cost=0.00..11540.00 rows=80 width=4)
(actual time=28.739..1290.786 rows=80 loops=1)
   Output: generate_series
   Resources: sys=280.000ms user=890.000ms read=27.7MB minflt=65
nvcsw=9 nivcsw=633
   Buffers: shared read=3540
 Total runtime: 2487.629 ms
(5 rows)


-- 
greg
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 105,110  static void ExplainJSONLineEnding(ExplainState *es);
--- 105,111 
  static void ExplainYAMLLineStarting(ExplainState *es);
  static void escape_json(StringInfo buf, const char *str);
  static void escape_yaml(StringInfo buf, const char *str);
+ static double normalize_memory(double amount, char **unit, int *precision);
  
  
  
***
*** 137,142  ExplainQuery(ExplainStmt *stmt, const char *queryString,
--- 138,145 
  			es.costs = defGetBoolean(opt);
  		else if (strcmp(opt->defname, "buffers") == 0)
  			es.buffers = defGetBoolean(opt);
+ 		else if (strcmp(opt->defname, "resource") == 0)
+ 			es.rusage = defGetBoolean(opt);
  		else if (strcmp(opt->defname, "format") == 0)
  		{
  			char	   *p = defGetString(opt);
***
*** 363,368  ExplainOnePlan(PlannedStmt *plannedstmt, ExplainState *es,
--- 366,373 
  		instrument_option |= INSTRUMENT_TIMER;
  	if (es->buffers)
  		instrument_option |= INSTRUMENT_BUFFERS;
+ 	if (es->rusage)
+ 		instrument_option |= INSTRUMENT_RUSAGE;
  
  	/*
  	 * Use a snapshot with an updated command ID to ensure this query sees
***
*** 1086,1091  ExplainNode(PlanState *planstate, List *ancestors,
--- 1091,1187 
  			break;
  	}
  
+ 	/* Show resource usage from getrusage */
+ 	if (es->rusage && es->format == EXPLAIN_FORMAT_TEXT)
+ 	{
+ 		const struct rusage *usage = &planstate->instrument->rusage;
+ 
+ 		bool has_rusage = (!INSTR_TIME_IS_ZERO(usage->ru_stime) || 
+ 		   !INSTR_TIME_IS_ZERO(usage->ru_utime) || 
+ 		   usage->ru_inblock > 0 || 
+ 		   usage->ru_oublock > 0);
+ 		bool has_verbose_rusage = (usage->ru_minflt   > 0 || 
+    usage->ru_majflt   > 0 ||
+    usage->ru_nswap> 0 || 
+    usage->ru_msgsnd   > 0 || 
+    usage->ru_msgrcv   > 0 || 
+    usage->ru_nsignals > 0 || 
+    usage->ru_nvcsw> 0 || 
+    usage->ru_nivcsw   > 0);
+ 
+ 		if (has_rusage || (es->verbose && has_verbose_rusage))
+ 		{
+ 			appendStringInfoSpaces(es->str, es->indent *2);
+ 			appendStringInfoString(es->str, "Resources:");
+ 			
+ 			if (!INSTR_TIME_IS_ZERO(usage->ru_stime)) 
+ 			{
+ double stime = INSTR_TIME_GET_DOUBLE(usage->ru_stime);
+ appendStringInfo(es->str, " sys=%.3fms", stime * 1000);
+ 			}
+ 			
+ 			if (!INSTR_TIME_IS_ZERO(usage->ru_utime)) 
+ 			{
+ double utime = INSTR_TIME_GET_DOUBLE(usage->ru_utime);
+ appendStringInfo(es->str, " user=%.3fms", utime * 1000);
+ 			}
+ 			
+ 			if (usage->ru_inblock > 0)
+ 			{
+ double inblock;
+ char *units;
+ int pr

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Greg Stark

On Thu, Sep 30, 2010 at 2:22 PM, A.M.  wrote:
> That is not correct. fsync and friends on Darwin synchronizes I/O and flushes 
> dirty kernel caches to the disk which meets the specification and is 
> distinctly different from doing nothing.

How exactly is it different from doing nothing? That is, in what
situation does doing this change in any way the behaviour from the
user's point of view versus doing nothing?

People keep saying it "meets the specification" but it's a useless
interpretation of the specification. And it requires a pretty
malicious reading of "transferred to the storage device" to read it as
"transferred it from one set of ram buffers to another more closely
associated with the actual persistent storage".

> It's too bad there is no cross-platform way to ask what level of 
> hardware-syncing is available.

Why would the user want to ask this? As far as the user is concerned
either there are only two "levels": synced or not synced. If it's not
guaranteed to persist after a power failure it's not synced. It
doesn't matter whether it's in kernel buffers, drive buffers, or
anywhere else -- they're all the same from the user's point of view --
they're non-persistent.

The other useful syncing behaviour would be to get write barriers. But
that's a whole different api, not just a behaviour that can be
attached to fsync.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch author name on commitfest page

2010-09-30 Thread Robert Haas

On Thu, Sep 30, 2010 at 5:51 PM, Gurjeet Singh  wrote:
> On Thu, Sep 30, 2010 at 11:45 PM, Robert Haas  wrote:
>>
>> On Thu, Sep 30, 2010 at 5:38 PM, Gurjeet Singh 
>> wrote:
>> > Can we please change the comment lines below the patch heading to have
>> > the
>> > real name instead of the postgresql.org ID?
>> >
>> > Patch by Pavel Stehule
>> > Patch by Gurjeet Singh
>> >
>> > instead of
>> >
>> > Patch by okbobcz
>> > Patch by singh.gurjeet
>> >
>> > https://commitfest.postgresql.org/action/commitfest_view?id=8
>>
>> I don't think that information is available to the app, at least at
>> present... when you log in, it is empowered to ask the community login
>> DB "is this combination of a username and a password valid?" but all
>> it gets back is "yes" or "no".
>
> If you click on the patch title on that page, it takes you to patch details
> page, where the real author name seems to be available to the app.
>
> https://commitfest.postgresql.org/action/patch_view?id=393

You're mixing up two different things.

Each patch has an "Author" field and a "Reviewers" field.  These are
displayed on both the CF summary page (in the columns with those
names) and on the patch detail page (in the rows with those names).
They are text fields, so they display whatever someone types into
them.

Each *comment* on a patch has an author also.  This is also displayed
on both the CF summary page (for the most recent 3 comments) and on
the patch detail page (for all comments) and the authorship
information is automatically populated from the user's login.

It would be possible to display both of these using the same format if
that information is available in the community login database and can
be replicated into the CF database, but currently I don't have it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch author name on commitfest page

2010-09-30 Thread Andrew Dunstan

On 09/30/2010 05:51 PM, Gurjeet Singh wrote:
On Thu, Sep 30, 2010 at 11:45 PM, Robert Haas > wrote:

On Thu, Sep 30, 2010 at 5:38 PM, Gurjeet Singh
mailto:singh.gurj...@gmail.com>> wrote:
> Can we please change the comment lines below the patch heading
to have the
> real name instead of the postgresql.org  ID?
>
> Patch by Pavel Stehule
> Patch by Gurjeet Singh
>
> instead of
>
> Patch by okbobcz
> Patch by singh.gurjeet
>
> https://commitfest.postgresql.org/action/commitfest_view?id=8

I don't think that information is available to the app, at least at
present... when you log in, it is empowered to ask the community login
DB "is this combination of a username and a password valid?" but all
it gets back is "yes" or "no".

If you click on the patch title on that page, it takes you to patch 
details page, where the real author name seems to be available to the app.

https://commitfest.postgresql.org/action/patch_view?id=393

That's because there is a field to set the author's name. See

At the same time, I should have though allowing the Commitfest app 
access to the user's actual name would be acceptable.

cheers

andrew

Re: [HACKERS] Patch author name on commitfest page

2010-09-30 Thread Gurjeet Singh

On Thu, Sep 30, 2010 at 11:45 PM, Robert Haas  wrote:

> On Thu, Sep 30, 2010 at 5:38 PM, Gurjeet Singh 
> wrote:
> > Can we please change the comment lines below the patch heading to have
> the
> > real name instead of the postgresql.org ID?
> >
> > Patch by Pavel Stehule
> > Patch by Gurjeet Singh
> >
> > instead of
> >
> > Patch by okbobcz
> > Patch by singh.gurjeet
> >
> > https://commitfest.postgresql.org/action/commitfest_view?id=8
>
> I don't think that information is available to the app, at least at
> present... when you log in, it is empowered to ask the community login
> DB "is this combination of a username and a password valid?" but all
> it gets back is "yes" or "no".
>

If you click on the patch title on that page, it takes you to patch details
page, where the real author name seems to be available to the app.

https://commitfest.postgresql.org/action/patch_view?id=393

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: [HACKERS] Patch author name on commitfest page

2010-09-30 Thread Robert Haas

On Thu, Sep 30, 2010 at 5:38 PM, Gurjeet Singh  wrote:
> Can we please change the comment lines below the patch heading to have the
> real name instead of the postgresql.org ID?
>
> Patch by Pavel Stehule
> Patch by Gurjeet Singh
>
> instead of
>
> Patch by okbobcz
> Patch by singh.gurjeet
>
> https://commitfest.postgresql.org/action/commitfest_view?id=8

I don't think that information is available to the app, at least at
present... when you log in, it is empowered to ask the community login
DB "is this combination of a username and a password valid?" but all
it gets back is "yes" or "no".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch author name on commitfest page

2010-09-30 Thread Gurjeet Singh

Can we please change the comment lines below the patch heading to have the
real name instead of the postgresql.org ID?

Patch by Pavel Stehule
Patch by Gurjeet Singh

instead of

Patch by okbobcz
Patch by singh.gurjeet

https://commitfest.postgresql.org/action/commitfest_view?id=8

Thanks,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Robert Haas

On Thu, Sep 30, 2010 at 5:02 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Oh, I missed that.  Actually, I wasn't really so concerned with
>> whether his benchmark is correct.  I *am* concerned about being broken
>> out of the box on MacOS X.
>
> Actually, the problem with OSX is that OSX is broken out of the box,
> at least by that standard.  The system's normal configuration is that
> fsync() does nothing, so it's hardly surprising that O_DSYNC is no
> better.  You have to use wal_sync_method = fsync_writethrough to get
> actual bits-to-the-platter behavior.
>
> I'm not sure whether we should select fsync_writethrough as the default
> on OSX.  We don't make an equivalent attempt to prevent OS or storage
> malfeasance on other Unixoid platforms --- in fact, I'd say OSX is a bit
> ahead of the game in that you *can* force writethrough without resorting
> to arcane hacks with hdparm or some such.
>
> We could definitely stand to be a bit more verbose about documenting
> the platform-specific issues in this area.

I think some documentation is definitely in order, at the least.  It's
certainly astonishing that the default settings aren't crash-safe.
I'd really like to understand how this shakes out on different
plaforms.

Whether we should try to work around them is a trickier question, but
I'm somewhat inclined to say yes.  If we're trying to have the system
be performant in the default config, turning off synchronous_commit
would be saner than failing to make use of a system call which we know
absolutely for sure to be necessary to avoid the possibility of
database corruption.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread A.M.


On Sep 30, 2010, at 5:02 PM, Tom Lane wrote:

> Robert Haas  writes:
>> Oh, I missed that.  Actually, I wasn't really so concerned with
>> whether his benchmark is correct.  I *am* concerned about being broken
>> out of the box on MacOS X.
> 
> Actually, the problem with OSX is that OSX is broken out of the box,
> at least by that standard.  The system's normal configuration is that
> fsync() does nothing, 

That is not correct. fsync and friends on Darwin synchronizes I/O and flushes 
dirty kernel caches to the disk which meets the specification and is distinctly 
different from doing nothing.

"The fsync() function can be used by an application to indicate that all data 
for the open file description named by fildes is to be transferred to the 
storage device associated with the file described by fildes in an 
implementation-dependent manner."
http://opengroup.org/onlinepubs/007908799/xsh/fsync.html

"On MacOS X, fsync() always has and always will flush all file data
from host memory to the drive on which the file resides."
http://lists.apple.com/archives/Darwin-dev/2005/Feb/msg00072.html

> I'm not sure whether we should select fsync_writethrough as the default
> on OSX.  We don't make an equivalent attempt to prevent OS or storage
> malfeasance on other Unixoid platforms --- in fact, I'd say OSX is a bit
> ahead of the game in that you *can* force writethrough without resorting
> to arcane hacks with hdparm or some such.
> 
> We could definitely stand to be a bit more verbose about documenting
> the platform-specific issues in this area.

Not only is this issue platform-specific, it is also bus-, controller- and 
disk-specific. Luckily, hardware that ships from Apple responds properly to 
F_FULLFSYNC. It's too bad there is no cross-platform way to ask what level of 
hardware-syncing is available.

Cheers,
M
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Tom Lane

Robert Haas  writes:
> Oh, I missed that.  Actually, I wasn't really so concerned with
> whether his benchmark is correct.  I *am* concerned about being broken
> out of the box on MacOS X.

Actually, the problem with OSX is that OSX is broken out of the box,
at least by that standard.  The system's normal configuration is that
fsync() does nothing, so it's hardly surprising that O_DSYNC is no
better.  You have to use wal_sync_method = fsync_writethrough to get
actual bits-to-the-platter behavior.

I'm not sure whether we should select fsync_writethrough as the default
on OSX.  We don't make an equivalent attempt to prevent OS or storage
malfeasance on other Unixoid platforms --- in fact, I'd say OSX is a bit
ahead of the game in that you *can* force writethrough without resorting
to arcane hacks with hdparm or some such.

We could definitely stand to be a bit more verbose about documenting
the platform-specific issues in this area.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres vs. intel ccNUMA on Linux

2010-09-30 Thread Robert Haas

On Wed, Sep 29, 2010 at 2:45 PM, James Robinson
 wrote:
> Hackers,
>
>        Any tips / conventional wisdom regarding running postgres on
> large-ish memory ccNUMA intel machines, such as a 32G dual-quad-core,
> showing two NUMA nodes of 16G each? I expect each postgres backend's
> non-shared memory usage to remain nice and reasonably sized, hopefully
> staying within the confines of its processor's local memory region, but how
> will accesses to shared memory and / or buffer cache play out? Do people
> tune their backends via 'numactl' ?
>
>        Furthermore, if one had more than one database being served by the
> machine, would it be advisable to do this via multiple clusters instead of a
> single cluster, tweaking the processor affinity of each postmaster
> accordingly, trying to ensure each cluster's shared memory segments and
> buffer cache pools remain local for the resulting backends?

I was hoping someone more knowledgeable about this topic would reply, but...

Generally, I don't recommend running more than one postmaster on one
machine, because one big pool for shared_buffers is generally going to
be more efficient than two smaller pools.  However, as you say, the
shared memory region might be a problem, particularly for things like
the ProcArray, which are pretty "hot" and are accessed by every
backend during every transaction.  But I'm not sure whether the
additional overhead is going to be more or less than the overhead of
splitting the shared_buffers arena in half, so I suspect you're going
to have to benchmark it to find out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git diff --patience

2010-09-30 Thread Kevin Grittner

Gurjeet Singh  wrote:

> The with-patience version has only two hunks, removal of a big
> block of comment and addition of a big block of code.
> 
> The without-patience patience is riddled with the mix of two
> hunks, spread until line 120.
> 
> --patience is a clear winner here.

When I read the description of the algorithm, I can't imagine a
situation where --patience would make the diff *worse*.  I was
somewhat afraid (based on the name) that it would be slow; but
if it is slower, it hasn't been by enough for me to notice it.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to reindex primary keys

2010-09-30 Thread Robert Haas

On Wed, Sep 29, 2010 at 8:42 PM, Gurjeet Singh  wrote:
>     Attached is a patch that implements replacing a primary key with another
> index. This would help overcome the limitation that primary keys cannot be
> reindexed
> without taking exclusive locks.

Sounds useful.  You should add this patch here so it gets reviewed
during the next CommitFest.

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Robert Haas

On Thu, Sep 30, 2010 at 4:09 PM, Dave Page  wrote:
> On Thu, Sep 30, 2010 at 8:26 PM, Robert Haas  wrote:
>> Jayant Kumar did some benchmarking of InnoDB vs. PostgreSQL and PG
>> came out 5 times faster.  The benchmark isn't very thoroughly
>> described, but it turns out not to matter.
>>
>> http://jayant7k.blogspot.com/2010/09/database-speed-tests-mysql-and.html
>>
>> Apparently, the reason we're faster is that wal_sync_method =
>> open_datasync, which is the default on MacOS X, doesn't actually work.
>
> That might be true, but if you check the comments, Jayant replied to say:
>
> @Andrew : I am running linux - ubuntu 10.04 - kernel 2.6.32-24. Linux
> does not support fsync_writethrough
> http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm
>
> So I don't think that invalidates his benchmark. Something else might
> of course...

Oh, I missed that.  Actually, I wasn't really so concerned with
whether his benchmark is correct.  I *am* concerned about being broken
out of the box on MacOS X.

(I also suspect problems with the benchmark.  It's hard to believe
we're 5x faster than InnoDB on an apples-to-apples comparison on
trivial queries.  I'd believe 20% either way, but 5x is a lot.  But
that's a question for another day.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git diff --patience

2010-09-30 Thread Gurjeet Singh

Attached are two versions of the same patch, with and without --patience.

The with-patience version has only two hunks, removal of a big block of
comment and addition of a big block of code.

The without-patience patience is riddled with the mix of two hunks, spread
until line 120.

--patience is a clear winner here.

Regards,

On Wed, Sep 29, 2010 at 5:10 PM, Kevin Grittner  wrote:

> Peter Eisentraut  wrote:
>
> > Do you have an existing commit where you see a difference so I can
> > try it and see if there is some other problem that my local
> > configuration has?
>
> Random poking around in the postgresql.git commits didn't turn up
> any where it mattered, so here's before and after files for the
> example diff files already posted.  If you create branch, commit the
> before copy, and copy in the after copy, you should be able to
> replicate the results I posted.
>
> -Kevin
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


diff_without_patience.patch
Description: Binary data


diff_with_patience.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Dave Page

On Thu, Sep 30, 2010 at 8:26 PM, Robert Haas  wrote:
> Jayant Kumar did some benchmarking of InnoDB vs. PostgreSQL and PG
> came out 5 times faster.  The benchmark isn't very thoroughly
> described, but it turns out not to matter.
>
> http://jayant7k.blogspot.com/2010/09/database-speed-tests-mysql-and.html
>
> Apparently, the reason we're faster is that wal_sync_method =
> open_datasync, which is the default on MacOS X, doesn't actually work.

That might be true, but if you check the comments, Jayant replied to say:

@Andrew : I am running linux - ubuntu 10.04 - kernel 2.6.32-24. Linux
does not support fsync_writethrough
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm

So I don't think that invalidates his benchmark. Something else might
of course...

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Dave Page

On Thu, Sep 30, 2010 at 8:26 PM, Robert Haas  wrote:
> Jayant Kumar did some benchmarking of InnoDB vs. PostgreSQL and PG
> came out 5 times faster.  The benchmark isn't very thoroughly
> described, but it turns out not to matter.
>
> http://jayant7k.blogspot.com/2010/09/database-speed-tests-mysql-and.html
>
> Apparently, the reason we're faster is that wal_sync_method =
> open_datasync, which is the default on MacOS X, doesn't actually work.

That might be true, but if you check the comments, Jayant replied to say:

@Andrew : I am running linux - ubuntu 10.04 - kernel 2.6.32-24. Linux
does not support fsync_writethrough
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm

So I don't think that invalidates his benchmark. Something else might
of course...

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] patch: subscripts function

2010-09-30 Thread Pavel Stehule

Hello

The attached patch contains a implementation of "subscripts" function.
The functionality of this function is same like generate_subscripts
function, but it's based for iteration from plpgsql's for-in-array.

Regards

Pavel Stehule
*** ./doc/src/sgml/func.sgml.orig	2010-09-29 10:40:40.0 +0200
--- ./doc/src/sgml/func.sgml	2010-09-30 21:25:07.294900077 +0200
***
*** 9940,9945 
--- 9940,9956 
 
  
   
+   subscripts(anyarray, int , boolean)
+  
+ 
+ int[]
+ returns array of subscripts of entered array, when third parameter is true, then result is reveresed
+ subscripts(ARRAY[10,20,NULL,30,1], 1, true)
+ {5,4,3,2,1}
+
+
+ 
+  
unnest(anyarray)
   
  
*** ./src/backend/utils/adt/arrayfuncs.c.orig	2010-09-29 10:40:40.0 +0200
--- ./src/backend/utils/adt/arrayfuncs.c	2010-09-30 20:18:50.339024900 +0200
***
*** 17,22 
--- 17,23 
  #include 
  
  #include "funcapi.h"
+ #include "catalog/pg_type.h"
  #include "libpq/pqformat.h"
  #include "parser/parse_coerce.h"
  #include "utils/array.h"
***
*** 4337,4342 
--- 4338,4401 
  	PG_RETURN_ARRAYTYPE_P(result);
  }
  
+ /*
+  * subscripts(array anyarray, dim int [, reverse bool]
+  *		Returns all subscripts of the array for any dimension as array
+  */
+ Datum
+ subscripts(PG_FUNCTION_ARGS)
+ {
+ 	ArrayType  *v = PG_GETARG_ARRAYTYPE_P(0);
+ 	int			reqdim = PG_GETARG_INT32(1);
+ 	int		   *lb,
+ 			   *dimv;
+ 	int	lower;
+ 	int	upper;
+ 	bool		reverse;
+ 	ArrayBuildState *astate = NULL;
+ 
+ 	/* Sanity check: does it look like an array at all? */
+ 	if (ARR_NDIM(v) <= 0 || ARR_NDIM(v) > MAXDIM)
+ 		PG_RETURN_NULL();
+ 
+ 	/* Sanity check: was the requested dim valid */
+ 	if (reqdim <= 0 || reqdim > ARR_NDIM(v))
+ 		PG_RETURN_NULL();
+ 
+ 	lb = ARR_LBOUND(v);
+ 	dimv = ARR_DIMS(v);
+ 
+ 	lower = lb[reqdim - 1];
+ 	upper = dimv[reqdim - 1] + lb[reqdim - 1] - 1;
+ 	reverse = (PG_NARGS() < 3) ? false : PG_GETARG_BOOL(2);
+ 
+ 	while (lower <= upper)
+ 	{
+ 		Datum	value;
+ 		
+ 		if (!reverse)
+ 			value = Int32GetDatum(lower++);
+ 		else
+ 			value = Int32GetDatum(upper--);
+ 	
+ 		astate = accumArrayResult(astate,
+ 		value, false,
+ 		INT4OID, CurrentMemoryContext);
+ 	}
+ 
+ 	PG_RETURN_DATUM(makeArrayResult(astate, CurrentMemoryContext));
+ }
+ 
+ /*
+  * subscripts_nodir
+  *		Implements the 2-argument version of subscripts
+  */
+ Datum
+ subscripts_nodir(PG_FUNCTION_ARGS)
+ {
+ 	/* just call the other one -- it can handle both cases */
+ 	return subscripts(fcinfo);
+ }
  
  typedef struct generate_subscripts_fctx
  {
*** ./src/include/catalog/pg_proc.h.orig	2010-09-29 10:40:40.0 +0200
--- ./src/include/catalog/pg_proc.h	2010-09-30 21:01:31.043900248 +0200
***
*** 1043,1048 
--- 1043,1052 
  DESCR("array subscripts generator");
  DATA(insert OID = 1192 (  generate_subscripts PGNSP PGUID 12 1 1000 0 f f f t t i 2 0 23 "2277 23" _null_ _null_ _null_ _null_ generate_subscripts_nodir _null_ _null_ _null_ ));
  DESCR("array subscripts generator");
+ DATA(insert OID = 3811 (  subscripts PGNSP PGUID 12 1 0 0 f f f t f i 3 0 1007 "2277 23 16" _null_ _null_ _null_ _null_ subscripts _null_ _null_ _null_ ));
+ DESCR("array subscripts generator");
+ DATA(insert OID = 3812 (  subscripts PGNSP PGUID 12 1 0 0 f f f t f i 2 0 1007 "2277 23" _null_ _null_ _null_ _null_ subscripts_nodir _null_ _null_ _null_ ));
+ DESCR("array subscripts generator");
  DATA(insert OID = 1193 (  array_fill PGNSP PGUID 12 1 0 0 f f f f f i 2 0 2277 "2283 1007" _null_ _null_ _null_ _null_ array_fill _null_ _null_ _null_ ));
  DESCR("array constructor with value");
  DATA(insert OID = 1286 (  array_fill PGNSP PGUID 12 1 0 0 f f f f f i 3 0 2277 "2283 1007 1007" _null_ _null_ _null_ _null_ array_fill_with_lower_bounds _null_ _null_ _null_ ));
*** ./src/include/utils/array.h.orig	2010-09-29 10:40:40.0 +0200
--- ./src/include/utils/array.h	2010-09-30 19:53:29.215024927 +0200
***
*** 202,207 
--- 202,209 
  extern Datum array_length(PG_FUNCTION_ARGS);
  extern Datum array_larger(PG_FUNCTION_ARGS);
  extern Datum array_smaller(PG_FUNCTION_ARGS);
+ extern Datum subscripts(PG_FUNCTION_ARGS);
+ extern Datum subscripts_nodir(PG_FUNCTION_ARGS);
  extern Datum generate_subscripts(PG_FUNCTION_ARGS);
  extern Datum generate_subscripts_nodir(PG_FUNCTION_ARGS);
  extern Datum array_fill(PG_FUNCTION_ARGS);
*** ./src/test/regress/expected/arrays.out.orig	2010-09-29 10:40:40.0 +0200
--- ./src/test/regress/expected/arrays.out	2010-09-30 21:06:15.0 +0200
***
*** 1286,1288 
--- 1286,1313 
   [5:5]={"(42,43)"}
  (1 row)
  
+ -- subscripts tests
+ select subscripts(array[1,3,4,10],1);
+  subscripts 
+ 
+  {1,2,3,4}
+ (1 row)
+ 
+ select subscripts(array[1,3,4,10],2);
+  subscripts 
+ 
+  
+ (1

[HACKERS] O_DSYNC broken on MacOS X?

2010-09-30 Thread Robert Haas

Jayant Kumar did some benchmarking of InnoDB vs. PostgreSQL and PG
came out 5 times faster.  The benchmark isn't very thoroughly
described, but it turns out not to matter.

http://jayant7k.blogspot.com/2010/09/database-speed-tests-mysql-and.html

Apparently, the reason we're faster is that wal_sync_method =
open_datasync, which is the default on MacOS X, doesn't actually work.

[rhaas pgbench]$ pgbench -t 10 -j 4 -c 4 pgbench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 25
query mode: simple
number of clients: 4
number of threads: 4
number of transactions per client: 10
number of transactions actually processed: 40/40
tps = 1292.258304 (including connections establishing)
tps = 1292.281493 (excluding connections establishing)

Clearly we're not getting 1292 (or even 1292/4) fsync per second out
of whatever HD is in my laptop.  So what happens if we change to
fsync_writethrough, which is the equivalent of what InnoDB apparently
does out of the box?

[rhaas pgsql]$ pg_ctl reload
server signaled
LOG:  received SIGHUP, reloading configuration files
LOG:  parameter "wal_sync_method" changed to "fsync_writethrough"
[rhaas pgbench]$ pgbench -t 10 -j 4 -c 4 pgbench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 25
query mode: simple
number of clients: 4
number of threads: 4
number of transactions per client: 10
number of transactions actually processed: 40/40
tps = 27.845797 (including connections establishing)
tps = 27.845809 (excluding connections establishing)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-09-30 Thread Tom Lane

Simon Riggs  writes:
> On Wed, 2010-09-29 at 08:44 -0400, Alvaro Herrera wrote:
> So, I think "twice disk space of the sum of table and indexes" would be
> the simplest explanation for safe margin.
>> 
>> Agreed.

> Surely the peak space is x3?
> Old space + sort space + new space.

The wording should be something like "CLUSTER requires transient disk
space equal to about twice the size of the table plus its indexes".

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security hook on table creation

2010-09-30 Thread Robert Haas

2010/9/29 KaiGai Kohei :
> In addition, I want to give these entrypoints its name which
> represents an appropriate purpose of the hook, rather than
> a uniformed one.

It sounds like you're proposing to create a vast number of hooks
rather than just one.  If we have ~20 object types in the system,
that's 40 hooks just for create and drop, and then many more to handle
comment, alter (perhaps in various flavors), etc.  I'm pretty
unexcited about that.  The main hook function can always dispatch
internally if it so desires, but I don't see any benefit to forcing
people to write the code that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Robert Haas

On Thu, Sep 30, 2010 at 12:52 PM, Simon Riggs  wrote:
> On Thu, 2010-09-30 at 07:06 -0700, David Fetter wrote:
>> On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
>> > David Fetter  writes:
>> > > On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
>> > >> I don't see anything has stalled.
>> >
>> > > I do.  We're half way through this commitfest, so if no one's
>> > > actually ready to commit one of the patches, I kinda have to
>> > > bounce them both, at least to the next CF.
>> >
>> > [ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
>> > stuff.  The CF end is still two weeks away.
>>
>> If people are still wrangling over the design, I'd say two weeks is
>> a ludicrously short time, not a long one.
>
> Yes, there is design work still to do.
>
> What purpose would be served by "bouncing" these patches?

None whatsoever, IMHO.  That having been said, I would like to see us
make some forward progress.  I'm open to your ideas expressed
up-thread, but I'm not sure whether they'll be sufficient to resolve
the problem.  Seems worth a try, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: I: [HACKERS] About "Our CLUSTER implementation is pessimal" patch

2010-09-30 Thread Simon Riggs

On Wed, 2010-09-29 at 08:44 -0400, Alvaro Herrera wrote:

> > So, I think "twice disk space of the sum of table and indexes" would be
> > the simplest explanation for safe margin.
> 
> Agreed.

Surely the peak space is x3?

Old space + sort space + new space.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Simon Riggs

On Thu, 2010-09-30 at 07:06 -0700, David Fetter wrote:
> On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
> > David Fetter  writes:
> > > On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
> > >> I don't see anything has stalled.
> > 
> > > I do.  We're half way through this commitfest, so if no one's
> > > actually ready to commit one of the patches, I kinda have to
> > > bounce them both, at least to the next CF.
> > 
> > [ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
> > stuff.  The CF end is still two weeks away.
> 
> If people are still wrangling over the design, I'd say two weeks is
> a ludicrously short time, not a long one.

Yes, there is design work still to do.

What purpose would be served by "bouncing" these patches?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Kevin Grittner

Aidan Van Dyk  wrote:

> When the "being written to" segmnt copmletes moves to the final
> location, he'll get an extra whole "copy" of the file.  But of the
> "move" can be an exec of his scritpt, the compressed/gzipped final
> result shouldn't be that bad.  Certainly no worse then what he's
> currently getting with archive command ;-)  And he's got the
> uncompressed incimental updates as they are happening.

Hmmm...  As long as streaming replication doesn't send the "tail" of
an incomplete WAL segment file, the only thing we'd be missing on
the send to the central location is the compression.  That's
typically reducing the size of the transmission by 50% to 75% (e.g.,
the gzipped "full" files are usually in the range of 4MB to 8MB). 
At our WAN speeds, that is significant.  I don't suppose that
streaming replication uses (or offers as an option) a compressed
stream?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Kevin Grittner

Magnus Hagander  wrote:

> We'd need a second script/command to call to figure out where to
> restart from in that case, no?

I see your point; I guess we would need that.

> It should be safe to just rsync the archive directory as it's
> being written by pg_streamrecv. Doesn't that give you the property
> you're looking for - local machine gets data streamed in live,
> remote machine gets it rsynced every minute?

Well the local target is a can't run pg_streamrecv -- it's a backup
machine where we pretty much have rsync and nothing else.  We could
run pg_streamrecv on the database server itself and rsync to the
local machine every minute.

I just checked with the DBA who monitors space issues for such
things, and it would be OK to rsync the uncompressed file to the
local backup as it is written (we have enough space for it without
compression) as long as we compress it before sending it to the
central location.  For that, your idea to fire a script on
completion of the file would work -- we could maintain both raw and
compressed files on the database server for rsync to the two
locations.

You can probably see the appeal of filtering it as it is written,
though, if that is feasible.  :-)

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Magnus Hagander

On Thu, Sep 30, 2010 at 16:39, Aidan Van Dyk  wrote:
> On Thu, Sep 30, 2010 at 10:24 AM, Magnus Hagander  wrote:
>
>>> That would allow some nice options.  I've been thinking what would
>>> be the ideal use of this with our backup scheme, and the best I've
>>> thought up would be that each WAL segment file would be a single
>>> output stream, with the option of calling a executable (which could
>>> be a script) with the target file name and then piping the stream to
>>> it.  At 16MB or a forced xlog switch, it would close the stream and
>>> call the executable again with a new file name.  You could have a
>>> default executable for the default behavior, or just build in a
>>> default if no executable is specified.
>>
>> The problem with that one (which I'm sure is solvable somehow) is how
>> to deal with restarts. Both restarts in the middle of a segment
>> (happens all the time if you don't have an archive_timeout set), and
>> really also restarts between segments. How would the tool know where
>> to begin streaming again? Right now, it looks at the files - but doing
>> it by your suggestion there are no files to look at. We'd need a
>> second script/command to call to figure out where to restart from in
>> that case, no?
>
> And then think of the future, when sync rep is in... I'm hoping to be
> able to use something like this to do synchrous replication to my
> archive (instead of to a live server).

Right, that could be a future enhancement. Doesn't mean we shouldn't
still do our best with the async mode of course :P


>> It should be safe to just rsync the archive directory as it's being
>> written by pg_streamrecv. Doesn't that give you the property you're
>> looking for - local machine gets data streamed in live, remote machine
>> gets it rsynced every minute?
>
> When the "being written to" segmnt copmletes moves to the final
> location, he'll get an extra whole "copy" of the file.  But of the

Ah, good point.

> "move" can be an exec of his scritpt, the compressed/gzipped final
> result shouldn't be that bad.  Certainly no worse then what he's
> currently getting with archive command ;-)  And he's got the
> uncompressed incimental updates as they are happening.

Yeah, it would be trivial to replace the rename() call with a call to
a script that gets to do whatever is suitable to the file. Actually,
it'd probably be better to rename() it *and* call the script, so that
we can continue properly if the script fails.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Kevin Grittner

Heikki Linnakangas  wrote:
 
> You do realize that to be able to guarantee zero data loss, the
> master will have to stop committing new transactions if the
> streaming stops for any reason, like a network glitch. Maybe
> that's a tradeoff you want, but I'm asking because that point
> isn't clear to many people.
 
Yeah, I get that.  I do think the quorum approach or some simplified
special case of it would be important for us -- possibly even a
requirement -- for that reason.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Yeb Havinga


Heikki Linnakangas wrote:

On 30.09.2010 17:09, Kevin Grittner wrote:

Aidan Van Dyk  wrote:
Heikki Linnakangas  wrote:


I'm sure there's several things you can accomplish with
synchronous replication, perhaps you could describe what the
important use case for you is?



I'm looking for "data durability", not "server query-ability"


Same here.  If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database.  We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.


You do realize that to be able to guarantee zero data loss, the master 
will have to stop committing new transactions if the streaming stops 
for any reason, like a network glitch. Maybe that's a tradeoff you 
want, but I'm asking because that point isn't clear to many people.
If there's a network glitch, it'd probably affect networked client 
connections as well, so it would mean no extra degration of service.


-- Yeb


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Aidan Van Dyk

On Thu, Sep 30, 2010 at 10:24 AM, Magnus Hagander  wrote:

>> That would allow some nice options.  I've been thinking what would
>> be the ideal use of this with our backup scheme, and the best I've
>> thought up would be that each WAL segment file would be a single
>> output stream, with the option of calling a executable (which could
>> be a script) with the target file name and then piping the stream to
>> it.  At 16MB or a forced xlog switch, it would close the stream and
>> call the executable again with a new file name.  You could have a
>> default executable for the default behavior, or just build in a
>> default if no executable is specified.
>
> The problem with that one (which I'm sure is solvable somehow) is how
> to deal with restarts. Both restarts in the middle of a segment
> (happens all the time if you don't have an archive_timeout set), and
> really also restarts between segments. How would the tool know where
> to begin streaming again? Right now, it looks at the files - but doing
> it by your suggestion there are no files to look at. We'd need a
> second script/command to call to figure out where to restart from in
> that case, no?

And then think of the future, when sync rep is in... I'm hoping to be
able to use something like this to do synchrous replication to my
archive (instead of to a live server).

> It should be safe to just rsync the archive directory as it's being
> written by pg_streamrecv. Doesn't that give you the property you're
> looking for - local machine gets data streamed in live, remote machine
> gets it rsynced every minute?

When the "being written to" segmnt copmletes moves to the final
location, he'll get an extra whole "copy" of the file.  But of the
"move" can be an exec of his scritpt, the compressed/gzipped final
result shouldn't be that bad.  Certainly no worse then what he's
currently getting with archive command ;-)  And he's got the
uncompressed incimental updates as they are happening.

a.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Standby registration

2010-09-30 Thread Heikki Linnakangas


On 29.09.2010 11:46, Fujii Masao wrote:

Aside from standby registration itself, I have another thought for C). Keeping
many WAL files in pg_xlog of the master is not good design in the first place.
I cannot believe that pg_xlog in most systems has enough capacity to store many
WAL files for the standby.

Usually the place where many WAL files can be stored is the archive. So I've
been thinking to make walsender send the archived WAL file to the standby.
That is, when the WAL file required for the standby is not found in pg_xlog,
walsender restores it from the archive by executing restore_command that users
specified. Then walsender read the WAL file and send it.

Currently, if pg_xlog is not enough large in your system, you have to struggle
with the setup of warm-standby environment on streaming replication, to prevent
the WAL files still required for the standby from being deleted before shipping.
Many people would be disappointed about that fact.

The archived-log-shipping approach cuts out the need of setup of warm-standby
and wal_keep_segments. So that would make streaming replication easier to use.
Thought?


The standby can already use restore_command to fetch WAL files from the 
archive. I don't see why the master should be involved in that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Magnus Hagander

On Thu, Sep 30, 2010 at 15:45, Kevin Grittner
 wrote:
> Magnus Hagander  wrote:
>
>>> If you could keep the development "friendly" to such features, I
>>> may get around to adding them to support our needs
>>
>> Would it be enough to have kind of an "archive_command" switch
>> that says "whenever you've finished a complete wal segment, run
>> this command on it"?
>
> That would allow some nice options.  I've been thinking what would
> be the ideal use of this with our backup scheme, and the best I've
> thought up would be that each WAL segment file would be a single
> output stream, with the option of calling a executable (which could
> be a script) with the target file name and then piping the stream to
> it.  At 16MB or a forced xlog switch, it would close the stream and
> call the executable again with a new file name.  You could have a
> default executable for the default behavior, or just build in a
> default if no executable is specified.

The problem with that one (which I'm sure is solvable somehow) is how
to deal with restarts. Both restarts in the middle of a segment
(happens all the time if you don't have an archive_timeout set), and
really also restarts between segments. How would the tool know where
to begin streaming again? Right now, it looks at the files - but doing
it by your suggestion there are no files to look at. We'd need a
second script/command to call to figure out where to restart from in
that case, no?


> The reason I like this is that I could pipe the stream through
> pg_clearxlogtail and gzip pretty much "as is" to the locations on
> the database server currently used for rsync to the two targets, and
> the rsync commands would send the incremental changes once per
> minute to both targets.  I haven't thought of another solution which
> provides incremental transmission of the WAL to the local backup
> location, which would be a nice thing to have, since this is most
> crucial when the WAN is down and not only is WAL data not coming
> back to our central location, but our application framework based
> replication stream isn't making back, either.

It should be safe to just rsync the archive directory as it's being
written by pg_streamrecv. Doesn't that give you the property you're
looking for - local machine gets data streamed in live, remote machine
gets it rsynced every minute?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Heikki Linnakangas


On 30.09.2010 17:09, Kevin Grittner wrote:

Aidan Van Dyk  wrote:
Heikki Linnakangas  wrote:


I'm sure there's several things you can accomplish with
synchronous replication, perhaps you could describe what the
important use case for you is?



I'm looking for "data durability", not "server query-ability"


Same here.  If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database.  We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.


You do realize that to be able to guarantee zero data loss, the master 
will have to stop committing new transactions if the streaming stops for 
any reason, like a network glitch. Maybe that's a tradeoff you want, but 
I'm asking because that point isn't clear to many people.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Kevin Grittner

Aidan Van Dyk  wrote:
Heikki Linnakangas  wrote:

>> I'm sure there's several things you can accomplish with
>> synchronous replication, perhaps you could describe what the
>> important use case for you is?

> I'm looking for "data durability", not "server query-ability"

Same here.  If we used synchronous replication, the important thing
for us would be to hold up the master for the minimum time required
to ensure remote persistence -- not actual application to the remote
database.  We could tolerate some WAL replay time on recovery better
than poor commit performance on the master.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread David Fetter

On Thu, Sep 30, 2010 at 09:52:46AM -0400, Tom Lane wrote:
> David Fetter  writes:
> > On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
> >> I don't see anything has stalled.
> 
> > I do.  We're half way through this commitfest, so if no one's
> > actually ready to commit one of the patches, I kinda have to
> > bounce them both, at least to the next CF.
> 
> [ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
> stuff.  The CF end is still two weeks away.

If people are still wrangling over the design, I'd say two weeks is
a ludicrously short time, not a long one.

> But while I'm thinking about that...
> 
> The actual facts on the ground are that practically no CF work has
> gotten done yet (at least not in my house)

Your non-involvement in the first half or more--I'd say maybe 3 weeks
or so--is precisely what commitfests are for.  The point is that
people who are *not* committers need to do a bunch of QA on patches,
review them, get or create new patches as needed.  Only then should a
committer get involved.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Aidan Van Dyk

On Thu, Sep 30, 2010 at 2:09 AM, Heikki Linnakangas
 wrote:

> Agreed. Actually, given the lack of people jumping in and telling us what
> they'd like to do with the feature, maybe it's not that important after all.

>> The basic features that I mean is for most basic use case, that is, one
>> master and one synchronous standby case. In detail,
>
> ISTM the problem is exactly that there is no consensus on what the basic use
> case is. I'm sure there's several things you can accomplish with synchronous
> replication, perhaps you could describe what the important use case for you
> is?

OK, So I'll throw in my ideal use case.  I'm starting to play with
Magnus's "streaming -> archive".

*that's* what I want, with synchronous.  Yes, again, I'm looking for
"data durability", not "server query-ability", and I'ld like to rely
on the PG user-space side of things instead of praying that replicated
block-devices hold together

If my master flips out, I'm quite happy to do a normal archive
restore.  Except I don't want that last 16MB (or archive timeout) of
transactions lost.  The streaming -> archive in it's current state
get's me pretty close, but I'ld love to be able to guarantee that my
recovery from that archive has *every* transaction that the master
committed...

a.

a.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Tom Lane

David Fetter  writes:
> On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
>> I don't see anything has stalled.

> I do.  We're half way through this commitfest, so if no one's actually
> ready to commit one of the patches, I kinda have to bounce them both,
> at least to the next CF.

[ raised eyebrow ]  You seem to be in an awfully big hurry to bounce
stuff.  The CF end is still two weeks away.

But while I'm thinking about that...

The actual facts on the ground are that practically no CF work has
gotten done yet (at least not in my house) due to the git move and the
9.0.0 release and the upcoming back-branch releases.  Maybe we shouldn't
have started the CF while all that was going on, but that's water over
the dam now.  What we can do is rethink the scheduled end date.  IMHO
we should push out the end date by at least a week to reflect the lack
of time spent on the CF so far.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Kevin Grittner

Magnus Hagander  wrote:

>> If you could keep the development "friendly" to such features, I
>> may get around to adding them to support our needs
> 
> Would it be enough to have kind of an "archive_command" switch
> that says "whenever you've finished a complete wal segment, run
> this command on it"?

That would allow some nice options.  I've been thinking what would
be the ideal use of this with our backup scheme, and the best I've
thought up would be that each WAL segment file would be a single
output stream, with the option of calling a executable (which could
be a script) with the target file name and then piping the stream to
it.  At 16MB or a forced xlog switch, it would close the stream and
call the executable again with a new file name.  You could have a
default executable for the default behavior, or just build in a
default if no executable is specified.

The reason I like this is that I could pipe the stream through
pg_clearxlogtail and gzip pretty much "as is" to the locations on
the database server currently used for rsync to the two targets, and
the rsync commands would send the incremental changes once per
minute to both targets.  I haven't thought of another solution which
provides incremental transmission of the WAL to the local backup
location, which would be a nice thing to have, since this is most
crucial when the WAN is down and not only is WAL data not coming
back to our central location, but our application framework based
replication stream isn't making back, either.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread David Fetter

On Thu, Sep 30, 2010 at 09:14:42AM +0100, Simon Riggs wrote:
> On Thu, 2010-09-30 at 09:09 +0300, Heikki Linnakangas wrote:
> > On 29.09.2010 10:56, Fujii Masao wrote:
> > > On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas  
> > > wrote:
> 
> > >> This feature is important, and we need to get it done.  How do
> > >> we get the ball rolling again?
> > 
> > Agreed. Actually, given the lack of people jumping in and telling
> > us what they'd like to do with the feature, maybe it's not that
> > important after all.
> 
> I don't see anything has stalled.

I do.  We're half way through this commitfest, so if no one's actually
ready to commit one of the patches, I kinda have to bounce them both,
at least to the next CF.

The very likely outcome of that, given that it's a pretty enormous
feature that involves even more enormous amounts of testing on various
hardware, networks, etc., is that we don't get SR in 9.1, and you
among others will be very unhappy.

So yes, it is stalled, and yes, there's a real urgency to actually
getting a baseline something in there in the next couple of weeks.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] final patch - plpgsql: for-in-array

2010-09-30 Thread Pavel Stehule

Hello

this patch implement a new iteration construct - iteration over an
array. The sense of this new iteration is:
  * a simple and cleaner syntax
  * a faster execution - this bring down a number of detoast operations

create or replace function subscripts(anyarray, int)
returns int[] as $$
select array(select generate_subscripts($1,$2));
$$ language sql;

create or replace function fora_test()
returns int as $$
declare x int; s int = 0;
   a int[] := array[1,2,3,4,5,6,7,8,9,10];
begin
  for x in array subscripts(a, 1)
  loop
s := s + a[x];
  end loop;
  return s;
end;
$$ language plpgsql;

create or replace function fora_test()
returns int as $$
declare x int; s int = 0;
begin
  for x in array array[1,2,3,4,5,6,7,8,9,10]
  loop
s := s + x;
  end loop;
  return s;
end;
$$ language plpgsql;

create or replace function fora_test()
returns int as $$
declare x int; y int;
   a fora_point[] := array[(1,2),(3,4),(5,6)];
begin
  for x, y in array a
  loop
raise notice 'point=%,%', x, y;
  end loop;
  return 0;
end;
$$ language plpgsql;

Regards

Pavel Stehule


for-in-array
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: starting to review the Extend NOT NULL representation to pg_constraint patch

2010-09-30 Thread Andrew Geery

Ok -- I've updated the commitfest page linking in this review and
changing the status to waiting on a new patch from you.

Thanks
Andrew

On Thu, Sep 30, 2010 at 4:12 AM, Bernd Helmle  wrote:
>
>
> --On 29. September 2010 23:05:11 -0400 Andrew Geery 
> wrote:
>
>> Reference: https://commitfest.postgresql.org/action/patch_view?id=312
>>
>> The patch from
>> http://archives.postgresql.org/message-id/ca2e4c4762eae28d68404...@amenop
>> his does not apply cleanly to the current git master:
>
> Yeah, there where some changes in the meantime to the master which generate
> some merge failures...will provide a new version along with other fixes
> soon. Are you going to update the commitfest page?
>
> Thanks
>       Bernd
>
>
>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tsearch dictionary initialization hook

2010-09-30 Thread Pavel Stehule

Hello

2010/9/28 Robert Haas :
> On Tue, Sep 7, 2010 at 12:55 PM, Pavel Stehule  
> wrote:
>> 2010/9/7 Teodor Sigaev :
>>> Hm, what is aim of this hook? It looks like a wrapper of dictionary init
>>> method.
>>
>> If I use a mmap for shared dictionary, then I have to prealloc and
>> maybe preread dictionary - it can be done in external module. But I
>> have to join preloaded dictionary to requested dictionary. This hook
>> allows this relation - and it's general - I don't need any special
>> support in ispell dictionary.
>
> Review:
>
> 1. Is this really necessary?  It seems that you're inserting a hook
> here when you could just as well change tmplinit to point to whatever
> function you want to call, which wouldn't require a code change.

The a creating of new hacked template is second option - I didn't find
it. It good for hacking and probably I'll use it because I have not a
time to work on this problem. On second hand - it is less on more
little bit dark hack - you have to modify system tables. Using a hook
is more transparent - you can or not just load a module, that uses a
hook.

>
> 2. Our standard criteria for the inclusion of a hook is some sample
> code that demonstrates how it can be usefully used.  I think you need
> to provide that before we can consider this further.
>

yes - I understand, but I have not time to work on this subject now,
so It can be moved to rejected patches queue. Some my ideas depends on
proposed (different people) shared memory control, but I don't see any
move on this, so there isn't reason why implement a hook or some
modules uses this hook now.

Regards

Pavel Stehule

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using streaming replication as log archiving

2010-09-30 Thread Magnus Hagander

On Wed, Sep 29, 2010 at 23:45, Kevin Grittner
 wrote:
> Magnus Hagander  wrote:
>
>> Comments and contributions are most welcome.
>
> This is probably too esoteric to be worked on yet, but for this to
> be useful for us we would need to pass the resulting files through
> pg_clearxlogtail and gzip in an automated fashion.  And we would
> need to do regular log file archiving in parallel with it.
>
> As background, our databases around the state archive to a directory
> which is then pushed via rsync to a "dumb" backup location in the
> same room as the database server (we're lucky to have rsync on the
> target of this copy; any other executable is out of the question),
> and the same directory is pulled via rsync to a central location.
> We would be interested in using streaming replication to a tool such
> as you describe for the copy to the central location, but since we
> would still be forcing a wal-file switch once per hour we would need
> the current capability to shrink an "empty" file from 16MB to 16kB
> using the above-mentioned tools.

You could just have one stream going local and one stream going to the
other location in parallell, though?

Or use the stream to the local directory and rsync that off? While I
haven't tested it, rsyncing the partial WAL files *should* be fine, I
think...


> Also, a the ability to limit bandwidth would be a nice feature for
> us, preferably in a way which could be changed on the fly.
>
> If you could keep the development "friendly" to such features, I may
> get around to adding them to support our needs

Would it be enough to have kind of an "archive_command" switch that
says "whenever you've finished a complete wal segment, run this
command on it"? Then that command could clear the tail, compress, and
send off?

And in that case, should it run inline or in the background with the
streaming? I would assume just fork it off and leave it to it's own
business would be best?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: for-in-array patch

2010-09-30 Thread Pavel Stehule

sorry

fixed patch

Pavel

2010/9/30 Pavel Stehule :
> Hello
>
> This patch add iteration over array to plpgsql
>
> now is supported only iteration over scalar array
>
> Regards
>
> Pavel Stehule
>
*** ./gram.y.orig	2010-09-29 10:53:44.663270537 +0200
--- ./gram.y	2010-09-30 10:46:41.976025132 +0200
***
*** 239,244 
--- 239,245 
  %token 	K_ABSOLUTE
  %token 	K_ALIAS
  %token 	K_ALL
+ %token 	K_ARRAY
  %token 	K_BACKWARD
  %token 	K_BEGIN
  %token 	K_BY
***
*** 1051,1056 
--- 1052,1066 
  			new->body	  = $4.stmts;
  			$$ = (PLpgSQL_stmt *) new;
  		}
+ 		else if ($3->cmd_type == PLPGSQL_STMT_FORA)
+ 		{
+ 			PLpgSQL_stmt_fora		*new;
+ 			new = (PLpgSQL_stmt_fora *) $3;
+ 			new->lineno   = plpgsql_location_to_lineno(@2);
+ 			new->label	  = $1;
+ 			new->body	  = $4.stmts;
+ 			$$ = (PLpgSQL_stmt *) new;
+ 		}
  		else
  		{
  			PLpgSQL_stmt_forq		*new;
***
*** 1077,1083 
  		int			tok = yylex();
  		int			tokloc = yylloc;
  
! 		if (tok == K_EXECUTE)
  		{
  			/* EXECUTE means it's a dynamic FOR loop */
  			PLpgSQL_stmt_dynfors	*new;
--- 1087,1130 
  		int			tok = yylex();
  		int			tokloc = yylloc;
  
! 		if (tok == K_ARRAY)
! 		{
! 			PLpgSQL_stmt_fora	*new;
! 			PLpgSQL_expr		*expr;
! 
! 			new = palloc0(sizeof(PLpgSQL_stmt_fora));
! 			new->cmd_type = PLPGSQL_STMT_FORA;
! 
! 			expr = read_sql_expression(K_LOOP, "LOOP");
! 
! 			if ($1.rec)
! 			{
! new->rec = $1.rec;
! check_assignable((PLpgSQL_datum *) new->rec, @1);
! 			}
! 			else if ($1.row)
! 			{
! new->row = $1.row;
! check_assignable((PLpgSQL_datum *) new->row, @1);
! 			}
! 			else if ($1.scalar)
! 			{
! /* convert single scalar to list */
! new->var = (PLpgSQL_var *) $1.scalar;
! check_assignable((PLpgSQL_datum *) new->var, @1);
! 			}
! 			else
! 			{
! ereport(ERROR,
! 		(errcode(ERRCODE_DATATYPE_MISMATCH),
! 		 errmsg("loop variable of loop over rows must be a record or row variable or list of scalar variables"),
! 		 parser_errposition(@1)));
! 			}
! 
! 			new->expr = expr;
! 			$$ = (PLpgSQL_stmt *) new;
! 		}
! 		else if (tok == K_EXECUTE)
  		{
  			/* EXECUTE means it's a dynamic FOR loop */
  			PLpgSQL_stmt_dynfors	*new;
*** ./pl_exec.c.orig	2010-09-29 11:22:32.435395512 +0200
--- ./pl_exec.c	2010-09-30 10:47:37.199899936 +0200
***
*** 8,14 
   *
   *
   * IDENTIFICATION
!  *	  src/pl/plpgsql/src/pl_exec.c
   *
   *-
   */
--- 8,14 
   *
   *
   * IDENTIFICATION
!  *	  $PostgreSQL: pgsql/src/pl/plpgsql/src/pl_exec.c,v 1.266 2010/08/19 18:10:48 tgl Exp $
   *
   *-
   */
***
*** 107,112 
--- 107,114 
  			   PLpgSQL_stmt_fors *stmt);
  static int exec_stmt_forc(PLpgSQL_execstate *estate,
  			   PLpgSQL_stmt_forc *stmt);
+ static int exec_stmt_fora(PLpgSQL_execstate *estate,
+ 			   PLpgSQL_stmt_fora *stmt);
  static int exec_stmt_open(PLpgSQL_execstate *estate,
  			   PLpgSQL_stmt_open *stmt);
  static int exec_stmt_fetch(PLpgSQL_execstate *estate,
***
*** 1309,1314 
--- 1311,1320 
  			rc = exec_stmt_forc(estate, (PLpgSQL_stmt_forc *) stmt);
  			break;
  
+ 		case PLPGSQL_STMT_FORA:
+ 			rc = exec_stmt_fora(estate, (PLpgSQL_stmt_fora *) stmt);
+ 			break;
+ 
  		case PLPGSQL_STMT_EXIT:
  			rc = exec_stmt_exit(estate, (PLpgSQL_stmt_exit *) stmt);
  			break;
***
*** 1860,1893 
  	return rc;
  }
  
  
  /* --
!  * exec_stmt_fors			Execute a query, assign each
!  *	tuple to a record or row and
!  *	execute a group of statements
!  *	for it.
   * --
   */
  static int
! exec_stmt_fors(PLpgSQL_execstate *estate, PLpgSQL_stmt_fors *stmt)
  {
! 	Portal		portal;
! 	int			rc;
  
! 	/*
! 	 * Open the implicit cursor for the statement using exec_run_select
! 	 */
! 	exec_run_select(estate, stmt->query, 0, &portal);
  
  	/*
! 	 * Execute the loop
  	 */
! 	rc = exec_for_query(estate, (PLpgSQL_stmt_forq *) stmt, portal, true);
  
  	/*
! 	 * Close the implicit cursor
  	 */
! 	SPI_cursor_close(portal);
  
  	return rc;
  }
--- 1866,2072 
  	return rc;
  }
  
+ /*
+  * Check whether a specific array element is NULL
+  *
+  * nullbitmap: pointer to array's null bitmap (NULL if none)
+  * offset: 0-based linear element number of array element
+  */
+ static bool
+ array_get_isnull(const bits8 *nullbitmap, int offset)
+ {
+ 	if (nullbitmap == NULL)
+ 		return false;			/* assume not null */
+ 	if (nullbitmap[offset / 8] & (1 << (offset % 8)))
+ 		return false;			/* not null */
+ 	return true;
+ }
  
  /* --
!  * exec_stmt_fora			Execute a loop

[HACKERS] Re: starting to review the Extend NOT NULL representation to pg_constraint patch

2010-09-30 Thread Bernd Helmle




--On 29. September 2010 23:05:11 -0400 Andrew Geery 
 wrote:



Reference: https://commitfest.postgresql.org/action/patch_view?id=312

The patch from
http://archives.postgresql.org/message-id/ca2e4c4762eae28d68404...@amenop
his does not apply cleanly to the current git master:


Yeah, there where some changes in the meantime to the master which generate 
some merge failures...will provide a new version along with other fixes 
soon. Are you going to update the commitfest page?


Thanks
   Bernd




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is sync rep stalled?

2010-09-30 Thread Simon Riggs

On Thu, 2010-09-30 at 09:09 +0300, Heikki Linnakangas wrote:
> On 29.09.2010 10:56, Fujii Masao wrote:
> > On Wed, Sep 29, 2010 at 11:47 AM, Robert Haas  wrote:

> >> This feature is important, and we need to get it done.  How do we get
> >> the ball rolling again?
> 
> Agreed. Actually, given the lack of people jumping in and telling us 
> what they'd like to do with the feature, maybe it's not that important 
> after all.

I don't see anything has stalled. I've been busy for a few days, so
haven't had a chance to follow up on the use cases, as suggested. I'm
busy again today, so cannot reply further. Anyway, taking a few days to
let us think some more about the technical comments is no bad thing.

I think we need to relax about this feature some more because trying to
get something actually done when basic issues need analysis is hard and
that creates tension. Between us we can work out the code in a few days,
once we know which code to write.

What we actually need to do is talk and listen. I'd like to suggest that
we have an online "focus day" (onlist) on Sync Rep on Oct 5 and maybe 6
as well?. Meeting in person is possible, but probably impractical. But a
design sprint, not a code sprint. 

This is important and I'm sure we'll work something out. 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-09-30 Thread Magnus Hagander

On Thu, Sep 30, 2010 at 04:40, Itagaki Takahiro
 wrote:
> Hi, I have a question about the latest patch.
>
> On Sun, Aug 22, 2010 at 11:03 PM, Quan Zongliang
>  wrote:
>> New patch attached. How about this?
>>> I don't see us ever using anything other than auto or demand. The
>>> others aren't for "regular services"
>
> + set_starttype(char *starttypeopt)
> + {
> +       if (strcmp(starttypeopt, "a") == 0 || strcmp(starttypeopt, "auto") == 
> 0)
> +               pgctl_start_type = SERVICE_AUTO_START;
> +       else if (strcmp(starttypeopt, "d") == 0 || strcmp(starttypeopt,
> "demand") == 0)
> +               pgctl_start_type = SERVICE_DEMAND_START;
>
> It accepts only "a" and "auto" for auto, but "au" or "aut" are rejected.
> Is is an intended behavior?  I think we can use prefix match here because
> we use the logic in some places. For example,

I think it's modeled on what we allow for the "-m" parameter, and ISTM
it's good to keep those consistent.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] wip: for-in-array patch

2010-09-30 Thread Pavel Stehule

Hello

This patch add iteration over array to plpgsql

now is supported only iteration over scalar array

Regards

Pavel Stehule
*** ./gram.y.orig	2010-09-29 10:53:44.663270537 +0200
--- ./gram.y	2010-09-30 09:04:04.809900052 +0200
***
*** 239,244 
--- 239,245 
  %token 	K_ABSOLUTE
  %token 	K_ALIAS
  %token 	K_ALL
+ %token 	K_ARRAY
  %token 	K_BACKWARD
  %token 	K_BEGIN
  %token 	K_BY
***
*** 1057,1063 
  
  			Assert($3->cmd_type == PLPGSQL_STMT_FORS ||
     $3->cmd_type == PLPGSQL_STMT_FORC ||
!    $3->cmd_type == PLPGSQL_STMT_DYNFORS);
  			/* forq is the common supertype of all three */
  			new = (PLpgSQL_stmt_forq *) $3;
  			new->lineno   = plpgsql_location_to_lineno(@2);
--- 1058,1065 
  
  			Assert($3->cmd_type == PLPGSQL_STMT_FORS ||
     $3->cmd_type == PLPGSQL_STMT_FORC ||
!    $3->cmd_type == PLPGSQL_STMT_DYNFORS ||
!    $3->cmd_type == PLPGSQL_STMT_FORA);
  			/* forq is the common supertype of all three */
  			new = (PLpgSQL_stmt_forq *) $3;
  			new->lineno   = plpgsql_location_to_lineno(@2);
***
*** 1077,1083 
  		int			tok = yylex();
  		int			tokloc = yylloc;
  
! 		if (tok == K_EXECUTE)
  		{
  			/* EXECUTE means it's a dynamic FOR loop */
  			PLpgSQL_stmt_dynfors	*new;
--- 1079,1122 
  		int			tok = yylex();
  		int			tokloc = yylloc;
  
! 		if (tok == K_ARRAY)
! 		{
! 			PLpgSQL_stmt_fora	*new;
! 			PLpgSQL_expr		*expr;
! 
! 			new = palloc0(sizeof(PLpgSQL_stmt_fora));
! 			new->cmd_type = PLPGSQL_STMT_FORA;
! 
! 			expr = read_sql_expression(K_LOOP, "LOOP");
! 
! 			if ($1.rec)
! 			{
! new->rec = $1.rec;
! check_assignable((PLpgSQL_datum *) new->rec, @1);
! 			}
! 			else if ($1.row)
! 			{
! new->row = $1.row;
! check_assignable((PLpgSQL_datum *) new->row, @1);
! 			}
! 			else if ($1.scalar)
! 			{
! /* convert single scalar to list */
! new->var = (PLpgSQL_var *) $1.scalar;
! check_assignable((PLpgSQL_datum *) new->var, @1);
! 			}
! 			else
! 			{
! ereport(ERROR,
! 		(errcode(ERRCODE_DATATYPE_MISMATCH),
! 		 errmsg("loop variable of loop over rows must be a record or row variable or list of scalar variables"),
! 		 parser_errposition(@1)));
! 			}
! 
! 			new->expr = expr;
! 			$$ = (PLpgSQL_stmt *) new;
! 		}
! 		else if (tok == K_EXECUTE)
  		{
  			/* EXECUTE means it's a dynamic FOR loop */
  			PLpgSQL_stmt_dynfors	*new;
*** ./pl_exec.c.orig	2010-09-29 11:22:32.435395512 +0200
--- ./pl_exec.c	2010-09-30 09:53:53.310900849 +0200
***
*** 8,14 
   *
   *
   * IDENTIFICATION
!  *	  src/pl/plpgsql/src/pl_exec.c
   *
   *-
   */
--- 8,14 
   *
   *
   * IDENTIFICATION
!  *	  $PostgreSQL: pgsql/src/pl/plpgsql/src/pl_exec.c,v 1.266 2010/08/19 18:10:48 tgl Exp $
   *
   *-
   */
***
*** 107,112 
--- 107,114 
  			   PLpgSQL_stmt_fors *stmt);
  static int exec_stmt_forc(PLpgSQL_execstate *estate,
  			   PLpgSQL_stmt_forc *stmt);
+ static int exec_stmt_fora(PLpgSQL_execstate *estate,
+ 			   PLpgSQL_stmt_fora *stmt);
  static int exec_stmt_open(PLpgSQL_execstate *estate,
  			   PLpgSQL_stmt_open *stmt);
  static int exec_stmt_fetch(PLpgSQL_execstate *estate,
***
*** 1309,1314 
--- 1311,1320 
  			rc = exec_stmt_forc(estate, (PLpgSQL_stmt_forc *) stmt);
  			break;
  
+ 		case PLPGSQL_STMT_FORA:
+ 			rc = exec_stmt_fora(estate, (PLpgSQL_stmt_fora *) stmt);
+ 			break;
+ 
  		case PLPGSQL_STMT_EXIT:
  			rc = exec_stmt_exit(estate, (PLpgSQL_stmt_exit *) stmt);
  			break;
***
*** 1860,1893 
  	return rc;
  }
  
  
  /* --
!  * exec_stmt_fors			Execute a query, assign each
!  *	tuple to a record or row and
!  *	execute a group of statements
!  *	for it.
   * --
   */
  static int
! exec_stmt_fors(PLpgSQL_execstate *estate, PLpgSQL_stmt_fors *stmt)
  {
! 	Portal		portal;
! 	int			rc;
  
! 	/*
! 	 * Open the implicit cursor for the statement using exec_run_select
! 	 */
! 	exec_run_select(estate, stmt->query, 0, &portal);
  
  	/*
! 	 * Execute the loop
  	 */
! 	rc = exec_for_query(estate, (PLpgSQL_stmt_forq *) stmt, portal, true);
  
  	/*
! 	 * Close the implicit cursor
  	 */
! 	SPI_cursor_close(portal);
  
  	return rc;
  }
--- 1866,2072 
  	return rc;
  }
  
+ /*
+  * Check whether a specific array element is NULL
+  *
+  * nullbitmap: pointer to array's null bitmap (NULL if none)
+  * offset: 0-based linear element number of array element
+  */
+ static bool
+ array_get_isnull(const bits8 *nullbitmap, int offset)
+ {
+ 	if (nullbitmap == NULL)
+ 		return false;			/* assume not n

Re: [HACKERS] patch: SQL/MED(FDW) DDL

2010-09-30 Thread Shigeru HANADA

On Thu, 30 Sep 2010 09:26:54 +0300
Heikki Linnakangas  wrote:
> > FYI, HiRDB, that implements FDW routines, has CREATE FOREIGN INDEX.
> > I think it is a little ugly and won't work in some cases -- for example,
> > index organized tables -- but evidently it's a realistic solution.
> 
> A long time ago I used DB2's federated database feature, which is at 
> least close to SQL/MED if not fully compatible. When you create a 
> "federated index" there. it's just a planner hint to the local database, 
> so that it knows how expensive it is to evaluate a qual remotely vs. 
> locally. It shouldn't matter what technology the remote index uses in 
> that case, as long as the cost model is roughly the same as a b-tree.
> 
> I don't think we want to go down that path though, it's better to leave 
> the cost estimation altogether to the wrapper. It has much better 
> knowledge of expensive various quals are.
> 
> However, the wrapper will likely need some local storage for indexes and 
> like to do the cost estimation. Or maybe it can just keep the 
> information in cache, loading it on first use from the remote database.
How about having cost hints in generic option of the foreign table or
its columns?  Generic options are storage for wrappers, not for
PostgreSQL core modules.  Wrappers can use their own format to
represent various information, and use the hints to estimate costs of
a path.

In addition, I think that the generic option of a server could be used
to store cost hints which depend on each server, such as network
transfer overhead for dbms wrappers, or disk I/O for file-wrappers.

Regards,
--
Shigeru Hanada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stalled post to pgsql-committers

2010-09-30 Thread Marc G. Fournier


On Wed, 29 Sep 2010, Alvaro Herrera wrote:


Excerpts from Peter Eisentraut's message of mi? sep 29 04:08:35 -0400 2010:

On s?n, 2010-09-26 at 17:11 +0200, Magnus Hagander wrote:



Yeah, that's what you need to do. I would guess you were previously
subscribed as pe...@postgresql.org, but the git commit scrpit sends
the email from pete...@gmx.net, so you need to subscribe from that one
(with or without nomail).


No, that address was not subscribed to that list.  There must have been
some other mechanism at work.


Yes.  Marc had a "sublist" with the addresses of all committers, which
were accepted without moderation and without being subscribed.  See
restrict_post in the "moderate" section of the Mj2 settings page for
that list; it contains pgsql-committers:restricted.

It would be trivial to add the new list of committer addresses to that
list.  I don't know how that list is edited though; Marc would know.
I think either Magnus or Dave should have enough privilege to do the
edit itself.


its a simple subscribe ... you jus reference the sublist vs just the list 
...


if someone can send me a list, I can easily add them ... should the old 
list be eliminated first though ... ?




Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Stalled post to pgsql-committers

2010-09-30 Thread Alvaro Herrera

Excerpts from Peter Eisentraut's message of mié sep 29 04:08:35 -0400 2010:
> On sön, 2010-09-26 at 17:11 +0200, Magnus Hagander wrote:

> > Yeah, that's what you need to do. I would guess you were previously
> > subscribed as pe...@postgresql.org, but the git commit scrpit sends
> > the email from pete...@gmx.net, so you need to subscribe from that one
> > (with or without nomail).
> 
> No, that address was not subscribed to that list.  There must have been
> some other mechanism at work.

Yes.  Marc had a "sublist" with the addresses of all committers, which
were accepted without moderation and without being subscribed.  See
restrict_post in the "moderate" section of the Mj2 settings page for
that list; it contains pgsql-committers:restricted.

It would be trivial to add the new list of committer addresses to that
list.  I don't know how that list is edited though; Marc would know.
I think either Magnus or Dave should have enough privilege to do the
edit itself.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

63 matches

Mail list logo