Re: [HACKERS] [WIP PATCH] for Performance Improvement in Buffer Management

2012-11-22 Thread Amit Kapila
 

From: Pavan Deolasee [mailto:pavan.deola...@gmail.com] 
Sent: Thursday, November 22, 2012 12:26 PM
To: Amit kapila
Cc: Jeff Janes; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [WIP PATCH] for Performance Improvement in Buffer
Management

 

 

 

On Mon, Nov 19, 2012 at 8:52 PM, Amit kapila  wrote:

On Monday, November 19, 2012 5:53 AM Jeff Janes wrote:
On Sun, Oct 21, 2012 at 12:59 AM, Amit kapila 
wrote:
> On Saturday, October 20, 2012 11:03 PM Jeff Janes wrote:
>
>>Run the modes in reciprocating order?
>> Sorry, I didn't understood this, What do you mean by modes in
reciprocating order?

> Sorry for the long delay.  In your scripts, it looks like you always
> run the unpatched first, and then the patched second.

   Yes, thats true.


> By reciprocating, I mean to run them in the reverse order, or in random
order.

Today for some configurations, I have ran by reciprocating the order.
Below are readings:
Configuration
16GB (Database) -7GB (Shared Buffers)

Here i had run in following order
1. Run perf report with patch for 32 client
2. Run perf report without patch for 32 client
3. Run perf report with patch for 16 client
4. Run perf report without patch for 16 client

Each execution is 5 minutes,
16 client /16 thread|   32 client /32 thread
   @mv-free-lst @9.3devl|  @mv-free-lst @9.3devl
---
  36694056|   53565258
  39874121|   46255185
  48404574|   45026796
  64656932|   45588233
  69667222|   49558237
  75517219|   91158269
  83157168|   431718340
  91027136|   579208349
---
  63626054|   167757333

 

>Sorry, I haven't followed this thread at all, but the numbers (43171 and
57920) in the last two runs of @mv-free-list for 32 clients look
aberrations, no ?  I wonder if >that's skewing the average.

 

Yes, that is one of the main reasons, but in all runs this is consistent
that for 32 clients or above this kind of numbers  are observed.

Even Jeff has pointed the similar thing in one of his mails and suggested to
run the tests such that first test should run "with patch" and then "without
patch". 

After doing what he suggested the observations are still similar.

 

 

> I also looked at the the Results.htm file down thread. There seem to be a
steep degradation when the shared buffers are increased from 5GB to 10GB,
both with and 

> without the patch. Is that expected ? If so, isn't that worth
investigating and possibly even fixing before we do anything else ?

 

The reason for decrease in performance is that when shared buffers are
increased from 5GB to 10GB, the I/O starts as after increasing it cannot
hold all

the data in OS buffers.

 

With Regards,

Amit Kapila



Re: [HACKERS] PQconninfo function for libpq

2012-11-22 Thread Boszormenyi Zoltan

2012-11-22 08:35 keltezéssel, Boszormenyi Zoltan írta:

2012-11-21 22:15 keltezéssel, Magnus Hagander írta:

On Wed, Nov 21, 2012 at 10:01 PM, Boszormenyi Zoltan  wrote:

Hi,

2012-11-21 19:19 keltezéssel, Magnus Hagander írta:


I'm breaking this out into it's own thread, for my own sanity if
nothing else :) And it's an isolated feature after all.

I still agree with the previous review at

http://archives.postgresql.org/message-id/1349321071.23971.0.ca...@vanquo.pezone.net
about keeping the data in more than one place.


OK, it seems I completely missed that comment.
(Or forgot about it if I happened to answer it.)



Based on that, I've created a different version of this patch,
attached. This way we keep all the data in one struct.


I like this single structure but not the way you handle the
options' classes. In your patch, each option belongs to only
one class. These classes are:

PG_CONNINFO_REPLICATION = "replication" only
PG_CONNINFO_PASSWORD = "password" only
PG_CONNINFO_NORMAL = everything else

How does it help pg_basebackup to filter out e.g. dbname and replication?

PG_CONNINFO_ALL should give pg_basebackup all it needs, no? Or
actually, it shouldn't have the replication=1 part, right? So it
should just use PG_CONNINFO_NORMAL|PG_CONNINFO_PASSWORD?



These are added by the walreceiver module anyway and adding them
to the primary_conninfo line should even be discouraged by the
documentation.

Hmm. I wasn't actually thinking about the dbname part here, I admit that.


And not only the dbname, libpqwalreceiver adds these three:

[zozo@localhost libpqwalreceiver]$ grep dbname *
libpqwalreceiver.c: "%s dbname=replication replication=true 
fallback_application_name=walreceiver",


I also excluded "application_name" from PG_CONNINFO_REPLICATION
by this reasoning:

- for async replication or single standby, it doesn't matter,
  the connection will show up as "walreceiver"
- for sync replication, the administrator has to add the node name
  manually via application_name.




In my view, the classes should be inclusive:

PG_CONNINFO_NORMAL = Everything that's usable for a regular client
connection. This mean everything, maybe including "password" but
excluding "replication".

PG_CONNINFO_PASSWORD = "password" only.

PG_CONNINFO_REPLICATION = Everything usable for a replication client
not added by walreceiver. Maybe including/excluding "password".

Maybe there should be two flags for replication usage:

PG_CONNINFO_WALRECEIVER = everything except those not added
by walreceiver (and maybe "password" too)

PG_CONNINFO_REPLICATION = "replication" only

And every option can belong to more than one class, just as in my patch.

Hmm. I kind of liked having each option in just one class, but I see
the problem. Looking at the ones you suggest, all the non-password
ones *have* to be without password, otherwise there i no way to get
the conninfo without password - which is the whole reason for that
parameter to exist. So the PASSWORD one has to be additional - which
means that not making the other ones additional makes them
inconsistent. But maybe we don't really have a choice there.


Yes, the PASSWORD part can be on its own, this is what I meant above
but wanted a different opinion about having it completely separate
is better or not.

But the NORMAL and REPLICATION (or WALRECEIVER) classes
need to overlap.


At this point, the patch is untested beyond compiling and a trivial
psql check, because I ran out of time :) But I figured I'd throw it
out there for comments on which version people prefer. (And yes, it's
quite possible I've made a big think-mistake in it somewhere, but
again, better to get some eyes on it early)

My version also contains a fixed version of the docs that should be
moved back into Zoltans version if that's the one we end up
preferring.


I also liked your version of the documentation better,
I am not too good at writing docs.

np.



Also, a question was buried in the other review which is - are we OK
to remove the requiressl parameter. Both these patches do so, because
the code becomes much simpler if we can do that. It has been
deprecated since 7.2. Is it OK to remove it, or do we need to put back
in the more complex code to deal with both?

Just going to highlight that we're looking for at least one third
party to comment on this :)


Yes, me too. A +1 for removing wouldn't count from me. ;-)




Attached is both Zoltans latest patch (v16) and my slightly different
approach.

Comments on which approach is best?

Test results from somebody who has the time to look at it? :)

Do you happen to have a set of tests you've been running on your
patches? Can you try them again this one?


My set of tests are:

1. initdb the master
2. pg_basebackup -R the first standby from the master
3. pg_basebackup -R the second standby from the master
4. pg_basebackup -R the third standby from the first standby

and diff -durpN the different data directories while there is no load on either.

I w

Re: [HACKERS] FDW for PostgreSQL

2012-11-22 Thread Shigeru Hanada
After playing with some big SQLs for testing, I came to feel that
showing every remote query in EXPLAIN output is annoying, especially
when SELECT * is unfolded to long column list.

AFAIK no plan node shows so many information in a line, so I'm
inclined to make postgres_fdw to show it only when VERBOSE was
specified.  This would make EXPLAIN output easy to read, even if many
foreign tables are used in a query.

Thoughts?

-- 
Shigeru HANADA


Re: [HACKERS] FDW for PostgreSQL

2012-11-22 Thread Kohei KaiGai
2012/11/22 Shigeru Hanada :
> After playing with some big SQLs for testing, I came to feel that
> showing every remote query in EXPLAIN output is annoying, especially
> when SELECT * is unfolded to long column list.
>
> AFAIK no plan node shows so many information in a line, so I'm
> inclined to make postgres_fdw to show it only when VERBOSE was
> specified.  This would make EXPLAIN output easy to read, even if many
> foreign tables are used in a query.
>
> Thoughts?
>
Indeed, I also think it is reasonable solution.
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FDW for PostgreSQL

2012-11-22 Thread Albe Laurenz
Kohei KaiGai wrote:
> 2012/11/22 Shigeru Hanada :
>> After playing with some big SQLs for testing, I came to feel that
>> showing every remote query in EXPLAIN output is annoying, especially
>> when SELECT * is unfolded to long column list.
>>
>> AFAIK no plan node shows so many information in a line, so I'm
>> inclined to make postgres_fdw to show it only when VERBOSE was
>> specified.  This would make EXPLAIN output easy to read, even if many
>> foreign tables are used in a query.
>>
>> Thoughts?
>>
> Indeed, I also think it is reasonable solution.

+1

That's the way I do it for oracle_fdw.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Database object names and libpq in UTF-8 locale on Windows

2012-11-22 Thread Sebastien FLAESCH

Tom, Andrew,

We have the same issue in our product: Support UTF-8 on Windows.

You know certainly that UTF-8 code page (65001) is no supported by MS Windows
when you set the locale with setlocale(). You cannot rely on standard libc
functions such as isalpha(), mbtowc(), mbstowc(), wctomb(), wcstombs(),
strcoll(), which depend on the current locale.

You should start to centralize all basic character-set related functions
(upper/lower, comparison, etc) in a library, to ease the port on Windows.

Then convert UTF-8 data to wide char and call wide char functions.

For example, to implement an uppercase() function:

1) Convert UTF-8 to Wide Char (algorithm can be easily found)
2) Use towupper()
3) Convert Wide Char result to UTF-8 (algorithm can be easily found)

To compare characters:

1) Convert s1 in UTF-8 to Wide Char => wcs1
2) Convert s2 in UTF-8 to Wide Char => wcs2
3) Use wcscoll(wcs1, wcs2)

Regards,
Seb

On 11/21/2012 06:07 PM, Tom Lane wrote:

Andrew Dunstan  writes:

On 11/21/2012 11:11 AM, Tom Lane wrote:

I'm not sure that's the only place we're doing this ...



Oh, Hmm, darn. Where else do you think we might?


Dunno, but grepping for isupper and/or tolower should find any such
places.

regards, tom lane





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] review: Deparsing DDL command strings

2012-11-22 Thread Pavel Stehule
Hello Dimitri

* patching (success)

pavel ~/src/postgresql $ patch -p1 < binBUNnKQVPBP
patching file src/backend/catalog/heap.c
patching file src/backend/commands/event_trigger.c
patching file src/backend/commands/extension.c
patching file src/backend/commands/sequence.c
patching file src/backend/commands/view.c
patching file src/backend/tcop/utility.c
patching file src/backend/utils/adt/Makefile
patching file src/backend/utils/adt/ddl_rewrite.c
patching file src/backend/utils/adt/ruleutils.c
patching file src/backend/utils/cache/evtcache.c
patching file src/bin/psql/describe.c
patching file src/include/catalog/heap.h
patching file src/include/catalog/pg_event_trigger.h
patching file src/include/commands/event_trigger.h
patching file src/include/commands/extension.h
patching file src/include/commands/sequence.h
patching file src/include/commands/view.h
patching file src/include/utils/builtins.h
patching file src/include/utils/evtcache.h
patching file src/pl/plpgsql/src/pl_comp.c
patching file src/pl/plpgsql/src/pl_exec.c
patching file src/pl/plpgsql/src/plpgsql.h
patching file src/test/regress/expected/event_trigger.out
patching file src/test/regress/sql/event_trigger.sql

* compilation (success)

All of PostgreSQL successfully made. Ready to install.

* regress tests (success)

 All 133 tests passed.

* issues

** missing doc

** statements are not deparsd for ddl_command_start event

postgres=# create table fooa(a int, b int);
NOTICE:  event: ddl_command_start, context: TOPLEVEL, tag: CREATE
TABLE, operation: CREATE, type: TABLE, schema: , name: 
NOTICE:  command: 
NOTICE:  event: ddl_command_start, context: TOPLEVEL, tag: CREATE
TABLE, operation: CREATE, type: TABLE, schema: , name: 
NOTICE:  command: 
NOTICE:  event: ddl_command_end, context: TOPLEVEL, tag: CREATE TABLE,
operation: CREATE, type: TABLE, schema: public, name: fooa
NOTICE:  command: CREATE TABLE public.fooa (a pg_catalog.int4, b
pg_catalog.int4);

** CREATE FUNCTION is not supported

postgres=# create or replace function bubu(a int) returns int as
$$select $1$$ language sql;
NOTICE:  event: ddl_command_start, context: TOPLEVEL, tag: CREATE
FUNCTION, operation: CREATE, type: FUNCTION, schema: , name:

NOTICE:  command: 
NOTICE:  event: ddl_command_start, context: TOPLEVEL, tag: CREATE
FUNCTION, operation: CREATE, type: FUNCTION, schema: , name:

NOTICE:  command: 
CREATE FUNCTION

* some wrong in deparsing - doesn't support constraints

postgres=# alter table a add column bbbsss int check (bbbsss > 0);
NOTICE:  event: ddl_command_start, context: TOPLEVEL, tag: ALTER
TABLE, operation: ALTER, type: TABLE, schema: , name: 
NOTICE:  command: 
NOTICE:  event: ddl_command_end, context: TOPLEVEL, tag: ALTER TABLE,
operation: ALTER, type: TABLE, schema: public, name: a
NOTICE:  command: ALTER TABLE public.a ADD COLUMN bbbsss pg_catalog.int4, ;
ALTER TABLE


Regards

Pavel


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fix ecpg core dump when there's a very long struct variable name in .pgc file

2012-11-22 Thread Chen Huajun

sorry,There's a miss(with out free memory) in that patch sended just now,
and resend it.

Best Regards,
Chen Huajun

(2012/11/22 18:09), Chen Huajun wrote:
> hi
> 
> I found a small bug in ecpg command and try to fix it.
> Please check if it is correct.
> 
> When use a struct variable whose name length is very very long such as 12KB 
> in .pgc source,
> ecpg will core dump because of buffer overflow if precompile the .pgc file.
> 
> $ ecpg testLongStructName.pgc
> Segmentation fault (core dumped)
> 
> 
> Normally no body will write a variable with so long name,
> but whether it's better to fix it.
> 
> 
> Best Regards,
> Chen Huajun
> 
> 
> 
> 

-- 
Best Regards
--
  富士通南大軟件技術有限公司(FNST)
  第二ソフトウェア事業部第三開発部
  陳華軍(チン カグン)
  Addr: 南京富士通南大軟件技術有限公司(FNST)
中国南京市雨花台区文竹路6号(210012)
  Mail: che...@cn.fujitsu.com
  Tel : +86+25-86630566-8406  内線: 7998-8406
  Fax : +86+25-83317685
--
diff --git a/postgresql-9.2rc1_org/src/interfaces/ecpg/preproc/type.c 
b/postgresql-9.2rc1_new/src/interfaces/ecpg/preproc/type.c
index c743616..cf2ff15 100644
--- a/postgresql-9.2rc1_org/src/interfaces/ecpg/preproc/type.c
+++ b/postgresql-9.2rc1_new/src/interfaces/ecpg/preproc/type.c
@@ -506,8 +506,8 @@ ECPGdump_a_struct(FILE *o, const char *name, const char 
*ind_name, char *arrsiz,
 */
struct ECPGstruct_member *p,
   *ind_p = NULL;
-   charpbuf[BUFSIZ],
-   ind_pbuf[BUFSIZ];
+   char*pbuf = (char *) mm_alloc(strlen(name) + ((prefix == 
NULL) ? 0 : strlen(prefix)) + 3);
+   char*ind_pbuf = (char *) mm_alloc(strlen(ind_name) + 
((ind_prefix == NULL) ? 0 : strlen(ind_prefix)) + 3);
 
if (atoi(arrsiz) == 1)
sprintf(pbuf, "%s%s.", prefix ? prefix : "", name);
@@ -540,6 +540,9 @@ ECPGdump_a_struct(FILE *o, const char *name, const char 
*ind_name, char *arrsiz,
if (ind_p != NULL && ind_p != &struct_no_indicator)
ind_p = ind_p->next;
}
+
+   free(pbuf);
+   free(ind_pbuf);
 }
 
 void

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP json generation enhancements

2012-11-22 Thread Dimitri Fontaine
Andrew Dunstan  writes:
> Here is a WIP patch for enhancements to json generation.
>
> First, there is the much_requested json_agg, which will aggregate rows
> directly to json. So the following will now work:
>
> select json_agg(my_table) from mytable;
> select json_agg(q) from () q;

Awesome, thanks!

How do you handle the nesting of the source elements? I would expect a
variant of the aggregate that takes two input parameters, the datum and
the current nesting level.

Consider a tree table using parent_id and a recursive query to display
the tree. You typically handle the nesting with an accumulator and a
call to repeat() to prepend some spaces before the value columns. What
about passing that nesting level (integer) to the json_agg()?

Here's a worked out example:

CREATE TABLE parent_child (
parent_id integer NOT NULL,
this_node_id integer NULL
);

INSERT INTO parent_child (parent_id, this_node_id) VALUES (0, 1);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 2);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 4);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 6);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 7);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 9);
INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10);


WITH RECURSIVE tree(id, level, parents) AS (
SELECT this_node_id as id, 0 as level, '{}'::int[] as parents
  FROM parent_child
 WHERE parent_id = 0
 
UNION ALL

SELECT this_node_id as id, t.level + 1, t.parents || c.parent_id
  FROM parent_child c
  JOIN tree t ON t.id = c.parent_id
)
SELECT json_agg(id, level)
  FROM tree;

I've left the parents column in the query above as a debug facility, but
it's not needed in that case.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-22 Thread Magnus Hagander
On Thu, Nov 22, 2012 at 10:16 AM, Boszormenyi Zoltan  wrote:
> 2012-11-22 08:35 keltezéssel, Boszormenyi Zoltan írta:
>
>> 2012-11-21 22:15 keltezéssel, Magnus Hagander írta:
>>>
>>> On Wed, Nov 21, 2012 at 10:01 PM, Boszormenyi Zoltan 
>>> wrote:

 Hi,

 2012-11-21 19:19 keltezéssel, Magnus Hagander írta:

> I'm breaking this out into it's own thread, for my own sanity if
> nothing else :) And it's an isolated feature after all.
>
> I still agree with the previous review at
>
>
> http://archives.postgresql.org/message-id/1349321071.23971.0.ca...@vanquo.pezone.net
> about keeping the data in more than one place.


 OK, it seems I completely missed that comment.
 (Or forgot about it if I happened to answer it.)


> Based on that, I've created a different version of this patch,
> attached. This way we keep all the data in one struct.


 I like this single structure but not the way you handle the
 options' classes. In your patch, each option belongs to only
 one class. These classes are:

 PG_CONNINFO_REPLICATION = "replication" only
 PG_CONNINFO_PASSWORD = "password" only
 PG_CONNINFO_NORMAL = everything else

 How does it help pg_basebackup to filter out e.g. dbname and
 replication?
>>>
>>> PG_CONNINFO_ALL should give pg_basebackup all it needs, no? Or
>>> actually, it shouldn't have the replication=1 part, right? So it
>>> should just use PG_CONNINFO_NORMAL|PG_CONNINFO_PASSWORD?
>>>
>>>
 These are added by the walreceiver module anyway and adding them
 to the primary_conninfo line should even be discouraged by the
 documentation.
>>>
>>> Hmm. I wasn't actually thinking about the dbname part here, I admit that.
>>
>>
>> And not only the dbname, libpqwalreceiver adds these three:
>>
>> [zozo@localhost libpqwalreceiver]$ grep dbname *
>> libpqwalreceiver.c: "%s dbname=replication replication=true
>> fallback_application_name=walreceiver",
>>
>> I also excluded "application_name" from PG_CONNINFO_REPLICATION
>> by this reasoning:
>>
>> - for async replication or single standby, it doesn't matter,
>>   the connection will show up as "walreceiver"
>> - for sync replication, the administrator has to add the node name
>>   manually via application_name.
>>
>>>
 In my view, the classes should be inclusive:

 PG_CONNINFO_NORMAL = Everything that's usable for a regular client
 connection. This mean everything, maybe including "password" but
 excluding "replication".

 PG_CONNINFO_PASSWORD = "password" only.

 PG_CONNINFO_REPLICATION = Everything usable for a replication client
 not added by walreceiver. Maybe including/excluding "password".

 Maybe there should be two flags for replication usage:

 PG_CONNINFO_WALRECEIVER = everything except those not added
 by walreceiver (and maybe "password" too)

 PG_CONNINFO_REPLICATION = "replication" only

 And every option can belong to more than one class, just as in my patch.
>>>
>>> Hmm. I kind of liked having each option in just one class, but I see
>>> the problem. Looking at the ones you suggest, all the non-password
>>> ones *have* to be without password, otherwise there i no way to get
>>> the conninfo without password - which is the whole reason for that
>>> parameter to exist. So the PASSWORD one has to be additional - which
>>> means that not making the other ones additional makes them
>>> inconsistent. But maybe we don't really have a choice there.
>>
>>
>> Yes, the PASSWORD part can be on its own, this is what I meant above
>> but wanted a different opinion about having it completely separate
>> is better or not.
>>
>> But the NORMAL and REPLICATION (or WALRECEIVER) classes
>> need to overlap.
>>
> At this point, the patch is untested beyond compiling and a trivial
> psql check, because I ran out of time :) But I figured I'd throw it
> out there for comments on which version people prefer. (And yes, it's
> quite possible I've made a big think-mistake in it somewhere, but
> again, better to get some eyes on it early)
>
> My version also contains a fixed version of the docs that should be
> moved back into Zoltans version if that's the one we end up
> preferring.


 I also liked your version of the documentation better,
 I am not too good at writing docs.
>>>
>>> np.
>>>
>>>
> Also, a question was buried in the other review which is - are we OK
> to remove the requiressl parameter. Both these patches do so, because
> the code becomes much simpler if we can do that. It has been
> deprecated since 7.2. Is it OK to remove it, or do we need to put back
> in the more complex code to deal with both?
>>>
>>> Just going to highlight that we're looking for at least one third
>>> party to comment on this :)
>>
>>
>> Yes, me too. A +1 for removing wouldn't count from me. ;-)
>

Re: [HACKERS] WIP patch: add (PRE|POST)PROCESSOR options to COPY

2012-11-22 Thread Etsuro Fujita
I wrote:
> > The biggest problem this patch has had from the very beginning is
> > overdesign, and this is more of the same.  Let's please just define the
> > feature as "popen, not fopen, the given string" and have done.  You can
> > put all the warning verbiage you want in the documentation.  (But note
> > that the server-side version would be superuser-only in any flavor of
> > the feature.)
> 
> Agreed.  I'll reimplement the feature using the PROGRAM keyword:
> 
> > COPY TABLE FROM PROGRAM 'command line';
> > COPY TABLE TO PROGRAM 'command line';

I've reimplemented the feature.  Attached is an updated version of the patch.

Todo:
* More documents
    * More tests

Any comments are welcomed.

Thanks,

Best regards,
Etsuro Fujita


copy-popen-20121122.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] C-function, don't load external dll file

2012-11-22 Thread pl65
Dimitri Fontaine-7 wrote
> You need to declare it in SQL, maybe like this:  create function
> public.transform(text) returns text  as '$libdir/fservice', 'transform'
> language C;

*I'm afraid that I won't do it becouse fservice.dll is writen in c++, but
dll file which contains function transform (this function export to
Postgresql) was wrote in ANSI C. *Secondly, LOAD function in posttgresql
libraries that are use of my all dependent dll? I think i should use:LOAD
'fservice.dll' But I don't know which dll load to use function or makros
which called from windows.h (HINSTANCE, LoadLibray(), GetProcAddress())?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/C-function-don-t-load-external-dll-file-tp5732931p5733197.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Switching timeline over streaming replication

2012-11-22 Thread Amit Kapila
On Wednesday, November 21, 2012 11:36 PM Heikki Linnakangas wrote:
> On 20.11.2012 15:33, Amit Kapila wrote:
> > Defect-2:
> >  1. start primary A
> >  2. start standby B following A
> >  3. start cascade standby C following B.
> >  4. Start another standby D following C.
> >  5. Execute the following commands in the primary A.
> > create table tbl(f int);
> > insert into tbl values(generate_series(1,1000));
> >  6. Promote standby B.
> >  7. Execute the following commands in the primary B.
> > insert into tbl values(generate_series(1001,2000));
> > insert into tbl values(generate_series(2001,3000));
> >
> >  The following logs are observed on standby C:
> >
> >  LOG:  restarted WAL streaming at position 0/700 on tli 2
> >  ERROR:  requested WAL segment 00020007 has
> > already been removed
> >  LOG:  record with zero length at 0/7028190
> >  LOG:  record with zero length at 0/7048540
> >  LOG:  out-of-sequence timeline ID 1 (after 2) in log segment
> > 00020007, offset 0
> 
> I propose the attached patch (against 9.2) to fix that. This should be
> backpatched to 9.0, where standby_mode was introduced. The code was the
> same in 8.4, too, but AFAICS there was no problem there because 8.4
> never tried to re-open the same WAL segment after replaying some of it.

Fixed.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP json generation enhancements

2012-11-22 Thread Andrew Dunstan


On 11/22/2012 05:54 AM, Dimitri Fontaine wrote:

Andrew Dunstan  writes:

Here is a WIP patch for enhancements to json generation.

First, there is the much_requested json_agg, which will aggregate rows
directly to json. So the following will now work:

 select json_agg(my_table) from mytable;
 select json_agg(q) from () q;

Awesome, thanks!

How do you handle the nesting of the source elements? I would expect a
variant of the aggregate that takes two input parameters, the datum and
the current nesting level.

Consider a tree table using parent_id and a recursive query to display
the tree. You typically handle the nesting with an accumulator and a
call to repeat() to prepend some spaces before the value columns. What
about passing that nesting level (integer) to the json_agg()?

Here's a worked out example:

 CREATE TABLE parent_child (
 parent_id integer NOT NULL,
 this_node_id integer NULL
 );
 
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (0, 1);

 INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 2);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 3);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (1, 4);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 5);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (2, 6);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 7);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 8);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (4, 9);
 INSERT INTO parent_child (parent_id, this_node_id) VALUES (9, 10);


 WITH RECURSIVE tree(id, level, parents) AS (
 SELECT this_node_id as id, 0 as level, '{}'::int[] as parents
   FROM parent_child
  WHERE parent_id = 0
  
 UNION ALL
 
 SELECT this_node_id as id, t.level + 1, t.parents || c.parent_id

   FROM parent_child c
   JOIN tree t ON t.id = c.parent_id
 )
 SELECT json_agg(id, level)
   FROM tree;

I've left the parents column in the query above as a debug facility, but
it's not needed in that case.



the function only takes a single argument and aggregates all the values 
into a json array. If the arguments are composites they will produce 
json objects.


People complained that to get a resultset as json you have to do in 9.2

select array_to_json(array_agg(q)) ...

which is both a bit cumbersome and fairly inefficient. json_agg(q) is 
equivalent to the above expression but is both simpler and much more 
efficient.


If you want a tree structured object you'll need to construct it 
yourself - this function won't do the nesting for you. That's beyond its 
remit.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-22 Thread Amit kapila

On Tuesday, November 20, 2012 7:21 PM Amit Kapila wrote:
On Monday, November 19, 2012 9:07 PM Amit Kapila wrote:
> On Monday, November 19, 2012 8:36 PM Alvaro Herrera wrote:
> > Amit Kapila escribió:
> >
> > > The only point I can see against SET PERSISTENT is that other
> variants
> > of
> > > SET command can be used in
> > > transaction blocks means for them ROLLBACK TO SAVEPOINT
> functionality
> > works,
> > > but for SET PERSISTENT,
> > > it can't be done.
> > > So to handle that might be we need to mention this point in User
> > Manual, so
> > > that users can be aware of this usage.
> > > If that is okay, then I think SET PERSISTENT is good to go.
> >
> > I think that's okay.  There are other commands which have some forms
> > that can run inside a transaction block and others not.  CLUSTER is
> > one example (maybe the only one?  Not sure).
>

> If no objections to SET PERSISTENT .. syntax, I shall update the patch for
> implementation of same.

Patch to implement SET PERSISTENT command is attached with this mail. 
Now it can be reviewed.

I have not update docs, as I want feedback about the behaviour of 
implementation, so that docs can be updated appropriately.

With Regards,
Amit Kapila.

set_persistent.v1.patch
Description: set_persistent.v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto_explain WAS: RFC: Timing Events

2012-11-22 Thread Alvaro Herrera
Greg Smith escribió:

> If I could just turn off logging just during those
> periods--basically, throwing them away only when some output rate
> throttled component hit its limit--I could still find them in the
> data collected the rest of the time.  There are some types of
> problems that also only occur under peak load that this idea would
> miss, but you'd still be likely to get *some* of them,
> statistically.

Uhm, what if you had a simple script that changed postgresql.conf and
SIGHUP'ed postmaster?  You could have it turn logging on and off
depending on predefined system load watermarks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQconninfo function for libpq

2012-11-22 Thread Boszormenyi Zoltan

2012-11-22 12:44 keltezéssel, Magnus Hagander írta:

On Thu, Nov 22, 2012 at 10:16 AM, Boszormenyi Zoltan  wrote:

2012-11-22 08:35 keltezéssel, Boszormenyi Zoltan írta:


2012-11-21 22:15 keltezéssel, Magnus Hagander írta:

On Wed, Nov 21, 2012 at 10:01 PM, Boszormenyi Zoltan 
wrote:

Hi,

2012-11-21 19:19 keltezéssel, Magnus Hagander írta:


I'm breaking this out into it's own thread, for my own sanity if
nothing else :) And it's an isolated feature after all.

I still agree with the previous review at


http://archives.postgresql.org/message-id/1349321071.23971.0.ca...@vanquo.pezone.net
about keeping the data in more than one place.


OK, it seems I completely missed that comment.
(Or forgot about it if I happened to answer it.)



Based on that, I've created a different version of this patch,
attached. This way we keep all the data in one struct.


I like this single structure but not the way you handle the
options' classes. In your patch, each option belongs to only
one class. These classes are:

PG_CONNINFO_REPLICATION = "replication" only
PG_CONNINFO_PASSWORD = "password" only
PG_CONNINFO_NORMAL = everything else

How does it help pg_basebackup to filter out e.g. dbname and
replication?

PG_CONNINFO_ALL should give pg_basebackup all it needs, no? Or
actually, it shouldn't have the replication=1 part, right? So it
should just use PG_CONNINFO_NORMAL|PG_CONNINFO_PASSWORD?



These are added by the walreceiver module anyway and adding them
to the primary_conninfo line should even be discouraged by the
documentation.

Hmm. I wasn't actually thinking about the dbname part here, I admit that.


And not only the dbname, libpqwalreceiver adds these three:

[zozo@localhost libpqwalreceiver]$ grep dbname *
libpqwalreceiver.c: "%s dbname=replication replication=true
fallback_application_name=walreceiver",

I also excluded "application_name" from PG_CONNINFO_REPLICATION
by this reasoning:

- for async replication or single standby, it doesn't matter,
   the connection will show up as "walreceiver"
- for sync replication, the administrator has to add the node name
   manually via application_name.


In my view, the classes should be inclusive:

PG_CONNINFO_NORMAL = Everything that's usable for a regular client
connection. This mean everything, maybe including "password" but
excluding "replication".

PG_CONNINFO_PASSWORD = "password" only.

PG_CONNINFO_REPLICATION = Everything usable for a replication client
not added by walreceiver. Maybe including/excluding "password".

Maybe there should be two flags for replication usage:

PG_CONNINFO_WALRECEIVER = everything except those not added
by walreceiver (and maybe "password" too)

PG_CONNINFO_REPLICATION = "replication" only

And every option can belong to more than one class, just as in my patch.

Hmm. I kind of liked having each option in just one class, but I see
the problem. Looking at the ones you suggest, all the non-password
ones *have* to be without password, otherwise there i no way to get
the conninfo without password - which is the whole reason for that
parameter to exist. So the PASSWORD one has to be additional - which
means that not making the other ones additional makes them
inconsistent. But maybe we don't really have a choice there.


Yes, the PASSWORD part can be on its own, this is what I meant above
but wanted a different opinion about having it completely separate
is better or not.

But the NORMAL and REPLICATION (or WALRECEIVER) classes
need to overlap.


At this point, the patch is untested beyond compiling and a trivial
psql check, because I ran out of time :) But I figured I'd throw it
out there for comments on which version people prefer. (And yes, it's
quite possible I've made a big think-mistake in it somewhere, but
again, better to get some eyes on it early)

My version also contains a fixed version of the docs that should be
moved back into Zoltans version if that's the one we end up
preferring.


I also liked your version of the documentation better,
I am not too good at writing docs.

np.



Also, a question was buried in the other review which is - are we OK
to remove the requiressl parameter. Both these patches do so, because
the code becomes much simpler if we can do that. It has been
deprecated since 7.2. Is it OK to remove it, or do we need to put back
in the more complex code to deal with both?

Just going to highlight that we're looking for at least one third
party to comment on this :)


Yes, me too. A +1 for removing wouldn't count from me. ;-)


Attached is both Zoltans latest patch (v16) and my slightly different
approach.

Comments on which approach is best?

Test results from somebody who has the time to look at it? :)

Do you happen to have a set of tests you've been running on your
patches? Can you try them again this one?


My set of tests are:

1. initdb the master
2. pg_basebackup -R the first standby from the master
3. pg_basebackup -R the second standby from the master
4. pg_basebackup -R the third stan

[HACKERS] pg_resetxlog defect with relative database path

2012-11-22 Thread Hari Babu

When I was testing pg_resetxlog with relative database path,
The pg_resetxlog is doing the transaction log reset even when the server is
running.

Steps to reproduce:
1. ./pg_ctl -D ../../data start
2. ./pg_resetxlog -f ../../data  -- is not able to detect as server
is already running. 


Please find the patch for the same: 

*** a/src/bin/pg_resetxlog/pg_resetxlog.c 
--- b/src/bin/pg_resetxlog/pg_resetxlog.c 
*** 
*** 254,260  main(int argc, char *argv[]) 
   */ 
  snprintf(path, MAXPGPATH, "%s/postmaster.pid", DataDir); 
  
! if ((fd = open(path, O_RDONLY, 0)) < 0) 
  { 
  if (errno != ENOENT) 
  { 
--- 254,260  
   */ 
  snprintf(path, MAXPGPATH, "%s/postmaster.pid", DataDir); 
  
! if ((fd = open("postmaster.pid", O_RDONLY, 0)) < 0) 
  { 
  if (errno != ENOENT) 
  { 

Any suggestions/comments?

Regards, 
Hari babu.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog defect with relative database path

2012-11-22 Thread Fujii Masao
On Thu, Nov 22, 2012 at 10:22 PM, Hari Babu  wrote:
>
> When I was testing pg_resetxlog with relative database path,
> The pg_resetxlog is doing the transaction log reset even when the server is
> running.
>
> Steps to reproduce:
> 1. ./pg_ctl -D ../../data start
> 2. ./pg_resetxlog -f ../../data  -- is not able to detect as server
> is already running.
>
>
> Please find the patch for the same:
>
> *** a/src/bin/pg_resetxlog/pg_resetxlog.c
> --- b/src/bin/pg_resetxlog/pg_resetxlog.c
> ***
> *** 254,260  main(int argc, char *argv[])
>*/
>   snprintf(path, MAXPGPATH, "%s/postmaster.pid", DataDir);
>
> ! if ((fd = open(path, O_RDONLY, 0)) < 0)
>   {
>   if (errno != ENOENT)
>   {
> --- 254,260 
>*/
>   snprintf(path, MAXPGPATH, "%s/postmaster.pid", DataDir);
>
> ! if ((fd = open("postmaster.pid", O_RDONLY, 0)) < 0)
>   {
>   if (errno != ENOENT)
>   {
>
> Any suggestions/comments?

Looks good to me.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-11-22 Thread Tom Lane
Peter Eisentraut  writes:
> On 11/21/12 9:42 AM, Robert Haas wrote:
>> On Mon, Nov 19, 2012 at 6:19 PM, Peter Eisentraut  wrote:
>>> I continue to be of the opinion that allowing this second case to work
>>> is not desirable.

>> 1. Why?

> Because a strongly-typed system should not cast numbers to strings
> implicitly.  Does the equivalent of the lpad case work in any other
> strongly-typed programming language?

The argument here is basically between ease of use and ability to detect
common programming mistakes.  It's not clear to me that there is any
principled way to make such a tradeoff, because different people can
reasonably put different weights on those two goals.

>> 2. What's your counter-proposal?

> Leave things as they are.

FWIW, I agree with Peter.  It's been like this for a long time and
whether the system would be easier to use or not, it would definitely
be uglier and harder to explain.  ("Assignment casts are used only
for assignments ... except when they aren't.")

I notice that the proposed patch is devoid of documentation.  Perhaps
after Robert is done writing the necessary changes to the SGML docs
about type conversions and casts, he'll agree this is pretty ugly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: Deparsing DDL command strings

2012-11-22 Thread Dimitri Fontaine
Hi,

Thank you Pavel for reviewing my patch! :)

Pavel Stehule  writes:
> * issues
>
> ** missing doc

Yes. I wanted to reach an agreement on why we need the de parsing for.
You can read the Last comment we made about it at the following link, I
didn't have any answer to my proposal.

  http://archives.postgresql.org/message-id/m2391fu79m@2ndquadrant.fr

In that email, I said:
> Also, I'm thinking that publishing the normalized command string is
> something that must be maintained in the core code, whereas the external
> stable format might be done as a contrib extension, coded in C, working
> with the Node *parsetree. Because it needs to ouput a compatible format
> when applied to different major versions of PostgreSQL, I think it suits
> quite well the model of C coded extensions.

I'd like to hear what people think about that position. I could be
working on the docs meanwhile I guess, but a non trivial amount of what
I'm going to document is to be thrown away if we end up deciding that we
don't want the normalized command string…

> ** statements are not deparsd for ddl_command_start event

Yes, that's by design, and that's why we have the ddl_command_trace
event alias too. Tom is right in saying that until the catalogs are
fully populated we can not rewrite most of the command string if we want
normalized output (types, constraints, namespaces, all the jazz).

> ** CREATE FUNCTION is not supported

Not yet. The goal of this patch in this CF is to get a green light on
providing a full implementation of what information to add to event
triggers and in particular about rewriting command strings.

That said, if you think it would help you as a reviewer, I may attack
the CREATE FUNCTION support right now.

> * some wrong in deparsing - doesn't support constraints
>
> postgres=# alter table a add column bbbsss int check (bbbsss > 0);
> NOTICE:  event: ddl_command_start, context: TOPLEVEL, tag: ALTER
> TABLE, operation: ALTER, type: TABLE, schema: , name: 
> NOTICE:  command: 
> NOTICE:  event: ddl_command_end, context: TOPLEVEL, tag: ALTER TABLE,
> operation: ALTER, type: TABLE, schema: public, name: a
> NOTICE:  command: ALTER TABLE public.a ADD COLUMN bbbsss pg_catalog.int4, ;
> ALTER TABLE

Took me some time to figure out how the constraint processing works in
CREATE TABLE, and apparently I didn't finish ALTER TABLE support yet.
Working on that, thanks.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog defect with relative database path

2012-11-22 Thread Tom Lane
Hari Babu  writes:
> When I was testing pg_resetxlog with relative database path,
> The pg_resetxlog is doing the transaction log reset even when the server is
> running.

Ooops.  Fixed, thanks for the report!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-22 Thread Heikki Linnakangas

On 21.11.2012 23:29, Alvaro Herrera wrote:

Alvaro Herrera escribió:

FWIW I have pushed this to github; see
https://github.com/alvherre/postgres/compare/bgworker

It's also attached.

The UnBlockSig stuff is the main stumbling block as I see it because it
precludes compilation on Windows.  Maybe we should fix that by providing
another function that the module is to call after initialization is done
and before it gets ready to work ... but having a function that only
calls PG_SETMASK() feels somewhat useless to me; and I don't see what
else we could do in there.


I cleaned up some more stuff and here's another version.  In particular
I added wrapper functions to block and unblock signals, so that this
doesn't need exported UnBlockSig.


Could you just unblock the signals before calling into the background 
worker's main() function?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-22 Thread Fujii Masao
On Thu, Nov 22, 2012 at 9:38 PM, Amit kapila  wrote:
> Patch to implement SET PERSISTENT command is attached with this mail.
> Now it can be reviewed.

I got the following compile warnings:
xact.c:1847: warning: implicit declaration of function
'AtEOXact_UpdateAutoConfFiles'
guc.c:5134: warning: enumeration value 'PGC_ENUM' not handled in switch

"SET PERSISTENT name TO value" failed, though "SET PERSISTENT name = value"
succeeded.

=# SET PERSISTENT synchronous_commit TO 'local';
ERROR:  syntax error at or near "TO"
LINE 1: SET PERSISTENT synchronous_commit TO 'local';
=# SET PERSISTENT synchronous_commit = 'local';
SET

SET PERSISTENT succeeded even if invalid setting value was set.

=# SET PERSISTENT synchronous_commit = 'hoge';
SET

SET PERSISTENT syntax should be explained by "\help SET" psql command.

When I remove postgresql.auto.conf, SET PERSISTENT failed.

=# SET PERSISTENT synchronous_commit = 'local';
ERROR:  failed to open "postgresql.auto.conf" file

We should implement "RESET PERSISTENT"? Otherwise, there is no way
to get rid of the parameter setting from postgresql.auto.conf, via SQL. Also
We should implement "SET PERSISTENT name TO DEFAULT"?

Is it helpful to output the notice message like 'Run pg_reload_conf() or
restart the server if you want new settings to take effect' always after
SET PERSISTENT command?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_trgm partial-match

2012-11-22 Thread Fujii Masao
On Mon, Nov 19, 2012 at 7:55 PM, Alexander Korotkov
 wrote:
> On Mon, Nov 19, 2012 at 10:05 AM, Alexander Korotkov 
> wrote:
>>
>> On Thu, Nov 15, 2012 at 11:39 PM, Fujii Masao 
>> wrote:
>>>
>>> Note that we cannot do a partial-match if KEEPONLYALNUM is disabled,
>>> i.e., if query key contains multibyte characters. In this case, byte
>>> length of
>>> the trigram string might be larger than three, and its CRC is used as a
>>> trigram key instead of the trigram string itself. Because of using CRC,
>>> we
>>> cannot do a partial-match. Attached patch extends pg_trgm so that it
>>> compares a partial-match query key only when KEEPONLYALNUM is
>>> enabled.
>>
>>
>> Didn't get this point. How does KEEPONLYALNUM guarantee that each trigram
>> character is singlebyte?
>>
>> CREATE TABLE test (val TEXT);
>> INSERT INTO test VALUES ('aa'), ('aaa'), ('шaaш');
>> CREATE INDEX trgm_idx ON test USING gin (val gin_trgm_ops);
>> ANALYZE test;
>> test=# SELECT * FROM test WHERE val LIKE '%aa%';
>>  val
>> --
>>  aa
>>  aaa
>>  шaaш
>> (3 rows)
>> test=# set enable_seqscan = off;
>> SET
>> test=# SELECT * FROM test WHERE val LIKE '%aa%';
>>  val
>> -
>>  aa
>>  aaa
>> (2 rows)
>>
>> I think we can use partial match only for singlebyte encodings. Or, at
>> most, in cases when all alpha-numeric characters are singlebyte (have no
>> idea how to check this).

Good catch! You're right.

> Actually, I also was fiddling around idea of partial match on trigrams when
> I was working on initial LIKE patch. But, I concluded that we would need a
> separate opclass which always keeps full trigram in entry.

Agreed.

My goal is to enable pg_trgm's full-text search using the keyword which
consists of one or two Japanese characters to use an index efficiently.
I first implemented the partial-match patch, and was planning to introduce
new opclass next CommitFest to store the multibyte characters to
the text data instead of int4. But obviously the order of development
should have been the opposite. I will work on the latter development first,
and add new opclass like gin_trgm_mb_ops (mb means multibyte) which
ignores KEEPONLYALNUM and stores the GIN index key as text value.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_trgm partial-match

2012-11-22 Thread Fujii Masao
On Mon, Nov 19, 2012 at 10:56 AM, Tomas Vondra  wrote:
> I've done a quick review of the current patch:

Thanks for the commit!

As Alexander pointed out upthread, another infrastructure patch is required
before applying this patch. So I will implement the infra patch first.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-22 Thread Alvaro Herrera
Heikki Linnakangas escribió:
> On 21.11.2012 23:29, Alvaro Herrera wrote:
> >Alvaro Herrera escribió:
> >>FWIW I have pushed this to github; see
> >>https://github.com/alvherre/postgres/compare/bgworker
> >>
> >>It's also attached.
> >>
> >>The UnBlockSig stuff is the main stumbling block as I see it because it
> >>precludes compilation on Windows.  Maybe we should fix that by providing
> >>another function that the module is to call after initialization is done
> >>and before it gets ready to work ... but having a function that only
> >>calls PG_SETMASK() feels somewhat useless to me; and I don't see what
> >>else we could do in there.
> >
> >I cleaned up some more stuff and here's another version.  In particular
> >I added wrapper functions to block and unblock signals, so that this
> >doesn't need exported UnBlockSig.
> 
> Could you just unblock the signals before calling into the
> background worker's main() function?

Yes, but what if a daemon wants to block/unblock signals later?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: plpgsql return a row-expression

2012-11-22 Thread Asif Rehman
Thanks for the review Pavel. I have taken care of the points you raised.
Please see the updated patch.

Regards,
--Asif

On Wed, Nov 21, 2012 at 1:47 AM, Pavel Stehule wrote:

> related to
> http://archives.postgresql.org/message-id/caauglxwpedfwae6dajmf7sxewfusa0f68p07retbbpf_fsa...@mail.gmail.com
>
> * patched and compiled withou warnings
>
> * All 133 tests passed.
>
>
> but
>
> I don't like
>
> * call invalid function from anonymous block - it is messy (regress
> tests) - there is no reason why do it
>
> +create or replace function foo() returns footype as $$
> +declare
> +  v record;
> +  v2 record;
> +begin
> +  v := (1, 'hello');
> +  v2 := (1, 'hello');
> +  return (v || v2);
> +end;
> +$$ language plpgsql;
> +DO $$
> +declare
> +  v footype;
> +begin
> +  v := foo();
> +  raise info 'x = %', v.x;
> +  raise info 'y = %', v.y;
> +end; $$;
> +ERROR:  operator does not exist: record || record
> +LINE 1: SELECT (v || v2)
> +  ^
>
> * there is some performance issue
>
> create or replace function fx2(a int)
> returns footype as $$
> declare x footype;
> begin
>   x = (10,20);
>   return x;
> end;
> $$ language plpgsql;
>
> postgres=# select sum(fx2.x) from generate_series(1,10) g(i),
> lateral fx2(i);
>sum
> -
>  100
> (1 row)
>
> Time: 497.129 ms
>
> returns footype as $$
> begin
>   return (10,20);
> end;
> $$ language plpgsql;
>
> postgres=# select sum(fx2.x) from generate_series(1,10) g(i),
> lateral fx2(i);
>sum
> -
>  100
> (1 row)
>
> Time: 941.192 ms
>
> following code has same functionality and it is faster
>
> if (stmt->expr != NULL)
> {
> if (estate->retistuple)
> {
> TupleDesc   tupdesc;
> Datum   retval;
> Oid rettype;
> boolisnull;
> int32   tupTypmod;
> Oid tupType;
> HeapTupleHeader td;
> HeapTupleData   tmptup;
>
> retval = exec_eval_expr(estate,
> stmt->expr,
> &isnull,
> &rettype);
>
> /* Source must be of RECORD or composite type */
> if (!type_is_rowtype(rettype))
> ereport(ERROR,
>
> (errcode(ERRCODE_DATATYPE_MISMATCH),
>  errmsg("cannot return
> non-composite value from composite type returning function")));
>
> if (!isnull)
> {
> /* Source is a tuple Datum, so safe to
> do this: */
> td = DatumGetHeapTupleHeader(retval);
> /* Extract rowtype info and find a tupdesc
> */
> tupType = HeapTupleHeaderGetTypeId(td);
> tupTypmod = HeapTupleHeaderGetTypMod(td);
> tupdesc =
> lookup_rowtype_tupdesc(tupType, tupTypmod);
>
> /* Build a temporary HeapTuple control
> structure */
> tmptup.t_len =
> HeapTupleHeaderGetDatumLength(td);
> ItemPointerSetInvalid(&(tmptup.t_self));
> tmptup.t_tableOid = InvalidOid;
> tmptup.t_data = td;
>
> estate->retval =
> PointerGetDatum(heap_copytuple(&tmptup));
> estate->rettupdesc =
> CreateTupleDescCopy(tupdesc);
> ReleaseTupleDesc(tupdesc);
> }
>
> estate->retisnull = isnull;
> }
>
>
>
> * it is to restrictive (maybe) - almost all plpgsql' statements does
> automatic conversions (IO conversions when is necessary)
>
> create type footype2 as (a numeric, b varchar)
>
> postgres=# create or replace function fx3(a int)
> returns footype2 as
> $$
> begin
>   return (1000.22234213412342143,'ewqerwqre');
> end;
> $$ language plpgsql;
> CREATE FUNCTION
> postgres=# select fx3(10);
> ERROR:  returned record type does not match expected record type
> DETAIL:  Returned type unknown does not match expected type character
> varying in column 2.
> CONTEXT:  PL/pgSQL function fx3(integer) while casting return value to
> function's return type
> postgres=#
>
> * it doesn't support RETURN NEXT
>
> postgres=# create or replace function fx4()
> postgres-# returns setof footype as $$
> postgres$# begin
> postgres$#   for i in 1..10
> postgres$#   loop
> postgres$# return next (10,20);
> postgres$#   end loop;
> postgres$#  

Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-22 Thread Heikki Linnakangas

On 22.11.2012 19:18, Alvaro Herrera wrote:

Heikki Linnakangas escribió:

On 21.11.2012 23:29, Alvaro Herrera wrote:

Alvaro Herrera escribió:

The UnBlockSig stuff is the main stumbling block as I see it because it
precludes compilation on Windows.  Maybe we should fix that by providing
another function that the module is to call after initialization is done
and before it gets ready to work ... but having a function that only
calls PG_SETMASK() feels somewhat useless to me; and I don't see what
else we could do in there.


I cleaned up some more stuff and here's another version.  In particular
I added wrapper functions to block and unblock signals, so that this
doesn't need exported UnBlockSig.


Could you just unblock the signals before calling into the
background worker's main() function?


Yes, but what if a daemon wants to block/unblock signals later?


Ok. Can you think of an example of a daemon that would like to do that?

Grepping the backend for "BlockSig", the only thing it seems to be 
currenlty used for is to block nested signals in the SIGQUIT handler 
(see bg_quickdie() for an example). The patch provides a built-in 
SIGQUIT handler for the background workers, so I don't think you need 
BlockSig for that. Or do you envision that it would be OK for a 
background worker to replace the SIGQUIT handler with a custom one?


Even if we provide the BackgroundWorkerBlock/UnblockSignals() functions, 
I think it would still make sense to unblock the signals before calling 
the bgworker's main loop. One less thing for the background worker to 
worry about that way. Or are there some operations that can't be done 
safely after unblocking the signals? Also, I note that some worker 
processes call sigdelset(&BlockSig, SIGQUITE); that remains impossible 
to do in a background worker on Windows, the 
BackgroundWorkerBlock/UnblockSignals() wrapper functions don't help with 
that.


Some documentation on what a worker is allowed to do would be helpful here..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[Re] [Re] Re: [HACKERS] PANIC: could not write to log file

2012-11-22 Thread Cyril VELTER

On 21/11/12 20:39:01 mailto:cyril.vel...@metadys.com wrote:
> On 21/11/12 18:10:12, mailto:jeff.ja...@gmail.com wrote:
> > On Wed, Nov 21, 2012 at 8:51 AM, Cyril VELTER  
> > wrote:
> > >
> > >After upgrading a pretty big database (serveral hundred gig) from 8.2 
> > > to 9.2 I'm getting some "PANIC:  could not write to log file" messages. 
> > > Actually I
> > > got two of them. One yesterday and one today.
> > 
> > How was the upgrade done?
> 
>The upgrade was done with the following steps :
> 
>* create a new cluster using 9.2.1 initdb
>* run pg_dump | psql using the 9.2.1 binairies on a linux machines (both 
> servers the old 8.2 and the new 9.2.1 running on windows machines).

   I follow up on my previous message. Just got two more crash today very 
similar to the first ones :

PANIC:  could not write to log file 118, segment 237 at offset 2686976, length 
5578752: Invalid argument
STATEMENT:  COMMIT
PANIC:  could not write to log file 118, segment 227 at offset 9764864, length 
57344: Invalid argument
STATEMENT:  COMMIT

for memory the first ones are 

PANIC:  could not write to log file 117, segment 117 at offset 5660672, length 
4096000: Invalid argument
STATEMENT:  COMMIT
PANIC:  could not write to log file 118, segment 74 at offset 12189696, length 
475136: Invalid argument
STATEMENT:  COMMIT

I dug a bit in the source code and the error is in XLogWrite in xlog.c. MSDN 
states that an EINVAL return code from write mean that the buffer is NULL which 
seem pretty strange. Any help on what I can do to solve this situation would be 
greatly apreciated.

One thing I forgot to mention on my first message. I'm running the 32 bits 
binaries on a win2003 64 bits and no antivirus is running on the machine.

Regards,

Cyril VELTER



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-11-22 Thread Alvaro Herrera
Heikki Linnakangas escribió:
> On 22.11.2012 19:18, Alvaro Herrera wrote:
> >Heikki Linnakangas escribió:
> >>On 21.11.2012 23:29, Alvaro Herrera wrote:
> >>>Alvaro Herrera escribió:
> The UnBlockSig stuff is the main stumbling block as I see it because it
> precludes compilation on Windows.  Maybe we should fix that by providing
> another function that the module is to call after initialization is done
> and before it gets ready to work ... but having a function that only
> calls PG_SETMASK() feels somewhat useless to me; and I don't see what
> else we could do in there.
> >>>
> >>>I cleaned up some more stuff and here's another version.  In particular
> >>>I added wrapper functions to block and unblock signals, so that this
> >>>doesn't need exported UnBlockSig.
> >>
> >>Could you just unblock the signals before calling into the
> >>background worker's main() function?
> >
> >Yes, but what if a daemon wants to block/unblock signals later?
> 
> Ok. Can you think of an example of a daemon that would like to do that?

Not really, but I don't know what crazy stuff people might be able to
come up with.  Somebody was talking about using a worker to do parallel
computation (of queries?) using FPUs, or something along those lines.  I
don't have enough background on that sort of thing but it wouldn't
surprise me that they wanted to block signals for a while when the
daemons are busy talking to the FPU, say.

> Grepping the backend for "BlockSig", the only thing it seems to be
> currenlty used for is to block nested signals in the SIGQUIT handler
> (see bg_quickdie() for an example). The patch provides a built-in
> SIGQUIT handler for the background workers, so I don't think you
> need BlockSig for that. Or do you envision that it would be OK for a
> background worker to replace the SIGQUIT handler with a custom one?

I wasn't really considering that the SIGQUIT handler would be replaced.
Not really certain that the SIGTERM handler needs to fiddle with the
sigmask ...

> Even if we provide the BackgroundWorkerBlock/UnblockSignals()
> functions, I think it would still make sense to unblock the signals
> before calling the bgworker's main loop. One less thing for the
> background worker to worry about that way. Or are there some
> operations that can't be done safely after unblocking the signals?

Yes, that's probably a good idea.  I don't see anything that would need
to run with signals blocked in the supplied sample code (but then they
are pretty simplistic).

> Also, I note that some worker processes call sigdelset(&BlockSig,
> SIGQUITE); that remains impossible to do in a background worker on
> Windows, the BackgroundWorkerBlock/UnblockSignals() wrapper
> functions don't help with that.

Hmm.  Not really sure about that.  Maybe we should keep SIGQUIT
unblocked at all times, so postmaster.c needs to remove it from BlockSig
before invoking the bgworker's main function.

The path of least resistance seems to be to export BlockSig and
UnBlockSig, but I hesitate to do it.

> Some documentation on what a worker is allowed to do would be helpful
> here..

I will see about it.

If the bgworker developer gets really tense about this stuff (or
anything at all, really), they can create a completely new sigmask and
do sigaddset() etc.  Since this is all C code, we cannot keep them from
doing anything, really; I think what we need to provide here is just a
framework to ease development of simple cases.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the number of pending entries in GIN index with FASTUPDATE=on

2012-11-22 Thread Fujii Masao
On Tue, Nov 20, 2012 at 4:44 PM, Kyotaro HORIGUCHI
 wrote:
> Hello,
>
>> Agreed. Attached patch introduces the pgstatginindex() which now reports
>> GIN version number, number of pages in the pending list and number of
>> tuples in the pending list, as information about a GIN index.
>
> It seems fine on the whole, and I have some suggestions.

Thanks for the review!

> 1. This patch applies current git head cleanly, but installation
>   ends up with failure because of the lack of
>   pgstattuple--1.0--1.1.sql which added in Makefile.

Added pgstattuple--1.0--1.1.sql.

> 2. I feel somewhat uneasy with size for palloc's (it's too long),
>   and BuildTupleFromCString used instead of heap_from_tuple.. But
>   it would lead additional modification for existent simillars.
>
>   You can leave that if you prefer to keep this patch smaller,
>   but it looks to me more preferable to construct the result
>   tuple not via c-strings in some aspects. (*1)

OK. I changed the code as you suggested.

Updated version of the patch attached.

>
> 3. pgstatginidex shows only version, pending_pages, and
>   pendingtuples. Why don't you show the other properties such as
>   entry pages, data pages, entries, and total pages as
>   pgstatindex does?

I didn't expose those because they are accurate as of last VACUUM.
But if you think they are useful, I don't object to expose them.

Regards,

-- 
Fujii Masao


pgstatginindex_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Review: Patch to compute Max LSN of Data Pages

2012-11-22 Thread Muhammad Usama
Hi Amit

I have reviewed and tested the patch, Following are my observations and
comments.

Basic stuff
-
- Patch applies OK
- Compiles cleanly with no warnings
- All regression tests pass.


Observations and Comments
---

- If no option is given to pg_computemaxlsn utility then we get a wrong
error message
./pg_computemaxlsn  ../data
pg_computemaxlsn: file or directory "(null)" exists
In my opinion we should consider the -P (Print max LSN from whole data
directory) as default in case no option is specified, or otherwise throw a
more descriptive error.

- Options -p and -P should be mutually exclusive

- Long options missing for -p and -P

- Help string for -P  should be something like "print max LSN from whole
data directory" instead of  "print max LSN from whole database"

- Help menu is silent about -V or --version option

- I think when finding the max LSN of single file the utility should
consider all relation segments.

- There is no check for extra arguments
e.g
 ./pg_computemaxlsn  -P . ../data/ ../data/base/ ../data2/

- For -p {file | dir}  option the utility expects the file path relative to
the specified data directory path which makes thing little confusing
for example
./pg_computemaxlsn -p data/base/12286/12200 data
pg_computemaxlsn: file or directory "data/base/12286/12200" does not exists
although data/base/12286/12200 is valid file path but we gets file does not
exists error

- Utility should print the max LSN number in hexadecimal notation and LSN
format (logid/segno) for consistency with PG, and may also print the file
name which contains that LSN.

- When finding max LSN from directory, the utility does not considers the
sub directories, which I think could be useful in some cases, like if we
want to find the max LSN in tablespace directory. With the current
implementation we would require to run the utility multiple times, once for
each database.


Code Review
-

Code generally looks good, I have few small points.

- In main() function variable maxLSN is defined as uint64,
Although XLogRecPtr and uint64 are the same thing, but I think we should
use XLogRecPtr instead.

- Just a small thing although will hardly improve anything but still for
sake of saving few cycles, I think you should use XLByteLT ( < ) instead
of XLByteLE ( <= )
to find if the previous max LSN is lesser then current.

- '\n' is missing from one the printf in the code :-)
fprintf(stderr, _("%s: file or directory \"%s\" does not exists"),
progname, LsnSearchPath);

- While finding the max LSN from single file, you are not verifying that
the file actually exists inside the data directory path provided. and can
end up looking
at wrong data directory for checking if the server is running.
For example:
bin/pg_computemaxlsn -p $PWD/data/base/12286/12200
$PWD/otherinstallation/data
Maximum LSN found is: 0, WAL segment file name (fileid,seg):


- Code is not verifying, if the provided data directory is the valid data
directory, This could make pg_computemaxlsn to compute max LSN from the
running server file.
For example:
bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/
Maximum LSN found is: 0, WAL segment file name (fileid,seg):


Questions
-

- I am wondering why are you not following the link inside the "pg_tblspc"
directory to get to the table space location instead of building the
directory path. I am thinking if you follow the link instead, The utility
can be used across the different versions of PG.



Regards,
Muhammad Usama


Re: [HACKERS] PQconninfo function for libpq

2012-11-22 Thread Fujii Masao
On Thu, Nov 22, 2012 at 10:05 PM, Boszormenyi Zoltan  wrote:
> 2012-11-22 12:44 keltezéssel, Magnus Hagander írta:
>>> Also, a question was buried in the other review which is - are we OK
>>> to remove the requiressl parameter. Both these patches do so, because
>>> the code becomes much simpler if we can do that. It has been
>>> deprecated since 7.2. Is it OK to remove it, or do we need to put
>>> back
>>> in the more complex code to deal with both?
>
> Just going to highlight that we're looking for at least one third
> party to comment on this :)


 Yes, me too. A +1 for removing wouldn't count from me. ;-)

+1

> The second one is the product of what caught my attention while
> I was looking at pg_receivexlog. The current coding may write
> beyond the end of the allocated arrays and the password may
> overwrite a previously set keyword/value pair.

ISTM that such problem doesn't happen at all because argcount is
incremented as follows.

if (dbhost)
argcount++;
if (dbuser)
argcount++;
if (dbport)
argcount++;

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP PATCH] for Performance Improvement in Buffer Management

2012-11-22 Thread Pavan Deolasee
On Thu, Nov 22, 2012 at 2:05 PM, Amit Kapila  wrote:

> ** **
>
>
> ** **
>
> >Sorry, I haven't followed this thread at all, but the numbers (43171 and
> 57920) in the last two runs of @mv-free-list for 32 clients look
> aberrations, no ?  I wonder if *>*that's skewing the average.
>
> ** **
>
> Yes, that is one of the main reasons, but in all runs this is consistent
> that for 32 clients or above this kind of numbers  are observed.
>
> Even Jeff has pointed the similar thing in one of his mails and suggested
> to run the tests such that first test should run “with patch” and then
> “without patch”. 
>
> After doing what he suggested the observations are still similar.
>
> **
>

Are we convinced that the jump that we are seeing is a real one then ? I'm
a bit surprised because it happens only with the patch and only for 32
clients. How would you explain that ?



> **
>
> ** **
>
> > I also looked at the the Results.htm file down thread. There seem to be
> a steep degradation when the shared buffers are increased from 5GB to 10GB,
> both with and 
>
> > without the patch. Is that expected ? If so, isn't that worth
> investigating and possibly even fixing before we do anything else ?
>
> ** **
>
> The reason for decrease in performance is that when shared buffers are
> increased from 5GB to 10GB, the I/O starts as after increasing it cannot
> hold all
>
> the data in OS buffers.
>

Shouldn't that data be in the shared buffers if not the OS cache and hence
approximately same IO will be required ? Again, the drop in the performance
is so severe that it seems worth investigating that further, especially
because you can reproduce it reliably.

Thanks,
Pavan