Re: [HACKERS] pg_system_identifier()

2013-08-25 Thread Michael Paquier
On Mon, Aug 26, 2013 at 7:47 AM, Jim Nasby  wrote:
> On 8/23/13 11:23 AM, Greg Stark wrote:
>>
>> This doesn't generate a unique id. You could back up a standby and restore
>> it and point it at the original master and end up with two standbies with
>> the same id.
>
>
> If you want to enforce something unique throughout a cluster, I think we're
> stuck with having the cluster communicate IDs across an entire cluster.
> AFAIK that's how both Slony and londiste 3 do it.
The same applies to Postgres-XC for node identifiers. Users can adapt
the settings of their cluster to their own needs.

> I think it's also noteworthy that Slony and londiste both rely on the user
> specifying node identifiers. They don't try to be magic about it. I think
> there's 2 advantages there:
>
> - Code is simpler
> - Users can choose a naming schema that makes sense for them
Definitely agreed on that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_system_identifier()

2013-08-25 Thread Jim Nasby

On 8/23/13 11:23 AM, Greg Stark wrote:

This doesn't generate a unique id. You could back up a standby and restore it 
and point it at the original master and end up with two standbies with the same 
id.


If you want to enforce something unique throughout a cluster, I think we're 
stuck with having the cluster communicate IDs across an entire cluster. AFAIK 
that's how both Slony and londiste 3 do it.

I think it's also noteworthy that Slony and londiste both rely on the user 
specifying node identifiers. They don't try to be magic about it. I think 
there's 2 advantages there:

- Code is simpler
- Users can choose a naming schema that makes sense for them
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for fail-back without fresh backup

2013-08-25 Thread Sawada Masahiko
On Sat, Aug 24, 2013 at 11:38 PM, Peter Eisentraut  wrote:
> On Thu, 2013-07-11 at 23:42 +0900, Sawada Masahiko wrote:
>> please find the attached patch.
>
> Please fix these compiler warnings:
>
> xlog.c:3117:2: warning: implicit declaration of function ‘SyncRepWaitForLSN’ 
> [-Wimplicit-function-declaration]
> syncrep.c:414:6: warning: variable ‘numdataflush’ set but not used 
> [-Wunused-but-set-variable]
Thank you for your information!

We are improving the patch for Commit Fest 2 now.
We will fix above compiler warnings as soon as possible and submit the patch

-- 
Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hstore: Query speedups with Gin index

2013-08-25 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Aug 22, 2013 at 11:55 PM, Blake Smith  wrote:
>> The combined entry is used to support "contains (@>)" queries, and the key
>> only item is used to support "key contains (?)" queries. This change seems
>> to help especially with hstore keys that have high cardinalities. Downsides
>> of this change is that it requires an index rebuild, and the index will be
>> larger in size.

> Index rebuild would be a problem only for minor releases,

That's completely false; people have expected major releases to be
on-disk-compatible for several years now.  While there probably will be
future releases in which we are willing to break storage compatibility,
a contrib module doesn't get to dictate that.

What might be a practical solution, especially if this isn't always a
win (which seems likely given the index-bloat risk), is to make hstore
offer two different GIN index opclasses, one that works the traditional
way and one that works this way.

Another thing that needs to be taken into account here is Oleg and
Teodor's in-progress work on extending hstore:
https://www.pgcon.org/2013/schedule/events/518.en.html
I'm not sure if this patch would conflict with that at all, but it
needs to be considered.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hstore: Query speedups with Gin index

2013-08-25 Thread Michael Paquier
On Thu, Aug 22, 2013 at 11:55 PM, Blake Smith  wrote:
> We've been experiencing slow "@>" queries involving an hstore column that's
> covered by a Gin index. At the current postgresql git HEAD, the hstore <->
> gin interface produces the following text items to be indexed:
>
> hstore: "'a'=>'1234', 'b'=>'test'"
> Produces indexed text items: "Ka", "V1234", "Kb", "Vtest"
>
> For the size of our production table (10s of millions of rows), I observed
> significant query speedups by changing the index strategy to the following:
What is the order of the speedup?

> hstore: "'a'=>'1234', 'b'=>'test'"
> Produces indexed text items: "Ka", "KaV1234", "Kb", "KbVtest"
I am not a gin expert, but do you see the same speedup for tables with
a lower number of rows, or even a degradation in performance?

> The combined entry is used to support "contains (@>)" queries, and the key
> only item is used to support "key contains (?)" queries. This change seems
> to help especially with hstore keys that have high cardinalities. Downsides
> of this change is that it requires an index rebuild, and the index will be
> larger in size.
Index rebuild would be a problem only for minor releases, this patch
would be applied only on the current master branch for 9.4 and above.

> Patch attached. Any thoughts on this change?
Please add your patch to the next commit fest that will begin in 3
weeks so as you could get more formal review.
https://commitfest.postgresql.org/action/commitfest_view?id=19

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-25 Thread Pavel Stehule
2013/8/23 Tom Lane 

> Pavel Stehule  writes:
> > please, can you send a self explained test
> > this issue should be fixed, and we need a examples.
>
> We already had a perfectly good example at the beginning of this thread.
> What's missing is a decision on how we ought to approximate the cost of
> planning (relative to execution costs).
>
> As I mentioned upthread, it doesn't seem unreasonable to me to do
> something quick-and-dirty based on the length of the plan's rangetable.
> Pretty nearly anything would fix these specific situations where the
> estimated execution cost is negligible.  It's possible that there are
> more complicated cases where we'll need a more accurate estimate, but
> we've not seen an example of that yet.
>
> My previous suggestion was to estimate planning cost as
>10 * (length(plan->rangetable) + 1)
> but on reflection it ought to be scaled by one of the cpu cost constants,
> so perhaps
>1000 * cpu_operator_cost * (length(plan->rangetable) + 1)
> which'd mean a custom plan has to be estimated to save a minimum of
> about 5 cost units (more if more than 1 table is used) before it'll
> be chosen.  I'm tempted to make the multiplier be 1 not 1000,
> but it seems better to be conservative about changing the behavior
> until we see how well this works in practice.
>
> Objections, better ideas?
>

I am thinking so this problem is little bit more complex and using only
this formula is too much simplification - although it is big step forward.

* first 5 queries uses a custom plan - it means so lot of procedures uses
custom plan for ever (if are executed without pooling, because almost all
functions with SQL are not called twice in one connect ) - and there are
really only a few reports related to prepared statements or PL/pgSQL
performance - so it can demonstrates so planning in PostgreSQL is relative
fast process and probably we don't be afraid of more wide using custom
plans. Custom plans has a nice a secondary effect - it solve a problems
with predicates in form: field = some_constant OR field IS NULL without any
special support in planner. But it sometimes 6. query can be slow, because
a generic plan is used.

where we can expect a performance problems?

* frequently fast simple statements:

** INSERT INTO table, { UPDATE | DELETE | SELECT } WHERE PK= const - these
queries can use a generic plan directly

* planer expensive queries with fast result - these queries can use a
generic plans too, with some logic as you describe.

In other cases probably using a custom plans doesn't do a performance
issue, we can use it directly.

What I see, a prepared plans (prepared statements) are used now more due
protection against SQL injection than due saving a planner time - and badly
using a generic plan is more worse than repeated planning.

P.S. Can be magic constant 5 (using custom plans) controlled via GUC? Then
we can have very good control for some special using where default
mechanism fails (0 .. use a generic plans ever, -1 use a generic plan newer)

Regards

Pavel






> regards, tom lane
>


Re: [HACKERS] Unpacking scalar JSON values

2013-08-25 Thread Josh Berkus

> 
> This came up recently on -bugs. See
> ,
> where I suggested we expose the dequoting function.

+1.

As it is, I use btrim() heaviy with JSON data.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: lob conversion functionality

2013-08-25 Thread Pavel Stehule
2013/8/25 Pavel Stehule 

> Hello
>
> here is a patch
>
> it introduce a load_lo and make_lo functions
>
> postgres=# select make_lo(decode('ff00','hex'));
>  make_lo
> ─
>24629
> (1 row)
>
> Time: 40.724 ms
> postgres=# select load_lo(24628);
>   load_lo
> 
>  \xff00
> (1 row)
>
> postgres=# \lo_import ~/avatar.png
> lo_import 24630
>
> postgres=# select md5(load_lo(24630));
>md5
> ──
>  513f60836f3b625713acaf1c19b6ea78
> (1 row)
>
> postgres=# \q
> bash-4.1$ md5sum ~/avatar.png
> 513f60836f3b625713acaf1c19b6ea78  /home/pavel/avatar.png
>
>
my motivation and reason for implementation of this patch was little bit
difficult implementation of conversion blob to base64 string.

I have a communication system based on XML documents. These documents are
created in PG with XML functions. There was customer request to support
binary (images) attachments. LO API is really very good for transport
binary data from client side to server side, but next processing was less
nice - LO -> Bytea transformation is not well documented and there is
necessary to use a magic integer constants. With these two functions this
transformations are very simple.

just

select xmlforest(encode(load_lo(24630), 'base64') AS attachment);

Regards

Pavel


> Regards
>
> Pavel Stehule
>
>
>
> 2013/8/22 Jov 
>
>> +1
>> badly need the large object and bytea convert function.
>>
>> Once I have to use the ugly pg_read_file() to put some text to pg,I tried
>> to use large object but find it is useless without function to convert
>> large object to bytea.
>>
>> Jov
>> blog: http:amutu.com/blog 
>>
>>
>> 2013/8/10 Pavel Stehule 
>>
>>>  Hello
>>>
>>> I had to enhance my older project, where XML documents are parsed and
>>> created on server side - in PLpgSQL and PLPerl procedures. We would to
>>> use a LO API for client server communication, but we have to
>>> parse/serialize LO on server side.
>>>
>>> I found so there are no simple API for working with LO from PL without
>>> access to file system. I had to use a ugly hacks:
>>>
>>> CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea)
>>> RETURNS oid AS $$
>>> DECLARE
>>>   _loid oid;
>>>   _substr bytea;
>>> BEGIN
>>>   _loid := lo_creat(-1);
>>>   FOR i IN 0..length($1)/2048
>>>   LOOP
>>> _substr := substring($1 FROM i * 2048 + 1 FOR 2048);
>>> IF _substr <> '' THEN
>>>   INSERT INTO pg_largeobject(loid, pageno, data)
>>> VALUES(_loid, i, _substr);
>>> END IF;
>>>   END LOOP;
>>>
>>>   EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid);
>>>   RETURN _loid;
>>> END;
>>> $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path =
>>> 'pg_catalog';
>>>
>>> and
>>>
>>> CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid)
>>> RETURNS xml AS $$
>>> DECLARE
>>>   b_cum bytea = '';
>>>   b bytea;
>>> BEGIN
>>>   FOR b IN SELECT l.data
>>>   FROM pg_largeobject l
>>>  WHERE l.loid = attachment_to_xml.attachment
>>>  ORDER BY l.pageno
>>>   LOOP
>>> b_cum := b_cum || b;
>>>   END LOOP;
>>>   IF NOT FOUND THEN
>>> RETURN NULL;
>>>   ELSE
>>> RETURN xmlelement(NAME "attachment",
>>>encode(b_cum, 'base64'));
>>>   END IF;
>>> END;
>>> $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path =
>>> 'pg_catalog';
>>>
>>> These functions can be simplified if we supports some functions like
>>> encode, decode for LO
>>>
>>> So my proposal is creating functions:
>>>
>>> * lo_encode(loid oid) .. returns bytea
>>> * lo_encode(loid oid, encoding text) .. returns text
>>> * lo_make(loid oid, data bytea)
>>> * lo_make(loid oid, data text, encoding text)
>>>
>>> This can simplify all transformation between LO and VARLENA. Known
>>> limit is 1G for varlena, but it is still relative enough high.
>>>
>>> Notes. comments?
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>>
>>
>


Re: [HACKERS] proposal: lob conversion functionality

2013-08-25 Thread Pavel Stehule
Hello

here is a patch

it introduce a load_lo and make_lo functions

postgres=# select make_lo(decode('ff00','hex'));
 make_lo
─
   24629
(1 row)

Time: 40.724 ms
postgres=# select load_lo(24628);
  load_lo

 \xff00
(1 row)

postgres=# \lo_import ~/avatar.png
lo_import 24630

postgres=# select md5(load_lo(24630));
   md5
──
 513f60836f3b625713acaf1c19b6ea78
(1 row)

postgres=# \q
bash-4.1$ md5sum ~/avatar.png
513f60836f3b625713acaf1c19b6ea78  /home/pavel/avatar.png

Regards

Pavel Stehule



2013/8/22 Jov 

> +1
> badly need the large object and bytea convert function.
>
> Once I have to use the ugly pg_read_file() to put some text to pg,I tried
> to use large object but find it is useless without function to convert
> large object to bytea.
>
> Jov
> blog: http:amutu.com/blog 
>
>
> 2013/8/10 Pavel Stehule 
>
>>  Hello
>>
>> I had to enhance my older project, where XML documents are parsed and
>> created on server side - in PLpgSQL and PLPerl procedures. We would to
>> use a LO API for client server communication, but we have to
>> parse/serialize LO on server side.
>>
>> I found so there are no simple API for working with LO from PL without
>> access to file system. I had to use a ugly hacks:
>>
>> CREATE OR REPLACE FUNCTION parser.save_as_lob(bytea)
>> RETURNS oid AS $$
>> DECLARE
>>   _loid oid;
>>   _substr bytea;
>> BEGIN
>>   _loid := lo_creat(-1);
>>   FOR i IN 0..length($1)/2048
>>   LOOP
>> _substr := substring($1 FROM i * 2048 + 1 FOR 2048);
>> IF _substr <> '' THEN
>>   INSERT INTO pg_largeobject(loid, pageno, data)
>> VALUES(_loid, i, _substr);
>> END IF;
>>   END LOOP;
>>
>>   EXECUTE format('GRANT SELECT ON LARGE OBJECT %s TO ohs', _loid);
>>   RETURN _loid;
>> END;
>> $$ LANGUAGE plpgsql SECURITY DEFINER STRICT SET search_path =
>> 'pg_catalog';
>>
>> and
>>
>> CREATE OR REPLACE FUNCTION fbuilder.attachment_to_xml(attachment oid)
>> RETURNS xml AS $$
>> DECLARE
>>   b_cum bytea = '';
>>   b bytea;
>> BEGIN
>>   FOR b IN SELECT l.data
>>   FROM pg_largeobject l
>>  WHERE l.loid = attachment_to_xml.attachment
>>  ORDER BY l.pageno
>>   LOOP
>> b_cum := b_cum || b;
>>   END LOOP;
>>   IF NOT FOUND THEN
>> RETURN NULL;
>>   ELSE
>> RETURN xmlelement(NAME "attachment",
>>encode(b_cum, 'base64'));
>>   END IF;
>> END;
>> $$ LANGUAGE plpgsql STRICT SECURITY DEFINER SET search_path =
>> 'pg_catalog';
>>
>> These functions can be simplified if we supports some functions like
>> encode, decode for LO
>>
>> So my proposal is creating functions:
>>
>> * lo_encode(loid oid) .. returns bytea
>> * lo_encode(loid oid, encoding text) .. returns text
>> * lo_make(loid oid, data bytea)
>> * lo_make(loid oid, data text, encoding text)
>>
>> This can simplify all transformation between LO and VARLENA. Known
>> limit is 1G for varlena, but it is still relative enough high.
>>
>> Notes. comments?
>>
>> Regards
>>
>> Pavel
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>


load_lo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unpacking scalar JSON values

2013-08-25 Thread Andrew Dunstan


On 08/24/2013 09:08 PM, Daniel Farina wrote:

On Sat, Aug 24, 2013 at 6:04 PM, Daniel Farina  wrote:

But there's no good way I can find from the documentation to do it
with a scalar: select ('"va\"lue"'::json)::text;

Triggered send by accident:

select ('"va\"lue"'::json)::text;
text
---
  "va\"lue"
(1 row)

the JSON escaping is retained.  That may be reasonable for a
text-cast, so I'm not suggesting its reinterpretation, but there is no
operator I can identify immediately from the documentation to convert
a JSON string value into a Postgres one like json_each_text, except on
a json that contains a scalar JSON string.





This came up recently on -bugs. See 
, 
where I suggested we expose the dequoting function.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance problem in PLPgSQL

2013-08-25 Thread Marc Cousin

On 24/08/2013 21:16, Tom Lane wrote:

Marc Cousin  writes:

On 23/08/2013 23:55, Tom Lane wrote:

My previous suggestion was to estimate planning cost as
10 * (length(plan->rangetable) + 1)
but on reflection it ought to be scaled by one of the cpu cost constants,
so perhaps
1000 * cpu_operator_cost * (length(plan->rangetable) + 1)
which'd mean a custom plan has to be estimated to save a minimum of
about 5 cost units (more if more than 1 table is used) before it'll
be chosen.  I'm tempted to make the multiplier be 1 not 1000,
but it seems better to be conservative about changing the behavior
until we see how well this works in practice.

Objections, better ideas?

No better idea as far as I'm concerned, of course :)
But it is a bit tricky to understand what is going on when you get
hit by it, and using a very approximated cost of the planning time
seems the most logical to me. So I'm all for this solution.

I've pushed a patch along this line.  I verified it fixes your original
example, but maybe you could try it on your real application?
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=005f583ba4e6d4d19b62959ef8e70a3da4d188a5

regards, tom lane

I think that won't be possible :(

It's one of those environments where you have to ask lots of permissions 
before doing anything. I'll do my best to have them do a test with this 
patch.


Thanks a lot.

Marc


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers