EXPLAIN BUFFERS and I/O timing accounting questions

2019-10-21 Thread Maciek Sakrejda
Hello,

I ran across an EXPLAIN plan and had some questions about some of its
details. The BUFFERS docs say

>The number of blocks shown for an upper-level node includes those used by
all its child nodes.

I initially assumed this would be cumulative, but I realized it's probably
not because some of the blocks affected by each child will actually
overlap. But this particular plan has a Shared Hit Blocks at the root (an
Aggregate) that is smaller than some of its children (three ModifyTables
and a CTE Scan). This seems to contradict the documentation (since if
children overlap fully in their buffers usage, the parent should still have
a cost equal to the costliest child)--any idea what's up? I can send the
whole plan (attached? inline? it's ~15kb) if that helps.

I also noticed the I/O Read Time (from track_io_timing) of two children in
this plan is equal to the I/O Read Time in the root. Is I/O time
potentially fully parallelized across children? There are no parallel
workers according to the plan, so I'm surprised at this and would like to
understand better.

Also, a tangential question: why is the top-level structure of a JSON plan
an array? I've only ever seen one root node with a Plan key there.

Thanks,
Maciek


Re: A question about sequences and backup/restore cycles

2019-10-21 Thread Adrian Klaver

On 10/21/19 5:13 PM, stan wrote:


I typically design a system with primary keys defined, like this:


CREATE TABLE employee (
employee_key   integer DEFAULT nextval('employee_key_serial')
PRIMARY KEY ,

I use scripts to build the database structures and load the data. I am
careful to get the dependencies in the correct order, so that the keys later
structures depend on already exist.

Today I was going over the design for the current project with a friend,
whose expertise i respect. he said that he had issues on a system designed
by an OEM that had dependencies on keys developed from sequences after a
backup/restore cycle,

Will I potentially have these issues? If so, what can I do different to
avoid this being an issue?


It is not clear to me what you are doing:

1) Are you using pg_dump/pg_restore to populate a database?
If so it will take care of the dependencies.

2) Are you using a home built method to populate the database?
In that case you take responsibility for dependencies.







--
Adrian Klaver
adrian.kla...@aklaver.com




A question about sequences and backup/restore cycles

2019-10-21 Thread stan


I typically design a system with primary keys defined, like this:


CREATE TABLE employee (
employee_key   integer DEFAULT nextval('employee_key_serial')
PRIMARY KEY ,

I use scripts to build the database structures and load the data. I am
careful to get the dependencies in the correct order, so that the keys later
structures depend on already exist.

Today I was going over the design for the current project with a friend,
whose expertise i respect. he said that he had issues on a system designed
by an OEM that had dependencies on keys developed from sequences after a
backup/restore cycle,

Will I potentially have these issues? If so, what can I do different to
avoid this being an issue?


-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin




existing dblinks

2019-10-21 Thread Julie Nishimura
Hello, is there any way to find if there are any dblink exist on the 9.6 
postgresql server?

Thanks


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread raf
Steven Pousty wrote:

> On Sun, Oct 20, 2019 at 4:31 PM raf  wrote:
> 
> > Steven Pousty wrote:
> >
> > > I would think though that raising an exception is better than a
> > > default behavior which deletes data.
> >
> > I can't help but feel the need to make the point that
> > the function is not deleting anything. It is just
> > returning null. The deletion of data is being performed
> > by an update statement that uses the function's return
> > value to set a column value.
> >
> > I don't agree that raising an exception in the function
> > is a good idea (perhaps unless it's valid to assume
> > that this function will only ever be used in such a
> > context). Making the column not null (as already
> > suggested) and having the update statement itself raise
> > the exception seems more appropriate if an exception is
> > desirable. But that presumes an accurate understanding
> > of the behaviour of jsonb_set.
> >
> > Really, I think the best fix would be in the
> > documentation so that everyone who finds the function
> > in the documentation understands its behaviour
> > immediately.
> >
> Hey Raf
> 
> In a perfect world I would agree with you. But often users do not read ALL
> the documentation before they use the function in their code OR they are
> not sure that the condition applies to them (until it does).

I'm well aware of that, hence the statement that this
information needs to appear at the place in the
documentation where the user is first going to
encounter the function (i.e. in the table where its
examples are). Even putting it in a note box further
down the page might not be enough (but hopefully it
will be).

cheers,
raf





Re: Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Adrian Klaver

On 10/21/19 1:30 PM, Alexander Farber wrote:

Apologies, I should have shown the JSON structure in my very first email -

On Mon, Oct 21, 2019 at 4:45 PM Thomas Kellerer > wrote:


Use ->> to return the value as text (not as JSONB) and you need to
use the column alias, not the table alias:

     (t.tile ->> 'col')::int


It is a JSON-array of JSON-objects with properties col, row, value 
(integers) and letter (text):


words_ru=> SELECT * FROM words_moves LIMIT 5;

   mid   | action |  gid  | uid  |    played 
|   
tiles
| score | letters |  hand   | puzzle

++---+--+---++---+-+-+
  385903 | play   | 17042 | 5476 | 2018-06-20 04:46:13.864758+02 | 
[{"col": 7, "row": 6, "value": 1, "letter": "А"}, {"col": 7, "row": 5, 
"value": 2, "letter": "Р"}, {"col": 7, "row": 11, "value": 2, "letter": 
"В"}, {"col": 7, "row": 10, "value": 1, "letter": "А"}, {"col": 7, 
"row": 9, "value": 2, "letter": "Л"}, {"col": 7, "row": 8, "value": 2, 
"letter": "П"}, {"col": 7, "row": 7, "value": 2, "letter": "С"}]  |
29 | АРВАЛПС | ВРЛПААС | f
  391416 | play   | 17055 | 5476 | 2018-06-21 00:36:36.690012+02 | 
[{"col": 4, "row": 11, "value": 1, "letter": "А"}, {"col": 4, "row": 10, 
"value": 2, "letter": "К"}, {"col": 4, "row": 9, "value": 0, "letter": 
"Л"}, {"col": 4, "row": 8, "value": 1, "letter": "Е"}, {"col": 4, "row": 
7, "value": 2, "letter": "Д"}, {"col": 4, "row": 5, "value": 2, 
"letter": "Р"}, {"col": 4, "row": 4, "value": 2, "letter": "П"}]  |
34 | АКЛЕДРП | РКП*АДЕ | f
  394056 | play   | 17264 | 7873 | 2018-06-21 13:39:27.026943+02 | 
[{"col": 9, "row": 7, "value": 0, "letter": "Р"}, {"col": 8, "row": 7, 
"value": 0, "letter": "Е"}, {"col": 7, "row": 7, "value": 1, "letter": 
"Н"}, {"col": 6, "row": 7, "value": 1, "letter": "О"}, {"col": 5, "row": 
7, "value": 2, "letter": "Р"}, {"col": 4, "row": 7, "value": 1, 
"letter": "О"}, {"col": 3, "row": 7, "value": 2, "letter": "К"}]    |
24 | РЕНОРОК | ОК**ОНР | f
     131 | play   |   206 |  404 | 2018-02-20 09:26:05.234006+01 | 
[{"col": 9, "row": 7, "value": 5, "letter": "Ь"}, {"col": 8, "row": 7, 
"value": 2, "letter": "Д"}, {"col": 7, "row": 7, "value": 1, "letter": 
"Е"}, {"col": 6, "row": 7, "value": 2, "letter": "С"}, {"col": 5, "row": 
7, "value": 1, "letter": "О"}, {"col": 4, "row": 7, "value": 2, 
"letter": "Р"}, {"col": 3, "row": 7, "value": 2, "letter": "П"}]    |
32 | ЬДЕСОРП | | f
   15676 | play   |  2785 | 2997 | 2018-04-18 16:56:58.368445+02 | 
[{"col": 12, "row": 7, "value": 5, "letter": "Ь"}, {"col": 11, "row": 7, 
"value": 1, "letter": "Н"}, {"col": 10, "row": 7, "value": 1, "letter": 
"Е"}, {"col": 8, "row": 7, "value": 0, "letter": "Г"}, {"col": 9, "row": 
7, "value": 2, "letter": "Р"}, {"col": 7, "row": 7, "value": 1, 
"letter": "И"}, {"col": 6, "row": 7, "value": 2, "letter": "М"}] |    28 
| МИЬРНГЕ | | f

(5 rows)
This stored function -

CREATE OR REPLACE FUNCTION words_get_move(
     in_mid integer
     ) RETURNS TABLE (
     out_bid    integer,
     out_mid    bigint,
     out_hand   text,
     out_col    integer,
     out_row    integer,
     out_letter text,
     out_value  integer
     ) AS
$func$
     SELECT
     g.bid,
     m.mid,
     m.hand,
     (t.tile->'col')::int AS col,
     (t.tile->'row')::int AS row,
     (t.tile->'letter')::text AS letter,
     (t.tile->'value')::int   AS value
     FROM words_moves m
     CROSS JOIN JSONB_ARRAY_ELEMENTS(m.tiles) AS t(tile)
     LEFT JOIN words_games g USING(gid)
     WHERE m.action = 'play' AND
     m.gid = (SELECT gid FROM words_moves WHERE mid = in_mid)
     AND m.played <= (SELECT played FROM words_moves WHERE mid = in_mid)
     ORDER BY m.played DESC;
$func$ LANGUAGE sql;

gives me same error (why does it think it is JSONB and not integer?)


As Thomas pointed there is a difference between -> and ->>:

test_(postgres)# select pg_typeof('[{"one": 1, "two": 2}]'::jsonb -> 0 
-> 'one'), '[{"one": 1, "two": 2}]'::jsonb -> 0 -> 'one';

 pg_typeof | ?c

Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Adrian Klaver

On 10/21/19 12:50 PM, Tomas Vondra wrote:

On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote:

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception 
blocks in stored functions and now sub-transactions in stored procedures.




Sorry for the confusion - I've not been particularly careful when
writing that response.

Let me illustrate the issue with this example:

    CREATE TABLE t (a int);

    CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
    DECLARE
   msg TEXT;
    BEGIN
  -- SAVEPOINT s1;
  INSERT INTO t VALUES (1);
  -- COMMIT;
    EXCEPTION
  WHEN others THEN
    msg := SUBSTR(SQLERRM, 1, 100);
    RAISE NOTICE 'error: %', msg;
    END; $$;

    CALL test();

If you uncomment the SAVEPOINT, you get

    NOTICE:  error: unsupported transaction command in PL/pgSQL

because savepoints are not allowed in stored procedures. Fine.

If you uncomment the COMMIT, you get

    NOTICE:  error: cannot commit while a subtransaction is active

which happens because the EXCEPTION block creates a subtransaction, and
we can't commit when it's active.

But we can commit outside the exception block:

    CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
    DECLARE
   msg TEXT;
    BEGIN
  BEGIN
    INSERT INTO t VALUES (1);
  EXCEPTION
    WHEN others THEN
  msg := SUBSTR(SQLERRM, 1, 100);
  RAISE NOTICE 'error: %', msg;
   END;
   COMMIT;
    END; $$;


You can do something like the below though:

CREATE TABLE t (a int PRIMARY KEY);

CREATE OR REPLACE PROCEDURE public.test()
 LANGUAGE plpgsql
AS $procedure$
   DECLARE
  msg TEXT;
   BEGIN
 BEGIN
   INSERT INTO t VALUES (1);
 EXCEPTION
   WHEN others THEN
 msg := SUBSTR(SQLERRM, 1, 100);
 RAISE NOTICE 'error: %', msg;
 UPDATE t set a = 2;
  END;
  COMMIT;
   END; $procedure$

test_(postgres)# CALL test();
CALL
test_(postgres)# select * from t;
 a
---
 1
(1 row)

test_(postgres)# CALL test();
NOTICE:  error: duplicate key value violates unique constraint "t_pkey"
CALL
test_(postgres)# select * from t;
 a
---
 2
(1 row)





regards




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Alexander Farber
Apologies, I should have shown the JSON structure in my very first email -

On Mon, Oct 21, 2019 at 4:45 PM Thomas Kellerer  wrote:

> Use ->> to return the value as text (not as JSONB) and you need to use the
> column alias, not the table alias:
>
> (t.tile ->> 'col')::int
>
>
It is a JSON-array of JSON-objects with properties col, row, value
(integers) and letter (text):

words_ru=> SELECT * FROM words_moves LIMIT 5;

  mid   | action |  gid  | uid  |played
|
tiles
| score | letters |  hand   | puzzle
++---+--+---++---+-+-+
 385903 | play   | 17042 | 5476 | 2018-06-20 04:46:13.864758+02 | [{"col":
7, "row": 6, "value": 1, "letter": "А"}, {"col": 7, "row": 5, "value": 2,
"letter": "Р"}, {"col": 7, "row": 11, "value": 2, "letter": "В"}, {"col":
7, "row": 10, "value": 1, "letter": "А"}, {"col": 7, "row": 9, "value": 2,
"letter": "Л"}, {"col": 7, "row": 8, "value": 2, "letter": "П"}, {"col": 7,
"row": 7, "value": 2, "letter": "С"}]  |29 | АРВАЛПС | ВРЛПААС | f
 391416 | play   | 17055 | 5476 | 2018-06-21 00:36:36.690012+02 | [{"col":
4, "row": 11, "value": 1, "letter": "А"}, {"col": 4, "row": 10, "value": 2,
"letter": "К"}, {"col": 4, "row": 9, "value": 0, "letter": "Л"}, {"col": 4,
"row": 8, "value": 1, "letter": "Е"}, {"col": 4, "row": 7, "value": 2,
"letter": "Д"}, {"col": 4, "row": 5, "value": 2, "letter": "Р"}, {"col": 4,
"row": 4, "value": 2, "letter": "П"}]  |34 | АКЛЕДРП | РКП*АДЕ | f
 394056 | play   | 17264 | 7873 | 2018-06-21 13:39:27.026943+02 | [{"col":
9, "row": 7, "value": 0, "letter": "Р"}, {"col": 8, "row": 7, "value": 0,
"letter": "Е"}, {"col": 7, "row": 7, "value": 1, "letter": "Н"}, {"col": 6,
"row": 7, "value": 1, "letter": "О"}, {"col": 5, "row": 7, "value": 2,
"letter": "Р"}, {"col": 4, "row": 7, "value": 1, "letter": "О"}, {"col": 3,
"row": 7, "value": 2, "letter": "К"}]|24 | РЕНОРОК | ОК**ОНР | f
131 | play   |   206 |  404 | 2018-02-20 09:26:05.234006+01 | [{"col":
9, "row": 7, "value": 5, "letter": "Ь"}, {"col": 8, "row": 7, "value": 2,
"letter": "Д"}, {"col": 7, "row": 7, "value": 1, "letter": "Е"}, {"col": 6,
"row": 7, "value": 2, "letter": "С"}, {"col": 5, "row": 7, "value": 1,
"letter": "О"}, {"col": 4, "row": 7, "value": 2, "letter": "Р"}, {"col": 3,
"row": 7, "value": 2, "letter": "П"}]|32 | ЬДЕСОРП | | f
  15676 | play   |  2785 | 2997 | 2018-04-18 16:56:58.368445+02 | [{"col":
12, "row": 7, "value": 5, "letter": "Ь"}, {"col": 11, "row": 7, "value": 1,
"letter": "Н"}, {"col": 10, "row": 7, "value": 1, "letter": "Е"}, {"col":
8, "row": 7, "value": 0, "letter": "Г"}, {"col": 9, "row": 7, "value": 2,
"letter": "Р"}, {"col": 7, "row": 7, "value": 1, "letter": "И"}, {"col": 6,
"row": 7, "value": 2, "letter": "М"}] |28 | МИЬРНГЕ | | f
(5 rows)

This stored function -

CREATE OR REPLACE FUNCTION words_get_move(
in_mid integer
) RETURNS TABLE (
out_bidinteger,
out_midbigint,
out_hand   text,
out_colinteger,
out_rowinteger,
out_letter text,
out_value  integer
) AS
$func$
SELECT
g.bid,
m.mid,
m.hand,
(t.tile->'col')::int AS col,
(t.tile->'row')::int AS row,
(t.tile->'letter')::text AS letter,
(t.tile->'value')::int   AS value
FROM words_moves m
CROSS JOIN JSONB_ARRAY_ELEMENTS(m.tiles) AS t(tile)
LEFT JOIN words_games g USING(gid)
WHERE m.action = 'play' AND
m.gid = (SELECT gid FROM words_moves WHERE mid = in_mid)
AND m.played <= (SELECT played FROM words_moves WHERE mid = in_mid)
ORDER BY m.played DESC;
$func$ LANGUAGE sql;

gives me same error (why does it think it is JSONB and not integer?)

words_ru=> \i src/slova/dict/words_get_move.sql
psql:src/slova/dict/words_get_move.sql:28: ERROR:  cannot cast type jsonb
to integer
LINE 17: (t.tile->'col')::int AS col,
^

And I would prefer not to use ->> because I want col, row, value as
integers and not text

Regards
Alex

P.S. Below is the table definition:

words_ru=> \d words_moves
  Table "public.words_moves"
 Column  |   Type   | Collation | Nullable
| Default
-+--+---+--+--
 mid | bigint   |   | not null |

Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Tomas Vondra

On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote:

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception 
blocks in stored functions and now sub-transactions in stored 
procedures.




Sorry for the confusion - I've not been particularly careful when
writing that response.

Let me illustrate the issue with this example:

   CREATE TABLE t (a int);

   CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
   DECLARE
  msg TEXT;
   BEGIN
 -- SAVEPOINT s1;
 INSERT INTO t VALUES (1);
 -- COMMIT;
   EXCEPTION
 WHEN others THEN
   msg := SUBSTR(SQLERRM, 1, 100);
   RAISE NOTICE 'error: %', msg;
   END; $$;

   CALL test();

If you uncomment the SAVEPOINT, you get

   NOTICE:  error: unsupported transaction command in PL/pgSQL

because savepoints are not allowed in stored procedures. Fine.

If you uncomment the COMMIT, you get

   NOTICE:  error: cannot commit while a subtransaction is active

which happens because the EXCEPTION block creates a subtransaction, and
we can't commit when it's active.

But we can commit outside the exception block:

   CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
   DECLARE
  msg TEXT;
   BEGIN
 BEGIN
   INSERT INTO t VALUES (1);
 EXCEPTION
   WHEN others THEN
 msg := SUBSTR(SQLERRM, 1, 100);
 RAISE NOTICE 'error: %', msg;
  END;
  COMMIT;
   END; $$;


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Steve Atkins


On 21/10/2019 17:39, Steven Pousty wrote:
 Turning a JSON null into a SQL null  and thereby "deleting" the data 
is not the path of least surprises.


In what situation does that happen? (If it's already been mentioned I 
missed it, long thread, sorry).


Cheers,
  Steve




Re: CPU SPIKE

2019-10-21 Thread Francisco Olarte
You'll have to provide more data:

On Mon, Oct 21, 2019 at 7:14 AM Pawan Sharma  wrote:
> Having real high CPU issue (95-98%), with SELECT statements and select 
> queries contains multiple AND operator, is it will cause any CPU Spike..???

If you do a query the only reason it is not hitting 100% cpu is it is
waiting for something, like disk or network or locked data, so with
all data caches, fast networks ( or small results ) a standalone query
will use 100% CPU during its run ( it must, if it is not doing it
something is rotten in the scheduler ).

If some set of queries is using 100% cpu during, i.e., 5 minutes, they
may be asking for cached data that is normal.

A proper question will be "x queries of type y are eating 95% CPU in
my machine Z, with this type of data, I think it is not correct
because blah-blah". An spike during the query execution is the correct
outcome, CPU time is not like petrol, if you do not use it is lost,
and careis taken doing things like paralell jobs to insure queries use
as much CPU time as possible ( because typically a query needs, say, 5
core-seconds, if you use 10% cpu in an octacore that is 3.75 wall
secs, if you use 100% it is 0.375 ).

Also note locks/big result transmissions aside a fully cached DB is
cpu-limited ( and would be for selects with complex conditions ), I
would expect 100% usage if enough ( for 24 cpus ) clients are doing
complex queries against a cached database.

Your problem may be "it is using 100% of 24 cpus during a minute"
where you think "it should be just a sec", but the 100% figure is
better, you do not want your cpu to sit iddle.

> apps team is using sub-partition, PG11, CPU:24, Mem: 16GB
...
>  effective_cache_size
> --
>  22GB
...
>  max_worker_processes
> --
>  8

I may be mislead, but isn't 16Gb a little spartan for 24 CPU with 8
workers ( per query ? ). Also, I assume the 22Gb is because you are
accounting for a huge host cache.

Francisco Olarte.




Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Steven Pousty
On Sun, Oct 20, 2019 at 4:31 PM raf  wrote:

> Steven Pousty wrote:
>
> > I would think though that raising an exception is better than a
> > default behavior which deletes data.
>
> I can't help but feel the need to make the point that
> the function is not deleting anything. It is just
> returning null. The deletion of data is being performed
> by an update statement that uses the function's return
> value to set a column value.
>
> I don't agree that raising an exception in the function
> is a good idea (perhaps unless it's valid to assume
> that this function will only ever be used in such a
> context). Making the column not null (as already
> suggested) and having the update statement itself raise
> the exception seems more appropriate if an exception is
> desirable. But that presumes an accurate understanding
> of the behaviour of jsonb_set.
>
> Really, I think the best fix would be in the
> documentation so that everyone who finds the function
> in the documentation understands its behaviour
> immediately.
>
>
>
Hey Raf

In a perfect world I would agree with you. But often users do not read ALL
the documentation before they use the function in their code OR they are
not sure that the condition applies to them (until it does).  Turning a
JSON null into a SQL null  and thereby "deleting" the data is not the path
of least surprises.

So while we could say reading the documentation is the proper path it is
not the most helpful path. I am not arguing against doc'ing the behavior no
matter what we decide on. What I am saying is an exception is better than
the current situation if we can't agree to any other solution. An exception
is better than just doc but probably not the best solution. (and it seems
like most other people have said as well but the lag on a mailing list is
getting us overlapping).

I see people saying Null pointer exceptions are not helpful. I mostly
agree, they are not the most helpful kind of exception BUT they are better
than some alternatives. So I think it would be better to say NPEs are not
as helpful as they possibly could be.


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread David G. Johnston
On Sun, Oct 20, 2019 at 3:51 PM Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

> I'm not arguing against the idea of improving the situation. But I am
> arguing against a minimal fix that will not provide much of value to a
> careful app developer. i.e. I want to do more to support app devs.
> Ideally they would not need to use wrapper functions. There will be
> plenty of situations where it is mighty inconvenient to catch an
> exception thrown by jsonb_set(). And catching exceptions can be
> expensive. You want to avoid that if possible in your
> performance-critical plpgsql code.
>

As there is pretty much nothing that can be done at runtime if this
exception is raised actually "catching" it anywhere deeper than near the
top of the application code is largely pointless.  Its more like a
NullPointerException in Java - if the application raises it there should be
a last line of defense error handler that basically says "you developer
made a mistake somewhere and needs to fix it - tell them this happened".

Performance critical subsections (and pretty much the whole) of the
application can just raise the error to the caller using normal mechanisms
for "SQLException" propogation.

David J.


Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Horacio Miranda
Hi

> On 22/10/2019, at 4:14 AM, Adrian Klaver  wrote:
> 
> On 10/21/19 8:10 AM, Avinash Kumar wrote:
>> Hi,
>> On Mon, Oct 21, 2019 at 8:16 PM Alan Hodgson > > wrote:
>>On Mon, 2019-10-21 at 16:40 +0530, Avinash Kumar wrote:
>>> 
>>>We need to ensure that we have safe backup locations, for example,
>>>push them to AWS S3 and forget about redundancy.
>>>Why do you think only Offline Backups are reliable today ? 

The only way to ensure, you have a secondary protocol, is using some type of 
pull approach, were the backup system pull from the online system.

>>There have been examples of hackers gaining control of an
>>organization's servers or cloud accounts and not only destroying
>>their online systems but also methodically deleting all their backups.
>>  There are fewer things that can go catastrophically wrong if one has
>>actual offline backups. You have to be a lot more careful about
>>protecting anything attached to the Internet.
>> I do not agree with this. If a hacker is gaining control of your 
>> organizational servers to destroy your Online backups, can't he destroy the 
>> offline backups and your database ?

They only way to be safe is having an external company or passwords isolated 
from your organisation, my personal approach is having public certs installed 
from the secondary backup system to pull the backups from the online platforms. 
Having generated passwords with a keepass encrypted database isolated from the 
Organisations.

> 
> Well to me off-line means you have introduced an air gap between your on-line 
> presence and your off-line backups. This would prevent an intruder from 
> accessing the off-line backups.

The only way is not having the access or perhaps a 2FA to login into AWS 
platforms to ensure you know when someone is trying to login to your AWS 
accounts, Linux servers support 2FA too.

> 
>> This is not a right justification to encouraging Offline Backups over Online 
>> Backups.
>> If you are worried about storing your online backups through internet on 
>> cloud (i do not agree as you can still secure your data on cloud), store it 
>> in on a server in your Organizational network and do not push them through 
>> internet.
>> Taking Offline Backups is not the only right way to ensure Reliable Backups.
>> We are way ahead of the days where you need to face downtime to take backups.
>> Online Backups are reliable in PostgreSQL.

I think offline backups are useful as complement to the online backups, my 
current policy of backups for offline one are ( weekly, monthly, yearly with 4 
backup retention ) only one time a yearly backup was utilised to recover a 
human mistake.

PS: I think if you are really worried about hackers, perhaps you need to chat 
with your security officer to ensure alerts, accountability and 2FA plus other 
techniques are implemented in your company, nothing will stop a good hacker and 
probably it is working inside of your company, 80% of the attacks comes from 
people that works inside of the company ( Orion security Chile in 2001 shared 
this information ), no clue what todays stats are.

Ps2: don’t use passwords like secret, or s3cr3t, etc.  and don’t forget 
security is just a feeling.

> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com
> 
> 





Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Avinash Kumar
On Mon, Oct 21, 2019 at 8:47 PM Alan Hodgson 
wrote:

> On Mon, 2019-10-21 at 20:40 +0530, Avinash Kumar wrote:
>
> can't he destroy the offline backups and your database ?
> This is not a right justification to encouraging Offline Backups over
> Online Backups.
> If you are worried about storing your online backups through internet on
> cloud (i do not agree as you can still secure your data on cloud), store it
> in on a server in your Organizational network and do not push them through
> internet.
> Taking Offline Backups is not the only right way to ensure Reliable
> Backups.
> We are way ahead of the days where you need to face downtime to take
> backups.
> Online Backups are reliable in PostgreSQL.
>
>
> I apologize, I think we have a difference in terminology here. Obviously
> you don't need to take PostgreSQL out of service to take a backup. I don't
> know that you ever did; pg_dump even has always worked fine while the
> database is available.
>
> When I say offline backup I mean a backup that is stored in a way that it
> cannot be accessed via the Internet.
>
Okay. This gives a clarity, thanks. Mix-up of offline and online in terms
of backup and network.


Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Alan Hodgson
On Mon, 2019-10-21 at 20:40 +0530, Avinash Kumar wrote:
> can't he destroy the offline backups and your database ?  
> This is not a right justification to encouraging Offline Backups over
> Online Backups. 
> If you are worried about storing your online backups through internet
> on cloud (i do not agree as you can still secure your data on cloud),
> store it in on a server in your Organizational network and do not push
> them through internet.
> Taking Offline Backups is not the only right way to ensure Reliable
> Backups. 
> We are way ahead of the days where you need to face downtime to take
> backups. 
> Online Backups are reliable in PostgreSQL. 

I apologize, I think we have a difference in terminology here. Obviously
you don't need to take PostgreSQL out of service to take a backup. I
don't know that you ever did; pg_dump even has always worked fine while
the database is available.

When I say offline backup I mean a backup that is stored in a way that
it cannot be accessed via the Internet.


Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Adrian Klaver

On 10/21/19 8:10 AM, Avinash Kumar wrote:

Hi,

On Mon, Oct 21, 2019 at 8:16 PM Alan Hodgson > wrote:


On Mon, 2019-10-21 at 16:40 +0530, Avinash Kumar wrote:


We need to ensure that we have safe backup locations, for example,
push them to AWS S3 and forget about redundancy.
Why do you think only Offline Backups are reliable today ? 



There have been examples of hackers gaining control of an
organization's servers or cloud accounts and not only destroying
their online systems but also methodically deleting all their backups. 



There are fewer things that can go catastrophically wrong if one has
actual offline backups. You have to be a lot more careful about
protecting anything attached to the Internet.

I do not agree with this. If a hacker is gaining control of your 
organizational servers to destroy your Online backups, can't he destroy 
the offline backups and your database ?


Well to me off-line means you have introduced an air gap between your 
on-line presence and your off-line backups. This would prevent an 
intruder from accessing the off-line backups.


This is not a right justification to encouraging Offline Backups over 
Online Backups.
If you are worried about storing your online backups through internet on 
cloud (i do not agree as you can still secure your data on cloud), store 
it in on a server in your Organizational network and do not push them 
through internet.
Taking Offline Backups is not the only right way to ensure Reliable 
Backups.
We are way ahead of the days where you need to face downtime to take 
backups.

Online Backups are reliable in PostgreSQL.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Avinash Kumar
Hi,

On Mon, Oct 21, 2019 at 8:16 PM Alan Hodgson 
wrote:

> On Mon, 2019-10-21 at 16:40 +0530, Avinash Kumar wrote:
>
>
> We need to ensure that we have safe backup locations, for example, push
> them to AWS S3 and forget about redundancy.
> Why do you think only Offline Backups are reliable today ?
>
>
>
> There have been examples of hackers gaining control of an organization's
> servers or cloud accounts and not only destroying their online systems but
> also methodically deleting all their backups.
>

> There are fewer things that can go catastrophically wrong if one has
> actual offline backups. You have to be a lot more careful about protecting
> anything attached to the Internet.
>
I do not agree with this. If a hacker is gaining control of your
organizational servers to destroy your Online backups, can't he destroy the
offline backups and your database ?
This is not a right justification to encouraging Offline Backups over
Online Backups.
If you are worried about storing your online backups through internet on
cloud (i do not agree as you can still secure your data on cloud), store it
in on a server in your Organizational network and do not push them through
internet.
Taking Offline Backups is not the only right way to ensure Reliable
Backups.
We are way ahead of the days where you need to face downtime to take
backups.
Online Backups are reliable in PostgreSQL.


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Adrian Klaver

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception blocks 
in stored functions and now sub-transactions in stored procedures.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Alan Hodgson
On Mon, 2019-10-21 at 16:40 +0530, Avinash Kumar wrote:
> We need to ensure that we have safe backup locations, for example,
> push them to AWS S3 and forget about redundancy. 
> Why do you think only Offline Backups are reliable today ? 

There have been examples of hackers gaining control of an organization's
servers or cloud accounts and not only destroying their online systems
but also methodically deleting all their backups.

There are fewer things that can go catastrophically wrong if one has
actual offline backups. You have to be a lot more careful about
protecting anything attached to the Internet.



Re: Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Thomas Kellerer
> I am trying to create the following strored function based on your suggestion 
> (and I have forgotten to mention, that I also need the board id aka bid from 
> another table, words_games), but hit the next problem:
> 
> CREATE OR REPLACE FUNCTION words_get_move(
>     in_mid integer
>     ) RETURNS TABLE (
>     out_bid    integer,
>     out_mid    bigint,
>     out_hand   text,
>     out_col    integer,
>     out_row    integer,
>     out_letter text,
>     out_value  integer
>     ) AS
> $func$
>     SELECT
>     g.bid,
>     m.mid,
>     m.hand,
>     (t->'col')::int AS col,
>     (t->'row')::int AS row,
>     (t->'letter')::text AS letter,
>     (t->'value')::int   AS value
>     FROM words_moves m
>     CROSS JOIN JSONB_ARRAY_ELEMENTS(m.tiles) AS t(tile)
>     LEFT JOIN words_games g USING(gid)
>     WHERE m.action = 'play' AND
>     m.gid = (SELECT gid FROM words_moves WHERE mid = in_mid)
>     AND m.played <= (SELECT played FROM words_moves WHERE mid = in_mid)
>     ORDER BY m.played DESC;
> $func$ LANGUAGE sql;
> 
> words_ru=> \i src/slova/dict/words_get_move.sql
> psql:src/slova/dict/words_get_move.sql:28: ERROR:  cannot cast type jsonb to 
> integer
> LINE 17: (t->'col')::int AS col,
>    ^
> 

Use ->> to return the value as text (not as JSONB) and you need to use the 
column alias, not the table alias:

(t.tile ->> 'col')::int





Re: Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Adrian Klaver

On 10/21/19 6:39 AM, Alexander Farber wrote:

Hello, good afternoon!

With PostgreSQL 10 I host a word game, which stores player moves as a 
JSON array of objects with properties: col, row, value, letter -


CREATE TABLE words_moves (
     mid BIGSERIAL PRIMARY KEY,
     action  text NOT NULL,
     gid integer NOT NULL REFERENCES words_games ON DELETE CASCADE,
     uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
     played  timestamptz NOT NULL,
     tiles   jsonb,
     letters text,
     hand    text,
     score   integer CHECK(score >= 0),
     puzzle  boolean NOT NULL DEFAULT false
);

I am trying to construct a query, which would draw a game board when 
given a move id (aka mid):


     SELECT
     hand,
     JSONB_ARRAY_ELEMENTS(tiles)->'col' AS col,
     JSONB_ARRAY_ELEMENTS(tiles)->'row' AS row,
     JSONB_ARRAY_ELEMENTS(tiles)->'letter' AS letter,
     JSONB_ARRAY_ELEMENTS(tiles)->'value' AS value
     FROM words_moves
     WHERE action = 'play' AND
     gid = (SELECT gid FROM words_moves WHERE mid = 391416)
     AND played <= (SELECT played FROM words_moves WHERE WHERE mid = 391416)
     ORDER BY played DESC

The above query works for me and fetches all moves performed in a game 
id (aka gid) up to the move id 391416.


In my Java program I then just draw the tiles at the board, one by one 
(here a picture: https://slova.de/game-62662/ )


I have however 3 questions please:

1. Is it okay to call JSONB_ARRAY_ELEMENTS four times in the query, will 
PostgreSQL optimize that to a single call?


What is the structure of the JSON in tiles?

In other words could you expand the data in one go using jsonb_to_record()?

2. Do you think if it is okay to sort by played timestamp or should I 
better sort by mid?
3. Performancewise is it okay to use the 2 subqueries for finding gid 
and played when given a mid?


I could see collapsing them into a single query: Something like:

FROM
  words_moves
JOIN
   (select gid, played from word_moves where mid = 39146) AS m_id
ON
  word_moves.gid = m_id.gid
WHERE
   ...



Thank you
Alex




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Alexander Farber
Thank you Thomas -

On Mon, Oct 21, 2019 at 4:24 PM Thomas Kellerer  wrote:

> Alexander Farber schrieb am 21.10.2019 um 15:39:
> > I am trying to construct a query, which would draw a game board when
> given a move id (aka mid):
> >
> > SELECT
> > hand,
> > JSONB_ARRAY_ELEMENTS(tiles)->'col' AS col,
> > JSONB_ARRAY_ELEMENTS(tiles)->'row' AS row,
> > JSONB_ARRAY_ELEMENTS(tiles)->'letter' AS letter,
> > JSONB_ARRAY_ELEMENTS(tiles)->'value' AS value
> > FROM words_moves
> > WHERE action = 'play' AND
> > gid = (SELECT gid FROM words_moves WHERE mid = 391416)
> > AND played <= (SELECT played FROM words_moves WHERE WHERE mid =
> 391416)
> > ORDER BY played DESC
> >
> > The above query works for me and fetches all moves performed in a game
> id (aka gid) up to the move id 391416.
> >
> > 1. Is it okay to call JSONB_ARRAY_ELEMENTS four times in the query, will
> PostgreSQL optimize that to a single call?
>
> Typically set returning functions should be used in the FROM clause, not
> the SELECT list:
>
> SELECT
> hand,
> t.tile -> 'col' AS col,
> t.tile -> 'row' AS row,
> t.tile -> 'letter' AS letter,
> t.tile -> 'value' AS value
> FROM words_moves
>   cross join jsonb_array_elements(tiles) as t(tile)
> WHERE action = 'play'
>   AND gid = (SELECT gid FROM words_moves WHERE mid = 391416)
>   AND played <= (SELECT played FROM words_moves WHERE WHERE mid =
> 391416)
> ORDER BY played DESC
>
>
I am trying to create the following strored function based on your
suggestion (and I have forgotten to mention, that I also need the board id
aka bid from another table, words_games), but hit the next problem:

CREATE OR REPLACE FUNCTION words_get_move(
in_mid integer
) RETURNS TABLE (
out_bidinteger,
out_midbigint,
out_hand   text,
out_colinteger,
out_rowinteger,
out_letter text,
out_value  integer
) AS
$func$
SELECT
g.bid,
m.mid,
m.hand,
(t->'col')::int AS col,
(t->'row')::int AS row,
(t->'letter')::text AS letter,
(t->'value')::int   AS value
FROM words_moves m
CROSS JOIN JSONB_ARRAY_ELEMENTS(m.tiles) AS t(tile)
LEFT JOIN words_games g USING(gid)
WHERE m.action = 'play' AND
m.gid = (SELECT gid FROM words_moves WHERE mid = in_mid)
AND m.played <= (SELECT played FROM words_moves WHERE mid = in_mid)
ORDER BY m.played DESC;
$func$ LANGUAGE sql;

words_ru=> \i src/slova/dict/words_get_move.sql
psql:src/slova/dict/words_get_move.sql:28: ERROR:  cannot cast type jsonb
to integer
LINE 17: (t->'col')::int AS col,
   ^

How to cast the col to integer here?

Thanks
Alex


Re: Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Thomas Kellerer
Alexander Farber schrieb am 21.10.2019 um 15:39:
> I am trying to construct a query, which would draw a game board when given a 
> move id (aka mid):
> 
>     SELECT
>     hand,
>     JSONB_ARRAY_ELEMENTS(tiles)->'col' AS col,
>     JSONB_ARRAY_ELEMENTS(tiles)->'row' AS row,
>     JSONB_ARRAY_ELEMENTS(tiles)->'letter' AS letter,
>     JSONB_ARRAY_ELEMENTS(tiles)->'value' AS value
>     FROM words_moves
>     WHERE action = 'play' AND
>     gid = (SELECT gid FROM words_moves WHERE mid = 391416)
>     AND played <= (SELECT played FROM words_moves WHERE WHERE mid = 391416)
>     ORDER BY played DESC
> 
> The above query works for me and fetches all moves performed in a game id 
> (aka gid) up to the move id 391416.
> 
> 1. Is it okay to call JSONB_ARRAY_ELEMENTS four times in the query, will 
> PostgreSQL optimize that to a single call?

Typically set returning functions should be used in the FROM clause, not the 
SELECT list: 

SELECT
hand,
t.tile -> 'col' AS col,
t.tile -> 'row' AS row,
t.tile -> 'letter' AS letter,
t.tile -> 'value' AS value
FROM words_moves
  cross join jsonb_array_elements(tiles) as t(tile)
WHERE action = 'play' 
  AND gid = (SELECT gid FROM words_moves WHERE mid = 391416)
  AND played <= (SELECT played FROM words_moves WHERE WHERE mid = 391416)
ORDER BY played DESC








Calling jsonb_array_elements 4 times in the same query

2019-10-21 Thread Alexander Farber
Hello, good afternoon!

With PostgreSQL 10 I host a word game, which stores player moves as a JSON
array of objects with properties: col, row, value, letter -

CREATE TABLE words_moves (
mid BIGSERIAL PRIMARY KEY,
action  text NOT NULL,
gid integer NOT NULL REFERENCES words_games ON DELETE CASCADE,
uid integer NOT NULL REFERENCES words_users ON DELETE CASCADE,
played  timestamptz NOT NULL,
tiles   jsonb,
letters text,
handtext,
score   integer CHECK(score >= 0),
puzzle  boolean NOT NULL DEFAULT false
);

I am trying to construct a query, which would draw a game board when given
a move id (aka mid):

SELECT
hand,
JSONB_ARRAY_ELEMENTS(tiles)->'col' AS col,
JSONB_ARRAY_ELEMENTS(tiles)->'row' AS row,
JSONB_ARRAY_ELEMENTS(tiles)->'letter' AS letter,
JSONB_ARRAY_ELEMENTS(tiles)->'value' AS value
FROM words_moves
WHERE action = 'play' AND
gid = (SELECT gid FROM words_moves WHERE mid = 391416)
AND played <= (SELECT played FROM words_moves WHERE WHERE mid = 391416)
ORDER BY played DESC

The above query works for me and fetches all moves performed in a game id
(aka gid) up to the move id 391416.

In my Java program I then just draw the tiles at the board, one by one
(here a picture: https://slova.de/game-62662/ )

I have however 3 questions please:

1. Is it okay to call JSONB_ARRAY_ELEMENTS four times in the query, will
PostgreSQL optimize that to a single call?
2. Do you think if it is okay to sort by played timestamp or should I
better sort by mid?
3. Performancewise is it okay to use the 2 subqueries for finding gid and
played when given a mid?

Thank you
Alex


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Andrew Dunstan


On 10/21/19 2:07 AM, Tomas Vondra wrote:
> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:
>>
>>> I think the general premise of this thread is that the application
>>> developer does not realize that may be necessary, because it's a bit
>>> surprising behavior, particularly when having more experience with
>>> other
>>> databases that behave differently. It's also pretty easy to not notice
>>> this issue for a long time, resulting in significant data loss.
>>>
>>> Let's say you're used to the MSSQL or MySQL behavior, you migrate your
>>> application to PostgreSQL or whatever - how do you find out about this
>>> behavior? Users are likely to visit
>>>
>>>    https://www.postgresql.org/docs/12/functions-json.html
>>>
>>> but that says nothing about how jsonb_set works with NULL values :-(
>>
>>
>>
>> We should certainly fix that. I accept some responsibility for the
>> omission.
>>
>
> +1
>
>


So let's add something to the JSON funcs page  like this:


Note: All the above functions except for json_build_object,
json_build_array, json_to_recordset, json_populate_record, and
json_populate_recordset and their jsonb equivalents are strict
functions. That is, if any argument is NULL the function result will be
NULL and the function won't even be called. Particular care should
therefore be taken to avoid passing NULL arguments to those functions
unless a NULL result is expected. This is particularly true of the
jsonb_set and jsonb_insert functions.



(We do have a heck of a lot of Note: sections on that page)


cheers


andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Fabio Ugo Venchiarutti

On 21/10/2019 12:10, Avinash Kumar wrote:



On Mon, Oct 21, 2019 at 4:19 PM Fabio Ugo Venchiarutti 
mailto:f.venchiaru...@ocado.com>> wrote:


On 21/10/2019 09:52, Luca Ferrari wrote:
 > On Sat, Oct 19, 2019 at 7:46 PM Daulat Ram
mailto:daulat@exponential.com>> wrote:
 >> One more questions is, how backups are useful if we have
streaming replication . As I know, we can promote the standby as
primary in case of disaster at primary side. Do we need to schedule
backups if we have streaming replication?
 >
 > Let's speculate a little on that: do you need backups if you have a
 > RAID-1 configuration?
 > Replication helps you reduce almost to zero the time to handle a
 > disaster, backups allow you to recover in a more large time window.
 >
 > Luca
 >
 >


TBH I hear this argument more often than I wish.


Offline backups and data replication are nearly entirely orthogonal. 




Any form of instantaneous redundancy (RAID, instantaneous replication
and so on) primary is a mitigation measure to protect data &
availability against loss of infrastructure.


Backups (preferably with PITR) also do that, but that's not their
primary purpose unless you can't afford live redundancy on top of them.


Offline backups address many failure scenarios that any form of live
replication is defenseless against (eg: logical damage to the data as a
result of human errors/bugs/vandalism would hit all your replicas, but
you always can perform DR from a backup).

I think we are way ahead of the days where we need to live on Offline 
backups involving downtimes.
Today, we have several Online Consistent Backup solutions such as 
pg_basebackup (built-in with PG), pgBackRest, BARMAN, WAL-G etc 

Online Consistent Backups + Archiving of WALs are perfectly reliable.
We need to ensure that we have safe backup locations, for example, push 
them to AWS S3 and forget about redundancy.

Why do you think only Offline Backups are reliable today ?



Delayed replicas, or any online rollback capability (such as pg_rewind
off the server's own pg_wal or Oracle's flashback), somewhat live in a
grey area in between, and their effectiveness varies depending on which
level was compromised.

What if you delay your replica by a day, before you making a change 
tomorrow (that may need a rollback). Delayed Replica is for emergencies 
IMO and of course does not satisfy every possible scenario.





Trade-offs come down to individual implementers, so fair enough.


Our biases must come from different use cases/experience: a good 
fraction of recovery/sanitisation operations I had to carry out were 
very selective/application specific and involved buildup of anomalies 
over time, sometimes months.



Wouldn't have been possible without an old frozen reference for users to 
compare with, and that hat to come from very cold storage.





--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Avinash Kumar
On Mon, Oct 21, 2019 at 4:19 PM Fabio Ugo Venchiarutti <
f.venchiaru...@ocado.com> wrote:

> On 21/10/2019 09:52, Luca Ferrari wrote:
> > On Sat, Oct 19, 2019 at 7:46 PM Daulat Ram 
> wrote:
> >> One more questions is, how backups are useful if we have streaming
> replication . As I know, we can promote the standby as primary in case of
> disaster at primary side. Do we need to schedule backups if we have
> streaming replication?
> >
> > Let's speculate a little on that: do you need backups if you have a
> > RAID-1 configuration?
> > Replication helps you reduce almost to zero the time to handle a
> > disaster, backups allow you to recover in a more large time window.
> >
> > Luca
> >
> >
>
>
> TBH I hear this argument more often than I wish.
>
>
> Offline backups and data replication are nearly entirely orthogonal.


>
> Any form of instantaneous redundancy (RAID, instantaneous replication
> and so on) primary is a mitigation measure to protect data &
> availability against loss of infrastructure.
>
>
> Backups (preferably with PITR) also do that, but that's not their
> primary purpose unless you can't afford live redundancy on top of them.
>
>
> Offline backups address many failure scenarios that any form of live
> replication is defenseless against (eg: logical damage to the data as a
> result of human errors/bugs/vandalism would hit all your replicas, but
> you always can perform DR from a backup).
>
I think we are way ahead of the days where we need to live on Offline
backups involving downtimes.
Today, we have several Online Consistent Backup solutions such as
pg_basebackup (built-in with PG), pgBackRest, BARMAN, WAL-G etc 
Online Consistent Backups + Archiving of WALs are perfectly reliable.
We need to ensure that we have safe backup locations, for example, push
them to AWS S3 and forget about redundancy.
Why do you think only Offline Backups are reliable today ?

>
>
> Delayed replicas, or any online rollback capability (such as pg_rewind
> off the server's own pg_wal or Oracle's flashback), somewhat live in a
> grey area in between, and their effectiveness varies depending on which
> level was compromised.
>
What if you delay your replica by a day, before you making a change
tomorrow (that may need a rollback). Delayed Replica is for emergencies IMO
and of course does not satisfy every possible scenario.

>
>
>
>
>
> --
> Regards
>
> Fabio Ugo Venchiarutti
> OSPCFC Network Engineering Dpt.
> Ocado Technology
>
> --
>
>
> Notice:
> This email is confidential and may contain copyright material of
> members of the Ocado Group. Opinions and views expressed in this message
> may not necessarily reflect the opinions and views of the members of the
> Ocado Group.
>
> If you are not the intended recipient, please notify us
> immediately and delete all copies of this message. Please note that it is
> your responsibility to scan this message for viruses.
>
> References to the
> "Ocado Group" are to Ocado Group plc (registered in England and Wales with
> number 7098618) and its subsidiary undertakings (as that expression is
> defined in the Companies Act 2006) from time to time. The registered
> office
> of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way,
> Hatfield, Hertfordshire, AL10 9UL.
>


Thanks,
Avinash Vallarapu.


Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Fabio Ugo Venchiarutti

On 21/10/2019 09:52, Luca Ferrari wrote:

On Sat, Oct 19, 2019 at 7:46 PM Daulat Ram  wrote:

One more questions is, how backups are useful if we have streaming replication 
. As I know, we can promote the standby as primary in case of disaster at 
primary side. Do we need to schedule backups if we have streaming replication?


Let's speculate a little on that: do you need backups if you have a
RAID-1 configuration?
Replication helps you reduce almost to zero the time to handle a
disaster, backups allow you to recover in a more large time window.

Luca





TBH I hear this argument more often than I wish.


Offline backups and data replication are nearly entirely orthogonal.


Any form of instantaneous redundancy (RAID, instantaneous replication 
and so on) primary is a mitigation measure to protect data & 
availability against loss of infrastructure.



Backups (preferably with PITR) also do that, but that's not their 
primary purpose unless you can't afford live redundancy on top of them.



Offline backups address many failure scenarios that any form of live 
replication is defenseless against (eg: logical damage to the data as a 
result of human errors/bugs/vandalism would hit all your replicas, but 
you always can perform DR from a backup).



Delayed replicas, or any online rollback capability (such as pg_rewind 
off the server's own pg_wal or Oracle's flashback), somewhat live in a 
grey area in between, and their effectiveness varies depending on which 
level was compromised.






--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--


Notice: 
This email is confidential and may contain copyright material of 
members of the Ocado Group. Opinions and views expressed in this message 
may not necessarily reflect the opinions and views of the members of the 
Ocado Group.


If you are not the intended recipient, please notify us 
immediately and delete all copies of this message. Please note that it is 
your responsibility to scan this message for viruses.


References to the 
"Ocado Group" are to Ocado Group plc (registered in England and Wales with 
number 7098618) and its subsidiary undertakings (as that expression is 
defined in the Companies Act 2006) from time to time. The registered office 
of Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way, 
Hatfield, Hertfordshire, AL10 9UL.





Re: Postgres Point in time Recovery (PITR),

2019-10-21 Thread Luca Ferrari
On Sat, Oct 19, 2019 at 7:46 PM Daulat Ram  wrote:
> One more questions is, how backups are useful if we have streaming 
> replication . As I know, we can promote the standby as primary in case of 
> disaster at primary side. Do we need to schedule backups if we have streaming 
> replication?

Let's speculate a little on that: do you need backups if you have a
RAID-1 configuration?
Replication helps you reduce almost to zero the time to handle a
disaster, backups allow you to recover in a more large time window.

Luca