[no subject]

2020-07-12 Thread Anto Aravinth
Hello All,

I have the following table:

postgres=# \d so_rum;
Table "public.so_rum"
  Column   |  Type   | Collation | Nullable | Default
---+-+---+--+-
 id| integer |   |  |
 title | character varying(1000) |   |  |
 posts | text|   |  |
 body  | tsvector|   |  |
 parent_id | integer |   |  |
Indexes:
"so_rum_body_idx" rum (body)

I wanted to do phrase search query, so I came up with the below query, for
example:

select id from so_rum
where body @@ phraseto_tsquery('english','Is it possible to toggle
the visibility');

This gives me the results, which only match's the entire text. However,
there are documents, where the distance between lexmes are more and the
above query doesn't gives me back those data.  For example: `'it is
something possible to do toggle between the. . .  visibility'` doesn't get
returned. I know I can get it returned with `<2>` (for example) distance
operator by giving in the `to_tsquery`, manually.


But I wanted to understand, how to do this in my sql statement itself, so
that I get the results  first with distance of `1` and then `2` and so on
(may be till `6-7`). Finally append results with the actual count of the
search words like the following query:

select count(id) from so_rum
where body @@ to_tsquery('english','string & string . . . ')

Is it possible to do in a single query with good performance?


Re: Join optimization

2020-07-12 Thread luis . roberto


- Mensagem original -
De: "David Rowley" 
Para: "luis.roberto" 
Cc: "Fabrízio de Royes Mello" , "pgsql-general" 

Enviadas: Domingo, 12 de julho de 2020 5:29:08
Assunto: Re: Join optimization

On Sun, 12 Jul 2020 at 06:59,  wrote:
>
> I'm sorry for the bad example.
>
> Here is another, with some data on PG:  
> https://dbfiddle.uk/?rdbms=postgres_13=ccfd1c4fa291e74a6db9db1772e2b5ac
>   and Oracle:  
> https://dbfiddle.uk/?rdbms=oracle_18=21a98f499065ad4e2c35ff4bd1487e14.

I believe what you're talking about is join removals.  It appears as
though Oracle is able to remove the inner join to the users table as
the join only serves to check the user record exists. No columns are
being selected.  The record must exist due to the foreign key
referencing users.

PostgreSQL currently can only remove left joins. Likely what you could
do here is just change your inner join into a left join. If you're
happy enough that the user record will always exist then that should
allow it to work.

The reason PostgreSQL does not currently support inner join is that by
default, foreign key constraints are only triggered at the end of the
query, (or if deferred, at the end of the transaction). WIth
PostgreSQL, it's possible for a SELECT query to see a violated foreign
key constraint.  This can happen if your SELECT query calls a function
which updates a referenced record.  The cascade of the foreign key
won't occur until the end of the statement, so the select may stumble
upon a violated foreign key.

Here's a quick example of this case:
drop table t1,t2;
create table t1 (a int primary key);
create table t2 (a int references t1 on update cascade);

insert into t1 values(1),(2);
insert into t2 values(2),(2);
create or replace function update_t1 (p_a int) returns int as $$ begin
update t1 set a = a + 1 where a = p_a; return p_a; end; $$ language
plpgsql volatile;

-- in theory, this should never return any rows as we're asking for
all rows that
-- don't exist in the referenced table. However, we do get a result
since the function
-- call updates t1 setting the row with a=2 to a=3. The cascade to t2
does not occur
-- until the end of the statement.
select update_t1(a+1),t1.a from t1 where not exists(select 1 from t2
where t1.a=t2.a);
 update_t1 | a
---+---
 2 | 1
(1 row)

If you're happy that you don't have any functions like that which
could momentarily cause the foreign key to appear violated, then there
shouldn't be any harm in changing the INNER JOIN on users to a LEFT
JOIN.  PostgreSQL will be able to remove the join in that case.

David

-

Thanks for the reply David! I understand it better now.




Re: Listen/Notify feedback

2020-07-12 Thread Michel Pelletier
On Sat, Jul 11, 2020 at 10:44 AM Brian Dunavant  wrote:

> One aspect is if there is no one listening when a notify happens, the
> message is lost (e.g. no durability).   If this is important to you, it can
> be addressed by writing the messages to a table as well when you NOTIFY,
> and the listener deletes messages after they are processed.  On connection
> the listener can query the table to catch up on any missed messages, or
> messages that were mid-process during a crash.  This is trickier with more
> than one listener.   This isn't a whole lot more efficient than just using
> the table alone, but it saves you from having to poll so better response
> times.
>

Good advice from Brian here that mirrors my own experience, I'd like to
point out that if you do go the multiple listener route working a
persistent table, it's important to avoid races with SELECT FOR UPDATE SKIP
LOCKED.

https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE

I know this is old news to most of the people on this list, but I've run
into enough folks who don't know about this little gem that I figured I'd
mention it, it's saved my bacon more than once.

-Michel


> On Sat, Jul 11, 2020 at 8:58 AM Rita  wrote:
>
>> I am investigating various pub/sub tools such as ActiveMQ, Rabbit, Redis,
>> etc.I came across Postgresql Listen/Notify and was easily able to write
>> code to listen to messages. For the people who have been using this for a
>> while: what are its downsides, things to consider when writing good code
>> that use pub/sub, how do you deal with large messages, can I have
>> subscribers listen to replica nodes?
>>
>> Thanks
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>


Re: Listen/Notify feedback

2020-07-12 Thread Rita
 Good to know about potential performance problems. I don't plan to have
more than 5 hosts. Also, good to know about MQTT.

On Sun, Jul 12, 2020 at 8:52 AM Andrew Smith  wrote:

> On Sun, 12 Jul 2020 at 21:39, Rita  wrote:
>
>> Thats good to know. Are there some standard patterns or best practices I
>> should follow when using messaging and with listen/notify?
>>
>> On Sat, Jul 11, 2020 at 1:44 PM Brian Dunavant 
>> wrote:
>>
>>> One aspect is if there is no one listening when a notify happens, the
>>> message is lost (e.g. no durability).   If this is important to you, it can
>>> be addressed by writing the messages to a table as well when you NOTIFY,
>>> and the listener deletes messages after they are processed.  On connection
>>> the listener can query the table to catch up on any missed messages, or
>>> messages that were mid-process during a crash.  This is trickier with more
>>> than one listener.   This isn't a whole lot more efficient than just using
>>> the table alone, but it saves you from having to poll so better response
>>> times.
>>>
>>> On Sat, Jul 11, 2020 at 8:58 AM Rita  wrote:
>>>
 I am investigating various pub/sub tools such as ActiveMQ, Rabbit,
 Redis, etc.I came across Postgresql Listen/Notify and was easily able to
 write code to listen to messages. For the people who have been using this
 for a while: what are its downsides, things to consider when writing good
 code that use pub/sub, how do you deal with large messages, can I have
 subscribers listen to replica nodes?

 Thanks
 --
 --- Get your facts first, then you can distort them as you please.--

>>>
> A couple of years ago I started looking into listen/notify in PG10 and
> found that the throughput decreased quite a bit as I added more and more
> listeners. Given the number of apps I needed to have listening and the
> number of messages that I expected to be consuming, I ended up writing a
> single listener app which then republished the messages via MQTT. Not sure
> if the performance has improved in subsequent versions (or whether this
> will affect you at all) but it's something to keep in mind.
>


-- 
--- Get your facts first, then you can distort them as you please.--


Re: Listen/Notify feedback

2020-07-12 Thread Tom Lane
Andrew Smith  writes:
> A couple of years ago I started looking into listen/notify in PG10 and
> found that the throughput decreased quite a bit as I added more and more
> listeners. Given the number of apps I needed to have listening and the
> number of messages that I expected to be consuming, I ended up writing a
> single listener app which then republished the messages via MQTT. Not sure
> if the performance has improved in subsequent versions (or whether this
> will affect you at all) but it's something to keep in mind.

FWIW, we made some efficiency improvements in v13 for the case of NOTIFY
with a lot of listeners [1].  Can't say of course whether that would
have fixed your problem, and v13 is still in beta anyway.

regards, tom lane

[1] https://git.postgresql.org/gitweb/?p=postgresql.git=commitdiff=51004c717




Re: Listen/Notify feedback

2020-07-12 Thread Andrew Smith
On Sun, 12 Jul 2020 at 21:39, Rita  wrote:

> Thats good to know. Are there some standard patterns or best practices I
> should follow when using messaging and with listen/notify?
>
> On Sat, Jul 11, 2020 at 1:44 PM Brian Dunavant  wrote:
>
>> One aspect is if there is no one listening when a notify happens, the
>> message is lost (e.g. no durability).   If this is important to you, it can
>> be addressed by writing the messages to a table as well when you NOTIFY,
>> and the listener deletes messages after they are processed.  On connection
>> the listener can query the table to catch up on any missed messages, or
>> messages that were mid-process during a crash.  This is trickier with more
>> than one listener.   This isn't a whole lot more efficient than just using
>> the table alone, but it saves you from having to poll so better response
>> times.
>>
>> On Sat, Jul 11, 2020 at 8:58 AM Rita  wrote:
>>
>>> I am investigating various pub/sub tools such as ActiveMQ, Rabbit,
>>> Redis, etc.I came across Postgresql Listen/Notify and was easily able to
>>> write code to listen to messages. For the people who have been using this
>>> for a while: what are its downsides, things to consider when writing good
>>> code that use pub/sub, how do you deal with large messages, can I have
>>> subscribers listen to replica nodes?
>>>
>>> Thanks
>>> --
>>> --- Get your facts first, then you can distort them as you please.--
>>>
>>
A couple of years ago I started looking into listen/notify in PG10 and
found that the throughput decreased quite a bit as I added more and more
listeners. Given the number of apps I needed to have listening and the
number of messages that I expected to be consuming, I ended up writing a
single listener app which then republished the messages via MQTT. Not sure
if the performance has improved in subsequent versions (or whether this
will affect you at all) but it's something to keep in mind.


Re: Listen/Notify feedback

2020-07-12 Thread Rita
Thats good to know. Are there some standard patterns or best practices I
should follow when using messaging and with listen/notify?

On Sat, Jul 11, 2020 at 1:44 PM Brian Dunavant  wrote:

> One aspect is if there is no one listening when a notify happens, the
> message is lost (e.g. no durability).   If this is important to you, it can
> be addressed by writing the messages to a table as well when you NOTIFY,
> and the listener deletes messages after they are processed.  On connection
> the listener can query the table to catch up on any missed messages, or
> messages that were mid-process during a crash.  This is trickier with more
> than one listener.   This isn't a whole lot more efficient than just using
> the table alone, but it saves you from having to poll so better response
> times.
>
> On Sat, Jul 11, 2020 at 8:58 AM Rita  wrote:
>
>> I am investigating various pub/sub tools such as ActiveMQ, Rabbit, Redis,
>> etc.I came across Postgresql Listen/Notify and was easily able to write
>> code to listen to messages. For the people who have been using this for a
>> while: what are its downsides, things to consider when writing good code
>> that use pub/sub, how do you deal with large messages, can I have
>> subscribers listen to replica nodes?
>>
>> Thanks
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>

-- 
--- Get your facts first, then you can distort them as you please.--


Re: Join optimization

2020-07-12 Thread David Rowley
On Sun, 12 Jul 2020 at 06:59,  wrote:
>
> I'm sorry for the bad example.
>
> Here is another, with some data on PG:  
> https://dbfiddle.uk/?rdbms=postgres_13=ccfd1c4fa291e74a6db9db1772e2b5ac
>   and Oracle:  
> https://dbfiddle.uk/?rdbms=oracle_18=21a98f499065ad4e2c35ff4bd1487e14.

I believe what you're talking about is join removals.  It appears as
though Oracle is able to remove the inner join to the users table as
the join only serves to check the user record exists. No columns are
being selected.  The record must exist due to the foreign key
referencing users.

PostgreSQL currently can only remove left joins. Likely what you could
do here is just change your inner join into a left join. If you're
happy enough that the user record will always exist then that should
allow it to work.

The reason PostgreSQL does not currently support inner join is that by
default, foreign key constraints are only triggered at the end of the
query, (or if deferred, at the end of the transaction). WIth
PostgreSQL, it's possible for a SELECT query to see a violated foreign
key constraint.  This can happen if your SELECT query calls a function
which updates a referenced record.  The cascade of the foreign key
won't occur until the end of the statement, so the select may stumble
upon a violated foreign key.

Here's a quick example of this case:
drop table t1,t2;
create table t1 (a int primary key);
create table t2 (a int references t1 on update cascade);

insert into t1 values(1),(2);
insert into t2 values(2),(2);
create or replace function update_t1 (p_a int) returns int as $$ begin
update t1 set a = a + 1 where a = p_a; return p_a; end; $$ language
plpgsql volatile;

-- in theory, this should never return any rows as we're asking for
all rows that
-- don't exist in the referenced table. However, we do get a result
since the function
-- call updates t1 setting the row with a=2 to a=3. The cascade to t2
does not occur
-- until the end of the statement.
select update_t1(a+1),t1.a from t1 where not exists(select 1 from t2
where t1.a=t2.a);
 update_t1 | a
---+---
 2 | 1
(1 row)

If you're happy that you don't have any functions like that which
could momentarily cause the foreign key to appear violated, then there
shouldn't be any harm in changing the INNER JOIN on users to a LEFT
JOIN.  PostgreSQL will be able to remove the join in that case.

David