Re: can stored procedures with computational sql queries improve API performance?

2024-07-10 Thread Olivier Gautherot
Hi Krishnkant,

On Wed, Jul 10, 2024 at 2:58 AM Krishnakant Mane 
wrote:

> Hello.
>
> I have a straight forward question, but I am just trying to analyze the
> specifics.
>
> So I have a set of queries depending on each other in a sequence to
> compute some results for generating financial report.
>
> It involves summing up some amounts from tuns or of rows and also on
> certain conditions it categorizes the amounts into types (aka Debit
> Balance, Credit balance etc).
>
> There are at least 6 queries in this sequence and apart from 4 input
> parameters. these queries never change.
>
> So will I get any performance benefit by having them in a stored
> procedure rather than sending the queries from my Python based API?
>
> Regards.
>

Functions and procedures have a significant potential to improve
performance but there are a few things to watch. Return of experience after
having written a few hundreds for a project. The list below is by no mean
not exhaustive.

1) Warning: Compatibility with other database engines
If your model needs to run on other technologies (Oracle, MySQL, MS-SQL,
etc.), the increase in maintenance efforts may become prohibitive

2) Warning: deadlocks
Make sure to design the procedures in such a way that you don't mutually
depend on other parallel invocations

3) WARNING: Don't overload the server
Try to remain conservative in terms of computation in your server. Try to
stick to selects, updates, deletes, joins, and simple arithmetics and
strings manipulations. You can do a lot more but it may quickly affect the
overall performance.

The reasons why I would recommend to use them:

A) Reduction of round trips
Even though it may be a few ms at a time, it can add up and become
significant. This is where you gain performance.

B) Implicit transactions
A function will not free locks until it returns. This means that if your
queries depend on cells, or modify cells, the behavior will be coherent,
reducing the risk of race conditions. If other invocations depend on the
same data, the locks will take care of sequencing execution to maintain
integrity. In other words, you can safely invoke functions in parallel and
let the engine do the scheduling when necessary.

C) API evolution
As long as you have only 1 or 2 applications running against the database,
it may not be an issue. If you have more and your model needs to evolve,
you may get to a situation where updating them all at the same time can
become a challenge, especially if you depend on external providers. By
using procedures and functions, you can abstract the model and maintain a
standard interface to the application.

Note: * I DON'T RECOMMEND IT * but in some cases it can be handy to have
the same function name with different sets of parameters (for instance to
present a wrapper with default parameters, and other instances with a
finer, more complete control). It can happen if you don't clean up timely
older versions of the API when you upgrade your model - and it can become a
nightmare.

Last recommendation: activate the logs and review regularly the performance
of your functions. You may identify occurrences that run very fast and
others not so. It can help you identify potential conflicts or model
optimizations.

Hope it helps
--
Olivier Gautherot


Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY

2024-02-06 Thread Olivier Gautherot
El mié, 7 feb 2024 8:07, Sean v  escribió:

> Exactly. I'm really just trying to understand if there's some functional
> limitation to it being able to do that with how it executes these types of
> queries, or if its just an optimization that hasn't been built into the
> query planner yet.
>
> I know I can get it to do precisely this if I use a CROSS JOIN LATERAL:
>
> SELECT o.*FROM company_users cuCROSS JOIN LATERAL (
>SELECT *
>FROM orders o
>WHERE o.user_id = company_users.user_id
>ORDER  BY created_at DESC LIMIT 50
>) cuWHERE  cu.company_id = ? ORDER BY created_at DESC LIMIT 50
>
> That makes sense to me, it forces a nested loop and executes for each
> user. But doing a nested select like the query below doesn't use the index
> or limit the results to 50 per user - even though it does a nested loop
> just like the lateral join does:
>
> SELECT "orders".* FROM "orders" WHERE user_id IN (SELECT user_id FROM 
> company_users WHERE company_id = ?)ORDER BY "orders"."created_at" LIMIT 50
>
>
Joins will generally query the whole tables, leading to long run times.
Have you tried to preselect the rows of interest with a "WITH ... SELECT
..." query to reduce the amount of data processed?

 On 2024-02-05 7:58 a.m., David G. Johnston wrote:
>
> On Mon, Feb 5, 2024 at 8:55 AM Ron Johnson 
> wrote:
>
>>
>> Who knows which users are going to be in that list???
>>
>>
> It doesn't matter.  Worse case scenario there is only one user in the
> result and so all 50 rows are their earliest 50 rows.  The system will thus
> never need more than the earliest 50 rows per user to answer this question.
>
> David J.
>
>
Cheers
Olivier

>
>


Re: Disk Groups/Storage Management for a Large Database in PostgreSQL

2024-01-22 Thread Olivier Gautherot
Hi Amit,

El lun, 22 ene 2024 18:44, Amit Sharma  escribió:

> Hi,
>
> We are building new VMs for PostgreSQL v15 on RHEL 8.x For a large
> database of 15TB-20TB.
>
> I would like to know from the experts that is it a good idea to create
> LVMs to manage storage for the database?
>
> Or are there any other better options/tools for disk groups in PostgreSQL,
> similar to ASM in Oracle?
>
> Thanks
> Amit
>

Simple question that requires a somewhat more complex answer. There are
actually 3 metrics to consider:

1) Capacity
Your database doesn't fit on a single disk, so you need to distribute your
data across several disks. LVM would indeed be an option (as well as ZFS or
RAID disk arrays)

2) Safety
If you loose 1 disk, your data is at risk, as you're likely to loose all
tables partially loaded on that disk. LVM is still an option as long as it
is configured on a RAID array. ZFS can do that natively.

3) Performance
Oracle ADM ensures performance by automatically controlling the
distribution of the tables. I would need to see on a real case how it is
actually done. For sure, LVM and ZFS won't have this type of granularity.

On the other hand, you can distribute your data in table partitions to help
this distribution. It is not automatic but will surely help you to
distribute your workload.

Hope it helps
Olivier


>


Re: Very newbie question

2023-10-26 Thread Olivier Gautherot
Hi,

El jue, 26 oct 2023 11:15, Peter J. Holzer  escribió:

> On 2023-10-25 17:48:46 +0200, Olivier Gautherot wrote:
> > El mié, 25 oct 2023 16:58, Олег Самойлов  escribió:
> > Okey, I see no one was be able to solve this problem. But I could.
> May be
> > for someone this will be useful too. There is solution.
> [...]
> > Now query is:
> >
> > SELECT generate_series(min(id)/1000, max(id)/1000) AS n FROM
> > delivery) as part_numbers
> >  WHERE (SELECT max(created_at) from delivery where
> n*1000 <=id
> > and id < (n+1)*1000)
> > < CURRENT_DATE-'3 month'::interval;
> >
> > Return the same (number of partition need to archive), accelerated
> by two
> > btree index: on id and created_at. Works very quick, less then
> second.
> [...]
> > Your fast solution will work as long as you don't have missing sequences
> (like
> > deleted rows).
>
> Why do you think this would break with missing sequence numbers?
>
> hp
>

In the suggested query, the return value contains a list of sequential
numbers from a min to a max - they seem to be markers of the partitions.
Let's assume that a complete partition is deleted in the middle: its index
will still be returned by the query, although it doesn't exist any more in
the table. It can be an issue if the list of indexes is actually used and
partitions are not deleted sequentially.

My cent worth to ensure data integrity.


Regards
Olivier Gautherot

>


Re: Very newbie question

2023-10-25 Thread Olivier Gautherot
Hi,

El mié, 25 oct 2023 16:58, Олег Самойлов  escribió:

> Okey, I see no one was be able to solve this problem. But I could. May be
> for someone this will be useful too. There is solution.
>
> Original query was:
>
> > 23 окт. 2023 г., в 18:13, Олег Самойлов  написал(а):
> >
> > SELECT id/1000 as partition
> >   FROM delivery
> >   GROUP BY partition
> >   HAVING max(created_at) < CURRENT_DATE - '3 month'::interval;
>
> And I was not able to accelerate it by any index, works 5 minutes. Now
> query is:
>
> SELECT generate_series(min(id)/1000, max(id)/1000) AS n FROM
> delivery) as part_numbers
>  WHERE (SELECT max(created_at) from delivery where n*1000 <=id
> and id < (n+1)*1000)
> < CURRENT_DATE-'3 month'::interval;
>
> Return the same (number of partition need to archive), accelerated by two
> btree index: on id and created_at. Works very quick, less then second.
>

If you happen to rework your design, consider partitioning on (created_at),
as it may simplify your maintenance.

The reason why you couldn't improve the performance with an index is due to
the calls of min() and max() that force to evaluate every single row. You
may consider using a computed index in this case.

Your fast solution will work as long as you don't have missing sequences
(like deleted rows).

Regards
Olivier

>


Re: PostgreSQL and local HDD

2023-08-15 Thread Olivier Gautherot
El mié, 16 ago 2023 6:54, Ron  escribió:

> On 8/15/23 23:48, Olivier Gautherot wrote:
>
>
> El mié, 16 ago 2023 5:39, Ron  escribió:
>
>> On 8/15/23 04:24, Olivier Gautherot wrote:
>>
>> [snip]
>>
>> Does the database have to be on a storage like EMC or QNAP?
>>>
>>
>> Faster storage can only help but I would start by discarding functional
>> overhead.
>>
>>
>> Functional overhead?
>>
>
> I mean inefficient design, which won't be solved by changing the storage.
>
>
> Ah, the normal aspects of physical design...
>

... and considerung that the OP did not quantify what "slow" means and what
his reference/expectation is.

Let's wait for the OP to give more background.

>


Re: PostgreSQL and local HDD

2023-08-15 Thread Olivier Gautherot
El mié, 16 ago 2023 5:39, Ron  escribió:

> On 8/15/23 04:24, Olivier Gautherot wrote:
>
> [snip]
>
> Does the database have to be on a storage like EMC or QNAP?
>>
>
> Faster storage can only help but I would start by discarding functional
> overhead.
>
>
> Functional overhead?
>

I mean inefficient design, which won't be solved by changing the storage.

--
Olivier Gautherot

>
>


Re: PostgreSQL and local HDD

2023-08-15 Thread Olivier Gautherot
Hi Jason

On Tue, Aug 15, 2023 at 9:23 AM Jason Long  wrote:

> Hello,
> Why is the PostgreSQL database slow when it is on the server's hard drive?
> My servers are HPE ProLiant DL380p G8 and HPE ProLiant DL380 G9.
>

Can you be more specific in your question? Reasons for the server would be
"slow" can be many and can be addressed in many ways. So additional
information would be welcome:
- Size of the database
- Number of rows (and their number of columns, size...) in the tables
- Indices and other optimization tricks
- The query itself
- ... and other server configuration parameters

Does the database have to be on a storage like EMC or QNAP?
>

Faster storage can only help but I would start by discarding functional
overhead.

Does PostgreSQL have an option to increase speed?
>

It is usually achieved by reducing the number of rows you pull from the
disk.
Try the command EXPLAIN (
https://www.postgresql.org/docs/current/sql-explain.html )  to see where
the server is wasting time.


> Thank you.
>

--
Olivier Gautherot


Re: Alter the column data type of the large data volume table.

2020-12-04 Thread Olivier Gautherot
On Fri, Dec 4, 2020 at 5:22 PM Michael Lewis  wrote:

>
> On Fri, Dec 4, 2020 at 9:04 AM Kevin Brannen  wrote:
>
>> *>From:* Olivier Gautherot 
>>
>> >>5) If you're brave enough, convert your current table as a partition
>> (rename it to something like table_hist), duplicate the table model under
>> the same name as now (adjusting the primary key type) and set the INHERITS
>> on the primary key range. The inheritance should take care of the type
>> conversion (haven't tried it but it's worth a try). If it works, you will
>> reach your goal without downtime or significant overhead.
>>
>> >Sorry, just tried this one and it failed: type mismatch.
>>
>>
>>
>> Seems like a sound idea in general. I’d probably rename the tables, let’s
>> call them “big_hist” for the old big table and “big_split” for the new
>> partitioned table that being used go forward – assuming the original table
>> was called “big”. Then create a View that will look at both of those but
>> call it the same as the old table, and let the view do a type cast on the
>> old key like big_hist.id::bigint so it matches the new type, because the
>> view will probably be a union and the type need to match. That way your
>> application only has to pause long enough to do a few meta-commands then it
>> all can resume, and like Olivier pointed you, you can fix the data by
>> moving it from big_hist to big_split in the background as you have time.
>>
>>
>>
>> I’d probably put it all in a transaction too:
>>
>>
>>
>> Create table … -- all the commands to create your patitioned table
>> big_split here
>>
>> Begin;
>>
>> Alter table big rename to big_hist;
>>
>> Create view big select * from big_split union select id::bigint, /* other
>> cols */ from big_hist;
>>
>> Commit;
>>
>>
>>
>> Try it on a dev system and if it works you’re off and running. I’d expect
>> the view to slow things down a little, but probably not too much if you
>> have good indexes. But at least you could transition without major downtime
>> and then rename “big_split” back to “big” and drop “big_hist” when you’ve
>> finished the transition. I might even be tempted to add a trigger so that
>> all new inserts into “big” really go into “big_split” so “big_hist” doesn’t
>> grow any more. Your imagination is probably the limit. 😊
>>
>
> Yes, "instead of" triggers for insert/update/delete will make the change
> transparent to the application side, other than the potential for slowness
> while in the process of moving the data and still using the view.
>
> Also, I'd advocate for UNION ALL to avoid the "distinct" work that is
> required for plain UNION. I wish ALL were default behavior and "UNION
> DISTINCT" was how get that behavior.
>
> If partitioning is going to happen anyway in the near future, now is a
> great time to make that happen. Unfortunately, PG13 is required to use
> logical replication from a non-partitioned table to a partitioned table so
> moving the data still requires some creative work.
>

  Rather than union, use inheritance with a constraint on the primary key:
it will take care of the union transparently for you.
 --
Olivier Gautherot

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Libre
de virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: Alter the column data type of the large data volume table.

2020-12-04 Thread Olivier Gautherot

Hi Charles,

On 04-12-2020 9:44, Olivier Gautherot wrote:

Hi Charles,

On Fri, Dec 4, 2020 at 9:12 AM charles meng mailto:xly...@gmail.com>> wrote:

What I mean is that it will take a lot of time to complete all
data processing.I have to say that it is a good solution to adjust
the column type without affecting online users.

I found a tool on github, see the link below, unfortunately, this
is for MySQL...

https://github.com/github/gh-ost <https://github.com/github/gh-ost>


MySQL has its own strategy with regard to column handling so what
works there does not necessarily fit here.

There are some good ideas in this thread but I would start with a few
checks:

1) What version of PG are you using ?

2) If you can try on your current setup (not necessarily in
Production) to add a column typed bigint - if it is recent enough it
will be a simple catalog update. Michael's suggestion is viable

3) Given the massive number of rows, I would update as suggested,
progressively in batches of a few tens of thousands. Make sure you
commit and vacuum after each to retrieve the space (or you may end up
with a storage space issue in addition to all the rest). In the
meantime, add a trigger to set the new column to the index value. Once
the new column is complete, drop the old column and set the new one as
primary key (it may take a while to recalculate the index).

4) If your table is still growing, I would definitely look into
partitioning as it can hardly make things worse.

5) If you're brave enough, convert your current table as a partition
(rename it to something like table_hist), duplicate the table model
under the same name as now (adjusting the primary key type) and set
the INHERITS on the primary key range. The inheritance should take
care of the type conversion (haven't tried it but it's worth a try).
If it works, you will reach your goal without downtime or significant
overhead.


Sorry, just tried this one and it failed: type mismatch.


Cheers
--
Olivier Gautherot



--
El software de antivirus Avast ha analizado este correo electrónico en busca de 
virus.
https://www.avast.com/antivirus


Re: Alter the column data type of the large data volume table.

2020-12-04 Thread Olivier Gautherot
Hi Charles,

On Fri, Dec 4, 2020 at 9:12 AM charles meng  wrote:

> What I mean is that it will take a lot of time to complete all data
> processing.I have to say that it is a good solution to adjust the column
> type without affecting online users.
>
> I found a tool on github, see the link below, unfortunately, this is for
> MySQL...
>
> https://github.com/github/gh-ost
>

MySQL has its own strategy with regard to column handling so what works
there does not necessarily fit here.

There are some good ideas in this thread but I would start with a few
checks:

1) What version of PG are you using ?

2) If you can try on your current setup (not necessarily in Production) to
add a column typed bigint - if it is recent enough it will be a simple
catalog update. Michael's suggestion is viable

3) Given the massive number of rows, I would update as suggested,
progressively in batches of a few tens of thousands. Make sure you commit
and vacuum after each to retrieve the space (or you may end up with a
storage space issue in addition to all the rest). In the meantime, add a
trigger to set the new column to the index value. Once the new column is
complete, drop the old column and set the new one as primary key (it may
take a while to recalculate the index).

4) If your table is still growing, I would definitely look into
partitioning as it can hardly make things worse.

5) If you're brave enough, convert your current table as a partition
(rename it to something like table_hist), duplicate the table model under
the same name as now (adjusting the primary key type) and set the INHERITS
on the primary key range. The inheritance should take care of the type
conversion (haven't tried it but it's worth a try). If it works, you will
reach your goal without downtime or significant overhead.



> regards.
>
> Michael Lewis  于2020年12月4日周五 下午1:04写道:
>
>> Please do reply all.
>>
>> Not sure what you mean about it taking too much time. It's rewriting a
>> bunch of data. It's going to take a while. The best you can do is break the
>> work up into small pieces and commit each piece.
>>
>> On Thu, Dec 3, 2020, 7:11 PM charles meng  wrote:
>>
>>> Thanks for your help, I think the first method I tried (adding temporary
>>> column) is similar to what you said, but it takes too much time for me.
>>>
>>>
>>> Thanks again.
>>>
>>> Michael Lewis  于2020年12月4日周五 上午1:11写道:
>>>
>>>> On Wed, Dec 2, 2020 at 11:53 PM charles meng  wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a table with 1.6 billion records. The data type of the primary
>>>>> key column is incorrectly used as integer. I need to replace the type of
>>>>> the column with bigint. Is there any ideas for this?
>>>>>
>>>>> Solutions that have been tried:
>>>>> Adding temporary columns was too time-consuming, so I gave up.
>>>>> Using a temporary table, there is no good way to migrate the original
>>>>> table data to the temporary table
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>
>>>> You can add a new column with NO default value and null as default and
>>>> have it be very fast. Then you can gradually update rows in batches (if on
>>>> PG11+, perhaps use do script with a loop to commit after X rows) to set the
>>>> new column the same as the primary key. Lastly, in a transaction, update
>>>> any new rows where the bigint column is null, and change which column is
>>>> the primary key & drop the old one. This should keep each transaction
>>>> reasonably sized to not hold up other processes.
>>>>
>>>
Cheers
--
Olivier Gautherot


<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Libre
de virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: vacuum vs vacuum full

2020-11-18 Thread Olivier Gautherot
On Wed, Nov 18, 2020 at 10:45 AM Ron  wrote:

> On 11/18/20 3:41 AM, Olivier Gautherot wrote:
>
> Hi Atul,
>
> On Wed, Nov 18, 2020 at 9:33 AM Atul Kumar  wrote:
>
>> Hi,
>>
>> We have a table of 3113GB, and we are planning to vacuum it in non
>> business hours i.e. 12AM to 4AM, So my queries are:
>>
> [snip]
>
>
>
> 3. Will the operation be completed in the given time frame? how to
>> check the same.
>>
>
> Given the size of the table, it will probably take several days.
>
>
> No matter how long it takes, this is an excellent argument for
> partitioning Very Large Tables: many maintenance tasks are made *much*
> easier.
>

I can only agree with this comment. The main issue I see is the available
disk space, as the partitioning process will include copying the whole
table.

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Libre
de virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: vacuum vs vacuum full

2020-11-18 Thread Olivier Gautherot
Hi Atul,

On Wed, Nov 18, 2020 at 9:33 AM Atul Kumar  wrote:

> Hi,
>
> We have a table of 3113GB, and we are planning to vacuum it in non
> business hours i.e. 12AM to 4AM, So my queries are:
>
> 1. What should be perform on the table Vacuum or Vacuum full ?
>

Vacuum full will do a complete rewrite of the table so you need to make
sure that you have the necessary space. I would recommend a simple
VACUUM, although it won't return the extra space to the OS.


> 2. Do we need to perform Analyze also?
>

It would be a good thing.


> 3. Will the operation be completed in the given time frame? how to
> check the same.
>

Given the size of the table, it will probably take several days.


> 4. Who acquire lock on table vacuum or vacuum full.
>

VACUUM FULL acquires a lock on the table. VACUUM doesn't.


> 5. If the activity goes beyond time frame, do we have any option to do
> continue doing t without acquiring lock on the table ?
>

VACUUM is a background activity. It does not block any other activity.


>
> If you also need the structure of the table, Please let me know.
>

It would be interesting to know the number of rows updated per hour or per
day to have an estimation of the needs.


> Please help me by responding my query wise.
>
> Regards,
> Atul
>

Cheers
Olivier



Libre
de virus. www.avast.com

<#m_-5414522580965465877_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: autovacuum recommendations for Large tables

2020-11-17 Thread Olivier Gautherot
Hi Rob,

On Tue, Nov 17, 2020 at 10:24 PM Rob Sargent  wrote:

> If it helps, I put together a few thoughts and own experience on a blog:
> https://sites.google.com/gautherot.net/postgresql/vacuum
>
> Hope you find it useful.
> --
>
> That URL does not work for me (not even
> https://sites.google.com/gautherot.net)
>
>
Thanks  for the heads up - it looks like Google won't publish the site
outside of my domain.

I moved it to the old sites:
https://sites.google.com/a/gautherot.net/postgresql-tips/vacuum

Let's hope it works this time.

Cheers
Olivier


Libre
de virus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: autovacuum recommendations for Large tables

2020-11-17 Thread Olivier Gautherot
Hi Atul,

On Tue, Nov 17, 2020 at 12:05 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Mon, Nov 16, 2020 at 3:57 PM Atul Kumar  wrote:
>
>> I only have this one big table in the database of size 3113 GB with rows
>> 7661353111.
>>
>> Right Now the autovacuum setting for that table is set to
>>
>> {autovacuum_enabled=true,autovacuum_vacuum_scale_factor=0.2,autovacuum_analyze_scale_factor=0.2}
>>
>>>
>>>>
> auto-vacuum doesn't care directly about absolute size, it cares about
> change (relative to absolute size in many cases, hence the scale factors).
>
> David J.
>

David is correct.

If it helps, I put together a few thoughts and own experience on a blog:
https://sites.google.com/gautherot.net/postgresql/vacuum

Hope you find it useful.
--
Olivier Gautherot


<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Libre
de virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: RITM18130676_Query_PostgreSQL support subscription

2020-10-20 Thread Olivier Gautherot
Dear Mary Jane,

Le mar. 20 oct. 2020 à 16:42, NECSG Purchasing  a
écrit :

> Dear Valued Supplier,
>
> May we follow up on our below inquiry please.
>

Please check with your technical teams where they downloaded the software
from. If it has been installed as a package from a Linux distribution or
directly from the Postgres web site, it is free software with a permissive
license, be it for Development or Production.

However there are some distributors who package the database engine with
their own add-ons and support - companies like EnterpriseDB for instance.
If you got the software from them, please check with their own commercial
services.

Hope it helps


> Thank you.
>
> Mary Jane Manalo
> Analyst, Vendor Master / Purchasing
> NEC Shared Service Centre
> For Internal NEC Query, please dial 500-63-51- 4004.
> -
>
> To: pgsql-gene...@postgresql.org
> From: purchasing...@necssc.com
> CC: NECSG Purchasing, SSC Purchasing Mailbox
> Date: 2020-10-16 09:47:32
> Subject: RE:RITM18130676_Query_PostgreSQL support subscription
>
> Greetings from NEC,
>
> May we ask for your confirmation about following whether is free license?
>
> PostgreSQL(9.4 - 11)
> PostgreSQL(9.4 - 11) support subscription
>
> Thank you.
>
> Mary Jane Manalo
> Analyst, Vendor Master / Purchasing
> NEC Shared Service Centre
> For Internal NEC Query, please dial 500-63-51- 4052.
>
> Ref:MSG44486374
>


Best regards
Olivier Gautherot

>


Re: "Go" (lang) standard driver

2020-08-18 Thread Olivier Gautherot
Hi Edson,

Le mar. 18 août 2020 à 09:36, Tony Shelver  a écrit :

> -- Forwarded message -
> From: Tony Shelver 
> Date: Tue, 18 Aug 2020 at 09:33
> Subject: Re: "Go" (lang) standard driver
> To: Edson Richter 
>
>
> A quick Google search found https://github.com/lib/pq.   Has 6.1K stars
> so I would guess fairly well used.
>

I have used it successfully and it performed perfectly. Go for it.


> On Tue, 18 Aug 2020 at 05:56, Edson Richter 
> wrote:
>
>> Is there any standard PostgreSQL driver for Go language?
>> Do members of this group could share experiences with Go + PostgreSQL
>> development?
>>
>> Thanks,
>>
>> E.R.
>>
>
Olivier

>


Re: Install PostgreSQL on windows 10 home 64-bit machine

2020-05-27 Thread Olivier Gautherot
Hi Prashanth,

On Wed, May 27, 2020 at 8:59 AM Prashanth Talla 
wrote:

> Hi,
> I'm trying to install PostgreSQL database on my personal laptop that is
> running on Windows 10 Home 64-bit operating system.
>
> I found the s/w @ https://www.postgresql.org/download/windows/
>
> The webpage shows that PostgreSQL v10 (latest is v12) is available for
> Windows 10 64-bit, but it doesn't tell if its suitable for Windows 10 home
> or not.
>
> Did anyone installed this database s/w on their personal laptop that is
> running on Windows 10 home 64-bit? If so, can you please point me to the
> right version (latest) of PostgreSQL DB that I can install for practice?
>

It works fine on Windows 10 Home. If it is a personal laptop (and probably
not dedicated to the database), just make sure you don't load huge
datasets. That being said, it is fully functional.

Happy learning!
--
Olivier Gautherot


Re: Slow SELECT

2020-05-26 Thread Olivier Gautherot
Hi Frank,

On Tue, May 26, 2020 at 9:23 AM Frank Millman  wrote:

> Hi all
>
> I have a SELECT that runs over 5 times slower on PostgreSQL compared
> with Sql Server and sqlite3. I am trying to understand why.
>
> I have a table that looks like this (simplified) -
>
> CREATE TABLE my_table (
>  row_id SERIAL PRIMARY KEY,
>  deleted_id INT DEFAULT 0,
>  fld_1 INT REFERENCES table_1(row_id),
>  fld_2 INT REFERENCES table_2(row_id),
>  fld_3 INT REFERENCES table_3(row_id),
>  fld_4 INT REFERENCES table_4(row_id),
>  tran_date DATE,
>  tran_total DEC(21,2)
>  );
>
> CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3,
> fld_4, tran_date) WHERE deleted_id = 0;
>
> The table sizes are -
>  my_table : 167 rows
>  table_1 : 21 rows
>  table_2 : 11 rows
>  table_3 : 3 rows
>  table_4 : 16 rows
>
> Therefore for each tran_date in my_table there are potentially
> 21x11x3x16 = 11088 rows. Most will be null.
>
> I want to select the row_id for the last tran_date for each of those
> potential groups. This is my select -
>
>  SELECT (
>  SELECT a.row_id FROM my_table a
>  WHERE a.fld_1 = b.row_id
>  AND a.fld_2 = c.row_id
>  AND a.fld_3 = d.row_id
>  AND a.fld_4 = e.row_id
>  AND a.deleted_id = 0
>  ORDER BY a.tran_date DESC LIMIT 1
>  )
>  FROM table_1 b, table_2 c, table_3 d, table_4 e
>
> Out of 11088 rows selected, 103 are not null.
>
> On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on
> sqlite3, and 0.31 sec on PostgreSQL.
>

SQL Server does a good job at caching data in memory. PostgreSQL does too
on consecutive calls to the same table. What execution time do you get if
you issue the query a second time?

My first guess would be to add an index on my_table.tran_date and check in
EXPLAIN that you don't have a SEQUENTIAL SCAN on that table.


> I have looked at the EXPLAIN, but I don't really know what to look for.
> I can supply it if that would help.
>
> Thanks for any advice.
>
> Frank Millman
>
>
--
Olivier Gautherot


Re: Column reset all values

2020-05-13 Thread Olivier Gautherot
Hi Otar,

On Wed, May 13, 2020 at 10:15 AM otar shavadze  wrote:

> postgres version 12
> I have very simple update query, like this:
>
> update my_table
> set
> col = NULL
> where
> col IS NOT NULL;
>
> my_table contains few million rows, col is indexed column
>
> Fastest way would be   alter table, drop column and then add column again,
> but I can't do this just because on this column   depends bunch of views
> and materialized views.
>
> No much hope, but still asking, Is there some another way to just reset
> column all values? ( with NULL in my case)
>

If views depend on this column, you may need to drop them (in the right
order...) and then recreate them. Now, if they depend on a column that will
not contain significant data, you may wish to remove the column, or declare
it as null if you need to maintain compatibility.

Now, if you have time and down time of the database is an issue, you may
run the UPDATE on lots of 1000 rows (or whatever that number fits you).
UPDATE is typically a INSERT/DELETE/VACUUM sequence and this copying around
is the killer - doing it in one go can temporarily increase the disk usage.
I've had success with the following pseudo code:

SELECT rowid FROM mytable WHERE col IS NOT NULL

and fed the result to something like:

FOR chunk IN chunk_in_1000_rows(query_result)
DO
BEGIN
UPDATE my_table SET col = NULL WHERE rowid IN chunk
COMMIT
SLEEP(5)
DONE

You may wish to run a VACUUM FULL manually at the end.

In my case, I had to compute individual numbers so the processing was a bit
more complex but it happily processed over 60 millions rows in a few days.

Hope it helps
--
Olivier Gautherot


Re: relationship of backend_start, query_start, state_change

2020-04-23 Thread Olivier Gautherot
Hi David,

On Thu, Apr 23, 2020 at 6:55 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Apr 23, 2020 at 9:37 AM Si Chen 
> wrote:
>
>> Hello,
>>
>> I'm looking at my pg_stat_activity and trying to figure out what is
>> causing some of these processes.  I'm using this query:
>>
>> SELECT pid, wait_event, state_change, backend_start, xact_start,
>> query_start, state_change - query_start, query from pg_stat_activity where
>> datname= 'my_database' and state in ('idle', 'idle in transaction', 'idle
>> in transaction (aborted)', 'disabled');
>>
>>
> Including the "state" field should clear things up considerably.
>
>
> https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW
>
>
> David J.
>

The transactions are idle, they are filtered in the WHERE statement.

--
Olivier Gautherot


Re: relationship of backend_start, query_start, state_change

2020-04-23 Thread Olivier Gautherot
On Thu, Apr 23, 2020 at 6:37 PM Si Chen 
wrote:

> Hello,
>
> I'm looking at my pg_stat_activity and trying to figure out what is
> causing some of these processes.  I'm using this query:
>
> SELECT pid, wait_event, state_change, backend_start, xact_start,
> query_start, state_change - query_start, query from pg_stat_activity where
> datname= 'my_database' and state in ('idle', 'idle in transaction', 'idle
> in transaction (aborted)', 'disabled');
>
> and what I see is the same PID which have been around for a while in the
> same query (COMMIT), but the query_start and state_change are updated and
> close to each other:
>
>   pid  | wait_event | state_change  | backend_start
> | xact_start |  query_start  |?column? |
> query
>
>
> ---++---+---++---+-+
>
>  32506 | ClientRead | 2020-04-23 09:29:05.6793-07   | 2020-04-23
> 01:00:19.612478-07 || 2020-04-23 09:29:05.679275-07 |
> 00:00:00.25 | COMMIT
>
>  32506 | ClientRead | 2020-04-23 09:30:33.247119-07 | 2020-04-23
> 01:00:19.612478-07 |   | 2020-04-23
> 09:30:33.247109-07 | 00:00:00.1  | COMMIT
>
>  32506 | ClientRead | 2020-04-23 09:31:31.506914-07 | 2020-04-23
> 01:00:19.612478-07 |   | 2020-04-23
> 09:31:31.506905-07 | 00:00:00.09 | COMMIT
>
>  32506 | ClientRead | 2020-04-23 09:32:32.06656-07  | 2020-04-23
> 01:00:19.612478-07 || 2020-04-23 09:32:32.066552-07 |
> 00:00:00.08 | COMMIT
>
>  32506 | ClientRead | 2020-04-23 09:36:51.579939-07 | 2020-04-23
> 01:00:19.612478-07 || 2020-04-23 09:36:51.579931-07 |
> 00:00:00.08 | COMMIT
>
> So can we close this thread because it's been around since 1 AM and the
> last query is always "COMMIT"?  Or should we keep it open because the
> query_start keeps updating, and the state_change is basically keeping up
> with query_start?
>
> -
> Si Chen
> Open Source Strategies, Inc.
>
> Our Mission: https://www.youtube.com/watch?v=Uc7lmvnuJHY
>
>
There is no significant harm in having this thread. Check why the client is
not disconnecting - it may have a good reason.

--
Olivier Gautherot
Tel: +33 6 02 71 92 23
https://www.linkedin.com/in/ogautherot/


Re: possibilities for SQL optimization

2020-04-16 Thread Olivier Gautherot
 Index Cond: (("pixelId" >= '10729566502912'::bigint) AND
>>> ("pixelId" <= '10729570697215'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..4.72 rows=15 width=0) (actual time=0.002..0.002 rows=0 loops=1)
>>>Index Cond: (("pixelId" >= '10729570959360'::bigint) AND
>>> ("pixelId" <= '10729571090431'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..4.67 rows=10 width=0) (actual time=0.002..0.003 rows=8 loops=1)
>>>Index Cond: (("pixelId" >= '10729571155968'::bigint) AND
>>> ("pixelId" <= '10729571221503'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..5.65 rows=108 width=0) (actual time=0.018..0.018 rows=108
>>> loops=1)
>>>Index Cond: (("pixelId" >= '10729571745792'::bigint) AND
>>> ("pixelId" <= '10729573056511'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..5.39 rows=82 width=0) (actual time=0.010..0.010 rows=105
>>> loops=1)
>>>Index Cond: (("pixelId" >= '10729573122048'::bigint) AND
>>> ("pixelId" <= '10729574105087'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..19.80 rows=723 width=0) (actual time=0.063..0.063 rows=674
>>> loops=1)
>>>Index Cond: (("pixelId" >= '10729574170624'::bigint) AND
>>> ("pixelId" <= '10729583280127'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..4.67 rows=10 width=0) (actual time=0.008..0.008 rows=1 loops=1)
>>>Index Cond: (("pixelId" >= '10729601105920'::bigint) AND
>>> ("pixelId" <= '10729601171455'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..11.92 rows=335 width=0) (actual time=0.074..0.074 rows=343
>>> loops=1)
>>>Index Cond: (("pixelId" >= '10729604251648'::bigint) AND
>>> ("pixelId" <= '10729608445951'::bigint))
>>>  ->  Bitmap Index Scan on "IDX_DiaSource_htmId20"
>>>  (cost=0.00..4.72 rows=15 width=0) (actual time=0.009..0.009 rows=16
>>> loops=1)
>>>Index Cond: ((&quo

Re: Recursive Queries

2020-04-16 Thread Olivier Gautherot
Hi Alex,


On Thu, Apr 16, 2020 at 10:36 AM Alex Magnum  wrote:

> Hi,
> I have a simple table with singup timestamps
>
> What I would like to do is to create a table as shown below that displays
> the counts per our for the past n dates.
>
> I can do this with a function but is there an easy way to use recursive
> queries?
>
>
> * Counts per hour for given date*
>
> *HR   2020-04-01  2020-04-02  ... 2020-04-10*00  38   33
>  36
> 01  33   26  18
> 02  26   36  17
> 03  36   18  10
> 04  18   17   3
> 05  17   10   3
> 06  103   6
> 07   33  10
> .36  13
> .6   10  22
> .   10   13  12
> 22  13   22   9
> 23  22   11   8
>
> Thanks for any suggestions.
> A
>

You don't need subqueries. The WHEN statement can help you in this case (a
bit tedious to write but fast to run):

WITH q AS (select date_part('hour', ts) AS hr, ts::date AS mydate FROM
your_table_or_query)
SELECT hr,
sum(CASE WHEN mydate = '2020-04-01'::date THEN 1 ELSE 0 END),
sum(CASE WHEN mydate = '2020-04-02'::date THEN 1 ELSE 0 END),
...
FROM q ORDER BY hr;


Hope it helps
--
Olivier Gautherot
Tel: +33 6 02 71 92 23
https://www.linkedin.com/in/ogautherot/


Re: Using Oracle SQL Client commands with PSQL 12.2 DB

2020-04-13 Thread Olivier Gautherot
Hi Fred,

Le lun. 13 avr. 2020 à 21:49, Fred Richard  a écrit :

> PGSQLCommunities,
>
>
> We migrated Oracle 11.x Database to PostgreSQL 12.x Database on a RH Linux
> 7.x server.
> On a different RH Linux 7.x Server, I have Oracle Client installed.  Since
> we have many scripts developed in Oracle SQL, is it possible for the
> PostgreSQL 12.x DB to process the Oracle Scripts?  Are there utilities or
> drivers that could be installed on the PostgreSQL 12.x Database or server
> for processing the Oracle SQL client commands?  We are trying to avoid
> updating our Oracle Client scripts on remote servers.
>
> Thanks
> Fred
>

I removed the hackers' list as this is a standard question. The question is
really how far from standard SQL. If you stuck to the standard, you
shouldn't have a big effort. If you didn't, the case should be evaluated
more carefully and you may have a bigger challenge.

A first evaluation would be to fire the requests and see how much breaks.

You can also look at
http://www.enterprisedb.com/enterprise-postgres/database-compatibility-oracle
as it seems to be a (possibly commercial) valid answer.

Orafce could be another one  https://github.com/orafce/orafce

I haven't tried these so I can't confirm how mature or applicable they are
to your problem.

Hope it helps
Olivier

>


Re: Using compression on TCP transfer

2020-04-05 Thread Olivier Gautherot
Hi Andrus,

Le sam. 4 avr. 2020 à 10:09, Andrus  a écrit :

> Hi!
>
> >> In case of varchar field values will appear in database sometimes with
> >> trailing spaces and sometimes without.
> >> This requires major application re-design which much is more expensive
> than
> >> continuing using char fields.
> >A simple BEFORE INSERT OR UPDATE trigger would take care of that.
>
> Changing char to varchar will break commands where trailing space is used
> in comparison.
>
> For example query
>
> create table test ( test char(10) );
> insert into test values ('test');
> select * from test where test ='test '; -- note trailing space
>
> does not return data anymore if your recommendation is used:
>
> create table test ( test varchar );
> insert into test values ('test');
> select * from test where test ='test '; -- note trailing space
>
> In production 'test ' is query parameter coming from application with
> possible trailing space(s).
>
> Adding trigger does not fix this.
> How to fix this without re-writing huge number of sql commands?
>

In the end, your question is more at application level than database
itself. The real question is: which one is correct? With or without
trailing space?

If you decide that it's without, you could apply a TRIM in a trigger on
each INSERT and UPDATE. Then, you could replace the table by a view of the
same name and implement the TRIM on SELECT there. This way, you don't have
to touch anything in the application.

Hope it helps
Olivier

>


Re: Using compression on TCP transfer

2020-03-31 Thread Olivier Gautherot
Sorry, forgot the reply all :-)

On Tue, Mar 31, 2020 at 11:39 AM Andrus  wrote:

> Hi,
>
> >See the section about sslcompression in
> https://www.postgresql.org/docs/9.2/libpq-connect.html. It should be your
> answer.
>
> I added
>
> sslcompression=1
>
> to psqlodbc connection string but log file shows that connection is still
> uncompressed:
>
> LOG:  connection authorized: user=me database=mydb SSL enabled
> (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, bits=256, compression=off)
>
> Maybe because according to TLSv1.3 standard, compression is no more
> allowed or psqlodbc does not pass it to pglib.
>
> How to compress ?
>
> Andrus.
>

Updated doc reference:
https://www.postgresql.org/docs/12/libpq-connect.html

They mention that compression is insecure and disabled by default. Taking
this into account, compression will require that both ODBC and PostgreSQL
are set up with compression enabled. I could not figure out quickly whether
this requires also recompiling the code...

--
Olivier Gautherot
Tel: +33 6 02 71 92 23
https://www.linkedin.com/in/ogautherot/


Re: ESQL/C FETCH of CHAR data delivers to much data for UTF-8

2020-01-10 Thread Olivier Gautherot
Hi Matthias,

On Thu, Jan 9, 2020, 20:21 Matthias Apitz  wrote:

> Hello,
>
> We encounter the following problem with ESQL/C: Imagine a table with two
> columns: CHAR(16) and DATE
>
> The CHAR column can contain not only 16 bytes, but 16 Unicode chars,
> which are longer than 16 bytes if one or more of the chars is a UTF-8
> multibyte
> encoded.
>
> If one provides in C a host structure to FETCH the data as:
>
> EXEC SQL BEGIN DECLARE SECTION;
> struct  r_d02ben_ec {
> charstring[17];
> chardate[11];
> };
> typedef struct r_d02ben_ec t_d02ben_ec;
> t_d02ben_ec *hp_d02ben, hrec_d02ben;
> EXEC SQL END DECLARE SECTION;
>
> and fetches the data with ESQL/C as:
>
> EXEC SQL FETCH hc_d02ben INTO :hrec_d02ben;
>
> The generated C-code looks like this:
>
> ...
> ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "fetch hc_d02ben",
> ECPGt_EOIT,
> ECPGt_char,&(hrec_d02ben.string),(long)17,(long)1,sizeof( struct
> r_d02ben_ec ),
> ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L,
> ECPGt_char,&(hrec_d02ben.date),(long)11,(long)1,sizeof( struct
> r_d02ben_ec ),
> ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L,
> ...
>
> As you can see for the first item the length 17 is sent to the PG server
> together with the pointer to where the data should be stored
> and for the second element the length 11 is sent (which is big enough to
> receive in ASCII MM.DD. and a trailing \0).
>
> What we now see using GDB is that for the first element all UTF-8 data
> is returned, lets asume only one multibyte char, which gives 17 bytes,
> not only 16, and the trailing NULL is already placed into the element for
> the date. Now the function ECPGdo() returns the date as MM.DD.
> into the area pointed to for the 2nd element and with this overwrites
> the NULL terminator of the string[17] element. Result is later a
> SIGSEGV because the expected string in string[17] is not NULL
> terminated anymore :-)
>
> I would call it a bug, that ECPGdo() puts more than 17 bytes (16 bytes +
> NULL) as return into the place pointed to by the host var pointer when
> the column in the database has more (UTF-8) chars as will fit into
> 16+1 byte.
>
> Comments?
> Proposals for a solution?
>
> Thanks
>
> matthias
>
>
> --
> Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/
> +49-176-38902045
> Public GnuPG key: http://www.unixarea.de/key.pub
>


I would be cautious about naming this a bug as it is a classical buffer
overflow (i.e. design) issue: if you have UTF-8 characters, your text is no
longer 16-byte long and you should plan extra space in your variables.


Re: Upgrade PostgreSQL 9.6 to 10.6

2020-01-10 Thread Olivier Gautherot
On Thu, Jan 9, 2020, 21:47 github kran  wrote:

>
>
> On Wed, Jan 8, 2020 at 11:03 PM Michael Lewis  wrote:
>
>> On Wed, Jan 8, 2020 at 8:52 PM github kran  wrote:
>>
>>> You are right on RDS but I believe the problem is on Aurora PostgreSQL
>>> where the pglogical throws an error during installation. Are you aware if
>>> this works on  Aurora PostGreSQL
>>>
>>
>> It seems like this question should be sent to AWS support for Aurora if
>> you are wanting to upgrade an Aurora 9.6 instance to 10x also in Aurora.
>>
>
> Yes you are right on that I dont believe in AWS support group but rather I
> feel this community has more experience and have more knowledgeable folks.
> I reached out to AWS support and got a reply as 'NO' ( This is not
> supported from Aurora PostgreSQL)
>

As far as I understand, Aurora is a "PostgreSQL-compatible" product. RDS is
a PostgreSQL database engine with limitations imposed by AWS. This group
will help you on standard installations, can guide you on standard features
but will send you to AWS when the questions come down to proprietary
features or configurations.


>


Re: READ UNCOMMITTED in postgres

2019-12-19 Thread Olivier Gautherot
On Thu, Dec 19, 2019 at 9:20 AM Simon Riggs  wrote:

> On Wed, 18 Dec 2019 at 23:13, Matthew Phillips 
> wrote:
>
>
>> With the current READ UNCOMMITTED discussion happening on pgsql-hackers
>> [1], It did raise a question/use-case I recently encountered and could not
>> find a satisfactory solution for. If someone is attempting to poll for new
>> records on a high insert volume table that has a monotonically increasing
>> id, what is the best way to do it? As is, with a nave implementation, rows
>> are not guaranteed to appear in monotonic order; so if you were to keep a
>> $MAX_ID, and SELECT WHERE p_id > $MAX_ID, you would hit gaps. Is there a
>> clean way to do this? I've seen READ UNCOMMITTED used for this with DB2.
>>
>
If READ UNCOMMITTED returns data belonging to transactions in process,
there is a risk that you consider data that will end up in a ROLLBACK.


>
> Not sure it helps much. The new records aren't truly there until commit.
>

True. And to make things worse, the timestamp (probably invocation of now()
) will record the beginning of the transaction. So if your transaction
takes a few seconds, or does not always take the same time, you will face a
challenge.


> Using max_id alone is not an effective technique. It's just an
> optimization.
>

I would recommend to manage p_id with a sequence... as long as you're not
in multi-master (you will find out that each master handles its own set of
values and you could end up with some surprises). Doing it with MAX(p_id) +
1 is looking for concurrency problems.


> Just be careful to not advance max_id too quickly, and remember which ones
> you've already checked. Or wait for the next monontonic value each time,
> accepting the lag.
>

Again, as long as you can ensure that there won't be any ROLLBACK.
Otherwise you could end up waiting for ever...


>
> --
> Simon Riggs    http://www.2ndQuadrant.com/
> <http://www.2ndquadrant.com/>
> PostgreSQL Solutions for the Enterprise
>


--
Olivier Gautherot
Tel: +33 6 02 71 92 23
https://www.linkedin.com/in/ogautherot/


Re: How to get column and identifier names in UPPERCASE in postgres?

2019-11-29 Thread Olivier Gautherot
Hi Amine,

El mié., 20 de noviembre de 2019 10:11, Amine Tengilimoglu <
aminetengilimo...@gmail.com> escribió:

> Hi all;
>
>   I want to get the column and other identifier names in UPPERCASE
>  form rather than a lowercase one without changing application code like
> qouting the identifiers. Do you know any settings  function or extention
> for this purpose?
>

Use double quotes (") - for instance "UPPERCASE"


> thank you.
>


pg_logical for RHEL 8

2019-11-08 Thread Olivier Gautherot
Hi folks,

We are installing a server with RHEL 8, installing PG 11 was a child's play
but pg_logical is apparently not available. We followed the instructions
in  https://dl.2ndquadrant.com/default/release/site/ but yum does not find
the packages.

Following this documentation, I eventually found a link to the repository:
https://dl.2ndquadrant.com/default/release/browse/rpm/

and realised that RHEL 8 is not there yet.

As a workaround, we installed the release for RHEL 7, which seems to work.

Does someone have a date for the official release of the support for RHEL8?


Thanks in advance
--
Olivier Gautherot
https://www.linkedin.com/in/ogautherot/


Re: Monitor Postgres database status on Docker

2019-09-27 Thread Olivier Gautherot
Hi Daulat,

El mar., 24 de septiembre de 2019 07:05, Daulat Ram <
daulat@exponential.com> escribió:

> Thanks but how we can use it for docker container.
>

You have basically 2 ways:

1) Publish the port 5432 on the container and access it from the host, or

2) Use "docker exec" to run the commands natively inside the container.



> Regards,
>
> Daulat
>
>
>
> *From:* Fan Liu 
> *Sent:* Tuesday, September 24, 2019 3:02 PM
> *To:* Daulat Ram ;
> pgsql-general@lists.postgresql.org
> *Subject:* RE: Monitor Postgres database status on Docker
>
>
>
> Hi,
>
>
>
> I am not from PostgreSQL team.
>
> Just let you know that when we run PostgreSQL in Kubernetes, we use below
> command for liveness check.
>
>
>
> pg_isready --host localhost -p $PG_PORT -U $PATRONI_SUPERUSER_USERNAME
>
>
>
>
>
> BRs,
>
> Fan Liu
>
>
>
>
>
> *From:* Daulat Ram 
> *Sent:* Tuesday, September 24, 2019 5:18 PM
> *To:* pgsql-general@lists.postgresql.org
> *Subject:* Monitor Postgres database status on Docker
>
>
>
> Hi team,
>
>
>
> We want to check the postgres database status on docker container just
> like we monitor Postgres (up  / down) via /etc/init.d/postgresql status
>
>
>
> But I am not sure how we can do that with docker.
>
>
>
> Thanks,
>
> Daulat
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: FW: Re: FW: Re: Shouldn;t this trigger be called?

2019-09-15 Thread Olivier Gautherot
Hi Stan,

On Sun, Sep 15, 2019 at 2:47 PM stan  wrote:

> Forgot to cc the list again. Have to look at settings in mutt.
>
> > > >
> > > > Sorry forgot to cc the list
> > > >
> > > > On Sun, Sep 15, 2019 at 08:42:02AM -0700, Adrian Klaver wrote:
> > > > > On 9/15/19 8:33 AM, stan wrote:
> > > > > > I have defined this function:
> > > > > >
> > > > > > CREATE FUNCTION fix_customer_types_case()
> > > > > >
> > > > > > and this trigger:
> > > > > >
> > > > > > CREATE TRIGGER fix_customer_types_case_trig BEFORE INSERT OR
> UPDATE ON customer
> > > > > > FOR EACH ROW EXECUTE FUNCTION fix_customer_types_case();
> > > > > >
> > > > > > and I put a RAISE NOTICE so I can tell if the function is
> called. Yet when I
> > > > > > do a :
> > > > > >
> > > > > > \copy to bring data into this table, I do not see the notice.
> > > > >
> > > > > What is the actual command you are using?
> > > >
> > > >
> > > >   \COPY customer(name, location, status , c_type , bill_attention ,
> bill_addresse , bill_address_1 , bill_address_2 , bill_city , bill_state ,
> bill_country , ship_attention , ship_addresse , ship_address_1 ,
> ship_address_2, ship_city ,ship_state  ) from
> '/home/stan/pm_db/live_data/ready/customer.csv'  DELIMITER ','  CSV HEADER ;
> > > >
> > > > and here is the function
> > > >
> > > > CREATE FUNCTION fix_customer_types_case()
> > > > RETURNS trigger AS $$
> > > > BEGIN
> > > >  if NEW.c_type  IS NOT NULL
> > > >   THEN
> > > >   NEW.c_type := upper(cast( NEW.c_type AS TEXT));
> > > >   END IF ;
> > > >  if NEW.status  IS NOT NULL
> > > >   THEN
> > > >   RAISE NOTICE 'Called With %', NEW.status;
> > > >   NEW.status := upper(cast( NEW.status AS TEXT));
> > > >   END IF ;
> > > >   RAISE NOTICE 'Left With With %', NEW.status;
> > > >   RAISE NOTICE 'Left With With %', NEW.c_type;
> > > >  return NEW;
> > > > END;
> > > > $$
> > > > LANGUAGE PLPGSQL;
> > > >
> > > > if I do an insert this function is called. However it IS NOT called
> for the
> > > > above copy command. How can I fix that?
> > >
> > > I thought you said it was fixed now.
> > >
> > I discovered that the function was not getting defined, and fixed that.
> Then I
> > rashly posted to the list that it was fixed, as i was certain that was
> the
> > only issue. But after I reported that, I tried testing, with he results
> in
> > this email.
> >
> > Works for INSERT, but does not fire on this \copy command.
> >
> More interesting data. I used vi to correct the incorrect case in the CSV
> file
> being imported, and re-ran the \copy command. At this point in time, I did
> see the messages from notice. I deleted the rows, re-edited back to the
> incorrect case in the csv file, and the import ((\copy) failed.
>
> So, my test tell me that the validity check is done BEFORE an attempt to
> insert (thus firing the trigger) occurs.
>
> Interesting, but not helpful for my application.
>
> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
> -- Benjamin Franklin
>

 Maybe you could try awk on your input:
https://thomas-cokelaer.info/blog/2018/01/awk-convert-into-lower-or-upper-cases/


Libre
de virus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: Handling case variatiions on a user defined type?

2019-09-15 Thread Olivier Gautherot
El dom., 15 de septiembre de 2019 09:37, stan  escribió:

> I have several enumerated types that look like:
>
> CREATE TYPE confidence_level AS ENUM ('HIGH' ,
> 'MEDIUM' ,
> 'LOW' ,
>  'WAG');
>
>
> I have someone creating data for this project in spreadsheets, and then am
> importing the data using \copy.
>
> Problem is the data in the spreadsheets is entered in mixed case.
>
> Is there a way that I can use the type itself, to enforce something like a
> to_upper(). Or is there a better way to  "human proof" this?
>
> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
> -- Benjamin Franklin
>

What about a trigger on INSERT, which would enforce the case?

>


Re: backing up the data from a single table?

2019-09-13 Thread Olivier Gautherot
Hi Stan,

El vie., 13 de septiembre de 2019 11:14, stan  escribió:

> My development methodology is to create scripts that init the db, and load
> test data, as I make changes.
>
> Now, I am starting to move toward a test instance with real data. The end
> user has provided "real" test data, n spreadsheets. I have the inputing of
> data from these working OK. (takes a bit of hand work). What I would like
> to do, is  continue to init the "test" database. To make that easy, I would
> like to be able to "export" individual tables, so I can use the scripts to
> reload them. I prefer not to have to convert these to my load script
> format.
>
> Is there a way to "export" a single table, that can be easily re
> "imported"?
>
>
> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
> -- Benjamin Franklin
>

It may be a strange suggestion but did you consider containers like Docker?
It's been a great help on a 100GB+ test database. Successful changes were
committed, failed changes were rolled back creating a new instance based on
the same image (with data preloaded). It takes seconds to do a rollback.

Olivier

>


Re: pgmodeler ?

2019-09-01 Thread Olivier Gautherot
On Sun, Sep 1, 2019 at 6:35 AM stan  wrote:

> On Sun, Sep 01, 2019 at 06:30:23AM -0400, stan wrote:
> > I apologize, as this is a bit off topic for this list.
> >
> > pgmodeler got mention in a thread that I started a day or two ago, and
> it looks
> > like it might be very  useful
> >
> > I am running on Ubuntu 18.04 (latest STABLE), and I found that there is
> a package,
> > which I downloaded. The idea is to import the project I am working on.
> >
> > But, when I try to import it I get an error message about:
> >
> > t says that the column pr.proisaag.
> >
> > I don't want to spend a lot of time trying to get this package to work,
> if it is a
> > dead end. I am using Postgres 11 if that matters.
> >
> > Has anyone gotten this to work If so, can anyone point me to a quick
> HOWTO to
> > import my existing project?
> >
> This is the specific error message:
>
> Could not execute the SQL command.
> Message returned: ERROR: column pr.proisagg does not exist
> LINE 1: ...namespace AS ns ON pr.pronamespace = ns.oid WHERE pr.proisag...
> ^
> HINT: Perhaps you meant to reference the column
>
> --
> "They that would give up essential liberty for temporary safety deserve
> neither liberty nor safety."
> -- Benjamin Franklin
>

Can you confirm the versions of pgModeler and PG? Pgmodeler 0.91 does not
fully support PG 11.x and there is an update in preparation in 0.92 which
should. You may want to givi it a try.

Cheers
--
Olivier Gautherot
https://www.linkedin.com/in/ogautherot/

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Libre
de virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: Code of Conduct plan

2018-09-15 Thread Olivier Gautherot
Dear all,

On Fri, Sep 14, 2018 at 5:18 PM Tom Lane  wrote:

> Robert Haas  writes:
> > It's not clear to me that there IS a general consensus here.  It looks
> > to me like the unelected core team got together and decided to impose
> > a vaguely-worded code of conduct on a vaguely-defined group of people
> > covering not only their work on PostgreSQL but also their entire life.
>
> There's been quite a lot of input, from quite a lot of people, dating
> back at least as far as a well-attended session at PGCon 2016.  I find
> it quite upsetting to hear accusations that core is imposing this out
> of nowhere.  From my perspective, we're responding to a real need
> voiced by other people, not so much by us.
>
> > However, I also don't think it matters very much.
>
> Yeah, this.  The PG community is mostly nice people, AFAICT.  I'll be
> astonished (and worried) if the CoC committee finds much to do.  We're
> implementing this mostly to make newcomers to the project feel that
> it's a safe space.
>
> It's also worth reminding people that this is v1.0 of the CoC document.
> We plan to revisit it in a year or so, and thereafter as needed, to
> improve anything that's causing problems or not working well.
>
> regards, tom lane
>

I must admit that I'm impressed by the huge amount of contributions to this
thread and, to be honest, it is the only one I have witnessed that would
have deserved a CoC. I had a quick look at the proposal and it sounds to me
like the team is trying to handle excesses - as long as no one complains, I
would bet that they won't even chime in.

One thing to keep in mind is this simple definition: "One person's freedom
ends where another's begins" and all the work should go in this direction.
We are all different, have different sensitivities, come from different
cultures where we interpret words in a different way - it's a given, no way
to escape. But we have in common the love of a great piece of software
provided by a very active and efficient community.

Why don't we focus on what unites us, instead of what creates divisions?

Have a peaceful week-end
Olivier


Re: unorthodox use of PG for a customer

2018-08-27 Thread Olivier Gautherot
Hi Dave, lots of nice inputs but let's try to summarize your user's needs.

On Fri, Aug 24, 2018 at 3:18 PM, David Gauthier 
wrote:

> [...]
>
> He's writing a script/program that runs on a workstation and needs to
> write data to a DB.  This process also sends work to a batch system on a
> server farm external to the workstation that will create multiple, parallel
> jobs/processes that also have to write to the DB as well. The workstation
> may have many of these jobs running at the same time.  And there are 58
> workstation which all have/use locally mounted disks for this work.
>

Do the workstations work independently or do they need to share
information? If they share information, does it need to be ACID or is
"eventually consistent" enough? Also, what is the size of the database? A
local database is viable as long as it does not overload the workstation
but it you have many instances of the job per workstation and 58 of them, I
would rely on an optimized central DB.

Are the jobs mainly INSERT or a mix of INSERT and SELECT/UPDATE?

Things to consider: amount of RAM required, CPU, disk I/O... The cost
factor should be considered.

At first blush, this is easy.  Just create a DB on a server and have all
> those clients work with it.  But he's also adamant about having the DB on
> the same server(s) that ran the script AND on the locally mounted disk.  He
> said he doesn't want the overhead, dependencies and worries of anything
> like an external DB with a DBA, etc... . He also wants this to be fast.
>

If he insists on having a local copy of the database, suggest PG10 with
logical replication - ideally the bidirectional replication. Once he
realizes that every INSERT causes 57 replications and this will most likely
kill the network, he may reconsider the requirement... Queries to a central
DB is likely to cause less stress and will be easier to handle.

Now, if speed is critical, you may also consider an in-memory DB like
Redis. There are schemes running in multi-master.

My first thought was SQLite.  Apparently, they now have some sort of
> multiple, concurrent write ability.  But there's no way those batch jobs on
> remote machines are going to be able to get at the locally mounted disk on
> the workstation. So I dismissed that idea.  Then I thought about having 58
> PG installs, one per workstation, each serving all the jobs pertaining to
> that workstation.  That could work.  But 58 DB instances ?  If he didn't
> like the ideal of one DBA, 58 can't be good.  Still, the DB would be on the
> workstation which seems to be what he wants.
>

A distributed database is likely to cause failures at some point if not
handled properly - and it sounds like there won't be any DBA at all. I
would be extremely cautious in this case.


Re: Safe operations?

2018-08-12 Thread Olivier Gautherot
On Sun, Aug 12, 2018 at 11:06 PM, Tim Cross  wrote:

>
> On Mon, 13 Aug 2018 at 11:24, Adrian Klaver 
> wrote:
>
>> On 08/12/2018 05:41 PM, Samuel Williams wrote:
>> > I wish the documentation would include performance details, i.e. this
>> > operation is O(N) or O(1) relative to the number of rows.
>> >
>> > I found renaming a table was okay.
>> >
>> > How about renaming a column? Is it O(1) or proportional to the amount
>> of
>> > data?
>> >
>> > Is there any documentation about this?
>>
>> https://www.postgresql.org/docs/10/static/sql-altertable.html
>>
>> "RENAME
>>
>>  The RENAME forms change the name of a table (or an index, sequence,
>> view, materialized view, or foreign table), the name of an individual
>> column in a table, or the name of a constraint of the table. There is no
>> effect on the stored data.
>> "
>>
>> Just wondering - what about the case when the column being renamed is
>> also referenced in an index or check constraint? (I would guess you cannot
>> rename a column used in a check constraint without first removing it, but
>> for an index, would this result in the index being rebuilt (or do you have
>> to take care of that manually or are such references abstracted such that
>> the column name "text" is irrelevant tot he actual structure of the
>> index?).
>
>
Tim, as far as I know, names are only an attribute tagged to an OID.
Internal relations are though these OIDs, not names, so renaming a column
is really one-shot. Names are mainly a more convenient way of referring to
objects.

Olivier


Re: How can i install contrib modules in pg11 via source

2018-07-24 Thread Olivier Gautherot
On Tue, Jul 24, 2018 at 3:41 PM, Márcio Antônio Sepp <
mar...@zyontecnologia.com.br> wrote:

> >
> > If so I can tell you how I do it in Linux and you can make the
> > appropriate translations to BSD.
> >
> > 1) cd to contrib/hstore/
> >
> > 2) make
> >
> > 3) sudo make install
> >
> > 4) In psql CREATE EXTENSION hstore;
>
> Well, I allways install via pkg and I didn't find the contrib folder  (:
>
> Now, the follow error message occur:
>
> root@pg11:/pg11/postgresql-11beta2/contrib/hstore # make
> make: "/pg11/postgresql-11beta2/contrib/hstore/Makefile" line 16: Need an
> operator
> make: "/pg11/postgresql-11beta2/contrib/hstore/Makefile" line 19: Could
> not
> find
> make: "/pg11/postgresql-11beta2/contrib/hstore/Makefile" line 20: Need an
> operator
>
>
Márcio, remember that it is FreeBSD, not Linux.
Use gmake (GNU make) instead of make (BSD make). They are not compatible...


Re: Can't compile postgresql 11 on FreeBSD 11.1

2018-07-17 Thread Olivier Gautherot
On Tue, Jul 17, 2018 at 4:30 PM, Christoph Moench-Tegeder <
c...@burggraben.net> wrote:

> ## Olivier Gautherot (oliv...@gautherot.net):
>
> > To: Christoph Moench-Tegeder 
>
> I think you meant to hit List-Reply...
>

Ooops, my bad...



> > I just pulled the .tar.bz2 archive and compiled it with no issue.
>
> In that case I suspect you have CFLAGS, CPPFLAGS and/or LDFLAGS
> set in your environment, or you are using a compiler with non-
> standard include/library paths. Default cc (clang) does not
> search in /usr/local by default (at least on 11.2, but that did
> not change with the update).
> Basically, the port's CONFIGURE_ARGS are there for a reason...
>

I compiled the standard archive and noticed that it used GCC 6.4.0 - and I
think it does include /usr/local/include in the standard header files path.
I checked the environment and there are no custom CFLAGS/CPPFLAGS/LDFLAGS.
My default shell is csh (I doubt it makes any difference).

My test machine was a FreeBSD 11.2, if that matters.

Márcio, do you have GCC installed?


Re: Extremely slow autovacuum:vacuum to prevent wraparound

2018-07-11 Thread Olivier Gautherot
On Wed, Jul 11, 2018 at 8:17 AM, Rijo Roy  wrote:

> +pgsql-general
>
> Sent from Yahoo Mail on Android
> 
>
> On Wed, 11 Jul 2018 at 5:43 pm, Rijo Roy
>  wrote:
> Hi Ravi,
>
> It was the application teams call to disable it fearing the alteration of
> execution plans and slowness as per their explanation. I have joined 2 days
> back and I have tried educating them for enabling the same but they said
> after enough testing and evidence collected it will be enabled back in
> future..
> As of now, I have started the manual vacuum operation on the databases but
> in one of them I am facing this issue that my vacuum jobs for 3 tables are
> paused as there are 3 autovacuum jobs running since 5 days on the same
> tables to prevent wraparound.
> Could you please confirm whether I can stop these autovacuum jobs so that
> I can run my manual run of vaccum job.
>
> Rijo, from a technical perspective (see how this can apply in Production),
one thing you can do is disable autovacuum in postgresql.conf and restart
the cluster. This will kill the vacuum processes in an orderly fashion.

One thing you could try is recreate these big tables (this will run an
implicit vacuum). There are 2 ways:
- COPY (SELECT * FROM bigtable) TO tempfile; COPY bigtable2 FROM tempfile;
- SELECT * INTO bigtable2 FROM (SELECT * FROM bigtable)

See which one works best for you. Note that this will only copy the columns
and the data - you will have to recreate the indexes and constraints by
hand afterwards.

As a comment, NEVER disable the autovacuum unless you're doing heavy
maintenance (like deleting millions of rows - and run the vacuum by hand
afterwards). From our experience, it can be a lot lighter to perform the
SELECT INTO described above if you happen to delete a majority of rows (it
happened to us).

Olivier


>
> Thanks,
> Rijo Roy
>
> Sent from Yahoo Mail on Android
> 
>
> On Wed, 11 Jul 2018 at 5:37 pm, Ravi Krishna
>  wrote:
> Who disabled auto vacuum and why ?  Basically you are paying the price for
> it now
>
> Sent from my iPad
>
> On Jul 11, 2018, at 7:33 AM, Rijo Roy  wrote:
>
> Hi Experts,
>
> I am running a PG10.4 in a RHEL 6.9. The tables stored in my database are
> mainly used for reads and very less writes happen. I have recently joined
> this environment and as soon as I logged into the postgresql servers,
> checked for dead tuples and relfrozenids.. And almost all have reached the
> autovacuum_freeze_max_age of 2billion and autovacuum was disabled.. Last
> vacuum was run 3 months back.
> Today I got permission to vacuum the tables but for 3 tables, Postgresql
> has already started the vacuum freeze using the autovacuum daemon running
> to prevent wraparound and these tables have TB'S of data.
> Experts, please confirm whether I can kill these sessions and run a manual
> vacuum job with parallel jobs.
>
> Thanks in advance.
>
> Regards,
> Rijo Roy
>
> Sent from Yahoo Mail on Android
> 
>
>


Re: Whither 1:1?

2018-06-01 Thread Olivier Gautherot
On Fri, Jun 1, 2018 at 1:25 PM, Guyren Howe  wrote:

> On Jun 1, 2018, at 10:16 , Olivier Gautherot 
> wrote:
>
>
> You will get a benefit in terms of space only if the optional fields in
> the second table exist in a reduced number of instances - and the second
> table is significantly wider. This can make a difference on big tables but
> this gain may be offset by the cost of the join. In this perspective, I
> don’t think that there is a clear benefit or drawback: it should be
> evaluated on a case-by-case basis.
>
>
> It seems to me that people take time to catch up with modern hardware
> reality. SSDs reduce seek time to virtually zero. Surely, joins are now
> much, much cheaper. If so, I’m inclined to describe wide tables as a
> premature optimization.
>

Guyren, I'm only saying it is not a one-size-fits-all and we don't all have
last-generation hardware to run our systems on. I actually have a use-case
for the structure you describe, which supports optional fields in a table:
the main one is partitioned into 15GB monthly tables with 30+ millions
rows. The parallel table is used for devices that generate additional
information but are a small subset of the fleet. This way, I could prevent
the overhead of 150 columns across the board (the main table has around
30). By keeping things simple (I mean a single table), you get more bloat
but you can reduce the response time and may eventually be able to run on a
smaller, cheaper server. It is really a project-specific design issue.

Olivier Gautherot


Re: Whither 1:1?

2018-06-01 Thread Olivier Gautherot
On Fri, Jun 1, 2018 at 12:52 PM, Guyren Howe  wrote:

> It’s come to my attention that what seems an obvious and useful database
> design pattern — 1:1 relations between tables by having a shared primary
> key — is hardly discussed or used.
>
> It would seem to be a very simple pattern, and useful to avoid storing
> nulls or for groups of fields that tend to be used together.
>
> Thoughts? Is there some downside I can’t see?
>

You will get a benefit in terms of space only if the optional fields in the
second table exist in a reduced number of instances - and the second table
is significantly wider. This can make a difference on big tables but this
gain may be offset by the cost of the join. In this perspective, I don't
think that there is a clear benefit or drawback: it should be evaluated on
a case-by-case basis.

Olivier Gautherot


Re: unable to write 'raise' messages to log file?

2018-05-31 Thread Olivier Gautherot
Hi Ian,

On Thu, May 31, 2018 at 3:19 PM, Adrian Klaver 
wrote:

> On 05/31/2018 12:15 PM, Olivier Gautherot wrote:
>
>> On Thu, May 31, 2018 at 3:07 PM, Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote:
>>
>> On 05/31/2018 11:20 AM, Ian Bell wrote:
>>
>> I am having considerable difficulty logging information in
>> PL-pgSQL functions by using the ‘RAISE’ statement.I am
>> asking for comments/suggestions on what I am doing wrong.
>>
>> I’ve tried flushing/rotating the log files by executing *‘select
>> pg_rotate_logfile()’* in PSQL but my messages never appear in
>> the log files.   I’ve tried calling my PL-pgSQL functions in
>> PSQL, PgAdmin4, OmniDB and ADO.NET <http://ADO.NET> but again my
>> messages never appear in the log file.
>>
>> On very rare occasions, I see my messages the log file if I
>> restart the PostgreSql server however restarting the server
>> generally does not flush my messages to the log files.
>>
>> Do they show up in a client? For example psql:
>>
>> [snip]
>> test_(aklaver)> select testwithbasictypearguments(1, 2.5, 'test');
>> LOG:  Test.TestWithArguments: i = 1, n = 2.5, t = test
>>   testwithbasictypearguments
>> 
>>0
>> (1 row)
>>
>> -- Adrian Klaver
>> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
>>
>>
>> I've used the logs successfully here. What are the values of
>> log_min_messages and log_min_error_statement in your postgresql.conf?
>>
>
> The settings are shown in the original post.
>

By any chance... did you try "RAISE NOTICE..." in your function? That's the
level I usually use.

Olivier Gautherot


Re: unable to write 'raise' messages to log file?

2018-05-31 Thread Olivier Gautherot
On Thu, May 31, 2018 at 3:07 PM, Adrian Klaver 
wrote:

> On 05/31/2018 11:20 AM, Ian Bell wrote:
>
>> I am having considerable difficulty logging information in PL-pgSQL
>> functions by using the ‘RAISE’ statement.I am asking for
>> comments/suggestions on what I am doing wrong.
>>
>>
>
>> I’ve tried flushing/rotating the log files by executing *‘select
>> pg_rotate_logfile()’* in PSQL but my messages never appear in the log
>> files.   I’ve tried calling my PL-pgSQL functions in PSQL, PgAdmin4, OmniDB
>> and ADO.NET but again my messages never appear in the log file.
>>
>> On very rare occasions, I see my messages the log file if I restart the
>> PostgreSql server however restarting the server generally does not flush my
>> messages to the log files.
>>
>>
>
> Do they show up in a client? For example psql:
>
> [snip]

test_(aklaver)> select testwithbasictypearguments(1, 2.5, 'test');
> LOG:  Test.TestWithArguments: i = 1, n = 2.5, t = test
>  testwithbasictypearguments
> 
>   0
> (1 row)
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
I've used the logs successfully here. What are the values of
log_min_messages and log_min_error_statement in your postgresql.conf?

Olivier Gautherot


Re: Fast logical replication jump start with PG 10

2018-05-26 Thread Olivier Gautherot
On Sat, May 26, 2018 at 1:27 PM, Adrian Klaver 
wrote:

> On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
>
>> On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver <
>> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>> wrote:
>> On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
>> [snip]
>>
>> The sequence on the replicated machine was (pseudo-code to simplify the
>> syntax):
>> - pg_dump --section=pre-data -h master_machine master_database | psql -h
>> replication_machine replication_database
>> # This took seconds, "pre-data" discards the indexes
>>
>> - psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION
>> "..." PUBLICATION mypub;" replication_database
>> # This took about 1 hour for the initial sync
>>
>> - pg_dump --section=post-data -h master_machine master_database | psql -h
>> replication_machine replication_database
>> # This took 2h20m to load the various indexes
>>
>> This sequence follows the recommendation of section 14.4.3 in
>> https://www.postgresql.org/docs/10/static/populate.html . If I stick to
>> streaming as we do today (e.g. pg_upgrade and then rsync to the replication
>> server), I can be ready in about 1 hour (more acceptable for the customer).
>>
>
> I am still learning what logical replication is capable of so take the
> following with that in mind.
>
> 1) I used pg_basebackup(www.postgresql.org/docs/10/static/app-pgbaseba
> ckup.html) to create a new $DATA directory for a replica instance.
>

Good tip, I'll give it a try.

2) I configured the master and the replica for logical replication. Also
> changed the copied over conf files to work for the new instance e.g.
> changed the port number.
>
> 3) I set up the PUBLICATION:
>
> CREATE PUBLICATION everything FOR ALL TABLES;
>

This was what I was planning to do, so great.

4) I set up the SUBSCRIPTION:
>
> CREATE SUBSCRIPTION test_sub CONNECTION 'dbname=production user=postgres
> port=5432' PUBLICATION everything WITH(copy_data=false);
>
> *NOTE* the copy_data=false.
>

This was the bit I missed! Excellent point!

5) Then I started entering data in the master and it was replicated.
>
> Caveats:
>
> 1) This was a small database.
>

I don't think the size is relevant in this specific case.

2) The master and replica where on the same machine.
>

Same comment: different ports mean basically different instances.

3) There was no activity on the master between the pg_basebackup and the
> CREATE PUBLICATION/CREATE SUBSCRIPTION commands.
>

This is also my plan for Production, so it's fine.

Thanks!!!

[snip]
>

> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>

Olivier Gautherot
http://www.linkedin.com/in/ogautherot


Re: Fast logical replication jump start with PG 10

2018-05-26 Thread Olivier Gautherot
Hi Adrian!

On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver 
wrote:

> On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
>
>> Hi Adrian, thanks for your reply. Here is the clarification.
>>
>> 1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the test
>> machine, it runs in between 15 and 20 minutes for just over 100GB. I can
>> negotiate this time with our customer. The vacuum process took another 5 to
>> 7 minutes. This this what I was referring to with the 30 minutes (point 3
>> in your questions)
>>
>> 2) After pg_upgrade, I published the tables on the database (in the sense
>> "CREATE DATABASE") and subscribed to this publication on the second server
>> (logical replication). The data copy processed started immediately and took
>> around 1 hour. I then loaded the indexes, what took > another 2h20m. At
>> that point the active-passive cluster was ready to go.
>>
>
> The index creation was done on the replicated machine I presume, using
> what command?
>

The sequence on the replicated machine was (pseudo-code to simplify the
syntax):
- pg_dump --section=pre-data -h master_machine master_database | psql -h
replication_machine replication_database
# This took seconds, "pre-data" discards the indexes

- psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION
"..." PUBLICATION mypub;" replication_database
# This took about 1 hour for the initial sync

- pg_dump --section=post-data -h master_machine master_database | psql -h
replication_machine replication_database
# This took 2h20m to load the various indexes

This sequence follows the recommendation of section 14.4.3 in
https://www.postgresql.org/docs/10/static/populate.html . If I stick to
streaming as we do today (e.g. pg_upgrade and then rsync to the replication
server), I can be ready in about 1 hour (more acceptable for the customer).

The reasons for the indexes to take so long is the large number of them on
big tables (for instance, 7 indexes on a partitioned table, with 3
partitions of 15GB of data in 30M rows). I will skip the reasons that got
us there (please no flames, I'm aware of the issue :-) ). I don't have
definite execution times for the Production environment (in a datacenter),
which tends to be kind of a lottery in terms of execution times compared to
testing (on a desktop in the office).


>
> Note that the active and the passive databases are on different machines.
>>
>> 4) By "database" I mean the result of "CREATE DATABASE" and we have 1 per
>> server (or "cluster" in your terminology - I tend to use this word for a
>> group of machines). We are currently using a streaming replication
>>
>
> Yeah I understand, it is just that database and cluster have specific
> meanings in Postgres and it helps to stick to those meanings when
> discussing replication operations. Lowers the confusion level:)
>
> between the 9.2 servers, so it could be a fall-back option after the
>> upgrade (I wanted to remove part of the indexes on the master to lower the
>> load, reason to use the logical replication... if the execution time is not
>> too excessive).
>>
>
> So the time you showed was with those indexes removed or not?
>

I did try to synchronize the database with the indexes installed and
eventually dropped the replication database after a full week-end of hectic
activity (apparently, the initial sync job was not finished...). I will try
it again just to make sure but I'm fairly positive that I will get to the
same result.


>
>> Hope it clarifies the question
>> Best regards
>> Olivier
>>
>>
>> Olivier Gautherot
>> oliv...@gautherot.net <mailto:oliv...@gautherot.net>
>> Cel:+56 98 730 9361
>> Skype: ogautherot
>> www.gautherot.net <http://www.gautherot.net>
>> http://www.linkedin.com/in/ogautherot
>>
>>
>> On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote:
>>
>> On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
>>
>> Hi,
>>
>> I just sent the question on StackOverflow but realized that this
>> audience may be more savvy. So sorry in advance for
>> cross-posting...
>>
>> I'm in the process of upgrading a PG from 9.2 to 10.4.
>> pg_upgrade worked fine on the master and was rather fast. The
>> problem is that the database is replicated and I'm planning to
>> switch from streaming to logical. The problem is that it is
>> rather slow (30 minutes for the master and over 3 hours for the
>> replication, between data transfer and inde

Re: Fast logical replication jump start with PG 10

2018-05-25 Thread Olivier Gautherot
Hi Adrian, thanks for your reply. Here is the clarification.

1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the test
machine, it runs in between 15 and 20 minutes for just over 100GB. I can
negotiate this time with our customer. The vacuum process took another 5 to
7 minutes. This this what I was referring to with the 30 minutes (point 3
in your questions)

2) After pg_upgrade, I published the tables on the database (in the sense
"CREATE DATABASE") and subscribed to this publication on the second server
(logical replication). The data copy processed started immediately and took
around 1 hour. I then loaded the indexes, what took another 2h20m. At that
point the active-passive cluster was ready to go. Note that the active and
the passive databases are on different machines.

4) By "database" I mean the result of "CREATE DATABASE" and we have 1 per
server (or "cluster" in your terminology - I tend to use this word for a
group of machines). We are currently using a streaming replication between
the 9.2 servers, so it could be a fall-back option after the upgrade (I
wanted to remove part of the indexes on the master to lower the load,
reason to use the logical replication... if the execution time is not too
excessive).

Hope it clarifies the question
Best regards
Olivier


Olivier Gautherot
oliv...@gautherot.net
Cel:+56 98 730 9361
Skype: ogautherot
www.gautherot.net
http://www.linkedin.com/in/ogautherot

On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver 
wrote:

> On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
>
>> Hi,
>>
>> I just sent the question on StackOverflow but realized that this audience
>> may be more savvy. So sorry in advance for cross-posting...
>>
>> I'm in the process of upgrading a PG from 9.2 to 10.4. pg_upgrade worked
>> fine on the master and was rather fast. The problem is that the database is
>> replicated and I'm planning to switch from streaming to logical. The
>> problem is that it is rather slow (30 minutes for the master and over 3
>> hours for the replication, between data transfer and indexes).
>>
>
> I am not clear on what you did, so can you clarify the following:
>
> 1) pg_upgrade from 9.2 master instance to 10.4 master instance, correct?
>
> 2) What replication are you talking about for the 3 hour value?
>
> 3) What is the 30 minute value referring to?
>
> 4) When you say database are you talking about a Postgres cluster or a
> database in the cluster?
>
>
>> Is there a way to speed up the replication or should I rather stick to
>> streaming replication? As I have only 1 database on the server, it would
>> not be a show-stopper.
>>
>
> See 4) above, but if you are talking about a single database in a cluster
> streaming replication will not work for that.
>
>
>
>>
>> Thanks in advance
>> Olivier Gautherot
>> http://www.linkedin.com/in/ogautherot
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Fast logical replication jump start with PG 10

2018-05-25 Thread Olivier Gautherot
Hi,

I just sent the question on StackOverflow but realized that this audience
may be more savvy. So sorry in advance for cross-posting...

I'm in the process of upgrading a PG from 9.2 to 10.4. pg_upgrade worked
fine on the master and was rather fast. The problem is that the database is
replicated and I'm planning to switch from streaming to logical. The
problem is that it is rather slow (30 minutes for the master and over 3
hours for the replication, between data transfer and indexes).

Is there a way to speed up the replication or should I rather stick to
streaming replication? As I have only 1 database on the server, it would
not be a show-stopper.


Thanks in advance
Olivier Gautherot
http://www.linkedin.com/in/ogautherot