Re: constant crashing hardware issue and thank you TAKE AWAY

2024-04-17 Thread Justin Clift

On 2024-04-17 23:06, jack wrote:


As a result of this I will be checking the RAM on all my machines once
a month or the moment a machine starts to act strange.


Once a month is overkill, and unlikely to be useful. :)

With server or enterprise grade hardware, it'll support "ECC" memory.

That has extra memory chips + supporting circuity on the memory board
so it can detect + correct most errors which happen without them causing
problems.

For the errors that it can't *correct*, it'll still generate warnings
to your system software to let you know (if you've configured it).

If you do get such a warning - or if the system starts acting funny like
you saw - that's when you'd want to run memtest on the system.

---

The other time to run memtest on the system is when you first buy or
receive a new server.  You'd generally do a "burn in" test of all the
things (memory, hard disks/ssds, cpu, gpu, etc) just to make sure
everything is ok before you start using it for important stuff.

Regards and best wishes,

Justin Clift





Re: constant crashing hardware issue and thank you TAKE AWAY

2024-04-17 Thread Madalin Ignisca
That kind of support for “damaged ram” you have it with ECC memory on CPU’s 
that support it.

XEON cpus for example.

> On 17 Apr 2024, at 15:06, jack  wrote:
> 
> uld advise me if there was ever an issue with me



re: constant crashing hardware issue and thank you

2024-04-15 Thread jack
It seems that the hardware may in fact be the issue.

I ran the exact same process during the night on a much slower machine (4 core 
Xeon E5-1620 v4 32G Ram) and it worked fine. It did not crash and completed all 
the updates on 113 million records successfully.

I will be bringing the i9 in for a checkup to find the problem.

Thank you all for your help with this issue.

Re: constant crashing

2024-04-15 Thread Francisco Olarte
Jack:

On Sun, 14 Apr 2024 at 22:18, jack  wrote:
> The CSV files are being produced by another system, a WIndows app on a 
> Windows machine. I then copy them to a USB key and copy them onto the ubuntu 
> machine. The data is then imported via the COPY command.
>
> COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
> The fields are tab delimited.

Then call them tab-delimited, not CSV. It may be important. You have a
tendency of sending your interpretation of information which hampers
people trying to help you.

As an example, only about three messages above did we see the "signal
11" in the server logs which indicates a probable bug. Signal 11 is
normally due to stray pointers or corrupted memory ( although faulty
ram can lead to it too ), which hints at a bug, not a resource
problem.

> But importing the data works. I can get all the data into a single table 
> without any problems. The issue is only when I start to update the single 
> table. And that is why I started using smaller temporary tables for each CSV 
> file, to do the updates in the smaller tables before I move them all to a 
> single large table.
> After all the data is loaded and updated, I run php programs on the large 
> table to generate reports. All of which works well EXCEPT for performing the 
> updates on the data. And I do not want to use perl or any outside tool. I 
> want it all one in SQL because I am required to document all my steps so that 
> someone else can take over, so everything needs to be as simple as possible.

I suggested perl ( or any similar thing, php should be equally fine )
as it normally makes your importing / debugging much easier / lighter.

On the documentation side, you can have the data and some
transformations in a big sql file, but having some steps in a php file
will have them documented too. But if you want to do text processing
in SQL, go ahead, the only problems are going to be making it harder
to debug and probably harder to document. Now people know the signal
11 stuff and the problem can probably be tracked.

Francisco Olarte.




Re: constant crashing

2024-04-14 Thread David G. Johnston
On Sun, Apr 14, 2024 at 6:58 PM jack  wrote:

> But the current i9 machine is a machine from only 4 years ago which should
> have no issues.
>
>
That is a non sequitur.  Besides, if the current code is a representative
example you've probably given it the effective usage of 8 years.

Whether related or not to this error you really should be doing all those
updates on a temporary or unlogged staging table and getting WAL generation
out of the equation.

David J.


Re: constant crashing

2024-04-14 Thread David Rowley
On Mon, 15 Apr 2024 at 02:25, jack  wrote:
> Then I tried to update the data in blocks of 100,000 records and it crashed 4 
> times on 4 different blocks.
> So I updated the first crashed block down to the a block of 10 records, until 
> it crashed.
> Then I updated each of the 10 records individually to identify the record 
> that is problemantic, but then all 10 records updated without crashing!
> Pure insanity!

> I have been working on this for just over 1 year now, documenting every step, 
> and I am still unable to get this to work without it crashing somewhere along 
> the way.
> I am beginning to wonder if postgreSQL is bi-polar.

There are two things I can think of that might make PostgreSQL appear
"bi-polar" here.

1) The plan isn't the same when you update a small number of rows vs a
large number rows.  The planner may prefer an index depending on
what's in your WHERE clause.
2) JIT compilation is less likely with a small number or rows as the
plan will come out cheaper.

A stack trace as mentioned by Tom would be ideal, but if you can't get that...

I'd try:

a) Disable "jit" in postgresql.conf and reload the config file with
SELECT pg_reload_conf();

b) Still crashing? Try dropping all indexes before performing the UPDATE.

c)

Still crashing? Can you make it consistently crash with one
problematic CSV file and UPDATE statement?

Does it still crash if you drop all indexes from the table before
performing the UPDATE?
Can you narrow the UPDATE's scope by forcing a TID Range scan by
starting with a WHERE clause such as:

WHERE ctid >= '(0,0)' AND ctid <= '(4294967294,0)';

then increasly try to home in on the problematic value by "binary
searching" for the problematic record.  You can get the number of
blocks in the table with:

select pg_relation_size('your_table'::regclass) /
current_setting('block_size')::int;

replace the 4294967294 with the value from that and bisect the table
until you home in on the block with the issue.

David




Re: constant crashing

2024-04-14 Thread jack
I wrote the windows app.
I export all data to simple ASCII text where fields are delimited with a tab 
and then run the file through a UTF8 converter (convertcp_v8.3_x86).

I will try the entire process on a Xeon E5-1620 and let it run during the night 
to see what happens. But the current i9 machine is a machine from only 4 years 
ago which should have no issues.

On Sunday, April 14th, 2024 at 8:50 PM, Adrian Klaver 
 wrote:

> 
> 
> On 4/14/24 14:50, jack wrote:
> 
> Reply to list also
> Ccing list
> 
> > Hello,
> > I am not sure what "locale" means.
> 
> 
> Go to the settings App for whatever version of Windows you are on and
> search for locale.
> 
> > The Windows app is an inhouse application which uses Actian-Zen SQL.
> > The data is exported to simple ASCII in a tab delimited format similar to 
> > CSV.
> 
> 
> And you know it is ASCII for a fact?
> 
> > Those files are then imported into the PostgreSQL table using COPY.
> > Importing the data is not an issue.
> > I am able to load all the data without any problems, even into 1 table 
> > which ends up with about 1.2 billion records.
> > But when I try to update the data in that table I get many errors, 
> > essentially crashes.
> 
> 
> Repeating what has been asked and answered it not really going anywhere.
> 
> > There may be some control characters (garbage) in the data but that should 
> > not crash postgresql, especially if it can import the data without issues.
> 
> 
> Unless it does. That is the point of the questions, getting to what is
> actually causing the issue. Until the problem can be boiled down to a
> reproducible test case there really is not much hope of anything more
> then the the 'yes you have a problem' answer. And there is a difference
> between dumping data into a table and then doing an UPGRADE where the
> data strings are manipulated by functions.
> 
> > Anyway, I hope I answered your questions.
> > Thanks for your help.
> > 
> > On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver 
> > adrian.kla...@aklaver.com wrote:
> > 
> > > On 4/14/24 13:18, jack wrote:
> > > 
> > > > The CSV files are being produced by another system, a WIndows app on a
> > > > Windows machine. I then copy them to a USB key and copy them onto the
> > > > ubuntu machine. The data is then imported via the COPY command.
> > > 
> > > The app?
> > > 
> > > The locale in use on the Windows machine?
> > > 
> > > The locale in use in the database?
> > > 
> > > > COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER 
> > > > E'\t'
> > > > The fields are tab delimited.
> > > > 
> > > > But importing the data works. I can get all the data into a single table
> > > > without any problems. The issue is only when I start to update the
> > > > single table. And that is why I started using smaller temporary tables
> > > > for each CSV file, to do the updates in the smaller tables before I move
> > > > them all to a single large table.
> > > 
> > > The import is just dumping the data in, my suspicion is the problem is
> > > related to using string functions on the data.
> > > 
> > > > After all the data is loaded and updated, I run php programs on the
> > > > large table to generate reports. All of which works well EXCEPT for
> > > > performing the updates on the data. And I do not want to use perl or any
> > > > outside tool. I want it all one in SQL because I am required to document
> > > > all my steps so that someone else can take over, so everything needs to
> > > > be as simple as possible.
> > > 
> > > --
> > > Adrian Klaver
> > > adrian.kla...@aklaver.com
> 
> 
> --
> Adrian Klaver
> adrian.kla...@aklaver.com




Re: constant crashing

2024-04-14 Thread Tom Lane
Adrian Klaver  writes:
> Unless it does. That is the point of the questions, getting to what is 
> actually causing the issue. Until the problem can be boiled down to a 
> reproducible test case there really is not much hope of anything more 
> then the the 'yes you have a problem' answer.

IIUC, the crashes are not reproducible for the OP either.  I hesitate
to suggest "maybe the hardware is flaky" because that seems like
passing the buck.  Still, from the info we have the table definitions
are extremely simple and so are the queries, making it hard to believe
that some rare Postgres bug is being tripped over.  It makes no sense
to be seeing this level of instability in mainstream usage on a stable
OS and stable Postgres release.  So I'm starting to think that a
hardware problem might be the most plausible explanation.

In any case, there is no information we have that suggests any
plausible line of investigation towards a software problem.
If we can see a crash stack trace or two, maybe that would change.

regards, tom lane




Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 14:50, jack wrote:

Reply to list also
Ccing list


Hello,
I am not sure what "locale" means.


Go to the settings App for whatever version of Windows you are on and 
search for locale.



The Windows app is an inhouse application which uses Actian-Zen SQL.
The data is exported to simple ASCII in a tab delimited format similar to CSV.


And you know it is ASCII for a fact?


Those files are then imported into the PostgreSQL table using COPY.
Importing the data is not an issue.
I am able to load all the data without any problems, even into 1 table which 
ends up with about 1.2 billion records.
But when I try to update the data in that table I get many errors, essentially 
crashes.


Repeating what has been asked and answered it not really going anywhere.


There may be some control characters (garbage) in the data but that should not 
crash postgresql, especially if it can import the data without issues.


Unless it does. That is the point of the questions, getting to what is 
actually causing the issue. Until the problem can be boiled down to a 
reproducible test case there really is not much hope of anything more 
then the the 'yes you have a problem' answer. And there is a difference 
between dumping data into a table and then doing an UPGRADE where the 
data strings are manipulated by functions.



Anyway, I hope I answered your questions.
Thanks for your help.



On Sunday, April 14th, 2024 at 4:28 PM, Adrian Klaver 
 wrote:




On 4/14/24 13:18, jack wrote:


The CSV files are being produced by another system, a WIndows app on a
Windows machine. I then copy them to a USB key and copy them onto the
ubuntu machine. The data is then imported via the COPY command.



The app?

The locale in use on the Windows machine?

The locale in use in the database?


COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table
without any problems. The issue is only when I start to update the
single table. And that is why I started using smaller temporary tables
for each CSV file, to do the updates in the smaller tables before I move
them all to a single large table.



The import is just dumping the data in, my suspicion is the problem is
related to using string functions on the data.


After all the data is loaded and updated, I run php programs on the
large table to generate reports. All of which works well EXCEPT for
performing the updates on the data. And I do not want to use perl or any
outside tool. I want it all one in SQL because I am required to document
all my steps so that someone else can take over, so everything needs to
be as simple as possible.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 13:18, jack wrote:
The CSV files are being produced by another system, a WIndows app on a 
Windows machine. I then copy them to a USB key and copy them onto the 
ubuntu machine. The data is then imported via the COPY command.


The app?

The locale in use on the Windows machine?

The locale in use in the database?



COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table 
without any problems. The issue is only when I start to update the 
single table. And that is why I started using smaller temporary tables 
for each CSV file, to do the updates in the smaller tables before I move 
them all to a single large table.


The import is just dumping the data in, my suspicion is the problem is 
related to using string functions on the data.




After all the data is loaded and updated, I run php programs on the 
large table to generate reports. All of which works well EXCEPT for 
performing the updates on the data. And I do not want to use perl or any 
outside tool. I want it all one in SQL because I am required to document 
all my steps so that someone else can take over, so everything needs to 
be as simple as possible.




--
Adrian Klaver
adrian.kla...@aklaver.com





re: constant crashing

2024-04-14 Thread jack
The CSV files are being produced by another system, a WIndows app on a Windows 
machine. I then copy them to a USB key and copy them onto the ubuntu machine. 
The data is then imported via the COPY command.

COPY master (field01,field02..fieldX) FROM '/data/file.text' DELIMITER E'\t'
The fields are tab delimited.

But importing the data works. I can get all the data into a single table 
without any problems. The issue is only when I start to update the single 
table. And that is why I started using smaller temporary tables for each CSV 
file, to do the updates in the smaller tables before I move them all to a 
single large table.

After all the data is loaded and updated, I run php programs on the large table 
to generate reports. All of which works well EXCEPT for performing the updates 
on the data. And I do not want to use perl or any outside tool. I want it all 
one in SQL because I am required to document all my steps so that someone else 
can take over, so everything needs to be as simple as possible.

re: constant crashing

2024-04-14 Thread jack
To show you how bi-polar this is really becoming, I tried a work-around...
I took the table called us113 with 113 million records and tried to break it 
down into 10 smaller tables each having about 10 million records, using the 
following code:

\set ECHO all
\set ON_ERROR_STOP on
-- Create 10 new tables (us113_01 to us113_10) similar to the original table 
(us113)
CREATE TABLE us113_01 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num <= 1000;
CREATE TABLE us113_02 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 1000 AND row_num <= 2000;
CREATE TABLE us113_03 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 2000 AND row_num <= 3000;
CREATE TABLE us113_04 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 3000 AND row_num <= 4000;
CREATE TABLE us113_05 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 4000 AND row_num <= 5000;
CREATE TABLE us113_06 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 5000 AND row_num <= 6000;
CREATE TABLE us113_07 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 6000 AND row_num <= 7000;
CREATE TABLE us113_08 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 7000 AND row_num <= 8000;
CREATE TABLE us113_09 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 8000 AND row_num <= 9000;
CREATE TABLE us113_10 AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS 
row_num FROM us113) sub WHERE row_num > 9000;
...and of course it crashed after creating 7 tables.

2024-04-14 15:59:12.294 EDT [1212] LOG: database system is ready to accept 
connections
2024-04-14 16:00:39.326 EDT [1668] postgres@lf ERROR: could not access status 
of transaction 3687904299
2024-04-14 16:00:39.326 EDT [1668] postgres@lf DETAIL: Could not open file 
"pg_xact/0DBD": No such file or directory.
2024-04-14 16:00:39.326 EDT [1668] postgres@lf STATEMENT: CREATE TABLE us113_08 
AS SELECT * FROM ( SELECT *, ROW_NUMBER() OVER () AS row_num FROM us113) sub 
WHERE row_num > 7000 AND row_num <= 8000;
This is what I am dealing with.
Every turn I take to get around a problem, I get more errors.

I am not sure if this makes a difference but the machine actually has 144 GIG 
of RAM not 128 GIG. I know that Windows may have an issue with this, but I 
would not think ubuntu would. But I thought I'd throw that into the mess anyway.

Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 12:22, jack wrote:

Here is an excerpt of /var/log/postgresql/postgresql-16-main.log


Where and how are the CSV files being produced?

What is the database locale?

--
Adrian Klaver
adrian.kla...@aklaver.com





re: constant crashing

2024-04-14 Thread jack
Here is the table structure:

The fields being updated are the ones that are NOT named field##.

Except for "3fc" which I left as is to show that it is named differently in the 
unlikely chance that this would be causing problems.

CREATE TABLE main (
field01 character(10) COLLATE pg_catalog."default",
field02 integer,
field03 character varying(100) COLLATE pg_catalog."default",
field04 character varying(50) COLLATE pg_catalog."default",
field05 character varying(100) COLLATE pg_catalog."default",
category character varying(100) COLLATE pg_catalog."default",
field07 character varying(100) COLLATE pg_catalog."default",
category_modified character varying(100) COLLATE pg_catalog."default",
field09 text COLLATE pg_catalog."default",
field10 character varying(100) COLLATE pg_catalog."default",
field11 character(1) COLLATE pg_catalog."default",
"3fc" character(3) COLLATE pg_catalog."default",
field12 text COLLATE pg_catalog."default",
field13 text COLLATE pg_catalog."default",
field14 text COLLATE pg_catalog."default",
field15 text COLLATE pg_catalog."default",
field16 COLLATE pg_catalog."default",
sub_category character(10) COLLATE pg_catalog."default",
field17 character varying(100) COLLATE pg_catalog."default",
field18 character varying(100) COLLATE pg_catalog."default",
field19 character varying(100) COLLATE pg_catalog."default",
field20 character varying(50) COLLATE pg_catalog."default",
code1 character(5) COLLATE pg_catalog."default",
code2 character(10) COLLATE pg_catalog."default",
field21 character varying(100) COLLATE pg_catalog."default",
field22 character varying(50) COLLATE pg_catalog."default",
field23 character varying(50) COLLATE pg_catalog."default",
field24 character varying(50) COLLATE pg_catalog."default",
field25 character varying(50) COLLATE pg_catalog."default",
field26 character varying(50) COLLATE pg_catalog."default",
field27 character varying(50) COLLATE pg_catalog."default",
field28 character varying(50) COLLATE pg_catalog."default",
field29 character varying(50) COLLATE pg_catalog."default",
field31 character varying(50) COLLATE pg_catalog."default",
field32 character varying(10) COLLATE pg_catalog."default",
field33 varying(10) COLLATE pg_catalog."default",
field34 varying(10) COLLATE pg_catalog."default",
field35 varying(10) COLLATE pg_catalog."default",
field36 character varying(50) COLLATE pg_catalog."default",
field37 character(1) COLLATE pg_catalog."default",
field38 character varying(50) COLLATE pg_catalog."default",
field39 boolean,
field40 boolean,
field41 boolean);

Re: constant crashing

2024-04-14 Thread Tom Lane
jack  writes:
> Here is an excerpt of /var/log/postgresql/postgresql-16-main.log
> 2024-04-14 12:17:43.153 EDT [1227] LOG: server process (PID 7289) was 
> terminated by signal 11: Segmentation fault

That's pretty interesting; it suggests an actual PG bug.
Can you get a stack trace from that?  See

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Also, it's probably time you showed us the complete DDL definition
for the tables you're working with.  It's fairly hard to believe
this would happen without something rather out-of-the-ordinary
in the definitions.  Or ... do you have any PG extensions installed?

regards, tom lane




Re: constant crashing

2024-04-14 Thread Vijaykumar Jain
Ignore my thread, I guess there might be a bug given it segfaulted.

On Mon, Apr 15, 2024, 12:48 AM Vijaykumar Jain <
vijaykumarjain.git...@gmail.com> wrote:

>
>
> On Sun, 14 Apr 2024 at 21:50, jack  wrote:
>
>> The full error reads:
>> server closed the connection expectantly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> error: connection to server was lost
>>
>> PostgreSQL 16.2
>>
>> I also believe it is a resource issue which can be rectified with a
>> setting, but which setting?
>> If you were updating 100 million records what settings would you adjust?
>>
>> Here are the updates I am performing on the 100 million records:
>> UPDATE table SET category_modified = UPPER(category);
>> UPDATE table SET category_modified =
>> REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'),
>> '\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND
>> POSITION('--' IN category_modified)>0;
>> UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
>> UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
>> NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
>> UPDATE table SET category_modified = regexp_replace(category_modified,
>> '-{2,}', '-', 'g');
>> UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
>> FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
>> category_modified LIKE '%-';
>>
>>
> independent of best practices, i just want to check if there is a leak.
> I created a sample table with text data and ran updates like yours and I
> could not see mem growth, but I have a small vm and ofc your
> category_modified field might be more complex than simple text fields for
> 30-40 chars.
>
> can you grab the pid of your psql backend and (if you have pidstat
> installed) monitor resource usage for that pid
>
> postgres@pg:~/udemy/16$ psql
> psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
> Type "help" for help.
>
> postgres=# select pg_backend_pid();
>   pg_backend_pid
> 
>1214
> (1 row)
>
> # pidstat 2 100 -rud -h -p 1214
> (get all stats for that pid) that might help to figure out if there is a
> leak or the server has other things competing for memory and your updates
> were picked by the killer.
>
> Linux 5.15.0-101-generic (pg)   04/15/24_x86_64_(1 CPU)
>
> # TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
> minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
> iodelay  Command
> 00:40:25  113  12140.000.000.000.000.00 0
> 0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
>0  postgres
>
> # TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
> minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
> iodelay  Command
> 00:40:27  113  12140.000.000.000.000.00 0
> 0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
>0  postgres
> 
>
> ofc, if there is a genuine leak , then there might be more digging needed 
> Finding
> memory leaks in Postgres C code (enterprisedb.com)
> 
> just kill the process requesting more mem than available  Memory context:
> how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)
> 
>
>
> --
> Thanks,
> Vijay
> LinkedIn - Vijaykumar Jain 
>


re: constant crashing

2024-04-14 Thread jack
Here is an excerpt of /var/log/postgresql/postgresql-16-main.log

2024-04-14 12:17:42.321 EDT [7124] LOG: checkpoint starting: wal
2024-04-14 12:17:43.153 EDT [1227] LOG: server process (PID 7289) was 
terminated by signal 11: Segmentation fault
2024-04-14 12:17:43.153 EDT [1227] DETAIL: Failed process was running: UPDATE 
main SET category_modified = UPPER(category), sub_category=UPPER(sub_category), 
code1=UPPER(code1), code2=UPPER(code2);
2024-04-14 12:17:43.153 EDT [1227] LOG: terminating any other active server 
processes
2024-04-14 12:17:43.166 EDT [1227] LOG: all server processes terminated; 
reinitializing
2024-04-14 12:17:43.206 EDT [7320] LOG: database system was interrupted; last 
known up at 2024-04-14 12:17:42 EDT
2024-04-14 12:17:43.484 EDT [7320] LOG: database system was not properly shut 
down; automatic recovery in progress
2024-04-14 12:17:43.487 EDT [7320] LOG: redo starts at 1260/66400608
2024-04-14 12:17:44.985 EDT [7320] LOG: unexpected pageaddr 1260/4A7CA000 in 
WAL segment 00011260008C, LSN 1260/8C7CA000, offset 8167424
2024-04-14 12:17:44.986 EDT [7320] LOG: redo done at 1260/8C7C9F50 system 
usage: CPU: user: 1.20 s, system: 0.28 s, elapsed: 1.49 s
2024-04-14 12:17:44.999 EDT [7321] LOG: checkpoint starting: end-of-recovery 
immediate wait
2024-04-14 12:17:45.251 EDT [7321] LOG: checkpoint complete: wrote 16284 
buffers (99.4%); 0 WAL file(s) added, 0 removed, 38 recycled; write=0.082 s, 
sync=0.143 s, total=0.252 s; sync files=6, longest=0.075 s, average=0.024 s; 
distance=626470 kB, estimate=626470 kB; lsn=1260/8C7CA048, redo 
lsn=1260/8C7CA048
2024-04-14 12:17:45.264 EDT [1227] LOG: database system is ready to accept 
connections
2024-04-14 12:28:32.526 EDT [7321] LOG: checkpoint starting: wal
2024-04-14 12:28:45.066 EDT [7321] LOG: checkpoint complete: wrote 12 buffers 
(0.1%); 0 WAL file(s) added, 0 removed, 33 recycled; write=12.371 s, sync=0.144 
s, total=12.541 s; sync files=11, longest=0.121 s, average=0.013 s; 
distance=536816 kB, estimate=617505 kB; lsn=1260/CB4790B8, redo 
lsn=1260/AD406208
2024-04-14 12:28:45.887 EDT [7321] LOG: checkpoints are occurring too 
frequently (13 seconds apart)
2024-04-14 12:28:45.887 EDT [7321] HINT: Consider increasing the configuration 
parameter "max_wal_size".
2024-04-14 12:28:45.887 EDT [7321] LOG: checkpoint starting: wal
Should I increase the max_wal_size to 2GB ?

Re: constant crashing

2024-04-14 Thread Vijaykumar Jain
On Sun, 14 Apr 2024 at 21:50, jack  wrote:

> The full error reads:
> server closed the connection expectantly
> This probably means the server terminated abnormally
> before or while processing the request.
> error: connection to server was lost
>
> PostgreSQL 16.2
>
> I also believe it is a resource issue which can be rectified with a
> setting, but which setting?
> If you were updating 100 million records what settings would you adjust?
>
> Here are the updates I am performing on the 100 million records:
> UPDATE table SET category_modified = UPPER(category);
> UPDATE table SET category_modified =
> REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'),
> '\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND
> POSITION('--' IN category_modified)>0;
> UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
> UPDATE table SET category_modified = CASE WHEN category_modified IS NOT
> NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
> UPDATE table SET category_modified = regexp_replace(category_modified,
> '-{2,}', '-', 'g');
> UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1
> FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND
> category_modified LIKE '%-';
>
>
independent of best practices, i just want to check if there is a leak.
I created a sample table with text data and ran updates like yours and I
could not see mem growth, but I have a small vm and ofc your
category_modified field might be more complex than simple text fields for
30-40 chars.

can you grab the pid of your psql backend and (if you have pidstat
installed) monitor resource usage for that pid

postgres@pg:~/udemy/16$ psql
psql (16.2 (Ubuntu 16.2-1.pgdg22.04+1))
Type "help" for help.

postgres=# select pg_backend_pid();
  pg_backend_pid

   1214
(1 row)

# pidstat 2 100 -rud -h -p 1214
(get all stats for that pid) that might help to figure out if there is a
leak or the server has other things competing for memory and your updates
were picked by the killer.

Linux 5.15.0-101-generic (pg)   04/15/24_x86_64_(1 CPU)

# TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
iodelay  Command
00:40:25  113  12140.000.000.000.000.00 0
0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
   0  postgres

# TimeUID   PID%usr %system  %guest   %wait%CPU   CPU
minflt/s  majflt/s VSZ RSS   %MEM   kB_rd/s   kB_wr/s kB_ccwr/s
iodelay  Command
00:40:27  113  12140.000.000.000.000.00 0
0.00  0.00  354112  220940  24.18  0.00  0.00  0.00
   0  postgres


ofc, if there is a genuine leak , then there might be more digging
needed Finding
memory leaks in Postgres C code (enterprisedb.com)

just kill the process requesting more mem than available  Memory context:
how PostgreSQL allocates memory - CYBERTEC (cybertec-postgresql.com)



-- 
Thanks,
Vijay
LinkedIn - Vijaykumar Jain 


Re: constant crashing

2024-04-14 Thread David G. Johnston
On Sun, Apr 14, 2024 at 10:20 AM Francisco Olarte 
wrote:

>
> If postgres does not supress redundant updates ( lots of people can
> answer that )


It does not.  My other usual piece of advice along these lines, if doing
the transform outside the database is not desirable, is to at least ensure
you are doing them on a staging table that is defined as either being
temporary or unlogged.  WAL production probably isn't causing the crash but
can definitely be an issue and is pointless to incur during most
transformation processing.

David J.


Re: constant crashing

2024-04-14 Thread Francisco Olarte
Jack:

On Sun, 14 Apr 2024 at 18:20, jack  wrote:
> The full error reads:
> server closed the connection expectantly
> This probably means the server terminated abnormally
> before or while processing the request.
> error: connection to server was lost

This is a CLIENT-side error. What people are asking you is for the
corresponding error in the SERVER log. When the server has an severe
error, ( probably a resource exhaustion, which normally leads to
abnormal termination as it has been pointed out previously ) the
connection gets closed, and you should go to the server to look at the
eigen cause.



> PostgreSQL 16.2
> I also believe it is a resource issue which can be rectified with a setting, 
> but which setting?

Not all resource issues can be solved, sometimes one asks for
something which is too much for the available hardware. Anyway, as
pointed out, posting the configuration, the server error, and maybe
monitoring memory ( the usual culprit ) with top may help some one.
Look for memory because a common cause of this kind of things in Linux
is having overcommit on ( overcommit is nice, but basically it lies to
postgres, it tells the server she can have X Gb and then when it tries
to use them kills -9 it )

> If you were updating 100 million records what settings would you adjust?

>From what you told earlier ( loading from CSV ) and with 

> Here are the updates I am performing on the 100 million records:
> UPDATE table SET category_modified = UPPER(category);
> UPDATE table SET category_modified = 
> REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'), 
> '\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND POSITION('--' 
> IN category_modified)>0;
> UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
> UPDATE table SET category_modified = CASE WHEN category_modified IS NOT NULL 
> THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
> UPDATE table SET category_modified = regexp_replace(category_modified, 
> '-{2,}', '-', 'g');
> UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 FOR 
> LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND 
> category_modified LIKE '%-';

this kind of updates I would recommend stream-filtering the data on
load. I do not know how it does it presently, but those are six
hundred million updates ( unless some are supressed, which is what I
do not remember now ), which makes for huge transactions and is asking
for problems ( or maybe only 100_000_000 if you are autocomitting ).

If postgres does not supress redundant updates ( lots of people can
answer that ) I would start by conditioning the updates ( UPDATE table
SET category_modified = UPPER(category) where category_modified is not
null and category_modified <> UPPER(category); ), this can shave time
and resource usages.

But if your updates are of this kind ( transformation of data on the
current row ( your sample is on a single value of a row, even easier )
without touching other things in the database ), I would encourage to
use a small perl/awk/python/whatever program to filter and transform
the data before loading it. It has several advantages, one is speed,
other is minimizing server load, other is testability ( pipe sample
lines to the filter, check result, when satisfied pipe full file and
pipe result to psql ), and choosing the right language ( I would say
perl, it was dessigned practically for this sort of things ) makes the
filtering really easy. And it seems you are forcing it, I see null
handling in some updates, but not others, this hints off relying on
strict behaviour ( functions returning null on null input ). The only
apparent difficulty is parsing csv, but there are millions of modules
parsing the one understood by postgres ( if you generate your data,
the default text format for postgres is much easier to parse, i.e., in
perl normally a simple split// call does the trick ).

Doing it with a filtering stream would allow you to easily process
gigabytes of data using a few megabytes of ram in the client. I'm not
sure about the server, but stream filtering lends itself to very easy
batching of copies, and from what I read your server is beefy.

Francisco Olarte.




Re: constant crashing

2024-04-14 Thread Tom Lane
Adrian Klaver  writes:
> On 4/14/24 09:20, jack wrote:
>> The full error reads:
>> server closed the connection expectantly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> error: connection to server was lost

> Look at the OS system log.

Really the most detail would be in postgres' own log.  I'd only
expect the system log to contain relevant info if the problem
turns out to be an overeager OOM killer --- but if that's what's
happening, we should be able to diagnose from the postmaster's
log too (because it'd show up as the backend dying from SIGKILL).

regards, tom lane




Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 09:20, jack wrote:

The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost


Look at the OS system log.



PostgreSQL 16.2

I also believe it is a resource issue which can be rectified with a 
setting, but which setting?

If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified = 
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'), 
'\s{2,}', ' ', 'g') WHERE  AND LENGTH(category_modified)>1 AND 
POSITION('--' IN category_modified)>0;

UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT 
NULL THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified, 
'-{2,}', '-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 
FOR LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND 
category_modified LIKE '%-';


Is the above all being done in one script/transaction?

Again what are the table definitions for the tables being copied into 
and/or modified?









--
Adrian Klaver
adrian.kla...@aklaver.com





re: constant crashing

2024-04-14 Thread jack
The full error reads:
server closed the connection expectantly
This probably means the server terminated abnormally
before or while processing the request.
error: connection to server was lost
PostgreSQL 16.2
I also believe it is a resource issue which can be rectified with a setting, 
but which setting?
If you were updating 100 million records what settings would you adjust?

Here are the updates I am performing on the 100 million records:
UPDATE table SET category_modified = UPPER(category);
UPDATE table SET category_modified = 
REGEXP_REPLACE(REPLACE(REPLACE(category_modified, ,'-'), '`', '-'), 
'\s{2,}', ' ', 'g') WHERE AND LENGTH(category_modified)>1 AND POSITION('--' IN 
category_modified)>0;
UPDATE table SET category_modified = REPLACE(category_modified,' ','-');
UPDATE table SET category_modified = CASE WHEN category_modified IS NOT NULL 
THEN regexp_replace(category_modified, '[^a-zA-Z]$', '') ELSE NULL END;
UPDATE table SET category_modified = regexp_replace(category_modified, '-{2,}', 
'-', 'g');
UPDATE table SET category_modified = SUBSTRING(category_modified FROM 1 FOR 
LENGTH(category_modified) - 1) WHERE LENGTH(category_modified)>1 AND 
category_modified LIKE '%-';

Re: constant crashing

2024-04-14 Thread Tom Lane
jack  writes:
> I am trying to load about 1.4 billion records into a postgresql table.
> The data is in about 100 individual CSV files which I load individually into 
> the table using the COPY command.
> After the data is loaded I perform updates to the data, and this is when I 
> get errors. Some updates work, some crash.
> I am using psql and an SQL script file (update.sql) to perform the updating 
> (UPDATE table SET field=UPPER(field), etc.).

> The errors are all "Server closed the connection unexpectedly"

What appears in the postmaster log when this happens?

It would also be instructive to run "top" while you trigger
the problem, and see if there is obvious distress like the
server process's memory consumption ballooning.

It smells like a resource-exhaustion problem.  Which probably
shouldn't be happening on a machine with 128G RAM, but maybe
something is misconfigured to give up way below that.  In any
case, you'll never find the cause from the client side.

> Last attempts have been with ubuntu 22.04.04 and postgreSQL 16.

16 dot which?

regards, tom lane




Re: constant crashing

2024-04-14 Thread Adrian Klaver

On 4/14/24 07:24, jack wrote:

Hello,
I am trying to load about 1.4 billion records into a postgresql table.
The data is in about 100 individual CSV files which I load individually 
into the table using the COPY command.
After the data is loaded I perform updates to the data, and this is when 
I get errors. Some updates work, some crash.
I am using psql and an SQL script file (update.sql) to perform the 
updating (UPDATE table SET field=UPPER(field), etc.).


The errors are all "Server closed the connection unexpectedly"




I have been working on this for just over 1 year now, documenting every 
step, and I am still unable to get this to work without it crashing 
somewhere along the way.

I am beginning to wonder if postgreSQL is bi-polar.


More information needed, to start:

1) Complete example of the code in the script.

2) Table definition(s) of those being copied into.

3) Sample of the data being copied.

4) The error message(s) generated.

5) Database locale



Any help would be greatly appreciated.
Thank you


--
Adrian Klaver
adrian.kla...@aklaver.com