Re: [GENERAL] Large index operation crashes postgres

2010-05-02 Thread strk
On Mon, Mar 29, 2010 at 09:39:35AM +0200, Frans Hals wrote:
 Paul,
 
 took your advice and installed Geos 3.2.0.
 Index is now running for 14 hrs, postmaster is taking all the RAM.
 Sadly it looks like the Geos update didn't save me.

Frans, could you try installing GEOS 3.2.2 ?
A leak was fixed there. You should need no action
on the postgis side to get the fix.

--strk;

  ()   Free GIS  Flash consultant/developer
  /\   http://strk.keybit.net/services.html

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-29 Thread Frans Hals
Paul,

took your advice and installed Geos 3.2.0.
Index is now running for 14 hrs, postmaster is taking all the RAM.
Sadly it looks like the Geos update didn't save me.

Regards
Frans

2010/3/28 Paul Ramsey pram...@cleverelephant.ca:
 GEOS 3.2 is backwards compatible, so you can install it overtop of 3.1
 and things will still work.

 P


 2010/3/26 Paul Ramsey pram...@cleverelephant.ca:
 Occams razor says it's PostGIS. However, I'm concerned about how old
 the code being run is. In particular, the library underneath PostGIS,
 GEOS, had a *lot* of memory work done on it over the last year. I'd
 like to see if things improve if you upgrade to GEOS 3.2.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-29 Thread Frans Hals
Paul,

I have checked the different kinds of data in the table for their memory usage.
ST_LineSting is the one that's leaking, the other types complete
indexing without leakage.
Update to Geos 3.2.0 didn't improve the operation.

Kind regards
Frans

2010/3/28 Paul Ramsey pram...@cleverelephant.ca:
 MIght be random, might be a clue, we'll see. So it looks like much of
 the table is two-point lines and points.

 P

 On Sat, Mar 27, 2010 at 1:16 PM, Frans Hals fha...@googlemail.com wrote:

  ST_Point        |              | 20648939
  ST_MultiPolygon |              |     6188
  ST_Polygon      |              |  8054680

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-27 Thread Frans Hals
Paul,

I kindly received the information about the table data (quoting here):

It changes as it goes down the table, it's a right mixture.

 ST_LineString   |2 |  5398548
 ST_LineString   |3 |  2877681
 ST_LineString   |4 |  2160809
 ST_LineString   |5 |  1696900
 ST_LineString   |6 |  1362231
 ST_LineString   |7 |  1107733
 ST_LineString   |8 |   915616
 ST_LineString   |9 |   766904
 ST_LineString   |   10 |   646150
 ST_LineString   |   11 |   550356
 ST_LineString   |   12 |   473357
 ST_LineString   |   13 |   410038
 ST_LineString   |   14 |   358185
 ST_LineString   |   15 |   313985
 ST_LineString   |   16 |   278846
 ST_LineString   |   17 |   248253
 ST_LineString   |   18 |   220736
 ST_LineString   |   19 |   198809
 ST_LineString   |   20 |   179552
 ST_LineString   |   21 |   162140
 ST_LineString   |   22 |   147957
 ST_LineString   |   23 |   134321
 ST_LineString   |   24 |   123805
 ST_LineString   |   25 |   113805
 ST_LineString   |   26 |   105329
 ST_LineString   |   27 |96809
 ST_LineString   |   28 |90105
 ST_LineString   |   29 |83137
 ST_LineString   |   30 |77846
 ST_LineString   |   31 |72963
 ST_LineString   |   32 |67830
 ST_LineString   |   33 |63849
 ST_LineString   |   34 |60241
 ST_LineString   |   35 |56312
 ST_LineString   |   36 |52805
 ST_LineString   |   37 |49919
 ST_LineString   |   38 |47402
 ST_LineString   |   39 |44860
 ST_LineString   |   40 |41987
 ST_LineString   |   41 |40055
 ST_LineString   |   42 |38173
 ST_LineString   |   43 |36649
 ST_LineString   |   44 |34464
 ST_LineString   |   45 |32637
 ST_LineString   |   46 |31695
 ST_LineString   |   47 |29851
 ST_LineString   |   48 |28546
 ST_LineString   |   49 |27419
 ST_LineString   |   50 |26784

 ST_LineString   | 1993 |2
 ST_LineString   | 1995 |1
 ST_LineString   | 1997 |6
 ST_LineString   | 1998 |5
 ST_LineString   | 1999 |3
 ST_LineString   | 2000 |9
 ST_Point|  | 20648939
 ST_MultiPolygon |  | 6188
 ST_Polygon  |  |  8054680


One thing you could try is indexing each geometry type in turn and
watch the memory usage.

If indexing starts sequentially from the beginning to the end, we
reach ST_Point around row 22.000.000. This is from the count of
geometry operations exactly where the slowdown begins.
Does this track us to the leak?

Thanks  regards
Frans

2010/3/26 Paul Ramsey pram...@cleverelephant.ca:


 Could you
 - put in your version information
 - tell me what kind of spatial objects you have? polygons of  100
 vertices? lines of two vertices? etc. That will help me pick similar
 data for the memory testing.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-27 Thread Frans Hals
Paul,

do you know, if Postgis 1.3.6 will survive, if I install Geos 3.2 over
it? Currently, there's Geos 3.1.1 installed.
I remember, I had to downgrade Postgis/Geos to be compatible with
Postgres 8.3.9. As I herited the database from a  8.3.9 dump, I
needed to fit my installation before restoring the dump.

Thanks
Frans

2010/3/26 Paul Ramsey pram...@cleverelephant.ca:
 Occams razor says it's PostGIS. However, I'm concerned about how old
 the code being run is. In particular, the library underneath PostGIS,
 GEOS, had a *lot* of memory work done on it over the last year. I'd
 like to see if things improve if you upgrade to GEOS 3.2.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-26 Thread Frans Hals
Operation is now running for around 13 hrs.
Two postmaster processes above 1% memory usage are running.

One of them uses constantly 26.5% of memory.
The other one is growing:
After 1 hr25%
After 9 hrs  59%
After 13 hrs64%

Thanks  regards
Frans


2010/3/25 Frans Hals fha...@googlemail.com:
 Paul,

 I have started the operation right now after a fresh reboot of the machine.
 update placex set leakcheck = st_x(st_centroid(geometry));
  Please give me some time to collect the results...

 Thanks  regards
 Frans

 2010/3/25 Paul Ramsey pram...@cleverelephant.ca:
 If you build an index, or try to update a column, using just the

 ST_X(ST_Centroid())

 without all the extra pl/pgsql wrapping, do you still see the memory
 footprint grow substantially?

 P.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-26 Thread Tom Lane
Frans Hals fha...@googlemail.com writes:
 Operation is now running for around 13 hrs.
 Two postmaster processes above 1% memory usage are running.

 One of them uses constantly 26.5% of memory.
 The other one is growing:
 After 1 hr25%
 After 9 hrs  59%
 After 13 hrs64%

Well, it's pretty clear that you've found a memory leak, but that was
what we thought before; this data doesn't move us any closer to a fix.
In particular it's not possible to guess whether the leak should be
blamed on Postgres or Postgis code.  Even if we knew that, I'm not
sure we could fix the leak without tracing through actual execution.

Can you generate a self-contained test case that exhibits similar bloat?
I would think it's probably not very dependent on the specific data in
the column, so a simple script that constructs a lot of random data
similar to yours might be enough, if you would rather not show us your
real data.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-26 Thread Paul Ramsey
Occams razor says it's PostGIS. However, I'm concerned about how old
the code being run is. In particular, the library underneath PostGIS,
GEOS, had a *lot* of memory work done on it over the last year. I'd
like to see if things improve if you upgrade to GEOS 3.2.

On Fri, Mar 26, 2010 at 9:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Frans Hals fha...@googlemail.com writes:
 Operation is now running for around 13 hrs.
 Two postmaster processes above 1% memory usage are running.

 One of them uses constantly 26.5% of memory.
 The other one is growing:
 After 1 hr        25%
 After 9 hrs      59%
 After 13 hrs    64%

 Well, it's pretty clear that you've found a memory leak, but that was
 what we thought before; this data doesn't move us any closer to a fix.
 In particular it's not possible to guess whether the leak should be
 blamed on Postgres or Postgis code.  Even if we knew that, I'm not
 sure we could fix the leak without tracing through actual execution.

 Can you generate a self-contained test case that exhibits similar bloat?
 I would think it's probably not very dependent on the specific data in
 the column, so a simple script that constructs a lot of random data
 similar to yours might be enough, if you would rather not show us your
 real data.

                        regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-26 Thread Frans Hals
Tom,

I'm pretty new to memory debugging, so please be patient if I'm not as
precise as you suppose me to be.
For the records I have run a valgrind postmaster session, starting my
initial indexing routine until it crashes postgres.
If you think this might be enlightning for you, I'll send you the
transcript. It's too long for the list.

I'm not sure, what you're thinking about generating a self-contained
test that exhibits similar bloat.
I have started an index creation using my data without calling postgis
functions. Just to make it busy:
CREATE INDEX idx_placex_sector ON placex USING btree
(substring(geometry,1,100), rank_address, osm_type, osm_id);
This is now running against the 50.000.000 rows in placex. I will
update you about the memory usage it takes.

The data itself isn't a secret. I need your experience to find and fix
the problem. For myself I 'll try all necessary steps, you suggest me
to do.

Kind regards
Frans

2010/3/26 Tom Lane t...@sss.pgh.pa.us:
 Well, it's pretty clear that you've found a memory leak, but that was
 what we thought before; this data doesn't move us any closer to a fix.
 In particular it's not possible to guess whether the leak should be
 blamed on Postgres or Postgis code.  Even if we knew that, I'm not
 sure we could fix the leak without tracing through actual execution.

 Can you generate a self-contained test case that exhibits similar bloat?
 I would think it's probably not very dependent on the specific data in
 the column, so a simple script that constructs a lot of random data
 similar to yours might be enough, if you would rather not show us your
 real data.

                        regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-26 Thread Frans Hals
The index mentioned below has been created in some minutes without problems.
Dropped it and created it again. Uses around 36 % of memorywhile
creating, after completion postmaster stays at 26 %.


 I'm not sure, what you're thinking about generating a self-contained
 test that exhibits similar bloat.
 I have started an index creation using my data without calling postgis
 functions. Just to make it busy:
 CREATE INDEX idx_placex_sector ON placex USING btree
 (substring(geometry,1,100), rank_address, osm_type, osm_id);
 This is now running against the 50.000.000 rows in placex. I will
 update you about the memory usage it takes.

 Can you generate a self-contained test case that exhibits similar bloat?
 I would think it's probably not very dependent on the specific data in
 the column, so a simple script that constructs a lot of random data
 similar to yours might be enough, if you would rather not show us your
 real data.

                        regards, tom lane



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-25 Thread Frans Hals
5 x postmaster taking memory:

93.3 %
18.7 %
 0.3 %
 0.2 %
 0.0 %

112.5%

Looks like there is someone living beyond its means?

Frans

2010/3/24 Tom Lane t...@sss.pgh.pa.us:

 Hm.  I wonder about a memory leak in there somewhere.  Have you checked
 the process size while this is going on?

                        regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-25 Thread Frans Hals
Tom,

ran a CREATE INDEX to the gdb operated postmaster.
Nothing new due to missing debugging libraries, so this might not help:

Reading symbols from /usr/bin/postmaster...(no debugging symbols found)...done.
Missing separate debuginfos, use: debuginfo-install
postgresql-server-8.3.9-1PGDG.f12.x86_64
(gdb) run
Starting program: /usr/bin/postmaster
[Thread debugging using libthread_db enabled]
Detaching after fork from child process 2869.
LOG:  database system was interrupted; last known up at 2010-03-24 05:51:12 PDT
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at CF/D482490
LOG:  unexpected pageaddr CE/AD49 in log file 207, segment 13,
offset 4784128
LOG:  redo done at CF/D48FB80
Detaching after fork from child process 2870.
Detaching after fork from child process 2871.
Detaching after fork from child process 2872.
LOG:  database system is ready to accept connections
Detaching after fork from child process 2996.
Detaching after fork from child process 3682.
Detaching after fork from child process 3685.
Detaching after fork from child process 3687.
Detaching after fork from child process 3688.
Detaching after fork from child process 3690.
Detaching after fork from child process 3691.
Detaching after fork from child process 3696.
Detaching after fork from child process 3698.
Detaching after fork from child process 3702.
Detaching after fork from child process 3706.
Detaching after fork from child process 3710.
LOG:  background writer process (PID 2870) was terminated by signal 9: Killed
LOG:  terminating any other active server processes
LOG:  statistics collector process (PID 2872) was terminated by signal 9: Killed
Detaching after fork from child process 5595.
FATAL:  the database system is in recovery mode

Is there anything meaningful for you?
Does it makes sense to install the debuginfos to catch the problem? If
yes, I may do so.

Paul asked me, to check another function of the postgis set for a
memory leak. I will try this now.

Thanks  regards
Frans

2010/3/24 Tom Lane t...@sss.pgh.pa.us:

 Can you provide a stack trace from the crash?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-25 Thread Frans Hals
Paul,

I have started the operation right now after a fresh reboot of the machine.
update placex set leakcheck = st_x(st_centroid(geometry));
 Please give me some time to collect the results...

Thanks  regards
Frans

2010/3/25 Paul Ramsey pram...@cleverelephant.ca:
 If you build an index, or try to update a column, using just the

 ST_X(ST_Centroid())

 without all the extra pl/pgsql wrapping, do you still see the memory
 footprint grow substantially?

 P.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Large index operation crashes postgres

2010-03-24 Thread Frans Hals
Hi,

running a geo-database from a dump restore where still one of the most
important indexes is missing and so the search is slow.
Whenever I try to add the follwing index to the table placex, one of
the postmaster processes dies and the server restarts.

I try:
CREATE INDEX idx_placex_sector ON placex USING btree
(geometry_sector(geometry), rank_address, osm_type, osm_id);

The table counts around 50.000.000 rows.
The first 20.000.000 are indexed in 20-30 minutes. Nice!
Then indexing becomes slow and slower, first taking 100.000 rows in
ten minutes while further consequently decreasing speed.
When the job reaches something around row 25.000.000 postgres goes down:

WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.

I have checked RAM and changed the HD with no success.
Experimenting with a lot of different memory settings in the conf-file
didn't help either.
Is there anybody else who experienced this and found a way to create this index?
Server is postgres 8.3.9 with 4 GB dedicated RAM.

gemoetry_sector function looks like this (postgis):

DECLARE
  NEWgeometry geometry;
BEGIN
-- RAISE WARNING '%',place;
  NEWgeometry := place;
  IF ST_IsEmpty(NEWgeometry) OR NOT ST_IsValid(NEWgeometry) OR
ST_X(ST_Centroid(NEWgeometry))::text in ('NaN','Infinity','-Infinity')
OR ST_Y(ST_Centroid(NEWgeometry))::text in
('NaN','Infinity','-Infinity') THEN
NEWgeometry := ST_buffer(NEWgeometry,0);
IF ST_IsEmpty(NEWgeometry) OR NOT ST_IsValid(NEWgeometry) OR
ST_X(ST_Centroid(NEWgeometry))::text in ('NaN','Infinity','-Infinity')
OR ST_Y(ST_Centroid(NEWgeometry))::text in
('NaN','Infinity','-Infinity') THEN
  RETURN NULL;
END IF;
  END IF;
  RETURN (500-ST_X(ST_Centroid(NEWgeometry))::integer)*1000 +
(500-ST_Y(ST_Centroid(NEWgeometry))::integer);
END;

The subcalled St_Centroid is a postgis C-function located in
/usr/lib/postgresql/8.3/lib/liblwgeom.


Anybody out there has an idea what happens or better how to reach the
50.000.000?

Thanks
Frans

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-24 Thread Tom Lane
Frans Hals fha...@googlemail.com writes:
 Whenever I try to add the follwing index to the table placex, one of
 the postmaster processes dies and the server restarts.

Can you provide a stack trace from the crash?

 I try:
 CREATE INDEX idx_placex_sector ON placex USING btree
 (geometry_sector(geometry), rank_address, osm_type, osm_id);

 The table counts around 50.000.000 rows.
 The first 20.000.000 are indexed in 20-30 minutes. Nice!
 Then indexing becomes slow and slower, first taking 100.000 rows in
 ten minutes while further consequently decreasing speed.
 When the job reaches something around row 25.000.000 postgres goes down:

On what exactly do you base these statements about number of rows
processed?  CREATE INDEX doesn't provide any such feedback that I'm
aware of.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Large index operation crashes postgres

2010-03-24 Thread Paul Ramsey
Can you do?

alter table placex add column geometry_sector integer;
update placex set geometry_sector = geometry_sector(geometry);

P.

On Wed, Mar 24, 2010 at 1:15 PM, Frans Hals fha...@googlemail.com wrote:
 Hi,

 running a geo-database from a dump restore where still one of the most
 important indexes is missing and so the search is slow.
 Whenever I try to add the follwing index to the table placex, one of
 the postmaster processes dies and the server restarts.

 I try:
 CREATE INDEX idx_placex_sector ON placex USING btree
 (geometry_sector(geometry), rank_address, osm_type, osm_id);

 The table counts around 50.000.000 rows.
 The first 20.000.000 are indexed in 20-30 minutes. Nice!
 Then indexing becomes slow and slower, first taking 100.000 rows in
 ten minutes while further consequently decreasing speed.
 When the job reaches something around row 25.000.000 postgres goes down:

 WARNING:  terminating connection because of crash of another server process
 DETAIL:  The postmaster has commanded this server process to roll back
 the current transaction and exit, because another server process
 exited abnormally and possibly corrupted shared memory.

 I have checked RAM and changed the HD with no success.
 Experimenting with a lot of different memory settings in the conf-file
 didn't help either.
 Is there anybody else who experienced this and found a way to create this 
 index?
 Server is postgres 8.3.9 with 4 GB dedicated RAM.

 gemoetry_sector function looks like this (postgis):

 DECLARE
  NEWgeometry geometry;
 BEGIN
 -- RAISE WARNING '%',place;
  NEWgeometry := place;
  IF ST_IsEmpty(NEWgeometry) OR NOT ST_IsValid(NEWgeometry) OR
 ST_X(ST_Centroid(NEWgeometry))::text in ('NaN','Infinity','-Infinity')
 OR ST_Y(ST_Centroid(NEWgeometry))::text in
 ('NaN','Infinity','-Infinity') THEN
    NEWgeometry := ST_buffer(NEWgeometry,0);
    IF ST_IsEmpty(NEWgeometry) OR NOT ST_IsValid(NEWgeometry) OR
 ST_X(ST_Centroid(NEWgeometry))::text in ('NaN','Infinity','-Infinity')
 OR ST_Y(ST_Centroid(NEWgeometry))::text in
 ('NaN','Infinity','-Infinity') THEN
      RETURN NULL;
    END IF;
  END IF;
  RETURN (500-ST_X(ST_Centroid(NEWgeometry))::integer)*1000 +
 (500-ST_Y(ST_Centroid(NEWgeometry))::integer);
 END;

 The subcalled St_Centroid is a postgis C-function located in
 /usr/lib/postgresql/8.3/lib/liblwgeom.


 Anybody out there has an idea what happens or better how to reach the
 50.000.000?

 Thanks
 Frans

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general