Re: [GENERAL] performance problem with loading data

2007-06-11 Thread Alban Hertroys
Sergey Karin wrote:
> Hi all.
> 
> I use PG 8.1.8 and PostGIS 1.1.1
> vka1=# select version();

> I develop application for loading geodata to database. In separate
> transaction the application  inserts the data in separate table that
> created dynamically when transaction started. All tables has equal
> structure.

> I found *VERY* strange problem: speed of the loading process is slow
> down (first 1 objects are loaded in 69792 ms and last 1 objects
> in 123737 ms). And futhermore, if I do NOT close current session but
> start new transaction, the first 1 objects will be loaded in 192279
> ms and last 1 objects in 251742 ms. And so on!! :-(
> 
> But if I drop the trigger speed of loading process is NOT slow down.
> 
> Who can explain me what I do incorrect?

I think I can explain what happens (to my understanding, I'm "just a user").

When you start you have an empty table (right?). If PG's statistics are
accurate to match that situation at that point, it will know there are 0
records in it. Querying a table with 0 records is fastest with a
sequential scan.

Next you put data into the table, and at a certain point a sequential
scan will not be optimal anymore. But PG doesn't know that, so it keeps
using sequential scans!

I think you'll see performance improve if you add ANALYZE 
statements periodically. That way PG updates its stats on that table.

I know my explanation is a bit simplified, there are other statistics in
play. That is why I didn't advise to run ANALYZE just once after a
certain number of inserts ;)

Now this would be problematic with INSERT ... SELECT, as there's no room
to run periodical ANALYZES (maybe if you'd put in LIMIT/OFFSET, but that
feels kludgy).

For the technical people; would it be possible to use the statistics on
the table(s) in the SELECT part of the statement to update the
statistics of the table being inserted into? Maybe they wouldn't be
entirely accurate, but it wouldn't it be more accurate than statistics
that say it's empty?

-- 
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] performance problem with loading data

2007-06-09 Thread Dann Corbit
Try using COPY instead of insert select, if that is possible for you.
It is much faster than insert.

 

Otherwise, you might try dropping the index and constraint, loading the
data, and recreating the index and constraint.

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sergey Karin
Sent: Saturday, June 09, 2007 1:48 AM
To: pgsql-general@postgresql.org; PostGIS Users Discussion
Subject: [GENERAL] performance problem with loading data

 

Hi all.

I use PG 8.1.8 and PostGIS 1.1.1
vka1=# select version();
 version

- 
 PostgreSQL 8.1.8 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)

vka1=# select postgis_full_version();
   postgis_full_version

-- 
 POSTGIS="1.1.1" GEOS="2.2.3-CAPI-1.1.1" PROJ="Rel. 4.4.9, 29 Oct 2004"
USE_STATS

I develop application for loading geodata to database. In separate
transaction the application  inserts the data in separate table that
created dynamically when transaction started. All tables has equal
structure. 

Geodata has simple and similar structure (field geometry): POLYGON((x1
y1, x2 y2, x3 y3, x4 y4, x1 y1))
For loading geodata I use  INSERT into   via LIBPQ.
In big loop I call PQexec(conn, query).

I found *VERY* strange problem: speed of the loading process is slow
down (first 1 objects are loaded in 69792 ms and last 1 objects
in 123737 ms). And futhermore, if I do NOT close current session but
start new transaction, the first 1 objects will be loaded in 192279
ms and last 1 objects in 251742 ms. And so on!! :-(

But if I drop the trigger speed of loading process is NOT slow down.

Who can explain me what I do incorrect?

Thanks in advance

Sergey Karin

===
code===
===
create table  (
GID   SERIAL   not null, 
GEOM_ORDER INT4not null default 0,
ZOOMABLE  BOOL  not null default
false,
GEOM  GEOMETRY 
constraint  primary key (GID), 
);

create index  on  using gist ( geom
gist_geometry_ops );
create trigger trgOInsert 
before insert or update 
on  
for each row 
execute procedure oInsertCheck('GEOMETRYCOLLECTION',
0);

create or replace function oInsertCheck() returns trigger as'
declare
g_isvalid boolean; 
iSrid int4;
geomType varchar;
begin

if(new.geom isnull) then
new.geom := geomFromText(\'GEOMETRYCOLLECTION(EMPTY)\');
end if;
if(new.geom_order isnull) then 
new.geom_order := 0;
end if;

select isvalid(new.geom) into g_isvalid;
if(g_isvalid isnull) then
return NULL;
end if;

geomType := TG_ARGV[TG_NARGS-2]; 
iSrid:= TG_ARGV[TG_NARGS-1];

if(upper(geomType) = \'GEOMETRYCOLLECTION\') then
new.geom := force_collection(new.geom);
end if;

new.geom := setSrid( new.geom, iSrid);

return new;
end
'language 'plpgsql' security definer;

=end of code==



[GENERAL] performance problem with loading data

2007-06-09 Thread Sergey Karin

Hi all.

I use PG 8.1.8 and PostGIS 1.1.1
vka1=# select version();
version
-
PostgreSQL 8.1.8 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC)
4.1.220061115 (prerelease) (Debian
4.1.1-21)

vka1=# select postgis_full_version();
  postgis_full_version
--
POSTGIS="1.1.1" GEOS="2.2.3-CAPI-1.1.1" PROJ="Rel. 4.4.9, 29 Oct 2004"
USE_STATS

I develop application for loading geodata to database. In separate
transaction the application  inserts the data in separate table that created
dynamically when transaction started. All tables has equal structure.

Geodata has simple and similar structure (field geometry): POLYGON((x1 y1,
x2 y2, x3 y3, x4 y4, x1 y1))
For loading geodata I use  INSERT into   via LIBPQ.
In big loop I call PQexec(conn, query).

I found *VERY* strange problem: speed of the loading process is slow down
(first 1 objects are loaded in 69792 ms and last 1 objects in 123737
ms). And futhermore, if I do NOT close current session but start new
transaction, the first 1 objects will be loaded in 192279 ms and last
1 objects in 251742 ms. And so on!! :-(

But if I drop the trigger speed of loading process is NOT slow down.

Who can explain me what I do incorrect?

Thanks in advance

Sergey Karin

===
code===
===
create table  (
   GID   SERIAL   not null,
   GEOM_ORDER INT4not null default 0,
   ZOOMABLE  BOOL  not null default false,
   GEOM  GEOMETRY
   constraint  primary key (GID),
   );

create index  on  using gist ( geom
gist_geometry_ops );
create trigger trgOInsert
   before insert or update
   on 
   for each row
   execute procedure oInsertCheck('GEOMETRYCOLLECTION', 0);

create or replace function oInsertCheck() returns trigger as'
declare
   g_isvalid boolean;
   iSrid int4;
   geomType varchar;
begin

   if(new.geom isnull) then
   new.geom := geomFromText(\'GEOMETRYCOLLECTION(EMPTY)\');
   end if;
   if(new.geom_order isnull) then
   new.geom_order := 0;
   end if;

   select isvalid(new.geom) into g_isvalid;
   if(g_isvalid isnull) then
   return NULL;
   end if;

   geomType := TG_ARGV[TG_NARGS-2];
   iSrid:= TG_ARGV[TG_NARGS-1];

   if(upper(geomType) = \'GEOMETRYCOLLECTION\') then
   new.geom := force_collection(new.geom);
   end if;

   new.geom := setSrid(new.geom, iSrid);

   return new;
end
'language 'plpgsql' security definer;

=end of code==