Re: [GENERAL] Memory Issue with array_agg?

2013-08-21 Thread Robert Sosinski
Hi Pavel,

Here are the explains you asked for:

explain analyze select string_agg(id::text,',') from things group by guid;
QUERY PLAN

--
 GroupAggregate  (cost=400357.78..433784.93 rows=1337086 width=37) (actual
time=41434.485..53195.185 rows=2378626 loops=1)
   ->  Sort  (cost=400357.78..403700.49 rows=1337086 width=37) (actual
time=41434.433..44992.736 rows=2378626 loops=1)
 Sort Key: guid
 Sort Method: quicksort  Memory: 284135kB
 ->  Seq Scan on things  (cost=0.00..264304.86 rows=1337086
width=37) (actual time=0.027..21429.179 rows=2378626 loops=1)
 Total runtime: 56295.362 ms
(6 rows)


explain analyze select array_agg(id::text) from things group by guid;
   QUERY PLAN

-
 GroupAggregate  (cost=400357.78..433784.93 rows=1337086 width=37) (actual
time=23953.922..38157.059 rows=2378626 loops=1)
   ->  Sort  (cost=400357.78..403700.49 rows=1337086 width=37) (actual
time=23953.847..27527.316 rows=2378626 loops=1)
 Sort Key: guid
 Sort Method: quicksort  Memory: 284135kB
 ->  Seq Scan on things  (cost=0.00..264304.86 rows=1337086
width=37) (actual time=0.007..4941.752 rows=2378626 loops=1)
 Total runtime: 41280.897 ms
(6 rows)



These seem to be running on the machine now, and the memory is not
inflating, I just run this one, and it blew up.

explain with t as (select id, guid, md5(concat_ws(':', fields -> 'a',
fields -> 'b', fields -> 'c', fields -> 'd', fields -> 'e', foo_id::text))
from things) select md5, count(id), array_agg(id) from t group by 1 having
count(id) > 1;

-Robert


On Tue, Aug 20, 2013 at 1:53 PM, Pavel Stehule wrote:

> Can you send a EXPLAIN result in both use cases?
>
> Pavel
>
>
> 2013/8/19 Robert Sosinski 
>
>> At the moment, all guids are distinct, however before I zapped the
>> duplicates, there were 280 duplicates.
>>
>> Currently, there are over 2 million distinct guids.
>>
>> -Robert
>>
>>
>> On Mon, Aug 19, 2013 at 11:12 AM, Pavel Stehule 
>> wrote:
>>
>>>
>>>
>>>
>>> 2013/8/19 Robert Sosinski 
>>>
 Hi Pavel,

 What kind of example do you need?  I cant give you the actual data I
 have in the table, but I can give you an example query and the schema
 attached below.  From there, I would just put in 2 million rows worth 1.2
 Gigs of data.  Average size of the the extended columns (using the
 pg_column_size function) in bytes are:

 guid: 33
 name: 2.41
 currency: 4
 fields: 120.32

 example query:

 -- find duplicate records using a guid
 select guid, array_agg(id) from orders group by guid;

>>>
>>> how much distinct guid is there, and how much duplicates
>>>
>>> ??
>>>
>>> regards
>>>
>>> Pavel
>>>
>>>
>>>

 example schema:
  Table "public.things"

Column   |Type |
  Modifiers  | Storage  | Stats target | Description

 +-+-+--+--+-
  id | integer | not null default
 nextval('things_id_seq'::regclass) | plain|  |
  version| integer | not null
  | plain|  |
  created_at | timestamp without time zone | not null
  | plain|  |
  updated_at | timestamp without time zone | not null
  | plain|  |
  foo_id | integer | not null
  | plain|  |
  bar_id | integer | not null
  | plain|  |
  baz_id | integer | not null
  | plain|  |
  guid   | character varying   | not null
  | extended |  |
  name   | character varying   | not null
  | extended |  |
  price  | numeric(12,2)   | not null
  | main |  |
  currency   | character varying   | not null
  | extended |  |
  amount | integer | not null
  | plain|  |
  the_date   | date| not null
  | plain|  |
  fields | hstore   

Re: [GENERAL] Memory Issue with array_agg?

2013-08-20 Thread Pavel Stehule
Can you send a EXPLAIN result in both use cases?

Pavel


2013/8/19 Robert Sosinski 

> At the moment, all guids are distinct, however before I zapped the
> duplicates, there were 280 duplicates.
>
> Currently, there are over 2 million distinct guids.
>
> -Robert
>
>
> On Mon, Aug 19, 2013 at 11:12 AM, Pavel Stehule 
> wrote:
>
>>
>>
>>
>> 2013/8/19 Robert Sosinski 
>>
>>> Hi Pavel,
>>>
>>> What kind of example do you need?  I cant give you the actual data I
>>> have in the table, but I can give you an example query and the schema
>>> attached below.  From there, I would just put in 2 million rows worth 1.2
>>> Gigs of data.  Average size of the the extended columns (using the
>>> pg_column_size function) in bytes are:
>>>
>>> guid: 33
>>> name: 2.41
>>> currency: 4
>>> fields: 120.32
>>>
>>> example query:
>>>
>>> -- find duplicate records using a guid
>>> select guid, array_agg(id) from orders group by guid;
>>>
>>
>> how much distinct guid is there, and how much duplicates
>>
>> ??
>>
>> regards
>>
>> Pavel
>>
>>
>>
>>>
>>> example schema:
>>>  Table "public.things"
>>>
>>>Column   |Type |
>>>  Modifiers  | Storage  | Stats target | Description
>>>
>>> +-+-+--+--+-
>>>  id | integer | not null default
>>> nextval('things_id_seq'::regclass) | plain|  |
>>>  version| integer | not null
>>>| plain|  |
>>>  created_at | timestamp without time zone | not null
>>>| plain|  |
>>>  updated_at | timestamp without time zone | not null
>>>| plain|  |
>>>  foo_id | integer | not null
>>>| plain|  |
>>>  bar_id | integer | not null
>>>| plain|  |
>>>  baz_id | integer | not null
>>>| plain|  |
>>>  guid   | character varying   | not null
>>>| extended |  |
>>>  name   | character varying   | not null
>>>| extended |  |
>>>  price  | numeric(12,2)   | not null
>>>| main |  |
>>>  currency   | character varying   | not null
>>>| extended |  |
>>>  amount | integer | not null
>>>| plain|  |
>>>  the_date   | date| not null
>>>| plain|  |
>>>  fields | hstore  |
>>> | extended |  |
>>> Indexes:
>>> "things_pkey" PRIMARY KEY, btree (id)
>>> "things_foo_id_idx" btree (foo_id)
>>> "things_bar_id_idx" btree (bar_id)
>>> "things_baz_id_idx" btree (baz_id)
>>> "things_guid_uidx" UNIQUE, btree (guid)
>>> "things_lpad_lower_name_eidx" btree (lpad(lower(name::text), 10,
>>> '0'::text))
>>> "things_price_idx" btree (price)
>>>
>>> Foreign-key constraints:
>>> "things_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(id)
>>> "things_bar_id_fkey" FOREIGN KEY (bar_id) REFERENCES bars(id)
>>> "things_baz_id_fkey" FOREIGN KEY (baz_id) REFERENCES bazs(id)
>>> Triggers:
>>> timestamps_trig BEFORE INSERT OR UPDATE ON things FOR EACH ROW
>>> EXECUTE PROCEDURE timestamps_tfun()
>>>
>>> Let me know if you need anything else.
>>>
>>> Thanks,
>>>
>>>
>>> On Mon, Aug 19, 2013 at 3:29 AM, Pavel Stehule 
>>> wrote:
>>>
 Hello

 please, can you send some example or test?

 Regards

 Pavel Stehule


 2013/8/19 Robert Sosinski 

> When using array_agg on a large table, memory usage seems to spike up
> until Postgres crashes with the following error:
>
> 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection
> because of crash of another server process
> 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
>
> I've definitely isolated it down to using array_agg, as when I changed
> the query to use string_agg, it worked fine.  I also tried using array_agg
> on a few different queries, all yielding the same issue.  Swapping in
> string_agg fixed the issue once more.
>
> This particular table has over 2 million rows and is 

Re: [GENERAL] Memory Issue with array_agg?

2013-08-20 Thread Pavel Stehule
Hello

It is strange. I am trying to simulate it without success. On 1 M rows
where every id is 2 times duplicated

processing string_agg .. cca 30MB

processing array_agg cca 32MB

postgres=# create table foo(a int, b varchar);
CREATE TABLE
postgres=# insert into foo select i, md5(i::text) from
generate_series(1,100) g(i);
INSERT 0 100
postgres=# insert into foo select i, md5(i::text) from
generate_series(1,100) g(i);
INSERT 0 100
postgres=# CREATE INDEX on foo(b);
CREATE INDEX
postgres=# ANALYZE foo;
ANALYZE
postgres=# explain analyze select string_agg(a::text,',') from foo group by
b;
 QUERY
PLAN

 GroupAggregate  (cost=410045.19..447831.37 rows=1022895 width=37) (actual
time=10195.972..14993.493 rows=100 loops=1)
   ->  Sort  (cost=410045.19..415045.19 rows=200 width=37) (actual
time=10195.944..13659.985 rows=200 loops=1)
 Sort Key: b
 Sort Method: external merge  Disk: 97768kB
 ->  Seq Scan on foo  (cost=0.00..36667.00 rows=200 width=37)
(actual time=0.018..321.197 rows=200 loops=1)
 Total runtime: 15066.397 ms
(6 rows)

postgres=# explain analyze select array_agg(a::text) from foo group by b;
 QUERY
PLAN

 GroupAggregate  (cost=410045.19..447831.37 rows=1022895 width=37) (actual
time=10062.095..15697.755 rows=100 loops=1)
   ->  Sort  (cost=410045.19..415045.19 rows=200 width=37) (actual
time=10062.059..13613.300 rows=200 loops=1)
 Sort Key: b
 Sort Method: external merge  Disk: 97768kB
 ->  Seq Scan on foo  (cost=0.00..36667.00 rows=200 width=37)
(actual time=0.029..311.423 rows=200 loops=1)
 Total runtime: 15799.226 ms
(6 rows)

Regards

Pavel


2013/8/19 Robert Sosinski 

> At the moment, all guids are distinct, however before I zapped the
> duplicates, there were 280 duplicates.
>
> Currently, there are over 2 million distinct guids.
>
> -Robert
>
>
> On Mon, Aug 19, 2013 at 11:12 AM, Pavel Stehule 
> wrote:
>
>>
>>
>>
>> 2013/8/19 Robert Sosinski 
>>
>>> Hi Pavel,
>>>
>>> What kind of example do you need?  I cant give you the actual data I
>>> have in the table, but I can give you an example query and the schema
>>> attached below.  From there, I would just put in 2 million rows worth 1.2
>>> Gigs of data.  Average size of the the extended columns (using the
>>> pg_column_size function) in bytes are:
>>>
>>> guid: 33
>>> name: 2.41
>>> currency: 4
>>> fields: 120.32
>>>
>>> example query:
>>>
>>> -- find duplicate records using a guid
>>> select guid, array_agg(id) from orders group by guid;
>>>
>>
>> how much distinct guid is there, and how much duplicates
>>
>> ??
>>
>> regards
>>
>> Pavel
>>
>>
>>
>>>
>>> example schema:
>>>  Table "public.things"
>>>
>>>Column   |Type |
>>>  Modifiers  | Storage  | Stats target | Description
>>>
>>> +-+-+--+--+-
>>>  id | integer | not null default
>>> nextval('things_id_seq'::regclass) | plain|  |
>>>  version| integer | not null
>>>| plain|  |
>>>  created_at | timestamp without time zone | not null
>>>| plain|  |
>>>  updated_at | timestamp without time zone | not null
>>>| plain|  |
>>>  foo_id | integer | not null
>>>| plain|  |
>>>  bar_id | integer | not null
>>>| plain|  |
>>>  baz_id | integer | not null
>>>| plain|  |
>>>  guid   | character varying   | not null
>>>| extended |  |
>>>  name   | character varying   | not null
>>>| extended |  |
>>>  price  | numeric(12,2)   | not null
>>>| main |  |
>>>  currency   | character varying   | not null
>>>| extended |  |
>>>  amount | integer | not null
>>>| plain|  |
>>>  the_date   | date| not null
>>>| plain|  |
>>>  fields | hstore  |
>>> | extended |  |
>>> Indexes:
>>> "things_pkey" PRI

Re: [GENERAL] Memory Issue with array_agg?

2013-08-19 Thread Robert Sosinski
At the moment, all guids are distinct, however before I zapped the
duplicates, there were 280 duplicates.

Currently, there are over 2 million distinct guids.

-Robert


On Mon, Aug 19, 2013 at 11:12 AM, Pavel Stehule wrote:

>
>
>
> 2013/8/19 Robert Sosinski 
>
>> Hi Pavel,
>>
>> What kind of example do you need?  I cant give you the actual data I have
>> in the table, but I can give you an example query and the schema attached
>> below.  From there, I would just put in 2 million rows worth 1.2 Gigs of
>> data.  Average size of the the extended columns (using the pg_column_size
>> function) in bytes are:
>>
>> guid: 33
>> name: 2.41
>> currency: 4
>> fields: 120.32
>>
>> example query:
>>
>> -- find duplicate records using a guid
>> select guid, array_agg(id) from orders group by guid;
>>
>
> how much distinct guid is there, and how much duplicates
>
> ??
>
> regards
>
> Pavel
>
>
>
>>
>> example schema:
>>  Table "public.things"
>>
>>Column   |Type |
>>  Modifiers  | Storage  | Stats target | Description
>>
>> +-+-+--+--+-
>>  id | integer | not null default
>> nextval('things_id_seq'::regclass) | plain|  |
>>  version| integer | not null
>>| plain|  |
>>  created_at | timestamp without time zone | not null
>>| plain|  |
>>  updated_at | timestamp without time zone | not null
>>| plain|  |
>>  foo_id | integer | not null
>>| plain|  |
>>  bar_id | integer | not null
>>| plain|  |
>>  baz_id | integer | not null
>>| plain|  |
>>  guid   | character varying   | not null
>>| extended |  |
>>  name   | character varying   | not null
>>| extended |  |
>>  price  | numeric(12,2)   | not null
>>| main |  |
>>  currency   | character varying   | not null
>>| extended |  |
>>  amount | integer | not null
>>| plain|  |
>>  the_date   | date| not null
>>| plain|  |
>>  fields | hstore  |
>>   | extended |  |
>> Indexes:
>> "things_pkey" PRIMARY KEY, btree (id)
>> "things_foo_id_idx" btree (foo_id)
>> "things_bar_id_idx" btree (bar_id)
>> "things_baz_id_idx" btree (baz_id)
>> "things_guid_uidx" UNIQUE, btree (guid)
>> "things_lpad_lower_name_eidx" btree (lpad(lower(name::text), 10,
>> '0'::text))
>> "things_price_idx" btree (price)
>>
>> Foreign-key constraints:
>> "things_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(id)
>> "things_bar_id_fkey" FOREIGN KEY (bar_id) REFERENCES bars(id)
>> "things_baz_id_fkey" FOREIGN KEY (baz_id) REFERENCES bazs(id)
>> Triggers:
>> timestamps_trig BEFORE INSERT OR UPDATE ON things FOR EACH ROW
>> EXECUTE PROCEDURE timestamps_tfun()
>>
>> Let me know if you need anything else.
>>
>> Thanks,
>>
>>
>> On Mon, Aug 19, 2013 at 3:29 AM, Pavel Stehule 
>> wrote:
>>
>>> Hello
>>>
>>> please, can you send some example or test?
>>>
>>> Regards
>>>
>>> Pavel Stehule
>>>
>>>
>>> 2013/8/19 Robert Sosinski 
>>>
 When using array_agg on a large table, memory usage seems to spike up
 until Postgres crashes with the following error:

 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection
 because of crash of another server process
 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has
 commanded this server process to roll back the current transaction and
 exit, because another server process exited abnormally and possibly
 corrupted shared memory.
 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be
 able to reconnect to the database and repeat your command.

 I've definitely isolated it down to using array_agg, as when I changed
 the query to use string_agg, it worked fine.  I also tried using array_agg
 on a few different queries, all yielding the same issue.  Swapping in
 string_agg fixed the issue once more.

 This particular table has over 2 million rows and is 1.2 Gigs, and when
 I ran the query while viewing htop, the virtual size of the Postgres
 process ballooned to 13.9G until crashing.

 The version of Postgres I am using is: PostgreSQL 9.2.4 on
 x86_64-unknown-l

Re: [GENERAL] Memory Issue with array_agg?

2013-08-19 Thread Pavel Stehule
2013/8/19 Robert Sosinski 

> Hi Pavel,
>
> What kind of example do you need?  I cant give you the actual data I have
> in the table, but I can give you an example query and the schema attached
> below.  From there, I would just put in 2 million rows worth 1.2 Gigs of
> data.  Average size of the the extended columns (using the pg_column_size
> function) in bytes are:
>
> guid: 33
> name: 2.41
> currency: 4
> fields: 120.32
>
> example query:
>
> -- find duplicate records using a guid
> select guid, array_agg(id) from orders group by guid;
>

how much distinct guid is there, and how much duplicates

??

regards

Pavel



>
> example schema:
>  Table "public.things"
>
>Column   |Type |  Modifiers
>  | Storage  | Stats target | Description
>
> +-+-+--+--+-
>  id | integer | not null default
> nextval('things_id_seq'::regclass) | plain|  |
>  version| integer | not null
>  | plain|  |
>  created_at | timestamp without time zone | not null
>  | plain|  |
>  updated_at | timestamp without time zone | not null
>  | plain|  |
>  foo_id | integer | not null
>  | plain|  |
>  bar_id | integer | not null
>  | plain|  |
>  baz_id | integer | not null
>  | plain|  |
>  guid   | character varying   | not null
>  | extended |  |
>  name   | character varying   | not null
>  | extended |  |
>  price  | numeric(12,2)   | not null
>  | main |  |
>  currency   | character varying   | not null
>  | extended |  |
>  amount | integer | not null
>  | plain|  |
>  the_date   | date| not null
>  | plain|  |
>  fields | hstore  |
>   | extended |  |
> Indexes:
> "things_pkey" PRIMARY KEY, btree (id)
> "things_foo_id_idx" btree (foo_id)
> "things_bar_id_idx" btree (bar_id)
> "things_baz_id_idx" btree (baz_id)
> "things_guid_uidx" UNIQUE, btree (guid)
> "things_lpad_lower_name_eidx" btree (lpad(lower(name::text), 10,
> '0'::text))
> "things_price_idx" btree (price)
>
> Foreign-key constraints:
> "things_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(id)
> "things_bar_id_fkey" FOREIGN KEY (bar_id) REFERENCES bars(id)
> "things_baz_id_fkey" FOREIGN KEY (baz_id) REFERENCES bazs(id)
> Triggers:
> timestamps_trig BEFORE INSERT OR UPDATE ON things FOR EACH ROW EXECUTE
> PROCEDURE timestamps_tfun()
>
> Let me know if you need anything else.
>
> Thanks,
>
>
> On Mon, Aug 19, 2013 at 3:29 AM, Pavel Stehule wrote:
>
>> Hello
>>
>> please, can you send some example or test?
>>
>> Regards
>>
>> Pavel Stehule
>>
>>
>> 2013/8/19 Robert Sosinski 
>>
>>> When using array_agg on a large table, memory usage seems to spike up
>>> until Postgres crashes with the following error:
>>>
>>> 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection
>>> because of crash of another server process
>>> 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has commanded
>>> this server process to roll back the current transaction and exit, because
>>> another server process exited abnormally and possibly corrupted shared
>>> memory.
>>> 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be able
>>> to reconnect to the database and repeat your command.
>>>
>>> I've definitely isolated it down to using array_agg, as when I changed
>>> the query to use string_agg, it worked fine.  I also tried using array_agg
>>> on a few different queries, all yielding the same issue.  Swapping in
>>> string_agg fixed the issue once more.
>>>
>>> This particular table has over 2 million rows and is 1.2 Gigs, and when
>>> I ran the query while viewing htop, the virtual size of the Postgres
>>> process ballooned to 13.9G until crashing.
>>>
>>> The version of Postgres I am using is: PostgreSQL 9.2.4 on
>>> x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1)
>>> 4.7.2, 64-bit
>>>
>>> Any help would be much appreciated, thanks!
>>>
>>> -Robert
>>>
>>
>>
>


Re: [GENERAL] Memory Issue with array_agg?

2013-08-19 Thread Robert Sosinski
Hi Pavel,

What kind of example do you need?  I cant give you the actual data I have
in the table, but I can give you an example query and the schema attached
below.  From there, I would just put in 2 million rows worth 1.2 Gigs of
data.  Average size of the the extended columns (using the pg_column_size
function) in bytes are:

guid: 33
name: 2.41
currency: 4
fields: 120.32

example query:

-- find duplicate records using a guid
select guid, array_agg(id) from orders group by guid;

example schema:
 Table "public.things"

   Column   |Type |  Modifiers
 | Storage  | Stats target | Description
+-+-+--+--+-
 id | integer | not null default
nextval('things_id_seq'::regclass) | plain|  |
 version| integer | not null
 | plain|  |
 created_at | timestamp without time zone | not null
 | plain|  |
 updated_at | timestamp without time zone | not null
 | plain|  |
 foo_id | integer | not null
 | plain|  |
 bar_id | integer | not null
 | plain|  |
 baz_id | integer | not null
 | plain|  |
 guid   | character varying   | not null
 | extended |  |
 name   | character varying   | not null
 | extended |  |
 price  | numeric(12,2)   | not null
 | main |  |
 currency   | character varying   | not null
 | extended |  |
 amount | integer | not null
 | plain|  |
 the_date   | date| not null
 | plain|  |
 fields | hstore  |
| extended |  |
Indexes:
"things_pkey" PRIMARY KEY, btree (id)
"things_foo_id_idx" btree (foo_id)
"things_bar_id_idx" btree (bar_id)
"things_baz_id_idx" btree (baz_id)
"things_guid_uidx" UNIQUE, btree (guid)
"things_lpad_lower_name_eidx" btree (lpad(lower(name::text), 10,
'0'::text))
"things_price_idx" btree (price)

Foreign-key constraints:
"things_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(id)
"things_bar_id_fkey" FOREIGN KEY (bar_id) REFERENCES bars(id)
"things_baz_id_fkey" FOREIGN KEY (baz_id) REFERENCES bazs(id)
Triggers:
timestamps_trig BEFORE INSERT OR UPDATE ON things FOR EACH ROW EXECUTE
PROCEDURE timestamps_tfun()

Let me know if you need anything else.

Thanks,


On Mon, Aug 19, 2013 at 3:29 AM, Pavel Stehule wrote:

> Hello
>
> please, can you send some example or test?
>
> Regards
>
> Pavel Stehule
>
>
> 2013/8/19 Robert Sosinski 
>
>> When using array_agg on a large table, memory usage seems to spike up
>> until Postgres crashes with the following error:
>>
>> 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection
>> because of crash of another server process
>> 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has commanded
>> this server process to roll back the current transaction and exit, because
>> another server process exited abnormally and possibly corrupted shared
>> memory.
>> 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be able
>> to reconnect to the database and repeat your command.
>>
>> I've definitely isolated it down to using array_agg, as when I changed
>> the query to use string_agg, it worked fine.  I also tried using array_agg
>> on a few different queries, all yielding the same issue.  Swapping in
>> string_agg fixed the issue once more.
>>
>> This particular table has over 2 million rows and is 1.2 Gigs, and when I
>> ran the query while viewing htop, the virtual size of the Postgres
>> process ballooned to 13.9G until crashing.
>>
>> The version of Postgres I am using is: PostgreSQL 9.2.4 on
>> x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1)
>> 4.7.2, 64-bit
>>
>> Any help would be much appreciated, thanks!
>>
>> -Robert
>>
>
>


Re: [GENERAL] Memory Issue with array_agg?

2013-08-19 Thread Pavel Stehule
Hello

please, can you send some example or test?

Regards

Pavel Stehule


2013/8/19 Robert Sosinski 

> When using array_agg on a large table, memory usage seems to spike up
> until Postgres crashes with the following error:
>
> 2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection
> because of crash of another server process
> 2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has commanded
> this server process to roll back the current transaction and exit, because
> another server process exited abnormally and possibly corrupted shared
> memory.
> 2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be able
> to reconnect to the database and repeat your command.
>
> I've definitely isolated it down to using array_agg, as when I changed the
> query to use string_agg, it worked fine.  I also tried using array_agg on a
> few different queries, all yielding the same issue.  Swapping in string_agg
> fixed the issue once more.
>
> This particular table has over 2 million rows and is 1.2 Gigs, and when I
> ran the query while viewing htop, the virtual size of the Postgres
> process ballooned to 13.9G until crashing.
>
> The version of Postgres I am using is: PostgreSQL 9.2.4 on
> x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1)
> 4.7.2, 64-bit
>
> Any help would be much appreciated, thanks!
>
> -Robert
>


Re: [GENERAL] Memory Issue with array_agg?

2013-08-18 Thread Christoph Moench-Tegeder
## Robert Sosinski (rsosin...@ticketevolution.com):

> When using array_agg on a large table, memory usage seems to spike up until
> Postgres crashes with the following error:

This sounds like bug #7916.
http://www.postgresql.org/message-id/e1uceeu-0004hy...@wrigleys.postgresql.org
As noted in that thread, changing the AllocSet parameters for
acummArrayResult may help (call to AllocSetContextCreate() in
ArrayBuildState *accumArrayResult(), src/backend/utils/adt/arrayfuncs.c,
change ALLOCSET_DEFAULT_*SIZE to ALLOCSET_SMALL_*SIZE).
Also, lowering work_mem may help, depending on your SQL.

Regards,
Christoph

-- 
Spare Space


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Memory Issue with array_agg?

2013-08-18 Thread Robert Sosinski
When using array_agg on a large table, memory usage seems to spike up until
Postgres crashes with the following error:

2013-08-17 18:41:02 UTC [2716]: [2] WARNING: terminating connection because
of crash of another server process
2013-08-17 18:41:02 UTC [2716]: [3] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2013-08-17 18:41:02 UTC [2716]: [4] HINT: In a moment you should be able to
reconnect to the database and repeat your command.

I've definitely isolated it down to using array_agg, as when I changed the
query to use string_agg, it worked fine.  I also tried using array_agg on a
few different queries, all yielding the same issue.  Swapping in string_agg
fixed the issue once more.

This particular table has over 2 million rows and is 1.2 Gigs, and when I
ran the query while viewing htop, the virtual size of the Postgres process
ballooned to 13.9G until crashing.

The version of Postgres I am using is: PostgreSQL 9.2.4 on
x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.7.2-2ubuntu1)
4.7.2, 64-bit

Any help would be much appreciated, thanks!

-Robert