Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-08 Thread Raphael Bauduin
On Thu, Nov 7, 2013 at 7:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I wrote:
  It looks like the problem is we're building a MergeAppend plan and not
  getting the targetlist for the MergeAppend node right.

 Found it --- simple oversight in building optimized min/max plans.
 If you need a patch now, see

 http://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=5d0731da521f090f80ea39529fe274ac6d6bffa1


Wow, the patch is available thes same day I supplied the steps to reproduce
the bug! I don't think it's possible to be faster :-)
Thanks a alot!

Raph


regards, tom lane




-- 
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-07 Thread Raphael Bauduin
Hi,

I have narrowed it a bit. It happens when I create said index on an empty
field. Here's the scenario to reproduce it:

Let me know if you need more info

Cheers

Raph


create table events(id SERIAL,
timestamp timestamp,
event json);


create table events_2012_01( CHECK (timestamp='2012-01-01' and
timestamp'2012-2-01' )) inherits (events) ;
create table events_2012_02( CHECK (timestamp='2012-02-01' and
timestamp'2012-3-01' )) inherits (events) ;
insert into events_2012_01 (timestamp, event) values ('2012-01-22
08:38:56', '{_id:4f93c3a044650105b5074c9a,type:t1}');
insert into events_2012_02 (timestamp, event) values ('2012-02-22
08:38:56', '{_id:5f93c3a044650105b5074c9a,type:t2}');


-- create empty table
create table events_2012_03( CHECK (timestamp='2012-03-01' and
timestamp'2012-4-01' )) inherits (events) ;
explain select max(event-'_id') from events where event is not null;
--OK

--create index
create index events_2012_03_event_id_index on events_2012_03
((event-'_id'));
explain select max(event-'_id') from events where event is not null;
--BANG

drop index events_2012_03_event_id_index;
explain select max(event-'_id') from events where event is not null;
--OK



On Mon, Nov 4, 2013 at 8:39 AM, Raphael Bauduin rbli...@gmail.com wrote:

 I'll look at providing such an example later this week.

 Raph


 On Thu, Oct 31, 2013 at 3:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Raphael Bauduin rbli...@gmail.com writes:
  An explain returns an error:
  = explain select max(event-'_id') from events;
  ERROR:  no tlist entry for key 2

 This is certainly a bug.  Can we see a self-contained example that
 triggers that?

 regards, tom lane




 --
 Web database: http://www.myowndb.com
 Free Software Developers Meeting: http://www.fosdem.org




-- 
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-07 Thread Raphael Bauduin
Correction: It happens when I create said index on an empty *table*.

Raph


On Thu, Nov 7, 2013 at 11:10 AM, Raphael Bauduin rbli...@gmail.com wrote:

 Hi,

 I have narrowed it a bit. It happens when I create said index on an empty
 field. Here's the scenario to reproduce it:

 Let me know if you need more info

 Cheers

 Raph


 create table events(id SERIAL,
 timestamp timestamp,
 event json);


 create table events_2012_01( CHECK (timestamp='2012-01-01' and
 timestamp'2012-2-01' )) inherits (events) ;
 create table events_2012_02( CHECK (timestamp='2012-02-01' and
 timestamp'2012-3-01' )) inherits (events) ;
 insert into events_2012_01 (timestamp, event) values ('2012-01-22
 08:38:56', '{_id:4f93c3a044650105b5074c9a,type:t1}');
 insert into events_2012_02 (timestamp, event) values ('2012-02-22
 08:38:56', '{_id:5f93c3a044650105b5074c9a,type:t2}');


 -- create empty table
 create table events_2012_03( CHECK (timestamp='2012-03-01' and
 timestamp'2012-4-01' )) inherits (events) ;
 explain select max(event-'_id') from events where event is not null;
 --OK

 --create index
 create index events_2012_03_event_id_index on events_2012_03
 ((event-'_id'));
 explain select max(event-'_id') from events where event is not null;
 --BANG

 drop index events_2012_03_event_id_index;
 explain select max(event-'_id') from events where event is not null;
 --OK



 On Mon, Nov 4, 2013 at 8:39 AM, Raphael Bauduin rbli...@gmail.com wrote:

 I'll look at providing such an example later this week.

 Raph


 On Thu, Oct 31, 2013 at 3:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Raphael Bauduin rbli...@gmail.com writes:
  An explain returns an error:
  = explain select max(event-'_id') from events;
  ERROR:  no tlist entry for key 2

 This is certainly a bug.  Can we see a self-contained example that
 triggers that?

 regards, tom lane




 --
 Web database: http://www.myowndb.com
 Free Software Developers Meeting: http://www.fosdem.org




 --
 Web database: http://www.myowndb.com
 Free Software Developers Meeting: http://www.fosdem.org




-- 
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-07 Thread Tom Lane
Raphael Bauduin rbli...@gmail.com writes:
 I have narrowed it a bit. It happens when I create said index on an empty
 field. Here's the scenario to reproduce it:

Thanks, I've reproduced the problem here.  The query still seems to run OK,
it's just EXPLAIN that's falling over --- do you see the same?

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-07 Thread Raphael Bauduin
The query is also problematic here, because it returns the full json, and
not only the data I selected in the json.
Below, it should return only '_id', and not the whole json stored in event:

test3= select max(event-'_id') from events where event is not null;
  max

 {_id:5f93c3a044650105b5074c9a,type:t2}

Thanks

raph



On Thu, Nov 7, 2013 at 4:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Raphael Bauduin rbli...@gmail.com writes:
  I have narrowed it a bit. It happens when I create said index on an empty
  field. Here's the scenario to reproduce it:

 Thanks, I've reproduced the problem here.  The query still seems to run OK,
 it's just EXPLAIN that's falling over --- do you see the same?

 regards, tom lane




-- 
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-07 Thread Tom Lane
Raphael Bauduin rbli...@gmail.com writes:
 The query is also problematic here, because it returns the full json, and
 not only the data I selected in the json.

Doh, right, you mentioned that in the original bug report, and now that
I'm paying a bit more attention I see it too.  I was looking for
some sort of error from running the query, not just wrong data.

It looks like the problem is we're building a MergeAppend plan and not
getting the targetlist for the MergeAppend node right.  I hacked EXPLAIN
very quickly to not fall over when it fails to find a sort key in the
node's targetlist, and here's what I see:

regression=# explain verbose select max(event-'_id') from events where event 
is not null;
 QUERY PLAN 
 
-
 Result  (cost=58.75..58.76 rows=1 width=0)
   Output: $0
   InitPlan 1 (returns $0)
 -  Limit  (cost=58.70..58.75 rows=1 width=32)
   Output: events.event
   -  Merge Append  (cost=58.70..200.88 rows=3268 width=32)
 Sort Key: [no tlist entry for key 2]
 -  Sort  (cost=0.01..0.02 rows=1 width=32)
   Output: events.event, ((events.event - '_id'::text))
   Sort Key: ((events.event - '_id'::text))
   -  Seq Scan on public.events  (cost=0.00..0.00 rows=1 
width=32)
 Output: events.event, (events.event - 
'_id'::text)
 Filter: ((events.event IS NOT NULL) AND 
((events.event - '_id'::text) IS NOT NULL))
 -  Sort  (cost=29.20..31.92 rows=1089 width=32)
   Output: events_2012_01.event, ((events_2012_01.event - 
'_id'::text))
   Sort Key: ((events_2012_01.event - '_id'::text))
   -  Seq Scan on public.events_2012_01  (cost=0.00..23.75 
rows=1089 width=32)
 Output: events_2012_01.event, 
(events_2012_01.event - '_id'::text)
 Filter: ((events_2012_01.event IS NOT NULL) AND 
((events_2012_01.event - '_id'::text) IS NOT NULL))
 -  Sort  (cost=29.20..31.92 rows=1089 width=32)
   Output: events_2012_02.event, ((events_2012_02.event - 
'_id'::text))
   Sort Key: ((events_2012_02.event - '_id'::text))
   -  Seq Scan on public.events_2012_02  (cost=0.00..23.75 
rows=1089 width=32)
 Output: events_2012_02.event, 
(events_2012_02.event - '_id'::text)
 Filter: ((events_2012_02.event IS NOT NULL) AND 
((events_2012_02.event - '_id'::text) IS NOT NULL))
 -  Index Scan Backward using events_2012_03_event_id_index on 
public.events_2012_03  (cost=0.15..63.30 rows=1089 width=32)
   Output: events_2012_03.event, (events_2012_03.event - 
'_id'::text)
   Index Cond: ((events_2012_03.event - '_id'::text) IS 
NOT NULL)
   Filter: (events_2012_03.event IS NOT NULL)
(29 rows)

So everything looks right for the individual table-scan subplans, but
something's going badly wrong when making the MergeAppend ...
dunno what yet.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-07 Thread Tom Lane
I wrote:
 It looks like the problem is we're building a MergeAppend plan and not
 getting the targetlist for the MergeAppend node right.

Found it --- simple oversight in building optimized min/max plans.
If you need a patch now, see
http://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=5d0731da521f090f80ea39529fe274ac6d6bffa1

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] problem with partitioned table and indexed json field

2013-11-03 Thread Raphael Bauduin
I'll look at providing such an example later this week.

Raph


On Thu, Oct 31, 2013 at 3:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Raphael Bauduin rbli...@gmail.com writes:
  An explain returns an error:
  = explain select max(event-'_id') from events;
  ERROR:  no tlist entry for key 2

 This is certainly a bug.  Can we see a self-contained example that
 triggers that?

 regards, tom lane




-- 
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org


[GENERAL] problem with partitioned table and indexed json field

2013-10-31 Thread Raphael Bauduin
Hi,

I have a partitioned table events, with one partition for each month, eg
events_2013_03. The partition is done on the field timestamp, and
constraints are set, but insertion of data is done in the partition
directly (so not with a trigger on the events table)
The field event is of type json, and has a field '_id', which I can access:

= select event-'_id' from events limit 1;
 ?column?
--
 4f9a786f44650105b50aafc9

I created an index on each partition of the table, but not on the events
table itself:
create index events_${y}_${m}_event_id_index on events_${y}_${m}
((event-'_id'));

Querying the max event_id from a partition works fine:
= select max(event-'_id') from events_2013_03;
   max
--
 5158cdfe4465012cff522b74


However, requesting on the parent table does return the whole json field,
and not only the '_id':
= select max(event-'_id') from events;
{_id:526eb3ad4465013e3e131a43,origin:. }

An explain returns an error:
= explain select max(event-'_id') from events;
ERROR:  no tlist entry for key 2

This problem appeared when I created the indexes, and removing the index
make the explain work fine, but the plan implies a sequential scan on the
tables which is exactly what I wanted to avoid with the indexes.

Does someone have an explanation, and possibly a way to solve this problem?

thanks

Raph


Re: [GENERAL] problem with partitioned table and indexed json field

2013-10-31 Thread Merlin Moncure
On Thu, Oct 31, 2013 at 5:46 AM, Raphael Bauduin rbli...@gmail.com wrote:

 Hi,

 I have a partitioned table events, with one partition for each month, eg
 events_2013_03. The partition is done on the field timestamp, and
 constraints are set, but insertion of data is done in the partition directly
 (so not with a trigger on the events table)
 The field event is of type json, and has a field '_id', which I can access:

 = select event-'_id' from events limit 1;
  ?column?
 --
  4f9a786f44650105b50aafc9

 I created an index on each partition of the table, but not on the events
 table itself:
 create index events_${y}_${m}_event_id_index on events_${y}_${m}
 ((event-'_id'));

 Querying the max event_id from a partition works fine:
 = select max(event-'_id') from events_2013_03;
max
 --
  5158cdfe4465012cff522b74


 However, requesting on the parent table does return the whole json field,
 and not only the '_id':
 = select max(event-'_id') from events;
 {_id:526eb3ad4465013e3e131a43,origin:. }

 An explain returns an error:
 = explain select max(event-'_id') from events;
 ERROR:  no tlist entry for key 2

 This problem appeared when I created the indexes, and removing the index
 make the explain work fine, but the plan implies a sequential scan on the
 tables which is exactly what I wanted to avoid with the indexes.

 Does someone have an explanation, and possibly a way to solve this problem?

wow, that looks like a bug.  Can you post the specific postgres version?

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] problem with partitioned table and indexed json field

2013-10-31 Thread Raphael Bauduin
It's postgresql 9.3, from the pgdg apt repository:
9.3.0-2.pgdg10.4+1

Raph


On Thu, Oct 31, 2013 at 1:48 PM, Merlin Moncure mmonc...@gmail.com wrote:

 On Thu, Oct 31, 2013 at 5:46 AM, Raphael Bauduin rbli...@gmail.com
 wrote:
 
  Hi,
 
  I have a partitioned table events, with one partition for each month, eg
  events_2013_03. The partition is done on the field timestamp, and
  constraints are set, but insertion of data is done in the partition
 directly
  (so not with a trigger on the events table)
  The field event is of type json, and has a field '_id', which I can
 access:
 
  = select event-'_id' from events limit 1;
   ?column?
  --
   4f9a786f44650105b50aafc9
 
  I created an index on each partition of the table, but not on the events
  table itself:
  create index events_${y}_${m}_event_id_index on events_${y}_${m}
  ((event-'_id'));
 
  Querying the max event_id from a partition works fine:
  = select max(event-'_id') from events_2013_03;
 max
  --
   5158cdfe4465012cff522b74
 
 
  However, requesting on the parent table does return the whole json field,
  and not only the '_id':
  = select max(event-'_id') from events;
  {_id:526eb3ad4465013e3e131a43,origin:. }
 
  An explain returns an error:
  = explain select max(event-'_id') from events;
  ERROR:  no tlist entry for key 2
 
  This problem appeared when I created the indexes, and removing the index
  make the explain work fine, but the plan implies a sequential scan on the
  tables which is exactly what I wanted to avoid with the indexes.
 
  Does someone have an explanation, and possibly a way to solve this
 problem?

 wow, that looks like a bug.  Can you post the specific postgres version?

 merlin




-- 
Web database: http://www.myowndb.com
Free Software Developers Meeting: http://www.fosdem.org


Re: [GENERAL] problem with partitioned table and indexed json field

2013-10-31 Thread Tom Lane
Raphael Bauduin rbli...@gmail.com writes:
 An explain returns an error:
 = explain select max(event-'_id') from events;
 ERROR:  no tlist entry for key 2

This is certainly a bug.  Can we see a self-contained example that
triggers that?

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general