Re: [HACKERS] Pull up aggregate subquery

2011-06-04 Thread Simon Riggs
On Tue, May 24, 2011 at 3:47 AM, Hitoshi Harada umi.tan...@gmail.com wrote:

 That's true. But if the planning cost is an only issue, why not adding
 new GUC for user to choose if they prefer it or not? Of course if we
 have some method to predict which way to go before proving both ways,
 it's great. Do you have some blue picture on it?

I like your simple patch and looks like it fixes your concern.

Your problem statement ignores the fact that most people would not
write the original query like this

select m_id, sum_len from size_m m inner join(select m_id,
sum(length(val)) as sum_len from size_l group by m_id)l on m.id =
l.m_id where val = '1';

they would write it like this

select m_id, sum(length(val))
from size_m m join size_l l on m.id = l.m_id
where val = '1'
group by m_id;

Which gives a far worse plan and one that is not solved by your patch.
Your way of writing the SQL is one of the hand optimized ways that
an SQL expert would try to re-write the SQL. We shouldn't be
optimizing only for hand-altered code, since it can always be further
tweaked by hand. We should be optimizing the original, simple queries
(as well as other forms of expressing the same thing).

This highlights that we do not have the infrastructure to push
aggregates up or down, and that the lack of a known primary key for
the output of each plan node prevents us from developing a general
transformation infrastructure to solve the general case. That
particular piece of infrastructure is also an essential step towards
materialized views, which would be pretty useless without the
capability to transform aggregates up and down the join tree.

In terms of costing, I think it would be likely that we can apply
simple heuristics. We already assume that applying quals down to the
lowest level possible make sense. I would guess that anything that
reduces the number of rows should be pushed down as far as possible.
I'm sure there are cases where that isn't true, but lets not stop from
solving simple general cases because of the theoretical existence of
complex cases

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Simon Riggs
On Fri, Jun 3, 2011 at 2:17 PM, Robert Haas robertmh...@gmail.com wrote:

 I've now spent enough time working on this issue now to be convinced
 that the approach has merit, if we can work out the kinks.

Yes, the approach has merits and I'm sure we can work out the kinks.

 As you can see, this works out to a bit more than a 4% improvement on
 this two-core box.  I also got access (thanks to Nate Boley) to a
 24-core box and ran the same test with scale factor 100 and
 shared_buffers=8GB.  Here are the results of alternating runs without
 and with the patch on that machine:

 tps = 36291.996228 (including connections establishing)
 tps = 129242.054578 (including connections establishing)
 tps = 36704.393055 (including connections establishing)
 tps = 128998.648106 (including connections establishing)
 tps = 36531.208898 (including connections establishing)
 tps = 131341.367344 (including connections establishing)

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

Congratulations. I believe that is realistic based upon my investigations.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cube Index Size

2011-06-04 Thread Nick Raj
2011/6/2 Teodor Sigaev teo...@sigaev.ru

  Can we figure out some information about index i.e. whet is the height
 of index tree, how many values are placed in one leaf node and one non
 leaf level node?


 http://www.sigaev.ru/cvsweb/cvsweb.cgi/gevel/


For improving space utilization, When node is splitted, then we have to
assign enteries to two groups. Once, one group is reached some threshod (m)
then, insert the remaining entries into another group.

Can you suggest some way to choose 'm' (beacuse cube store in form of NDBOX
that having variable length) or provide some guide with code?

Thanks


 --
 Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW:
 http://www.sigaev.ru/



Re: [HACKERS] storing TZ along timestamps

2011-06-04 Thread Greg Stark
On Thu, Jun 2, 2011 at 8:58 PM, Jim Nasby j...@nasby.net wrote:

 I'm torn between whether the type should store the original time or the 
 original time converted to GMT.

This is the wrong way to think about it. We *never* store time
converted to GMT.  When we want to represent a point in time we
represent it as seconds since the epoch. This is a quantity that is
the same everywhere in the world (modulo relativistic effects...) ie.
it's a concept which is completely independent of timezones.

The question here is how to represent more complex concepts than
simply points in time. I think the two concepts under discussion are
a) a composite type representing a point in time and a timezone it
should be interpreted in for operations and display and b) the
original input provided which is a text string with the constraint
that it's a valid input which can be interpreted as a point in time.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Simon Riggs
On Sat, Jun 4, 2011 at 2:59 PM, Simon Riggs si...@2ndquadrant.com wrote:

 As you can see, this works out to a bit more than a 4% improvement on
 this two-core box.  I also got access (thanks to Nate Boley) to a
 24-core box and ran the same test with scale factor 100 and
 shared_buffers=8GB.  Here are the results of alternating runs without
 and with the patch on that machine:

 tps = 36291.996228 (including connections establishing)
 tps = 129242.054578 (including connections establishing)
 tps = 36704.393055 (including connections establishing)
 tps = 128998.648106 (including connections establishing)
 tps = 36531.208898 (including connections establishing)
 tps = 131341.367344 (including connections establishing)

 That's an improvement of about ~3.5x.  According to the vmstat output,
 when running without the patch, the CPU state was about 40% idle.
 With the patch, it dropped down to around 6%.

 Congratulations. I believe that is realistic based upon my investigations.


Tom,

You should look at this. It's good.

The approach looks sound to me. It's a fairly isolated patch and we
should be considering this for inclusion in 9.1, not wait another
year.

I will happily add its a completely different approach to the one I'd
been working on, and even more happily is so different from the Oracle
approach that we are definitely unencumbered by patent issues here.
Well done Robert, Noah.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-06-04 Thread Heikki Linnakangas

On 03.06.2011 14:02, Alexander Korotkov wrote:

Hackers,

WIP patch of fast GiST index build is attached. Code is dirty and comments
are lacking, but it works. Now it is ready for first benchmarks, which
should prove efficiency of selected technique. It's time to compare fast
GiST index build with repeat insert build on large enough datasets (datasets
which don't fit to cache). There are following aims of testing:
1) Measure acceleration of index build.
2) Measure change in index quality.
I'm going to do first testing using synthetic datasets. Everybody who have
interesting real-life datasets for testing are welcome.


I did some quick performance testing of this. I installed postgis 1.5, 
and loaded an extract of the OpenStreetMap data covering Finland. The 
biggest gist index in that data set is the idx_nodes_geom index on nodes 
table. I have maintenance_work_mem and shared_buffers both set to 512 
MB, and this laptop has 4GB of RAM.


Without the patch, reindexing the index takes about 170 seconds and the 
index size is 321 MB. And with the patch, it takes about 150 seconds, 
and the resulting index size is 319 MB.


The nodes table is 618MB in size, so it fits in RAM. I presume the gain 
would be bigger if it doesn't, as the random I/O to update the index 
starts to hurt more. But this shows that even when it does, this patch 
helps a little bit, and the resulting index size is comparable.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BLOB support

2011-06-04 Thread Radosław Smogura
Tom Lane t...@sss.pgh.pa.us Friday 03 of June 2011 16:44:13
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Excerpts from Radosław Smogura's message of jue jun 02 15:26:29 -0400 
2011:
  So do I understand good should We think about create bettered TOAST to
  support larger values then 30-bit length? I like this much more,
  
  Good :-)
  
  (BTW while it'd be good to have longer-than-30 bit length words for
  varlena, I'm not sure we have room for that.)
 
 You wouldn't want to push such values around as whole values anyway.
 Possibly what would work here is a variant form of TOAST pointer for
 which we'd simply throw error if you tried to fetch the entire value
 at once.
 
   regards, tom lane

I mentoined about JDBC call. 
1. 
b = conn.createBlob();
ps = conn.preparesStatement(INSRT INTO t blob = ?);
ps.setBlob(1, b);
ps.executeQuery();

2.
Statements could be as well too
SELECT blob_index_of(?, ?);
where 1st ? is blob, 2nd one is some text/sequence

This call must be supported to make BLOB as far as possible simillar to other 
types, this actually disallow to put LOB in TOAST as there is no relation (2.) 
or relation is not known (1.) during reading LOB - in any case you can't skip 
bytes from protocol stream, so possible solutions are:

1. Create temporaly LOB in file or memory depending of it size.
2. Use existing Large Object interface.
3. Client will not send LOB, just it's faked ID and during call Server will 
ask client to serialize this LOB, by faked id.
4. Any other propositions?

I vote for 2.

For pg_type/class changes I think about removing haslobs, and put this as 
attstorage or somewhere else for example ('l' may stand for lobs or -3 
length), but currently TOAST composites doesn't support recursion, and those 
types are toasted as whole. I may add recursion for those types, and support 
special maintain for LOBs. In any case handling this will be better in 
toasting code, not in nodeModifyTable.

Any ideas about this?

Reagrds,
Radek

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Heikki Linnakangas

On 04.06.2011 18:01, Simon Riggs wrote:

It's a fairly isolated patch and we
should be considering this for inclusion in 9.1, not wait another
year.


-1

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ts_count

2011-06-04 Thread Andrew Dunstan


One of our PostgreSQL Experts Inc customers wanted a function to count 
all the occurrences of terms in a tsquery in a tsvector. This has been 
written as a loadable module function, and initial testing shows it is 
working well. With the client's permission we are releasing the code - 
it's available at https://github.com/pgexperts/ts_count. The actual 
new code involved here is tiny, some of the code is CP'd from tsrank.c 
and much of the rest is boilerplate.


A snippet from the regression test:


   select ts_count(to_tsvector('managing managers manage peons
   managerially'),
to_tsquery('managers | peon'));
 ts_count
   --
4

We'd like to add something like this for 9.2, so I'd like to get the API agreed 
and then I'll prepare a patch and submit it for the next CF.

Comments? cheers andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error in PQsetvalue

2011-06-04 Thread Merlin Moncure
On Fri, Jun 3, 2011 at 10:36 PM, Andrew Chernow a...@esilo.com wrote:
 On 6/3/2011 10:26 PM, Andrew Chernow wrote:

 I disagree -- I think the fix is a one-liner. line 446:
 if (tup_num == res-ntups !res-tuples[tup_num])

 should just become
 if (tup_num == res-ntups)

 also the memset of the tuple slots when the slot array is expanded can
 be removed. (in addition, the array tuple array expansion should
 really be abstracted, but that isn't strictly necessary here).


 All true. This is a cleaner fix to something that was in fact broken ;)
 You want

 Attached a patch that fixes the OP's issue. PQsetvalue now uses pqAddTuple
 to
 grow the tuple table and has removed the remnants of an older idea that
 caused
 the bug.


 Sorry, I attached the wrong patch.  Here is the correct one.


This looks good.  Pavel, want to test it?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating CHECK constraints as NOT VALID

2011-06-04 Thread Simon Riggs
On Thu, Jun 2, 2011 at 5:48 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Alvaro Herrera's message of mié jun 01 20:56:12 -0400 2011:
 Excerpts from Thom Brown's message of mié jun 01 19:48:44 -0400 2011:

  Is this expected?
  [ pg_dump fails to preserve not-valid status of constraints ]

 Certainly not.

  Shouldn't the constraint be dumped as not valid too??

 Sure, I'll implement that tomorrow.

 Actually, it turns out that NOT VALID foreign keys were already buggy
 here, and fixing them automatically fixes this case as well, because the
 fix involves touching pg_get_constraintdef to dump the flag.  This also
 gets it into psql's \d.  Patch attached.

 (Maybe the changes in psql's describe.c should be reverted, not sure.)

Thanks. As soon as Thom said that, I thought ahh... didn't do that.

Patch looks fine. Will you commit this patch to 9.1 now, or would you
like me to?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIREAD lock versus ACCESS EXCLUSIVE lock

2011-06-04 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 03.06.2011 21:04, Kevin Grittner wrote:
 Also, if anyone has comments or hints about the placement of those
 calls, I'd be happy to receive them.

 heap_drop_with_catalog() schedules the relation for deletion at the end 
 of transaction, but it's still possible that the transaction aborts and 
 the heap doesn't get dropped after all. If you put the 
 DropAllPredicateLocksFromTable() call there, and the transaction later 
 aborts, you've lost all the locks already.

But on the third thought: is that wrong?  Surely locks taken by an
aborted transaction can be discarded.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIREAD lock versus ACCESS EXCLUSIVE lock

2011-06-04 Thread Heikki Linnakangas

On 04.06.2011 19:19, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

On 03.06.2011 21:04, Kevin Grittner wrote:

Also, if anyone has comments or hints about the placement of those
calls, I'd be happy to receive them.



heap_drop_with_catalog() schedules the relation for deletion at the end
of transaction, but it's still possible that the transaction aborts and
the heap doesn't get dropped after all. If you put the
DropAllPredicateLocksFromTable() call there, and the transaction later
aborts, you've lost all the locks already.


But on the third thought: is that wrong?  Surely locks taken by an
aborted transaction can be discarded.


These are predicate locks - there can be locks on the table belonging 
to transactions that have already committed.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] plperl fails with perl 5.14.0

2011-06-04 Thread Tom Lane
It appears from the buildfarm configure logs that member anchovy was
updated yesterday from perl 5.12.3 to 5.14.0.  It doesn't like plperl.c
anymore:

ccache cc -flto -march=amdfam10 -fstack-protector -O3 -pipe -Wall 
-Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv -g -fpic -I. -I. 
-I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
-I/usr/lib/perl5/core_perl/CORE  -c -o plperl.o plperl.c
plperl.c: In function 'plperl_trusted_init':
plperl.c:929:36: error: lvalue required as left operand of assignment
make[3]: *** [plperl.o] Error 1

What it's unhappy about is evidently this:

GvCV(sv) = NULL;/* prevent call via GV */

I think we are going to have to find a different way of doing that...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Kevin Grittner
Simon Riggs  wrote:
 
 we should be considering this for inclusion in 9.1, not wait
 another year.
 
-1
 
I'm really happy that we're addressing the problems with scaling to
a large number of cores, and this patch sounds great.  Adding a new
feature at this point in the release cycle would be horrible. 
Frankly, from the tone of Robert's post, it probably wouldn't be
appropriate to include it in a release if it showed up in this
condition at the start of the last CF for that release.
 
The nice thing about annual releases is there's never one too far
away -- unless, of course, we hold up a release up to squeeze in
just one more feature.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pull up aggregate subquery

2011-06-04 Thread Hitoshi Harada
2011/6/4 Simon Riggs si...@2ndquadrant.com:

 I like your simple patch and looks like it fixes your concern.

Thanks for your interest. I forgot to mention but this type of query
is quite general in one-to-many entities and likely to be generated by
simple ORMappers.

 Your problem statement ignores the fact that most people would not
 write the original query like this

 select m_id, sum_len from size_m m inner join(select m_id,
 sum(length(val)) as sum_len from size_l group by m_id)l on m.id =
 l.m_id where val = '1';

 they would write it like this

 select m_id, sum(length(val))
 from size_m m join size_l l on m.id = l.m_id
 where val = '1'
 group by m_id;

Hm, I didn't notice this hand transformation. Will check what's
happened. But my example is simplified and it might be likely that
some other joins (not uniquely) to size_m.

 In terms of costing, I think it would be likely that we can apply
 simple heuristics. We already assume that applying quals down to the
 lowest level possible make sense. I would guess that anything that
 reduces the number of rows should be pushed down as far as possible.
 I'm sure there are cases where that isn't true, but lets not stop from
 solving simple general cases because of the theoretical existence of
 complex cases

Agreed. After more thought, push-down-qual approach would be better
than push down/pull up aggregates. The only concern was multiple
aggregate call case in such cases like more rows than one are
qualified in size_m. But it is clear that each aggregate call scans
only qualified size_l rows if we can push down parameter qual to the
subquery. Nestloop with parameter push down to aggregate subquery
appoach is more general because it doesn't concern about primary key
issue. You can push it down whenever the total execution cost is
smaller.

So, I'm now walking through planner code and finally I found the clue
to start. First, the current problem of parameterized nestloop idea in
general case is that while nested index scan pushes parameter to the
other join relation, more general approach needs to do it with
subquery. A nested index scan Path is generated in
match_unsorted_outer(), which is at much deeper of
make_rel_from_joinlist(), which is after set_base_rel_pathlist(),
which contains set_subquery_pathlist(), which calls
subquery_planner(). This means that if you want to add code to
generate general nestloop with parameter push down during join
search process, it is too late to push down the parameter to subquery,
because subquery was already planned at that time.

So we need to add new operation before subquery_planner(). It is hard
because any join-relevant information is not ready at the stage. But I
bet some simple cases like my aggregate-join can find it possible to
make parameter from join qual. (I haven't yet written any code nor
proven my theory). In this case we need to plan subquery twice, one
with pure and the other with parameterized.

Other than subquery case, LATERAL will be ok with near the nested
index scan approach, since the joinned relation is FunctionScan, which
is not planned lie subquery. s JOIN(l1 LEFT JOIN l2) case is unclear
which of subquery or index scan. Maybe the third way, because l1 LEFT
JOIN l2 is inside deconstructed jointree which is not planned subquery
but also not plain RelOptInfo like base relation / function scan.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postmaster holding unlinked files for pg_largeobject table

2011-06-04 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 What surprises me is that the open references remain after a database
 drop.  Surely this means that no backends keep open file descriptors to
 any table in that database, because there are no connections.

bgwriter ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch

2011-06-04 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 The approach looks sound to me. It's a fairly isolated patch and we
 should be considering this for inclusion in 9.1, not wait another
 year.

That suggestion is completely insane.  The patch is only WIP and full of
bugs, even according to its author.  Even if it were solid, it is way
too late to be pushing such stuff into 9.1.  We're trying to ship a
release, not find ways to cause it to slip more.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SIREAD lock versus ACCESS EXCLUSIVE lock

2011-06-04 Thread Kevin Grittner
Heikki Linnakangas  wrote:
 On 04.06.2011 19:19, Tom Lane wrote:
 Heikki Linnakangas writes:
 On 03.06.2011 21:04, Kevin Grittner wrote:
 Also, if anyone has comments or hints about the placement of
 those calls, I'd be happy to receive them.

 heap_drop_with_catalog() schedules the relation for deletion at
 the end of transaction, but it's still possible that the
 transaction aborts and the heap doesn't get dropped after all. If
 you put the DropAllPredicateLocksFromTable() call there, and the
 transaction later aborts, you've lost all the locks already.

 But on the third thought: is that wrong? Surely locks taken by an
 aborted transaction can be discarded.
 
 These are predicate locks - there can be locks on the table
 belonging to transactions that have already committed.
 
It took me a while to think this through, but if the transaction (T1)
which reads the table to create the SIREAD lock overlaps another
transaction (T2) with which it might interact to create a dangerous
structure, and T2 has not yet accessed the table in any way,
then after T1 commits a third transaction (T3) could try to drop the
table but roll back, and T2 could still proceed to do a write which
conflicts with the predicate lock.
 
That certainly sounds like a low frequency combination of events, but
one which can't be ignored if we want correct behavior (i.e., no
false negatives).
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl fails with perl 5.14.0

2011-06-04 Thread Andrew Dunstan


There was some discussion on this about 6 weeks ago, at which time 5.14 
wasn't released, so I didn't want to apply the patch then.


I'll look at applying the fix now.

cheers

andrew


On 06/04/2011 12:30 PM, Tom Lane wrote:

It appears from the buildfarm configure logs that member anchovy was
updated yesterday from perl 5.12.3 to 5.14.0.  It doesn't like plperl.c
anymore:

ccache cc -flto -march=amdfam10 -fstack-protector -O3 -pipe -Wall 
-Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement 
-Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv -g -fpic -I. -I. 
-I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
-I/usr/lib/perl5/core_perl/CORE  -c -o plperl.o plperl.c
plperl.c: In function 'plperl_trusted_init':
plperl.c:929:36: error: lvalue required as left operand of assignment
make[3]: *** [plperl.o] Error 1

What it's unhappy about is evidently this:

 GvCV(sv) = NULL;/* prevent call via GV */

I think we are going to have to find a different way of doing that...

regards, tom lane



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl fails with perl 5.14.0

2011-06-04 Thread Andrew Dunstan



On 06/04/2011 01:05 PM, Andrew Dunstan wrote:


There was some discussion on this about 6 weeks ago, at which time 
5.14 wasn't released, so I didn't want to apply the patch then.


I'll look at applying the fix now.





(and apologies for top-replying ;-) )

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl fails with perl 5.14.0

2011-06-04 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 There was some discussion on this about 6 weeks ago, at which time 5.14 
 wasn't released, so I didn't want to apply the patch then.

Oh, right, I thought the issue seemed familiar:
http://archives.postgresql.org/message-id/BANLkTi=qrfqppfqto2r5amb_cfsyhdv...@mail.gmail.com

It looks like we were waiting to see if they'd put GvCV_set() into
ppport.h.  If they didn't, Alex's fix looks sane to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] creating CHECK constraints as NOT VALID

2011-06-04 Thread Alvaro Herrera
Excerpts from Simon Riggs's message of sáb jun 04 09:11:52 -0400 2011:
 On Thu, Jun 2, 2011 at 5:48 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

  Actually, it turns out that NOT VALID foreign keys were already buggy
  here, and fixing them automatically fixes this case as well, because the
  fix involves touching pg_get_constraintdef to dump the flag.  This also
  gets it into psql's \d.  Patch attached.
 
  (Maybe the changes in psql's describe.c should be reverted, not sure.)
 
 Thanks. As soon as Thom said that, I thought ahh... didn't do that.
 
 Patch looks fine. Will you commit this patch to 9.1 now, or would you
 like me to?

Thanks for the review.  I already committed it on 9.1:
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=048417511aef8d5fb2d541b17b73afc730935cd5

I'd still like your opinion on the psql bits.  Should they be reverted?
I haven't verified what the output currently looks like.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ts_count

2011-06-04 Thread Oleg Bartunov

Well, there are several functions available around tsearch2. so I suggest
somebody to collect all of them and create one extension - ts_addon.
For example, these are what I remember:
1. tsvector2array
2. noccurences(tsvector, tsquery) - like your ts_count
3. nmatches(tsvector, tsquery) - # of matched lexems in query
Of course, we need to think about better names for functions, since
ts_count is a bit ambiguous.


Oleg

On Sat, 4 Jun 2011, Andrew Dunstan wrote:



One of our PostgreSQL Experts Inc customers wanted a function to count all 
the occurrences of terms in a tsquery in a tsvector. This has been written as 
a loadable module function, and initial testing shows it is working well. 
With the client's permission we are releasing the code - it's available at 
https://github.com/pgexperts/ts_count. The actual new code involved here is 
tiny, some of the code is CP'd from tsrank.c and much of the rest is 
boilerplate.


A snippet from the regression test:


  select ts_count(to_tsvector('managing managers manage peons
  managerially'),
   to_tsquery('managers | peon'));
ts_count
  --
   4

We'd like to add something like this for 9.2, so I'd like to get the API 
agreed and then I'll prepare a patch and submit it for the next CF.


Comments? cheers andrew





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert failure when rechecking an exclusion constraint

2011-06-04 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 Commit d2f60a3ab055fb61c8e1056a7c5652f1dec85e00 added an assert to indexam.c's
 RELATION_CHECKS to block use of an index while it's being rebuilt.  This
 assert trips while rechecking an exclusion constraint after an ALTER TABLE
 rebuild:

   CREATE TABLE t (
   c   int,
   EXCLUDE (c WITH =)
   );

   INSERT INTO t VALUES (1), (2);
   ALTER TABLE t ALTER c TYPE smallint USING 1;

Mph ... obviously not tested enough ...

 I could not come up with an actual wrong behavior arising from this usage, so
 I'll tentatively call it a false positive.  reindex_index() could instead
 unconditionally clear indexInfo-ii_Exclusion* before calling index_build(),
 then pop pendingReindexedIndexes and call IndexCheckExclusion() itself.  
 Popping
 pendingReindexedIndexes as we go can make the success of a reindex_relation()
 dependent on the order in which we choose to rebuild indexes, though.

 Another option is to just remove the assert as not worth preserving.

Removing the assert would be a seriously bad idea IMO.  I think we could
just allow index_build to call ResetReindexProcessing() midstream (ie,
before it calls IndexCheckExclusion).  This does raise the question of
whether the existing call to IndexCheckExclusion is badly placed and
we should move it to after the index is fully rebuilt.  That would
allow us to avoid doing ResetReindexProcessing until the index is
clearly safe to use.

So in short, I'm thinking move lines 1760-1772 (in HEAD) of index.c to
the end of index_build(), then insert a ResetReindexProcessing() call in
front of them; or maybe only do ResetReindexProcessing there if we
actually do call IndexCheckExclusion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ts_count

2011-06-04 Thread Andrew Dunstan



On 06/04/2011 04:51 PM, Oleg Bartunov wrote:

Well, there are several functions available around tsearch2. so I suggest
somebody to collect all of them and create one extension - ts_addon.
For example, these are what I remember:
1. tsvector2array
2. noccurences(tsvector, tsquery) - like your ts_count
3. nmatches(tsvector, tsquery) - # of matched lexems in query
Of course, we need to think about better names for functions, since
ts_count is a bit ambiguous.




Getting agreed names was one reason for posting. I don't know why these 
need to be an extension. I think they are of sufficiently general 
interest (and sufficiently lightweight) that we could just build them in.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assert failure when rechecking an exclusion constraint

2011-06-04 Thread Noah Misch
On Sat, Jun 04, 2011 at 05:49:31PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  I could not come up with an actual wrong behavior arising from this usage, 
  so
  I'll tentatively call it a false positive.  reindex_index() could instead
  unconditionally clear indexInfo-ii_Exclusion* before calling index_build(),
  then pop pendingReindexedIndexes and call IndexCheckExclusion() itself.  
  Popping
  pendingReindexedIndexes as we go can make the success of a 
  reindex_relation()
  dependent on the order in which we choose to rebuild indexes, though.
 
  Another option is to just remove the assert as not worth preserving.
 
 Removing the assert would be a seriously bad idea IMO.  I think we could
 just allow index_build to call ResetReindexProcessing() midstream (ie,
 before it calls IndexCheckExclusion).  This does raise the question of
 whether the existing call to IndexCheckExclusion is badly placed and
 we should move it to after the index is fully rebuilt.  That would
 allow us to avoid doing ResetReindexProcessing until the index is
 clearly safe to use.
 
 So in short, I'm thinking move lines 1760-1772 (in HEAD) of index.c to
 the end of index_build(), then insert a ResetReindexProcessing() call in
 front of them; or maybe only do ResetReindexProcessing there if we
 actually do call IndexCheckExclusion.

Sounds reasonable.  Need to remove the index from pendingReindexedIndexes, not
just call ResetReindexProcessing().  Also, wouldn't that specific construction
make the catalog updates fail due to running in the table owner's security
context?  But certainly something along those lines will do.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ts_count

2011-06-04 Thread Alvaro Herrera
Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011:

 A snippet from the regression test:
 
 
 select ts_count(to_tsvector('managing managers manage peons 
 managerially'),
  to_tsquery('managers | peon'));
   ts_count
 --
  4

Err, shouldn't this return 5?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ts_count

2011-06-04 Thread Andrew Dunstan



On 06/04/2011 08:59 PM, Alvaro Herrera wrote:

Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011:


A snippet from the regression test:


 select ts_count(to_tsvector('managing managers manage peons managerially'),
  to_tsquery('managers | peon'));
   ts_count
 --
  4

Err, shouldn't this return 5?


No. 'managerially' doesn't get the same stemming.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] heap vacuum cleanup locks

2011-06-04 Thread Robert Haas
We've occasionally seen problems with VACUUM getting stuck for failure
to acquire a cleanup lock due to, for example, a cursor holding a pin
on the buffer page.  In the worst case, this can cause an undetected
deadlock, if the backend holding the buffer pin blocks trying to
acquire a heavyweight lock that is in turn blocked by VACUUM.  A while
back, someone (Greg Stark? me?) floated the idea of not waiting for
the cleanup lock.  If we can't get it immediately, or within some
short period of time, then we just skip the page and continue on.

Today I had what might be a better idea: don't try to acquire a
cleanup lock at all.  Instead, acquire an exclusive lock.  After
having done so, observe the pin count.  If there are no other buffer
pins, that means our exclusive lock is actually a cleanup lock, and we
proceed as now.  If other buffer pins do exist, then we can't
defragment the page, but that doesn't mean no useful work can be done:
we can still mark used line pointers dead, or dead line pointers
unused.  We cannot defragment, but that can be done either by the next
VACUUM or by a HOT cleanup.  We can even arrange - using existing
mechanism - to leave behind a hint that the page is a good candidate
for a HOT cleanup, by setting pd_prune_xid to, say, FrozenXID.

Like the idea of skipping pages on which we can't acquire a cleanup
lock altogether, this should prevent VACUUM from getting stuck trying
to lock a heap page.  While buffer pins can be held for extended
periods of time, I don't think there is any operation that holds a
buffer content lock more than very briefly.  Furthermore, unlike the
idea of skipping the page altogether, we could use this approach even
during an anti-wraparound vacuum.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers