Re: [HACKERS] json api WIP patch

2013-01-07 Thread james

The processing functions have been extended to provide populate_record() and 
populate_recordset() functions.The latter in particular could be useful in 
decomposing a piece of json representing an array of flat objects (a fairly 
common pattern) into a set of Postgres records in a single pass.


So this would allow an 'insert into ... select ... from 
(...)'?


I had been wondering how to do such an insertion efficiently in the 
context of SPI, but it seems that there is no SPI_copy equiv that would 
allow a query parse and plan to be avoided.


Is this mechanism likely to be as fast as we can get at the moment in 
contexts where copy is not feasible?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2013-01-07 Thread Pavan Deolasee
Sorry for a long pause on this thread. A new arrival at home kept me
occupied all the time.

This thread saw a lot of ideas and suggestions from different people. I
don't think we had an agreement one way or the other on any of them, but
let me summarize the discussion for archival and taking further action if
deemed necessary.

*Suggestion 1: Set the visibility map bit after HOT prune
*
The rational for this idea is to improve the chances of an index-only scan
happening after HOT prune. This is especially interesting when the a table
gets random updates or deletes, each of which will clear the VM bit. The
table may not still come up for vacuum, either because the number of
updates/deletes are not over the vac threshold or because subsequent HOT
prune did not leave any work for vacuum. The only place where we set the VM
bits again is during vacuum. So this idea would add another path where VM
bits are set. This would also help vacuums to avoid visiting those heap
pages that don't have any work to be done.

The main objection to this idea is that this may result in too much
flip-flopping of the bit, especially if the HOT prune is to be followed by
an UPDATE to the page. This is a valid concern. But the way HOT prune works
today, it has no linkage to the future UPDATE operations other than the
fact that it frees up space for future UPDATE operations. But the prune can
happen even in a select code path. Suggestion 2 below is about changing
this behavior, but my point is to consider 1 unless and until we do 2. Tom
and Simon opposed saying we need to take a holistic view.  Another concern
with this idea is that VM bit set operation now generates WAL and repeated
setting/clearing of the bit may increase WAL activity. I suggested to piggy
back the VM bit set logging with the HOT prune WAL log. Robert raised some
doubts regarding increased full-page writes if VM set LSN is recorded in
the heap page LSN. I am not sure if that applies if we piggy back the
operation though because HOT prune WAL would anyway record LSN in the heap
page.

If we do this, we should also consider updating FSM after prune because
vacuum may not scan this page at all.

*Suggestion 2: Move HOT prune logic somewhere else
*
Tom/Simon suggested that we ought to consider moving HOT prune to some
other code path. When we implemented HOT a few years back, we wanted to
make it as less invasive as possible. But now the code has proven
stability, we can experiment a few more things. Especially, we would like
to prune only if the page is going to receive an UPDATE soon. Otherwise,
pruning may unnecessarily add overheads to a simple read-only query and the
space freed up by prune may not even be used soon/ever. Tom suggested that
we can teach planner/executor to distinguish between a scan on a normal
relation vs result relation. I'd some concerns that even if we have such
mechanism, it may not be enough because a scan does not guarantee that the
tuple will be finally updated because it may fail qualification etc.

Simon has strong views regarding burdening SELECTs with maintenance work,
but Robert and I are not convinced that its necessarily a bad idea to let
SELECTs do a little extra work which can help to keep the overall state
healthy.  But in general, it might be a good idea to try such approaches
and see if we can extract more out of the system. Suggestion 5 and 6 also
popped up to handle this problem in a slightly different way.

*Suggestion 3: Don't clear visibility map bit after HOT update
*
I proposed this during the course of discussion and Andreas F
liked/supported the idea. This could be useful when most/all updates are
HOT updates. So the page does not need any work during vacuum (assuming HOT
prune will take care of it) and index-only scans still work because the
index pointers will be pointing to a valid HOT chain. Tom/Simon didn't
quite like it because they were worried that this will change the meaning
on the VM. I (and I think even Andreas) don't think that way. Of course,
there are some concerns because this will break the use of PD_ALL_VISIBLE
flag for avoiding MVCC checks during heap scans. There are couple of
suggestions to fix that like having another page level bit to differentiate
these two states. Doing that will help us skip MVCC checks even if there
are one or more DEAD line pointers in the page. We should also run some
performance tests to see how much benefit is really served by skipping MVCC
checks in heap scans. We can weigh that against the benefit of keeping the
VM bit set.

*Suggestion 4: Freeze tuples during HOT prune*
*
*
I suggested that we can also freeze tuples during HOT prune. The rational
for doing it this way is to remove unnecessary work from vacuum by
piggy-backing the freeze logging in the HOT prune WAL record. Today vacuum
will generate additional WAL and dirty the buffers again just to freeze the
tuples. There are couple of objections to this idea. One is pushes
background work into foreground operation.

[HACKERS] pg_upgrade with parallel tablespace copying

2013-01-07 Thread Bruce Momjian
Pg_upgrade by default (without --link) copies heap/index files from the
old to new cluster.  This patch implements parallel heap/index file
copying in pg_upgrade using the --jobs option.  It uses the same
infrastructure used for pg_upgrade parallel dump/restore.  Here are the
performance results:

   --- seconds ---
   GBgitpatched
2   62.0963.75
4   95.93   107.22
8  194.96   195.29
   16  494.38   348.93
   32  983.28   644.23
   64 2227.73  1244.08
  128 4735.83  2547.09

Because of the kernel cache, you only see a big win when the amount of
copy data exceeds the kernel cache.  For testing, I used a 24GB, 16-core
machine with two magnetic disks with one tablespace on each.  Using more
tablespaces would yield larger improvements.  My test script is
attached.  

I consider this patch ready for application.  This is the last
pg_upgrade performance improvement idea I am considering.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c
new file mode 100644
index 59f8fd0..1780788
*** a/contrib/pg_upgrade/check.c
--- b/contrib/pg_upgrade/check.c
*** create_script_for_old_cluster_deletion(c
*** 606,612 
  	fprintf(script, RMDIR_CMD " %s\n", fix_path_separator(old_cluster.pgdata));
  
  	/* delete old cluster's alternate tablespaces */
! 	for (tblnum = 0; tblnum < os_info.num_tablespaces; tblnum++)
  	{
  		/*
  		 * Do the old cluster's per-database directories share a directory
--- 606,612 
  	fprintf(script, RMDIR_CMD " %s\n", fix_path_separator(old_cluster.pgdata));
  
  	/* delete old cluster's alternate tablespaces */
! 	for (tblnum = 0; tblnum < os_info.num_old_tablespaces; tblnum++)
  	{
  		/*
  		 * Do the old cluster's per-database directories share a directory
*** create_script_for_old_cluster_deletion(c
*** 621,634 
  			/* remove PG_VERSION? */
  			if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
  fprintf(script, RM_CMD " %s%s%cPG_VERSION\n",
! 		fix_path_separator(os_info.tablespaces[tblnum]), 
  		fix_path_separator(old_cluster.tablespace_suffix),
  		PATH_SEPARATOR);
  
  			for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
  			{
  fprintf(script, RMDIR_CMD " %s%s%c%d\n",
! 		fix_path_separator(os_info.tablespaces[tblnum]),
  		fix_path_separator(old_cluster.tablespace_suffix),
  		PATH_SEPARATOR, old_cluster.dbarr.dbs[dbnum].db_oid);
  			}
--- 621,634 
  			/* remove PG_VERSION? */
  			if (GET_MAJOR_VERSION(old_cluster.major_version) <= 804)
  fprintf(script, RM_CMD " %s%s%cPG_VERSION\n",
! 		fix_path_separator(os_info.old_tablespaces[tblnum]), 
  		fix_path_separator(old_cluster.tablespace_suffix),
  		PATH_SEPARATOR);
  
  			for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
  			{
  fprintf(script, RMDIR_CMD " %s%s%c%d\n",
! 		fix_path_separator(os_info.old_tablespaces[tblnum]),
  		fix_path_separator(old_cluster.tablespace_suffix),
  		PATH_SEPARATOR, old_cluster.dbarr.dbs[dbnum].db_oid);
  			}
*** create_script_for_old_cluster_deletion(c
*** 640,646 
  			 * or a version-specific subdirectory.
  			 */
  			fprintf(script, RMDIR_CMD " %s%s\n",
! 	fix_path_separator(os_info.tablespaces[tblnum]), 
  	fix_path_separator(old_cluster.tablespace_suffix));
  	}
  
--- 640,646 
  			 * or a version-specific subdirectory.
  			 */
  			fprintf(script, RMDIR_CMD " %s%s\n",
! 	fix_path_separator(os_info.old_tablespaces[tblnum]), 
  	fix_path_separator(old_cluster.tablespace_suffix));
  	}
  
diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c
new file mode 100644
index 0c11ff8..7fd4584
*** a/contrib/pg_upgrade/info.c
--- b/contrib/pg_upgrade/info.c
*** create_rel_filename_map(const char *old_
*** 106,125 
  		 * relation belongs to the default tablespace, hence relfiles should
  		 * exist in the data directories.
  		 */
! 		snprintf(map->old_dir, sizeof(map->old_dir), "%s/base/%u", old_data,
!  old_db->db_oid);
! 		snprintf(map->new_dir, sizeof(map->new_dir), "%s/base/%u", new_data,
!  new_db->db_oid);
  	}
  	else
  	{
  		/* relation belongs to a tablespace, so use the tablespace location */
! 		snprintf(map->old_dir, sizeof(map->old_dir), "%s%s/%u", old_rel->tablespace,
!  old_cluster.tablespace_suffix, old_db->db_oid);
! 		snprintf(map->new_dir, sizeof(map->new_dir), "%s%s/%u", new_rel->tablespace,
!  new_cluster.tablespace_suffix, new_db->db_oid);
  	}
  
  	/*
  	 * old_relfilenode might differ from pg_class.oid (and hence
  	 * new_relfilenode) because of CLUSTER, REINDEX, or VACUUM FULL.
--- 106,130 
  		 * relation belongs to the default tablespace, hence relfiles

Re: [HACKERS] [PATCH] Patch to fix missing libecpg_compat.lib and libpgtypes.lib.

2013-01-07 Thread Peter Eisentraut
On Mon, 2012-11-19 at 14:03 +0800, JiangGuiqing wrote:
> hi
> 
> I install postgresql-9.1.5 from source code on windows successfully.
> But there is not "libecpg_compat.lib" and "libpgtypes.lib" in the
> installed lib directory .
> If someone used functions of pgtypes or ecpg_compat library in ecpg
> programs,the ecpg program build will fail.
> 
> So i modify src/tools/msvc/Install.pm to resolve this problem.
> The diff file refer to the attachment "Install.pm.patch".

Could someone with access to Windows verify this patch?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bad examples in pg_dump README

2013-01-07 Thread Peter Eisentraut
On Sat, 2013-01-05 at 15:34 +0100, Magnus Hagander wrote:
> On Fri, Jan 4, 2013 at 3:36 AM, Josh Kupershmidt  wrote:
> > I propose slimming down the pg_dump README, keeping intact the
> > introductory notes and details of the tar format.
> 
> Do we need to keep it at all, really? Certainly the introductory part
> is covered in the main documentation already...

I'd remove it and distribute the remaining information, if any, between
the source code and the man page.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PL/Python result object str handler

2013-01-07 Thread Peter Eisentraut
For debugging PL/Python functions, I'm often tempted to write something
like

rv = plpy.execute(...)
plpy.info(rv)

which prints something unhelpful like



By implementing a "str" handler for the result object, it now prints
something like



Patch attached for review.

diff --git a/src/pl/plpython/expected/plpython_spi.out b/src/pl/plpython/expected/plpython_spi.out
index 3cda958..b07a429 100644
--- a/src/pl/plpython/expected/plpython_spi.out
+++ b/src/pl/plpython/expected/plpython_spi.out
@@ -263,6 +263,24 @@ CONTEXT:  PL/Python function "result_empty_test"
  
 (1 row)
 
+CREATE FUNCTION result_str_test(cmd text) RETURNS text
+AS $$
+plan = plpy.prepare(cmd)
+result = plpy.execute(plan)
+return str(result)
+$$ LANGUAGE plpythonu;
+SELECT result_str_test($$SELECT 1 AS foo, '11'::text AS bar UNION SELECT 2, '22'$$);
+   result_str_test
+--
+ 
+(1 row)
+
+SELECT result_str_test($$CREATE TEMPORARY TABLE foo1 (a int, b text)$$);
+   result_str_test
+--
+ 
+(1 row)
+
 -- cursor objects
 CREATE FUNCTION simple_cursor_test() RETURNS int AS $$
 res = plpy.cursor("select fname, lname from users")
diff --git a/src/pl/plpython/plpy_resultobject.c b/src/pl/plpython/plpy_resultobject.c
index ea93ad7..077bde6 100644
--- a/src/pl/plpython/plpy_resultobject.c
+++ b/src/pl/plpython/plpy_resultobject.c
@@ -22,6 +22,7 @@
 static PyObject *PLy_result_item(PyObject *arg, Py_ssize_t idx);
 static PyObject *PLy_result_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx);
 static int	PLy_result_ass_slice(PyObject *arg, Py_ssize_t lidx, Py_ssize_t hidx, PyObject *slice);
+static PyObject *PLy_result_str(PyObject *arg);
 static PyObject *PLy_result_subscript(PyObject *arg, PyObject *item);
 static int	PLy_result_ass_subscript(PyObject *self, PyObject *item, PyObject *value);
 
@@ -74,7 +75,7 @@
 	&PLy_result_as_mapping,		/* tp_as_mapping */
 	0,			/* tp_hash */
 	0,			/* tp_call */
-	0,			/* tp_str */
+	&PLy_result_str,			/* tp_str */
 	0,			/* tp_getattro */
 	0,			/* tp_setattro */
 	0,			/* tp_as_buffer */
@@ -249,6 +250,26 @@
 }
 
 static PyObject *
+PLy_result_str(PyObject *arg)
+{
+	PLyResultObject *ob = (PLyResultObject *) arg;
+
+#if PY_MAJOR_VERSION >= 3
+	return PyUnicode_FromFormat("<%s status=%S nrows=%S rows=%S>",
+Py_TYPE(ob)->tp_name,
+ob->status,
+ob->nrows,
+ob->rows);
+#else
+	return PyString_FromFormat("<%s status=%ld nrows=%ld rows=%s>",
+			   ob->ob_type->tp_name,
+			   PyInt_AsLong(ob->status),
+			   PyInt_AsLong(ob->nrows),
+			   PyString_AsString(PyObject_Str(ob->rows)));
+#endif
+}
+
+static PyObject *
 PLy_result_subscript(PyObject *arg, PyObject *item)
 {
 	PLyResultObject *ob = (PLyResultObject *) arg;
diff --git a/src/pl/plpython/sql/plpython_spi.sql b/src/pl/plpython/sql/plpython_spi.sql
index 6250e90..7a84473 100644
--- a/src/pl/plpython/sql/plpython_spi.sql
+++ b/src/pl/plpython/sql/plpython_spi.sql
@@ -169,6 +169,16 @@ CREATE FUNCTION result_empty_test() RETURNS void
 
 SELECT result_empty_test();
 
+CREATE FUNCTION result_str_test(cmd text) RETURNS text
+AS $$
+plan = plpy.prepare(cmd)
+result = plpy.execute(plan)
+return str(result)
+$$ LANGUAGE plpythonu;
+
+SELECT result_str_test($$SELECT 1 AS foo, '11'::text AS bar UNION SELECT 2, '22'$$);
+SELECT result_str_test($$CREATE TEMPORARY TABLE foo1 (a int, b text)$$);
+
 -- cursor objects
 
 CREATE FUNCTION simple_cursor_test() RETURNS int AS $$

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] lazy_vacuum_heap()'s removal of HEAPTUPLE_DEAD tuples

2013-01-07 Thread Noah Misch
Per this comment in lazy_scan_heap(), almost all tuple removal these days
happens in heap_page_prune():

case HEAPTUPLE_DEAD:
/*
 * Ordinarily, DEAD tuples would have 
been removed by
 * heap_page_prune(), but it's possible 
that the tuple
 * state changed since 
heap_page_prune() looked.  In
 * particular an INSERT_IN_PROGRESS 
tuple could have
 * changed to DEAD if the inserter 
aborted.  So this
 * cannot be considered an error 
condition.

vacuumlazy.c remains responsible for noticing the LP_DEAD line pointers left
by heap_page_prune(), removing corresponding index entries, and marking those
line pointers LP_UNUSED.

Nonetheless, lazy_vacuum_heap() retains the ability to remove actual
HEAPTUPLE_DEAD tuples and reclaim their LP_NORMAL line pointers.  This support
gets exercised only in the scenario described in the above comment.  For hot
standby, this capability requires its own WAL record, XLOG_HEAP2_CLEANUP_INFO,
to generate the necessary conflicts[1].  There is a bug in lazy_scan_heap()'s
bookkeeping for the xid to place in that WAL record.  Each call to
heap_page_prune() simply overwrites vacrelstats->latestRemovedXid, but
lazy_scan_heap() expects it to only ever increase the value.  I have a
attached a minimal fix to be backpatched.  It has lazy_scan_heap() ignore
heap_page_prune()'s actions for the purpose of this conflict xid, because
heap_page_prune() emitted an XLOG_HEAP2_CLEAN record covering them.

At that point in the investigation, I realized that the cost of being able to
remove entire tuples in lazy_vacuum_heap() greatly exceeds the benefit.
Again, the benefit is being able to remove tuples whose inserting transaction
aborted between the HeapTupleSatisfiesVacuum() call in heap_page_prune() and
the one in lazy_scan_heap().  To make that possible, lazy_vacuum_heap() grabs
a cleanup lock, calls PageRepairFragmentation(), and emits a WAL record for
every page containing LP_DEAD line pointers or HEAPTUPLE_DEAD tuples.  If we
take it out of the business of removing tuples, lazy_vacuum_heap() can skip
WAL and update lp_flags under a mere shared lock.  The second attached patch,
for HEAD, implements that.  Besides optimizing things somewhat, it simplifies
the code and removes rarely-tested branches.  (This patch supersedes the
backpatch-oriented patch rather than stacking with it.)

The bookkeeping behind the "page containing dead tuples is marked as
all-visible in relation" warning is also faulty; it only fires when
lazy_heap_scan() saw the HEAPTUPLE_DEAD tuple; again, heap_page_prune() will
be the one to see it in almost every case.  I changed the warning to fire
whenever the table cannot be marked all-visible for a reason other than the
presence of too-recent live tuples.

I considered renaming lazy_vacuum_heap() to lazy_heap_clear_dead_items(),
reflecting its narrower role.  Ultimately, I left function names unchanged.

This patch conflicts textually with Pavan's "Setting visibility map in
VACUUM's second phase" patch, but I don't see any conceptual incompatibility.

I can't give a simple statement of the performance improvement here.  The
XLOG_HEAP2_CLEAN record is fairly compact, so the primary benefit of avoiding
it is the possibility of avoiding a full-page image.  For example, if a
checkpoint lands just before the VACUUM and again during the index-cleaning
phase (assume just one such phase in this example), this patch reduces
heap-related WAL volume by almost 50%.  Improvements as extreme as 2% and 97%
are possible given other timings of checkpoints relatively to the VACUUM.  In
general, expect this to help VACUUMs spanning several checkpoint cycles more
than it helps shorter VACUUMs.  I have attached a script I used as a reference
workload for testing different checkpoint timings.  There should also be some
improvement from keeping off WALInsertLock, not requiring WAL flushes to evict
from the ring buffer during the lazy_vacuum_heap() phase, and not taking a
second buffer cleanup lock.  I did not attempt to quantify those.

Thanks,
nm

[1] Normally, heap_page_prune() removes the tuple first (leaving an LP_DEAD
line pointer), and vacuumlazy.c removes index entries afterward.  When the
removal happens in this order, the XLOG_HEAP2_CLEAN record takes care of
conflicts.  However, in the rarely-used code path, we remove the index entries
before removing the tuple.  XLOG_HEAP2_CLEANUP_INFO conflicts with standby
snapshots that might need the vanishing index entries.
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***
*** 482,487  lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
--- 482,488 
boolall_

Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-07 Thread Shigeru Hanada
Hi Tomas,

On Tue, Jan 8, 2013 at 7:08 AM, Tomas Vondra  wrote:
>> * I found another extra space after asterisk.
>>
>> + RelFileNode * nodes;
>
> Thanks, fixed.

check

>> * Curly bracket which starts code block should be at the head of next line.
>>
>> + /* extend the array if needed (double the 
>> size) */
>> + if (maxrels <= nrels) {
>
> Fixed.

check

>> * There are several trailing tabs in src/backend/catalog/storage.c.
>
> Fixed (I balieve).

check

>> * naming of DROP_RELATIONS_BSEARCH_LIMIT (or off-by-one bug?)
>> IIUC bsearch is used when # of relations to be dropped is *more* than
>> the value of DROP_RELATIONS_BSEARCH_LIMIT (10).  IMO this behavior is
>> not what the macro name implies; I thought that bsearch is used for 10
>> relations.  Besides, the word "LIMIT" is used as *upper limit* in
>> other macros.  How about MIN_DROP_REL[ATION]S_BSEARCH or
>> DROP_REL[ATION]S_LINEAR_SEARCH_LIMIT?
>> # using RELS instead of RELATIONS would be better to shorten the name
>>
>
> I've changed the name to DROP_RELS_BSEARCH_THRESHOLD. I believe this
> naming is consistent with options like "geqo_threshold" - when the
> number of relations reaches the specified value, the bsearch is used.
>
> I've also increased the value from 10 to 20, in accordance with the
> previous discussion.

New name sounds good to me, but the #define says that the value is 15.
Should it be fixed to 20?

>> * +1 for Alvaro's suggestion about avoiding palloc traffic by starting
>> with 8 elements or so.
>>
>
> Done.

Not yet.  Initial size of srels array is still 1 element.

Regards,
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A very small typo in the comment

2013-01-07 Thread Bruce Momjian
On Mon, Jan  7, 2013 at 07:46:46PM -0500, Bruce Momjian wrote:
> On Sat, Jan  5, 2013 at 12:13:27AM +0800, 孟庆钟 wrote:
> > Hi,
> >   I am a student in China. I have read some source code of postgreSQL,
> > though only a little. I think maybe there is typo in this file: src\backend\
> > catalog\index.c. At line 649 in v9.2.0, line 653 in V9.2.2. The sentence is 
> > "
> > *indexInfo: same info executor uses to insert into the index", I think it
> > should not be 'same', but 'some'.
> >   I am not sure if my thoughts is right.
> >   Sorry for my poor English.
> 
> You are correct.  I have fixed it in the PG 9.3 source tree.  Thanks.

Oh, I see Tom's later comment now.  The change was not made to 9.3.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A very small typo in the comment

2013-01-07 Thread Bruce Momjian
On Sat, Jan  5, 2013 at 12:13:27AM +0800, 孟庆钟 wrote:
> Hi,
>   I am a student in China. I have read some source code of postgreSQL,
> though only a little. I think maybe there is typo in this file: src\backend\
> catalog\index.c. At line 649 in v9.2.0, line 653 in V9.2.2. The sentence is "
> *indexInfo: same info executor uses to insert into the index", I think it
> should not be 'same', but 'some'.
>   I am not sure if my thoughts is right.
>   Sorry for my poor English.

You are correct.  I have fixed it in the PG 9.3 source tree.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] in-catalog Extension Scripts and Control parameters (templates?)

2013-01-07 Thread Dimitri Fontaine
Hi,

Please find attached a preliminary patch following the TEMPLATE ideas,
and thanks in particular to Tom and Heikki for a practical design about
how to solve that problem!

Tom Lane  writes:
>>  - Extension Scripts are now stored in the catalogs, right?
>
> Only for these new-style thingies.  I am not suggesting breaking the
> existing file-based implementation, only offering a parallel
> catalog-based implementation too.  We'd have to think about what to do
> for name collisions --- probably having the catalog entry take
> precedence is okay, but is there an argument for something else?

The attached patch is implementing TEMPLATEs only for these new-style
thingies. Conflicts are checked at template creation time, and at create
extension time we do the file system lookup first, so that's the winner.

>> [ need separate catalogs for install scripts and update scripts ]
>
> Check.

You'll find the 3 of them in the attached unfinished patch (install,
update, control).

> Actually, given the text search precedent, I'm not sure why you're so
> against TEMPLATE.

So I called them TEMPLATE and I tried hard to leave that term open to
other uses. As a result the main syntax is

  CREATE TEMPLATE FOR EXTENSION …
  ALTER TEMPLATE FOR EXTENSION …
  DROP TEMPLATE FOR EXTENSION …

No new keyword has been added to the parser in the making of this patch.

You'll find some usage examples in the regression tests part of the
patch, and the new commands have received the very minimum documentation
coverage. I intend to fill in the docs some more before calling it ready
for commit, of course.

I'm at a point where I need feedback before continuing though, and I
think Tom is in the best position to provide it given the previous
exchanges.

>> The $2.56 question being what would be the pg_dump policy of the
>> "extension templates" objects?
>
> The ones that are catalog objects, not file objects, should be dumped
> I think.

So, the current version of the patch has no support for pg_dump and psql
yet, and most ALTER commands in the grammar are not yet implemented. In
the lacking list we can also add ALTER … OWNER TO / RENAME and COMMENT,
both for the new catalog objects and the extension to be created from
them.

I think we could transfer the COMMENT on the template from the
pg_extension_control (so that you can change the comment at upgrade) to
the extension, but wanted to talk about that first. The alternative is
to simply add a comment column to the pg_extension_control catalog,
along with a grammar rule to get the information from the commands.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support



templates.v0.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \l to accept patterns

2013-01-07 Thread Robert Haas
On Mon, Jan 7, 2013 at 5:14 PM, Peter Eisentraut  wrote:
> We removed showing system functions and operators from \df and \do
> without S.  Those are needed all the time.  This was controversial at
> the time, but it's the way it is now.  The definition of "S", I suppose,
> is more like "is there after database is created", not "typical users
> care about these objects".

System functions and operators are needed all the time, but so are
system tables and views, and the old behavior was that the latter were
suppressed by default and the former were included by default.  So I
consider that change to be well-justified on consistency grounds.
There's a practical consideration, as well.  Out of the box, there are
2400 entries for functions and 3 for databases.  This means that the
old \df behavior made it very hard to figure out what user-defined
functions exist in your database, but there's no corresponding problem
with \l.  Finally, note that you can drop the postgres database (and
everything else will still work) but you cannot drop
pg_table_is_visible(oid), because, as the error message will inform
you, it is required by the database system.

If we make the postgres database undroppable, unrenamable, and
strictly read-only, I will happily support a proposal to consider it a
system object.  Until then, it's no more a system object than the
public schema - which, you will note, \dn has no compunctions about
displaying, even without S.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \l to accept patterns

2013-01-07 Thread Peter Eisentraut
On 1/7/13 3:53 PM, Robert Haas wrote:
> On Mon, Jan 7, 2013 at 7:14 AM, Peter Eisentraut  wrote:
>> > Here is a patch for psql's \l command to accept patterns, like \d
>> > commands do.  While at it, I also added an "S" option to show system
>> > objects and removed system objects from the default display.  This might
>> > be a bit controversial, but it's how it was decided some time ago that
>> > the \d commands should act.
> -1 from me on that last bit.  I don't think that you can really
> compare the postgres or template1 database to, say, the pg_toast
> schema.  There's no real reason for people to ever care about objects
> in the pg_toast schema, but the same cannot be said about template1,
> which it's often necessary to connect to when running many of the
> commands (vacuumdb -a, etc.) we ship with our distribution.  I think
> this will just be confusing to users without any real upside.

We removed showing system functions and operators from \df and \do
without S.  Those are needed all the time.  This was controversial at
the time, but it's the way it is now.  The definition of "S", I suppose,
is more like "is there after database is created", not "typical users
care about these objects".




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-07 Thread Tomas Vondra
On 7.1.2013 10:07, Shigeru Hanada wrote:
> Hi Tomas,
> 
> I reviewed v6 patch, and found that several places need fix.
> Sorry for extra nitpickings.
> 
> * I found another extra space after asterisk.
> 
> + RelFileNode * nodes;

Thanks, fixed.

> 
> * Curly bracket which starts code block should be at the head of next line.
> 
> + /* extend the array if needed (double the size) 
> */
> + if (maxrels <= nrels) {

Fixed.

> 
> * There are several trailing tabs in src/backend/catalog/storage.c.

Fixed (I balieve).

> * naming of DROP_RELATIONS_BSEARCH_LIMIT (or off-by-one bug?)
> IIUC bsearch is used when # of relations to be dropped is *more* than
> the value of DROP_RELATIONS_BSEARCH_LIMIT (10).  IMO this behavior is
> not what the macro name implies; I thought that bsearch is used for 10
> relations.  Besides, the word "LIMIT" is used as *upper limit* in
> other macros.  How about MIN_DROP_REL[ATION]S_BSEARCH or
> DROP_REL[ATION]S_LINEAR_SEARCH_LIMIT?
> # using RELS instead of RELATIONS would be better to shorten the name
>

I've changed the name to DROP_RELS_BSEARCH_THRESHOLD. I believe this
naming is consistent with options like "geqo_threshold" - when the
number of relations reaches the specified value, the bsearch is used.

I've also increased the value from 10 to 20, in accordance with the
previous discussion.

> 
> * +1 for Alvaro's suggestion about avoiding palloc traffic by starting
> with 8 elements or so.
> 

Done.

regards
Tomas
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 8dc622a..d8dabc6 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -312,6 +312,11 @@ smgrDoPendingDeletes(bool isCommit)
 	PendingRelDelete *pending;
 	PendingRelDelete *prev;
 	PendingRelDelete *next;
+
+	SMgrRelation   *srels = palloc(sizeof(SMgrRelation));
+	int			nrels = 0,
+i = 0,
+maxrels = 8;
 
 	prev = NULL;
 	for (pending = pendingDeletes; pending != NULL; pending = next)
@@ -335,14 +340,32 @@ smgrDoPendingDeletes(bool isCommit)
 SMgrRelation srel;
 
 srel = smgropen(pending->relnode, pending->backend);
-smgrdounlink(srel, false);
-smgrclose(srel);
+
+/* extend the array if needed (double the size) */
+if (maxrels <= nrels)
+{
+	maxrels *= 2;
+	srels = repalloc(srels, sizeof(SMgrRelation) * maxrels);
+}
+
+srels[nrels++] = srel;
 			}
 			/* must explicitly free the list entry */
 			pfree(pending);
 			/* prev does not change */
 		}
 	}
+
+	if (nrels > 0)
+	{
+		smgrdounlinkall(srels, nrels, false);
+
+		for (i = 0; i < nrels; i++)
+			smgrclose(srels[i]);
+	}
+
+	pfree(srels);
+
 }
 
 /*
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index dddb6c0..1665630 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -62,6 +62,7 @@
 #define BUF_WRITTEN0x01
 #define BUF_REUSABLE			0x02
 
+#define DROP_RELS_BSEARCH_THRESHOLD		15
 
 /* GUC variables */
 bool		zero_damaged_pages = false;
@@ -108,6 +109,7 @@ static volatile BufferDesc *BufferAlloc(SMgrRelation smgr,
 static void FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln);
 static void AtProcExit_Buffers(int code, Datum arg);
 
+static int rnode_comparator(const void * p1, const void * p2);
 
 /*
  * PrefetchBuffer -- initiate asynchronous read of a block of a relation
@@ -2094,35 +2096,87 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
  * 
  */
 void
-DropRelFileNodeAllBuffers(RelFileNodeBackend rnode)
+DropRelFileNodeAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
 {
-	int			i;
+	int i, j, n = 0;
+	RelFileNode *nodes;
+
+	nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
 
 	/* If it's a local relation, it's localbuf.c's problem. */
-	if (RelFileNodeBackendIsTemp(rnode))
+	for (i = 0; i < nnodes; i++)
 	{
-		if (rnode.backend == MyBackendId)
-			DropRelFileNodeAllLocalBuffers(rnode.node);
+		if (RelFileNodeBackendIsTemp(rnodes[i]))
+		{
+			if (rnodes[i].backend == MyBackendId)
+DropRelFileNodeAllLocalBuffers(rnodes[i].node);
+		}
+		else
+		{
+			nodes[n++] = rnodes[i].node;
+		}
+	}
+
+	/*
+	 * If there are no non-local relations, then we're done. Release the memory
+	 * and return.
+	 */
+	if (n == 0)
+	{
+		pfree(nodes);
 		return;
 	}
 
+	/* sort the list of rnodes (but only if we're going to use the bsearch) */
+	if (n > DROP_RELS_BSEARCH_THRESHOLD)
+		pg_qsort(nodes, n, sizeof(RelFileNode), rnode_comparator);
+
 	for (i = 0; i < NBuffers; i++)
 	{
+		RelFileNode *rnode = NULL;
 		volatile BufferDesc *bufHdr = &BufferDescriptors[i];
-
+ 
 		/*
 		 * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
 		 * and saves some cycles.
 		 */
-		if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
+
+		/*
+		 * For low number of relations to drop just use a

Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Robert Haas
On Mon, Jan 7, 2013 at 4:19 PM, Tom Lane  wrote:
> Hm ... one of us is reading those results backwards, then.

*looks*

It's me.

Sorry for the noise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2013-01-07 Thread Alvaro Herrera
Tom Lane escribió:
> Robert Haas  writes:
> > On Mon, Jan 7, 2013 at 3:43 PM, Alvaro Herrera  
> > wrote:

> >> If this is considered a problem, I think the way to fix it is to have a
> >> getObjectDescriptionOids() variant that quotes the object name in the
> >> output.
> 
> > This sort of thing has been rejected repeatedly in the past on
> > translation grounds:
> 
> Yes.  I'm surprised Alvaro isn't well aware of the rules against trying
> to build error messages out of sentence fragments: see first item under
> http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES

Actually I'm fully aware of the limitations; I just had a brain fade.  I
knew there was something wrong with that usage of getObjectDescription,
but managed to miss what it was exactly.  Doh.  Thankfully there are
other hackers paying attention.

BTW this patch was correct in this regard in a previous version -- there
was one separate copy of the error message per object type.  I think
it's just a matter of returning to that older coding.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Andrew Dunstan


On 01/07/2013 04:19 PM, Tom Lane wrote:

Robert Haas  writes:

On Mon, Jan 7, 2013 at 11:16 AM, Tom Lane  wrote:

Why would that be a good tradeoff to make?  Larger stored values require
more I/O, which is likely to swamp any CPU savings in the compression
step.  Not to mention that a value once written may be read many times,
so the extra I/O cost could be multiplied many times over later on.

I agree with this analysis, but I note that the test results show it
actually improving things along both parameters.

Hm ... one of us is reading those results backwards, then.




I just went back and looked. Unless I'm misreading it he has about a 2.5 
times speed improvement but about a 20% worse compression result.


What would be interesting would be to see if the knobs he's tweaked 
could be tweaked a bit more to give us substantial speedup without 
significant space degradation.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Merlin Moncure
On Mon, Jan 7, 2013 at 2:41 PM, Tom Lane  wrote:
> Merlin Moncure  writes:
>> On Mon, Jan 7, 2013 at 10:16 AM, Tom Lane  wrote:
>>> Takeshi Yamamuro  writes:
 The attached is a patch to improve compression speeds with loss of
 compression ratios in backend/utils/adt/pg_lzcompress.c.
>
>>> Why would that be a good tradeoff to make?  Larger stored values require
>>> more I/O, which is likely to swamp any CPU savings in the compression
>>> step.  Not to mention that a value once written may be read many times,
>>> so the extra I/O cost could be multiplied many times over later on.
>
>> I disagree.  pg compression is so awful it's almost never a net win.
>> I turn it off.
>
> One report doesn't make it useless, but even if it is so on your data,
> why would making it even less effective be a win?

That's a fair point.  I'm neutral on the OP's proposal -- it's just
moving spots around the dog.  If we didn't have better options, maybe
offering options to tune what we have would be worth implementing...
but by your standard ISTM we can't even do *that*.

>>> Another thing to keep in mind is that the compression area in general
>>> is a minefield of patents.  We're fairly confident that pg_lzcompress
>>> as-is doesn't fall foul of any, but any significant change there would
>>> probably require more research.
>
>> A minefield of *expired* patents.  Fast lz based compression is used
>> all over the place -- for example by the lucene.
>
> The patents that had to be dodged for original LZ compression are gone,
> true, but what's your evidence for saying that newer versions don't have
> newer patents?

That's impossible (at least for a non-attorney) to do because the
patents are still flying (for example:
http://www.google.com/patents/US7650040).  That said, you've framed
the debate so that any improvement to postgres compression requires an
IP lawyer.  That immediately raises some questions:

*) why hold only compression type features in postgres to that
standard?  Patents get mentioned here and there in the context of
other features in the archives but only compression seems to require a
proven clean pedigree.   Why don't we require a patent search for
other interesting features? What evidence do *you* offer that lz4
violates any patents?

*) why is postgres the only FOSS project that cares about
patentability of say, lz4?  (google 'lz4 patent')

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jan 7, 2013 at 11:16 AM, Tom Lane  wrote:
>> Why would that be a good tradeoff to make?  Larger stored values require
>> more I/O, which is likely to swamp any CPU savings in the compression
>> step.  Not to mention that a value once written may be read many times,
>> so the extra I/O cost could be multiplied many times over later on.

> I agree with this analysis, but I note that the test results show it
> actually improving things along both parameters.

Hm ... one of us is reading those results backwards, then.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \l to accept patterns

2013-01-07 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jan 7, 2013 at 7:14 AM, Peter Eisentraut  wrote:
>> Here is a patch for psql's \l command to accept patterns, like \d
>> commands do.  While at it, I also added an "S" option to show system
>> objects and removed system objects from the default display.  This might
>> be a bit controversial, but it's how it was decided some time ago that
>> the \d commands should act.

> -1 from me on that last bit.  I don't think that you can really
> compare the postgres or template1 database to, say, the pg_toast
> schema.  There's no real reason for people to ever care about objects
> in the pg_toast schema, but the same cannot be said about template1,
> which it's often necessary to connect to when running many of the
> commands (vacuumdb -a, etc.) we ship with our distribution.  I think
> this will just be confusing to users without any real upside.

Suppressing the postgres DB is even worse.

I think that it might be sensible to have an "S" option and define
"system" DBs as those without datallowconn, which ordinarily would only
hide template0.  But I can't get real excited about that.  People do
need to know about the existence of template0 (for use in
CREATE DATABASE ... TEMPLATE ...), which is not so true of, say,
pg_temp_NNN schemas.  The "it reduces clutter" argument also seems
pretty weak if we're only hiding one database, or even three of them.

On the whole I lean towards not adding this notion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2013-01-07 Thread Kohei KaiGai
2013/1/7 Tom Lane :
> Robert Haas  writes:
>> On Mon, Jan 7, 2013 at 3:43 PM, Alvaro Herrera  
>> wrote:
>>> I checked this patch.  It needed a rebase for the changes to return
>>> OIDs.  Attached patch applies to current HEAD.  In general this looks
>>> good, with one exception: it's using getObjectDescriptionOids() to build
>>> the messages to complain in case the object already exists in the
>>> current schema, which results in diffs like this:
>>>
>>> -ERROR:  event trigger "regress_event_trigger2" already exists
>>> +ERROR:  event trigger regress_event_trigger2 already exists
>>>
>>> I don't know how tense we are about keeping the quotes, but I fear there
>>> would be complaints because it took us lots of sweat, blood and tears to
>>> get where we are now.
>>>
>>> If this is considered a problem, I think the way to fix it is to have a
>>> getObjectDescriptionOids() variant that quotes the object name in the
>>> output.
>
>> This sort of thing has been rejected repeatedly in the past on
>> translation grounds:
>
> Yes.  I'm surprised Alvaro isn't well aware of the rules against trying
> to build error messages out of sentence fragments: see first item under
> http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES
>
> Presence or absence of quotes is the very least of this code's i18n
> problems.
>
> If we had no other choice, we might consider a workaround such as that
> suggested in "Assembling Error Messages"
> http://www.postgresql.org/docs/devel/static/error-style-guide.html#AEN98605
> but frankly I'm not convinced that this patch is attractive enough to
> justify a degradation in message readability.
>
Sorry, I forgot Robert pointed out same thing before.
I'll reconstruct the portion to raise an error message.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Robert Haas
On Mon, Jan 7, 2013 at 11:16 AM, Tom Lane  wrote:
> Why would that be a good tradeoff to make?  Larger stored values require
> more I/O, which is likely to swamp any CPU savings in the compression
> step.  Not to mention that a value once written may be read many times,
> so the extra I/O cost could be multiplied many times over later on.

I agree with this analysis, but I note that the test results show it
actually improving things along both parameters.

I'm not sure how general that result is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2013-01-07 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jan 7, 2013 at 3:43 PM, Alvaro Herrera  
> wrote:
>> I checked this patch.  It needed a rebase for the changes to return
>> OIDs.  Attached patch applies to current HEAD.  In general this looks
>> good, with one exception: it's using getObjectDescriptionOids() to build
>> the messages to complain in case the object already exists in the
>> current schema, which results in diffs like this:
>> 
>> -ERROR:  event trigger "regress_event_trigger2" already exists
>> +ERROR:  event trigger regress_event_trigger2 already exists
>> 
>> I don't know how tense we are about keeping the quotes, but I fear there
>> would be complaints because it took us lots of sweat, blood and tears to
>> get where we are now.
>> 
>> If this is considered a problem, I think the way to fix it is to have a
>> getObjectDescriptionOids() variant that quotes the object name in the
>> output.

> This sort of thing has been rejected repeatedly in the past on
> translation grounds:

Yes.  I'm surprised Alvaro isn't well aware of the rules against trying
to build error messages out of sentence fragments: see first item under
http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES

Presence or absence of quotes is the very least of this code's i18n
problems.

If we had no other choice, we might consider a workaround such as that
suggested in "Assembling Error Messages"
http://www.postgresql.org/docs/devel/static/error-style-guide.html#AEN98605
but frankly I'm not convinced that this patch is attractive enough to
justify a degradation in message readability.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \l to accept patterns

2013-01-07 Thread Robert Haas
On Mon, Jan 7, 2013 at 7:14 AM, Peter Eisentraut  wrote:
> Here is a patch for psql's \l command to accept patterns, like \d
> commands do.  While at it, I also added an "S" option to show system
> objects and removed system objects from the default display.  This might
> be a bit controversial, but it's how it was decided some time ago that
> the \d commands should act.

-1 from me on that last bit.  I don't think that you can really
compare the postgres or template1 database to, say, the pg_toast
schema.  There's no real reason for people to ever care about objects
in the pg_toast schema, but the same cannot be said about template1,
which it's often necessary to connect to when running many of the
commands (vacuumdb -a, etc.) we ship with our distribution.  I think
this will just be confusing to users without any real upside.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2013-01-07 Thread Robert Haas
On Mon, Jan 7, 2013 at 3:43 PM, Alvaro Herrera  wrote:
> Kohei KaiGai escribió:
>
>> The attached patch is a revised version.
>>
>> It follows the manner in ExecAlterObjectSchemaStmt(); that tries to check
>> name duplication for object classes that support catcache with name-key.
>> Elsewhere, it calls individual routines for each object class to solve object
>> name and check namespace conflicts.
>> For example, RenameFunction() is still called from ExecRenameStmt(),
>> however, it looks up the given function name and checks namespace
>> conflict only, then it just invokes AlterObjectRename_internal() in spite
>> of system catalog updates by itself.
>> It allows to use this consolidated routine by object classes with special
>> rule for namespace conflicts, such as collation.
>> In addition, this design allowed to eliminate get_object_type() to pull
>> OBJECT_* enum from class_id/object_id pair.
>
> I checked this patch.  It needed a rebase for the changes to return
> OIDs.  Attached patch applies to current HEAD.  In general this looks
> good, with one exception: it's using getObjectDescriptionOids() to build
> the messages to complain in case the object already exists in the
> current schema, which results in diffs like this:
>
> -ERROR:  event trigger "regress_event_trigger2" already exists
> +ERROR:  event trigger regress_event_trigger2 already exists
>
> I don't know how tense we are about keeping the quotes, but I fear there
> would be complaints because it took us lots of sweat, blood and tears to
> get where we are now.
>
> If this is considered a problem, I think the way to fix it is to have a
> getObjectDescriptionOids() variant that quotes the object name in the
> output.  This would be pretty intrusive: we'd have to change things
> so that, for instance,
>
> appendStringInfo(&buffer, _("collation %s"),
> NameStr(coll->collname));
>
> would become
>
> if (quotes)
> appendStringInfo(&buffer, _("collation \"%s\""),
> NameStr(coll->collname));
> else
> appendStringInfo(&buffer, _("collation %s"),
> NameStr(coll->collname));
>
> Not really thrilled with this idea.  Of course,
> getObjectDescription() itself should keep the same API as now, to avoid
> useless churn all over the place.

This sort of thing has been rejected repeatedly in the past on
translation grounds:

+   ereport(ERROR,
+   
(errcode(ERRCODE_DUPLICATE_OBJECT),
+errmsg("%s already exists in 
schema \"%s\"",
+   
getObjectDescriptionOids(classId, exists),
+   
get_namespace_name(namespaceId;

If we're going to start allowing that, there's plenty of other code
that can be hit with the same hammer, but I'm pretty sure that all
previous proposals in this area have been slapped down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recent ALTER whatever .. SET SCHEMA refactoring

2013-01-07 Thread Robert Haas
On Mon, Jan 7, 2013 at 2:14 PM, Alvaro Herrera  wrote:
> Kohei KaiGai escribió:
>
>> Function and collation are candidates of this special case handling;
>> here are just two kinds of object.
>>
>> Another idea is to add a function-pointer as argument of
>> AlterNamespace_internal for (upcoming) object classes that takes
>> special handling for detection of name collision.
>> My personal preference is the later one, rather than hardwired
>> special case handling.
>> However, it may be too elaborate to handle just two exceptions.
>
> I think this idea is fine.  Pass a function pointer which is only
> not-NULL for the two exceptional cases; the code should have an Assert
> that either the function pointer is passed, or there is a nameCacheId to
> use.  That way, the object types we already handle in the simpler way do
> not get any more complicated than they are today, and we're not forced
> to create useless callbacks for objects were the lookup is trivial.  The
> function pointer should return boolean, true when the function/collation
> is already in the given schema; that way, the message wording is only
> present in AlterObjectNamespace_internal.

It seems overly complex to me.  What's wrong with putting special-case
logic directly into the function?  That seems cleaner and easier to
understand, and there's no real downside AFAICS.  We have similar
special cases elsewhere; the code can't be simpler than the actual
logic.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER command reworks

2013-01-07 Thread Alvaro Herrera
Kohei KaiGai escribió:

> The attached patch is a revised version.
> 
> It follows the manner in ExecAlterObjectSchemaStmt(); that tries to check
> name duplication for object classes that support catcache with name-key.
> Elsewhere, it calls individual routines for each object class to solve object
> name and check namespace conflicts.
> For example, RenameFunction() is still called from ExecRenameStmt(),
> however, it looks up the given function name and checks namespace
> conflict only, then it just invokes AlterObjectRename_internal() in spite
> of system catalog updates by itself.
> It allows to use this consolidated routine by object classes with special
> rule for namespace conflicts, such as collation.
> In addition, this design allowed to eliminate get_object_type() to pull
> OBJECT_* enum from class_id/object_id pair.

I checked this patch.  It needed a rebase for the changes to return
OIDs.  Attached patch applies to current HEAD.  In general this looks
good, with one exception: it's using getObjectDescriptionOids() to build
the messages to complain in case the object already exists in the
current schema, which results in diffs like this:

-ERROR:  event trigger "regress_event_trigger2" already exists
+ERROR:  event trigger regress_event_trigger2 already exists

I don't know how tense we are about keeping the quotes, but I fear there
would be complaints because it took us lots of sweat, blood and tears to
get where we are now.

If this is considered a problem, I think the way to fix it is to have a
getObjectDescriptionOids() variant that quotes the object name in the
output.  This would be pretty intrusive: we'd have to change things
so that, for instance,

appendStringInfo(&buffer, _("collation %s"),
NameStr(coll->collname));

would become

if (quotes)
appendStringInfo(&buffer, _("collation \"%s\""),
NameStr(coll->collname));
else
appendStringInfo(&buffer, _("collation %s"),
NameStr(coll->collname));

Not really thrilled with this idea.  Of course,
getObjectDescription() itself should keep the same API as now, to avoid
useless churn all over the place.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/src/backend/commands/aggregatecmds.c
--- b/src/backend/commands/aggregatecmds.c
***
*** 29,34 
--- 29,35 
  #include "catalog/pg_aggregate.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_type.h"
+ #include "commands/alter.h"
  #include "commands/defrem.h"
  #include "miscadmin.h"
  #include "parser/parse_func.h"
***
*** 240,254  RenameAggregate(List *name, List *args, const char *newname)
HeapTuple   tup;
Form_pg_proc procForm;
Relationrel;
-   AclResult   aclresult;
  
rel = heap_open(ProcedureRelationId, RowExclusiveLock);
  
/* Look up function and make sure it's an aggregate */
procOid = LookupAggNameTypeNames(name, args, false);
  
!   tup = SearchSysCacheCopy1(PROCOID, ObjectIdGetDatum(procOid));
!   if (!HeapTupleIsValid(tup)) /* should not happen */
elog(ERROR, "cache lookup failed for function %u", procOid);
procForm = (Form_pg_proc) GETSTRUCT(tup);
  
--- 241,254 
HeapTuple   tup;
Form_pg_proc procForm;
Relationrel;
  
rel = heap_open(ProcedureRelationId, RowExclusiveLock);
  
/* Look up function and make sure it's an aggregate */
procOid = LookupAggNameTypeNames(name, args, false);
  
!   tup = SearchSysCache1(PROCOID, ObjectIdGetDatum(procOid));
!   if (!HeapTupleIsValid(tup)) /* should not happen */
elog(ERROR, "cache lookup failed for function %u", procOid);
procForm = (Form_pg_proc) GETSTRUCT(tup);
  
***
*** 268,291  RenameAggregate(List *name, List *args, const char *newname)

   procForm->proargtypes.values),

get_namespace_name(namespaceOid;
  
!   /* must be owner */
!   if (!pg_proc_ownercheck(procOid, GetUserId()))
!   aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_PROC,
!  NameListToString(name));
! 
!   /* must have CREATE privilege on namespace */
!   aclresult = pg_namespace_aclcheck(namespaceOid, GetUserId(), 
ACL_CREATE);
!   if (aclresult != ACLCHECK_OK)
!   aclcheck_error(aclresult, ACL_KIND_NAMESPACE,
!  get_namespace_name(namespaceOid));
! 
!   /* rename */
!   namestrcpy(&(((Form_pg_proc) GETSTRUCT(tup))->proname), newname);
!   simple_heap_update(rel, &tup->t_self, tup);
!   Catal

Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Tom Lane
Merlin Moncure  writes:
> On Mon, Jan 7, 2013 at 10:16 AM, Tom Lane  wrote:
>> Takeshi Yamamuro  writes:
>>> The attached is a patch to improve compression speeds with loss of
>>> compression ratios in backend/utils/adt/pg_lzcompress.c.

>> Why would that be a good tradeoff to make?  Larger stored values require
>> more I/O, which is likely to swamp any CPU savings in the compression
>> step.  Not to mention that a value once written may be read many times,
>> so the extra I/O cost could be multiplied many times over later on.

> I disagree.  pg compression is so awful it's almost never a net win.
> I turn it off.

One report doesn't make it useless, but even if it is so on your data,
why would making it even less effective be a win?

>> Another thing to keep in mind is that the compression area in general
>> is a minefield of patents.  We're fairly confident that pg_lzcompress
>> as-is doesn't fall foul of any, but any significant change there would
>> probably require more research.

> A minefield of *expired* patents.  Fast lz based compression is used
> all over the place -- for example by the lucene.

The patents that had to be dodged for original LZ compression are gone,
true, but what's your evidence for saying that newer versions don't have
newer patents?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Merlin Moncure
On Mon, Jan 7, 2013 at 10:16 AM, Tom Lane  wrote:
> Takeshi Yamamuro  writes:
>> The attached is a patch to improve compression speeds with loss of
>> compression ratios in backend/utils/adt/pg_lzcompress.c.
>
> Why would that be a good tradeoff to make?  Larger stored values require
> more I/O, which is likely to swamp any CPU savings in the compression
> step.  Not to mention that a value once written may be read many times,
> so the extra I/O cost could be multiplied many times over later on.

I disagree.  pg compression is so awful it's almost never a net win.
I turn it off.

> Another thing to keep in mind is that the compression area in general
> is a minefield of patents.  We're fairly confident that pg_lzcompress
> as-is doesn't fall foul of any, but any significant change there would
> probably require more research.

A minefield of *expired* patents.  Fast lz based compression is used
all over the place -- for example by the lucene.

lz4.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-07 Thread Pavel Stehule
2013/1/7 Merlin Moncure :
> On Fri, Jan 4, 2013 at 3:03 PM, Pavel Stehule  wrote:
>> I understand - but hstore isn't in core  - so it should not be precedent
>>
>> regexp_split_to_table
>>
>> I am not native speaker, it sounds little bit strange - but maybe
>> because I am not native speaker :)
>
> it's common usage: http://api.jquery.com/jQuery.each/
>

ook

Regards

Pavel

> the patch looks fabulous.  There are a few trivial whitespace issues
> yet and I noticed a leaked hstore comment@ 2440:
> +   /*
> +* if the input hstore is empty, we can only skip the rest if we were
> +* passed in a non-null record, since otherwise there may be issues 
> with
> +* domain nulls.
> +*/
>
> merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \l to accept patterns

2013-01-07 Thread Alvaro Herrera
Peter Eisentraut wrote:
> Here is a patch for psql's \l command to accept patterns, like \d
> commands do.  While at it, I also added an "S" option to show system
> objects and removed system objects from the default display.  This might
> be a bit controversial, but it's how it was decided some time ago that
> the \d commands should act.

How does this affect psql -l?  Should it, for instance, hide system DBs?
Accept an optional pattern?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \l to accept patterns

2013-01-07 Thread Fabrízio de Royes Mello
On Mon, Jan 7, 2013 at 10:14 AM, Peter Eisentraut  wrote:

> Here is a patch for psql's \l command to accept patterns, like \d
> commands do.  While at it, I also added an "S" option to show system
> objects and removed system objects from the default display.  This might
> be a bit controversial, but it's how it was decided some time ago that
> the \d commands should act.
>
>
I applied the attached patch to the current master branch and everything is
ok.

When build all works fine too... and I do some tests:

1) Now '\l' list only regular databases

postgres=# \l
   List of databases
 Name | Owner | Encoding | Collate | Ctype | Access privileges
--+---+--+-+---+---
(0 rows)

postgres=# CREATE DATABASE fabrizio;
CREATE DATABASE
postgres=# \l
   List of databases
   Name   |  Owner   | Encoding |   Collate   |Ctype| Access
privileges
--+--+--+-+-+---
 fabrizio | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
(1 row)

postgres=# \l+
  List of databases
   Name   |  Owner   | Encoding |   Collate   |Ctype| Access
privileges |  Size   | Tablespace | Description
--+--+--+-+-+---+-++-
 fabrizio | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| 5945 kB | pg_default |
(1 row)

postgres=# DROP DATABASE fabrizio;
DROP DATABASE
postgres=# \l
   List of databases
 Name | Owner | Encoding | Collate | Ctype | Access privileges
--+---+--+-+---+---
(0 rows)


2) The new sub-command '\lS' list regular and systems databases

postgres=# \lS
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges
---+--+--+-+-+---
 postgres  | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +
   |  |  | | |
fabrizio=CTc/fabrizio
 template1 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +
   |  |  | | |
fabrizio=CTc/fabrizio
(3 rows)

postgres=# CREATE DATABASE fabrizio;
CREATE DATABASE
postgres=# \lS
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges
---+--+--+-+-+---
 fabrizio  | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 postgres  | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +
   |  |  | | |
fabrizio=CTc/fabrizio
 template1 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +
   |  |  | | |
fabrizio=CTc/fabrizio
(4 rows)

postgres=# \lS+
List of
databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges   |  Size   | Tablespace |Description

---+--+--+-+-+---+-++

 fabrizio  | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
  | 5945 kB | pg_default |
 postgres  | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
  | 6041 kB | pg_default | default administrative connecti
on database
 template0 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +| 5945 kB | pg_default | unmodifiable empty database
   |  |  | | |
fabrizio=CTc/fabrizio | ||
 template1 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +| 5945 kB | pg_default | default template for new databa
ses
   |  |  | | |
fabrizio=CTc/fabrizio | ||
(4 rows)

postgres=# DROP DATABASE fabrizio ;
DROP DATABASE
postgres=# \lS
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges
---+--+--+-+-+---
 postgres  | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +
   |  |  | | |
fabrizio=CTc/fabrizio
 template1 | fabrizio | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/fabrizio
 +

Re: [HACKERS] recent ALTER whatever .. SET SCHEMA refactoring

2013-01-07 Thread Alvaro Herrera
Kohei KaiGai escribió:

> Function and collation are candidates of this special case handling;
> here are just two kinds of object.
> 
> Another idea is to add a function-pointer as argument of
> AlterNamespace_internal for (upcoming) object classes that takes
> special handling for detection of name collision.
> My personal preference is the later one, rather than hardwired
> special case handling.
> However, it may be too elaborate to handle just two exceptions.

I think this idea is fine.  Pass a function pointer which is only
not-NULL for the two exceptional cases; the code should have an Assert
that either the function pointer is passed, or there is a nameCacheId to
use.  That way, the object types we already handle in the simpler way do
not get any more complicated than they are today, and we're not forced
to create useless callbacks for objects were the lookup is trivial.  The
function pointer should return boolean, true when the function/collation
is already in the given schema; that way, the message wording is only
present in AlterObjectNamespace_internal.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles

2013-01-07 Thread Claudio Freire
On Mon, Jan 7, 2013 at 3:27 PM, Tom Lane  wrote:
>
> One issue that needs some thought is that the argument for this formula
> is based entirely on thinking about b-trees.  I think it's probably
> reasonable to apply it to gist, gin, and sp-gist as well, assuming we
> can get some estimate of tree height for those, but it's obviously
> hogwash for hash indexes.  We could possibly just take H=0 for hash,
> and still apply the log2(N) part ... not so much because that is right
> as because it's likely too small to matter.

Height would be more precisely "lookup cost" (in comparisons). Most
indexing structures have a well-studied lookup cost. For b-trees, it's
log_b(size), for hash it's 1 + size/buckets.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles

2013-01-07 Thread Tom Lane
Simon Riggs  writes:
> On 7 January 2013 17:35, Tom Lane  wrote:
>> That gives a formula of
>>  cpu_operator_cost * log2(N) + cpu_operator_cost * 50 * (H+2)

> Again, this depends on N and H, so thats good.

> I think my retinas detached while reading your explanation, but I'm a
> long way from coming up with a better or more principled one.

> If we can describe this as a heuristic that appears to fit the
> observed costs, we may keep the door open for something better a
> little later.

I'm fairly happy with the general shape of this formula: it has a
principled explanation and the resulting numbers appear to be sane.
The specific cost multipliers obviously are open to improvement based
on future evidence.  (In particular, I intend to code it in a way that
doesn't tie the "startup overhead" and "cost per page" numbers to be
equal, even though I'm setting them equal for the moment for lack of a
better idea.)

One issue that needs some thought is that the argument for this formula
is based entirely on thinking about b-trees.  I think it's probably
reasonable to apply it to gist, gin, and sp-gist as well, assuming we
can get some estimate of tree height for those, but it's obviously
hogwash for hash indexes.  We could possibly just take H=0 for hash,
and still apply the log2(N) part ... not so much because that is right
as because it's likely too small to matter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles

2013-01-07 Thread Simon Riggs
On 7 January 2013 17:35, Tom Lane  wrote:

> That gives a formula of
>
> cpu_operator_cost * log2(N) + cpu_operator_cost * 50 * (H+2)
>
> This would lead to the behavior depicted in the attached plot, wherein
> I've modified the comparison lines (historical, 9.2, and HEAD behaviors)
> to include the existing 100 * cpu_operator_cost startup cost charge in
> addition to the fudge factor we've been discussing so far.  The new
> proposed curve is a bit above the historical curve for indexes with
> 250-5000 tuples, but the value is still quite small there, so I'm not
> too worried about that.  The people who've been complaining about 9.2's
> behavior have indexes much larger than that.
>
> Thoughts?

Again, this depends on N and H, so thats good.

I think my retinas detached while reading your explanation, but I'm a
long way from coming up with a better or more principled one.

If we can describe this as a heuristic that appears to fit the
observed costs, we may keep the door open for something better a
little later.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles

2013-01-07 Thread Tom Lane
I wrote:
> Now, if we consider only CPU costs of index descent, we expect
> about log2(P) comparisons to be needed on each of the H upper pages
> to be descended through, that is we have total descent cost
>   cpu_index_tuple_cost * H * log2(P)
> If we ignore the ceil() step as being a second-order correction, this
> can be simplified to
>   cpu_index_tuple_cost * H * log2(N)/(H+1)

I thought some more about this and concluded that the above reasoning is
incorrect, because it ignores the fact that initial positioning on the
index leaf page requires another log2(P) comparisons (to locate the
first matching tuple if any).  If you include those comparisons then the
H/(H+1) factor drops out and you are left with just "cost * log2(N)",
independently of the tree height.

But all is not lost for including some representation of the physical
index size into this calculation, because it seems plausible to consider
that there is some per-page cost for descending through the upper pages.
It's not nearly as much as random_page_cost, if the pages are cached,
but we don't have to suppose it's zero.  So that reasoning leads to a
formula like
cost-per-tuple * log2(N) + cost-per-page * (H+1)
which is better than the above proposal anyway because we can now
twiddle the two cost factors separately rather than being tied to a
fixed idea of how much a larger H hurts.

As for the specific costs to use, I'm now thinking that the
cost-per-tuple should be just cpu_operator_cost (0.0025) not
cpu_index_tuple_cost (0.005).  The latter is meant to model costs such
as reporting a TID back out of the index AM to the executor, which is
not what we're doing at an upper index entry.  I also propose setting
the per-page cost to some multiple of cpu_operator_cost, since it's
meant to represent a CPU cost not an I/O cost.

There is already a charge of 100 times cpu_operator_cost in
genericcostestimate to model "general costs of starting an indexscan".
I suggest that we should consider half of that to be actual fixed
overhead and half of it to be per-page cost for the first page, then
add another 50 times cpu_operator_cost for each page descended through.
That gives a formula of

cpu_operator_cost * log2(N) + cpu_operator_cost * 50 * (H+2)

This would lead to the behavior depicted in the attached plot, wherein
I've modified the comparison lines (historical, 9.2, and HEAD behaviors)
to include the existing 100 * cpu_operator_cost startup cost charge in
addition to the fudge factor we've been discussing so far.  The new
proposed curve is a bit above the historical curve for indexes with
250-5000 tuples, but the value is still quite small there, so I'm not
too worried about that.  The people who've been complaining about 9.2's
behavior have indexes much larger than that.

Thoughts?

regards, tom lane

<>set terminal png small color
set output 'new_costs.png'
set xlabel "Index tuples"
set ylabel "Added cost"
set logscale x
set logscale y
h(x) = (x <= 256.0) ? 0 : (x <= 256.0*256) ? 1 : (x <= 256.0*256*256) ? 2 : (x 
<= 256.0*256*256*256) ? 3 : (x <= 256.0*256*256*256*256) ? 4 : 5
head(x) = 4*log(1 + x/1) + 0.25
historical(x) = 4 * x/10 + 0.25
ninepoint2(x) = 4 * x/1 + 0.25
plot [10:1e9][0.1:10] 0.0025*log(x)/log(2) + 0.125*(h(x)+2), head(x), 
historical(x), ninepoint2(x)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER .. OWNER TO error mislabels schema as other object type

2013-01-07 Thread Robert Haas
On Wed, Jan 2, 2013 at 10:35 AM, Kohei KaiGai  wrote:
> The attached patch fixes this trouble.

Thanks.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Tom Lane
Takeshi Yamamuro  writes:
> The attached is a patch to improve compression speeds with loss of
> compression ratios in backend/utils/adt/pg_lzcompress.c.

Why would that be a good tradeoff to make?  Larger stored values require
more I/O, which is likely to swamp any CPU savings in the compression
step.  Not to mention that a value once written may be read many times,
so the extra I/O cost could be multiplied many times over later on.

Another thing to keep in mind is that the compression area in general
is a minefield of patents.  We're fairly confident that pg_lzcompress
as-is doesn't fall foul of any, but any significant change there would
probably require more research.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json api WIP patch

2013-01-07 Thread Merlin Moncure
On Fri, Jan 4, 2013 at 3:03 PM, Pavel Stehule  wrote:
> I understand - but hstore isn't in core  - so it should not be precedent
>
> regexp_split_to_table
>
> I am not native speaker, it sounds little bit strange - but maybe
> because I am not native speaker :)

it's common usage: http://api.jquery.com/jQuery.each/

the patch looks fabulous.  There are a few trivial whitespace issues
yet and I noticed a leaked hstore comment@ 2440:
+   /*
+* if the input hstore is empty, we can only skip the rest if we were
+* passed in a non-null record, since otherwise there may be issues with
+* domain nulls.
+*/

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Review of "pg_basebackup and pg_receivexlog to use non-blocking socket communication", was: Re: [HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2013-01-07 Thread Boszormenyi Zoltan

2013-01-04 13:43 keltezéssel, Hari Babu írta:

On January 02, 2013 12:41 PM Hari Babu wrote:

On January 01, 2013 10:19 PM Boszormenyi Zoltan wrote:

I am reviewing your patch.
• Is the patch in context diff format?
Yes.

Thanks for reviewing the patch.


• Does it apply cleanly to the current git master?
Not quite cleanly but it doesn't produce rejects or fuzz, only offset

warnings:

Will rebase the patch to head.


• Does it include reasonable tests, necessary doc patches, etc?
The test cases are not applicable. There is no test framework for
testing network outage in "make check".

There are no documentation patches for the new --recvtimeout=INTERVAL
and --conntimeout=INTERVAL options for either pg_basebackup or
pg_receivexlog.

I will add the documentation for the same.


Per the previous comment, no. But those are for the backend
to notice network breakdowns and as such, they need a
separate patch.

I also think it is better to handle it as a separate patch for walsender.


• Are the comments sufficient and accurate?
This chunk below removes a comment which seems obvious enough
so it's not needed:
***
*** 518,524  ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos,

uint32 timeline,

 goto error;
 }
   
!   /* Check the message type. */

 if (copybuf[0] == 'k')
 {
 int pos;
--- 559,568 
 goto error;
 }
   
!   /* Set the last reply timestamp */

!   last_recv_timestamp = localGetCurrentTimestamp();
!   ping_sent = false;
!
 if (copybuf[0] == 'k')
 {
 int pos;
***

Other comments are sufficient and accurate.

I will fix and update the patch.

The attached V2 patch in the mail handles all the review comments identified
above.

Regards,
Hari babu.


Since my other patch against pg_basebackup is now committed,
this patch doesn't apply cleanly, patch rejects 2 hunks.
The fixed up patch is attached.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -dcrpN postgresql.orig/doc/src/sgml/ref/pg_basebackup.sgml postgresql/doc/src/sgml/ref/pg_basebackup.sgml
*** postgresql.orig/doc/src/sgml/ref/pg_basebackup.sgml	2013-01-05 17:34:30.742135371 +0100
--- postgresql/doc/src/sgml/ref/pg_basebackup.sgml	2013-01-07 15:11:40.787007890 +0100
*** PostgreSQL documentation
*** 400,405 
--- 400,425 
   
  
   
+   -r interval
+   --recvtimeout=interval
+   
+
+ time that receiver waits for communication from server (in seconds).
+
+   
+   
+ 
+  
+   -t interval
+   --conntimeout=interval
+   
+
+ time that client wait for connection to establish with server (in seconds).
+
+   
+
+ 
+  
-U username
--username=username

diff -dcrpN postgresql.orig/doc/src/sgml/ref/pg_receivexlog.sgml postgresql/doc/src/sgml/ref/pg_receivexlog.sgml
*** postgresql.orig/doc/src/sgml/ref/pg_receivexlog.sgml	2012-11-08 13:13:04.152630639 +0100
--- postgresql/doc/src/sgml/ref/pg_receivexlog.sgml	2013-01-07 15:11:40.788007898 +0100
*** PostgreSQL documentation
*** 164,169 
--- 164,189 
   
  
   
+   -r interval
+   --recvtimeout=interval
+   
+
+ time that receiver waits for communication from server (in seconds).
+
+   
+   
+ 
+  
+   -t interval
+   --conntimeout=interval
+   
+
+ time that client wait for connection to establish with server (in seconds).
+
+   
+
+  
+  
-U username
--username=username

diff -dcrpN postgresql.orig/src/bin/pg_basebackup/pg_basebackup.c postgresql/src/bin/pg_basebackup/pg_basebackup.c
*** postgresql.orig/src/bin/pg_basebackup/pg_basebackup.c	2013-01-05 17:34:30.778135625 +0100
--- postgresql/src/bin/pg_basebackup/pg_basebackup.c	2013-01-07 15:16:24.610037886 +0100
*** bool		streamwal = false;
*** 45,50 
--- 45,54 
  bool		fastcheckpoint = false;
  bool		writerecoveryconf = false;
  int			standby_message_timeout = 10 * 1000;		/* 10 sec = default */
+ int		standby_recv_timeout = 60*1000;		/* 60 sec = default */
+ char		*standby_connect_timeout = NULL;		
+ 
+ #define NAPTIME_PER_CYCLE 100	/* max sleep time between cycles (100ms) */
  
  /* Progress counters */
  static uint64 totalsize;
*** usage(void)
*** 130,135 
--- 134,143 
  	printf(_("  -p, --port=PORTdatabase server port number\n"));
  	printf(_("  -s, --status-interval=INTERVAL\n"
  			 "   

Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Andres Freund

Hi,

It seems worth rereading the thread around
http://archives.postgresql.org/message-id/CAAZKuFb59sABSa7gCG0vnVnGb-mJCUBBbrKiyPraNXHnis7KMw%40mail.gmail.com
for people wanting to work on this.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread k...@rice.edu
On Mon, Jan 07, 2013 at 01:36:33PM +, Greg Stark wrote:
> On Mon, Jan 7, 2013 at 10:21 AM, John R Pierce  wrote:
> > On 1/7/2013 2:05 AM, Andres Freund wrote:
> >>
> >> I think there should be enough bits available in the toast pointer to
> >> indicate the type of compression. I seem to remember somebody even
> >> posting a patch to that effect?
> >> I agree that it's probably too late in the 9.3 cycle to start with this.
> >
> >
> > so an upgraded database would have old toasted values in the old compression
> > format, and new toasted values in the new format in an existing table?
> > that's kind of ugly.
> 
> I haven't looked at the patch. It's not obvious to me from the
> description that the output isn't backwards compatible. The way the LZ
> toast compression works the output is self-describing. There are many
> different outputs that would decompress to the same thing and the
> compressing code can choose how hard to look for earlier matches and
> when to just copy bytes wholesale but the decompression will work
> regardless.
> 

I think this comment refers to the lz4 option. I do agree that the patch
that was posted to improve the current compression speed should be able
to be implemented to allow the current results to be decompressed as well.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread k...@rice.edu
On Mon, Jan 07, 2013 at 09:10:31AM +, Simon Riggs wrote:
> On 7 January 2013 07:29, Takeshi Yamamuro
>  wrote:
> 
> > Anyway, the compression speed in lz4 is very fast, so in my
> > opinion, there is a room to improve the current implementation
> > in pg_lzcompress.
> 
> So why don't we use LZ4?
> 
+1

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-07 Thread Andres Freund
On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:
> On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:
> > On 7 January 2013 12:39, Amit Kapila  wrote:
> >
> > > So We can modify to change this in function LogStandbySnapshot as
> > below:
> > > running = GetRunningTransactionData();
> > > if (running->xcnt > 0)
> > > LogCurrentRunningXacts(running);
> > >
> > > So this check will make sure that if there is no operation happening
> > i.e. no
> > > new running transaction, then no need to log running transaction
> > snapshot
> > > and hence further checkpoint operations will be skipped.
> > >
> > > Let me know if I am missing something?
> >
> > It's not the same test. The fact that nothing is running at that
> > moment is not the same thing as saying nothing at all has run since
> > last checkpoint.
>
> But isn't the functionality of LogStandbySnapshot() is to log "all running
> xids" and "all current
> AccessExclusiveLocks". For RunningTransactionLocks, WAL is avoided in
> similar way.

The information that no transactions are currently running allows you to
build a recovery snapshot, without that information the standby won't
start answering queries. Now that doesn't matter if all standbys already
have built a snapshot, but the primary cannot know that.
Having to issue a checkpoint while ensuring transactions are running
just to get a standby up doesn't seem like a good idea to me :)

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Simon Riggs
On 7 January 2013 13:36, Greg Stark  wrote:
> On Mon, Jan 7, 2013 at 10:21 AM, John R Pierce  wrote:
>> On 1/7/2013 2:05 AM, Andres Freund wrote:
>>>
>>> I think there should be enough bits available in the toast pointer to
>>> indicate the type of compression. I seem to remember somebody even
>>> posting a patch to that effect?
>>> I agree that it's probably too late in the 9.3 cycle to start with this.
>>
>>
>> so an upgraded database would have old toasted values in the old compression
>> format, and new toasted values in the new format in an existing table?
>> that's kind of ugly.
>
> I haven't looked at the patch. It's not obvious to me from the
> description that the output isn't backwards compatible. The way the LZ
> toast compression works the output is self-describing. There are many
> different outputs that would decompress to the same thing and the
> compressing code can choose how hard to look for earlier matches and
> when to just copy bytes wholesale but the decompression will work
> regardless.

Good point, and a great reason to use this patch rather than LZ4 for 9.3

We could even have tuning parameters for toast compression, as long as
we keep the on disk format identical.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-07 Thread Simon Riggs
On 7 January 2013 13:33, Amit Kapila  wrote:

>> If we skip the WAL record in the way you suggest, we'd be unable to
>> start quickly in some cases.
>
> If there are any operations happened which have generated WAL, then on next
> checkpoint interval the checkpoint operation should happen.
> Which cases will it not able to start quickly?

The case where we do lots of work but momentarily we weren't doing
anything when we took the snapshot.

The absence of write transactions at one specific moment gives no
indication of behaviour at other points across the whole checkpoint
period.

If you make the correct test, I'd be more inclined to accept the premise.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store "timestamptz" of database creation on "pg_database"

2013-01-07 Thread Stephen Frost
* Fabrízio de Royes Mello (fabriziome...@gmail.com) wrote:
> Understood... a "COMMENT" is a database object, then if we add a creation
> time column to pg_description/shdescription tables how we track his
> creation time?

When it's NULL it "doesn't exist", in this case, when it transistions
from NULL, it becomes created.  A transistion from non-NULL to non-NULL
would be an alter, and a transistion from non-NULL to NULL would be a
drop/remove.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Greg Stark
On Mon, Jan 7, 2013 at 10:21 AM, John R Pierce  wrote:
> On 1/7/2013 2:05 AM, Andres Freund wrote:
>>
>> I think there should be enough bits available in the toast pointer to
>> indicate the type of compression. I seem to remember somebody even
>> posting a patch to that effect?
>> I agree that it's probably too late in the 9.3 cycle to start with this.
>
>
> so an upgraded database would have old toasted values in the old compression
> format, and new toasted values in the new format in an existing table?
> that's kind of ugly.

I haven't looked at the patch. It's not obvious to me from the
description that the output isn't backwards compatible. The way the LZ
toast compression works the output is self-describing. There are many
different outputs that would decompress to the same thing and the
compressing code can choose how hard to look for earlier matches and
when to just copy bytes wholesale but the decompression will work
regardless.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-07 Thread Amit Kapila
On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:
> On 7 January 2013 12:39, Amit Kapila  wrote:
> 
> > So We can modify to change this in function LogStandbySnapshot as
> below:
> > running = GetRunningTransactionData();
> > if (running->xcnt > 0)
> > LogCurrentRunningXacts(running);
> >
> > So this check will make sure that if there is no operation happening
> i.e. no
> > new running transaction, then no need to log running transaction
> snapshot
> > and hence further checkpoint operations will be skipped.
> >
> > Let me know if I am missing something?
> 
> It's not the same test. The fact that nothing is running at that
> moment is not the same thing as saying nothing at all has run since
> last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log "all running
xids" and "all current 
AccessExclusiveLocks". For RunningTransactionLocks, WAL is avoided in
similar way.

> If we skip the WAL record in the way you suggest, we'd be unable to
> start quickly in some cases.

If there are any operations happened which have generated WAL, then on next
checkpoint interval the checkpoint operation should happen. 
Which cases will it not able to start quickly?

With Regards,
Amit Kapila



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] A very small typo in the comment

2013-01-07 Thread ??????
Hi,
  I am a student in China. I have read some source code of postgreSQL, 
though only a little. I think maybe there is typo in this file: 
src\backend\catalog\index.c. At line 649 in v9.2.0, line 653 in V9.2.2. The 
sentence is " *indexInfo: same info executor uses to insert into the index", I 
think it should not be 'same', but 'some'.
  I am not sure if my thoughts is right.
  Sorry for my poor English.


  Good Luck From Beijing!


  Meng Qingzhong

Re: [HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-07 Thread Simon Riggs
On 7 January 2013 12:39, Amit Kapila  wrote:

> So We can modify to change this in function LogStandbySnapshot as below:
> running = GetRunningTransactionData();
> if (running->xcnt > 0)
> LogCurrentRunningXacts(running);
>
> So this check will make sure that if there is no operation happening i.e. no
> new running transaction, then no need to log running transaction snapshot
> and hence further checkpoint operations will be skipped.
>
> Let me know if I am missing something?

It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.

If we skip the WAL record in the way you suggest, we'd be unable to
start quickly in some cases.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Extra XLOG in Checkpoint for StandbySnapshot

2013-01-07 Thread Amit Kapila
Observation is that whenever a checkpoint happens and the wal_level
configured is hot_standby then one standby snapshot XLOG gets written with
the information of "running transaction". 


So if first time checkpoint happened at specified interval, it will create
new XLOG in LogStandbySnapshot, due to which checkpoint operation doesn't
get skipped again on next interval. This is okay if there are any running
transactions, but it seems XLOG is written even if there is no running xact.

As per the analysis below is the code snippet doing this: 
running = GetRunningTransactionData(); 
LogCurrentRunningXacts(running); 



So irrespective of value of running, snapshot is getting logged. 

So We can modify to change this in function LogStandbySnapshot as below: 
running = GetRunningTransactionData(); 
if (running->xcnt > 0) 
LogCurrentRunningXacts(running); 



So this check will make sure that if there is no operation happening i.e. no
new running transaction, then no need to log running transaction snapshot
and hence further checkpoint operations will be skipped. 



Let me know if I am missing something?

 

With Regards,

Amit Kapila.



[HACKERS] psql \l to accept patterns

2013-01-07 Thread Peter Eisentraut
Here is a patch for psql's \l command to accept patterns, like \d
commands do.  While at it, I also added an "S" option to show system
objects and removed system objects from the default display.  This might
be a bit controversial, but it's how it was decided some time ago that
the \d commands should act.

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index c41593c..a9770c0 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1689,16 +1689,19 @@ Meta-Commands
 
 
   
-\l (or \list)
-\l+ (or \list+)
+\l[S+] (or \list[S+])
 
 
-List the names, owners, character set encodings, and access privileges
-of all the databases in the server.
+List the databases in the server and show they names, owners, character
+set encodings, and access privileges.
+If pattern is specified,
+only databases whose names match the pattern are listed.
 If + is appended to the command name, database
-sizes, default tablespaces, and descriptions are also displayed.
-(Size information is only available for databases that the current
-user can connect to.)
+sizes, default tablespaces, and descriptions are also displayed.  (Size
+information is only available for databases that the current user can
+connect to.)  By default, only user-created databases are shown; supply
+a pattern or the S modifier to include system
+objects.
 
 
   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 59f8b03..aea0903 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -804,10 +804,27 @@ static bool do_edit(const char *filename_arg, PQExpBuffer query_buf,
 	}
 
 	/* \l is list databases */
-	else if (strcmp(cmd, "l") == 0 || strcmp(cmd, "list") == 0)
-		success = listAllDbs(false);
-	else if (strcmp(cmd, "l+") == 0 || strcmp(cmd, "list+") == 0)
-		success = listAllDbs(true);
+	else if (strcmp(cmd, "l") == 0 || strcmp(cmd, "list") == 0 ||
+			 strcmp(cmd, "l+") == 0 || strcmp(cmd, "list+") == 0 ||
+			 strcmp(cmd, "lS") == 0 || strcmp(cmd, "listS") == 0 ||
+			 strcmp(cmd, "l+S") == 0 || strcmp(cmd, "list+S") == 0 ||
+			 strcmp(cmd, "lS+") == 0 || strcmp(cmd, "listS+") == 0)
+	{
+		char	   *pattern;
+		bool		show_verbose,
+	show_system;
+
+		pattern = psql_scan_slash_option(scan_state,
+		 OT_NORMAL, NULL, true);
+
+		show_verbose = strchr(cmd, '+') ? true : false;
+		show_system = strchr(cmd, 'S') ? true : false;
+
+		success = listAllDbs(pattern, show_verbose, show_system);
+
+		if (pattern)
+			free(pattern);
+	}
 
 	/*
 	 * large object things
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 87174cc..c6ac40c 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -637,7 +637,7 @@ static bool describeOneTSConfig(const char *oid, const char *nspname,
  * for \l, \list, and -l switch
  */
 bool
-listAllDbs(bool verbose)
+listAllDbs(const char *pattern, bool verbose, bool showSystem)
 {
 	PGresult   *res;
 	PQExpBufferData buf;
@@ -680,6 +680,14 @@ static bool describeOneTSConfig(const char *oid, const char *nspname,
 	if (verbose && pset.sversion >= 8)
 		appendPQExpBuffer(&buf,
 		   "  JOIN pg_catalog.pg_tablespace t on d.dattablespace = t.oid\n");
+
+	if (pattern)
+		processSQLNamePattern(pset.db, &buf, pattern, false, false,
+			  NULL, "d.datname", NULL, NULL);
+
+	if (!showSystem && !pattern)
+		appendPQExpBuffer(&buf, "WHERE d.datname NOT  IN ('postgres', 'template0', 'template1')\n");
+
 	appendPQExpBuffer(&buf, "ORDER BY 1;");
 	res = PSQLexec(buf.data, false);
 	termPQExpBuffer(&buf);
diff --git a/src/bin/psql/describe.h b/src/bin/psql/describe.h
index 9e71a88..721c363 100644
--- a/src/bin/psql/describe.h
+++ b/src/bin/psql/describe.h
@@ -55,7 +55,7 @@ extern bool listTSDictionaries(const char *pattern, bool verbose);
 extern bool listTSTemplates(const char *pattern, bool verbose);
 
 /* \l */
-extern bool listAllDbs(bool verbose);
+extern bool listAllDbs(const char *pattern, bool verbose, bool showSystem);
 
 /* \dt, \di, \ds, \dS, etc. */
 extern bool listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSystem);
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 1070bc5..ab32b32 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -230,7 +230,7 @@
 	fprintf(output, _("  \\dE[S+] [PATTERN]  list foreign tables\n"));
 	fprintf(output, _("  \\dx[+]  [PATTERN]  list extensions\n"));
 	fprintf(output, _("  \\dy [PATTERN]  list event triggers\n"));
-	fprintf(output, _("  \\l[+]  list all databases\n"));
+	fprintf(output, _("  \\l[S+]  [PATTERN]  list databases\n"));
 	fprintf(output, _("  \\sf[+] FUNCNAMEshow a function's definition\n"));
 	fprintf(output, _("  \\z  [PATTERN]  same as \\dp\n"));
 	fprintf(output, "\n");
diff --gi

Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Andres Freund
On 2013-01-07 02:21:26 -0800, John R Pierce wrote:
> On 1/7/2013 2:05 AM, Andres Freund wrote:
> >I think there should be enough bits available in the toast pointer to
> >indicate the type of compression. I seem to remember somebody even
> >posting a patch to that effect?
> >I agree that it's probably too late in the 9.3 cycle to start with this.
>
> so an upgraded database would have old toasted values in the old compression
> format, and new toasted values in the new format in an existing table?
> that's kind of ugly.

Well, ISTM thats just life. What you prefer? Converting all toast values
during pg_upgrade kinda goes against the aim of quick upgrades.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread John R Pierce

On 1/7/2013 2:05 AM, Andres Freund wrote:

I think there should be enough bits available in the toast pointer to
indicate the type of compression. I seem to remember somebody even
posting a patch to that effect?
I agree that it's probably too late in the 9.3 cycle to start with this.


so an upgraded database would have old toasted values in the old 
compression format, and new toasted values in the new format in an 
existing table?  that's kind of ugly.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Andres Freund
On 2013-01-07 09:57:58 +, Simon Riggs wrote:
> On 7 January 2013 09:19, John R Pierce  wrote:
> > On 1/7/2013 1:10 AM, Simon Riggs wrote:
> >>
> >> On 7 January 2013 07:29, Takeshi Yamamuro
> >>   wrote:
> >>
> >>> >Anyway, the compression speed in lz4 is very fast, so in my
> >>> >opinion, there is a room to improve the current implementation
> >>> >in pg_lzcompress.
> >>
> >> So why don't we use LZ4?
> >
> > what will changing compression formats do for compatability?
> >
> > this is for the compressed data in pg_toast storage or something? will this
> > break pg_upgrade style operations?
>
> Anything that changes on-disk format would need to consider how to do
> pg_upgrade. It's the major blocker in that area.
>
> For this, it would be possible to have a new format and old format
> coexist, but that will take more time to think through than we have
> for this release, so this is a nice idea for further investigation in
> 9.4. Thanks for raising that point.

I think there should be enough bits available in the toast pointer to
indicate the type of compression. I seem to remember somebody even
posting a patch to that effect?
I agree that it's probably too late in the 9.3 cycle to start with this.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Simon Riggs
On 7 January 2013 09:19, John R Pierce  wrote:
> On 1/7/2013 1:10 AM, Simon Riggs wrote:
>>
>> On 7 January 2013 07:29, Takeshi Yamamuro
>>   wrote:
>>
>>> >Anyway, the compression speed in lz4 is very fast, so in my
>>> >opinion, there is a room to improve the current implementation
>>> >in pg_lzcompress.
>>
>> So why don't we use LZ4?
>
> what will changing compression formats do for compatability?
>
> this is for the compressed data in pg_toast storage or something? will this
> break pg_upgrade style operations?

Anything that changes on-disk format would need to consider how to do
pg_upgrade. It's the major blocker in that area.

For this, it would be possible to have a new format and old format
coexist, but that will take more time to think through than we have
for this release, so this is a nice idea for further investigation in
9.4. Thanks for raising that point.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread John R Pierce

On 1/7/2013 1:10 AM, Simon Riggs wrote:

On 7 January 2013 07:29, Takeshi Yamamuro
  wrote:


>Anyway, the compression speed in lz4 is very fast, so in my
>opinion, there is a room to improve the current implementation
>in pg_lzcompress.

So why don't we use LZ4?



what will changing compression formats do for compatability?

this is for the compressed data in pg_toast storage or something? will 
this break pg_upgrade style operations?





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve compression speeds in pg_lzcompress.c

2013-01-07 Thread Simon Riggs
On 7 January 2013 07:29, Takeshi Yamamuro
 wrote:

> Anyway, the compression speed in lz4 is very fast, so in my
> opinion, there is a room to improve the current implementation
> in pg_lzcompress.

So why don't we use LZ4?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-07 Thread Shigeru Hanada
Hi Tomas,

I reviewed v6 patch, and found that several places need fix.
Sorry for extra nitpickings.

* I found another extra space after asterisk.

+   RelFileNode * nodes;

* Curly bracket which starts code block should be at the head of next line.

+   /* extend the array if needed (double the size) 
*/
+   if (maxrels <= nrels) {

* There are several trailing tabs in src/backend/catalog/storage.c.

* naming of DROP_RELATIONS_BSEARCH_LIMIT (or off-by-one bug?)
IIUC bsearch is used when # of relations to be dropped is *more* than
the value of DROP_RELATIONS_BSEARCH_LIMIT (10).  IMO this behavior is
not what the macro name implies; I thought that bsearch is used for 10
relations.  Besides, the word "LIMIT" is used as *upper limit* in
other macros.  How about MIN_DROP_REL[ATION]S_BSEARCH or
DROP_REL[ATION]S_LINEAR_SEARCH_LIMIT?
# using RELS instead of RELATIONS would be better to shorten the name

* +1 for Alvaro's suggestion about avoiding palloc traffic by starting
with 8 elements or so.

Regards,
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers