date:20090603

[HACKERS] Question about STRICT

2009-06-03 Thread Gevik Babakhani

Could someone please clarify the difference between RETURNS NULL ON 
NULL INPUT or STRICT when creating a function. Do both options exist 
because of historical reasons/SQL standard compliance?



Shouldn't we raise an error when calling a function with NULL arguments 
values if the function is created as STRICT?
(This would of course have an impact on checking for NULLs on arguments 
defaults if the above is implemented.)


--

Regards,
Gevik

Re: [HACKERS] A humble request

2009-06-03 Thread Gevik Babakhani


Thank you for making this available :)

Bruce Momjian wrote:

Gevik Babakhani wrote:

please, please, please


The video is here:

http://hosting3.epresence.tv/fosslc/1/watch/121.aspx

---



David Fetter wrote:

On Fri, May 22, 2009 at 10:39:13AM +0200, Gevik Babakhani wrote:
  

Hi,

For the ones who couldn't attend to How to Get Your PostgreSQL
Patch  Accepted, could someone please make a summary. (Tom? Bruce?)
Thank you.


Video will get posted pretty soon after, and expect tweets, blog
posts, etc. :)

Cheers,
David.
  


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers





--
Regards,
Gevik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-03 Thread Magnus Hagander

Josh Berkus wrote:
 Tom, all,
 
 * Path separator consistency on Windows
 
 -- This discussion does not appear to have concluded.  Magnus, Dave?

We've lived with this for a long time, it's nothing new. And it's not
critical. Bump to 8.5.


-- 
 Magnus Hagander
 Self: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Markus Wanner


Hi,

Quoting Tom Lane t...@sss.pgh.pa.us:

I can't escape the feeling that we're missing something basic here.


Perhaps the power (and importance) of merging is still a bit  
underestimated, but otherwise I don't think there's much to miss.



But rebuilding the Linux kernel is hardly a zero-cost operation,
so how have Linus and co failed to notice this problem?  There
must be some trick they're using that I haven't heard about, or
they'd not be nearly so pleased with git.


Keep in mind that they don't have half as many back branches to  
maintain (taking only 2.4 and 2.6 into account). The minor version  
stable branches are not maintained for such a long time (for example,  
the last fix for 2.6.19 happened 2 years ago, from what I can tell).  
Overall, I think the differences are smaller than between the stable  
branches of Postgres' repository.


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about STRICT

2009-06-03 Thread Greg Stark

On Wed, Jun 3, 2009 at 9:45 AM, Gevik Babakhani pg...@xs4all.nl wrote:
 Could someone please clarify the difference between RETURNS NULL ON NULL
 INPUT or STRICT when creating a function.

They're synonyms.

 Do both options exist because
 of historical reasons/SQL standard compliance?

One or the other, not sure which.

 Shouldn't we raise an error when calling a function with NULL arguments
 values if the function is created as STRICT?

No, what they do is return NULL automatically. The function doesn't
have to check for NULL arguments itself.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_standby -l might destory the archived file

2009-06-03 Thread Fujii Masao

Hi,

On Wed, Jun 3, 2009 at 3:54 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Tom Lane wrote:
 That's a good point; don't we recover files under names like
 RECOVERYXLOG, not under names that could possibly conflict with regular
 WAL files?

 Yes. But we rename RECOVERYXLOG to 00010057 or similar
 at the end of recovery, in exitArchiveRecovery().

 Thinking about this some more, I think we should've changed
 exitArchiveRecovery() rather than RemoveOldXlogFiles(): it would be more
 robust if exitArchiveRecovery() always copied the last WAL file rather
 than just renamed it. It doesn't seem safe to rely on the file the
 symlink points to to be valid after recovery is finished, and we might
 write to it before it's recycled, so the current fix isn't complete.

 Hmm.  I think really the reason it's coded that way is that we assumed
 the recovery command would be physically copying the file from someplace
 else.  pg_standby is violating the backend's expectations by using a
 symlink.  And I really doubt that the technique is saving anything, since
 the data has to be read in from the archive location anyway.

 I'm leaning back to the position that pg_standby's -l option is simply a
 bad idea and should be removed.

I'm OK with this. And, we should document the assumption for
restore_command? Otherwise, some users might wrongly use
'ln' command to restore archived files.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v2] Add bit operations util header

2009-06-03 Thread Florian Weimer

* Jeremy Kerr:

 +#if defined(__GNUC__)  \
 + (defined(__ppc__) || defined(__powerpc__) || \
 +  defined(__ppc64__) || defined (__powerpc64__))

If you require GCC anyway, you can use __builtin_clz instead.
(It's been available since GCC 4.1 at least.)

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4b2 tsearch2 strange error

2009-06-03 Thread Tatsuo Ishii

  I have encountered strange errors while testing PostgreSQL 8.4 beta2.
 
  ERROR:  tuple offset out of range: 0
  (occasionally ERROR:  tuple offset out of range: 459)
 
 This is evidently coming from tbm_add_tuples, indicating that it's being
 passed bogus TID values from the GIN index.  We'll probably have to get
 Teodor to look at it --- can you provide a self-contained test case?

Sorry for delay. I have put a database dump at:

http://sylpheed.sraoss.jp/tmp/dump.sql.gz

We got following errors using this database:

ERROR:  unrecognized operator: -50
STATEMENT:  SELECT msg_sid, hdr_from, hdr_to, hdr_subject, msg_date, folder_id, 
msgnum FROM msginfo LEFT JOIN msg_folderinfo USING (msg_sid) WHERE 
plainto_tsquery(E'a') @@ body_index
NOTICE:  text-search query contains only stop words or doesn't contain lexemes, 
ignored

ERROR:  tuple offset out of range: 609
STATEMENT:  SELECT msg_sid, hdr_from, hdr_to, hdr_subject, msg_date, folder_id, 
msgnum FROM msginfo LEFT JOIN msg_folderinfo USING (msg_sid) WHERE 
plainto_tsquery(E'tsearch') @@ body_index

ERROR:  tuple offset out of range: 0
STATEMENT:  SELECT msg_sid, hdr_from, hdr_to, hdr_subject, msg_date, folder_id, 
msgnum FROM msginfo LEFT JOIN msg_folderinfo USING (msg_sid) WHERE 
plainto_tsquery(E'interval') @@ body_index

ERROR:  tuple offset out of range: 924
STATEMENT:  SELECT msg_sid, hdr_from, hdr_to, hdr_subject, msg_date, folder_id, 
msgnum FROM msginfo LEFT JOIN msg_folderinfo USING (msg_sid) WHERE 
plainto_tsquery(E'pitr') @@ body_index
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Greg Stark

On Wed, Jun 3, 2009 at 12:39 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 David E. Wheeler da...@kineticode.com writes:
 Does that make sense?

 Maybe, but it still seems messy, brute force, and error-prone.

 I can't escape the feeling that we're missing something basic here.
 It's allegedly one of git's great strengths that it allows you to easily
 and quickly switch your attention among multiple development branches.

Well as long as the development branches are all compatible in the
sense that you don't nee to reconfigure and rebuild everything when
you switch then yes. Doing git-checkout to switch branches will work
fine for branches created to work on new code or review patches.

I really don't see what you mean by messy or brute force or
error-prone with regards to keeping a separate clone for each major
release. It's basically equivalent to having a CVS checkout for each
major release which you do now.

The main difference is that commit becomes a two step process --
commit to your repository, then push to the public repository. That's
three steps if you include the add but I suspect you're going to be
doing git-commit -a most of the time.

There's an advantage that you can commit several changes to your local
repo and push them all to the public repo together. That might be good
if you have, for example, a bug fix which requires an api change
elsewhere. You might want two separate commit messages and the ability
to merge one of them forward or back separately but not want to put
them in the public repo until both are available.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Markus Wanner


Hi,

Quoting David E. Wheeler da...@kineticode.com:

Monotone?


..one of the sources of inspiration for Linus to write git. He was not  
satisfied with its speed and he didn't like C++ and SQL. Plus the main  
contributors weren't around at the time Linus was on the mailing list.  
So he turned away and did his own thing, in C and filesystem based.  
(Most ranting stripped).


Regards

Markus Wanner



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about STRICT

2009-06-03 Thread Gevik Babakhani


Shouldn't we raise an error when calling a function with NULL arguments
values if the function is created as STRICT?


No, what they do is return NULL automatically. The function doesn't
have to check for NULL arguments itself.


The RETURNS NULL ON NULL INPUT is logical and does the above 
accordingly. But when a function is STRICT you kind of expect to have an 
notification, perhaps an error if a value for an argument is NULL.


STRICT is sort of puzzling when you want to make sure a function is only 
called if none of the arguments are NULL.


With STRICT, the function is called anyway and returns NULL, witch 
results the application code to happily execute further without noticing 
that calling the function did not do anything.


I am thinking about the following situation:

create table table1
(
col1 int,
col2 varchar
);

create or replace function insert_test(int,varchar) returns void as
$$
insert into table1 (col1,col2) values ($1,$2);
$$
language sql strict;

select * from insert_test(null,'a');

select * from table1;

--
Regards,
Gevik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about STRICT

2009-06-03 Thread Greg Stark

On Wed, Jun 3, 2009 at 11:04 AM, Gevik Babakhani pg...@xs4all.nl wrote:
 The RETURNS NULL ON NULL INPUT is logical and does the above accordingly.
 But when a function is STRICT you kind of expect to have an notification,
 perhaps an error if a value for an argument is NULL.

Uhm, you might but I'm not sure why. That's not what STRICT does. It's
a synonym for RETURNS NULL ON NULL INPUT.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about STRICT

2009-06-03 Thread Gevik Babakhani


Greg Stark wrote:

On Wed, Jun 3, 2009 at 11:04 AM, Gevik Babakhani pg...@xs4all.nl wrote:

The RETURNS NULL ON NULL INPUT is logical and does the above accordingly.
But when a function is STRICT you kind of expect to have an notification,
perhaps an error if a value for an argument is NULL.


Uhm, you might but I'm not sure why. That's not what STRICT does. It's
a synonym for RETURNS NULL ON NULL INPUT.



Perhaps it is an idea to have something like:
 RAISE ERROR ON NULL INPUT

--
Regards,
Gevik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Markus Wanner


Hi,

Quoting Marko Kreen mark...@gmail.com:

The example was not actual case from Postgres CVS history,
but hypotetical situation without checking if it already works
with GIT.


Of course it is a simplified example, but it resembles what could  
happen i.e. to the file doc/src/sgml/generate_history.pl, which got  
added from a backported patch after forking off REL8_3_STABLE.


If you create separate commits during the conversion, rename that file  
on the master branch and then - for whatever reason - try to merge the  
two branches, you will end up having that file twice. That's what I'm  
warning about. Changes on either or both sides of the merge make the  
situation worse.



Merging between branches with GIT is fine workflow in the future.


Do you consider the above scenario a fine merge?


My point is that we should avoid fake merges, to avoid obfuscating
history.


Understood. It looks like I'm pretty much the only one who cares more  
about merge capability than nice looking history :-(


Attached is my current options file for cvs2git, it includes requested  
changes by Alvaro and additional names and emails as given by Tom  
(thanks again). A current conversion with cvs2git (and with the  
merges) results in a repository with exactly 0 differences against any  
branch or tag symbol compared to cvs checkout -kk.


Regards

Markus Wanner
# (Be in -*- mode: python; coding: utf-8 -*- mode.)

import re

from cvs2svn_lib import config
from cvs2svn_lib import changeset_database
from cvs2svn_lib.common import CVSTextDecoder
from cvs2svn_lib.log import Log
from cvs2svn_lib.project import Project
from cvs2svn_lib.git_revision_recorder import GitRevisionRecorder
from cvs2svn_lib.git_output_option import GitRevisionMarkWriter
from cvs2svn_lib.git_output_option import GitOutputOption
from cvs2svn_lib.revision_manager import NullRevisionRecorder
from cvs2svn_lib.revision_manager import NullRevisionExcluder
from cvs2svn_lib.fulltext_revision_recorder \
 import SimpleFulltextRevisionRecorderAdapter
from cvs2svn_lib.rcs_revision_manager import RCSRevisionReader
from cvs2svn_lib.cvs_revision_manager import CVSRevisionReader
from cvs2svn_lib.checkout_internal import InternalRevisionRecorder
from cvs2svn_lib.checkout_internal import InternalRevisionExcluder
from cvs2svn_lib.checkout_internal import InternalRevisionReader
from cvs2svn_lib.symbol_strategy import AllBranchRule
from cvs2svn_lib.symbol_strategy import AllTagRule
from cvs2svn_lib.symbol_strategy import BranchIfCommitsRule
from cvs2svn_lib.symbol_strategy import ExcludeRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceBranchRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ForceTagRegexpStrategyRule
from cvs2svn_lib.symbol_strategy import ExcludeTrivialImportBranchRule
from cvs2svn_lib.symbol_strategy import ExcludeVendorBranchRule
from cvs2svn_lib.symbol_strategy import HeuristicStrategyRule
from cvs2svn_lib.symbol_strategy import UnambiguousUsageRule
from cvs2svn_lib.symbol_strategy import HeuristicPreferredParentRule
from cvs2svn_lib.symbol_strategy import SymbolHintsFileRule
from cvs2svn_lib.symbol_transform import ReplaceSubstringsSymbolTransform
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform
from cvs2svn_lib.symbol_transform import IgnoreSymbolTransform
from cvs2svn_lib.symbol_transform import NormalizePathsSymbolTransform
from cvs2svn_lib.property_setters import AutoPropsPropertySetter
from cvs2svn_lib.property_setters import CVSBinaryFileDefaultMimeTypeSetter
from cvs2svn_lib.property_setters import CVSBinaryFileEOLStyleSetter
from cvs2svn_lib.property_setters import CVSRevisionNumberSetter
from cvs2svn_lib.property_setters import DefaultEOLStyleSetter
from cvs2svn_lib.property_setters import EOLStyleFromMimeTypeSetter
from cvs2svn_lib.property_setters import ExecutablePropertySetter
from cvs2svn_lib.property_setters import KeywordsPropertySetter
from cvs2svn_lib.property_setters import MimeMapper
from cvs2svn_lib.property_setters import SVNBinaryFileKeywordsPropertySetter

Log().log_level = Log.NORMAL
ctx.revision_recorder = SimpleFulltextRevisionRecorderAdapter(
CVSRevisionReader(cvs_executable=r'cvs'),
GitRevisionRecorder('cvs2git-tmp/git-blob.dat'),
)

ctx.revision_excluder = NullRevisionExcluder()

ctx.revision_reader = None

ctx.sort_executable = r'sort'

ctx.trunk_only = False

ctx.cvs_author_decoder = CVSTextDecoder(
['ascii', 'latin1'],
)
ctx.cvs_log_decoder = CVSTextDecoder(
['ascii', 'latin1'],
)
ctx.cvs_filename_decoder = CVSTextDecoder(
['ascii', 'latin1'],
)

ctx.initial_project_commit_message = (
'Standard project directories initialized by cvs2git.'
)

ctx.post_commit_message = (
'This commit was generated by cvs2git to track changes on a CVS '
'vendor branch.'
)

ctx.symbol_commit_message = (
This commit was manufactured by cvs2git to create %(symbol_type)s 
'%(symbol_name)s'.
)

ctx.decode_apple_single = False

Re: [HACKERS] [PATCH v2] Add bit operations util header

2009-06-03 Thread Jeremy Kerr

Florian,

  +#if defined(__GNUC__)  \
  +   (defined(__ppc__) || defined(__powerpc__) || \
  +defined(__ppc64__) || defined (__powerpc64__))

 If you require GCC anyway, you can use __builtin_clz instead.
 (It's been available since GCC 4.1 at least.)

Because now we have to test the compiler *and* the version as well?

But I do agree that using the builtins makes for much better code; I'm 
looking at a future change that does this.

Cheers,


Jeremy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v2] Add bit operations util header

2009-06-03 Thread Florian Weimer

* Jeremy Kerr:

 Florian,

  +#if defined(__GNUC__)  \
  +  (defined(__ppc__) || defined(__powerpc__) || \
  +   defined(__ppc64__) || defined (__powerpc64__))

 If you require GCC anyway, you can use __builtin_clz instead.
 (It's been available since GCC 4.1 at least.)

 Because now we have to test the compiler *and* the version as well?

This builtin is not architecture-specific, so you'd save the
architecture check.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Florian Weimer

* Tom Lane:

 I wondered for a second about symlinking .git from several checkout
 directories to a common master, but AFAICT .git stores both the
 repository and status information about the current checkout, so
 that's not gonna work.

git clone --reference stores just a reference and does not copy the
history.

It's not going to help in the long run because history accumulating on
the HEAD will be duplicated in your release branches.  This is not a
problem if you never merge stuff into them, but I don't know how much
(recent) history browsing you want to do from your release checkouts.

 At the same time, I don't really buy the theory that relating commits on
 different branches via merges will work.  In my experience it is very
 seldom the case that a patch applies to each back branch with no manual
 effort whatever, which is what I gather the merge functionality could
 help with.  So maybe there's not much help to be had on this ...

Correct.  Merging doesn't work if you pick individual patches.  This
is a difficult problem, and few VCS seem to have tackled it.

Working with a single tree and ccache would be another alternative
(ccache still runs the preprocessor and hashes its output, so it
doesn't care about file modification times).

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Greg Stark

On Wed, Jun 3, 2009 at 12:10 PM, Markus Wanner mar...@bluegap.ch wrote:
 If you create separate commits during the conversion, rename that file on
 the master branch

This is all completely irrelevant to the CVS import. I don't think
we've ever renamed files because CVS can't handle it cleanly.

It does sound to me like we really ought to have merge commits marking
the bug fixes in old releases as merged in the equivalent commits to
later branches based on Tom's commit messages.

That would make the git history match Tom's same commit message
implicit CVS history that cvs2pcl was giving him. I find git-log's
output including merge commits kind of strange and annoying myself but
having them at least gives us a chance to have a tool that understands
them output something like cvs2pcl. Throwing away that information
because we don't like the clutter in the tool output seems like a
short-sighted plan.

That said, the commit log message isn't being lost. We could always
import the history linearly and add the merge commits later if we
decide having them would help some tool implement cvs2pcl summaries.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Andres Freund


Hi,

On 06/03/2009 02:08 PM, Greg Stark wrote:

On Wed, Jun 3, 2009 at 12:10 PM, Markus Wannermar...@bluegap.ch  wrote:
That would make the git history match Tom's same commit message
implicit CVS history that cvs2pcl was giving him. I find git-log's
output including merge commits kind of strange and annoying myself but
having them at least gives us a chance to have a tool that understands
them output something like cvs2pcl.
git log --no-merges hides the actual merge commits if that is what you 
want.


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andres Freund


On 06/03/2009 01:48 PM, Florian Weimer wrote:

I wondered for a second about symlinking .git from several checkout
directories to a common master, but AFAICT .git stores both the
repository and status information about the current checkout, so
that's not gonna work.

git clone --reference stores just a reference and does not copy the
history.
It's not going to help in the long run because history accumulating on
the HEAD will be duplicated in your release branches.  This is not a
problem if you never merge stuff into them, but I don't know how much
(recent) history browsing you want to do from your release checkouts.
As the referenced repository would be a mirror from the official 
repository it should contain most of what is contained in the release 
checkouts - so repacking the release checkouts should remove duplicate 
objects, right?
The work on the release branches would hopefully get pushed to the 
official repository so I don't see a long term problem of duplicate objects.


I haven't really looked at the repack code, so I may be completely off...

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_migrator status for 8.4

2009-06-03 Thread Bruce Momjian

I believe pg_migrator has advanced as far as it is going to for 8.4 and
I am expecting to enter beta next week.  Here are the known pg_migrator
limitations:

---

Currently pg_migrator only supports upgrades from 8.3.X to 8.4.X.
pg_migrator will not work if:

o  a user column is of data type tsquery
o  a user column is of data type 'name' and is not the first
column

pg_migrator will require a table rebuild if:

o  a user column is of data type tsvector

pg_migrator will require a reindex if:

o  an index is of type hash or gin
o  an index uses bpchar_pattern_ops

All failure, rebuild, and reindex cases will be reported by pg_migrator
if they affect your installation;  post-migration scripts to rebuild
tables and indexes will be automatically generated.

In general it is unsafe to access tables referenced in rebuild scripts
until the rebuild scripts have run to completion;  doing so could yield
incorrect results or poor performance.  Tables not referenced in rebuild
scripts can be accessed immediately.


-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Tom Lane

Emmanuel Cecchet m...@frogthinker.org writes:
 Tom Lane wrote:
 AFAIK that doesn't really have anything to do with the temp-ness of the
 table; it'd be the same with a regular table.  The problem is you have
 an in-doubt tuple in pg_class for pg_temp_NNN.foo, and you are trying
 to create another one for the same schema/relname, and so the unique
 index check is blocking to see what happens to the other transaction
 that's creating/deleting the conflicting tuple.

 There should not be a doubt about table foo because whether the 
 transaction commits or rollbacks, that table will not exist anymore (we 
 can get rid of it at prepare time actually).

True, but the problem is that the tuple might still be live to (some
snapshots in) that transaction, so we can't inject a duplicate tuple
without risking confusing it.  In this particular case that isn't an
issue because the transaction is done executing, but the tqual.c
rules don't know that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Emmanuel Cecchet


Tom Lane wrote:

Emmanuel Cecchet m...@frogthinker.org writes:
  

Take PG 8.3.0 and try:
BEGIN;
CREATE TEMP TABLE foo (x int) ON COMMIT DROP;
PREPARE TRANSACTION 't1';
[BEGIN;] -- doesn't really matter if you start a new transaction or not
CREATE TEMP TABLE foo (x int); -- blocks until t1 commits

I have been tracking down the problem and it looks like 
PostPrepare_Locks is holding the locks on 'foo' for some reason I don't 
really get.



AFAIK that doesn't really have anything to do with the temp-ness of the
table; it'd be the same with a regular table.  The problem is you have
an in-doubt tuple in pg_class for pg_temp_NNN.foo, and you are trying
to create another one for the same schema/relname, and so the unique
index check is blocking to see what happens to the other transaction
that's creating/deleting the conflicting tuple.
  


You are right (of course!), I tried:

BEGIN;
CREATE TABLE foo (x int);
DROP TABLE foo;
PREPARE TRANSACTION 't1';
[BEGIN;]
CREATE TABLE foo (x int); -- blocks


There should not be a doubt about table foo because whether the 
transaction commits or rollbacks, that table will not exist anymore (we 
can get rid of it at prepare time actually).
I guess Postgres does not handle the special case of tables (temp or 
not) whose lifespan is limited to the scope of a transaction and 
therefore cannot optimize that case. Is that correct?


Thanks for your help.
Emmanuel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Greg Stark

On Wed, Jun 3, 2009 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
 git log --no-merges hides the actual merge commits if that is what you
 want.

Ooh! Life seems so much sweeter now!

Given that we don't have to see them then I'm all for marking bug fix
patches which were applied to multiple branches as merges. That seems
like it would make it easier for tools like gitk or to show useful
information analogous to the cvs2pcl info.

Given that Tom's been intentionally marking the commits with identical
commit messages we ought to be able to find *all* of them and mark
them properly. That would be way better than only finding patches that
are absolutely identical.

I'm not sure whether we should mark the old branches getting merges
down or the new branches getting merged up. I suspect I'm missing
something but I don't see any reason one is better than the other.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Magnus Hagander

Greg Stark wrote:
 On Wed, Jun 3, 2009 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
 git log --no-merges hides the actual merge commits if that is what you
 want.
 
 Ooh! Life seems so much sweeter now!
 
 Given that we don't have to see them then I'm all for marking bug fix
 patches which were applied to multiple branches as merges. That seems
 like it would make it easier for tools like gitk or to show useful
 information analogous to the cvs2pcl info.

Right, if it adds additional metadata that lets the tools do their magic
better, and it's still easy to filter out, I don't see a downside.


 Given that Tom's been intentionally marking the commits with identical
 commit messages we ought to be able to find *all* of them and mark
 them properly. That would be way better than only finding patches that
 are absolutely identical.

Just to be clear, not just Tom. All committers. I was told to do that
right after my first backpatch which *didn't* do it :-)

So it's an established project practice. That has other advantages as
well, of course..


 I'm not sure whether we should mark the old branches getting merges
 down or the new branches getting merged up. I suspect I'm missing
 something but I don't see any reason one is better than the other.

If you go from older to newer, the automatic merge algorithms have a
better chance of doing something smart since they can track previous
changes. At least I think that's how it works.

But I think for most of the changes it wouldn't make a huge difference,
though - manual merging would be needed anyway.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Marko Kreen

On 6/3/09, Greg Stark st...@enterprisedb.com wrote:
 On Wed, Jun 3, 2009 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
   git log --no-merges hides the actual merge commits if that is what you
   want.


 Ooh! Life seems so much sweeter now!

  Given that we don't have to see them then I'm all for marking bug fix
  patches which were applied to multiple branches as merges. That seems
  like it would make it easier for tools like gitk or to show useful
  information analogous to the cvs2pcl info.

  Given that Tom's been intentionally marking the commits with identical
  commit messages we ought to be able to find *all* of them and mark
  them properly. That would be way better than only finding patches that
  are absolutely identical.

  I'm not sure whether we should mark the old branches getting merges
  down or the new branches getting merged up. I suspect I'm missing
  something but I don't see any reason one is better than the other.

Although mark Tom's back-branch fixes as merges makes much more
sense than mark new files as merges, it is quite a step up from
do tags match official releases.

It seems to require noticeable development effort to get a importer
to a level it can do it.  Will this be a requirement for import?
Or just a good thing to have?  Also how to check if all such merges
are sensible?

And note that such effort will affect only old imported history,
it will not make easier to handle back-branch fixes in the future...

Various scenarios with git cherry-pick and similar tools would still
result in duplicate commits, so we would need a git log post-processor
anyway if we want to somehow group them together for eg. weekly commit
summary.  And such post-processor would work on old history too.

Maybe that's better direction to work on, than to potentially risk in
messy history in GIT?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Robert Haas

On Wed, Jun 3, 2009 at 10:13 AM, Magnus Hagander mag...@hagander.net wrote:
 Greg Stark wrote:
 On Wed, Jun 3, 2009 at 1:19 PM, Andres Freund and...@anarazel.de wrote:
 git log --no-merges hides the actual merge commits if that is what you
 want.

 Ooh! Life seems so much sweeter now!

 Given that we don't have to see them then I'm all for marking bug fix
 patches which were applied to multiple branches as merges. That seems
 like it would make it easier for tools like gitk or to show useful
 information analogous to the cvs2pcl info.

 Right, if it adds additional metadata that lets the tools do their magic
 better, and it's still easy to filter out, I don't see a downside.

 I'm not sure whether we should mark the old branches getting merges
 down or the new branches getting merged up. I suspect I'm missing
 something but I don't see any reason one is better than the other.

 If you go from older to newer, the automatic merge algorithms have a
 better chance of doing something smart since they can track previous
 changes. At least I think that's how it works.

 But I think for most of the changes it wouldn't make a huge difference,
 though - manual merging would be needed anyway.

In practice, isn't it more likely that you would develop the change on
the newest branch and then try to back-port it?  However you do the
import, you're going to want to do subsequent things the same way.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] display previous query string of idle-in-transaction

2009-06-03 Thread Greg Stark

Should this patch be on the commitfest page for 8.5? Or is there a
consensus already that it's a bad idea?

Personally I actually think this makes a lot of sense to do.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Marko Kreen

On 6/3/09, Magnus Hagander mag...@hagander.net wrote:
 Robert Haas wrote:
   On Wed, Jun 3, 2009 at 10:13 AM, Magnus Hagander mag...@hagander.net 
 wrote:

  I'm not sure whether we should mark the old branches getting merges
   down or the new branches getting merged up. I suspect I'm missing
   something but I don't see any reason one is better than the other.
   If you go from older to newer, the automatic merge algorithms have a
   better chance of doing something smart since they can track previous
   changes. At least I think that's how it works.
  
   But I think for most of the changes it wouldn't make a huge difference,
   though - manual merging would be needed anyway.
  
   In practice, isn't it more likely that you would develop the change on
   the newest branch and then try to back-port it?  However you do the
   import, you're going to want to do subsequent things the same way.


 That's definitely the order in which *I* work, and I think that's how
  most others do it as well.

Thats true, but it's not representable in VCS, unless you use cherry-pick,
which is just UI around patch transport.  But considering separate
local trees (with can optionally contain local per-fix branches),
it is possible to separate the fix-developement from final representation.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow vacuumdb to only analyze

2009-06-03 Thread Bruce Momjian

Robert Haas wrote:
 I noticed in Bruce's talk that there are a number of post-migration
 steps which are currently partially manual.  Ideally we'd like to
 automate them all, preferably in some sort of well-thought-out order.
 I actually suspect this is something like: analyze each database,
 reindex those indices invalidated by the upgrade, analyze each
 database again.  Ideally we'd like to have some control over the
 degree of parallelism here too but that might be asking too much for
 8.4.

I can easily have pg_migrator run those scripts itself but I pushed it
on to the administrator so pg_migrator could finish and they could
decide when to run those scripts.  For example, there might only be
issues in a few databases and the other database could be used fully
while the upgrade scripts are running.  The same hold for analyzing the
cluster --- anything I thought might take a while I gave to the
administrators.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] display previous query string of idle-in-transaction

2009-06-03 Thread Kevin Grittner

Greg Stark st...@enterprisedb.com wrote:
 Should this patch be on the commitfest page for 8.5? Or is there a
 consensus already that it's a bad idea?
 
 Personally I actually think this makes a lot of sense to do.
 
+1
 
It at least gives one a reasonable chance to get a useful clue
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Ron Mayer

Robert Haas wrote:
 The problem with making each release a separate directory is that,
 just like using separate repositories, it will defeat one of the main
 strengths of git, which is the ability to move around commits easily.
 git-new-workdir is the only solution to the problem of having multiple
 branches checked out simultaneously that seems like it might not
 suffer from that weakness.

While I agree git-new-workdir is best for typical postgres workflows
so I won't dwell on separate-repositories beyond this post - but I
think you overstate the difficulty a bit.


It seems it's not that hard to cherry-pick from a remote repository by
setting up a temporary tracking branch and (optionally) removing it
when you're done with it if you don't think you'll need it often.

From: http://www.sourcemage.org/Git_Guide
$ git checkout --track -b tmp local branch origin/remote branch
$ git cherry-pick -x sha1 refspec of commit from other (local or remote) 
branch
$ git push origin tmp local branch
$ git branch -D tmp local branch

And if you know you'll be moving patches between external repositories
like origin/remote branch often, ISTM you don't have to do the first and
last steps (which create and remove the tracked branch) each time; but rather
leave the local tracking branch there.



IMVHO, Moving commits around across *different* remote repositories is
also one of the main strengths of moving to a distributed VCS.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Aidan Van Dyk

* Magnus Hagander mag...@hagander.net [090603 10:13]:
 
 Right, if it adds additional metadata that lets the tools do their magic
 better, and it's still easy to filter out, I don't see a downside.

Note, that it could (and likely will) have a downside when you get to
doing real merge-based development... A merge means that *all* changes
in *both* parents have been combined in *this* commit.  And all merge
tools depend on this.  That's the directed part of the DAG in git.  So
if you want to be working in a way that the merge tools work, you
*don't* have master/HEAD merged into REL8_2_STABLE.  You can have
REL8_2_STABLE merged into master/head.

I'll concede that in GIT, it's flexible (some say arbitrary) enough that
you can *construct* the DAG otherwise, but then you've done something in
such a fashion that the DAG has no bearing on real merging, and thus
you loose all the power of DAGs merge tracking when working on new
real merging

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Aidan Van Dyk

* Marko Kreen mark...@gmail.com [090603 10:26]:
 
 Thats true, but it's not representable in VCS, unless you use cherry-pick,
 which is just UI around patch transport.  But considering separate
 local trees (with can optionally contain local per-fix branches),
 it is possible to separate the fix-developement from final representation.

I'll note that in git, cherry-pick is *more* than just patch
transport.  I would more call it patch commute.  It does actually
look at the history between the picked patch, and the current
tree, any merge/fork points, and the differences on each path that lead
to the changes in the current tree and the picked patch.

a.
-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Magnus Hagander

Robert Haas wrote:
 On Wed, Jun 3, 2009 at 10:13 AM, Magnus Hagander mag...@hagander.net wrote:
 I'm not sure whether we should mark the old branches getting merges
 down or the new branches getting merged up. I suspect I'm missing
 something but I don't see any reason one is better than the other.
 If you go from older to newer, the automatic merge algorithms have a
 better chance of doing something smart since they can track previous
 changes. At least I think that's how it works.

 But I think for most of the changes it wouldn't make a huge difference,
 though - manual merging would be needed anyway.
 
 In practice, isn't it more likely that you would develop the change on
 the newest branch and then try to back-port it?  However you do the
 import, you're going to want to do subsequent things the same way.

That's definitely the order in which *I* work, and I think that's how
most others do it as well.


-- 
 Magnus Hagander
 Self: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Robert Haas

On Wed, Jun 3, 2009 at 10:20 AM, Marko Kreen mark...@gmail.com wrote:
 Various scenarios with git cherry-pick and similar tools would still
 result in duplicate commits, so we would need a git log post-processor
 anyway if we want to somehow group them together for eg. weekly commit
 summary.  And such post-processor would work on old history too.

 Maybe that's better direction to work on, than to potentially risk in
 messy history in GIT?

I think it is.  cherry-picking seems like a much better way of
back-patching than merging, so putting a lot of effort into making
merges work doesn't seem like a good expenditure of effort.

It seems pretty clear that searching through the histories of each
branch for duplicate commit messages and producing a unified report is
pretty straightforward if we assume that the commit messages are
byte-for-byte identical (or even modulo whitespace changes).  But I
wonder if it would make more sense to include some kind of metadata in
the commit message (or some other property of the commit?  does git
support that?) to make it not depend on that.  I suppose Tom et. al.
like the way they do it now, so maybe we should just stick with text
comparison, but it seems a bit awkward to me.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Alvaro Herrera

Robert Haas escribió:
 On Tue, Jun 2, 2009 at 7:54 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

  I think these are the two usable options.  They will probably end up
 
 ...wait a minute.   I just went and Googled this git-new-workdir thing
 and it looks like it's almost exactly what we need.  According to the
 docs, it lets you share the same local repository between multiple
 working copies, so all the commits are shared but the index is
 separate for each working directory.  Assuming it works, that sounds
 just about perfect for Tom's use case, since it would allow
 cherry-picking of commits without an intervening push/pull cycle.  Did
 you have some reason for passing over that as one of the usable
 options?

Well, it sounds about perfect for my use case too (which is
approximately the same as Tom's), but the description makes it sound
unsupported.  It doesn't work on Windows which doesn't bother me
personally but may be a showstopper more generally.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Warnings in compile

2009-06-03 Thread Bruce Momjian

Tom Lane wrote:
 Michael Meskes mes...@postgresql.org writes:
  On Mon, May 25, 2009 at 10:19:40AM -0400, Tom Lane wrote:
  That sounds both dangerous and against our coding conventions.  The
  standard way to do that is do { ... } while (0)
 
  Which won't work here as the macros have continue and break commands in 
  them.
 
 Oh, right, that was Bruce's improvement of the COPY code.  I was less
 than thrilled with it, but didn't have an easy alternative.
 
 You can't just remove the else, or it's unsafe; and I'm afraid that
 changing the macros into else {} would still leave us with some
 warnings about empty statements ...

Wow, that must have been a long time ago because I had forgotten about
it (seems it was 2005-12-27).  As least I added a macro comment:

/*
 * These macros centralize code used to process line_buf and raw_buf buffers.
 * They are macros because they often do continue/break control and to avoid
 * function call overhead in tight COPY loops.
 *
 * We must use if (1) because do {} while(0) overrides the continue/break
 * processing.  See http://www.cit.gu.edu.au/~anthony/info/C/C.macros.
 */

As I remember this was an attempt to implement Greenplum's optimizations
in a coherent manner.

I have added a comment about why ((void) 0) is used.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Marko Kreen

On 6/3/09, Aidan Van Dyk ai...@highrise.ca wrote:
 * Marko Kreen mark...@gmail.com [090603 10:26]:
   Thats true, but it's not representable in VCS, unless you use cherry-pick,
   which is just UI around patch transport.  But considering separate
   local trees (with can optionally contain local per-fix branches),
   it is possible to separate the fix-developement from final representation.


 I'll note that in git, cherry-pick is *more* than just patch
  transport.  I would more call it patch commute.  It does actually
  look at the history between the picked patch, and the current
  tree, any merge/fork points, and the differences on each path that lead
  to the changes in the current tree and the picked patch.

Well, thats good to know, but this also seems to mean it's rather bad
tool for back-patching, as you risk including random unwanted commits
too that happened in the HEAD meantime.  But also, it's very good
tool for forward-patching.

But my point was not about that - rather I was pointing out that
this patch-commute will result in duplicate commits, that have
no ties in DAG.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Dave Page

On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Well, it sounds about perfect for my use case too (which is
 approximately the same as Tom's), but the description makes it sound
 unsupported.  It doesn't work on Windows which doesn't bother me
 personally but may be a showstopper more generally.

It's not a showstopper for me. Can't speak for Magnus, Andrew or
anyone else working on Windows though. I imagine those two are the
most likely to have issues if they're back-patching - and that should
just be a matter of disk space.


-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Aidan Van Dyk

* Marko Kreen mark...@gmail.com [090603 11:12]:
 
 Well, thats good to know, but this also seems to mean it's rather bad
 tool for back-patching, as you risk including random unwanted commits
 too that happened in the HEAD meantime.  But also, it's very good
 tool for forward-patching.

It doesn't pull in commits in the sense that darcs does... But rather,
its more like the patch changes $XXX in $file, but that $file was
really $old_file at the common point between the 2 commits, and
$old_file is still $old file in the commit I'm trying to apply the patch
to.

It looks at the history of the changes to figure out why (or why
not) they apply, and see if they should still be applied to the same
file, or another file (in case of a rename/moved file in 1 branch), or
if the changed area has been moved drastically in the file in one
branch, and the change should be applied there instead.

 But my point was not about that - rather I was pointing out that
 this patch-commute will result in duplicate commits, that have
 no ties in DAG.

Yes.  That's a cherry-pick, if you want a merge, you merge ;-)  But
merge carries the baggage of expectation that *all* changes in both
parents have been combined.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Marko Kreen

On 6/3/09, Aidan Van Dyk ai...@highrise.ca wrote:
 * Marko Kreen mark...@gmail.com [090603 11:12]:
   Well, thats good to know, but this also seems to mean it's rather bad
   tool for back-patching, as you risk including random unwanted commits
   too that happened in the HEAD meantime.  But also, it's very good
   tool for forward-patching.

 It doesn't pull in commits in the sense that darcs does... But rather,
  its more like the patch changes $XXX in $file, but that $file was
  really $old_file at the common point between the 2 commits, and
  $old_file is still $old file in the commit I'm trying to apply the patch
  to.

  It looks at the history of the changes to figure out why (or why
  not) they apply, and see if they should still be applied to the same
  file, or another file (in case of a rename/moved file in 1 branch), or
  if the changed area has been moved drastically in the file in one
  branch, and the change should be applied there instead.

I'm not certain, but I remember using cherry pick and seeing
several commits in result.  This seems to be a point that needs
to be checked.

   But my point was not about that - rather I was pointing out that
   this patch-commute will result in duplicate commits, that have
   no ties in DAG.


 Yes.  That's a cherry-pick, if you want a merge, you merge ;-)  But
  merge carries the baggage of expectation that *all* changes in both
  parents have been combined.

But in forward-merge case it's true.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Synchronous replication: status of standby side

2009-06-03 Thread Kolb, Harald (NSN - DE/Munich)

Hi,

will there be a possibility to retrieve the standby situation in case of
synchronous replication ?
We would need a command or similar to report the current state and state
changes 
(like: syncing, in-sync, replication broken, ..). 
Observing and scaning the logfile would not be appropriate.
Are there plans for such an interface ? 

Best regards 
Harald Kolb

Re: [HACKERS] Question about STRICT

2009-06-03 Thread Tom Lane

Gevik Babakhani pg...@xs4all.nl writes:
 Perhaps it is an idea to have something like:
   RAISE ERROR ON NULL INPUT

[ shrug... ]  There's really been no demand for that.  If you want a
particular function to do it, you can put suitable tests and error
reports into that function.  I can't see us adding extra cycles into
the core function-calling code (which is a very hot hot-spot) for a
feature with so little demand.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Andres Freund


Hi,

While analyzing some complex query and switching away from using the
materialized views to their underlying ones I got interested in the long
plan times (minutes and up) and did some profiling work.

The queries are high dimensional star-schema-alike queries 
(unfortunately quite private (health) data and a schema I may not make 
public).


Using oprofile and
valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes
--simulate-cache=yes --simulate-hwpref=yes I found that one of the 
bitmapset functions are near the top of the profile.


When switching bitmapword and companions in bitmap.h to u64 and s64 
respectively I get an improvement up to 15% in queries with 16+ joins. 
The more joins the bigger the win.


In the very simple (structurally) query with 16 joins the improvement is 
around 1-2%.
With the most complex query I tested (the nr. of participating relations 
is hard to count because of many views) I get an improvement up to 15%.
I did not test with bigger/more complex queries because it got too slow 
to get sufficiently thorough results.


When playing around with join_collapse_limit, from_collapse_limit, geqo, 
geqo_threshold I found that unless the settings are set to really low 
values I can find performance improvements for most combinations.


I could not find any regression in the queries we use - and I can't see 
where there would be a significant overhead.


Unfortunately the more interesting trace seems to be the valgrind one - 
which with these options currently only kcachegrind can read. I could 
not get a usable text export out of the latter.


Linked are two overview pictures before (32bit.png) and after 
(64bit.png) the switch to using 64bit bitmapsets from the backend 
evaluating a complex query once:


http://anarazel.de/pg/32bit_bitmapsets.png
http://anarazel.de/pg/64bit_bitmapsets.png

That seems like an easy change - is there a reason not to do this if the 
arch is a 64bit one?


Can anybody else with complex queries test my results? (I can provide a 
patch if wanted).


Andres

PS: If kcachegrind users want to see the trace, speak up...

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres delays function returning large set of data

2009-06-03 Thread flippo00110001


Hello,

First some background:
For my application I need to collect and retrieve data at a very fast rate,
faster than postgres can produce, but I also have the need to store and
access a large range of metadata about the large sets of data I am storing. 
I am therefore attempting to use postgres for the metadata in combination
with the HDF5 filesystem for the large sets of data to store data fast and
have all of the power of a relational database over the metadata.

Now onto the problem:
To combine the two sets of data I have created a C function which can be
dynamically loaded into postgres and takes a primary key index into the data
and returns a large table (around 3 elements) of the data (value and
timestamp).  When I access the data directly it takes me around 30ms to just
read the data out of my test case.  As the postgres C function is called
many times and I must convert the raw data to tuples it introduces more
overhead bringing the data to take around 2000ms to run the testcase, this
is still acceptable.  My problem is when I perform an actual query over my
test case postgres takes around 22000 ms to retrieve the sets of data.  My
concern is: what is postgres doing in these extra 20 seconds, and how can I
make this run faster?

In summary:
I have a large testcase.
I am running a C function from postgres which takes a primary key and
returns a setof (int4, int4) with around 3 tuples returned per call
my c function, when timed over the duration of the test case, takes 2
seconds to run.
the postgres query select * from function(args); takes around 11 times
this amount of time to execute

my function is defined as
create or replace function(IN args, IN..., OUT name1 int4, OUT name2 int4)
returns setof record as 'libname.so', 'fcnname' LANGUAGE C IMMUTABLE STRICT;

I'm sorry if this is the wrong forum, I was torn between hackers and
performance.
Thank you for any assistance.

-- 
View this message in context: 
http://www.nabble.com/Postgres-delays-function-returning-large-set-of-data-tp23853886p23853886.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Aidan Van Dyk

* Marko Kreen mark...@gmail.com [090603 11:28]:
 
 I'm not certain, but I remember using cherry pick and seeing
 several commits in result.  This seems to be a point that needs
 to be checked.

I'm not sure what you're recalling, but git cherry-pick takes a single
commit, and applies it as a single commit (or, with -n, doesn't actually
commit it).  That's what it does... There are various *other* tools (like
rebase, am, cherry, etc) which operate on sets of commits.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH v2] Add bit operations util header

2009-06-03 Thread Tom Lane

Florian Weimer fwei...@bfk.de writes:
 * Jeremy Kerr:
 Because now we have to test the compiler *and* the version as well?

 This builtin is not architecture-specific, so you'd save the
 architecture check.

The appropriate way to handle it would be a configure probe to see if
the function is available, thus avoiding any wired-in knowledge about
compiler or compiler version *or* architecture.

The other thing I didn't like about the patch was the assumption that
it's okay to have a static inline function in a header.  You can
get away with that in gcc but *not* in other compilers.  Look at the
existing coding patterns for, eg, list_head; then go thou and do
likewise.  Or, since there's currently no need for the code outside
aset.c, forget about putting it in a header and just plop it into
aset.c.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Tom Lane

Dave Page dp...@pgadmin.org writes:
 On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Well, it sounds about perfect for my use case too (which is
 approximately the same as Tom's), but the description makes it sound
 unsupported.  It doesn't work on Windows which doesn't bother me
 personally but may be a showstopper more generally.

 It's not a showstopper for me. Can't speak for Magnus, Andrew or
 anyone else working on Windows though.

Seems like we'd want all committers to be using a similar work-flow
for back-patching, else we're going to have random variations in what
patch sets look like in the history.

I think the appropriate question is why doesn't it work on Windows,
and is that fixable?  Without having looked, I'm guessing the issue
is that it depends on hardlinks or symlinks --- and we know those are
available, as long as you're using recent Windows with NTFS.  Which
does not sound like an unreasonable baseline requirement for someone
committing from Windows.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Joshua D. Drake

On Wed, 2009-06-03 at 12:01 -0400, Tom Lane wrote:

 I think the appropriate question is why doesn't it work on Windows,
 and is that fixable?  Without having looked, I'm guessing the issue
 is that it depends on hardlinks or symlinks --- and we know those are
 available, as long as you're using recent Windows with NTFS.  Which
 does not sound like an unreasonable baseline requirement for someone
 committing from Windows.

That was the mention in the channel, that it had to do with symlinks.

Joshua D. Drake


-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andrew Dunstan




Dave Page wrote:

On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

  

Well, it sounds about perfect for my use case too (which is
approximately the same as Tom's), but the description makes it sound
unsupported.  It doesn't work on Windows which doesn't bother me
personally but may be a showstopper more generally.



It's not a showstopper for me. Can't speak for Magnus, Andrew or
anyone else working on Windows though. I imagine those two are the
most likely to have issues if they're back-patching - and that should
just be a matter of disk space.

  


Yeah, AFAIK Magnus doesn't commit direct from Windows, and neither do I, 
and this should not be a showstopper for anyone who isn't a committer.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Dave Page

On Wed, Jun 3, 2009 at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Dave Page dp...@pgadmin.org writes:
 On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Well, it sounds about perfect for my use case too (which is
 approximately the same as Tom's), but the description makes it sound
 unsupported.  It doesn't work on Windows which doesn't bother me
 personally but may be a showstopper more generally.

 It's not a showstopper for me. Can't speak for Magnus, Andrew or
 anyone else working on Windows though.

 Seems like we'd want all committers to be using a similar work-flow
 for back-patching, else we're going to have random variations in what
 patch sets look like in the history.

 I think the appropriate question is why doesn't it work on Windows,
 and is that fixable?  Without having looked, I'm guessing the issue
 is that it depends on hardlinks or symlinks --- and we know those are
 available, as long as you're using recent Windows with NTFS.  Which
 does not sound like an unreasonable baseline requirement for someone
 committing from Windows.

It's a simple perl script that uses symlinks:
http://git.kernel.org/?p=git/git.git;a=blob;f=contrib/workdir/git-new-workdir

But... it doesn't really break the workflow as far as I can see - it
will just mean Windows users need multiple full copies of the repo for
each branch until the script could be hacked up.


-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Dave Page

On Wed, Jun 3, 2009 at 5:14 PM, Dave Page dp...@pgadmin.org wrote:

 It's a simple perl script that uses symlinks:
 http://git.kernel.org/?p=git/git.git;a=blob;f=contrib/workdir/git-new-workdir

Err, shell script even.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Tom Lane

Andres Freund and...@anarazel.de writes:
 When switching bitmapword and companions in bitmap.h to u64 and s64 
 respectively I get an improvement up to 15% in queries with 16+ joins. 

I find this *really* hard to believe, because I've never seen the bitmap
support operations show up noticeably at all in profiles.  What sort of
queries are you testing?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andres Freund


On 06/03/2009 06:17 PM, Andrew Dunstan wrote:

Tom Lane wrote:

I think the appropriate question is why doesn't it work on Windows,
and is that fixable? Without having looked, I'm guessing the issue
is that it depends on hardlinks or symlinks --- and we know those are
available, as long as you're using recent Windows with NTFS. Which
does not sound like an unreasonable baseline requirement for someone
committing from Windows.

I think it could probably be made to work on WIndows if really necessary
(e.g. by translating into perl).
Is the fact that its implemented as a shell script the real problem? 
Isn't it more that symlinks aka Junction Points are really dangerous 
= WinXP? (Deleting a symlink recurses to the target and deletes there).


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andrew Dunstan




Tom Lane wrote:

Dave Page dp...@pgadmin.org writes:
  

On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:


Well, it sounds about perfect for my use case too (which is
approximately the same as Tom's), but the description makes it sound
unsupported.  It doesn't work on Windows which doesn't bother me
personally but may be a showstopper more generally.
  


  

It's not a showstopper for me. Can't speak for Magnus, Andrew or
anyone else working on Windows though.



Seems like we'd want all committers to be using a similar work-flow
for back-patching, else we're going to have random variations in what
patch sets look like in the history.

I think the appropriate question is why doesn't it work on Windows,
and is that fixable?  Without having looked, I'm guessing the issue
is that it depends on hardlinks or symlinks --- and we know those are
available, as long as you're using recent Windows with NTFS.  Which
does not sound like an unreasonable baseline requirement for someone
committing from Windows.


  


It's a shell script, IIRC.

I think it could probably be made to work on WIndows if really necessary 
(e.g. by translating into perl).


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Andres Freund


Hi,

On 06/03/2009 06:21 PM, Tom Lane wrote:

Andres Freundand...@anarazel.de  writes:

When switching bitmapword and companions in bitmap.h to u64 and s64
respectively I get an improvement up to 15% in queries with 16+ joins.

I find this *really* hard to believe, because I've never seen the bitmap
support operations show up noticeably at all in profiles.  What sort of
queries are you testing?
Many left joins from one base relation to additional dimensions. All the 
dimensions are relatively complex views consisting out of multiple joins 
or subselects.
Some correlated subqueries and some [NOT] EXISTS() are also included in 
some of the queries.



I tested by compiling with 64bit bitmaps and without by repeatedly just 
changing those three definitions. I don't see how I could get false 
results with that?


I guess the biggest advantage comes from less cache trashing?

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andrew Dunstan




Andres Freund wrote:

On 06/03/2009 06:17 PM, Andrew Dunstan wrote:

Tom Lane wrote:

I think the appropriate question is why doesn't it work on Windows,
and is that fixable? Without having looked, I'm guessing the issue
is that it depends on hardlinks or symlinks --- and we know those are
available, as long as you're using recent Windows with NTFS. Which
does not sound like an unreasonable baseline requirement for someone
committing from Windows.

I think it could probably be made to work on WIndows if really necessary
(e.g. by translating into perl).
Is the fact that its implemented as a shell script the real problem? 
Isn't it more that symlinks aka Junction Points are really dangerous 
= WinXP? (Deleting a symlink recurses to the target and deletes there).





You have carefully left out the first sentence of my reply. Neither of 
the committers who actually do much work on Windows (namely Magnus and 
me) commit direct from *any* version of Windows. And the whole point of 
this was to overcome an issue relating to commits, so it should not 
affect anyone except a committer.


And yes, we know about junction points. I don't think either of us is 
doing any development work on XP. I do most of my Windows work on my 
laptop, which has Vista (and thus mklink as well as junction points).


And yes, the fact that it's a shell script can be a problem if you're 
not using a Unix-like shell environment.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about STRICT

2009-06-03 Thread Gevik Babakhani


Tom Lane wrote:

Gevik Babakhani pg...@xs4all.nl writes:

Perhaps it is an idea to have something like:
  RAISE ERROR ON NULL INPUT


[ shrug... ]  There's really been no demand for that.  If you want a
particular function to do it, you can put suitable tests and error
reports into that function.  I can't see us adding extra cycles into
the core function-calling code (which is a very hot hot-spot) for a
feature with so little demand.


Understood. Thank you (Tom and Greg) for clarifying this.

--
Regards,
Gevik

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Tom Lane

Andres Freund and...@anarazel.de writes:
 On 06/03/2009 06:21 PM, Tom Lane wrote:
 I find this *really* hard to believe, because I've never seen the bitmap
 support operations show up noticeably at all in profiles.  What sort of
 queries are you testing?

 Many left joins from one base relation to additional dimensions. All the 
 dimensions are relatively complex views consisting out of multiple joins 
 or subselects.
 Some correlated subqueries and some [NOT] EXISTS() are also included in 
 some of the queries.

Hmmm, could you provide a complete test case?  I'm thinking the behavior
might indicate some other performance issue, ie an unreasonable number
of bitmapset calls in some particular planning path.

There used to be some performance issues in this area back when we
represented sets of relids as integer Lists :-(, but the change to
bitmap sets pretty much stomped that.  I'm just really surprised that
there would be anything measurable from changing the word width.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andres Freund


On 06/03/2009 06:38 PM, Andrew Dunstan wrote:

Andres Freund wrote:

On 06/03/2009 06:17 PM, Andrew Dunstan wrote:

Tom Lane wrote:

I think the appropriate question is why doesn't it work on
Windows, and is that fixable? Without having looked, I'm
guessing the issue is that it depends on hardlinks or symlinks
--- and we know those are available, as long as you're using
recent Windows with NTFS. Which does not sound like an
unreasonable baseline requirement for someone committing from
Windows.

I think it could probably be made to work on WIndows if really
necessary (e.g. by translating into perl).

Is the fact that its implemented as a shell script the real
problem? Isn't it more that symlinks aka Junction Points are
really dangerous = WinXP? (Deleting a symlink recurses to the
target and deletes there).

You have carefully left out the first sentence of my reply.

Sorry, I didnt want to imply anything by that.


And yes, we know about junction points. I don't think either of us is
doing any development work on XP. I do most of my Windows work on my
laptop, which has Vista (and thus mklink as well as junction
points).

Good then.


And yes, the fact that it's a shell script can be a problem if
you're not using a Unix-like shell environment.

The git for windows installation includes a functional unix-alike shell
(mingw, not cygwin or such). Some core part of git are still written in
shell, so it would not work without that anyway.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Tom Lane

Andrew Dunstan and...@dunslane.net writes:
 You have carefully left out the first sentence of my reply. Neither of 
 the committers who actually do much work on Windows (namely Magnus and 
 me) commit direct from *any* version of Windows.

Nonetheless, that might not be true in future.  I'd be a bit worried
about establishing a project standard that excluded people from doing
commit work on Windows.

But it sounds like that problem could be dealt with if anyone cared to
put some work into it, so I'm feeling this is not a showstopper issue.


What it seems we need next is for someone to experiment with
git-new-workdir and committing patches that touch multiple branches,
to confirm whether this actually offers a good solution.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Andres Freund


On 06/03/2009 06:42 PM, Tom Lane wrote:

Andres Freundand...@anarazel.de  writes:

On 06/03/2009 06:21 PM, Tom Lane wrote:

I find this *really* hard to believe, because I've never seen the bitmap
support operations show up noticeably at all in profiles.  What sort of
queries are you testing?



Many left joins from one base relation to additional dimensions. All the
dimensions are relatively complex views consisting out of multiple joins
or subselects.
Some correlated subqueries and some [NOT] EXISTS() are also included in
some of the queries.

Hmmm, could you provide a complete test case?  I'm thinking the behavior
might indicate some other performance issue, ie an unreasonable number
of bitmapset calls in some particular planning path.
Uh. I will try, but probably it is not easy. (Once more a 
datawarehouse-ish example database would be useful).


The graph linked in the former email includes all callers with relevant 
amount of calls (generate_join_implied_equalities, join_is_legal, 
have_join_order_restriction are in this order the by far most costly).


I put up the raw profile data at:
http://anarazel.de/pg/32bit_bitmaps.out.gz
http://anarazel.de/pg/64bit_bitmaps.out.gz
As I said, unfortunately only kcachegrind seems to be able to load the 
data - it is included in most linux distros though.



There used to be some performance issues in this area back when we
represented sets of relids as integer Lists :-(, but the change to
bitmap sets pretty much stomped that.  I'm just really surprised that
there would be anything measurable from changing the word width.
Well, the number of memory accesses is halved and I think that bitwise 
NOT and AND take the same amount of cycles whether they are operating on 
32 or 64bit. That would explain some difference.


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator status for 8.4

2009-06-03 Thread Alvaro Herrera

Bruce Momjian wrote:

 pg_migrator will require a reindex if:
 
 o  an index is of type hash or gin
 o  an index uses bpchar_pattern_ops

What about varchar_pattern_ops and text_pattern_ops?  Are they not
affected?


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Kevin Grittner

Andres Freund and...@anarazel.de wrote: 
 
 long plan times (minutes and up)
 
Wow.  I thought I had some pretty complex queries, including some
which join using several views, each of which has several joins; but
I've never gone to multiple seconds on plan time (much less multiple
minutes!) without very high statistics targets and many indexes on the
tables.  Any rough estimates on those?
 
If you think your patch could have a significant impact on a query
with a 260 ms plan time, I could give it a try.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4b2 tsearch2 strange error

2009-06-03 Thread Tom Lane

Tatsuo Ishii is...@postgresql.org writes:
 Sorry for delay. I have put a database dump at:
 http://sylpheed.sraoss.jp/tmp/dump.sql.gz

 We got following errors using this database:

Hmm.  I can confirm the unrecognized operator failure (though I get
8 not -50 ... uninitialized memory?).  The tuple out of range
failures are not happening here though.  Perhaps it is locale dependent?
What locale are you using, and what is your default_text_search_config
setting?  Are you using any nondefault configuration settings?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Alvaro Herrera

Markus Wanner wrote:
 Hi,

 Quoting David E. Wheeler da...@kineticode.com:
 Monotone?

 ..one of the sources of inspiration for Linus to write git. He was not  
 satisfied with its speed and he didn't like C++ and SQL. Plus the main  
 contributors weren't around at the time Linus was on the mailing list.  
 So he turned away and did his own thing, in C and filesystem based.  
 (Most ranting stripped).

The only rant I have about the outcome is that Linus did not copy more
of it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Magnus Hagander

Andrew Dunstan wrote:
 
 
 Dave Page wrote:
 On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

  
 Well, it sounds about perfect for my use case too (which is
 approximately the same as Tom's), but the description makes it sound
 unsupported.  It doesn't work on Windows which doesn't bother me
 personally but may be a showstopper more generally.
 

 It's not a showstopper for me. Can't speak for Magnus, Andrew or
 anyone else working on Windows though. I imagine those two are the
 most likely to have issues if they're back-patching - and that should
 just be a matter of disk space.

   
 
 Yeah, AFAIK Magnus doesn't commit direct from Windows, and neither do I,
 and this should not be a showstopper for anyone who isn't a committer.

Well, partially correct.

My workflow today is that I do the commit on a git repository in my
Windows VM. Which I then git push out to my linux box. Where I do a
make to be sure I didn't break things :-), and then just extract the
patch with git diff and apply it manually to the cvs tree, and finally
I commit in cvs...

Even if we move to git, I have no desire to push directly from Windows
into the core repository. I'll still stage it through a local one.

-- 
 Magnus Hagander
 Self: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andrew Dunstan




Andres Freund wrote:

The git for windows installation includes a functional unix-alike shell
(mingw, not cygwin or such). Some core part of git are still written in
shell, so it would not work without that anyway.




Ah. Ok. Good to know. Does it contain a builtin ln command? And does 
that use junction points?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_migrator status for 8.4

2009-06-03 Thread Tom Lane

Alvaro Herrera alvhe...@commandprompt.com writes:
 Bruce Momjian wrote:
 pg_migrator will require a reindex if:
 o  an index is of type hash or gin
 o  an index uses bpchar_pattern_ops

 What about varchar_pattern_ops and text_pattern_ops?  Are they not
 affected?

Nope.  See
http://archives.postgresql.org/pgsql-committers/2008-05/msg00346.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andres Freund


Hi,

On 06/03/2009 07:26 PM, Andrew Dunstan wrote:

Andres Freund wrote:

The git for windows installation includes a functional unix-alike shell
(mingw, not cygwin or such). Some core part of git are still written in
shell, so it would not work without that anyway.

Ah. Ok. Good to know. Does it contain a builtin ln command? And does
that use junction points?

It contains a ln.exe but I do not know what it exactly does:
http://repo.or.cz/w/msysgit.git?a=tree;f=bin;h=ab9faa176dbed67a93aa223e0d84bff9f950a26d;hb=HEAD

I don't have windows access for the next few hours, but if nobody 
answered until then I will try it.


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Andrew Dunstan




Magnus Hagander wrote:

Andrew Dunstan wrote:
  

Dave Page wrote:


On Wed, Jun 3, 2009 at 4:01 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 
  

Well, it sounds about perfect for my use case too (which is
approximately the same as Tom's), but the description makes it sound
unsupported.  It doesn't work on Windows which doesn't bother me
personally but may be a showstopper more generally.



It's not a showstopper for me. Can't speak for Magnus, Andrew or
anyone else working on Windows though. I imagine those two are the
most likely to have issues if they're back-patching - and that should
just be a matter of disk space.

  
  

Yeah, AFAIK Magnus doesn't commit direct from Windows, and neither do I,
and this should not be a showstopper for anyone who isn't a committer.



Well, partially correct.

My workflow today is that I do the commit on a git repository in my
Windows VM. Which I then git push out to my linux box. Where I do a
make to be sure I didn't break things :-), and then just extract the
patch with git diff and apply it manually to the cvs tree, and finally
I commit in cvs...

Even if we move to git, I have no desire to push directly from Windows
into the core repository. I'll still stage it through a local one.
  


I see. In that case, though, you probably do need to be able to do thing 
atomically across branches, so that you can push a single changeset, no?


Anyway, it sounds like it's not going to be a showstopper.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-03 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 More suggested dispositions:

  * contrib/seg and contrib/cube GiST index support have performance issues

 -- If it's just a performance issue, I don't think this issue should 
 block 8.4.0; it can be fixed in 8.4.1 if necessary, since we'll probably 
 want to backpatch the fix anyway.

I might be mistaken, but I think that any fix would invalidate existing
indexes of these types; which would make it problematic to apply a fix
in a minor release.  This item is actually my single biggest concern
of the Open Items list.  I'd really like to see it resolved.  It's not
so much that seg and cube themselves are a big deal, as that I think
people have copied that logic in custom opclasses.  We need to be
shipping an example that's not buggy.

  * localeconv encoding issues
o proposed patch here

 -- Any reason not to apply patch?

It still needs work, per my review the other day.  In any case, it
being a Windows-specific patch, I'd want one of the Windows folk
to take responsibility for testing/committing it --- I'm not going to.

  * BUG #4622: xpath only work in utf-8 server encoding

 -- I think this is a doc patch.  Since libxml (as I understand it) only 
 supports UTF, this is not something we can fix without major engineering 
 anyway, certainly not before release.  I just think we need big warnings 
 in the docs in several places.

This resolution is okay with me.  I'm not sure how major the
engineering really is, but with no one stepping up to do the work,
I can't see letting this item block the release indefinitely.

  * autovacuum can run rebuild_database_list unreasonably often

 -- A possible quick workaround would be to put a lower limit of naptime 
 at 1s.

I'm assuming that Alvaro can fix this in a reasonable way as soon as he
gets a little time to spend on it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-03 Thread Tom Lane

Magnus Hagander mag...@hagander.net writes:
 Josh Berkus wrote:
 * Path separator consistency on Windows
 
 -- This discussion does not appear to have concluded.  Magnus, Dave?

 We've lived with this for a long time, it's nothing new. And it's not
 critical. Bump to 8.5.

This is partly my fault, since I objected to the proposed refactoring
of path.c but didn't get around to offering a concrete alternative.
But I agree it's noncritical and a bit too late for 8.4.

I moved the entry to TODO, but maybe it should be on the next CommitFest
page instead?  There were proposed patches submitted.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] It's June 1; do you know where your release is?

2009-06-03 Thread Magnus Hagander

Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
 Josh Berkus wrote:
 * Path separator consistency on Windows

 -- This discussion does not appear to have concluded.  Magnus, Dave?
 
 We've lived with this for a long time, it's nothing new. And it's not
 critical. Bump to 8.5.
 
 This is partly my fault, since I objected to the proposed refactoring
 of path.c but didn't get around to offering a concrete alternative.
 But I agree it's noncritical and a bit too late for 8.4.
 
 I moved the entry to TODO, but maybe it should be on the next CommitFest
 page instead?  There were proposed patches submitted.

CommitFest sounds best to me.


-- 
 Magnus Hagander
 Self: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] trouble with to_char('L')

2009-06-03 Thread Tom Lane

Hiroshi Inoue in...@tpf.co.jp writes:
 Tom Lane wrote:
 * This seems to be assuming that the user has set LC_MONETARY and
 LC_NUMERIC the same.  What if they're different?

 Strictky speaking they should be handled individually.

I thought about this some more, and I wonder why you did it like this at
all.  The patch claimed to be copying the LC_TIME code, but the LC_TIME
code isn't trying to temporarily change any locale settings.  What we
are doing in that code is assuming that the system will give us back
the localized strings in the encoding identified by CP_ACP; so all we
have to do is convert CP_ACP to wide chars and then to UTF8.  Can't we
use a similar approach for the output of localeconv?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4b2 tsearch2 strange error

2009-06-03 Thread Tom Lane

Tatsuo Ishii is...@postgresql.org writes:
 STATEMENT:  SELECT msg_sid, hdr_from, hdr_to, hdr_subject, msg_date, 
 folder_id, msgnum FROM msginfo LEFT JOIN msg_folderinfo USING (msg_sid) WHERE 
 plainto_tsquery(E'a') @@ body_index
 NOTICE:  text-search query contains only stop words or doesn't contain 
 lexemes, ignored
 ERROR:  unrecognized operator: -50

I found the cause of this one --- the new code for estimating the
selectivity of @@ was not guarding against the possibility that the
given TSQuery is empty, as it is in this example.

I'm still unable to reproduce the tuple offset out of range errors.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Gregory Stark

Kevin Grittner kevin.gritt...@wicourts.gov writes:

 Andres Freund and...@anarazel.de wrote: 
  
 long plan times (minutes and up)
  
 Wow.  I thought I had some pretty complex queries, including some
 which join using several views, each of which has several joins; but
 I've never gone to multiple seconds on plan time (much less multiple
 minutes!) without very high statistics targets and many indexes on the
 tables.  Any rough estimates on those?

My money's still on very large statistics targets. If you have a lot of
columns and 1,000-element arrays for each column that can get big pretty
quickly.

But that doesn't explain the bitmap ops being important. Hm. Actually having a
lot of columns and then joining a lot of tables could mean having fairly large
bitmapsets.



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Tom Lane

Gregory Stark st...@enterprisedb.com writes:
 But that doesn't explain the bitmap ops being important. Hm. Actually
 having a lot of columns and then joining a lot of tables could mean
 having fairly large bitmapsets.

Yeah, but then you have a lot of *other* expensive operations too,
such as the aforementioned statistics-pushing.  It's still pretty
mystifying why bitmapsets would be eating a noticeable fraction
of the total.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Developer meeting minutes up

2009-06-03 Thread Ron Mayer

Robert Haas wrote:
 But I
 wonder if it would make more sense to include some kind of metadata in
 the commit message (or some other property of the commit?  does git
 support that?) to make it not depend on that.

From elsewhere in this thread[1], 'The git cherry-pick ... -x flag adds
a note to the commit comment describing the relationship between the commits.'

If the commit on the main branch had this message
=
   added a line on the main branch
=
The commit on the cherry picked branch will have this comment
=
   added a line on the main branch
(cherry picked from commit 189ef03b4f4ed5078328f7965c7bfecce318490d)
=
where the big hex string identifies the comment on the other branch.


[1] http://archives.postgresql.org/pgsql-hackers/2009-06/msg00191.php



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Kevin Grittner

Gregory Stark st...@enterprisedb.com wrote: 
 
 My money's still on very large statistics targets. If you have a lot
 of columns and 1,000-element arrays for each column that can get big
 pretty quickly.
 
I'm finding that even the ones that had a plan time in the range of
260 ms go down to 15 ms to 85 ms once the statistics are cached.  I
wonder if the long run time is because it's having to read statistics
multiple times because they don't fit in cache?  Like with really wide
values?  Would the wider bitmap type help with that situation in any
way?
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing multiple branches in git

2009-06-03 Thread Aidan Van Dyk


My last post on the git issue...  If any one wants to ask specific
questions, feel free to e-mail me directly...  But this thread has
digressed to way too much hand-waving...

If any of your are not familiar with git and want to get an overview of
it, this might be a good place to start:
http://excess.org/article/2008/07/ogre-git-tutorial/

It was a presentation done Bart did here in Ottawa for a group of local
ruby enthusiasts, not necessarily aimed at kernel or C hackers.

a.
-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature

Re: [HACKERS] list_head naming conflict gcc 4.2/perl/solaris

2009-06-03 Thread Zdenek Kotala


Zdenek Kotala píše v po 01. 06. 2009 v 22:45 +0200:
 Tom Lane píše v po 01. 06. 2009 v 16:09 -0400:
  Zdenek Kotala zdenek.kot...@sun.com writes:
 
 
  What is sys/list.h, and why is it being imported by the Perl headers?
 
 It seems that problem is with Perl. It includes sys/mode.h. The new
 change for gcc 4.2 is that mode.h includes vnode.h and it finally
 sys/list.h which is generic list for kernel structures. 
 
  Can we do something like #define list_head solaris_list_head around 
  the problematic #include?  (Which one is that, anyway?)
 
 I will try it. There is following hack:
 
 CFLAGS=-D_SYS_VNODE_H 
 
 which disable vnode.h and breaks include chain.

 I need to install fresh nightly build of Solaris to test it on live
 system.

Perl includes sys/mode.h which is probably not necessary (it look likes
that it is something for AIX), but it seems that sys/*.h headers
modifications break more than only PostgreSQL. And kernel guys should
fix it. Which means no change for PostgreSQL. Only who will use
opensolaris devel builds 116 could have a problem. I will skip these
build(s) on my buildfarm animals.

Zdenek




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Robert Haas

On Wed, Jun 3, 2009 at 3:18 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Gregory Stark st...@enterprisedb.com wrote:

 My money's still on very large statistics targets. If you have a lot
 of columns and 1,000-element arrays for each column that can get big
 pretty quickly.

 I'm finding that even the ones that had a plan time in the range of
 260 ms go down to 15 ms to 85 ms once the statistics are cached.  I
 wonder if the long run time is because it's having to read statistics
 multiple times because they don't fit in cache?  Like with really wide
 values?  Would the wider bitmap type help with that situation in any
 way?

 -Kevin

I had some performance results back when we were last looking at
default_statistics_target that indicated that the time to repeatedly
decompress a toasted statistics array contributed significantly to the
total planning time, but my suggestion to disable compression for
pg_statistic was summarily poo-poohed for reasons that still aren't
quite clear to me.  When you say, don't fit in cache, exactly what
cache are you talking about?  It seems to me that the statistics
should be far smaller than the underlying tables, so if even your
statistics don't fit in shared buffers (let alone main memory), it
doesn't really matter how long your query takes to plan because it
will probably take literally forever to execute.  How many tables
would you have to be joining to get a GB of statistics, even with dst
= 1000?  A few hundred?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote: 
 
 When you say, don't fit in cache, exactly what
 cache are you talking about?  It seems to me that the statistics
 should be far smaller than the underlying tables, so if even your
 statistics don't fit in shared buffers (let alone main memory), it
 doesn't really matter how long your query takes to plan because it
 will probably take literally forever to execute.  How many tables
 would you have to be joining to get a GB of statistics, even with
 dst = 1000?  A few hundred?
 
Since he can't share the schema, and hasn't even given much of a hint,
I don't know whether one (or more) of the columns is a bytea filled
with 100 MB values; and I don't remember any description of the
hardware environment either.  Since the behavior seems so
out-of-the-ordinary, I was casting about for possible extraordinary
characteristics of his environment which might cause it.  I'm probably
way off base
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Emmanuel Cecchet


Tom Lane wrote:

True, but the problem is that the tuple might still be live to (some
snapshots in) that transaction, so we can't inject a duplicate tuple
without risking confusing it.  In this particular case that isn't an
issue because the transaction is done executing, but the tqual.c
rules don't know that.
  
Please excuse my ignorance. I am not sure to get how the tuple could 
still be live to some snapshots after the transaction has prepared. What 
could still happen to objects that were only visible to a transaction 
after it has prepared? An example would definitely help.
Is it possible in Postgres for a transaction to see an object that was 
created inside another transaction before it commits (assuming at least 
'read committed' of course)?


Thanks again,
Emmanuel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Improving the ngettext() patch

2009-06-03 Thread Tom Lane

After looking through the current uses of ngettext(), I think that it
wouldn't be too difficult to modify the patch to address the concerns
I had about it.  What I propose doing is to add an additional elog.h
function

errmsg_plural(const char *fmt_singular, const char *fmt_plural,
  unsigned long n, ...)

and replace the current errmsg(ngettext(...)) calls with this.
Similarly add errdetail_plural to replace errdetail(ngettext(...)).
(We could also add errhint_plural and so on, but right offhand these
seem unlikely to be useful.)  The advantage of doing this is that
we avoid double translation and eliminate the current kluge whereby
usages in PL code have to be different from usages anywhere else.

I don't feel a need to touch the usages in client programs (pg_dump and
so on).  In principle the double-translation risk still exists there,
but it seems much less likely to be a real hazard because any one client
program has a *far* smaller pool of translatable messages than the
backend does.  Also, there's only one active text domain in a client
program, so the problem of needing to use dngettext in special cases
doesn't exist.

There are a few usages of ngettext() in the backend that are not tied
to ereport calls, but I think they can be left as-is.  There's no
double-translation risk, and with so few of them I don't see much of
a risk of wrongly copying the usage in PL code, either.


Also: one thought that came to me while looking at the existing usages
is that there are several places that are plural-ized that seem
completely pointless; why are we making our translators work
harder on them?  For example

ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_ARGUMENTS),
 errmsg(ngettext(functions cannot have more than %d argument,
 functions cannot have more than %d arguments,
 FUNC_MAX_ARGS),
FUNC_MAX_ARGS)));

It seems extremely far-fetched that FUNC_MAX_ARGS would ever be small
enough that it would make any language's special cases kick in.  Or
how about this one:

#if 0
write_msg(modulename, ngettext(read %lu byte into lookahead buffer\n,
   read %lu bytes into lookahead buffer\n,
   AH-lookaheadLen),
  (unsigned long) AH-lookaheadLen);
#endif

I'm not sure why this debug support is still there at all, but surely
it's a crummy candidate for making translators sweat over.  So I'd like
to revert these.

Comments, objections?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Tom Lane

Emmanuel Cecchet m...@frogthinker.org writes:
 Tom Lane wrote:
 True, but the problem is that the tuple might still be live to (some
 snapshots in) that transaction, so we can't inject a duplicate tuple
 without risking confusing it.  In this particular case that isn't an
 issue because the transaction is done executing, but the tqual.c
 rules don't know that.

 Please excuse my ignorance. I am not sure to get how the tuple could 
 still be live to some snapshots after the transaction has prepared.

Well, it couldn't be because there are no snapshots in that transaction
anymore.  The problem is that the *other* transaction doesn't have a
good way to know that.  It just sees an open transaction with
conflicting unique-index changes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Alvaro Herrera

Emmanuel Cecchet wrote:
 Tom Lane wrote:
 True, but the problem is that the tuple might still be live to (some
 snapshots in) that transaction, so we can't inject a duplicate tuple
 without risking confusing it.  In this particular case that isn't an
 issue because the transaction is done executing, but the tqual.c
 rules don't know that.
   
 Please excuse my ignorance. I am not sure to get how the tuple could  
 still be live to some snapshots after the transaction has prepared. What  
 could still happen to objects that were only visible to a transaction  
 after it has prepared? An example would definitely help.

The classic example is having an open cursor that references the table.
(In this case it doesn't work, but the snapshot management module and
tqual.c don't know it.)  If you want this to work you need to improve
those modules, and who knows what else ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres delays function returning large set of data

2009-06-03 Thread Pavel Stehule

Hello

can you send source code? There are two types of C SRF functions. One
returns row ro by row, second store returned rows and returns block.
What did you use?

ignore comments (in czech please) and look on code -
http://www.postgres.cz/index.php/Iter%C3%A1tor_pole

regards
Pavel Stehule

2009/6/3 flippo00110001 dde...@umich.edu:

 Hello,

 First some background:
 For my application I need to collect and retrieve data at a very fast rate,
 faster than postgres can produce, but I also have the need to store and
 access a large range of metadata about the large sets of data I am storing.
 I am therefore attempting to use postgres for the metadata in combination
 with the HDF5 filesystem for the large sets of data to store data fast and
 have all of the power of a relational database over the metadata.

 Now onto the problem:
 To combine the two sets of data I have created a C function which can be
 dynamically loaded into postgres and takes a primary key index into the data
 and returns a large table (around 3 elements) of the data (value and
 timestamp).  When I access the data directly it takes me around 30ms to just
 read the data out of my test case.  As the postgres C function is called
 many times and I must convert the raw data to tuples it introduces more
 overhead bringing the data to take around 2000ms to run the testcase, this
 is still acceptable.  My problem is when I perform an actual query over my
 test case postgres takes around 22000 ms to retrieve the sets of data.  My
 concern is: what is postgres doing in these extra 20 seconds, and how can I
 make this run faster?

 In summary:
 I have a large testcase.
 I am running a C function from postgres which takes a primary key and
 returns a setof (int4, int4) with around 3 tuples returned per call
 my c function, when timed over the duration of the test case, takes 2
 seconds to run.
 the postgres query select * from function(args); takes around 11 times
 this amount of time to execute

 my function is defined as
 create or replace function(IN args, IN..., OUT name1 int4, OUT name2 int4)
 returns setof record as 'libname.so', 'fcnname' LANGUAGE C IMMUTABLE STRICT;

 I'm sorry if this is the wrong forum, I was torn between hackers and
 performance.
 Thank you for any assistance.

 --
 View this message in context: 
 http://www.nabble.com/Postgres-delays-function-returning-large-set-of-data-tp23853886p23853886.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Tom Lane

Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Since he can't share the schema, and hasn't even given much of a hint,
 I don't know whether one (or more) of the columns is a bytea filled
 with 100 MB values; and I don't remember any description of the
 hardware environment either.  Since the behavior seems so
 out-of-the-ordinary, I was casting about for possible extraordinary
 characteristics of his environment which might cause it.  I'm probably
 way off base

There's a hard-wired restriction in analyze.c that makes it discard data
values wider than 1KB on-sight.  So no such value will ever be found in
a statistics array.  You could still have a few meg in a pg_statistics
row, I suppose, but not a really horrendous amount.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Wed, Jun 3, 2009 at 3:18 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 I'm finding that even the ones that had a plan time in the range of
 260 ms go down to 15 ms to 85 ms once the statistics are cached.

 I had some performance results back when we were last looking at
 default_statistics_target that indicated that the time to repeatedly
 decompress a toasted statistics array contributed significantly to the
 total planning time, but my suggestion to disable compression for
 pg_statistic was summarily poo-poohed for reasons that still aren't
 quite clear to me.

Well, smaller is better.  Kevin's example demonstrates that it's good
to have the stats sucked into cache.  If they were uncompressed then
less of them would fit in however much cache space you have, and
whatever CPU savings you get would be lost to more I/O to read in
stats entries.

Of course, this all depends on total database size vs total RAM,
but that's how I'd interpret the observation.  PG is still mostly
optimized for databases bigger than RAM, so this decision still
makes sense.

(I think you could try marking the columns of pg_statistic as don't
compress if you have a DB you want to optimize for all-in-memory
behavior.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Emmanuel Cecchet


Tom Lane wrote:

Emmanuel Cecchet m...@frogthinker.org writes:
  

Tom Lane wrote:


True, but the problem is that the tuple might still be live to (some
snapshots in) that transaction, so we can't inject a duplicate tuple
without risking confusing it.  In this particular case that isn't an
issue because the transaction is done executing, but the tqual.c
rules don't know that.
  


  
Please excuse my ignorance. I am not sure to get how the tuple could 
still be live to some snapshots after the transaction has prepared.



Well, it couldn't be because there are no snapshots in that transaction
anymore.  The problem is that the *other* transaction doesn't have a
good way to know that.  It just sees an open transaction with
conflicting unique-index changes.
  
But when the transaction prepares, we know that. What would prevent us 
to do at prepare time the same cleanup that commit does?


regards, manu (indentation (C) tom lane)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Locks on temp table and PREPARE

2009-06-03 Thread Tom Lane

Emmanuel Cecchet m...@frogthinker.org writes:
 But when the transaction prepares, we know that. What would prevent us 
 to do at prepare time the same cleanup that commit does?

The entire point of PREPARE is that it's *not* committed yet.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4b2 tsearch2 strange error

2009-06-03 Thread Tatsuo Ishii

 Hmm.  I can confirm the unrecognized operator failure (though I get
 8 not -50 ... uninitialized memory?).  The tuple out of range
 failures are not happening here though.  Perhaps it is locale dependent?
 What locale are you using, and what is your default_text_search_config
 setting?  Are you using any nondefault configuration settings?

From my previous post:
 Ubuntu 8.04
 ./configure --prefix=/usr/local/pgsql84
 initdb -E UTF-8 --no-locale /path/to/database

So locale is C. I believe we are not customizing
default_text_search_config. I'm not think of any nondefault
configuration settings.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Andres Freund


On 06/03/2009 07:05 PM, Kevin Grittner wrote:

Andres Freundand...@anarazel.de  wrote:


long plan times (minutes and up)


Wow.  I thought I had some pretty complex queries, including some
which join using several views, each of which has several joins; but
I've never gone to multiple seconds on plan time (much less multiple
minutes!) without very high statistics targets and many indexes on the
tables.  Any rough estimates on those?
Statistics target is 250. Lowering to 10 lowers the query plan time 
somewhat but not significantly and increases query runtime significantly.


Real dataset is a bit less than 1.5TB without materialized views and a 
bit over 3 with.

Production machine (old) is a 2xDualcore Xeon 5150, 32gig ram.

Test Dataset is about 15GB. Core2 Duo 2.4Ghz, 4GB ram.

Example query (from which the traces are) on the test dataset (I cant 
simply do a full analyze on the real data):

Stat target 10: 22283.187ms PREPARE
Stat target 1000: 23986.504ms PREPARE

So, no really interesting difference.

For the timings I always PREPARE'ed the query multiple times in a 
transaction to make sure there are no caching effects - a small drop but 
nothing significant.



On the average its about

If you think your patch could have a significant impact on a query
with a 260 ms plan time, I could give it a try.
From what I have seen so far I doubt that it will have a really 
measurable effect on relatively short planning times- if you want to try 
its a very simple change:


Just change all 32 into the 64 bit equivalents in include/nodes/bitmapset.h:
#define BITS_PER_BITMAPWORD 32
typedef uint32 bitmapword;/* must be an unsigned type */
typedef int32 signedbitmapword; /* must be the matching signed type */


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Greg Stark

Doesn't that still add up to 3GB for a table's stats in the worst  
case? 1kb * 1,000 buckets * 1,500 attributes * 2 (histogram + mfv)


Except you can't actually get 1500 toast pointers on a page. I suppose  
with games with nulls you could make this worst case happen though.


It does seem like it ought to be possible to truncate strings in the  
histogram since any string between the actual values us equally good.


--
Greg


On 3 Jun 2009, at 22:11, Tom Lane t...@sss.pgh.pa.us wrote:


Kevin Grittner kevin.gritt...@wicourts.gov writes:
Since he can't share the schema, and hasn't even given much of a  
hint,

I don't know whether one (or more) of the columns is a bytea filled
with 100 MB values; and I don't remember any description of the
hardware environment either.  Since the behavior seems so
out-of-the-ordinary, I was casting about for possible extraordinary
characteristics of his environment which might cause it.  I'm  
probably

way off base


There's a hard-wired restriction in analyze.c that makes it discard  
data
values wider than 1KB on-sight.  So no such value will ever be found  
in

a statistics array.  You could still have a few meg in a pg_statistics
row, I suppose, but not a really horrendous amount.

   regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Tom Lane

Greg Stark greg.st...@enterprisedb.com writes:
 It does seem like it ought to be possible to truncate strings in the  
 histogram since any string between the actual values us equally good.

Yeah, that was the justification for dropping the wide values ---
that and the theory that they'd be unlikely to be most-common values.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4b2 tsearch2 strange error

2009-06-03 Thread Tom Lane

Tatsuo Ishii is...@postgresql.org writes:
 initdb -E UTF-8 --no-locale /path/to/database

 So locale is C. I believe we are not customizing
 default_text_search_config. I'm not think of any nondefault
 configuration settings.

Huh.  So that should be the same case I tested here.  I wonder why
I can't duplicate it.  Can anyone else get the failure?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4b2 tsearch2 strange error

2009-06-03 Thread Oleg Bartunov


On Wed, 3 Jun 2009, Tom Lane wrote:


Tatsuo Ishii is...@postgresql.org writes:

initdb -E UTF-8 --no-locale /path/to/database



So locale is C. I believe we are not customizing
default_text_search_config. I'm not think of any nondefault
configuration settings.


Huh.  So that should be the same case I tested here.  I wonder why
I can't duplicate it.  Can anyone else get the failure?


I also can't reproduce the problems reported, except one you just fixed.
My system is Linux, locale ru_RU.UTF-8.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 117 matches

Mail list logo