Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
[already sent a variant of that yesterday but it doesn't look like it
made it to the list]

Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Apparently we've made the planner a bit too optimistic about the savings
 that can be expected from repeated indexscans occurring on the inside of
 a join.
 
 effective_cache_size was set to 10GB(my fault for copying over the conf
 from a 16GB box) during the run - lowering it just a few megabytes(!) or
 to a more realistic 6GB results in the following MUCH better plan:
 http://www.kaltenbrunner.cc/files/dbt3_explain_analyze2.txt
 
 Interesting.  It used to be that effective_cache_size wasn't all that
 critical... what I think this report is showing is that with the 8.2
 changes to try to account for caching effects in repeated indexscans,
 we've turned that into a pretty significant parameter.

took me a while due to hardware issues on my testbox - but there are new
results(with 6GB for effective_cache_size) up at:

http://www.kaltenbrunner.cc/files/5/

there are still a few issues with the validity of the run like the rf
tests not actually being done right - but lowering effective_cache_size
gave a dramtic speedup on Q5,Q7 and Q8.

that is the explain for the 4h+ Q9:

http://www.kaltenbrunner.cc/files/analyze_q9.txt

increasing the the statistic_target up to 1000 does not seem to change
the plan btw.

disabling nested loop leads to the following (4 times faster) plan:

http://www.kaltenbrunner.cc/files/analyze_q9_no_nest.txt

since the hash-joins in there look rather slow (inappropriate hashtable
set up due to the wrong estimates?) I disabled hash_joins too:

http://www.kaltenbrunner.cc/files/analyze_q9_no_nest_no_hashjoin.txt

and amazingly this plan is by far the fastest one in runtime (15min vs
4,5h ...) except that the planner thinks it is 20 times more expensive ...


Stefan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Opinion about macro for the uuid datatype.

2006-09-17 Thread Gevik Babakhani
Understood, Thank you :)

On Sat, 2006-09-16 at 21:23 -0400, Tom Lane wrote:
 Gevik Babakhani [EMAIL PROTECTED] writes:
  I was wondering if I should go ahead and add a macro datatype like the
  SERIAL, only this time for the uuid.
 
 This assumes a fact not in evidence, which is that we're going to accept
 a uuid-generation function as part of core.  AFAIK the only reasonably
 non-contentious part of this proposal is the ability to *store* uuids.
 Generating new ones introduces a host of portability and other issues.
 
 Considering the amount of pain involved in supporting SERIAL in the
 parser, pg_dump, etc, I'd say that adding the above is a pretty certain
 route to getting your patch rejected as too invasive.  If, three or four
 versions down the road, large numbers of people are using uuid with the
 same generation function, *then* it might be time to think about
 introducing a macro type.
 
   regards, tom lane
 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Bitmap index status

2006-09-17 Thread Jie Zhang
Hi Heikki and all,

I just sent the latest bitmap index patch to the list. I am not sure if
there is any size limit for this mailing list. If you have received my
previous email, please let me know.

Thanks,
Jie

On 9/12/06 2:43 AM, Heikki Linnakangas [EMAIL PROTECTED] wrote:

 Hi,
 
 What's the status of the bitmap index patch? Have you worked on it since
 the last posted patch
 (http://archives.postgresql.org/pgsql-patches/2006-08/msg3.php)?
 
 I've started to review it, to get it into CVS early in the 8.3 cycle. I
 just want to make sure that I'm working on the latest version.
 
 Beside the issues already discussed, I found two minor bugs:
 * pg_am says that bitmap am supports unique indexes, while it doesn't.
 Second,
 * race condition in _bitmap_inserttuple if two backends try to insert
 the same, new value. If they both find that there's no lov item for the
 key, and try to create one, one backend will get a duplicate key error
 on the lov index.
 
 Also, vacuum actually does a reindex, which seems awfully wasteful. That
 needs to be looked at.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [HACKERS] Timezone List

2006-09-17 Thread Joachim Wieland
On Sat, Sep 16, 2006 at 04:19:48PM -0400, Tom Lane wrote:
 I've applied this along with some extra work to get it to show GMT
 offsets and DST status, which should be useful for helping people
 to choose which setting they want.  This effectively obsoletes
 Table B-5 as well as B-4 in the SGML docs ... we should probably
 remove both of those in favor of recommending people look at the
 views.

 I did the basic documentation work in catalogs.sgml for these views,
 but Appendix B still needs an update.  Joachim, you were going to do
 that, right?

Thats right.


Joachim

-- 
Joachim Wieland  [EMAIL PROTECTED]
   GPG key available

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Martijn van Oosterhout
On Sat, Sep 16, 2006 at 09:15:24PM -0400, Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
  Gregory Stark wrote:
  A TODO list people can freely add stuff to is precisely what would make it
  useful. It would have things we don't already know.
 
  I am just going to hope that you are kidding about this one.
 
 Fortunately, none of the real developers would have to pay any attention
 to any such page ... and you can bet they wouldn't.

Well, there is a reason why I put a big label there Unofficial TODO
List. I tried to make it clear that it's not an official stance of the
project.

If someone wants to spend an afternoon putting up a coherent
description of their wishlist item complete with possible problems and
solutions, then I don't see why we should stop them. The page someone
has put up covering XML told me more about the current state of XML
support in postgres than a few hours of archive searching would.

It's just not official, that doesn't make it any less useful.

Two points I'm not clear about on this thread though:

1. Authorized user: is that someone with an account, or someone who has
been authorized by someone else?

2. I can see the official todo list being in CVS, which gives it all
the access protection it needs. A wiki todo list can stay where it is,
just that it's not official.

[I've just made a reference to the TODO list in CVS from the wiki, that
should help].

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Reducing data type space usage

2006-09-17 Thread Martijn van Oosterhout
On Sat, Sep 16, 2006 at 08:56:11PM +0100, Gregory Stark wrote:

[Re inet and cidr]

 Why are these varlena? Just for ipv6 addresses? Is the network mask length not
 stored if it's not present? This gives us a strange corner case in that ipv4
 addresses will *always* fit in the smallfoo data type and ipv6 *never* fit.
 Ie, we'll essentially end up with an ipv4inet and an ipv6inet. Sad in a way.

Eh? Either will always fit. ipv6 is 128 *bits* and the new varlena
header goes to 128 *bytes*. That should fit easily, no?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Reducing data type space usage

2006-09-17 Thread Gregory Stark
Martijn van Oosterhout kleptog@svana.org writes:

 On Sat, Sep 16, 2006 at 08:56:11PM +0100, Gregory Stark wrote:

 [Re inet and cidr]

 Why are these varlena? Just for ipv6 addresses? Is the network mask length 
 not
 stored if it's not present? This gives us a strange corner case in that ipv4
 addresses will *always* fit in the smallfoo data type and ipv6 *never* fit.
 Ie, we'll essentially end up with an ipv4inet and an ipv6inet. Sad in a way.

 Eh? Either will always fit. ipv6 is 128 *bits* and the new varlena
 header goes to 128 *bytes*. That should fit easily, no?

Er, oops.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
Stefan Kaltenbrunner wrote:
 [already sent a variant of that yesterday but it doesn't look like it
 made it to the list]
 
 Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Apparently we've made the planner a bit too optimistic about the savings
 that can be expected from repeated indexscans occurring on the inside of
 a join.
 effective_cache_size was set to 10GB(my fault for copying over the conf
 from a 16GB box) during the run - lowering it just a few megabytes(!) or
 to a more realistic 6GB results in the following MUCH better plan:
 http://www.kaltenbrunner.cc/files/dbt3_explain_analyze2.txt
 Interesting.  It used to be that effective_cache_size wasn't all that
 critical... what I think this report is showing is that with the 8.2
 changes to try to account for caching effects in repeated indexscans,
 we've turned that into a pretty significant parameter.
 
 took me a while due to hardware issues on my testbox - but there are new
 results(with 6GB for effective_cache_size) up at:
 
 http://www.kaltenbrunner.cc/files/5/
 
 there are still a few issues with the validity of the run like the rf
 tests not actually being done right - but lowering effective_cache_size
 gave a dramtic speedup on Q5,Q7 and Q8.
 
 that is the explain for the 4h+ Q9:
 
 http://www.kaltenbrunner.cc/files/analyze_q9.txt
 
 increasing the the statistic_target up to 1000 does not seem to change
 the plan btw.
 
 disabling nested loop leads to the following (4 times faster) plan:
 
 http://www.kaltenbrunner.cc/files/analyze_q9_no_nest.txt
 
 since the hash-joins in there look rather slow (inappropriate hashtable
 set up due to the wrong estimates?) I disabled hash_joins too:
 
 http://www.kaltenbrunner.cc/files/analyze_q9_no_nest_no_hashjoin.txt
 
 and amazingly this plan is by far the fastest one in runtime (15min vs
 4,5h ...) except that the planner thinks it is 20 times more expensive ...

some additional numbers(first one is with default settings, second is
with enable_nestloop = 'off', third one is with enable_nestloop = 'off'
and enable_hashjoin='off'):

http://www.kaltenbrunner.cc/files/analyze_q7.txt

here we have a 3x speedup with disabling nested loops and a 2x speedup
(over the original plan) with nested loops and hashjoins disabled.


http://www.kaltenbrunner.cc/files/analyze_q20.txt

here we have a 180x(!) speedup with both disabled planner options ...

it is worth mentioning that for both queries the estimated costs in
relation to each other looks quite reasonable as soon as enable_nestloop
= 'off' (ie 5042928 vs 10715247 with 344sec vs 514 for Q7 and 101441851
vs 101445468 with 10sec vs 11sec)


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Magnus Hagander
 The wiki has been sitting there for two weeks and hasn't had 
 any problems.

Uh, you mean apart from the fact that it took very little time (days,
IIRC) before we had people writing attempts at user documentation,
somthing that we already have *two* different systems (interactive docs
+ new techdocs) for, and specifically said we absolutely did not want on
this wiki? IIRC, that got on there long before *any* content related to
what was actually supposed to be there..


 It's already getting more attention and updates than the 
 techdocs wiki which still has articles up from 2001 that are 
 no longer relevant and in some cases are actively misleading.

It's in the process of being cleaned up, mainly by Robert Treat. I'm
sure he'd appreciate help.

Why would *this* wiki be less suceptible to the same kind of issues than
the old one? That's more an argument that we *will* have this problem on
the wiki.


//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Magnus Hagander
 Techdocs is a different problem all together. Josh has 
 already mentioned some problems with it. I can mention more.

[warning: thread hi-jack]


 1. It isn't easy to login

Really? You're kidding, right? You click a link that requires login, and
you get a browser login box. How much easier can it be?

 2. It is even harder to create a login

Again, really? If yo uget the login prompt and hit cancel (or just login
with an invalid password), that says you need a community login. If you
don't hav eone, click here to read about it. If you click here, you
get to the page where you sign up.

Now, explaining this process on the frontpage of the techdocs part of
the site might not be a bad idea at all (in fact, it's a good idea :-P),
but do you honestly think the process is complex? If so, what should we
do to make it easier?

 3. There is no creation of login for most people because they 
 don't know they have to go to the community portion of the 
 www site to get to it.

See above, you don't need to do this.


 I am sure their are other problems on the inside, I haven't 
 actually ever logged in ;)

You should, we'd like to know about them so we can fix them.


//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Magnus Hagander
 Two points I'm not clear about on this thread though:
 
 1. Authorized user: is that someone with an account, or 
 someone who has been authorized by someone else?

IIRC, the idea was someone with an account. Basically you add a (very
very small) hurdle so you only get the people who actually *care* to
write things. But if you do care, it's not a lot of work. You also get
traceability, so you can talk to whomever wrote a certain thing.

I don't see any gain in having someone specifically authorize who can
write to it.

//Magnus


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Martijn van Oosterhout
On Sun, Sep 17, 2006 at 03:09:29PM +0200, Magnus Hagander wrote:
  The wiki has been sitting there for two weeks and hasn't had 
  any problems.
 
 Uh, you mean apart from the fact that it took very little time (days,
 IIRC) before we had people writing attempts at user documentation,

snip

Really? Where was that? Did it get deleted in the meantime? Who's
responsible for that kind of thing?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Magnus Hagander
   The wiki has been sitting there for two weeks and hasn't had any 
   problems.
  
  Uh, you mean apart from the fact that it took very little 
 time (days,
  IIRC) before we had people writing attempts at user documentation,
 
 snip
 
 Really? Where was that? Did it get deleted in the meantime? 
 Who's responsible for that kind of thing?

Yes.
Dave took it off when he moved the wiki to it's correct place (being
developer.postgresql.org)

AFAIK, nobody has stepped up to actually take *responsibility* for
maintaining the wiki - both software and content-wise. But I may have
missed something while I speed-read some lists after getting back.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Joshua D. Drake



1. Authorized user: is that someone with an account, or someone who has
been authorized by someone else?


In my mind it is someone who without threw a process of email 
confirmation. Just to help stave off the amount of trolling that may happen.


Joshua D. Drake




2. I can see the official todo list being in CVS, which gives it all
the access protection it needs. A wiki todo list can stay where it is,
just that it's not official.

[I've just made a reference to the TODO list in CVS from the wiki, that
should help].

Have a nice day,



--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Joshua D. Drake



1. It isn't easy to login


Really? You're kidding, right? You click a link that requires login, and
you get a browser login box. How much easier can it be?


What URL are you talking about?

http://www.postgresql.org/docs/techdocs

Where do I click login? Where do I click create account? Where do I 
click to login?



2. It is even harder to create a login


Again, really? If yo uget the login prompt and hit cancel (or just login
with an invalid password), that says you need a community login. If you
don't hav eone, click here to read about it. If you click here, you
get to the page where you sign up.


You are corrent, now that I have tried it.

If I click edit, and then cancel because I don't have a login I get a 
page that tells me:


* Login required

* Accessing this resource requires a community login. If you don't have 
* one, you can read about it here. To try again, just press your

* browsers Refresh button.

Which pretty much goes against how every other site in the world does 
it. I shouldn't have to throw an exception to perform the correct behavior.


That page that tells me where to login should come *BEFORE* I get a 
login prompt.




Now, explaining this process on the frontpage of the techdocs part of
the site might not be a bad idea at all (in fact, it's a good idea :-P),
but do you honestly think the process is complex? If so, what should we
do to make it easier?


Let me rephrase. It is not complex, it is not standard. Which makes it 
confusing.


What I expect is this:

Open web browser
Go to techdocs

Either the first thing I see is,

 * You are not logged in, if you wish to edit content click here to 
login or create an account.


 * When I click edit the above happens.

3. There is no creation of login for most people because they 
don't know they have to go to the community portion of the 
www site to get to it.


See above, you don't need to do this.



You are correct but most people are going to be confused. They are going 
to click edit, see a login/password they don't have and move on. Heck I 
probably have hit cancel before and didn't even read the text after.


Why?

Because the text after a login failure or cancel when using httpd auth 
is almost ALWAYS telling me I need a correct login. Not giving a link to 
login.


Sincerely,

Joshua D. Drake

--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Joshua D. Drake

Magnus Hagander wrote:

Two points I'm not clear about on this thread though:

1. Authorized user: is that someone with an account, or 
someone who has been authorized by someone else?


IIRC, the idea was someone with an account. Basically you add a (very
very small) hurdle so you only get the people who actually *care* to
write things. But if you do care, it's not a lot of work. You also get
traceability, so you can talk to whomever wrote a certain thing.

I don't see any gain in having someone specifically authorize who can
write to it.


Yeah I would agree. My idea was just that people would actually create 
an account and be email confirmed.


Joshua D. Drake




//Magnus


---(end of broadcast)---
TIP 6: explain analyze is your friend




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Gregory Stark
Joshua D. Drake [EMAIL PROTECTED] writes:

 1. Authorized user: is that someone with an account, or someone who has
 been authorized by someone else?

 In my mind it is someone who without threw a process of email confirmation.
 Just to help stave off the amount of trolling that may happen.

I guess it depends on whether you feel the larger of the project's problems is
too many people trying to help who must be stopped before they do something
that may need to be corrected or too few people getting past the natural
barriers to being able to contribute.

The former would be a great problem to have but I don't see any evidence of
it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Magnus Hagander
  1. It isn't easy to login
  
  Really? You're kidding, right? You click a link that 
 requires login, 
  and you get a browser login box. How much easier can it be?
 
 What URL are you talking about?
 
 http://www.postgresql.org/docs/techdocs

Yes.

 Where do I click login? Where do I click create account? 
 Where do I click to login?

You click to do whatever you want to do - say edit a page or create a
new page. When you do, you will be asked to log in. There is no point in
asking you to log in when you don't need it (such as for reading pages)


 You are corrent, now that I have tried it.
 
 If I click edit, and then cancel because I don't have a login 
 I get a page that tells me:
 
 * Login required
 
 * Accessing this resource requires a community login. If you 
 don't have
 * one, you can read about it here. To try again, just press your
 * browsers Refresh button.
 
 Which pretty much goes against how every other site in the 
 world does it. I shouldn't have to throw an exception to 
 perform the correct behavior.

No, that's correct. But in *normal* access, you just get the login
prompt and you go for it. The usability issue is definitly with the
signup though - do you think it'd be enough to just add a blurb about it
on the first page of techdocs?


 That page that tells me where to login should come *BEFORE* I 
 get a login prompt.

Here, we clearly disagree, I think. If you mean a system like pgFoundry,
where you find where you want to go and edit something (say a tracker),
then you have to specifically go log in (because you never remember to
do that when you get there in the first place - or you may have received
the link in email), at which point you are promptly sent off to a
completely different page than the one you wanted to edit...


 Let me rephrase. It is not complex, it is not standard. Which 
 makes it confusing.
 
 What I expect is this:
 
 Open web browser
 Go to techdocs
 
 Either the first thing I see is,
 
   * You are not logged in, if you wish to edit content click 
 here to login or create an account.
 
   * When I click edit the above happens.
Depends on whose standard you look at, I guess. This is how most
proper sites work, IMHO. There are a whole lot of sucky sites out
there, though :-P

Therere is anothe rproblem with that one - it does not scale. It
requires every pgae to be dynamic and look if you are logged in. 


 Why?
 
 Because the text after a login failure or cancel when using 
 httpd auth is almost ALWAYS telling me I need a correct 
 login. Not giving a link to login.

Yes, this is definitly a problem.

again, you think it'd be enough to stick it o nthe frontpage of
techdocs, or do we need a small blurb on every page next to the edit
links?

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Dave Page


-Original Message-
From: Magnus Hagander [EMAIL PROTECTED]
To: Martijn van Oosterhout kleptog@svana.org
Cc: Gregory Stark [EMAIL PROTECTED]; Josh Berkus josh@agliodbs.com; 
[EMAIL PROTECTED] [EMAIL PROTECTED]; Neil Conway [EMAIL PROTECTED]; 
Dave Page dpage@vale-housing.co.uk; pgsql-hackers@postgresql.org 
pgsql-hackers@postgresql.org
Sent: 17/09/06 14:22
Subject: RE: [pgsql-www] [HACKERS] Developer's Wiki

 Dave took it off when he moved the wiki to it's correct place (being
developer.postgresql.org)

I left it there, but un-linked from the frontpage in this case - but that was 
mainly because Devrim had clearly put a lot of effort in and I don't know if he 
has a copy.

/D

AFAIK, nobody has stepped up to actually take *responsibility* for
maintaining the wiki - both software and content-wise. But I may have
missed something while I speed-read some lists after getting back.

//Magnus



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Timezone List

2006-09-17 Thread Andrew - Supernews
On 2006-09-17, Tom Lane [EMAIL PROTECTED] wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Any view over the full timezone names should also include the
 corresponding data from zone.tab in the timezone library source.

 Just noticed this mail, so that's not included in my patch.

 BTW, now that the view is in, I can't help noticing that it shows 550
 different zone names, while there are only 392 entries in the zone.tab
 file.  I conclude that the zic people don't take maintenance of zone.tab
 very seriously, and hence that we probably shouldn't rely on it.

You're jumping to conclusions there.

Eliminating the alias names (i.e. anything not in the form Continent/*),
we get only 45 differences, of which 40 are backward-compatibility aliases
(see the backward source file). The remaining five are:

 Asia/Riyadh87
 Asia/Riyadh88
 Asia/Riyadh89   (the three Riyadh?? zones are local solar time for specific
  years)
 Asia/Istanbul   (alias for Europe/Istanbul)
 Europe/Nicosia  (alias for Asia/Nicosia)

So the list in zone.tab _is_ complete, it just doesn't list aliases (which
it isn't supposed to). The reason to include zone.tab in with the data
(unlike the current setup which doesn't bother to even install the file
anywhere) is to provide an answer to the question what timezone(s) are
applicable to a specific country. For that purpose aliases are irrelevent.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-www] [HACKERS] Developer`s Wiki

2006-09-17 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


 AFAIK, nobody has stepped up to actually take *responsibility* for
 maintaining the wiki - both software and content-wise. But I may have
 missed something while I speed-read some lists after getting back.

I'm responsible for the software. I'll certainly help out with the
content when I can as well, but it seems we are still tying to hash
out some ground rules at the moment.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200609171112
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-BEGIN PGP SIGNATURE-

iD8DBQFFDWXnvJuQZxSWSsgRAiy2AJ0eBq4uvb4V74WY/Ym8CZjxzGfnGACeN67+
ww7Qj91OCyk39MoidcqhDnA=
=MiZH
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Joshua D. Drake

Gregory Stark wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:


1. Authorized user: is that someone with an account, or someone who has
been authorized by someone else?

In my mind it is someone who without threw a process of email confirmation.
Just to help stave off the amount of trolling that may happen.


I guess it depends on whether you feel the larger of the project's problems is
too many people trying to help who must be stopped before they do something
that may need to be corrected or too few people getting past the natural
barriers to being able to contribute.


That is a good point. I see it as more of a problem with crap content 
that could occur and thus good content won't.


Joshua D. Drake




The former would be a great problem to have but I don't see any evidence of
it.




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Md.Abdul Aziz

On Sat, 16 Sep 2006, Josh Berkus wrote:
Hi,


Greg,


I think the lessons of wikipedia is precisely that you *don't* want to add
such barriers. You want to let people add stuff pretty much freely. That
encourages people to get involved and put up information.


The other lesson of Wikipedia is that maintaining wiki quality for a generally
editable wiki requires a full-time dedicated staff.   We don't even have any
volunteers who have 4 hours/week to commit to cleaning up the wiki, unless
you're volunteering.

Then it will need not be a wiki, just make a website.




This is *particularly* true of the TODO stuff.  We simply don't want Joe User
adding their personal wishlist to the TODOs, and that's exactly what will
happen if the TODO list is world-writable.  TODOs should be items which have
been hashed out here on the Hackers list, and the wiki page should list the
specification which is the general consensus.

If we had a user documentation wiki, then *that* should be world-editable,
but again that would require community volunteers to dedicate to cleaning it
up.  The developer wiki is by and for actual contributors.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Sat, Sep 16, 2006 at 09:15:24PM -0400, Tom Lane wrote:
 Fortunately, none of the real developers would have to pay any attention
 to any such page ... and you can bet they wouldn't.

 If someone wants to spend an afternoon putting up a coherent
 description of their wishlist item complete with possible problems and
 solutions, then I don't see why we should stop them.

Because if they're willing to put any actual effort into it, the right
way is to post that same item to the mailing list where it can be
discussed.  If it survives such discussion (very possibly in a modified
form) *then* it belongs on a TODO list.  The first problem with a wiki
TODO is that it will not reflect any sort of community consensus, only
the opinions of whoever edited the page last.  The second problem is
that setting it up represents a unilateral attempt to redefine (bypass?)
the community's design/development process, which is a process that has
served us well for many years and is not showing any signs of being
broken.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] tiny patch to make vacuumdb -a's database order match pg_dumpall

2006-09-17 Thread Tom Lane
Dan Thomas [EMAIL PROTECTED] writes:
 I've been having trouble running vacuumdb -a and pg_dumpall
 concurrently because they run through the databases in a different
 order (so dumpall was getting stuck behind vacuum's lock, and my
 firewall was rather unhelpfully closing the idle connection).

Um, whaddya mean dumpall was getting stuck behind vacuum's lock?
A plain vacuum doesn't take any locks that would block pg_dump.

While the proposed patch looks harmless enough, I'm unconvinced that
it will solve your problem, or even quite what the problem is.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Andrew Dunstan



Tom Lane wrote:

Martijn van Oosterhout kleptog@svana.org writes:
  

On Sat, Sep 16, 2006 at 09:15:24PM -0400, Tom Lane wrote:


Fortunately, none of the real developers would have to pay any attention
to any such page ... and you can bet they wouldn't.
  


  

If someone wants to spend an afternoon putting up a coherent
description of their wishlist item complete with possible problems and
solutions, then I don't see why we should stop them.



Because if they're willing to put any actual effort into it, the right
way is to post that same item to the mailing list where it can be
discussed.  If it survives such discussion (very possibly in a modified
form) *then* it belongs on a TODO list.  The first problem with a wiki
TODO is that it will not reflect any sort of community consensus, only
the opinions of whoever edited the page last.  The second problem is
that setting it up represents a unilateral attempt to redefine (bypass?)
the community's design/development process, which is a process that has
served us well for many years and is not showing any signs of being
broken.


  


I agree with lots of this.

Being slightly more abstract, we are grappling with a couple of 
different kinds of objects here: discussions and decisions. The mailing 
list is a very good way of having a discussion, and a wiki is IMNSHO a 
poor substitute. Ditto, bulletin board, web forum, blog .  The 
reason is simply that with a mailing list all you need is a subscription 
to get the info delivered to you in a medium everybody uses. It's push, 
not pull, and that's very appealing. Any other mechanism requires the 
user to seek the location of the discussion actively to some degree. 
Conversely, the very unstructured nature of the mailing list(s) makes 
them a poor medium for capturing decisions. That's why some of us have 
advocated use of a tracker to capture decisions about development 
directions, because the TODO list doesn't seem appropriate. But an open 
wiki would be a horrible substitute for the TODO list - it would turn it 
from a list that reflects at least some discussion and consensus into a 
mere wish list of no authority whatsoever. IOW, it is the exact opposite 
of the direction I believe we should be headed.


I use wikis in my work as a good way of capturing all sorts of 
information I want to keep. But I have generally found them to be less 
than successful as a way of capturing discussions or developing coherent 
bodies of technical information and decisions. Comparisons have been 
made with WikiPedia - they are inappropriate. Quite apart from anything 
else Wikipedia survives through the work of a huge team of editors who 
review the work of contributors. And they still run into trouble. We 
don't have the resources and we don't need the fights. So let's not go 
there.


The only good purpose I can see for a developer wiki is as a place to 
publish information that is too large for the mailing lists. Currently 
we provide web and other space for a few users - a wiki would allow us 
to provide publishing facilities in a central spot for a significantly 
wider group of people, with very little cost.


Tom proposed a modest roadmap type experiment a week or so ago. I'd like 
to see that pursued. After all, we know of some things that are at least 
at first cut stage for 8.3, and a few things high on may people's 
agenda. I'd also like to see some work done on using a tracker (for 
features as well as bugs). The rest of what's been talked about strikes 
me as wasted effort, to be honest. We seem to be running in a few 
directions which look like dead ends to me. Let's pick one or two 
strategically, and follow those instead.


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 that is the explain for the 4h+ Q9:
 http://www.kaltenbrunner.cc/files/analyze_q9.txt

The big problem there seems to be the drastic misestimation of the
number of rows matching the p_name ~~ '%ghost%' condition.  What does
pg_stats have for the p_name column?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Joshua D. Drake


Tom proposed a modest roadmap type experiment a week or so ago. I'd like 
to see that pursued. After all, we know of some things that are at least 
at first cut stage for 8.3, and a few things high on may people's 
agenda. I'd also like to see some work done on using a tracker (for 
features as well as bugs). The rest of what's been talked about strikes 
me as wasted effort, to be honest. We seem to be running in a few 
directions which look like dead ends to me. Let's pick one or two 
strategically, and follow those instead.


There are a couple of people helping me with pgbugs.commandprompt.com. 
We could always use a couple more.


Joshua D. Drake




cheers

andrew




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 that is the explain for the 4h+ Q9:
 http://www.kaltenbrunner.cc/files/analyze_q9.txt
 
 The big problem there seems to be the drastic misestimation of the
 number of rows matching the p_name ~~ '%ghost%' condition.  What does
 pg_stats have for the p_name column?

http://www.kaltenbrunner.cc/files/pg_stat_p_name.txt


Stefan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 some additional numbers(first one is with default settings, second is
 with enable_nestloop = 'off', third one is with enable_nestloop = 'off'
 and enable_hashjoin='off'):

 http://www.kaltenbrunner.cc/files/analyze_q7.txt

I'm inclined to think you still have effective_cache_size set too high;
or at least that the planner is being too optimistic about how much
cache space is actually available to each indexscan.

With the code as it currently stands, effective_cache_size has some of
the same properties as work_mem: the planner effectively assumes that
that much space is available to *each* indexscan, and so you'd need to
de-rate the setting based on the complexity of queries and the number of
concurrent sessions.

I'm not sure what we could do about the concurrent-sessions issue, but
we could make some sort of attack on the query complexity issue by
pro-rating the effective_cache_size among all the tables used by a
query.


 http://www.kaltenbrunner.cc/files/analyze_q20.txt
 here we have a 180x(!) speedup with both disabled planner options ...

There's something awfully bogus about that one --- how is it that the
aggregate subplan, with the exact same plan and same number of
executions in all three cases, has an actual runtime 200x more in the
first case?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] tiny patch to make vacuumdb -a's database order match pg_dumpall

2006-09-17 Thread Dan Thomas

Um, whaddya mean dumpall was getting stuck behind vacuum's lock?
A plain vacuum doesn't take any locks that would block pg_dump.


Dammit, just looked and the setup I originally encountered the problem
on and tracked it down to the vacuum process, and it is indeed set up
to perform a full vacuum.. I've incorrectly assumed the problem I'm
having now (with a normal vac) was for the same reason.


While the proposed patch looks harmless enough, I'm unconvinced that
it will solve your problem, or even quite what the problem is.


Yes, sorry about that, it does indeed appear that whatever is causing
my dumpall process to die isn't PG's fault.

Though I still think it makes a *bit* of sense to have vacuumdb use
the same order as pg_dumpall (clusterdb too now I think about it),
it's obviously not as much of an issue as I originally thought, and
not the source of my problem, which is a shame :)

Dan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The big problem there seems to be the drastic misestimation of the
 number of rows matching the p_name ~~ '%ghost%' condition.  What does
 pg_stats have for the p_name column?

 http://www.kaltenbrunner.cc/files/pg_stat_p_name.txt

Hmm ... pattern_sel already applies the operator directly to the
most_common_vals, but in this situation those aren't common enough
to help much.  With such an extensive histogram it is awfully tempting
to assume that the histogram members are a representative sample, and
take the selectivity as being the fraction of histogram entries that
match the pattern.  Maybe drop the first and last histogram entries
on the grounds they're probably outliers.  Thoughts?  What would be a
reasonable minimum histogram size to enable using this approach instead
of the guess-on-the-basis-of-the-pattern code?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 some additional numbers(first one is with default settings, second is
 with enable_nestloop = 'off', third one is with enable_nestloop = 'off'
 and enable_hashjoin='off'):
 
 http://www.kaltenbrunner.cc/files/analyze_q7.txt
 
 I'm inclined to think you still have effective_cache_size set too high;
 or at least that the planner is being too optimistic about how much
 cache space is actually available to each indexscan.

I have long term external monitoring on that server and it indeed shows
that that there was never less then about 5.8G of buffercache used (or
more then 2.2GB used by other means). So 6G might still be a bit on the
optimistic side but it is not actually that far of from reality.
I will redo with lower settings - do you have any suggestions for that ?

 
 With the code as it currently stands, effective_cache_size has some of
 the same properties as work_mem: the planner effectively assumes that
 that much space is available to *each* indexscan, and so you'd need to
 de-rate the setting based on the complexity of queries and the number of
 concurrent sessions.

concurrency is 1 here - there is never more than a single query running
in parallel in those tests.

 
 I'm not sure what we could do about the concurrent-sessions issue, but
 we could make some sort of attack on the query complexity issue by
 pro-rating the effective_cache_size among all the tables used by a
 query.

hmm not sure i understand what you mean here :-(

 
 
 http://www.kaltenbrunner.cc/files/analyze_q20.txt
 here we have a 180x(!) speedup with both disabled planner options ...
 
 There's something awfully bogus about that one --- how is it that the
 aggregate subplan, with the exact same plan and same number of
 executions in all three cases, has an actual runtime 200x more in the
 first case?

hmm - good question. I will redo those in a bit ...


Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] -HEAD planner issue wrt hash_joins on dbt3 ?

2006-09-17 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'm not sure what we could do about the concurrent-sessions issue, but
 we could make some sort of attack on the query complexity issue by
 pro-rating the effective_cache_size among all the tables used by a
 query.

 hmm not sure i understand what you mean here :-(

Per the comment for index_pages_fetched:

 * We assume that effective_cache_size is the total number of buffer pages
 * available for both table and index, and pro-rate that space between the
 * table and index.  (Ideally other_pages should include all the other
 * tables and indexes used by the query too; but we don't have a good way
 * to get that number here.)

A first-order approximation to this would be to add up the total sizes
of all the other tables used in the query.  I am thinking of leaving out
other indexes, mainly because we can't tell at this level which other
indexes are actually gonna get used.  This would tend to underestimate
by leaving out indexes, but not by a lot if you assume indexes are much
smaller than their tables.  It would also be an overestimate because
tables that are not indexscanned concurrently with the one under
consideration probably shouldn't be counted anyway.  So one might hope
these effects would more or less cancel out.  Anyway it seems to be a
better idea than what we have now.

 I will redo with lower settings - do you have any suggestions for that ?

Try reducing effective_cache_size to maybe a fourth of what it is now.
If that helps the thing pick better plans for these multi-table queries,
then we should try changing the other_pages calculation as above.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] One of our own begins a new life

2006-09-17 Thread Enver ALTIN
On Fri, 2006-09-15 at 09:38 -0700, Joshua D. Drake wrote:
 Hello,

Hi Joshua,

 Yeah, this is a cross post and it is slightly off topic but IMHO this is 
 important.
 
 Tomorrow one of our own, Devrim Gunduz is becoming a man. He is sucking 
 it up, and committing to the cvs repo of project marriage.
 
 May the patches reviewers be kind to him!
 
 Congratz Devrim, have a good honey moon and we look forward to having 
 you back in a couple of weeks!

I have done a quick post-commit review of his recent patch, it looked
(very) good to me :) I'd also like to raise the point of how good he's
on pretending to be a belly dancer.
-- 
Enver


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] One of our own begins a new life

2006-09-17 Thread Tomi NA

On 9/15/06, Joshua D. Drake [EMAIL PROTECTED] wrote:

Hello,

Yeah, this is a cross post and it is slightly off topic but IMHO this is
important.

Tomorrow one of our own, Devrim Gunduz is becoming a man. He is sucking
it up, and committing to the cvs repo of project marriage.

May the patches reviewers be kind to him!

Congratz Devrim, have a good honey moon and we look forward to having
you back in a couple of weeks!


The kind of cross post off topic post that's always welcome! :)

t.n.a.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] 8.2 beta blockers

2006-09-17 Thread Tom Lane
I see the following items standing between us and putting out 8.2beta1:

* Set client encoding based on OS environment - Peter E.

I'm not sure whether Peter is intending to complete this item for 8.2
or not, but if it's to be done it ought to be done before we start beta.

* The contrib/userlock replacement issue

We have three possible choices for this: do nothing, install a
bug-compatible, allegedly-clean-room implementation in contrib:
http://archives.postgresql.org/pgsql-patches/2006-09/msg00077.php
or put a hopefully-cleaner design into core, eg per my suggestions here:
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00467.php
I favor the third alternative, mainly because by changing the API
we remove all doubt as to whether any intellectual property
remains from the original GPL'd code.  However, we've got to make up
our minds and get on with it.

* AIX linking issues

This isn't necessarily a beta-stopper, but it'd be nice to get it done
so we can be sure that any beta testing done on AIX will test the
revisions.

* Documentation: VALUES-list docs and release notes

Must-do so that beta testers will know what there is to test.


There are other open items on Bruce's list but none of them seem to
me to be must-fix-before-beta.

I do not know the status of the client-encoding change, but I think we
could get all the other items done tomorrow.  I propose that we try to
wrap these things and ship beta1 Tuesday or Wednesday.  Sitting around
waiting is not a productive use of time, especially when it seems that
many people's attention has already moved on to 8.3 development ideas.

Personally I'm willing to commit to making the VALUES-list docs and
userlock replacement code happen tomorrow.  Bruce seems to be close
on the release notes, and if the other two items aren't ready, well,
beta1 can ship without 'em.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.2 beta blockers

2006-09-17 Thread Christopher Browne
A long time ago, in a galaxy far, far away, [EMAIL PROTECTED] (Tom Lane) wrote:
 I see the following items standing between us and putting out 8.2beta1:
 * AIX linking issues

 This isn't necessarily a beta-stopper, but it'd be nice to get it done
 so we can be sure that any beta testing done on AIX will test the
 revisions.

This has to do with the discussion about shared vs static libs?  

If a patch is put out, it'll get build-farm results on AIX within a
couple hours.  If there's something worth trying, then try away...
-- 
select 'cbbrowne' || '@' || 'gmail.com';
http://cbbrowne.com/info/wp.html
When I met th'POPE back in '58, I scrubbed him with a MILD SOAP or
DETERGENT for 15 minutes.  He seemed to enjoy it ...

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [pgsql-advocacy] One of our own begins a new life

2006-09-17 Thread Jonah H. Harris

On 9/15/06, Joshua D. Drake [EMAIL PROTECTED] wrote:

Tomorrow one of our own, Devrim Gunduz is becoming a man. He is sucking
it up, and committing to the cvs repo of project marriage.


Congratulations Devrim!

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 8.2 beta blockers

2006-09-17 Thread James William Pye
On Sun, Sep 17, 2006 at 07:38:38PM -0400, Tom Lane wrote:
 We have three possible choices for this: do nothing, install a
 bug-compatible, allegedly-clean-room implementation in contrib:
 http://archives.postgresql.org/pgsql-patches/2006-09/msg00077.php
 or put a hopefully-cleaner design into core, eg per my suggestions here:
 http://archives.postgresql.org/pgsql-hackers/2006-09/msg00467.php
 I favor the third alternative, mainly because by changing the API
 we remove all doubt as to whether any intellectual property
 remains from the original GPL'd code.  However, we've got to make up
 our minds and get on with it.

FWIW, I'm +1 on the cleaner design you suggested. While I understand the
concerns of adding features/API this late; as a user, I'd rather not wait
another year to have these available in core(yes, I know alternative measures
would exist if it did not make it into core, but the convenience of having it
there would certainly be nice). That is, I really like the waiting variant.
It is something that I would use. The lack thereof(IIRC) in the current contrib
implementation is something that I have recently lamented about.

I understand that want is not a reason to compromise the feature freeze, so I
hope the legal concerns Tom mentions will be enough. =)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] 8.2 beta blockers

2006-09-17 Thread Andrew - Supernews
On 2006-09-18, James William Pye [EMAIL PROTECTED] wrote:
 FWIW, I'm +1 on the cleaner design you suggested. While I understand the
 concerns of adding features/API this late;

Adding features is one thing, breaking existing users of the code is another.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Interesting CREATE TABLE AS misbehavior

2006-09-17 Thread Tom Lane
regression=# select * from int8_tbl union all select * from int8_tbl order by 
q2;
q1|q2 
--+---
 4567890123456789 | -4567890123456789
 4567890123456789 | -4567890123456789
 4567890123456789 |   123
 4567890123456789 |   123
  123 |   456
  123 |   456
 4567890123456789 |  4567890123456789
  123 |  4567890123456789
 4567890123456789 |  4567890123456789
  123 |  4567890123456789
(10 rows)

regression=# create table fooey(z1,z2) as select * from int8_tbl union all 
select * from int8_tbl order by q2;
ERROR:  column q2 does not exist
LINE 1: ... from int8_tbl union all select * from int8_tbl order by q2;
^

The reason for this behavior is that analyze.c attaches the CREATE TABLE
AS aliases to the targetlist (via applyColumnNames) before it processes
the ORDER BY clause.  (So, order by z2 works instead.)  This seems
like a bug: one would expect that the aliases do not change the
semantics of the SELECT part of the command.

Interestingly, 7.2 seems to get it right, the misbehavior appears in
7.3 and later.

I'm inclined to fix this in HEAD but not back-patch it, on the grounds
that there might be apps out there expecting the existing behavior,
and it's not a big enough deal to change behavior in a minor release.
Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.2 beta blockers

2006-09-17 Thread Tom Lane
Christopher Browne [EMAIL PROTECTED] writes:
 A long time ago, in a galaxy far, far away, [EMAIL PROTECTED] (Tom Lane) 
 wrote:
 I see the following items standing between us and putting out 8.2beta1:
 * AIX linking issues

 This has to do with the discussion about shared vs static libs?  

 If a patch is put out, it'll get build-farm results on AIX within a
 couple hours.

Yeah, I know, which is why I don't find it absolutely critical that
this make it to beta1.  But one of the concerns mentioned in the thread
is that the changes might break things for older AIX versions.  If we
get it into beta1, we have a better chance of finding out before release
whether there are any issues with AIX versions that aren't represented
in buildfarm.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 8.2 beta blockers

2006-09-17 Thread Tom Lane
Andrew - Supernews [EMAIL PROTECTED] writes:
 On 2006-09-18, James William Pye [EMAIL PROTECTED] wrote:
 FWIW, I'm +1 on the cleaner design you suggested. While I understand the
 concerns of adding features/API this late;

 Adding features is one thing, breaking existing users of the code is another.

We already have an established solution for people who need backwards
compatibility: download the old userlock code from pgfoundry.  I'm
unclear why we should duplicate that project within core.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] polite request about syntax

2006-09-17 Thread Jim Nasby

On Sep 15, 2006, at 11:32 PM, Jeremy Drake wrote:
When I was first dealing with postgres, I found it extremely  
annoying that
I had to type out double precision rather than just double  
since every

sane programming language (as well as Java) uses double.  I eventually
figured out that it was because double precision is the standard  
name, but
I don't like to type and although I know I could use float8, I am  
used to

typing double.

I have found the same thing with the type timestamp without time  
zone.

The verbosity of type names seems rather extreme.  But it is just not
important enough to warrant me creating a domain or anything to do
anything about it, it just slightly irks me every time I have to type
them.


Luckily, it's pretty easy to create a 'double' type on your own,  
either as a DOMAIN or as it's own type. Same with datetime, etc.  
(BTW, timestamptz works great as a replacement for timestamp with  
time zone).


Going one step further, you could also create a compatibility package  
and put it on pgFoundry. In fact, I believe one already exists for  
MySQL. I'm sure users would love to see ones for other popular  
databases.


As for $$ quoting, I do think it would be nice if the docs adopted  
the standard of


CREATE FUNCTION my_function ... $my_function$
...
$my_function$

While more verbose than $$, it does a lot to help code readability.
--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Mid cycle release?

2006-09-17 Thread Jim Nasby

On Sep 16, 2006, at 9:31 PM, Tom Lane wrote:

Jim C. Nasby [EMAIL PROTECTED] writes:

Then we should change autovacuum so that it stays out of the way when
tables are being vacuumed frequently enough via an external means.


What makes you think it doesn't do that already?  Of course, it has  
its

own ideas about what frequently enough is, but it won't re-vacuum a
table that's been vacuumed within that interval.


Oh, I'd forgotten that autovac looked at manual vacuums too.


While I personally don't really want autovac on in my development
environment, I find it hard to deny the argument that it ought to be
on by default.  If you know enough to set up a cron job to do your own
vacuuming schedule, you *certainly* know enough to turn off autovac
if you don't want it, or better dial it down to the point where it's
just an emergency backstop for your cron job.  If you don't know  
enough

to turn off autovac, then you need it on.


And while we're certainly not MS Access, it's worthwhile to make  
things easy for newbies when it doesn't get in the way of the pros.  
It's certainly not hard to disable autovac, and anyone who's actually  
tuning an install is going to be tweaking stuff in postgresql.conf  
anyway...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New version of money type

2006-09-17 Thread Jim Nasby

On Sep 16, 2006, at 7:31 PM, Gregory Stark wrote:
Would that pose indexing issues?  It would also mean that when  
joining two
tables you'd have to handle some interesting type  conversion  
issues (at
times).  We had someone accidentally create a  largish table with  
userid as
numeric and other tables are bigint,  it was disastrous for  
performance

(joining).  I'd imagine that if the  above wasn't done cleverly, that
performance problem would be repeated.


That used to be a problem but Tom solved it a little while back.  
Not a perfect
solution in that it requires lots of cross-data-type operators as  
the number

of data types grows but it works.

In any case I think Jim was suggesting this be handled internally  
to the
numeric data type which wouldn't cause this problem. However I'm  
not sure
anything has to be done. A numeric is an array of 16 bit integers,  
so anything

under 64k *is* stored just as an integer.


Yes, I definitely meant for this to be internal-only... end users  
shouldn't notice any difference (except hopefully improved performance).


If all the math is done in 64k chunks then this might not be as big a  
help. Numbers between 2^16 and 2^64 (or 2^32 on some platforms) would  
probably be faster, but how much faster isn't clear. Perhaps the OP  
could do some testing if someone came up with a quick and dirty  
prototype patch.


Well, just an integer plus a useless exponent. I think it would be  
a neat
trick to normalize the exponent to the end of the last element of  
the mantissa

rather than the first digit so that integers don't need an exponent.


How would that help? If I'm understanding correctly you're just  
talking about storing how many places after the decimal instead of  
how many in front of it?

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Unique index: update error

2006-09-17 Thread Jim Nasby

On Sep 14, 2006, at 9:16 PM, Golden Liu wrote:

Suppose there are too tuples in a table t, named
id
---
1
2

and there is a unique index on id. Now we do an update on table t
update t set id=id+1

Since PG executes the update one tuple at a time, it updates tuple 1
to 2 and insert it into the index. Before insert into the index, it
check whether the id is still unique or not. No, it's not, old tuple
2 is still in the table. So an error is raised.

I try to solve this problem this way:
First, update the table t but DON'T update the index.
Next, find all the tuples updated by this command and insert them into
the unique index.


Isn't that what a deferred constraint normally does?

I suspect that your change adds a non-trivial overhead, which means  
we don't want it to be the normal case.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)




--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] relation cache statistics (was: -HEAD planner issue wrt hash_joins on dbt3 ?)

2006-09-17 Thread Jim C. Nasby
On Sun, Sep 17, 2006 at 04:18:36PM -0400, Tom Lane wrote:
  * table and index.  (Ideally other_pages should include all the other
  * tables and indexes used by the query too; but we don't have a good way
  * to get that number here.)
 
 A first-order approximation to this would be to add up the total sizes
 of all the other tables used in the query.  I am thinking of leaving out
 other indexes, mainly because we can't tell at this level which other
 indexes are actually gonna get used.  This would tend to underestimate
 by leaving out indexes, but not by a lot if you assume indexes are much
 smaller than their tables.  It would also be an overestimate because
 tables that are not indexscanned concurrently with the one under
 consideration probably shouldn't be counted anyway.  So one might hope
 these effects would more or less cancel out.  Anyway it seems to be a
 better idea than what we have now.

I think it'd be better to attack this problem from the other side;
namely looking at what's actually cached. Sadly, I don't think there's
any way to actually query the OS for info about what it has buffered,
but we can look at what's in shared_buffers and assume that it's a
reasonable proxy for the OS's cache. Something like...

relBufPages / shared_buffers * effective_cache_size

should give us a decent idea of what percentage of a relation will be in
cache somewhere. (relBufPages is the number of pages the relation in
question has in the buffer).

Of course, that raises the question of how to track how many pages are
in shared buffers for a relation. Given the criticality of locking
there, we probably don't want to update that info in real-time, but for
this application it's probably OK to just scan through the buffer every
X period of time (maybe after X number of pages read into the buffers).
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Unique index: update error

2006-09-17 Thread Tom Lane
Jim Nasby [EMAIL PROTECTED] writes:
 On Sep 14, 2006, at 9:16 PM, Golden Liu wrote:
 I try to solve this problem this way:
 First, update the table t but DON'T update the index.
 Next, find all the tuples updated by this command and insert them into
 the unique index.

 I suspect that your change adds a non-trivial overhead, which means  
 we don't want it to be the normal case.

There's a bigger problem:

begin;
update tab set col1 = ... where unique_key = ...;
update tab set col2 = ... where unique_key = ...;
commit;

If the first update doesn't insert index entries into unique_key's
index, then the second update won't find the tuples it needs to update
(unless we hack the planner to not trust the index as valid ... and
then it'd fall back on a seqscan, which is hardly acceptable anyway).

The scheme that I've thought about involves inserting index entries as
usual, but instead of having the aminsert code error out immediately
upon finding a duplicate, have it make an entry in a list of things
that need to be rechecked before commit.  This wins as long as potential
conflicts are uncommon.  Performance could suck if the list gets too
large --- but we have more or less the same hazard now for foreign-key
checks, and it mostly works well enough.  (In fact, maybe the existing
deferred trigger event list is the thing to use for the deferred
conflict rechecks.)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] One of our own begins a new life

2006-09-17 Thread Jim C. Nasby
On Mon, Sep 18, 2006 at 12:29:56AM +0300, Enver ALTIN wrote:
 On Fri, 2006-09-15 at 09:38 -0700, Joshua D. Drake wrote:
  Hello,
 
 Hi Joshua,
 
  Yeah, this is a cross post and it is slightly off topic but IMHO this is 
  important.
  
  Tomorrow one of our own, Devrim Gunduz is becoming a man. He is sucking 
  it up, and committing to the cvs repo of project marriage.
  
  May the patches reviewers be kind to him!
  
  Congratz Devrim, have a good honey moon and we look forward to having 
  you back in a couple of weeks!
 
 I have done a quick post-commit review of his recent patch, it looked
 (very) good to me :) I'd also like to raise the point of how good he's
 on pretending to be a belly dancer.

Just be careful... this is one patch we don't want to be reviewing too
closely. ;P

Congrats Devrim!
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] relation cache statistics (was: -HEAD planner issue wrt hash_joins on dbt3 ?)

2006-09-17 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 I think it'd be better to attack this problem from the other side;
 namely looking at what's actually cached.

You can kiss goodbye to plan stability if you go that route... and
in any case I doubt the assumption that what's in shared buffers is
representative of what's in kernel cache.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 8.2 beta blockers

2006-09-17 Thread Jim C. Nasby
On Sun, Sep 17, 2006 at 07:38:38PM -0400, Tom Lane wrote:
 * The contrib/userlock replacement issue
 
 We have three possible choices for this: do nothing, install a
 bug-compatible, allegedly-clean-room implementation in contrib:
 http://archives.postgresql.org/pgsql-patches/2006-09/msg00077.php
 or put a hopefully-cleaner design into core, eg per my suggestions here:
 http://archives.postgresql.org/pgsql-hackers/2006-09/msg00467.php
 I favor the third alternative, mainly because by changing the API
 we remove all doubt as to whether any intellectual property
 remains from the original GPL'd code.  However, we've got to make up
 our minds and get on with it.
 
One problem I see with userlock is that you're asking for lock ID
conflicts unless you control everything on the system that's using
userlocks. Right now, doing so probably isn't that difficult, but my
hope is that as PostgreSQL becomes more acceptable we'll see more
software written to use it... software that might need to play well with
other software within the same database.

I think the best way to fix this is to set aside some range of the lock
ID space for a naming scheme that is based on schemas, or some other
kind of text identifier. I think this really just amounts to putting
some recommended practices into the docs; perhaps not using any
int4,int4 IDs where the first int is larger than 16k or whatever the
maximum system OID is. That would allow for segmenting locks based on
schema OIDs. Down the road we might want a nicer, schema-based
interface, but just setting aside the space should be fine for now.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Interesting CREATE TABLE AS misbehavior

2006-09-17 Thread Jim C. Nasby
On Sun, Sep 17, 2006 at 10:10:52PM -0400, Tom Lane wrote:
 I'm inclined to fix this in HEAD but not back-patch it, on the grounds
 that there might be apps out there expecting the existing behavior,
 and it's not a big enough deal to change behavior in a minor release.
 Thoughts?

Isn't the existing behavior that you can't do something (ie, throws an
error)? Doesn't seem like much would break to fix it. Then again, I
can't remember anyone complaining about this, so it's probably not worth
the effort.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Jim C. Nasby
On Sun, Sep 17, 2006 at 12:32:13PM -0700, Joshua D. Drake wrote:
 
 Tom proposed a modest roadmap type experiment a week or so ago. I'd like 
 to see that pursued. After all, we know of some things that are at least 
 at first cut stage for 8.3, and a few things high on may people's 
 agenda. I'd also like to see some work done on using a tracker (for 
 features as well as bugs). The rest of what's been talked about strikes 
 me as wasted effort, to be honest. We seem to be running in a few 
 directions which look like dead ends to me. Let's pick one or two 
 strategically, and follow those instead.
 
 There are a couple of people helping me with pgbugs.commandprompt.com. 
 We could always use a couple more.

Sorry if I missed an email, but help doing what? Are we actively trying
to do something with that besides just play around with it (I'm already
pretty well-aware of bugzilla's capabilities...)
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] relation cache statistics (was: -HEAD planner issue wrt hash_joins on dbt3 ?)

2006-09-17 Thread Jim C. Nasby
On Mon, Sep 18, 2006 at 12:20:10AM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  I think it'd be better to attack this problem from the other side;
  namely looking at what's actually cached.
 
 You can kiss goodbye to plan stability if you go that route... and
 in any case I doubt the assumption that what's in shared buffers is
 representative of what's in kernel cache.

Well, there's 2 issues with caching:

1) Is something we're going to want actually in cache right now?
2) If we need to read something more than once (ie: higher level btree
pages), what are the odds it will still be in cache when we come around
to it the next time.

Once the caches are warmed up, looking at what's actually in them would
give you a very good probability for #1. I suspect that for large
relations, shared_buffers would also match the OS cache pretty well in
most cases, almost certainly better than whatever estimate we're using
now. But I'm not sure how useful that info is to the planner.

For #2 we'd have to know what kind of pressure the caches are under to
replace pages and have some kind of idea how frequently the system is
hitting them. The pg_statio info might be useful there, though
unfortunately in that case I think there's much less likely to be a good
correlation between the two.

If there was *some* way to track stats on page fetches that came out of
the OS cache, I suspect we could make great use of per-relation hit
rates to come up with better plans.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [pgsql-www] [HACKERS] Developer's Wiki

2006-09-17 Thread Joshua D. Drake

Jim C. Nasby wrote:

On Sun, Sep 17, 2006 at 12:32:13PM -0700, Joshua D. Drake wrote:
Tom proposed a modest roadmap type experiment a week or so ago. I'd like 
to see that pursued. After all, we know of some things that are at least 
at first cut stage for 8.3, and a few things high on may people's 
agenda. I'd also like to see some work done on using a tracker (for 
features as well as bugs). The rest of what's been talked about strikes 
me as wasted effort, to be honest. We seem to be running in a few 
directions which look like dead ends to me. Let's pick one or two 
strategically, and follow those instead.
There are a couple of people helping me with pgbugs.commandprompt.com. 
We could always use a couple more.


Sorry if I missed an email, but help doing what? Are we actively trying
to do something with that besides just play around with it (I'm already
pretty well-aware of bugzilla's capabilities...)


Well yes, we are trying to use it :). If it becomes useful enough, we 
hope that the project as a whole will move to it. If it isn't useful 
enough, then we can say We have actually tried it for the project, it 
didn't work.


Joshua D. Drake


--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org