Re: [HACKERS] Make SIGHUP less painful if pg_hba.conf is not readable

2009-03-05 Thread Joshua Tolley
On Thu, Mar 05, 2009 at 09:47:55AM -0500, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: Yeah, the big question is if we want to backport something like this at all... Thoughts? The issue never even came up before, so I'd vote to not take any risks for it. How often do

[HACKERS] Operators based on non-IMMUTABLE functions

2009-03-05 Thread Joshua Tolley
I've recently run into a problem with a datatype whose operators are based on functions not marked IMMUTABLE. Although there might be good reasons to have such a thing, it seems like it might be a valuable warning message if you create an operator based on an non-IMMUTABLE function. Comments? -

Re: [HACKERS] Make SIGHUP less painful if pg_hba.conf is not readable

2009-03-05 Thread Joshua Tolley
On Thu, Mar 05, 2009 at 08:19:05PM +0100, Magnus Hagander wrote: Peter Eisentraut wrote: On Thursday 05 March 2009 18:04:42 Joshua Tolley wrote: As an aside, is access() adequately portable, ok to use within the backend, etc.? I just sort of took a shot in the dark. Using access

Re: [HACKERS] Make SIGHUP less painful if pg_hba.conf is not readable

2009-03-04 Thread Joshua Tolley
On Wed, Mar 04, 2009 at 10:28:42AM +0100, Magnus Hagander wrote: Joshua Tolley wrote: On Wed, Mar 04, 2009 at 09:43:55AM +0100, Magnus Hagander wrote: So. I've updated the comment, and applied your patch. Thanks! What would it take to get it applied to a few earlier versions as well

[HACKERS] SYNONYMs revisited

2009-03-04 Thread Joshua Tolley
Way back in this thread[1] one of the arguments against allowing some version of CREATE SYNONYM was that we couldn't create a synonym for an object in a remote database. Will the SQL/MED work make this sort of thing a possibility? I realize since it's not standard anyway, there's still a

Re: [HACKERS] SYNONYMs revisited

2009-03-04 Thread Joshua Tolley
On Wed, Mar 04, 2009 at 10:14:41AM -0500, Jonah H. Harris wrote: SQL/MED does support foreign tables, which are basically synonyms for remote tables. Other than that, it has no real similarity to synonym behavior for other database objects such as views, functions, or local

Re: [HACKERS] SYNONYMs revisited

2009-03-04 Thread Joshua Tolley
On Wed, Mar 04, 2009 at 03:15:23PM -0500, Tom Lane wrote: Joshua Tolley eggyk...@gmail.com writes: I didn't mean to suggest that SQL/MED on its own could be used to make SYNONYMs, but rather that given SQL/MED, perhaps we could reconsider some sort of CREATE SYNONYM functionality to go

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-02-26 Thread Joshua Tolley
On Wed, Feb 25, 2009 at 10:24:21PM -0500, Robert Haas wrote: I don't think we're really doing this the right way. EXPLAIN ANALYZE has a measurable effect on the results, and we probably ought to stop the database and drop the VM caches after each query. Are the Z1-Z7 datasets on line

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-02-26 Thread Joshua Tolley
On Thu, Feb 26, 2009 at 08:22:52AM -0500, Robert Haas wrote: On Thu, Feb 26, 2009 at 4:22 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Joshua, in the tests that you've been running, did you have to rig the planner with enable_mergjoin=off or similar, to get the queries

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-02-19 Thread Joshua Tolley
On Wed, Feb 18, 2009 at 11:20:03PM -0500, Robert Haas wrote: On Wed, Jan 7, 2009 at 9:14 AM, Joshua Tolley eggyk...@gmail.com wrote: On Tue, Jan 06, 2009 at 11:49:57PM -0500, Robert Haas wrote: Josh / eggyknap - Can you rerun your performance tests with this version of the patch

Re: [HACKERS] adding stuff to parser, question

2009-02-01 Thread Joshua Tolley
On Sun, Feb 01, 2009 at 12:12:47AM +, Grzegorz Jaskiewicz wrote: On 1 Feb 2009, at 00:05, Joshua Tolley wrote: to add new syntax, you might consider writing a function instead. This function might take parameters such as the privilege to grant and the user to grant it to, and be called

Re: [HACKERS] adding stuff to parser, question

2009-01-31 Thread Joshua Tolley
On Sat, Jan 31, 2009 at 05:40:57PM +, Grzegorz Jaskiewicz wrote: On 31 Jan 2009, at 17:30, Andrew Dunstan wrote: But the syntax you posted does not do this at all. Where does it restrict the grant to a single schema, like the syntax above? I am just starting the attempt here, obviously

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

2009-01-07 Thread Joshua Tolley
On Tue, Jan 06, 2009 at 11:49:57PM -0500, Robert Haas wrote: Josh / eggyknap - Can you rerun your performance tests with this version of the patch? ...Robert Will do, as soon as I can. signature.asc Description: Digital signature

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-12-23 Thread Joshua Tolley
On Tue, Dec 23, 2008 at 09:22:27AM -0500, Robert Haas wrote: On Tue, Dec 23, 2008 at 2:21 AM, Bryce Cutt pandas...@gmail.com wrote: Because there is no nice way in PostgreSQL (that I know of) to derive a histogram after a join (on an intermediate result) currently usingMostCommonValues is

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-12-23 Thread Joshua Tolley
On Tue, Dec 23, 2008 at 10:14:29AM -0500, Robert Haas wrote: It's equivalent to our assumption that distributions of values in columns in the same table are independent. Making that assumption in this case would probably result in occasional dramatic speed improvements similar to the ones

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-12-22 Thread Joshua Tolley
On Sun, Dec 21, 2008 at 10:25:59PM -0500, Robert Haas wrote: [Some performance testing.] I (finally!) have a chance to post my performance testing results... my apologies for the really long delay. Excuses omitted Unfortunately I'm not seeing wonderful speedups with the particular queries I did

Re: [HACKERS] Simple postgresql.conf wizard

2008-11-27 Thread Joshua Tolley
On Thu, Nov 27, 2008 at 05:15:04PM -0500, Robert Haas wrote: A random thought: maybe the reason I'm not seeing any benefit is because my tables are just too small - most contain at most a few thousand rows, and some are much smaller. Maybe default_statistics_target should vary with the table

Re: [HACKERS] Simple postgresql.conf wizard -- Statistics idea...

2008-11-26 Thread Joshua Tolley
On Tue, Nov 25, 2008 at 06:59:25PM -0800, Dann Corbit wrote: I do have a statistics idea/suggestion (possibly useful with some future PostgreSQL 9.x or something): It is a simple matter to calculate lots of interesting univarate summary statistics with a single pass over the data (perhaps

Re: [HACKERS] Patch Review Complete: Multi-Batch Hash Join Improvements

2008-11-18 Thread Joshua Tolley
On Mon, Nov 17, 2008 at 10:42:21PM -0800, Jeff Davis wrote: On Mon, 2008-11-17 at 23:19 -0700, Joshua Tolley wrote: -- it speeds up joins by fairly significant margins in some cases The original claim in the message you cite says 10-50% for some data distributions. Were you able to observe

[HACKERS] Patch Review Complete: Multi-Batch Hash Join Improvements

2008-11-17 Thread Joshua Tolley
Note: this email is effectively a repeat of an email sent earlier to which there has been less response than I expected. If there's something else I'm supposed to do at this point, someone please let me know, because I don't know what it is :) --- I've finished

Re: [HACKERS] Question about SPI_prepare

2008-11-11 Thread Joshua Tolley
On Tue, Nov 11, 2008 at 11:33:41AM -0600, Tim Keitt wrote: I have an application where I am building a plan with SPI_plan and then this plan is called multiple times. There is one free parameter ($1) to the plan. The issue is with the order of the values returned. If $1 is identical during

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-10 Thread Joshua Tolley
On Wed, Nov 05, 2008 at 04:06:11PM -0800, Bryce Cutt wrote: The error is causes by me Asserting against the wrong variable. I never noticed this as I apparently did not have assertions turned on on my development machine. That is fixed now and with the new patch version I have attached all

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-06 Thread Joshua Tolley
On Wed, Nov 5, 2008 at 5:06 PM, Bryce Cutt [EMAIL PROTECTED] wrote: The error is causes by me Asserting against the wrong variable. I never noticed this as I apparently did not have assertions turned on on my development machine. That is fixed now and with the new patch version I have

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-06 Thread Joshua Tolley
On Thu, Nov 6, 2008 at 3:52 PM, Simon Riggs [EMAIL PROTECTED] wrote: On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote: Stay tuned. Minor question on this patch. AFAICS there is another patch that seems to be aiming at exactly the same use case. Jonah's Bloom filter patch. Shouldn't

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-06 Thread Joshua Tolley
On Thu, Nov 6, 2008 at 5:31 PM, Lawrence, Ramon [EMAIL PROTECTED] wrote: -Original Message- Minor question on this patch. AFAICS there is another patch that seems to be aiming at exactly the same use case. Jonah's Bloom filter patch. Shouldn't we have a dust off to see which one

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-05 Thread Joshua Tolley
On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote: We propose a patch that improves hybrid hash join's performance for large multi-batch joins where the probe relation has skew. I'm running into problems with this patch. It applies cleanly, and the technique you provided for

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-05 Thread Joshua Tolley
On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote: We propose a patch that improves hybrid hash join's performance for large multi-batch joins where the probe relation has skew. I also recommend modifying docs/src/sgml/config.sgml to include the enable_hashjoin_usestatmcvs

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-05 Thread Joshua Tolley
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, Nov 5, 2008 at 8:20 AM, Tom Lane wrote: Joshua Tolley writes: On Mon, Oct 20, 2008 at 03:42:49PM -0700, Lawrence, Ramon wrote: We propose a patch that improves hybrid hash join's performance for large multi-batch joins where the probe

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-05 Thread Joshua Tolley
On Wed, Nov 05, 2008 at 04:06:11PM -0800, Bryce Cutt wrote: The error is causes by me Asserting against the wrong variable. I never noticed this as I apparently did not have assertions turned on on my development machine. That is fixed now and with the new patch version I have attached all

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-02 Thread Joshua Tolley
On Sun, Nov 2, 2008 at 4:48 PM, Lawrence, Ramon [EMAIL PROTECTED] wrote: Joshua, Thank you for offering to review the patch. The easiest way to test would be to generate your own TPC-H data and load it into a database for testing. I have posted the TPC-H generator at:

Re: [HACKERS] Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

2008-11-01 Thread Joshua Tolley
On Mon, Oct 20, 2008 at 4:42 PM, Lawrence, Ramon [EMAIL PROTECTED] wrote: We propose a patch that improves hybrid hash join's performance for large multi-batch joins where the probe relation has skew. Project name: Histojoin Patch file: histojoin_v1.patch This patch implements the Histojoin

Re: [HACKERS] Lisp as a procedural language?

2008-10-20 Thread Joshua Tolley
On Mon, Oct 20, 2008 at 12:56 PM, John DeSoi [EMAIL PROTECTED] wrote: On Oct 19, 2008, at 1:27 PM, Douglas McNaught wrote: SBCL is a big and very sophisticated program. It's designed to be a self-contained Lisp system and has (AFAIK) no concessions to embeddability. It uses threads

Re: [HACKERS] Cross-column statistics revisited

2008-10-18 Thread Joshua Tolley
On Fri, Oct 17, 2008 at 7:54 PM, Nathan Boley [EMAIL PROTECTED] wrote: I'm still working my way around the math, but copulas sound better than anything else I've been playing with. I think the easiest way to think of them is, in 2-D finite spaces, they are just a plot of the order statistics

Re: [HACKERS] Cross-column statistics revisited

2008-10-17 Thread Joshua Tolley
On Fri, Oct 17, 2008 at 3:47 PM, Nathan Boley [EMAIL PROTECTED] wrote: Right now our histogram values are really quantiles; the statistics_target T for a column determines a number of quantiles we'll keep track of, and we grab values from into an ordered list L so that approximately 1/T of

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Joshua Tolley
On Thu, Oct 16, 2008 at 2:54 PM, Josh Berkus [EMAIL PROTECTED] wrote: Tom, (I'm not certain of how to do that efficiently, even if we had the right stats :-() I was actually talking to someone about this at pgWest. Apparently there's a fair amount of academic algorithms devoted to this

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Joshua Tolley
On Thu, Oct 16, 2008 at 6:32 PM, Tom Lane [EMAIL PROTECTED] wrote: It appears to me that a lot of people in this thread are confusing correlation in the sense of statistical correlation between two variables with correlation in the sense of how well physically-ordered a column is. For what

Re: [HACKERS] Cross-column statistics revisited

2008-10-16 Thread Joshua Tolley
On Thu, Oct 16, 2008 at 8:38 PM, Tom Lane [EMAIL PROTECTED] wrote: Joshua Tolley [EMAIL PROTECTED] writes: For what it's worth, neither version of correlation was what I had in mind. Statistical correlation between two variables is a single number, is fairly easy to calculate, and probably

Re: [HACKERS] Cross-column statistics revisited

2008-10-15 Thread Joshua Tolley
On Wed, Oct 15, 2008 at 7:51 AM, Gregory Stark [EMAIL PROTECTED] wrote: Joshua Tolley [EMAIL PROTECTED] writes: I've been interested in what it would take to start tracking cross-column statistics. A review of the mailing lists as linked from the TODO item on the subject [1] suggests

[HACKERS] Cross-column statistics revisited

2008-10-15 Thread Joshua Tolley
I've been interested in what it would take to start tracking cross-column statistics. A review of the mailing lists as linked from the TODO item on the subject [1] suggests the following concerns: 1) What information exactly would be tracked? 2) How would it be kept from exploding in size? 3) For

[HACKERS] \ef should probably append semicolons

2008-10-10 Thread Joshua Tolley
The new \ef psql command creates nicely usable CREATE OR REPLACE FUNCTION ... text based on the function I tell it to edit, but the text it creates *doesn't* include a final semicolon, so when I exit my editor-of-choice after messing with my function, it doesn't run the code I've given it until I

Re: [HACKERS] \ef should probably append semicolons

2008-10-10 Thread Joshua Tolley
On Fri, Oct 10, 2008 at 7:10 PM, Tom Lane [EMAIL PROTECTED] wrote: Joshua Tolley [EMAIL PROTECTED] writes: The new \ef psql command creates nicely usable CREATE OR REPLACE FUNCTION ... text based on the function I tell it to edit, but the text it creates *doesn't* include a final semicolon, so

<    1   2