Re: [PATCHES] 2WRS [WIP]

2008-02-08 Thread Decibel!
On Fri, Feb 08, 2008 at 12:27:23AM -0500, Jaime Casanova wrote:
 On Feb 7, 2008 6:04 AM, Manolo _ [EMAIL PROTECTED] wrote:
 
  HI.
 
  I send you the diff of my code against the current CVS TIP.
  Please tell me if it's what you were asking for.
 
 
 not actually, because your patch removes an improvement that was
 included in 8.3...
 what you will have to do (if someone has a better solution feel free
 to comment on this) is to manually merge your 8.2's patch into the
 8.3's source and then generate a diff

s/8.3/HEAD/
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpiovHuPRYnM.pgp
Description: PGP signature


Re: [PATCHES] Proposed patch to disallow password=foo in database name parameter

2008-01-28 Thread Decibel!
On Tue, Dec 11, 2007 at 08:58:05AM -0500, Andrew Dunstan wrote:
 I'm actually inclined to vote with Stephen that this is a silly change.
 I just put up the patch to show the best way of doing it if we're gonna
 do it ...
 
 OK. I'm not going to die in a ditch over it.

On the other hand, warning about it in the docs would probably be a good
idea...
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpqQC74CaoF8.pgp
Description: PGP signature


Re: [PATCHES] Better default_statistics_target

2008-01-28 Thread Decibel!
On Wed, Dec 05, 2007 at 06:49:00PM +0100, Guillaume Smet wrote:
 On Dec 5, 2007 3:26 PM, Greg Sabino Mullane [EMAIL PROTECTED] wrote:
  Agreed, this would be a nice 8.4 thing. But what about 8.3 and 8.2? Is
  there a reason not to make this change? I know I've been lazy and not run
  any absolute figures, but rough tests show that raising it (from 10 to
  100) results in a very minor increase in analyze time, even for large
  databases. I think the burden of a slightly slower analyze time, which
  can be easily adjusted, both in postgresql.conf and right before running
  an analyze, is very small compared to the pain of some queries - which 
  worked
  before - suddenly running much, much slower for no apparent reason at all.
 
 As Tom stated it earlier, the ANALYZE slow down is far from being the
 only consequence. The planner will also have more work to do and
 that's the hard point IMHO.

How much more? Doesn't it now use a binary search? If so, ISTM that
going from 10 to 100 would at worst double the time spent finding the
bucket we need. Considering that we're talking something that takes
microseconds, and that there's a huge penalty to be paid if you have bad
stats estimates, that doesn't seem that big a deal. And on modern
machines it's not like the additional space in the catalogs is going to
kill us.

FWIW, I've never seen anything but a performance increase or no change
when going from 10 to 100. In most cases there's a noticeable
improvement since it's common to have over 100k rows in a table, and
there's just no way to capture any kind of a real picture of that with
only 10 buckets.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpgJJU7Asl3N.pgp
Description: PGP signature


Re: [PATCHES] [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

2007-12-21 Thread Decibel!

On Dec 20, 2007, at 2:36 AM, Gokulakannan Somasundaram wrote:
I checked it by creating a table with 10 columns on a 32 bit  
machine. i inserted 100,000 rows with trailing nulls and i observed  
savings of 400Kbytes.



That doesn't really tell us anything... how big was the table  
originally? Also, testing on 64 bit would be interesting.

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCHES] HOT patch - version 14

2007-08-31 Thread Decibel!
On Fri, Aug 31, 2007 at 12:53:51PM +0530, Pavan Deolasee wrote:
 On 8/31/07, Pavan Deolasee [EMAIL PROTECTED] wrote:
 
 
 
  In fact, now that I think about it there is no other
  fundamental reason to not support HOT on system tables. So we
  can very well do what you are suggesting.
 
 
 
 On second thought, I wonder if there is really much to gain by
 supporting HOT on system tables and whether it would justify all
 the complexity. Initially I thought about CatalogUpdateIndexes to
 which we need to teach HOT. Later I also got worried about
 building the HOT attribute lists for system tables and handling
 all the corner cases for bootstrapping and catalog REINDEX.
 It might turn out to be straight forward, but I am not able to
 establish that with my limited knowledge in the area.
 
 I would still vote for disabling HOT on catalogs unless you see
 strong value in it.

What about ANALYZE? Doesn't that do a lot of updates?

BTW, I'm 100% in favor of pushing system catalog HOT until later; it's
be silly to risk not getting hot in 8.3 because of catalog HOT.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpXfGeddWvmd.pgp
Description: PGP signature


Re: [PATCHES] Reduce the size of PageFreeSpaceInfo on 64bit platform

2007-08-10 Thread Decibel!
On Fri, Aug 10, 2007 at 10:32:35AM +0900, ITAGAKI Takahiro wrote:
 Here is a patch to reduce the size of PageFreeSpaceInfo on 64bit platform.
 We will utilize maintenance_work_mem twice with the patch.
 
 The sizeof(PageFreeSpaceInfo) is 16 bytes there because the type of 'avail'
 is 'Size', that is typically 8 bytes and needs to be aligned in 8-byte bounds.
 I changed the type of the field to uint32. We can store the freespace with
 uint16 at smallest, but the alignment issue throws it away.

So... does that mean that the comment in the config file about 6 bytes
per page is incorrect?
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpTGjLFGE552.pgp
Description: PGP signature


Re: [PATCHES] strpos() KMP

2007-08-07 Thread Decibel!
On Wed, Aug 08, 2007 at 12:19:53AM +0500, Pavel Ajtkulov wrote:
 
  Do you have any performance test results for this?
 
 I describe the worst case in first message (search 'aaa..aab' in
 'aa..aa', complete N^2). It works some msec instead of several sec
 (in current version).

You describe the test case but don't provide any results. Providing the
test case so others can duplicate your results is good, but you should
provide actual results as well. You should also make sure to test any
cases where performance would actually degrade so we can see what that
looks like (though I don't know if that's an issue in this case).
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpGxMofKAc20.pgp
Description: PGP signature


Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)

2007-08-04 Thread Decibel!
On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote:
 You would have to recompile with the value at line 214 of
 src/backend/utils/adt/pg_lzcompress.c set to a lower value.

Doesn't seem to be working for me, even in the case of a table with a
bunch of rows containing nothing but 213 'x's. I can't do anything that
changes what pg_class.relpages shows.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp4mGKOGGkJ0.pgp
Description: PGP signature


Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)

2007-08-03 Thread Decibel!
On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote:
 Decibel! [EMAIL PROTECTED] writes:
 
  On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote:
  Fwiw, do we really not want to compress anything smaller than 256 bytes
  (everyone in Postgres uses the default strategy, not the always strategy).
 
  Is there actually a way to specify always compressing? I'm not seeing it
  on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html
 
 In the code there's an always strategy, but nothing in Postgres uses it so
 there's no way to set it using ALTER TABLE ... SET STORAGE.
 
 That might be an interesting approach though. We could add another SET STORAGE
 value COMPRESSIBLE which says to use the always strategy. The neat thing
 about this is we could set bpchar to use this storage type by default.

Yeah, we should have that. I'll add it to my TODO...

  ISTM that with things like CHAR(n) around we might very well have some
  databases where compression for smaller sized datums would be beneficial. I
  would suggest 32 for the minimum.
 
  CPU is generally cheaper than IO now-a-days, so I agree with something
  less than 256. Not sure what would be best though.
 
 Well it depends a lot on how large your database is. If your whole database
 fits in RAM and you use datatypes like CHAR(n) only for storing data which is
 exactly b characters long then there's really no benefit to trying to compress
 smaller data.
 
Well, something else to consider is that this could make a big
difference between a database fitting in memory and not...

 If on the other hand your database is heavily I/O-bound and you're using
 CHAR(n) or storing other highly repetitive short strings then compressing data
 will save I/O bandwidth at the expense of cpu cycles.
 
  I do have a database that has both user-entered information as well as
  things like email addresses, so I could do some testing on that if
  people want.
 
 You would have to recompile with the value at line 214 of
 src/backend/utils/adt/pg_lzcompress.c set to a lower value.

Ok, I'll work up some numbers. The only tests that come to mind are how long it
takes to load a dump (with indexes) and the resulting table size. Other ideas?
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpf6qtfBTCDu.pgp
Description: PGP signature


Re: [PATCHES] strpos() KMP

2007-08-03 Thread Decibel!
On Thu, Aug 02, 2007 at 01:18:22AM +0500, Pavel Ajtkulov wrote:
 Hello,
 
 this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function 
 (see Cormen et al. Introduction to Algorithms, MIT Press, 2001).
 
 It also works with multibyte wchar.
 
 In worst case current brute force strpos() takes O(n * m) (n  m is length 
 of strings) 
 time (example: 'aaa...aaab' search in 'aaa...aaa').
 KMP algo always takes O(n + m) time. 
 To check this someone need to create a table with one text attribute, and 
 insert several thousands
 record 'aa..aa'(for example, with lenght = 1000) . After execute select 
 count(*) from test where 
 strpos(a, 'aaaaab')  0; on current and modified version.
 
 Also, I advise to use select .. where strpos(att, 'word')  0; instead 
 select .. where attr like '%word%'
 (strpos must be faster than regex).
 
 In general, this belongs to artificial expressions. In natural language KMP 
 is equal (execution time)
 current strpos() nearly.

Do you have any performance test results for this?
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgprzhuOBAr5D.pgp
Description: PGP signature


Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)

2007-08-03 Thread Decibel!
On Fri, Aug 03, 2007 at 06:12:09PM -0500, Decibel! wrote:
 On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote:
  Decibel! [EMAIL PROTECTED] writes:
  
   On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote:
   Fwiw, do we really not want to compress anything smaller than 256 bytes
   (everyone in Postgres uses the default strategy, not the always 
   strategy).
  
   Is there actually a way to specify always compressing? I'm not seeing it
   on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html
  
  In the code there's an always strategy, but nothing in Postgres uses it so
  there's no way to set it using ALTER TABLE ... SET STORAGE.
  
  That might be an interesting approach though. We could add another SET 
  STORAGE
  value COMPRESSIBLE which says to use the always strategy. The neat thing
  about this is we could set bpchar to use this storage type by default.
 
 Yeah, we should have that. I'll add it to my TODO...

On second thought... how much work would it be to expose the first 3 (or
even all) of the elements in PGLZ_Strategy to SET STORAGE? It's
certainly possible that there are workloads out there that will never be
optimal with one of our pre-defined strategies...
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpuThZqXLBgG.pgp
Description: PGP signature


Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)

2007-07-28 Thread Decibel!
On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote:
 Fwiw, do we really not want to compress anything smaller than 256 bytes
 (everyone in Postgres uses the default strategy, not the always strategy).

Is there actually a way to specify always compressing? I'm not seeing it
on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html

 ISTM that with things like CHAR(n) around we might very well have some
 databases where compression for smaller sized datums would be beneficial. I
 would suggest 32 for the minimum.

CPU is generally cheaper than IO now-a-days, so I agree with something
less than 256. Not sure what would be best though.

I do have a database that has both user-entered information as well as
things like email addresses, so I could do some testing on that if
people want.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpxVh94hBmDs.pgp
Description: PGP signature