Re: [PATCHES] 2WRS [WIP]
On Fri, Feb 08, 2008 at 12:27:23AM -0500, Jaime Casanova wrote: On Feb 7, 2008 6:04 AM, Manolo _ [EMAIL PROTECTED] wrote: HI. I send you the diff of my code against the current CVS TIP. Please tell me if it's what you were asking for. not actually, because your patch removes an improvement that was included in 8.3... what you will have to do (if someone has a better solution feel free to comment on this) is to manually merge your 8.2's patch into the 8.3's source and then generate a diff s/8.3/HEAD/ -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgpiovHuPRYnM.pgp Description: PGP signature
Re: [PATCHES] Proposed patch to disallow password=foo in database name parameter
On Tue, Dec 11, 2007 at 08:58:05AM -0500, Andrew Dunstan wrote: I'm actually inclined to vote with Stephen that this is a silly change. I just put up the patch to show the best way of doing it if we're gonna do it ... OK. I'm not going to die in a ditch over it. On the other hand, warning about it in the docs would probably be a good idea... -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgpqQC74CaoF8.pgp Description: PGP signature
Re: [PATCHES] Better default_statistics_target
On Wed, Dec 05, 2007 at 06:49:00PM +0100, Guillaume Smet wrote: On Dec 5, 2007 3:26 PM, Greg Sabino Mullane [EMAIL PROTECTED] wrote: Agreed, this would be a nice 8.4 thing. But what about 8.3 and 8.2? Is there a reason not to make this change? I know I've been lazy and not run any absolute figures, but rough tests show that raising it (from 10 to 100) results in a very minor increase in analyze time, even for large databases. I think the burden of a slightly slower analyze time, which can be easily adjusted, both in postgresql.conf and right before running an analyze, is very small compared to the pain of some queries - which worked before - suddenly running much, much slower for no apparent reason at all. As Tom stated it earlier, the ANALYZE slow down is far from being the only consequence. The planner will also have more work to do and that's the hard point IMHO. How much more? Doesn't it now use a binary search? If so, ISTM that going from 10 to 100 would at worst double the time spent finding the bucket we need. Considering that we're talking something that takes microseconds, and that there's a huge penalty to be paid if you have bad stats estimates, that doesn't seem that big a deal. And on modern machines it's not like the additional space in the catalogs is going to kill us. FWIW, I've never seen anything but a performance increase or no change when going from 10 to 100. In most cases there's a noticeable improvement since it's common to have over 100k rows in a table, and there's just no way to capture any kind of a real picture of that with only 10 buckets. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgpgJJU7Asl3N.pgp Description: PGP signature
Re: [PATCHES] [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
On Dec 20, 2007, at 2:36 AM, Gokulakannan Somasundaram wrote: I checked it by creating a table with 10 columns on a 32 bit machine. i inserted 100,000 rows with trailing nulls and i observed savings of 400Kbytes. That doesn't really tell us anything... how big was the table originally? Also, testing on 64 bit would be interesting. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [PATCHES] HOT patch - version 14
On Fri, Aug 31, 2007 at 12:53:51PM +0530, Pavan Deolasee wrote: On 8/31/07, Pavan Deolasee [EMAIL PROTECTED] wrote: In fact, now that I think about it there is no other fundamental reason to not support HOT on system tables. So we can very well do what you are suggesting. On second thought, I wonder if there is really much to gain by supporting HOT on system tables and whether it would justify all the complexity. Initially I thought about CatalogUpdateIndexes to which we need to teach HOT. Later I also got worried about building the HOT attribute lists for system tables and handling all the corner cases for bootstrapping and catalog REINDEX. It might turn out to be straight forward, but I am not able to establish that with my limited knowledge in the area. I would still vote for disabling HOT on catalogs unless you see strong value in it. What about ANALYZE? Doesn't that do a lot of updates? BTW, I'm 100% in favor of pushing system catalog HOT until later; it's be silly to risk not getting hot in 8.3 because of catalog HOT. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpXfGeddWvmd.pgp Description: PGP signature
Re: [PATCHES] Reduce the size of PageFreeSpaceInfo on 64bit platform
On Fri, Aug 10, 2007 at 10:32:35AM +0900, ITAGAKI Takahiro wrote: Here is a patch to reduce the size of PageFreeSpaceInfo on 64bit platform. We will utilize maintenance_work_mem twice with the patch. The sizeof(PageFreeSpaceInfo) is 16 bytes there because the type of 'avail' is 'Size', that is typically 8 bytes and needs to be aligned in 8-byte bounds. I changed the type of the field to uint32. We can store the freespace with uint16 at smallest, but the alignment issue throws it away. So... does that mean that the comment in the config file about 6 bytes per page is incorrect? -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpTGjLFGE552.pgp Description: PGP signature
Re: [PATCHES] strpos() KMP
On Wed, Aug 08, 2007 at 12:19:53AM +0500, Pavel Ajtkulov wrote: Do you have any performance test results for this? I describe the worst case in first message (search 'aaa..aab' in 'aa..aa', complete N^2). It works some msec instead of several sec (in current version). You describe the test case but don't provide any results. Providing the test case so others can duplicate your results is good, but you should provide actual results as well. You should also make sure to test any cases where performance would actually degrade so we can see what that looks like (though I don't know if that's an issue in this case). -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpGxMofKAc20.pgp Description: PGP signature
Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)
On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote: You would have to recompile with the value at line 214 of src/backend/utils/adt/pg_lzcompress.c set to a lower value. Doesn't seem to be working for me, even in the case of a table with a bunch of rows containing nothing but 213 'x's. I can't do anything that changes what pg_class.relpages shows. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgp4mGKOGGkJ0.pgp Description: PGP signature
Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)
On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote: Decibel! [EMAIL PROTECTED] writes: On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote: Fwiw, do we really not want to compress anything smaller than 256 bytes (everyone in Postgres uses the default strategy, not the always strategy). Is there actually a way to specify always compressing? I'm not seeing it on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html In the code there's an always strategy, but nothing in Postgres uses it so there's no way to set it using ALTER TABLE ... SET STORAGE. That might be an interesting approach though. We could add another SET STORAGE value COMPRESSIBLE which says to use the always strategy. The neat thing about this is we could set bpchar to use this storage type by default. Yeah, we should have that. I'll add it to my TODO... ISTM that with things like CHAR(n) around we might very well have some databases where compression for smaller sized datums would be beneficial. I would suggest 32 for the minimum. CPU is generally cheaper than IO now-a-days, so I agree with something less than 256. Not sure what would be best though. Well it depends a lot on how large your database is. If your whole database fits in RAM and you use datatypes like CHAR(n) only for storing data which is exactly b characters long then there's really no benefit to trying to compress smaller data. Well, something else to consider is that this could make a big difference between a database fitting in memory and not... If on the other hand your database is heavily I/O-bound and you're using CHAR(n) or storing other highly repetitive short strings then compressing data will save I/O bandwidth at the expense of cpu cycles. I do have a database that has both user-entered information as well as things like email addresses, so I could do some testing on that if people want. You would have to recompile with the value at line 214 of src/backend/utils/adt/pg_lzcompress.c set to a lower value. Ok, I'll work up some numbers. The only tests that come to mind are how long it takes to load a dump (with indexes) and the resulting table size. Other ideas? -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpf6qtfBTCDu.pgp Description: PGP signature
Re: [PATCHES] strpos() KMP
On Thu, Aug 02, 2007 at 01:18:22AM +0500, Pavel Ajtkulov wrote: Hello, this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function (see Cormen et al. Introduction to Algorithms, MIT Press, 2001). It also works with multibyte wchar. In worst case current brute force strpos() takes O(n * m) (n m is length of strings) time (example: 'aaa...aaab' search in 'aaa...aaa'). KMP algo always takes O(n + m) time. To check this someone need to create a table with one text attribute, and insert several thousands record 'aa..aa'(for example, with lenght = 1000) . After execute select count(*) from test where strpos(a, 'aaaaab') 0; on current and modified version. Also, I advise to use select .. where strpos(att, 'word') 0; instead select .. where attr like '%word%' (strpos must be faster than regex). In general, this belongs to artificial expressions. In natural language KMP is equal (execution time) current strpos() nearly. Do you have any performance test results for this? -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgprzhuOBAr5D.pgp Description: PGP signature
Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)
On Fri, Aug 03, 2007 at 06:12:09PM -0500, Decibel! wrote: On Sun, Jul 29, 2007 at 12:06:50PM +0100, Gregory Stark wrote: Decibel! [EMAIL PROTECTED] writes: On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote: Fwiw, do we really not want to compress anything smaller than 256 bytes (everyone in Postgres uses the default strategy, not the always strategy). Is there actually a way to specify always compressing? I'm not seeing it on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html In the code there's an always strategy, but nothing in Postgres uses it so there's no way to set it using ALTER TABLE ... SET STORAGE. That might be an interesting approach though. We could add another SET STORAGE value COMPRESSIBLE which says to use the always strategy. The neat thing about this is we could set bpchar to use this storage type by default. Yeah, we should have that. I'll add it to my TODO... On second thought... how much work would it be to expose the first 3 (or even all) of the elements in PGLZ_Strategy to SET STORAGE? It's certainly possible that there are workloads out there that will never be optimal with one of our pre-defined strategies... -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpuThZqXLBgG.pgp Description: PGP signature
Re: [PATCHES] Repair cosmetic damage (done by pg_indent?)
On Fri, Jul 27, 2007 at 04:07:01PM +0100, Gregory Stark wrote: Fwiw, do we really not want to compress anything smaller than 256 bytes (everyone in Postgres uses the default strategy, not the always strategy). Is there actually a way to specify always compressing? I'm not seeing it on http://www.postgresql.org/docs/8.2/interactive/storage-toast.html ISTM that with things like CHAR(n) around we might very well have some databases where compression for smaller sized datums would be beneficial. I would suggest 32 for the minimum. CPU is generally cheaper than IO now-a-days, so I agree with something less than 256. Not sure what would be best though. I do have a database that has both user-entered information as well as things like email addresses, so I could do some testing on that if people want. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpxVh94hBmDs.pgp Description: PGP signature