Re: [HACKERS] Reducing NUMERIC size for 8.3
"Merlin Moncure" <[EMAIL PROTECTED]> writes: > would this break any application pulling a numeric field as binary > over the protocol? No. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
On 9/24/07, Gregory Stark <[EMAIL PROTECTED]> wrote: > > > We previously discussed compressing the numeric data type for small values: > > http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php > > We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse > the weight and sign/dscale so we could do this sometime without introducing > another incompatibility. > > I think we also should move the NumericData and declaration to numeric.c and > make the Numeric type an opaque pointer for the rest of the source tree. That > will catch any contrib modules or third-party modules which would be broken by > any new data representation. > > > --- numeric.h 27 Feb 2007 23:48:10 + 1.24 > +++ numeric.h 24 Sep 2007 16:07:24 +0100 > @@ -63,8 +63,8 @@ > typedef struct NumericData > { > int32 vl_len_;/* varlena header (do not > touch directly!) */ > - int16 n_weight; /* Weight of 1st digit */ > uint16 n_sign_dscale; /* Sign + display scale */ > + int16 n_weight; /* Weight of 1st digit */ > charn_data[1]; /* Digits (really array of > NumericDigit) */ > } NumericData; would this break any application pulling a numeric field as binary over the protocol? merlin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andrew Dunstan wrote: Zdenek Kotala wrote: Tom Lane wrote: Gregory Stark <[EMAIL PROTECTED]> writes: We previously discussed compressing the numeric data type for small values: http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse the weight and sign/dscale so we could do this sometime without introducing another incompatibility. I had forgotten about that, but it does seem like a good idea to do it now. Any objections? For in-place upgrade purpose It would be good change also OID for numeric type and preserve current OID for current implementation on updated system. If we want to get into that game we need a better way of allocating Oids. Right now anything not currently used is liable to be grabbed, so there's a high risk of reuse. Yes, it will be necessary. Or maybe second way is create only really base types (name, int, bool ...) on bootstrap and others types will be created in standard manner by CREATE TYPE, CREATE OPERATOR ... commands. Or third way is not remove old datatypes and only rename them to e.g. numeric_old1 ... Zdenek ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
Zdenek Kotala wrote: Tom Lane wrote: Gregory Stark <[EMAIL PROTECTED]> writes: We previously discussed compressing the numeric data type for small values: http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse the weight and sign/dscale so we could do this sometime without introducing another incompatibility. I had forgotten about that, but it does seem like a good idea to do it now. Any objections? For in-place upgrade purpose It would be good change also OID for numeric type and preserve current OID for current implementation on updated system. If we want to get into that game we need a better way of allocating Oids. Right now anything not currently used is liable to be grabbed, so there's a high risk of reuse. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing NUMERIC size for 8.3
Tom Lane wrote: Gregory Stark <[EMAIL PROTECTED]> writes: We previously discussed compressing the numeric data type for small values: http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse the weight and sign/dscale so we could do this sometime without introducing another incompatibility. I had forgotten about that, but it does seem like a good idea to do it now. Any objections? For in-place upgrade purpose It would be good change also OID for numeric type and preserve current OID for current implementation on updated system. Zdenek ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing NUMERIC size for 8.3
"Tom Lane" <[EMAIL PROTECTED]> writes: "Tom Lane" <[EMAIL PROTECTED]> writes: > Gregory Stark <[EMAIL PROTECTED]> writes: > >> I think we also should move the NumericData and declaration to numeric.c and >> make the Numeric type an opaque pointer for the rest of the source >> tree. > > I don't agree with that; we are not in the habit of doing it that way > for any other on-disk data type. All it will accomplish is to force > people to make private copies of the struct declaration, thereby > entirely guaranteeing that they fail to track changes. There will > always be legitimate reasons for external code to want to look at > on-disk bits. Well the macros to do so would become quite a bit more complex. I imagine they would become functions instead. I suppose a reasonable simple interface could be ginned up. But anyone currently accessing the data directly would have to go through the functions to access the bits. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
Gregory Stark <[EMAIL PROTECTED]> writes: > We previously discussed compressing the numeric data type for small values: > http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php > We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse > the weight and sign/dscale so we could do this sometime without introducing > another incompatibility. I had forgotten about that, but it does seem like a good idea to do it now. Any objections? > I think we also should move the NumericData and declaration to numeric.c and > make the Numeric type an opaque pointer for the rest of the source > tree. I don't agree with that; we are not in the habit of doing it that way for any other on-disk data type. All it will accomplish is to force people to make private copies of the struct declaration, thereby entirely guaranteeing that they fail to track changes. There will always be legitimate reasons for external code to want to look at on-disk bits. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing NUMERIC size for 8.3
We previously discussed compressing the numeric data type for small values: http://archives.postgresql.org/pgsql-hackers/2007-06/msg00715.php We didn't do this for 8.3 but in any case Tom did suggest we ought to reverse the weight and sign/dscale so we could do this sometime without introducing another incompatibility. I think we also should move the NumericData and declaration to numeric.c and make the Numeric type an opaque pointer for the rest of the source tree. That will catch any contrib modules or third-party modules which would be broken by any new data representation. --- numeric.h 27 Feb 2007 23:48:10 + 1.24 +++ numeric.h 24 Sep 2007 16:07:24 +0100 @@ -63,8 +63,8 @@ typedef struct NumericData { int32 vl_len_;/* varlena header (do not touch directly!) */ - int16 n_weight; /* Weight of 1st digit */ uint16 n_sign_dscale; /* Sign + display scale */ + int16 n_weight; /* Weight of 1st digit */ charn_data[1]; /* Digits (really array of NumericDigit) */ } NumericData; -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Simon Riggs wrote: > We've changed the on-disk database format in 8.3, so we have an > opportunity to change other things also. There is a patch thats been on > the patch queue for some time called numeric508, submitted Dec 2005; > I've updated this patch now for 8.3 to remove bit rot (an hour's work). > This is posted to pgsql-patches now and it works. > > The benefit of the patch is that it reduces each NUMERIC value by 2 > bytes, so will speed up things considerably. This is now especially > important if we are looking to reduce the speed of numeric division by a > factor of 4 (recent WIP patch). > > The objections to applying this patch originally were: > 1. it changes on-disk format (we've done this, so argument is void) > 2. it would restrict number of digits to 508 and there are allegedly > some people that want to store > 508 digits. > > The current patch passes all regression tests, but currently fails > numeric_big.sql since this explicitly checks for support of > numeric(1000,800). > > We could: > a) accept the patch as-is and restrict NUMERIC to 508 digits > b) refine the patch somewhat to allow 1000 digits > > (b) is possible in a couple of ways, both fairly quick: > - extend the patch so that one of the spare bits from the second digit > is used to represent dscale 508-1000. > - extend the patch so that if weight > 127 or dscale > 127 we would use > the first byte in the digits as an extra indicator byte holding the high > bits of both fields. > Neither change makes any difference to numbers below > 1,000,000,000,000,000(127 zeroes in total)...000 which probably > covers the vast majority of people's usage. > > Objections: True, we are passed feature freeze, but this patch has been > on the queue for 14 months prior to freeze and has been waiting on disk > format changes to make patch application acceptable. We definitely want > to reduce the size of Numeric by 2 bytes at some point. The question in > my mind is: When is the best time to make this change? If we put this > off until 8.4, then it will get rejected again because we won't want to > change the disk format again. So the best time to do this is now, > otherwise we'll put it off forever. > > Can I get somebody other than Tom to agree to review the patch? Clearly > waiting for Tom to review this is just going to delay release, which I > don't want to do. > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > > > > ---(end of broadcast)--- > TIP 6: explain analyze is your friend -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Tom, > That sounds good, but there are corner cases where it wouldn't work --- > consider a page containing a single maximum-length tuple. Certainly any mature upgrade-in-place tool will require a "checker" which you run first which determines if you have a prohibitive corner case. Besides, I thought we didn't allow tuples to grow to more than 1/2 page size? Or am I thinking indexes here? > In general I don't see us accepting changes that would increase data > size, but there's at least one troubling exception on the horizon: > per-column or per-value locale support. Another problem is that a strict > rule of "no data size increase ever" might forbid acceptance of changes > that achieve average space savings at the cost of increasing the size of > some lesser-used cases. Our attitude at the meeting was "let's burn that bridge when we come to it". If we can develop a solid in-place upgrade too which will work for 8.1->8.2 and 8.2->8.3, then we've done something worthwhile even if we break it in 8.4 or 8.5. It's possible that we won't implement anything that breaks it for five years or that someone will invent another solution before then, in which case we'd feel pretty dumb for having kept it on the drawing board all that time. Or to put it another way, the page-grows-on-upgrade problem is hard enough that it will probably take as much effort as the whole rest of the upgrade process to solve. So let's tackle one problem at a time. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Heikki Linnakangas wrote: Since we're discussing upgrades, let me summarize the discussions we had over dinner in Ottawa for the benefit of all: Thanks for summary. As before, someone just needs to step up and do it. I'm now working on proposal. I hope that it will ready soon. Zdenek ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > * Page format changes that grow data size are problematic, because there > can be pages that can't be expanded to the new format because there's > not enough space. However, it would be possible to write a pre-upgrade > program to look for the problematic pages, and do a dummy UPDATE on some > tuples to move them away from such pages, to make room. The pre-upgrade > program could be run while the old database is still up, so it doesn't > cause downtime. That sounds good, but there are corner cases where it wouldn't work --- consider a page containing a single maximum-length tuple. In general I don't see us accepting changes that would increase data size, but there's at least one troubling exception on the horizon: per-column or per-value locale support. Another problem is that a strict rule of "no data size increase ever" might forbid acceptance of changes that achieve average space savings at the cost of increasing the size of some lesser-used cases. In short, this point seems to need more thought. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Since we're discussing upgrades, let me summarize the discussions we had over dinner in Ottawa for the benefit of all: * pg_migrator is a sound approach to handling catalog changes. * Handling any page format change that doesn't grow the space taken by data is straightforward. Note that all 8.2 -> 8.3 changes fall in this category. * Page format changes that grow data size are problematic, because there can be pages that can't be expanded to the new format because there's not enough space. However, it would be possible to write a pre-upgrade program to look for the problematic pages, and do a dummy UPDATE on some tuples to move them away from such pages, to make room. The pre-upgrade program could be run while the old database is still up, so it doesn't cause downtime. * Page format conversion can be done at upgrade time, scanning all relations and rewriting them. Or it can be done at runtime in ReadBuffer whenever a page in the old format is accessed. We don't know what changes upcoming releases will bring, but 8.2 -> 8.3 upgrade is feasible. Any future changes will need to be considered on a case-by-case basis, but we have a pretty good basic strategy that doesn't impose any strict restrictions on future development. Even if we can't guarantee that we can do a non dump/restore upgrade between every release, if we can do it say between every other release, that's *much* better than none. As before, someone just needs to step up and do it. Zdenek Kotala wrote: Andrew Sullivan wrote: On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote: FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 8.1->8.2. Unfortunately until the PGDG decides that in-place upgrade is a constraint their willing to place on development, I see them a good chicken/egg away from making it a continually usefull tool. Or maybe cart/horse. It seems to me that the rule more likely needs to be that the migrator follow the development of the database than that the database engine be strongly constrained by the needs of an upgrade tool. I agree that some commitment is needed, though. I don't think that upgrade will strongly constrain a database engine. But some constrains we can expect. Any new development in some area will have to take care about upgrade. If you break some "upgrade coding rules" you will not to able create upgrade procedure or upgrade process will be risky. For example upgrade between 8.1 -> 8.2 is risky, because you are not able easy determine what format of cidr/inet is used on the page. If something fails during table structure upgrade, then it is not easy to recovery from this situation. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Andrew Sullivan wrote: On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote: FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 8.1->8.2. Unfortunately until the PGDG decides that in-place upgrade is a constraint their willing to place on development, I see them a good chicken/egg away from making it a continually usefull tool. Or maybe cart/horse. It seems to me that the rule more likely needs to be that the migrator follow the development of the database than that the database engine be strongly constrained by the needs of an upgrade tool. I agree that some commitment is needed, though. I don't think that upgrade will strongly constrain a database engine. But some constrains we can expect. Any new development in some area will have to take care about upgrade. If you break some "upgrade coding rules" you will not to able create upgrade procedure or upgrade process will be risky. For example upgrade between 8.1 -> 8.2 is risky, because you are not able easy determine what format of cidr/inet is used on the page. If something fails during table structure upgrade, then it is not easy to recovery from this situation. Zdenek ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote: > FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for > 8.1->8.2. Unfortunately until the PGDG decides that in-place upgrade is a > constraint their willing to place on development, I see them a good > chicken/egg away from making it a continually usefull tool. Or maybe cart/horse. It seems to me that the rule more likely needs to be that the migrator follow the development of the database than that the database engine be strongly constrained by the needs of an upgrade tool. I agree that some commitment is needed, though. A -- Andrew Sullivan | [EMAIL PROTECTED] The whole tendency of modern prose is away from concreteness. --George Orwell ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
On Tuesday 19 June 2007 10:15, Tom Lane wrote: > Zdenek Kotala <[EMAIL PROTECTED]> writes: > > I'm little bit confused when we introduce new page layout version? I > > expect that new version become with changes with pageheader, tuple > > header or data encoding (varlen/TOAST ...). But in case when there is > > new data type internal implementation, there was not reason to update > > version (see inet/cidr between 8.1 -> 8.2). Can me somebody clarify this? > > Well, we've changed it when there was a benefit to an existing tool to > do so. So far that's meant page header and tuple header changes. If > we ever had a working in-place upgrade solution, I think we'd be willing > to make the page version account for datatype format changes too. > FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 8.1->8.2. Unfortunately until the PGDG decides that in-place upgrade is a constraint their willing to place on development, I see them a good chicken/egg away from making it a continually usefull tool. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Zdenek Kotala <[EMAIL PROTECTED]> writes: > I'm little bit confused when we introduce new page layout version? I > expect that new version become with changes with pageheader, tuple > header or data encoding (varlen/TOAST ...). But in case when there is > new data type internal implementation, there was not reason to update > version (see inet/cidr between 8.1 -> 8.2). Can me somebody clarify this? Well, we've changed it when there was a benefit to an existing tool to do so. So far that's meant page header and tuple header changes. If we ever had a working in-place upgrade solution, I think we'd be willing to make the page version account for datatype format changes too. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)
Simon Riggs wrote: The objections to applying this patch originally were: 1. it changes on-disk format (we've done this, so argument is void) I'm little bit confused when we introduce new page layout version? I expect that new version become with changes with pageheader, tuple header or data encoding (varlen/TOAST ...). But in case when there is new data type internal implementation, there was not reason to update version (see inet/cidr between 8.1 -> 8.2). Can me somebody clarify this? However, In upgrade point of view. It will be necessary to keep information which postgreSQL version creates page to correctly track data type implementation (or special structure). Information in pg_control file and PG_VERSION files are unusable for upgrade purpose. Zdenek ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing NUMERIC size for 8.3
"Bruce Momjian" <[EMAIL PROTECTED]> writes: > Would someone please explain why we are considering this so far past > features freeze, and who suggtested that the 8.3->8.4 upgrade being a binary > upgrade was anything more than a pipe dream? Simon just updated a patch he had originally submitted over a year ago. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing NUMERIC size for 8.3
Bruce Momjian <[EMAIL PROTECTED]> writes: > Would someone please explain why we are considering this so far past > features freeze, and who suggtested that the 8.3->8.4 upgrade being a binary > upgrade was anything more than a pipe dream? Well, Greg had left further squeezing of numerics as an open item in his varlena patch, so it's not totally unreasonable to consider a patch for that now --- as long as it's pretty small and simple. I agree that in-place upgrade is a pipe dream until we see someone actually step forward and do the work to build a usable pg_upgrade utility. If nothing else, we should consider swapping the n_sign_dscale and n_weight fields now, since that would enable upward-compatible implementation of these ideas later. Otherwise any such patch would probably get rejected if pg_upgrade did happen to emerge out of nowhere. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs wrote: > On Mon, 2007-06-18 at 11:55 -0400, Tom Lane wrote: > > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > > Since this is your idea, would you like to do this, or should I? > > > > Go for it. > > OK > > > I'm not actually convinced this is worth spending time on, > > as Greg Stark's 1-byte-varlena patch already saved more for typical > > numerics than this will. > > Understood, thats why I left this till last. This will save even more on > top of those savings. My concern is to make this change now while we > can, since we will be aiming for 8.3->8.4 to be a binary upgrade. Would someone please explain why we are considering this so far past features freeze, and who suggtested that the 8.3->8.4 upgrade being a binary upgrade was anything more than a pipe dream? -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
"Simon Riggs" <[EMAIL PROTECTED]> writes: > I think representing zero as compactly as possible is worth the trouble, Either of these proposals can do that in 2 bytes. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 12:44 -0400, Tom Lane wrote: > I wrote: > > Gregory Stark <[EMAIL PROTECTED]> writes: > >> Anything shorter than the shortest possible numeric representation can > >> implicitly be interpreted as some alternate compact representation. I > >> already had a patch that stored small integers in a single > >> NumericDigit without any numeric header at all. > > > That's getting well beyond the realm of reason IMHO. I doubt you can > > merge it with this proposal anyway --- how will you disambiguate from > > zero with a positive dscale ("0.00")? > > Hmm, I take that back: it could be merged with this proposal pretty > easily. Again assuming that we move the POS/NEG/NAN/SHORT "sign" > flags to the first header word, we can have: > > Sign=NAN: it's a NAN, regardless of anything else. We may as well > store NAN as just 2 bytes. > I'm not entirely convinced this is worth the extra pack/unpack > logic, since said integers would be 4 bytes (+1 byte header) in > the sign=SHORT representation, which is not that much more. > Also, this means at least three different representations of some > values, which is going to be a headache for comparison and hashing. > But the bit-space is there. I think representing zero as compactly as possible is worth the trouble, so that in itself is enough for me to say Yes. Rest seems good too. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
I wrote: > Gregory Stark <[EMAIL PROTECTED]> writes: >> Anything shorter than the shortest possible numeric representation can >> implicitly be interpreted as some alternate compact representation. I >> already had a patch that stored small integers in a single >> NumericDigit without any numeric header at all. > That's getting well beyond the realm of reason IMHO. I doubt you can > merge it with this proposal anyway --- how will you disambiguate from > zero with a positive dscale ("0.00")? Hmm, I take that back: it could be merged with this proposal pretty easily. Again assuming that we move the POS/NEG/NAN/SHORT "sign" flags to the first header word, we can have: Sign=NAN: it's a NAN, regardless of anything else. We may as well store NAN as just 2 bytes. Sign=SHORT: it's a non-NAN with limited weight and dscale range, as per my proposal. A zero value would still be only 2 bytes, but anything else is longer. (This would be needed only for zero with nonzero dscale, though.) Sign=POS or NEG: if length == 2 bytes then interpret the remaining 14 bits as a single NBASE digit, with assumed weight and dscale 0. This allows storing integers up to +/- in 2 bytes (+ 1 byte varlena header). If length > 2 then it is a traditional-format numeric. I'm not entirely convinced this is worth the extra pack/unpack logic, since said integers would be 4 bytes (+1 byte header) in the sign=SHORT representation, which is not that much more. Also, this means at least three different representations of some values, which is going to be a headache for comparison and hashing. But the bit-space is there. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
Gregory Stark <[EMAIL PROTECTED]> writes: > Anything shorter than the shortest possible numeric representation can > implicitly be interpreted as some alternate compact representation. I already > had a patch that stored small integers in a single NumericDigit without any > numeric header at all. That's getting well beyond the realm of reason IMHO. I doubt you can merge it with this proposal anyway --- how will you disambiguate from zero with a positive dscale ("0.00")? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing NUMERIC size for 8.3
"Tom Lane" <[EMAIL PROTECTED]> writes: > I had a thought though: it's possible to reduce the header overhead for > typical-size numbers without giving up the ability to store large ones. > This is because the POS/NEG/NAN sign possibilities leave one unused bit > pattern. Hence: I had a whack and doing something similar to this a while back -- though I was more concerned with optimizing very small numeric variables, especially small integers and typical currency-sized quantities. The bottleneck I ran into was desupporting the macros which access the sign, display scale, etc. Those are currently in numeric.h making them--in theory-- publicly available outside the internal functions. Afaik nothing actively uses them outside of numeric.c though. If we're happy desupporting those macros and not replacing them with anything comparable then there are additional bits of state hidden in the varlena size. Anything shorter than the shortest possible numeric representation can implicitly be interpreted as some alternate compact representation. I already had a patch that stored small integers in a single NumericDigit without any numeric header at all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 11:55 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > Since this is your idea, would you like to do this, or should I? > > Go for it. OK > I'm not actually convinced this is worth spending time on, > as Greg Stark's 1-byte-varlena patch already saved more for typical > numerics than this will. Understood, thats why I left this till last. This will save even more on top of those savings. My concern is to make this change now while we can, since we will be aiming for 8.3->8.4 to be a binary upgrade. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
"Simon Riggs" <[EMAIL PROTECTED]> writes: > Since this is your idea, would you like to do this, or should I? Go for it. I'm not actually convinced this is worth spending time on, as Greg Stark's 1-byte-varlena patch already saved more for typical numerics than this will. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 17:49 +0200, Andreas Pflug wrote: > I wonder if the currently waiting patch isn't Good Enough for > 999. % of use cases, and "all" others can use numeric > instead of numeric(1000,800) or so. Especially since there are many > patches waiting that do need further investigation and refining. That still has problems. Another approach would be to restrict the existing datatype NUMERIC to 508 digits, but introduce a new datatype of LONGNUM which allows arbitrary length numerics for those that *do* care. It might be more beneficial in the longer run to separate the use cases so we can further tune them (not sure how just yet...) -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andreas Pflug <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> In any case, no capability is lost, unlike the original proposal; and >> this would be much less invasive than the original patch since there's >> no need to play tricks with the content of the digit array. > I wonder if the currently waiting patch isn't Good Enough for > 999. % of use cases, and "all" others can use numeric > instead of numeric(1000,800) or so. Apparently you misunderstand that patch: it takes capability away from unconstrained numeric too. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 11:32 -0400, Tom Lane wrote: > Andreas Pflug <[EMAIL PROTECTED]> writes: > > Simon Riggs wrote: > >> The objections to applying this patch originally were: > >> 2. it would restrict number of digits to 508 and there are allegedly > >> some people that want to store > 508 digits. > >> > > If 508 digits are not enough, are1000 digits be sufficient? Both limits > > appear quite arbitrary to me. > > As per the recent discussion about factorial, the current limit of > numeric format is 10^131071 --- there is a whole lot of daylight between > that and 10^508. > > I had a thought though: it's possible to reduce the header overhead for > typical-size numbers without giving up the ability to store large ones. > This is because the POS/NEG/NAN sign possibilities leave one unused bit > pattern. Hence: > > 1. Switch the positions of the n_sign_dscale and n_weight fields in the > long format, so that the sign bits are in the first word. > > 2. Reserve the fourth "sign" bit pattern to denote a compressed-header > format in which there's just one uint16 header word and the > NumericDigits start right after that. The header word could contain: > 2 bits: "sign" distinguishing this from the two-word-header format > 1 bit: actual number sign (POS or NEG, disallow NaN) > 6 bits: weight, room for -32 .. 31 > 7 bits: dscale, room for 0 .. 127 > > 3. When packing a NumericVar into a Numeric, use this short format when > it's not a NaN and the weight and dscale are in range, else use the long > format. > > Since the weight is in base-1 digits, this bit allocation allows a > dynamic range of about +- 10^127 which fits well with the dscale range. > But I suspect that most of the use-cases for long numerics involve large > integers, so it might be more useful to shave another bit or two from > dscale and give 'em to weight. > > In any case, no capability is lost, unlike the original proposal; and > this would be much less invasive than the original patch since there's > no need to play tricks with the content of the digit array. Sounds good. I thought there'd be a way. Since this is your idea, would you like to do this, or should I? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Reducing NUMERIC size for 8.3
Tom Lane wrote: > Andreas Pflug <[EMAIL PROTECTED]> writes: > >> Simon Riggs wrote: >> >>> The objections to applying this patch originally were: >>> 2. it would restrict number of digits to 508 and there are allegedly >>> some people that want to store > 508 digits. >>> >>> >> If 508 digits are not enough, are1000 digits be sufficient? Both limits >> appear quite arbitrary to me. >> > > As per the recent discussion about factorial, the current limit of > numeric format is 10^131071 --- there is a whole lot of daylight between > that and 10^508. > > I had a thought though: it's possible to reduce the header overhead for > typical-size numbers without giving up the ability to store large ones. > This is because the POS/NEG/NAN sign possibilities leave one unused bit > pattern. Hence: > > 1. Switch the positions of the n_sign_dscale and n_weight fields in the > long format, so that the sign bits are in the first word. > > 2. Reserve the fourth "sign" bit pattern to denote a compressed-header > format in which there's just one uint16 header word and the > NumericDigits start right after that. The header word could contain: > 2 bits: "sign" distinguishing this from the two-word-header format > 1 bit: actual number sign (POS or NEG, disallow NaN) > 6 bits: weight, room for -32 .. 31 > 7 bits: dscale, room for 0 .. 127 > > 3. When packing a NumericVar into a Numeric, use this short format when > it's not a NaN and the weight and dscale are in range, else use the long > format. > > Since the weight is in base-1 digits, this bit allocation allows a > dynamic range of about +- 10^127 which fits well with the dscale range. > But I suspect that most of the use-cases for long numerics involve large > integers, so it might be more useful to shave another bit or two from > dscale and give 'em to weight. > > In any case, no capability is lost, unlike the original proposal; and > this would be much less invasive than the original patch since there's > no need to play tricks with the content of the digit array. > I wonder if the currently waiting patch isn't Good Enough for 999. % of use cases, and "all" others can use numeric instead of numeric(1000,800) or so. Especially since there are many patches waiting that do need further investigation and refining. Regards, Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andreas Pflug <[EMAIL PROTECTED]> writes: > Simon Riggs wrote: >> The objections to applying this patch originally were: >> 2. it would restrict number of digits to 508 and there are allegedly >> some people that want to store > 508 digits. >> > If 508 digits are not enough, are1000 digits be sufficient? Both limits > appear quite arbitrary to me. As per the recent discussion about factorial, the current limit of numeric format is 10^131071 --- there is a whole lot of daylight between that and 10^508. I had a thought though: it's possible to reduce the header overhead for typical-size numbers without giving up the ability to store large ones. This is because the POS/NEG/NAN sign possibilities leave one unused bit pattern. Hence: 1. Switch the positions of the n_sign_dscale and n_weight fields in the long format, so that the sign bits are in the first word. 2. Reserve the fourth "sign" bit pattern to denote a compressed-header format in which there's just one uint16 header word and the NumericDigits start right after that. The header word could contain: 2 bits: "sign" distinguishing this from the two-word-header format 1 bit: actual number sign (POS or NEG, disallow NaN) 6 bits: weight, room for -32 .. 31 7 bits: dscale, room for 0 .. 127 3. When packing a NumericVar into a Numeric, use this short format when it's not a NaN and the weight and dscale are in range, else use the long format. Since the weight is in base-1 digits, this bit allocation allows a dynamic range of about +- 10^127 which fits well with the dscale range. But I suspect that most of the use-cases for long numerics involve large integers, so it might be more useful to shave another bit or two from dscale and give 'em to weight. In any case, no capability is lost, unlike the original proposal; and this would be much less invasive than the original patch since there's no need to play tricks with the content of the digit array. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andreas Pflug wrote: Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store > 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. This 1000 is just a restriction on the typmod of numeric. You can still use a much higher number of digits, if you use unconstrained numeric: test=> create table test (n numeric); CREATE TABLE test=> insert into test values (10::numeric ^ 9); INSERT 0 1 test=> select length(n) from test; length 100017 (1 row) Best Regards Michael Paesold ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 10:54 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > We've changed the on-disk database format in 8.3, so we have an > > opportunity to change other things also. There is a patch thats been on > > the patch queue for some time called numeric508, submitted Dec 2005; > > I thought that idea had been rejected long since as being an unreasonable > reduction in the flexibility of numeric arithmetic. I've re-read all of the posts from Dec 2005 and I see everyone in favour of committing the patch, subject to review of whether limitation to 508 was acceptable. For me that was never finalised, at least I can't see a clear decision. The important point is that the patch can be easily modified to overcome these restrictions, a line of thought that was never explored. The current patch can be easily modified to allow 1000 digits, yet still saving 2 bytes per value when storing more common values. Or we can use the same techniques to represent arbitrary number of digits, again still saving 2 bytes on common values. We have the ground work done, we just need to agree further changes. > > The benefit of the patch is that it reduces each NUMERIC value by 2 > > bytes, so will speed up things considerably. This is now especially > > important if we are looking to reduce the speed of numeric division by a > > factor of 4 (recent WIP patch). > > The only way that patch would make division faster is by making the > slowest cases (longest input numbers) impossible. Which is hardly a > solution. It isn't directly related to division, but is a speed up nonetheless which can offset the loss. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 16:56 +0200, Andreas Pflug wrote: > Simon Riggs wrote: > > The objections to applying this patch originally were: > > 2. it would restrict number of digits to 508 and there are allegedly > > some people that want to store > 508 digits. > > > If 508 digits are not enough, are1000 digits be sufficient? Both limits > appear quite arbitrary to me. Thats the current limit; I agree, but I didn't choose it. IIRC if you don't specify a limit then you can have arbitrarily long numbers. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs wrote: > The objections to applying this patch originally were: > 2. it would restrict number of digits to 508 and there are allegedly > some people that want to store > 508 digits. > If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. Regards, Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
"Simon Riggs" <[EMAIL PROTECTED]> writes: > We've changed the on-disk database format in 8.3, so we have an > opportunity to change other things also. There is a patch thats been on > the patch queue for some time called numeric508, submitted Dec 2005; I thought that idea had been rejected long since as being an unreasonable reduction in the flexibility of numeric arithmetic. > The benefit of the patch is that it reduces each NUMERIC value by 2 > bytes, so will speed up things considerably. This is now especially > important if we are looking to reduce the speed of numeric division by a > factor of 4 (recent WIP patch). The only way that patch would make division faster is by making the slowest cases (longest input numbers) impossible. Which is hardly a solution. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate