Greg Stark wrote:
No, I think I'm OK there. These are programmatically-generated values and I've already been through them all once. Just the millisecond issue and the hex binary issue AFAIK.David Rysdam <[EMAIL PROTECTED]> writes:
In my brute force port, I just bulk copied the date
fields into temporary tables and then did a to_timestamp(field, 'Mon DD YYYY
HH:MI:SS:MSAM').
Again, I created a temporary table and did a decode(field, 'hex') to the
real table.
This is the standard approach. You're rather lucky these are the only data representation changes you've had to do so far. I fear you'll run into more and more complex changes over time and trying to avoid the temporary table will get harder and harder.
Very quick and easy to do one time. A little trickier to handle in an elegant, maintainable way for the dozens of data reloads I do every month for GBs of data onto two different server types.If it were me I would consider processing the files in perl. It should be pretty easy to do both of these modifications very quickly.
Why not create a type and then define the load function to be the equivalent of "decode('hex')"?If you really want to go with a custom C code then you might be able to just grab the byteain/byteaout functions from src/backend/util/adt/varlena into a separate module and create new functions with modified names. Load it with CREATE FUNCTION byteain ... AS 'my_bytea_funcs.so' 'my_byteain';
Or maybe create the function as my_byteain in postgres and then update the catalog entries somehow. I'm not sure how to do that but it shouldn't be too hard. And it might make it easier to do the substitution for the data load and then undo the change afterwards.
This is kind of a hybrid of my suggestions and the problems are a hybrid as well. :)Doing the same for timmestamp is a bit trickier but you could copy ParseDateTime from datetime.c as a static function for your module.
Be careful though, test this out thoroughly on a test database. I'm not sure of all the impacts of altering the in/out functions for data types. I expect it would break pg_dump, for example. And I would worry about the statistics tables too.
1) Just change the timestamp type so that it allows a ':' delimiter for milliseconds. Potential problems: Other parts of the code won't expect it. People don't want that.
2) Create a new type. Potential problem: Things like date ranges probably wouldn't work anymore, since the server wouldn't know it's a date now.
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings