Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Am Sonntag, den 08.05.2005, 14:30 +0900 schrieb Tatsuo Ishii: ... Actually I myself thought as you are before. Later I found that it was not so good idea. People already have invalid encoded data in their precious database and have very hard time to migrate to newer version of PostgreSQL because of encoding validation. ... Now I need to upgrade to newer PostgreSQL within 1 day. I know I have to fix wrong encoded field but it will take more than 1 day. So I would like to import the data first then fix wrong encoded field on running database since I can reach members by phone or email even with wrong encoded address field... Well, if you are so in a hurry you better not migrate. Postgres is proud of validating the input and to have no invalid data. So if you have invalid data, better fix it. I saw this kind of situation in the real world and that's why we developed the patches. Why not developing a helper for contrib to help reencoding the database instead? Regards Tino ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Tatsuo Ishii wrote: Sent: Sunday, May 08, 2005 3:31 PM To: John Hansen Cc: [EMAIL PROTECTED]; pgsql-general@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem Tatsuo Ishii wrote: Sent: Sunday, May 08, 2005 12:01 PM To: [EMAIL PROTECTED] Cc: pgsql-general@postgresql.org; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem We have developed patches which relaxes the character validation so that PostgreSQL accepts invalid characters. It works like this: That is just plain 100% wrong!! Under no circumstances should there be invalid data in a database. And if you're trying to make a database of invalid data, then at least encode it using a valid encoding. In fact, I've proposed strengthening the validation routines for UTF-8. Actually I myself thought as you are before. Later I found that it was not so good idea. People already have invalid encoded data in their precious database and have very hard time to migrate to newer version of PostgreSQL because of encoding validation. Think about this kind of situation: There is a table t1(member_id integer primary key, member_name text, address text, phone text, email text). I have to reach each member by either adress, phone or email. Unfortunately some of address field have wrong encoded data. In this case I will use phone or email to reach them. Now I need to upgrade to newer PostgreSQL within 1 day. I know I have to fix wrong encoded field but it will take more than 1 day. So I would like to import the data first then fix wrong encoded field on running database since I can reach members by phone or email even with wrong encoded address field... Actually would be very simple, create function isvalidutf8(text) in your preferred language. C source is available from unicode.org. Create function converttoutf8(text) using whatever code is required to transform the _wrong_ encoding (SQL_ASCII - UTF8 for instance) to utf-8. Update table set field=converttoutf8(field) where !isvalidutf8(field); Now sit back and relax while your invalid data is converted to utf-8. When done, pg_dump the database, upgrade, and reload. This should take less than a day. I saw this kind of situation in the real world and that's why we developed the patches. -- Tatsuo Ishii ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Is there a way to store the name in raw binary? If so, would this not be safe because to postgresql it should no longer matter what data is or represents, right? Maybe there is a third option I am not yet concidering? In the backup rename the file and add another file file-name-with-invalids-set-to-?.README which explains the issue, details the steps taken (eg renaming) and offers some sort of raw binary value of the original file name. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
There is a table t1(member_id integer primary key, member_name text, address text, phone text, email text). I have to reach each member by either adress, phone or email. Unfortunately some of address field have wrong encoded data. In this case I will use phone or email to reach them. Now I need to upgrade to newer PostgreSQL within 1 day. I know I have to fix wrong encoded field but it will take more than 1 day. So I would like to import the data first then fix wrong encoded field on running database since I can reach members by phone or email even with wrong encoded address field... You could add a bytea field in the migration process which holds the original data and put some sloppily corrected data inside the real field for the time being. Then, once corrected set the additional bytea field to NULL. Pretty much the concept of staging tables for data import. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] SQL History
You can set postgresql.conf to log all queries to a file. Sadly you cant distinguish between the databases. Hrishikesh Deshmukh wrote: Hi All, Does Postgresql have a sql history of queries run capability? Any ideas/pointers will be a great help. Could one capture the order and SQL queries run?! Thanks in advance and your help is appreciated. Hrishi ---(end of broadcast)--- TIP 8: explain analyze is your friend -- -- Canaan Surfing Ltd. Internet Service Providers Ben-Nes Michael - Manager Tel: 972-4-6991122 Cel: 972-52-8555757 Fax: 972-4-6990098 http://www.canaan.net.il -- ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] Shorthand for foreign key indices
Does anybody else think it would be cool if you could use a shorthand expression for creating an index on a foreign key? I think it's fair to say that in the majority of cases, if you're using a foreign key, you're going to want an index on it. I know that it was decided a fair few releases ago to stop creating an implicit index for each foreign key, and that's cool, but should the creation syntax really require a separate statement for every single FK you want to index? Something like ... CREATE TABLE foo ( foo int NOT NULL REFERENCES bar INDEX ); ... would be marvellous My apologies if this has been answered before, but a search of the -general mailing list was not fruitful. BJ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] SQL History
Michael Ben-Nes [EMAIL PROTECTED] writes: You can set postgresql.conf to log all queries to a file. Sadly you cant distinguish between the databases. See log_line_prefix in 8.0. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Shorthand for foreign key indices
On Mon, May 09, 2005 at 02:05:14AM +1000, Brendan Jurd wrote: CREATE TABLE foo ( foo int NOT NULL REFERENCES bar INDEX ); ... would be marvellous I agree that it would be handy. Another possibility is throwing a NOTICE or even WARNING if you create a foreign key that isn't covered by an index. My apologies if this has been answered before, but a search of the -general mailing list was not fruitful. FYI, often times new ideas are only discussed on -hackers, so you should search there as well. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: Where do you want to go today? Linux: Where do you want to go tomorrow? FreeBSD: Are you guys coming, or what? ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] starting
I am interested in seting up postgres 7.4 or 8.0 which is best on SuSe 8.0 professional and where should I put the tar.gz for unzip and install? Thank you. Yahoo! Mail Stay connected, organized, and protected. Take the tour: http://tour.mail.yahoo.com/mailtour.html ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Howto prevent write based on date
rumor has it that Richard wrote: Franco Bruno Borghesi wrote: You could write a trigger like this: CREATE OR REPLACE FUNCTION checkDate() RETURNS TRIGGER LANGUAGE 'plpgsql' AS ' DECLARE limitDate DATE DEFAULT current_date-''1 year''::INTERVAL; BEGIN IF (OLD.date=limitDate) THEN RAISE EXCEPTION ''Cannot change record.''; END IF; RETURN NEW; END; '; CREATE TRIGGER _tg1 BEFORE UPDATE OR DELETE ON FOR EACH ROW EXECUTE PROCEDURE checkDate(); This should do the job :) I feel like I'm 1 meter tall and the wave on the beach are more than 3 meters high... Thank you for the code. It looks like it would need to be a part of any access to the database, so I imagine I would have to figure out where to put it into the front-end code. Is this correct? Franco's trigger function should do the job just fine, but speaking from experience you'll want to take further steps. Take a backup of the database, restore it to another system and also burn a copy to a CD. If the auditors come round it's simple to explain what you've done and demonstrate the data on the CD and backup system match. It also means that should any changes occur to your historical data despite your precautions you can prove that this happened. Ahh, that is a good idea! A database dump is a part of my daily backup. I guess I could also make a read-only copy of the year-end as a second database on the same system. That could make it easy to keep an eye on the main database so I (hopefully) spot any ripples that reach back to last year. Thanks for the help! Philip ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] Invalid unicode in COPY problem
Madison Kelly wrote: Is there a way to store the name in raw binary? Yes: bytea. -O ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq