[GENERAL] Supporting SQL/MED DATALINK
Hello, Do you plan on supporting SQL/MED features concerning DATALINKs? I've seen DATALINK mentionned on the Wiki [1] but I couldn't find it on the TODO list [2]. By the way, do you know any vendor that provides support for DATALINK? Except DB2, there seems to be very little visibility of this 2003 feature in the whole SQL world ! Regards, [1] http://wiki.postgresql.org/wiki/DATALINK [2] http://wiki.postgresql.org/wiki/Todo -- Damiano ALBANI
Re: [GENERAL] using pg's internal timezone database?
On Tue, Dec 20, 2011 at 05:29:15AM -0700, Scott Marlowe wrote: On Tue, Dec 20, 2011 at 2:05 AM, Louis-David Mitterrand vindex+lists-pgsql-gene...@apartia.org wrote: Hi, To provide my forum users with a 'timezeone' preference in their profile how can I use postgresql's internal table of timezones ? I found a reference to it here: http://www.postgresql.org/docs/7.2/static/timezones.html but not in recent versions docs. You can view the pg internal tz database by looking at what's in the tables pg_timezone_names and pg_timezone_abbrevs Hi, Thanks for that info. There are 1146 timezones listed in that table! Which sublist is the most commonly used or standard? posix/* ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Vacuum and Large Objects
-Original Message- From: Stefan Keller [mailto:sfkel...@gmail.com] Sent: Friday, January 06, 2012 1:12 AM To: Igor Neyman Cc: Simon Windsor; pgsql-general@postgresql.org Subject: Re: Vacuum and Large Objects Hi Igor 2011/12/16 Igor Neyman iney...@perceptron.com wrote: But I think, your problem is right here: running VACUUM FULL pg_largeobject If you are running VACUUM FULL ... on the table, you should follow it with the REINDEX TABLE ..., at least on PG versions prior to 9.0. I'm pretty sure that VACUUM FULL builds new indexes. That's at least of how I understand the docs, especially the first tip here http://www.postgresql.org/docs/9.0/static/routine-vacuuming.html Yours, Stefan 2011/12/16 Igor Neyman iney...@perceptron.com: From: Simon Windsor [mailto:simon.wind...@cornfield.me.uk] Sent: Wednesday, December 14, 2011 3:02 PM To: pgsql-general@postgresql.org Subject: Vacuum and Large Objects Hi I am having problems recovering storage from a Postgres 9.05 database that is used to hold large XML blocks for a week, before they are archived off line. The main tables are partitioned in daily partitions, and these are easy to manage, however the DB keeps growing despite using Vacuum (daily at 0700) and autovacuum (this does not seem to run, although the process is running). The system is insert only, and partitions are dropped when over 7 days of age. I believe the issue lies with pg_largeobject, it is split between 88 files of approx. 1G each. The Postgres settings are default, EXCEPT grep ^[a-z] postgresql.conf listen_addresses = '*' # what IP address(es) to listen on; port = 5432 # (change requires restart) max_connections = 1000 # (change requires restart) shared_buffers = 256MB # min 128kB work_mem = 4MB # min 64kB maintenance_work_mem = 256MB # min 1MB vacuum_cost_delay = 20ms # 0-100 milliseconds checkpoint_segments = 32 # in logfile segments, min 1, 16MB each checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0 checkpoint_warning = 60s # 0 disables archive_mode = off # allows archiving to be done constraint_exclusion = partition # on, off, or partition log_destination = 'stderr' # Valid values are combinations of logging_collector = on # Enable capturing of stderr and csvlog silent_mode = on # Run server silently. log_checkpoints = on log_line_prefix = '%t %d %u ' # special values: log_statement = 'none' # none, ddl, mod, all track_activities = on track_counts = on autovacuum = on # Enable autovacuum subprocess? 'on' log_autovacuum_min_duration = 250 # -1 disables, 0 logs all actions and autovacuum_max_workers = 3 # max number of autovacuum subprocesses autovacuum_naptime = 3min # time between autovacuum runs autovacuum_vacuum_threshold = 500 # min number of row updates before autovacuum_analyze_threshold = 100 # min number of row updates before autovacuum_vacuum_scale_factor = 0.1 # fraction of table size before vacuum autovacuum_analyze_scale_factor = 0.05 # fraction of table size before analyze autovacuum_vacuum_cost_delay = 5ms # default vacuum cost delay for autovacuum_vacuum_cost_limit = 200 # default vacuum cost limit for statement_timeout = 0 # in milliseconds, 0 is disabled datestyle = 'iso, dmy' lc_messages = 'en_GB.UTF-8' # locale for system error message lc_monetary = 'en_GB.UTF-8' # locale for monetary formatting lc_numeric = 'en_GB.UTF- 8' # locale for number formatting lc_time = 'en_GB.UTF- 8' # locale for time formatting default_text_search_config = 'pg_catalog.english' Besides running VACUUM FULL pg_largeobject;, is there a way I can get autovacuum to start and clear this up? All the best Simon Simon Windsor Eml: simon.wind...@cornfield.org.uk Tel: 01454 617689 Mob: 07590 324560 There is nothing in the world that some man cannot make a little worse and sell a little cheaper, and he who considers price only is that man's lawful prey. I might be a bit late in this discussion. But I think, your problem is right here: running VACUUM FULL pg_largeobject If you are running VACUUM FULL ... on the table, you should follow it with the REINDEX TABLE ..., at least on PG versions prior to 9.0. Regards, Igor Neyman -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Please, disregard my previous message. I missed Postgres 9.05 database in original post. Regards, Igor Neyman -- Sent via pgsql-general
Re: [GENERAL] Vacuum and Large Objects
Stefan, -Original Message- From: Stefan Keller [mailto:sfkel...@gmail.com] Sent: Friday, January 06, 2012 1:12 AM To: Igor Neyman Cc: Simon Windsor; pgsql-general@postgresql.org Subject: Re: Vacuum and Large Objects Hi Igor 2011/12/16 Igor Neyman iney...@perceptron.com wrote: But I think, your problem is right here: running VACUUM FULL pg_largeobject If you are running VACUUM FULL ... on the table, you should follow it with the REINDEX TABLE ..., at least on PG versions prior to 9.0. I'm pretty sure that VACUUM FULL builds new indexes. That's at least of how I understand the docs, especially the first tip here http://www.postgresql.org/docs/9.0/static/routine-vacuuming.html Yours, Stefan 2011/12/16 Igor Neyman iney...@perceptron.com: From: Simon Windsor [mailto:simon.wind...@cornfield.me.uk] Sent: Wednesday, December 14, 2011 3:02 PM To: pgsql-general@postgresql.org Subject: Vacuum and Large Objects Hi I am having problems recovering storage from a Postgres 9.05 database that is used to hold large XML blocks for a week, before they are archived off line. The main tables are partitioned in daily partitions, and these are easy to manage, however the DB keeps growing despite using Vacuum (daily at 0700) and autovacuum (this does not seem to run, although the process is running). The system is insert only, and partitions are dropped when over 7 days of age. I believe the issue lies with pg_largeobject, it is split between 88 files of approx. 1G each. The Postgres settings are default, EXCEPT grep ^[a-z] postgresql.conf listen_addresses = '*' # what IP address(es) to listen on; port = 5432 # (change requires restart) max_connections = 1000 # (change requires restart) shared_buffers = 256MB # min 128kB work_mem = 4MB # min 64kB maintenance_work_mem = 256MB # min 1MB vacuum_cost_delay = 20ms # 0-100 milliseconds checkpoint_segments = 32 # in logfile segments, min 1, 16MB each checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0 checkpoint_warning = 60s # 0 disables archive_mode = off # allows archiving to be done constraint_exclusion = partition # on, off, or partition log_destination = 'stderr' # Valid values are combinations of logging_collector = on # Enable capturing of stderr and csvlog silent_mode = on # Run server silently. log_checkpoints = on log_line_prefix = '%t %d %u ' # special values: log_statement = 'none' # none, ddl, mod, all track_activities = on track_counts = on autovacuum = on # Enable autovacuum subprocess? 'on' log_autovacuum_min_duration = 250 # -1 disables, 0 logs all actions and autovacuum_max_workers = 3 # max number of autovacuum subprocesses autovacuum_naptime = 3min # time between autovacuum runs autovacuum_vacuum_threshold = 500 # min number of row updates before autovacuum_analyze_threshold = 100 # min number of row updates before autovacuum_vacuum_scale_factor = 0.1 # fraction of table size before vacuum autovacuum_analyze_scale_factor = 0.05 # fraction of table size before analyze autovacuum_vacuum_cost_delay = 5ms # default vacuum cost delay for autovacuum_vacuum_cost_limit = 200 # default vacuum cost limit for statement_timeout = 0 # in milliseconds, 0 is disabled datestyle = 'iso, dmy' lc_messages = 'en_GB.UTF-8' # locale for system error message lc_monetary = 'en_GB.UTF-8' # locale for monetary formatting lc_numeric = 'en_GB.UTF- 8' # locale for number formatting lc_time = 'en_GB.UTF- 8' # locale for time formatting default_text_search_config = 'pg_catalog.english' Besides running VACUUM FULL pg_largeobject;, is there a way I can get autovacuum to start and clear this up? All the best Simon Simon Windsor Eml: simon.wind...@cornfield.org.uk Tel: 01454 617689 Mob: 07590 324560 There is nothing in the world that some man cannot make a little worse and sell a little cheaper, and he who considers price only is that man's lawful prey. I might be a bit late in this discussion. But I think, your problem is right here: running VACUUM FULL pg_largeobject If you are running VACUUM FULL ... on the table, you should follow it with the REINDEX TABLE ..., at least on PG versions prior to 9.0. Regards, Igor Neyman -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general I qualified my reply with PG version specifics, since original poster didn't specify Postgres version he's using. Regards, Igor Neyman
Re: [GENERAL] function return update count
Kevin Duffy kevind0...@gmail.com hat am 6. Januar 2012 um 00:32 geschrieben: Hello: I am try to get a function to return the count of the rows updated within the function. As in the following, I wan the number of rows updated to be returned. This is a simple update, other update statements that I need to write will be complicated. CREATE OR REPLACE FUNCTION est_idio_return_stats_update() RETURNS integer AS ' update est_idiosyncratic_return_stats set delta_avg_60 = avg_60 - period_61_return, delta_avg_last_24 = avg_last_24 - period_61_return, delta_avg_last_18 = avg_last_18 - period_61_return, delta_avg_last_12 = avg_last_12 - period_61_return, delta_avg_last_6 = avg_last_06 - period_61_return ; ' LANGUAGE SQL ; The above returns the following: ERROR: return type mismatch in function declared to return integer DETAIL: Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING. CONTEXT: SQL function est_idio_return_stats_update ** Error ** ERROR: return type mismatch in function declared to return integer SQL state: 42P13 Detail: Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING. Context: SQL function est_idio_return_stats_update Thanks for your kind assistance. KD I think, you can rewrite this to pl/pgsql (language, not sql) and than you can use GET DIAGNOSTICS integer_var = ROW_COUNT; Andreas
[GENERAL] How to code lo_creat lo_write lo_read in non-blocking mode
I just wonder if there is a way to program lo client interfaces (lo_creat, lo_write, lo_read) in non-blocking mode. PQsendQueryParams works perfect for executing a sql command in non-blocking mode. But I couldn't find a way for handling large objects. Do you have any example? Thank you, Choon Park
Re: [GENERAL] Running multiple versions
praka...@uci.edu writes: Thanks for your reply, Scott. Your suggestion sounds like it would be fine, but I have a few more questions now. The docs at http://www.postgresql.org/docs/9.1/interactive/upgrading.html recommend using the new version's pg_dumpall to back up the existing cluster for reload. In light of your reply, it seems you pretty much *have* to change the pg superuser's LD_LIBRARY_PATH first. IMO you should not be depending on LD_LIBRARY_PATH at all; getting it right in a multi-installation scenario is simply too error-prone. Each installation should be built with an rpath specification that points at its library directory. End of problem. --disable-rpath is really only useful when building packages for distributions that have distro policy against using rpath (because all shared libraries are supposed to be placed in system directories). It's not a good idea in any case where you have to put the libraries in non-system-standard directories. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] URGENT: temporary table not recognized?
Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. I create a temporary table, and then want to import data into this table via a COPY command. Yet, this just created table is not being recognized. What's up? From my terminal: mydb=# mydb=# create temporary table vl ( alias varchar(35) ,dates timestamp without time zone ,referers text null ); CREATE TABLE Time: 1.871 ms mydb=# copy vl from '/backup/data.txt' WITH DELIMITER AS '|'; ERROR: relation vl does not exist mydb=# Weird! Welcome any thoughts. Thanks -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] function return update count
In 9.1 you could use and updatable CTE and in the main query perform and return a count. I would think plpgsql would be the better option though. On Jan 6, 2012, at 2:29, Andreas Kretschmer andr...@a-kretschmer.de wrote: Kevin Duffy kevind0...@gmail.com hat am 6. Januar 2012 um 00:32 geschrieben: Hello: I am try to get a function to return the count of the rows updated within the function. As in the following, I wan the number of rows updated to be returned. This is a simple update, other update statements that I need to write will be complicated. CREATE OR REPLACE FUNCTION est_idio_return_stats_update() RETURNS integer AS ' update est_idiosyncratic_return_stats set delta_avg_60 = avg_60 - period_61_return, delta_avg_last_24 = avg_last_24 - period_61_return, delta_avg_last_18 = avg_last_18 - period_61_return, delta_avg_last_12 = avg_last_12 - period_61_return, delta_avg_last_6 = avg_last_06 - period_61_return ; ' LANGUAGE SQL ; The above returns the following: ERROR: return type mismatch in function declared to return integer DETAIL: Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING. CONTEXT: SQL function est_idio_return_stats_update ** Error ** ERROR: return type mismatch in function declared to return integer SQL state: 42P13 Detail: Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING. Context: SQL function est_idio_return_stats_update Thanks for your kind assistance. KD I think, you can rewrite this to pl/pgsql (language, not sql) and than you can use GET DIAGNOSTICS integer_var = ROW_COUNT; Andreas
Re: [GENERAL] URGENT: temporary table not recognized?
Phoenix Kiula phoenix.ki...@gmail.com writes: Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. Perhaps pgbouncer is redirecting the second command to a different session? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] function return update count
On 06/01/12 16:33, David Johnston wrote: In 9.1 you could use and updatable CTE and in the main query perform and return a count. I would think plpgsql would be the better option though. For the SQL option, it would be this (9.1 only though - I think David's right there). CREATE FUNCTION f1() RETURNS int AS $$ WITH rows AS ( UPDATE t1 SET ... WHERE ... RETURNING 1 ) SELECT count(*)::int FROM rows $$ LANGUAGE SQL; Unfortunately you can't do UPDATE ... RETURNING count(*) directly so you need to go through this business with the CTE (WITH clause). Oh - the cast to int is because count() returns bigint. -- Richard Huxton Archonet Ltd -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
Tom Lane wrote: Phoenix Kiula phoenix.ki...@gmail.com writes: Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. Perhaps pgbouncer is redirecting the second command to a different session? This may be OT, but are temp tables also removed when setting a new session authorization? Bosco. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] DROP ROLE prevented by dependency
psql 8.3.16 I've done some searching of mailing list, etc., but have either not searched the right way, or something. If anyone could throw a bone on this it would be appreciated. I have no idea what the procedure is for figuring out how to drop this role: DROP ROLE myuser; ERROR: role myuser cannot be dropped because some objects depend on it DETAIL: 1 objects in database postgres The detail seems kind of skimpy. Any pointers on how to hunt this down? --- Kevin R. Bulgrien Design and Development Engineer http://www.gdsatcom.com/ General Dynamics SATCOM Technologies Tel: 903-295-1480 x24109 3750 W. Loop 281 903-381-4109 Longview, TX 75604-5438 Fax: 903-295-1479 This message and/or attachments may include information subject to GD Corporate Policy 07-105 and is intended to be accessed only by authorized personnel of General Dynamics and approved service providers. Use, storage and transmission are governed by General Dynamics and its policies. Contractual restrictions apply to third parties. Recipients should refer to the policies or contract to determine proper handling. Unauthorized review, use, disclosure or distribution is prohibited. If you are not an intended recipient, please contact the sender and destroy all copies of the original message. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Binary Large Objects (LOB/BLOB) in Hibernate and JDBC: Unresolved issues
Hi, I run into a nasty behavior of current PostgreSQL JDBC. I maintain images (from Webcams). In the Java and Hibernate (JPA) code I specified a @Lob annotation on class MyData and a attribte/data type byte[] mydata;. Hibernate then generates two tables in PostgreSQL, one called MyData with a column mydata of type oid and an internal one called pg_largobjects (which contain foreign keys to the oid). That's also explained in the JDBC docs [1], saying PostgreSQL provides two distinct ways to store binary data. Binary data can be stored in a table using the data type bytea or by using the Large Object feature which stores the binary data in a separate table in a special format and refers to that table by storing a value of type oid in your table. Now, when replacing the images (few hundred) with new ones using Java, pg_largeobjects grew constantly until the file system run out of space. So old image data did'nt get released! This is to me a bug because the user/programmer must (and should) assume that there is a strict 1:1 relationship between generated table MyData and its LOB column data (stored in pg_largeobjects). = I finally found the supplied module 'lo' [2] which releases detached records. Is this the recommended way to resolve this problem? Searching for explanations I found a ticket HHH2244 [3] which was closed by the Hibernate team without action referring to the JDBC Spec. which says: An SQL BLOB is a built-in type that stores a Binary Large Object as a column value in a row of a database table. = In other words: The PostgreSQL JDBC team should take action on this but didn't until now, right? There is another issue about PostgreSQL and BLOBs [4]. First it cites PostgreSQL JDBC docs [1]. The thread [4] ends somehow too in a 'deadlock' concluding ...the PostgreSQLDialect (as of 3.5.5) needs to change not to use MaterializedBlobType until the Postgres (JDBC) team changes their driver (which does not seem to have happened in the last 6 years). = Any solutions or comments form insiders on this? Yours, Stefan [1] http://jdbc.postgresql.org/documentation/head/binary-data.html [2] http://www.postgresql.org/docs/current/interactive/lo.html [3] https://hibernate.onjira.com/browse/HHH-2244 [4] http://relation.to/Bloggers/PostgreSQLAndBLOBs -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How PostgreSQL handles Binary Large Objects (LOB/BLOB): types BYTEA, OID/pg_largeobjects and DATALINK
Hi, I'd like to get more insight of how to handle (binary) Large Object Types (LOB/BLOB/CLOB) in Postgres and I think we should clarify the situation since to me there are some inconsistencies at least in the docs (or my understanding). I could try to summarize e.g. in the Postgres Wiki. In current docs the BLOB data type is mentioned as unsupported (chapter D.2. Unsupported Features, item T041-01, Basic object support [1]) But we have 1. type bytea, 2. oid/pg_largeobjects (at least in JDBC) and now 3. type DATALINK [5]. = Why not taking one of those as BLOB data type? The Wiki page about ORACLE and BLOBs [2] seems to me little bit outdated. It says Binary large object support in Postgres is very poor and unsuitable for use in a 24/7 environment, because you can't dump them with pg_dump. = Still true? Do we want to dump them anyway? The JDBC docs [3] says PostgreSQL provides two distinct ways to store binary data. Binary data can be stored in a table using the data type bytea or by using the Large Object feature which stores the binary data in a separate table in a special format and refers to that table by storing a value of type oid in your table. = I assume the approach with a separate table (pg_largeobjects) is also feasible in plain PostgreSQL schema/usage without JDBC involed? = Should'nt be there some reference in the JDBC docs to the solution with module lo [3]? = Should one disable TOAST compression here by doing an ALTER column SET STORAGE EXTERNAL? There's the new data type DATALINK [5]. A DATALINK allows to reference files from tables. Thanks to this, big files (like images) can be externally saved while maintaining support of access control and integrity mechanisms. Currently I only find something in the wiki [5] but can't find a reference to DATALINK in the PostgreSQL docs (except as reserved keyword). = Is this an omission in the docs? So to make an initial summary, in PostgreSQL AFAIK there are at least three common ways to manage Binary Large Objects (LOB/BLOB): 1. Column of type BYTEA which stores the LOB within the table (respectively using TOAST). 2. Column of type OID in the user table and an internal separate table which separates the LOB away (called pg_largeobjects with a foreign key to oid and bytea ) Hint to module 'lo' [4]. 3. DATALINK type which maintains a link to a specific file in external storage (Example: % CREATE TABLE mydata (myid Integer, myimage DATALINK (40) FILE LINK CONTROL INTEGRITY ALL); ). I'm interested in any comments. Yours, Stefan [1] http://www.postgresql.org/docs/current/static/unsupported-features-sql-standard.html [2] http://wiki.postgresql.org/wiki/Oracle_to_Postgres_Conversion#BLOBs [3] http://jdbc.postgresql.org/documentation/head/binary-data.html [4] http://www.postgresql.org/docs/current/interactive/lo.html [5] http://wiki.postgresql.org/wiki/DATALINK -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DROP ROLE prevented by dependency
Bulgrien, Kevin kevin.bulgr...@gdsatcom.com writes: psql 8.3.16 I've done some searching of mailing list, etc., but have either not searched the right way, or something. If anyone could throw a bone on this it would be appreciated. I have no idea what the procedure is for figuring out how to drop this role: DROP ROLE myuser; ERROR: role myuser cannot be dropped because some objects depend on it DETAIL: 1 objects in database postgres Try the command while connected to the postgres database. It can't give you more detail than that from where you are, for lack of visibility into the other database's system catalogs. Also, read up on DROP OWNED BY. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How PostgreSQL handles Binary Large Objects (LOB/BLOB): types BYTEA, OID/pg_largeobjects and DATALINK
Stefan Keller sfkel...@gmail.com writes: The Wiki page about ORACLE and BLOBs [2] seems to me little bit outdated. It says Binary large object support in Postgres is very poor and unsuitable for use in a 24/7 environment, because you can't dump them with pg_dump. That whole page is largely obsolete, I think, but that particular claim hasn't been true since 8.1 or so. There's the new data type DATALINK [5]. AFAICT that page is a wish-list. It certainly doesn't describe any committed code, and if there's work in progress for that, I don't know about it. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DROP ROLE prevented by dependency
-Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Friday, January 06, 2012 12:22 PM To: Bulgrien, Kevin Cc: 'pgsql-general@postgresql.org' Subject: Re: [GENERAL] DROP ROLE prevented by dependency I have no idea what the procedure is for figuring out how to drop this role: DROP ROLE myuser; ERROR: role myuser cannot be dropped because some objects depend on it DETAIL: 1 objects in database postgres Try the command while connected to the postgres database. It can't give you more detail than that from where you are, for lack of visibility into the other database's system catalogs. Hah. I was connecting to template1 to do the DROP, so I changed to postgres and got: DROP ROLE myuser; ERROR: role myuser cannot be dropped because some objects depend on it DETAIL: 1 objects in database template1 That jostled loose a clue. That seemed to imply that ROLE myuser had some setup (GRANT/REVOKE) done in both postgres AND template1. After undoing all custom ROLE setup in both template1 (GRANT/REVOKE) in postgres AND template1, the ROLE dropped fine. Apparently the database that is connected when ROLE GRANTS and REVOKES are done is tied to the database - so it can only be undone by connecting to the same database. I have to be consistant about doing the role management consistently in the same database context. Normally I do all the database owner ROLE stuff when connected to template1, but somehow that user had some setup in both template1 and postgres. I vaguely remember connected to postgres some time ago when trying to figure out how to dump the global data not associated with an application database. I must have forgotten which context I was in when playing around with the ROLE. It's fixed now. Thanks for the tip! I think I learned something important in the process. Kevin Bulgrien This message and/or attachments may include information subject to GD Corporate Policy 07-105 and is intended to be accessed only by authorized personnel of General Dynamics and approved service providers. Use, storage and transmission are governed by General Dynamics and its policies. Contractual restrictions apply to third parties. Recipients should refer to the policies or contract to determine proper handling. Unauthorized review, use, disclosure or distribution is prohibited. If you are not an intended recipient, please contact the sender and destroy all copies of the original message. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] please help understand freeing shared buffers
Among following queries, only THREE runs fast enough for me. I can't understand the logic behind this. I once thought that shared buffers replacement is independent from whether the backend which used it is still alive. But is it true? Seems like shared buffers are returned to the pool or freed just after client disconnects? $ cat test.sh sql=explain (analyze,buffers) select getlocationid_faster2('New York','10003','NY','US',40.73,-73.99); psql=psql -X dev postgres echo ONE echo $sql | $psql echo TWO THREE echo $sql $sql | $psql echo FOUR echo $sql | $psql $ . test.sh ONE QUERY PLAN Result (cost=0.00..0.26 rows=1 width=0) (actual time=3.733..3.735 rows=1 loops=1) Buffers: shared hit=294 Total runtime: 3.769 ms (3 rows) TWO THREE QUERY PLAN Result (cost=0.00..0.26 rows=1 width=0) (actual time=3.717..3.719 rows=1 loops=1) Buffers: shared hit=294 Total runtime: 3.754 ms (3 rows) QUERY PLAN Result (cost=0.00..0.26 rows=1 width=0) (actual time=0.521..0.523 rows=1 loops=1) Buffers: shared hit=4 Total runtime: 0.540 ms (3 rows) FOUR QUERY PLAN Result (cost=0.00..0.26 rows=1 width=0) (actual time=3.740..3.742 rows=1 loops=1) Buffers: shared hit=294 Total runtime: 3.777 ms (3 rows) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On Fri, Jan 6, 2012 at 11:46 AM, Tom Lane t...@sss.pgh.pa.us wrote: Phoenix Kiula phoenix.ki...@gmail.com writes: Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. Perhaps pgbouncer is redirecting the second command to a different session? Thanks Tom. I'm in the exact same session in my terminal, and the commands are entered within 2 seconds of each other. With copy/paste, maybe split microseconds of each other. How can I make sure pgbouncer takes it all in the same session? I also tried the two commands within a transaction. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] please help understand freeing shared buffers
=?UTF-8?Q?Filip_Rembia=C5=82kowski?= filip.rembialkow...@gmail.com writes: Among following queries, only THREE runs fast enough for me. I can't understand the logic behind this. I'm not sure why you'd expect real answers when you haven't shown us what the query is doing, but my first thought is that the discrepancy comes from additional buffer touches in the first execution of a query in a given backend; which is not exactly surprising because that backend has to load up its system catalog caches. IOW, the excess touches represent accesses to system catalogs not user tables. In general, if you're annoyed by query execution times measured in milliseconds, you'd be best advised not to start a fresh connection for each one. A new connection not only involves a process launch but a fair amount of loading of local caches, and a large part of the latter work happens during the first few queries it processes. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/2012 01:11 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 11:46 AM, Tom Lanet...@sss.pgh.pa.us wrote: Phoenix Kiulaphoenix.ki...@gmail.com writes: Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. Perhaps pgbouncer is redirecting the second command to a different session? Thanks Tom. I'm in the exact same session in my terminal, and the commands are entered within 2 seconds of each other. With copy/paste, maybe split microseconds of each other. How can I make sure pgbouncer takes it all in the same session? I also tried the two commands within a transaction. Sounds like you are using statement pooling - every statement can be assigned to a different server connection. You may need transaction pooling or session pooling: http://pgbouncer.projects.postgresql.org/doc/usage.html Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] please help understand freeing shared buffers
2012/1/6 Tom Lane t...@sss.pgh.pa.us: =?UTF-8?Q?Filip_Rembia=C5=82kowski?= filip.rembialkow...@gmail.com writes: Among following queries, only THREE runs fast enough for me. I can't understand the logic behind this. I'm not sure why you'd expect real answers when you haven't shown us what the query is doing, it is an UDF, encapsulating a single SELECT where a=$1 and b=$2 and c=$3 but my first thought is that the discrepancy comes from additional buffer touches in the first execution of a query in a given backend; which is not exactly surprising because that backend has to load up its system catalog caches. IOW, the excess touches represent accesses to system catalogs not user tables. In general, if you're annoyed by query execution times measured in milliseconds, you'd be best advised not to start a fresh connection for each one. A new connection not only involves a process launch but a fair amount of loading of local caches, and a large part of the latter work happens during the first few queries it processes. thank you, that explains a lot. I misinterpreted the number of buffer hits as true buffer reads. sure, using persistent connections is what I will do (we have pgbouncer here) Filip -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] JOIN column maximum
On Thu, Jan 5, 2012 at 10:19 PM, Darren Duncan dar...@darrenduncan.net wrote: This all being said, 23K values per row just sounds wrong, and I can't imagine any census forms having that many details. Do you, by chance, have multiple values of the same type that are in different fields, eg telephone_1, telephone_2 or child_1, child_2 etc? You should take any of those and collect them into array-typed fields, or separate tables with just telephone or child columns. Or do you say have a set of coordinates in separate fields? Or you may have other kinds of redundancy within single rows that are best normalized into separate rows. With 23K values, these probably have many mutual associations, and you could split that table into a bunch of other ones where columns that relate more closely together are collected. My question is already answered, so this is mostly for anyone curious about Census data, and if anyone wants to add to this, feel free. You're right that no census form has that many details. The American Community Survey replaced the old Census long form, so it does have more details than the form sent out for the 2010 Census, but not 23,000 questions. It might ask, e.g. income, sex, and how you travel to work. But the forms are private, so that data (the so-called microdata) is not released in its entirety. What I am working with is called the summary file, which presents the data in aggregate. That means you have an income table with 16 income classes, plus a total column. Then you have 9 more tables which show the same income classes by 9 racial and ethnic categories, for a total of 153 more columns. Then you also have a table which crosses 9 income classes by 5 mobility statuses (living in same house, moved within county, moved within state, etc.) for a total of 55 columns. Then you have income classes crossed with sex, income classes crossed with mode of transportation to work, sex crossed with mode of transportation to work, etc. When all is said and done, you have 23,000 columns. Believe me, I would all love to be working directly with the microdata. All of these different ways of slicing and dicing the categories are basically how the Census Bureau provides as much detail as possible without providing so much detail that privacy would be compromised (i.e., enough information is available that specific individuals could be identified). That inevitably leads to a great deal of redundancy in the data, since the same individuals are just being counted in different groups in different tables. Given all that, one could still take the data that came from Census and try to normalize it and organize it, but my main goal given the size of the dataset is to keep it as similar as possible to the source, so that a researcher familiar with the Census data can work with our database using the Census documentation without our having to produce a new set of documentation. The way I had done that was to store the sequences (which are merely a data dissemination convenience, and have no relationship to the logic of the data) in Postgres tables, and make the subject tables (which do represent logical groupings of data) into views. I'm thinking about making the sequences into array columns. The subject tables would still be views. --Lee -- Lee Hachadoorian PhD, Earth Environmental Sciences (Geography) Research Associate, CUNY Center for Urban Research http://freecity.commons.gc.cuny.edu/
Re: [GENERAL] URGENT: temporary table not recognized?
On Fri, Jan 6, 2012 at 11:24 PM, Steve Crawford scrawf...@pinpointresearch.com wrote: On 01/06/2012 01:11 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 11:46 AM, Tom Lanet...@sss.pgh.pa.us wrote: Phoenix Kiulaphoenix.ki...@gmail.com writes: Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. Perhaps pgbouncer is redirecting the second command to a different session? Thanks Tom. I'm in the exact same session in my terminal, and the commands are entered within 2 seconds of each other. With copy/paste, maybe split microseconds of each other. How can I make sure pgbouncer takes it all in the same session? I also tried the two commands within a transaction. Sounds like you are using statement pooling - every statement can be assigned to a different server connection. You may need transaction pooling or session pooling: http://pgbouncer.projects.postgresql.org/doc/usage.html Statement pooling throws error on open transaction. -- marko -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
Marko Kreen mark...@gmail.com writes: On Fri, Jan 6, 2012 at 11:24 PM, Steve Crawford scrawf...@pinpointresearch.com wrote: On 01/06/2012 01:11 PM, Phoenix Kiula wrote: How can I make sure pgbouncer takes it all in the same session? I also tried the two commands within a transaction. Sounds like you are using statement pooling - every statement can be assigned to a different server connection. You may need transaction pooling or session pooling: http://pgbouncer.projects.postgresql.org/doc/usage.html Statement pooling throws error on open transaction. Yeah, if it still fails within a single transaction, it gets harder to blame pgbouncer. But there are not very many other candidates. I wondered about a funny setting of search_path, but ISTM that could at worst result in copying into the wrong table (ie some other table named vl), not failure to find any vl at all. It might be worth turning on statement logging and ensuring that log_line_prefix includes the process PID. Then the postmaster log would provide indisputable evidence whether the CREATE and the COPY are executed in the same session or not. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On Fri, Jan 6, 2012 at 4:24 PM, Steve Crawford scrawf...@pinpointresearch.com wrote: On 01/06/2012 01:11 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 11:46 AM, Tom Lanet...@sss.pgh.pa.us wrote: Phoenix Kiulaphoenix.ki...@gmail.com writes: Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer. Perhaps pgbouncer is redirecting the second command to a different session? Thanks Tom. I'm in the exact same session in my terminal, and the commands are entered within 2 seconds of each other. With copy/paste, maybe split microseconds of each other. How can I make sure pgbouncer takes it all in the same session? I also tried the two commands within a transaction. Sounds like you are using statement pooling - every statement can be assigned to a different server connection. You may need transaction pooling or session pooling: http://pgbouncer.projects.postgresql.org/doc/usage.html Thanks Steve. YES! I changed it to transaction pooling and now it works. Another problem through. I need to COPY a huge text file into a table, with about 350 million lines in the file (i.e., 350 million rows in the table). While copying, some lines do not have data. They are empty values. How can I specify in COPY command that if data is not found, it should be ignored? In my temp table definition, I set this column as NULL anyway, so it should be ok if this column was left empty! What can I do in my COPY command to circumvent this? Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/2012 03:12 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 4:24 PM, Steve Crawford scrawf...@pinpointresearch.com wrote: On 01/06/2012 01:11 PM, Phoenix Kiula wrote: Thanks Steve. YES! I changed it to transaction pooling and now it works. Another problem through. I need to COPY a huge text file into a table, with about 350 million lines in the file (i.e., 350 million rows in the table). While copying, some lines do not have data. They are empty values. How can I specify in COPY command that if data is not found, it should be ignored? In my temp table definition, I set this column as NULL anyway, so it should be ok if this column was left empty! What can I do in my COPY command to circumvent this? http://www.postgresql.org/docs/9.0/interactive/sql-copy.html Search for NULL Thanks. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaver adrian.kla...@gmail.com wrote: http://www.postgresql.org/docs/9.0/interactive/sql-copy.html Search for NULL Thanks Adrian. Without examples, it's hard to predict syntax. If the value after a pipe is missing altogether, I suppose the missing value is \n (newline). But this doesn't work: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n'; None of these work either: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL ''; The first two give errors, the third one throws the same missing value for column error. The data is stored like this: 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 But sometimes, the strings are: |big string here|189209209|US|2001-01-01 |big string here|189209209|US|2001-01-01 Or 123|big string here|189209209|US 123|big string here|189209209|US| So you see either the first column, which is the ID in a way, is missing so the missing character is probably a blank (''?). In this case I want COPY to just ignore this line. Or the last column is missing, where the missing character can be a newline I suppose? So how do I specify this in the COPY command so that it doesn't croak? If a line's ID is missing, it should ignore the line and go on instead of not doing anything by throwing an error for EVERYTHING! Thanks. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/2012 03:42 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaveradrian.kla...@gmail.com wrote: http://www.postgresql.org/docs/9.0/interactive/sql-copy.html Search for NULL Thanks Adrian. Without examples, it's hard to predict syntax. If the value after a pipe is missing altogether, I suppose the missing value is \n (newline). But this doesn't work: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n'; None of these work either: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL ''; The first two give errors, the third one throws the same missing value for column error. The data is stored like this: 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 But sometimes, the strings are: |big string here|189209209|US|2001-01-01 |big string here|189209209|US|2001-01-01 Or 123|big string here|189209209|US 123|big string here|189209209|US| So you see either the first column, which is the ID in a way, is missing so the missing character is probably a blank (''?). In this case I want COPY to just ignore this line. Or the last column is missing, where the missing character can be a newline I suppose? So how do I specify this in the COPY command so that it doesn't croak? If a line's ID is missing, it should ignore the line and go on instead of not doing anything by throwing an error for EVERYTHING! Thanks. Missing data is one thing, missing delimiters is another. Try doing a small copy of data with just a few lines to see which variants are actually causing the error. My money is on the one that has a mismatch between the table column count and the data column count. I.e., the row with three delimiters instead of four: 23|big string here|189209209|US When you say ignore, do you mean that you want PostgreSQL to assume a null value for the missing column or to not import that row at all? In general, when you have data scrubbing issues like this, grep/sed/awk/... are your friends. Clean it up then import it. I suppose you could import all rows into a big text field and process it in PostgreSQL but I doubt you will find that to be an optimal solution. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/2012 03:42 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaveradrian.kla...@gmail.com wrote: http://www.postgresql.org/docs/9.0/interactive/sql-copy.html Search for NULL Thanks Adrian. Without examples, it's hard to predict syntax. If the value after a pipe is missing altogether, I suppose the missing value is \n (newline). But this doesn't work: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n'; None of these work either: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL ''; Try: copy vl from 'data.txt' WITH CSV DELIMITER '|'; If that doesn't work take a look at pgloader: http://pgfoundry.org/projects/pgloader/ -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On Fri, Jan 6, 2012 at 6:53 PM, Steve Crawford scrawf...@pinpointresearch.com wrote: On 01/06/2012 03:42 PM, Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaveradrian.kla...@gmail.com wrote: http://www.postgresql.org/docs/9.0/interactive/sql-copy.html Search for NULL Thanks Adrian. Without examples, it's hard to predict syntax. If the value after a pipe is missing altogether, I suppose the missing value is \n (newline). But this doesn't work: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n'; None of these work either: copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n; copy vl from 'data.txt' WITH DELIMITER AS '|' NULL ''; The first two give errors, the third one throws the same missing value for column error. The data is stored like this: 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 123|big string here|189209209|US|2001-01-01 But sometimes, the strings are: |big string here|189209209|US|2001-01-01 |big string here|189209209|US|2001-01-01 Or 123|big string here|189209209|US 123|big string here|189209209|US| So you see either the first column, which is the ID in a way, is missing so the missing character is probably a blank (''?). In this case I want COPY to just ignore this line. Or the last column is missing, where the missing character can be a newline I suppose? So how do I specify this in the COPY command so that it doesn't croak? If a line's ID is missing, it should ignore the line and go on instead of not doing anything by throwing an error for EVERYTHING! Thanks. Missing data is one thing, missing delimiters is another. Try doing a small copy of data with just a few lines to see which variants are actually causing the error. My money is on the one that has a mismatch between the table column count and the data column count. I.e., the row with three delimiters instead of four: 23|big string here|189209209|US When you say ignore, do you mean that you want PostgreSQL to assume a null value for the missing column or to not import that row at all? In general, when you have data scrubbing issues like this, grep/sed/awk/... are your friends. Clean it up then import it. I suppose you could import all rows into a big text field and process it in PostgreSQL but I doubt you will find that to be an optimal solution. Thanks Steve. The file has 350 million lines. Sed, Awk etc are a little painful when the file is 18GB witht hat many lines. I'd want Postgresql to ignore the line altogether when something is missing. Is this an option we can use, or are rules hoisted on us? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/2012 03:12 PM, Phoenix Kiula wrote: ... Sounds like you are using statement pooling - every statement can be assigned to a different server connection. You may need transaction pooling or session pooling: http://pgbouncer.projects.postgresql.org/doc/usage.html Thanks Steve. YES! I changed it to transaction pooling and now it works. But Marko is correct. If you were using: begin; create temporary... \copy commit; and if your pooler was set to statement then you should have seen an error. Are you by any chance routing stderr to /dev/null or otherwise hiding messages? If you are using psql.2/dev/null then everything would *look* the same with pooling set to statement or transaction but you would be blissfully ignorant of the errors. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On Fri, Jan 6, 2012 at 6:54 PM, Adrian Klaver adrian.kla...@gmail.com wrote: Try: copy vl from 'data.txt' WITH CSV DELIMITER '|'; Doesn't work. Can't see what the different in CSV is from a text file. Same errors are thrown. If that doesn't work take a look at pgloader: http://pgfoundry.org/projects/pgloader/ Wow, another geeky tool. Hard to find documentation. Archaic presentation, no simple steps to install and get using. Anyway doesn't seem to provide the options I need (http://pgloader.projects.postgresql.org/) -- a) Ability to assign more than one NULL value b) Ability to ignore lines altogether that have any problem Really, other databases have mechanisms to ignore problem lines while copying. Does Postgresql allow me to *ignore* lines while COPYING? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/2012 03:55 PM, Phoenix Kiula wrote: ... In general, when you have data scrubbing issues like this, grep/sed/awk/... are your friends. Clean it up then import it. Thanks Steve. The file has 350 million lines. Sed, Awk etc are a little painful when the file is 18GB witht hat many lines. I'd want Postgresql to ignore the line altogether when something is missing. Is this an option we can use, or are rules hoisted on us? I've found grep, sed and friends to be quite effective and proper pre-cleaning to have a relatively minor impact on performance. Done properly, you will just be piping the data through a very simple grep/sed/awk/... into psql. No extra disk-reads, minimal memory use and a bit of CPU. And you will be in charge of deciding how suspect data is handled. If by rules hoisted you mean will PostgreSQL make arbitrary and possibly incorrect assumptions to attempt to force bad data into a table then the answer is no. In fact, it has become more and more picky over time. Trust me, at some point you will thank it for doing so. For example the following used to work: select current_date 2020-01-01; But it returned false which was probably not what the user wanted. (2020-01-01 is the integer 2018 which PostgreSQL interpreted as being less than current_date). But it sure looks OK at first glance when you really meant '2020-01-01'::date. In current versions, that statement will throw an error just at PostgreSQL does with dates like February 31 and a variety of other things that certain other DBMS' deem good. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On Friday, January 06, 2012 4:16:09 pm Phoenix Kiula wrote: On Fri, Jan 6, 2012 at 6:54 PM, Adrian Klaver adrian.kla...@gmail.com wrote: Try: copy vl from 'data.txt' WITH CSV DELIMITER '|'; Doesn't work. Can't see what the different in CSV is from a text file. Same errors are thrown. If that doesn't work take a look at pgloader: http://pgfoundry.org/projects/pgloader/ Wow, another geeky tool. Hard to find documentation. Archaic presentation, no simple steps to install and get using. Anyway doesn't seem to provide the options I need (http://pgloader.projects.postgresql.org/) -- Install: Download tar -xzvf pgloader-2.3.2.tar.gz cd pgloader-2.3.2/ sudo make Very difficult. http://pgloader.projects.postgresql.org/ a) Ability to assign more than one NULL value null You can configure here how null value is represented into your flat data file. This parameter is optional and defaults to '' (that is empty string). If defined on a table level, this local value will overwrite the global one. empty_string You can configure here how empty values are represented into your flat data file. This parameter is optional and defaults to \ (that is backslash followed by space). If defined on a table level, this local value will overwrite the global one. reformat Use this option when you need to preprocess some column data with pgloader reformatting modules, or your own. The value of this option is a comma separated list of columns to rewrite, which are a colon separated list of column name, reformat module name, reformat function name. Here's an example to reformat column dt_cx with the mysql.timestamp() reformatting function: reformat = dt_cx:mysql:timestamp See global setting option reformat_path for configuring where pgloader will look for reformat packages and modules. If you want to write a new formating function, provide a python package called reformat (a directory of this name containing an empty __init__.py file will do) and place in there arbitrary named modules (foo.py files) containing functions with the following signature: def bar(reject, input) The reject object has a log(self, messages, data = None) method for you to log errors into section.rej.log and section.rej files. b) Ability to ignore lines altogether that have any problem reject_log In case of errors processing input data, a human readable log per rejected input data line is produced into the reject_log file. reject_data In case of errors processing input data, the rejected input line is appended to the reject_data file. Really, other databases have mechanisms to ignore problem lines while copying. Does Postgresql allow me to *ignore* lines while COPYING? No. I await with bated breath your most excellent patch to COPY. -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] URGENT: temporary table not recognized?
On 01/06/12 5:33 PM, Phoenix Kiula wrote: http://pgloader.projects.postgresql.org/ Sorry. That I already did. But where's the config file? How to configure the config file? Where's the simple doc (not on that ugly PGFoundry website, I mean in English that people can understand what to do, with a starting sample?) you should check your attitude at the door. this isn't Microsoft Pay per Incident Tech Support. what you're asking for is right on that very URL you so conveniently quoted, both the configuration file format (Global Configuration Section), and where the sample pgloader.conf can be found... Please see the given configuration example which should be distributed in/usr/share/doc/pgloader/examples/pgloader.conf. The example configuration file comes with example data and can be used a unit test ofpgloader. so, yes, you need to download the package and un-tar it to see the sample .conf $ wget http://pgfoundry.org/frs/download.php/2294/pgloader-2.3.2.tar.gz $ tar xzf pgloader-2.3.2.tar.gz $ cd pgloader-2.3.2 $ more examples/pgloader.conf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general