date:20120106

[GENERAL] Supporting SQL/MED DATALINK

2012-01-06 Thread Damiano ALBANI

Hello,

Do you plan on supporting SQL/MED features concerning DATALINKs?
I've seen DATALINK mentionned on the Wiki [1] but I couldn't find it on the
TODO list [2].

By the way, do you know any vendor that provides support for DATALINK?
Except DB2, there seems to be very little visibility of this 2003 feature
in the whole SQL world !

Regards,

[1] http://wiki.postgresql.org/wiki/DATALINK
[2] http://wiki.postgresql.org/wiki/Todo

-- 
Damiano ALBANI

Re: [GENERAL] using pg's internal timezone database?

2012-01-06 Thread Louis-David Mitterrand

On Tue, Dec 20, 2011 at 05:29:15AM -0700, Scott Marlowe wrote:
 On Tue, Dec 20, 2011 at 2:05 AM, Louis-David Mitterrand
 vindex+lists-pgsql-gene...@apartia.org wrote:
  Hi,
 
  To provide my forum users with a 'timezeone' preference in their profile
  how can I use postgresql's internal table of timezones ?
 
  I found a reference to it here:
 
  http://www.postgresql.org/docs/7.2/static/timezones.html
 
  but not in recent versions docs.
 
 You can view the pg internal tz database by looking at what's in the
 tables pg_timezone_names and pg_timezone_abbrevs

Hi,

Thanks for that info.

There are 1146 timezones listed in that table! Which sublist is the most
commonly used or standard? posix/* ?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Vacuum and Large Objects

2012-01-06 Thread Igor Neyman

 -Original Message-
 From: Stefan Keller [mailto:sfkel...@gmail.com]
 Sent: Friday, January 06, 2012 1:12 AM
 To: Igor Neyman
 Cc: Simon Windsor; pgsql-general@postgresql.org
 Subject: Re: Vacuum and Large Objects

 Hi Igor
 2011/12/16 Igor Neyman iney...@perceptron.com wrote:  But I think,
 your problem is right here:

   running VACUUM FULL  pg_largeobject

  If you are running VACUUM FULL ... on the table, you should follow
 it with the REINDEX TABLE ..., at least on PG versions prior to 9.0.

 I'm pretty sure that VACUUM FULL builds new indexes. That's at least of
 how I understand the docs, especially the first tip here
 http://www.postgresql.org/docs/9.0/static/routine-vacuuming.html

 Yours, Stefan

 2011/12/16 Igor Neyman iney...@perceptron.com:
  From: Simon Windsor [mailto:simon.wind...@cornfield.me.uk]
  Sent: Wednesday, December 14, 2011 3:02 PM
  To: pgsql-general@postgresql.org
  Subject: Vacuum and Large Objects

  Hi

  I am having problems recovering storage from a Postgres 9.05 database
 that is used to hold large XML blocks for a week, before they are
 archived off line.

  The main tables are partitioned in daily partitions, and these are
 easy to manage, however the DB keeps growing despite using Vacuum
 (daily at 0700) and autovacuum (this does not seem to run, although the
 process is running). The system is insert only, and partitions are
 dropped when over 7 days of age.

  I believe the issue lies with pg_largeobject, it is split between 88
 files of approx. 1G each.

  The Postgres settings are default, EXCEPT

  grep ^[a-z] postgresql.conf
  listen_addresses = '*'  # what IP address(es) to listen on;
  port = 5432 # (change requires restart)
  max_connections = 1000  # (change requires restart)
  shared_buffers = 256MB  # min 128kB work_mem =
 4MB
  # min 64kB maintenance_work_mem = 256MB    # min 1MB
  vacuum_cost_delay = 20ms    # 0-100 milliseconds
  checkpoint_segments = 32    # in logfile segments, min 1,
  16MB each checkpoint_completion_target = 0.9  # checkpoint target
  duration, 0.0 - 1.0 checkpoint_warning = 60s    # 0
  disables archive_mode = off  # allows archiving to be
 done
  constraint_exclusion = partition    # on, off, or partition
  log_destination = 'stderr'      # Valid values are
  combinations of logging_collector = on  # Enable capturing of
  stderr and csvlog silent_mode = on    # Run
 server silently.
  log_checkpoints = on
  log_line_prefix = '%t %d %u '   # special values:
  log_statement = 'none'  # none, ddl, mod, all
  track_activities = on track_counts = on autovacuum =
 on
  # Enable autovacuum subprocess?  'on'
  log_autovacuum_min_duration = 250   # -1 disables, 0 logs all
  actions and autovacuum_max_workers = 3  # max number of
  autovacuum subprocesses autovacuum_naptime = 3min   #
 time
  between autovacuum runs autovacuum_vacuum_threshold = 500   # min
  number of row updates before autovacuum_analyze_threshold = 100
 #
  min number of row updates before autovacuum_vacuum_scale_factor =
 0.1
  # fraction of table size before vacuum
 autovacuum_analyze_scale_factor
  = 0.05  # fraction of table size before analyze
  autovacuum_vacuum_cost_delay = 5ms  # default vacuum cost delay
  for autovacuum_vacuum_cost_limit = 200  # default vacuum cost
  limit for statement_timeout = 0   # in milliseconds,
 0 is disabled datestyle = 'iso, dmy'
  lc_messages = 'en_GB.UTF-8' # locale for system
  error message lc_monetary = 'en_GB.UTF-8' #
 locale
  for monetary formatting lc_numeric = 'en_GB.UTF-
 8'
  # locale for number formatting lc_time = 'en_GB.UTF-
 8'
  # locale for time formatting default_text_search_config =
 'pg_catalog.english'

  Besides running VACUUM FULL  pg_largeobject;, is there a way I can
 get autovacuum to start and clear this up?

  All the best

  Simon

  Simon Windsor
  Eml: simon.wind...@cornfield.org.uk
  Tel: 01454 617689
  Mob: 07590 324560

  There is nothing in the world that some man cannot make a little
 worse and sell a little cheaper, and he who considers price only is
 that man's lawful prey.

  I might be a bit late in this discussion.
  But I think, your problem is right here:

   running VACUUM FULL  pg_largeobject

  If you are running VACUUM FULL ... on the table, you should follow
 it with the REINDEX TABLE ..., at least on PG versions prior to 9.0.

  Regards,
  Igor Neyman

  --
  Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
  make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-general

Please, disregard my previous message.
I missed  Postgres 9.05 database in original post.

Regards,
Igor Neyman

-- 
Sent via pgsql-general

Re: [GENERAL] Vacuum and Large Objects

2012-01-06 Thread Igor Neyman

Stefan,

 -Original Message-
 From: Stefan Keller [mailto:sfkel...@gmail.com]
 Sent: Friday, January 06, 2012 1:12 AM
 To: Igor Neyman
 Cc: Simon Windsor; pgsql-general@postgresql.org
 Subject: Re: Vacuum and Large Objects

 Hi Igor
 2011/12/16 Igor Neyman iney...@perceptron.com wrote:  But I think,
 your problem is right here:

   running VACUUM FULL  pg_largeobject

  If you are running VACUUM FULL ... on the table, you should follow
 it with the REINDEX TABLE ..., at least on PG versions prior to 9.0.

 I'm pretty sure that VACUUM FULL builds new indexes. That's at least of
 how I understand the docs, especially the first tip here
 http://www.postgresql.org/docs/9.0/static/routine-vacuuming.html

 Yours, Stefan

 2011/12/16 Igor Neyman iney...@perceptron.com:
  From: Simon Windsor [mailto:simon.wind...@cornfield.me.uk]
  Sent: Wednesday, December 14, 2011 3:02 PM
  To: pgsql-general@postgresql.org
  Subject: Vacuum and Large Objects

  Hi

  I am having problems recovering storage from a Postgres 9.05 database
 that is used to hold large XML blocks for a week, before they are
 archived off line.

  The main tables are partitioned in daily partitions, and these are
 easy to manage, however the DB keeps growing despite using Vacuum
 (daily at 0700) and autovacuum (this does not seem to run, although the
 process is running). The system is insert only, and partitions are
 dropped when over 7 days of age.

  I believe the issue lies with pg_largeobject, it is split between 88
 files of approx. 1G each.

  The Postgres settings are default, EXCEPT

  grep ^[a-z] postgresql.conf
  listen_addresses = '*'  # what IP address(es) to listen on;
  port = 5432 # (change requires restart)
  max_connections = 1000  # (change requires restart)
  shared_buffers = 256MB  # min 128kB work_mem =
 4MB
  # min 64kB maintenance_work_mem = 256MB    # min 1MB
  vacuum_cost_delay = 20ms    # 0-100 milliseconds
  checkpoint_segments = 32    # in logfile segments, min 1,
  16MB each checkpoint_completion_target = 0.9  # checkpoint target
  duration, 0.0 - 1.0 checkpoint_warning = 60s    # 0
  disables archive_mode = off  # allows archiving to be
 done
  constraint_exclusion = partition    # on, off, or partition
  log_destination = 'stderr'      # Valid values are
  combinations of logging_collector = on  # Enable capturing of
  stderr and csvlog silent_mode = on    # Run
 server silently.
  log_checkpoints = on
  log_line_prefix = '%t %d %u '   # special values:
  log_statement = 'none'  # none, ddl, mod, all
  track_activities = on track_counts = on autovacuum =
 on
  # Enable autovacuum subprocess?  'on'
  log_autovacuum_min_duration = 250   # -1 disables, 0 logs all
  actions and autovacuum_max_workers = 3  # max number of
  autovacuum subprocesses autovacuum_naptime = 3min   #
 time
  between autovacuum runs autovacuum_vacuum_threshold = 500   # min
  number of row updates before autovacuum_analyze_threshold = 100
 #
  min number of row updates before autovacuum_vacuum_scale_factor =
 0.1
  # fraction of table size before vacuum
 autovacuum_analyze_scale_factor
  = 0.05  # fraction of table size before analyze
  autovacuum_vacuum_cost_delay = 5ms  # default vacuum cost delay
  for autovacuum_vacuum_cost_limit = 200  # default vacuum cost
  limit for statement_timeout = 0   # in milliseconds,
 0 is disabled datestyle = 'iso, dmy'
  lc_messages = 'en_GB.UTF-8' # locale for system
  error message lc_monetary = 'en_GB.UTF-8' #
 locale
  for monetary formatting lc_numeric = 'en_GB.UTF-
 8'
  # locale for number formatting lc_time = 'en_GB.UTF-
 8'
  # locale for time formatting default_text_search_config =
 'pg_catalog.english'

  Besides running VACUUM FULL  pg_largeobject;, is there a way I can
 get autovacuum to start and clear this up?

  All the best

  Simon

  Simon Windsor
  Eml: simon.wind...@cornfield.org.uk
  Tel: 01454 617689
  Mob: 07590 324560

  There is nothing in the world that some man cannot make a little
 worse and sell a little cheaper, and he who considers price only is
 that man's lawful prey.

  I might be a bit late in this discussion.
  But I think, your problem is right here:

   running VACUUM FULL  pg_largeobject

  If you are running VACUUM FULL ... on the table, you should follow
 it with the REINDEX TABLE ..., at least on PG versions prior to 9.0.

  Regards,
  Igor Neyman

  --
  Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
  make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-general

I qualified my reply with PG version specifics, since original poster didn't 
specify Postgres version he's using.

Regards,
Igor Neyman

Re: [GENERAL] function return update count

2012-01-06 Thread Andreas Kretschmer





Kevin Duffy kevind0...@gmail.com hat am 6. Januar 2012 um 00:32 geschrieben:



 Hello:
 
 I am try to get a function to return the count of the rows updated within the
 function.
 As in the following, I wan the number of rows updated to be returned.
 
 This is a simple update, other update statements that I need to write will be
 complicated.
 
 CREATE OR REPLACE FUNCTION est_idio_return_stats_update()
   RETURNS integer AS
 
 '
 update est_idiosyncratic_return_stats set delta_avg_60 = avg_60 -
 period_61_return,  delta_avg_last_24 = avg_last_24 - period_61_return,
 delta_avg_last_18 = avg_last_18 - period_61_return,
  delta_avg_last_12 = avg_last_12 - period_61_return,  delta_avg_last_6
 = avg_last_06 - period_61_return ;
 '
 
   LANGUAGE SQL ;
 
 The above returns the following:
 ERROR:  return type mismatch in function declared to return integer
 DETAIL:  Function's final statement must be SELECT or INSERT/UPDATE/DELETE
 RETURNING.
 CONTEXT:  SQL function est_idio_return_stats_update
 
 ** Error **
 
 ERROR: return type mismatch in function declared to return integer
 SQL state: 42P13
 Detail: Function's final statement must be SELECT or INSERT/UPDATE/DELETE
 RETURNING.
 Context: SQL function est_idio_return_stats_update
 
 
 Thanks for your kind assistance.
 
 
 KD

 I think, you can rewrite this to pl/pgsql (language, not sql) and than you can
use


   GET DIAGNOSTICS integer_var = ROW_COUNT;






   Andreas

[GENERAL] How to code lo_creat lo_write lo_read in non-blocking mode

2012-01-06 Thread ChoonSoo Park

I just wonder if there is a way to program lo client interfaces (lo_creat,
lo_write, lo_read) in non-blocking mode.
PQsendQueryParams works perfect for executing a sql command in non-blocking
mode. But I couldn't find a way for handling large objects.

Do you have any example?

Thank you,
Choon Park

Re: [GENERAL] Running multiple versions

2012-01-06 Thread Tom Lane

praka...@uci.edu writes:
 Thanks for your reply, Scott.  Your suggestion sounds like it would be
 fine, but I have a few more questions now.  The docs at
 http://www.postgresql.org/docs/9.1/interactive/upgrading.html recommend
 using the new version's pg_dumpall to back up the existing cluster for
 reload.  In light of your reply, it seems you pretty much *have* to change
 the pg superuser's LD_LIBRARY_PATH first.

IMO you should not be depending on LD_LIBRARY_PATH at all; getting it
right in a multi-installation scenario is simply too error-prone.  Each
installation should be built with an rpath specification that points at
its library directory.  End of problem.

--disable-rpath is really only useful when building packages for
distributions that have distro policy against using rpath (because
all shared libraries are supposed to be placed in system directories).
It's not a good idea in any case where you have to put the libraries
in non-system-standard directories.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Phoenix Kiula

Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer.

I create a temporary table, and then want to import data into this
table via a COPY command. Yet, this just created table is not being
recognized. What's up?

From my terminal:



mydb=#
mydb=# create temporary table vl (
  alias varchar(35)
  ,dates timestamp without time zone
  ,referers text null
);
CREATE TABLE
Time: 1.871 ms

mydb=# copy vl from '/backup/data.txt' WITH DELIMITER AS '|';
ERROR:  relation vl does not exist

mydb=#



Weird! Welcome any thoughts.

Thanks

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] function return update count

2012-01-06 Thread David Johnston

In 9.1 you could use and updatable CTE and in the main query perform and return 
a count.  I would think plpgsql would be the better option though.

On Jan 6, 2012, at 2:29, Andreas Kretschmer andr...@a-kretschmer.de wrote:

 
 Kevin Duffy kevind0...@gmail.com hat am 6. Januar 2012 um 00:32 
 geschrieben: 
 
 Hello: 
 
 I am try to get a function to return the count of the rows updated within the 
 function. 
 As in the following, I wan the number of rows updated to be returned. 
 
 This is a simple update, other update statements that I need to write will be 
 complicated. 
 
 CREATE OR REPLACE FUNCTION est_idio_return_stats_update() 
   RETURNS integer AS 
 
 ' 
 update est_idiosyncratic_return_stats set delta_avg_60 = avg_60 - 
 period_61_return,  delta_avg_last_24 = avg_last_24 - period_61_return, 
 delta_avg_last_18 = avg_last_18 - period_61_return, 
  delta_avg_last_12 = avg_last_12 - period_61_return,  
 delta_avg_last_6 = avg_last_06 - period_61_return ; 
 ' 
 
   LANGUAGE SQL ; 
 
 The above returns the following: 
 ERROR:  return type mismatch in function declared to return integer 
 DETAIL:  Function's final statement must be SELECT or INSERT/UPDATE/DELETE 
 RETURNING. 
 CONTEXT:  SQL function est_idio_return_stats_update 
 
 ** Error ** 
 
 ERROR: return type mismatch in function declared to return integer 
 SQL state: 42P13 
 Detail: Function's final statement must be SELECT or INSERT/UPDATE/DELETE 
 RETURNING. 
 Context: SQL function est_idio_return_stats_update 
 
 
 Thanks for your kind assistance. 
 
 
 KD
 
  
 I think, you can rewrite this to pl/pgsql (language, not sql) and than you 
 can use
GET DIAGNOSTICS integer_var = ROW_COUNT;

 

 

 
Andreas

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Tom Lane

Phoenix Kiula phoenix.ki...@gmail.com writes:
 Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer.

Perhaps pgbouncer is redirecting the second command to a different
session?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] function return update count

2012-01-06 Thread Richard Huxton


On 06/01/12 16:33, David Johnston wrote:

In 9.1 you could use and updatable CTE and in the main query perform
and return a count.  I would think plpgsql would be the better option
though.


For the SQL option, it would be this (9.1 only though - I think David's 
right there).


CREATE FUNCTION f1() RETURNS int AS $$
WITH rows AS (
UPDATE t1 SET ... WHERE ... RETURNING 1
)
SELECT count(*)::int FROM rows
$$ LANGUAGE SQL;


Unfortunately you can't do UPDATE ... RETURNING count(*) directly so you 
need to go through this business with the CTE (WITH clause).


Oh - the cast to int is because count() returns bigint.

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Bosco Rama

Tom Lane wrote:
 Phoenix Kiula phoenix.ki...@gmail.com writes:
 Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer.
 
 Perhaps pgbouncer is redirecting the second command to a different
 session?

This may be OT, but are temp tables also removed when setting a new session
authorization?

Bosco.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] DROP ROLE prevented by dependency

2012-01-06 Thread Bulgrien, Kevin

psql 8.3.16

I've done some searching of mailing list, etc., but have either not searched
the right way, or something.  If anyone could throw a bone on this it would
be appreciated.  I have no idea what the procedure is for figuring out how
to drop this role:

DROP ROLE myuser;
ERROR:  role myuser cannot be dropped because some objects depend on it
DETAIL:  1 objects in database postgres

The detail seems kind of skimpy.  Any pointers on how to hunt this down?

---
Kevin R. Bulgrien
Design and Development Engineer
 http://www.gdsatcom.com/
General Dynamics SATCOM Technologies Tel: 903-295-1480 x24109
3750 W. Loop 281  903-381-4109
Longview, TX 75604-5438  Fax: 903-295-1479

This message and/or attachments may include information subject to GD Corporate 
Policy 07-105 and is intended to be accessed only by authorized personnel of 
General Dynamics and approved service providers.  Use, storage and transmission 
are governed by General Dynamics and its policies. Contractual restrictions 
apply to third parties.  Recipients should refer to the policies or contract to 
determine proper handling.  Unauthorized review, use, disclosure or 
distribution is prohibited.  If you are not an intended recipient, please 
contact the sender and destroy all copies of the original message.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Binary Large Objects (LOB/BLOB) in Hibernate and JDBC: Unresolved issues

2012-01-06 Thread Stefan Keller

Hi,

I run into a nasty behavior of current PostgreSQL JDBC.

I maintain images (from Webcams). In the Java and Hibernate (JPA) code
I specified a @Lob annotation on class MyData and a attribte/data type
byte[] mydata;. Hibernate then generates two tables in PostgreSQL,
one called MyData with a column mydata of type oid and an internal one
called pg_largobjects (which contain foreign keys to the oid). That's
also explained in the JDBC docs [1], saying PostgreSQL provides two
distinct ways to store binary data. Binary data can be stored in a
table using the data type bytea or by using the Large Object feature
which stores the binary data in a separate table in a special format
and refers to that table by storing a value of type oid in your
table.

Now, when replacing the images (few hundred) with new ones using Java,
pg_largeobjects grew constantly until the file system run out of
space. So old image data did'nt get released! This is to me a bug
because the user/programmer must (and should) assume that there is a
strict 1:1 relationship between generated table MyData and its LOB
column data (stored in pg_largeobjects).
= I finally found the supplied module 'lo' [2] which releases
detached records. Is this the recommended way to resolve this problem?

Searching for explanations I found a ticket HHH2244 [3] which was
closed by the Hibernate team without action referring to the JDBC
Spec. which says: An SQL BLOB is a built-in type that stores a Binary
Large Object as a column value in a row of a database table.
= In other words: The PostgreSQL JDBC team should take action on this
but didn't until now, right?

There is another issue about PostgreSQL and BLOBs [4]. First it
cites PostgreSQL JDBC docs [1]. The thread [4] ends somehow too in a
'deadlock' concluding  ...the PostgreSQLDialect (as of 3.5.5) needs
to change not to use MaterializedBlobType until the Postgres (JDBC)
team changes their driver (which does not seem to have happened in the
last 6 years).
= Any solutions or comments form insiders on this?

Yours, Stefan

[1] http://jdbc.postgresql.org/documentation/head/binary-data.html
[2] http://www.postgresql.org/docs/current/interactive/lo.html
[3] https://hibernate.onjira.com/browse/HHH-2244
[4] http://relation.to/Bloggers/PostgreSQLAndBLOBs

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] How PostgreSQL handles Binary Large Objects (LOB/BLOB): types BYTEA, OID/pg_largeobjects and DATALINK

2012-01-06 Thread Stefan Keller

Hi,

I'd like to get more insight of how to handle (binary) Large Object
Types (LOB/BLOB/CLOB) in Postgres and I think we should clarify the
situation since to me there are some inconsistencies at least in the
docs (or my understanding). I could try to summarize e.g. in the
Postgres Wiki.

In current docs the BLOB data type is mentioned as unsupported
(chapter D.2. Unsupported Features, item T041-01, Basic object
support [1])
But we have 1. type bytea, 2. oid/pg_largeobjects (at least in JDBC)
and now 3. type DATALINK [5].
= Why not taking one of those as BLOB data type?

The Wiki page about ORACLE and BLOBs [2] seems to me little bit
outdated. It says Binary large object support in Postgres is very
poor and unsuitable for use in a 24/7 environment, because you can't
dump them with pg_dump.
= Still true? Do we want to dump them anyway?

The JDBC docs [3] says PostgreSQL provides two distinct ways to store
binary data. Binary data can be stored in a table using the data type
bytea or by using the Large Object feature which stores the binary
data in a separate table in a special format and refers to that table
by storing a value of type oid in your table.
= I assume the approach with a separate table (pg_largeobjects) is
also feasible in plain PostgreSQL schema/usage without JDBC involed?
= Should'nt be there some reference in the JDBC docs to the solution
with module lo [3]?
= Should one disable TOAST compression here by doing an ALTER column
SET STORAGE EXTERNAL?

There's the new data type DATALINK [5]. A DATALINK allows to reference
files from tables. Thanks to this, big files (like images) can be
externally saved while maintaining support of access control and
integrity mechanisms. Currently I only find something in the wiki [5]
but can't find a reference to DATALINK in the PostgreSQL docs (except
as reserved keyword).
= Is this an omission in the docs?

So to make an initial summary, in PostgreSQL AFAIK there are at least
three common ways to manage Binary Large Objects (LOB/BLOB):

1. Column of type BYTEA which stores the LOB within the table
(respectively using TOAST).
2. Column of type OID in the user table and an internal separate table
which separates the LOB away (called pg_largeobjects with a foreign
key to oid and bytea ) Hint to module 'lo' [4].
3. DATALINK type which maintains a link to a specific file in external
storage (Example: % CREATE TABLE mydata (myid Integer, myimage
DATALINK (40) FILE LINK CONTROL INTEGRITY ALL); ).

I'm interested in any comments.

Yours, Stefan

[1] 
http://www.postgresql.org/docs/current/static/unsupported-features-sql-standard.html
[2] http://wiki.postgresql.org/wiki/Oracle_to_Postgres_Conversion#BLOBs
[3] http://jdbc.postgresql.org/documentation/head/binary-data.html
[4] http://www.postgresql.org/docs/current/interactive/lo.html
[5] http://wiki.postgresql.org/wiki/DATALINK

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] DROP ROLE prevented by dependency

2012-01-06 Thread Tom Lane

Bulgrien, Kevin kevin.bulgr...@gdsatcom.com writes:
 psql 8.3.16
 I've done some searching of mailing list, etc., but have either not searched
 the right way, or something.  If anyone could throw a bone on this it would
 be appreciated.  I have no idea what the procedure is for figuring out how
 to drop this role:

 DROP ROLE myuser;
 ERROR:  role myuser cannot be dropped because some objects depend on it
 DETAIL:  1 objects in database postgres

Try the command while connected to the postgres database.  It can't give
you more detail than that from where you are, for lack of visibility
into the other database's system catalogs.

Also, read up on DROP OWNED BY.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How PostgreSQL handles Binary Large Objects (LOB/BLOB): types BYTEA, OID/pg_largeobjects and DATALINK

2012-01-06 Thread Tom Lane

Stefan Keller sfkel...@gmail.com writes:
 The Wiki page about ORACLE and BLOBs [2] seems to me little bit
 outdated. It says Binary large object support in Postgres is very
 poor and unsuitable for use in a 24/7 environment, because you can't
 dump them with pg_dump.

That whole page is largely obsolete, I think, but that particular
claim hasn't been true since 8.1 or so.

 There's the new data type DATALINK [5].

AFAICT that page is a wish-list.  It certainly doesn't describe any
committed code, and if there's work in progress for that, I don't
know about it.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] DROP ROLE prevented by dependency

2012-01-06 Thread Bulgrien, Kevin

 -Original Message-
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]
 Sent: Friday, January 06, 2012 12:22 PM
 To: Bulgrien, Kevin
 Cc: 'pgsql-general@postgresql.org'
 Subject: Re: [GENERAL] DROP ROLE prevented by dependency

   I have no idea what the procedure is for figuring out how
  to drop this role:

  DROP ROLE myuser;
  ERROR:  role myuser cannot be dropped because some
  objects depend on it
  DETAIL:  1 objects in database postgres

 Try the command while connected to the postgres database.  It
 can't give you more detail than that from where you are, for
 lack of visibility into the other database's system catalogs.

Hah.  I was connecting to template1 to do the DROP, so I changed to
postgres and got:

DROP ROLE myuser;
ERROR:  role myuser cannot be dropped because some objects depend on it
DETAIL:  1 objects in database template1

That jostled loose a clue.  That seemed to imply that ROLE myuser
had some setup (GRANT/REVOKE) done in both postgres AND template1.

After undoing all custom ROLE setup in both template1 (GRANT/REVOKE)
in postgres AND template1, the ROLE dropped fine.

Apparently the database that is connected when ROLE GRANTS and
REVOKES are done is tied to the database - so it can only be undone
by connecting to the same database.  I have to be consistant about
doing the role management consistently in the same database
context.  Normally I do all the database owner ROLE stuff when
connected to template1, but somehow that user had some setup
in both template1 and postgres.  I vaguely remember connected to
postgres some time ago when trying to figure out how to dump the
global data not associated with an application database.  I must
have forgotten which context I was in when playing around with
the ROLE.

It's fixed now.  Thanks for the tip!  I think I learned
something important in the process.

Kevin Bulgrien

This message and/or attachments may include information subject to GD Corporate 
Policy 07-105 and is intended to be accessed only by authorized personnel of 
General Dynamics and approved service providers.  Use, storage and transmission 
are governed by General Dynamics and its policies. Contractual restrictions 
apply to third parties.  Recipients should refer to the policies or contract to 
determine proper handling.  Unauthorized review, use, disclosure or 
distribution is prohibited.  If you are not an intended recipient, please 
contact the sender and destroy all copies of the original message.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] please help understand freeing shared buffers

2012-01-06 Thread Filip Rembiałkowski

Among following queries, only THREE runs fast enough for me.

I can't understand the logic behind this.

I once thought that shared buffers replacement is independent from
whether the backend which used it is still alive. But is it true?

Seems like shared buffers are  returned to the pool or freed just
after client disconnects?



$ cat test.sh
sql=explain (analyze,buffers) select getlocationid_faster2('New
York','10003','NY','US',40.73,-73.99);
psql=psql -X dev postgres
echo ONE
echo $sql | $psql
echo TWO THREE
echo $sql $sql | $psql
echo FOUR
echo $sql | $psql


$ . test.sh
ONE
 QUERY PLAN

 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=3.733..3.735
rows=1 loops=1)
   Buffers: shared hit=294
 Total runtime: 3.769 ms
(3 rows)

TWO THREE
 QUERY PLAN

 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=3.717..3.719
rows=1 loops=1)
   Buffers: shared hit=294
 Total runtime: 3.754 ms
(3 rows)

 QUERY PLAN

 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=0.521..0.523
rows=1 loops=1)
   Buffers: shared hit=4
 Total runtime: 0.540 ms
(3 rows)

FOUR
 QUERY PLAN

 Result  (cost=0.00..0.26 rows=1 width=0) (actual time=3.740..3.742
rows=1 loops=1)
   Buffers: shared hit=294
 Total runtime: 3.777 ms
(3 rows)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Phoenix Kiula

On Fri, Jan 6, 2012 at 11:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Phoenix Kiula phoenix.ki...@gmail.com writes:
 Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer.

 Perhaps pgbouncer is redirecting the second command to a different
 session?




Thanks Tom. I'm in the exact same session in my terminal, and the
commands are entered within 2 seconds of each other. With copy/paste,
maybe split microseconds of each other.

How can I make sure pgbouncer takes it all in the same session? I also
tried the two commands within a transaction.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] please help understand freeing shared buffers

2012-01-06 Thread Tom Lane

=?UTF-8?Q?Filip_Rembia=C5=82kowski?= filip.rembialkow...@gmail.com writes:
 Among following queries, only THREE runs fast enough for me.
 I can't understand the logic behind this.

I'm not sure why you'd expect real answers when you haven't shown us
what the query is doing, but my first thought is that the discrepancy
comes from additional buffer touches in the first execution of a query
in a given backend; which is not exactly surprising because that backend
has to load up its system catalog caches.  IOW, the excess touches
represent accesses to system catalogs not user tables.

In general, if you're annoyed by query execution times measured in
milliseconds, you'd be best advised not to start a fresh connection
for each one.  A new connection not only involves a process launch
but a fair amount of loading of local caches, and a large part of
the latter work happens during the first few queries it processes.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Steve Crawford


On 01/06/2012 01:11 PM, Phoenix Kiula wrote:

On Fri, Jan 6, 2012 at 11:46 AM, Tom Lanet...@sss.pgh.pa.us  wrote:

Phoenix Kiulaphoenix.ki...@gmail.com  writes:

Hi. I'm using Postgresql 9.0.5, and the connection is made via pgbouncer.

Perhaps pgbouncer is redirecting the second command to a different
session?




Thanks Tom. I'm in the exact same session in my terminal, and the
commands are entered within 2 seconds of each other. With copy/paste,
maybe split microseconds of each other.

How can I make sure pgbouncer takes it all in the same session? I also
tried the two commands within a transaction.



Sounds like you are using statement pooling - every statement can be 
assigned to a different server connection. You may need transaction 
pooling or session pooling:


http://pgbouncer.projects.postgresql.org/doc/usage.html

Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] please help understand freeing shared buffers

2012-01-06 Thread Filip Rembiałkowski

2012/1/6 Tom Lane t...@sss.pgh.pa.us:
 =?UTF-8?Q?Filip_Rembia=C5=82kowski?= filip.rembialkow...@gmail.com writes:
 Among following queries, only THREE runs fast enough for me.
 I can't understand the logic behind this.

 I'm not sure why you'd expect real answers when you haven't shown us
 what the query is doing,

it is an UDF, encapsulating a single SELECT where a=$1 and b=$2 and c=$3

 but my first thought is that the discrepancy
 comes from additional buffer touches in the first execution of a query
 in a given backend; which is not exactly surprising because that backend
 has to load up its system catalog caches.  IOW, the excess touches
 represent accesses to system catalogs not user tables.

 In general, if you're annoyed by query execution times measured in
 milliseconds, you'd be best advised not to start a fresh connection
 for each one.  A new connection not only involves a process launch
 but a fair amount of loading of local caches, and a large part of
 the latter work happens during the first few queries it processes.


thank you, that explains a lot.

I misinterpreted the number of buffer hits as true buffer reads.

sure, using persistent connections is what I will do (we have pgbouncer here)


Filip

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] JOIN column maximum

2012-01-06 Thread Lee Hachadoorian

On Thu, Jan 5, 2012 at 10:19 PM, Darren Duncan dar...@darrenduncan.net
 wrote:


 This all being said, 23K values per row just sounds wrong, and I can't
 imagine any census forms having that many details.

 Do you, by chance, have multiple values of the same type that are in
 different fields, eg telephone_1, telephone_2 or child_1, child_2 etc?  You
 should take any of those and collect them into array-typed fields, or
 separate tables with just telephone or child columns.  Or do you say have a
 set of coordinates in separate fields?  Or you may have other kinds of
 redundancy within single rows that are best normalized into separate rows.

 With 23K values, these probably have many mutual associations, and you
 could split that table into a bunch of other ones where columns that relate
 more closely together are collected.

 My question is already answered, so this is mostly for anyone curious
about Census data, and if anyone wants to add to this, feel free.

You're right that no census form has that many details. The American
Community Survey replaced the old Census long form, so it does have more
details than the form sent out for the 2010 Census, but not 23,000
questions. It might ask, e.g. income, sex, and how you travel to work. But
the forms are private, so that data (the so-called microdata) is not
released in its entirety. What I am working with is called the summary
file, which presents the data in aggregate. That means you have an income
table with 16 income classes, plus a total column. Then you have 9 more
tables which show the same income classes by 9 racial and ethnic
categories, for a total of 153 more columns. Then you also have a table
which crosses 9 income classes by 5 mobility statuses (living in same
house, moved within county, moved within state, etc.) for a total of 55
columns. Then you have income classes crossed with sex, income classes
crossed with mode of transportation to work, sex crossed with mode of
transportation to work, etc. When all is said and done, you have 23,000
columns.

Believe me, I would all love to be working directly with the microdata. All
of these different ways of slicing and dicing the categories are basically
how the Census Bureau provides as much detail as possible without providing
so much detail that privacy would be compromised (i.e., enough information
is available that specific individuals could be identified). That
inevitably leads to a great deal of redundancy in the data, since the same
individuals are just being counted in different groups in different tables.

Given all that, one could still take the data that came from Census and try
to normalize it and organize it, but my main goal given the size of the
dataset is to keep it as similar as possible to the source, so that a
researcher familiar with the Census data can work with our database using
the Census documentation without our having to produce a new set of
documentation. The way I had done that was to store the sequences (which
are merely a data dissemination convenience, and have no relationship to
the logic of the data) in Postgres tables, and make the subject tables
(which do represent logical groupings of data) into views. I'm thinking
about making the sequences into array columns. The subject tables would
still be views.

--Lee

-- 
Lee Hachadoorian
PhD, Earth  Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Marko Kreen

On Fri, Jan 6, 2012 at 11:24 PM, Steve Crawford
scrawf...@pinpointresearch.com wrote:
 On 01/06/2012 01:11 PM, Phoenix Kiula wrote:

 On Fri, Jan 6, 2012 at 11:46 AM, Tom Lanet...@sss.pgh.pa.us  wrote:

 Phoenix Kiulaphoenix.ki...@gmail.com  writes:

 Hi. I'm using Postgresql 9.0.5, and the connection is made via
 pgbouncer.

 Perhaps pgbouncer is redirecting the second command to a different
 session?



 Thanks Tom. I'm in the exact same session in my terminal, and the
 commands are entered within 2 seconds of each other. With copy/paste,
 maybe split microseconds of each other.

 How can I make sure pgbouncer takes it all in the same session? I also
 tried the two commands within a transaction.


 Sounds like you are using statement pooling - every statement can be
 assigned to a different server connection. You may need transaction pooling
 or session pooling:

 http://pgbouncer.projects.postgresql.org/doc/usage.html

Statement pooling throws error on open transaction.

-- 
marko

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Fri, Jan 6, 2012 at 11:24 PM, Steve Crawford
 scrawf...@pinpointresearch.com wrote:
 On 01/06/2012 01:11 PM, Phoenix Kiula wrote:
 How can I make sure pgbouncer takes it all in the same session? I also
 tried the two commands within a transaction.

 Sounds like you are using statement pooling - every statement can be
 assigned to a different server connection. You may need transaction pooling
 or session pooling:
 http://pgbouncer.projects.postgresql.org/doc/usage.html

 Statement pooling throws error on open transaction.

Yeah, if it still fails within a single transaction, it gets harder to
blame pgbouncer.  But there are not very many other candidates.  I
wondered about a funny setting of search_path, but ISTM that could at
worst result in copying into the wrong table (ie some other table named
vl), not failure to find any vl at all.

It might be worth turning on statement logging and ensuring that
log_line_prefix includes the process PID.  Then the postmaster log would
provide indisputable evidence whether the CREATE and the COPY are
executed in the same session or not.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Phoenix Kiula

On Fri, Jan 6, 2012 at 4:24 PM, Steve Crawford
scrawf...@pinpointresearch.com wrote:
 On 01/06/2012 01:11 PM, Phoenix Kiula wrote:

 On Fri, Jan 6, 2012 at 11:46 AM, Tom Lanet...@sss.pgh.pa.us  wrote:

 Phoenix Kiulaphoenix.ki...@gmail.com  writes:

 Hi. I'm using Postgresql 9.0.5, and the connection is made via
 pgbouncer.

 Perhaps pgbouncer is redirecting the second command to a different
 session?



 Thanks Tom. I'm in the exact same session in my terminal, and the
 commands are entered within 2 seconds of each other. With copy/paste,
 maybe split microseconds of each other.

 How can I make sure pgbouncer takes it all in the same session? I also
 tried the two commands within a transaction.


 Sounds like you are using statement pooling - every statement can be
 assigned to a different server connection. You may need transaction pooling
 or session pooling:

 http://pgbouncer.projects.postgresql.org/doc/usage.html



Thanks Steve. YES! I changed it to transaction pooling and now it works.

Another problem through.

I need to COPY a huge text file into a table, with about 350 million
lines in the file (i.e., 350 million rows in the table).

While copying, some lines do not have data. They are empty values.

How can I specify in COPY command that if data is not found, it should
be ignored? In my temp table definition, I set this column as NULL
anyway, so it should be ok if this column was left empty!

What can I do in my COPY command to circumvent this?

Thanks.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Adrian Klaver


On 01/06/2012 03:12 PM, Phoenix Kiula wrote:

On Fri, Jan 6, 2012 at 4:24 PM, Steve Crawford
scrawf...@pinpointresearch.com  wrote:

On 01/06/2012 01:11 PM, Phoenix Kiula wrote:





Thanks Steve. YES! I changed it to transaction pooling and now it works.

Another problem through.

I need to COPY a huge text file into a table, with about 350 million
lines in the file (i.e., 350 million rows in the table).

While copying, some lines do not have data. They are empty values.

How can I specify in COPY command that if data is not found, it should
be ignored? In my temp table definition, I set this column as NULL
anyway, so it should be ok if this column was left empty!

What can I do in my COPY command to circumvent this?


http://www.postgresql.org/docs/9.0/interactive/sql-copy.html

Search for
NULL



Thanks.




--
Adrian Klaver
adrian.kla...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Phoenix Kiula

On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaver adrian.kla...@gmail.com wrote:

 http://www.postgresql.org/docs/9.0/interactive/sql-copy.html

 Search for
 NULL



Thanks Adrian.

Without examples, it's hard to predict syntax. If the value after a
pipe is missing altogether, I suppose the missing value is \n
(newline). But this doesn't work:

   copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n';

None of these work either:

   copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n;
   copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n;
   copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '';

The first two give errors, the third one throws the same missing value
for column error.

The data is stored like this:

123|big string here|189209209|US|2001-01-01
123|big string here|189209209|US|2001-01-01
123|big string here|189209209|US|2001-01-01
123|big string here|189209209|US|2001-01-01

But sometimes, the strings are:

|big string here|189209209|US|2001-01-01
|big string here|189209209|US|2001-01-01

Or

123|big string here|189209209|US
123|big string here|189209209|US|

So you see either the first column, which is the ID in a way, is
missing so the missing character is probably a blank (''?). In this
case I want COPY to just ignore this line.

Or the last column is missing, where the missing character can be a
newline I suppose?

So how do I specify this in the COPY command so that it doesn't croak?
If a line's ID is missing, it should ignore the line and go on instead
of not doing anything by throwing an error for EVERYTHING!

Thanks.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Steve Crawford


On 01/06/2012 03:42 PM, Phoenix Kiula wrote:

On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaveradrian.kla...@gmail.com  wrote:


http://www.postgresql.org/docs/9.0/interactive/sql-copy.html

Search for
NULL



Thanks Adrian.

Without examples, it's hard to predict syntax. If the value after a
pipe is missing altogether, I suppose the missing value is \n
(newline). But this doesn't work:

copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n';

None of these work either:

copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n;
copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n;
copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '';

The first two give errors, the third one throws the same missing value
for column error.

The data is stored like this:

 123|big string here|189209209|US|2001-01-01
 123|big string here|189209209|US|2001-01-01
 123|big string here|189209209|US|2001-01-01
 123|big string here|189209209|US|2001-01-01

But sometimes, the strings are:

 |big string here|189209209|US|2001-01-01
 |big string here|189209209|US|2001-01-01

Or

 123|big string here|189209209|US
 123|big string here|189209209|US|

So you see either the first column, which is the ID in a way, is
missing so the missing character is probably a blank (''?). In this
case I want COPY to just ignore this line.

Or the last column is missing, where the missing character can be a
newline I suppose?

So how do I specify this in the COPY command so that it doesn't croak?
If a line's ID is missing, it should ignore the line and go on instead
of not doing anything by throwing an error for EVERYTHING!

Thanks.


Missing data is one thing, missing delimiters is another. Try doing a 
small copy of data with just a few lines to see which variants are 
actually causing the error. My money is on the one that has a mismatch 
between the table column count and the data column count. I.e., the row 
with three delimiters instead of four:


23|big string here|189209209|US

When you say ignore, do you mean that you want PostgreSQL to assume a 
null value for the missing column or to not import that row at all?


In general, when you have data scrubbing issues like this, 
grep/sed/awk/... are your friends. Clean it up then import it.


I suppose you could import all rows into a big text field and process it 
in PostgreSQL but I doubt you will find that to be an optimal solution.


Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Adrian Klaver


On 01/06/2012 03:42 PM, Phoenix Kiula wrote:

On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaveradrian.kla...@gmail.com  wrote:


http://www.postgresql.org/docs/9.0/interactive/sql-copy.html

Search for
NULL




Thanks Adrian.

Without examples, it's hard to predict syntax. If the value after a
pipe is missing altogether, I suppose the missing value is \n
(newline). But this doesn't work:

copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n';

None of these work either:

copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n;
copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n;
copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '';


Try:
copy vl from 'data.txt' WITH CSV DELIMITER '|';


If that doesn't work take a look at pgloader:
http://pgfoundry.org/projects/pgloader/






--
Adrian Klaver
adrian.kla...@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Phoenix Kiula

On Fri, Jan 6, 2012 at 6:53 PM, Steve Crawford
scrawf...@pinpointresearch.com wrote:
 On 01/06/2012 03:42 PM, Phoenix Kiula wrote:

 On Fri, Jan 6, 2012 at 6:20 PM, Adrian Klaveradrian.kla...@gmail.com
  wrote:

 http://www.postgresql.org/docs/9.0/interactive/sql-copy.html

 Search for
 NULL



 Thanks Adrian.

 Without examples, it's hard to predict syntax. If the value after a
 pipe is missing altogether, I suppose the missing value is \n
 (newline). But this doesn't work:

    copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '\n';

 None of these work either:

    copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \n;
    copy vl from 'data.txt' WITH DELIMITER AS '|' NULL \\n;
    copy vl from 'data.txt' WITH DELIMITER AS '|' NULL '';

 The first two give errors, the third one throws the same missing value
 for column error.

 The data is stored like this:

     123|big string here|189209209|US|2001-01-01
     123|big string here|189209209|US|2001-01-01
     123|big string here|189209209|US|2001-01-01
     123|big string here|189209209|US|2001-01-01

 But sometimes, the strings are:

     |big string here|189209209|US|2001-01-01
     |big string here|189209209|US|2001-01-01

 Or

     123|big string here|189209209|US
     123|big string here|189209209|US|

 So you see either the first column, which is the ID in a way, is
 missing so the missing character is probably a blank (''?). In this
 case I want COPY to just ignore this line.

 Or the last column is missing, where the missing character can be a
 newline I suppose?

 So how do I specify this in the COPY command so that it doesn't croak?
 If a line's ID is missing, it should ignore the line and go on instead
 of not doing anything by throwing an error for EVERYTHING!

 Thanks.


 Missing data is one thing, missing delimiters is another. Try doing a small
 copy of data with just a few lines to see which variants are actually
 causing the error. My money is on the one that has a mismatch between the
 table column count and the data column count. I.e., the row with three
 delimiters instead of four:


 23|big string here|189209209|US

 When you say ignore, do you mean that you want PostgreSQL to assume a null
 value for the missing column or to not import that row at all?

 In general, when you have data scrubbing issues like this, grep/sed/awk/...
 are your friends. Clean it up then import it.

 I suppose you could import all rows into a big text field and process it in
 PostgreSQL but I doubt you will find that to be an optimal solution.



Thanks Steve.

The file has 350 million lines. Sed, Awk etc are a little painful when
the file is 18GB witht hat many lines.

I'd want Postgresql to ignore the line altogether when something is
missing. Is this an option we can use, or are rules hoisted on us?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Steve Crawford


On 01/06/2012 03:12 PM, Phoenix Kiula wrote:

...

Sounds like you are using statement pooling - every statement can be
assigned to a different server connection. You may need transaction pooling
or session pooling:

http://pgbouncer.projects.postgresql.org/doc/usage.html



Thanks Steve. YES! I changed it to transaction pooling and now it works.


But Marko is correct. If you were using:
begin;
create temporary...
\copy
commit;

and if your pooler was set to statement then you should have seen an error.

Are you by any chance routing stderr to /dev/null or otherwise hiding 
messages? If you are using psql.2/dev/null then everything would 
*look* the same with pooling set to statement or transaction but you 
would be blissfully ignorant of the errors.


Cheers,
Steve

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Phoenix Kiula

On Fri, Jan 6, 2012 at 6:54 PM, Adrian Klaver adrian.kla...@gmail.com wrote:

 Try:
 copy vl from 'data.txt' WITH CSV DELIMITER '|';


Doesn't work. Can't see what the different in CSV is from a text file.
Same errors are thrown.


 If that doesn't work take a look at pgloader:
 http://pgfoundry.org/projects/pgloader/


Wow, another geeky tool. Hard to find documentation. Archaic
presentation, no simple steps to install and get using. Anyway doesn't
seem to provide the options I need
(http://pgloader.projects.postgresql.org/) --

a) Ability to assign more than one NULL value
b) Ability to ignore lines altogether that have any problem

Really, other databases have mechanisms to ignore problem lines
while copying. Does Postgresql allow me to *ignore* lines while
COPYING?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Steve Crawford


On 01/06/2012 03:55 PM, Phoenix Kiula wrote:

...

In general, when you have data scrubbing issues like this, grep/sed/awk/...
are your friends. Clean it up then import it.

Thanks Steve.

The file has 350 million lines. Sed, Awk etc are a little painful when
the file is 18GB witht hat many lines.

I'd want Postgresql to ignore the line altogether when something is
missing. Is this an option we can use, or are rules hoisted on us?



I've found grep, sed and friends to be quite effective and proper 
pre-cleaning to have a relatively minor impact on performance. Done 
properly, you will just be piping the data through a very simple 
grep/sed/awk/... into psql. No extra disk-reads, minimal memory use and 
a bit of CPU. And you will be in charge of deciding how suspect data is 
handled.


If by rules hoisted you mean will PostgreSQL make arbitrary and 
possibly incorrect assumptions to attempt to force bad data into a 
table then the answer is no. In fact, it has become more and more 
picky over time. Trust me, at some point you will thank it for doing so.


For example the following used to work:
select current_date  2020-01-01;

But it returned false which was probably not what the user wanted. 
(2020-01-01 is the integer 2018 which PostgreSQL interpreted as being 
less than current_date). But it sure looks OK at first glance when you 
really meant '2020-01-01'::date.


In current versions, that statement will throw an error just at 
PostgreSQL does with dates like February 31 and a variety of other 
things that certain other DBMS' deem good.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread Adrian Klaver

On Friday, January 06, 2012 4:16:09 pm Phoenix Kiula wrote:
 On Fri, Jan 6, 2012 at 6:54 PM, Adrian Klaver adrian.kla...@gmail.com wrote:
  Try:
  copy vl from 'data.txt' WITH CSV DELIMITER '|';
 
 Doesn't work. Can't see what the different in CSV is from a text file.
 Same errors are thrown.
 
  If that doesn't work take a look at pgloader:
  http://pgfoundry.org/projects/pgloader/
 
 Wow, another geeky tool. Hard to find documentation. Archaic
 presentation, no simple steps to install and get using. Anyway doesn't
 seem to provide the options I need
 (http://pgloader.projects.postgresql.org/) --

Install:
Download
tar -xzvf pgloader-2.3.2.tar.gz 
cd pgloader-2.3.2/
sudo make

Very difficult.

http://pgloader.projects.postgresql.org/
 
 a) Ability to assign more than one NULL value
null
You can configure here how null value is represented into your flat data file.

This parameter is optional and defaults to '' (that is empty string). If 
defined 
on a table level, this local value will overwrite the global one.

empty_string
You can configure here how empty values are represented into your flat data 
file.

This parameter is optional and defaults to \ (that is backslash followed by 
space). If defined on a table level, this local value will overwrite the global 
one.

reformat
Use this option when you need to preprocess some column data with pgloader 
reformatting modules, or your own. The value of this option is a comma 
separated 
list of columns to rewrite, which are a colon separated list of column name, 
reformat module name, reformat function name. Here's an example to reformat 
column dt_cx with the mysql.timestamp() reformatting function:

reformat = dt_cx:mysql:timestamp
See global setting option reformat_path for configuring where pgloader will 
look 
for reformat packages and modules.

If you want to write a new formating function, provide a python package called 
reformat (a directory of this name containing an empty __init__.py file will 
do) 
and place in there arbitrary named modules (foo.py files) containing functions 
with the following signature:

def bar(reject, input)
The reject object has a log(self, messages, data = None) method for you to log 
errors into section.rej.log and section.rej files.

 b) Ability to ignore lines altogether that have any problem

reject_log
In case of errors processing input data, a human readable log per rejected 
input 
data line is produced into the reject_log file.
reject_data
In case of errors processing input data, the rejected input line is appended to 
the reject_data file.

 
 Really, other databases have mechanisms to ignore problem lines
 while copying. Does Postgresql allow me to *ignore* lines while
 COPYING?

No.

I await with bated breath your most excellent patch to COPY.


-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] URGENT: temporary table not recognized?

2012-01-06 Thread John R Pierce


On 01/06/12 5:33 PM, Phoenix Kiula wrote:

  http://pgloader.projects.postgresql.org/

Sorry. That I already did.

But where's the config file? How to configure the config file?
Where's the simple doc (not on that ugly PGFoundry website, I mean in
English that people can understand what to do, with a starting
sample?)





you should check your attitude at the door.  this isn't Microsoft Pay 
per Incident Tech Support.




what you're asking for is right on that very URL you so conveniently 
quoted, both the configuration file format (Global Configuration 
Section), and where the sample pgloader.conf can be found...



Please see the given configuration example which should be distributed 
in/usr/share/doc/pgloader/examples/pgloader.conf.


The example configuration file comes with example data and can be used 
a unit test ofpgloader.





so, yes, you need to download the package and un-tar it to see the 
sample .conf


$ wget http://pgfoundry.org/frs/download.php/2294/pgloader-2.3.2.tar.gz
$ tar xzf pgloader-2.3.2.tar.gz
$ cd pgloader-2.3.2
$ more examples/pgloader.conf



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

37 matches

Mail list logo