Re: [GENERAL] postgres external table

2010-01-19 Thread Greg Stark
On Tue, Jan 19, 2010 at 4:41 AM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 How can that work without a transactional file system, though? If the
 external process writes to the file while you're half-way through reading
 it, what's the database to do? In general, how do external tables cope with
 the fact that they're on non-transactional storage?

Well if you use mv to replace the old file with the new one then it
should be safe. Unless your query involves opening the table multiple
times or your transactions are more complex than a single query...


-- 
greg

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-19 Thread Jayadevan M
Yes. We get quite a few files as 'feeds' from external systems. Once the 
files are in our network, we know that no changes will happen to those 
files. We access them using Oracle external tables and process them (the 
data, after some processing, end up in other real tables). If external 
tables were not there, we would have had to schedule some job to load 
these files.
Jayadevan




From:   Greg Stark gsst...@mit.edu
To: Craig Ringer cr...@postnewspapers.com.au
Cc: Vincenzo Romano vincenzo.rom...@notorand.it, Tom Lane 
t...@sss.pgh.pa.us, Greg Smith g...@2ndquadrant.com, Amy Smith 
vah...@gmail.com, pgsql-general@postgresql.org
Date:   01/19/2010 04:37 PM
Subject:Re: [GENERAL] postgres external table
Sent by:pgsql-general-ow...@postgresql.org



On Tue, Jan 19, 2010 at 4:41 AM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 How can that work without a transactional file system, though? If the
 external process writes to the file while you're half-way through 
reading
 it, what's the database to do? In general, how do external tables cope 
with
 the fact that they're on non-transactional storage?

Well if you use mv to replace the old file with the new one then it
should be safe. Unless your query involves opening the table multiple
times or your transactions are more complex than a single query...


-- 
greg

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general






DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






Re: [GENERAL] postgres external table

2010-01-19 Thread Alban Hertroys
On 19 Jan 2010, at 12:16, Jayadevan M wrote:

 Yes. We get quite a few files as 'feeds' from external systems. Once the 
 files are in our network, we know that no changes will happen to those files. 
 We access them using Oracle external tables and process them (the data, after 
 some processing, end up in other real tables). If external tables were not 
 there, we would have had to schedule some job to load these files. 

I'm surprised you even need to download them to your network. I suspect it 
would be quite possible to implement 'external tables' so that they could be 
read from eg. an RSS feed. That would be a pretty slick feature.

There's a gazillion number of data formats that people would want support for 
though; CSV, probably excel, RSS feeds, web services, etc.

Most of that should probably be implemented by means of stored procedures 
(using the unsafe languages - pl/pythonu, pl/perlu, etc) returning TABLE (...), 
which means this is in fact already possible I think? It's just that nobody's 
(publicly) thought of doing this so far.

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:737,4b559f7d10601504612258!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-19 Thread Jayadevan M
Hi,
We use Oracle's external tables to process the files (so no need to look 
for the other alternatives you mentioned). We do not have access to the 
data providers' network to access the files. That is why we process them 
after we receive the files.
If there is a database implementation of 'external tables' in the way you 
mentioned (CSV, probably excel, RSS feeds, web services, etc) at db level, 
that would be great.  Most ETL tools already do accept feeds of the type 
you mentioned and load those into tables.
Jayadevan




From:   Alban Hertroys dal...@solfertje.student.utwente.nl
To: Jayadevan M jayadevan.maym...@ibsplc.com
Cc: pgsql-general@postgresql.org
Date:   01/19/2010 05:31 PM
Subject:Re: [GENERAL] postgres external table



On 19 Jan 2010, at 12:16, Jayadevan M wrote:

 Yes. We get quite a few files as 'feeds' from external systems. Once the 
files are in our network, we know that no changes will happen to those 
files. We access them using Oracle external tables and process them (the 
data, after some processing, end up in other real tables). If external 
tables were not there, we would have had to schedule some job to load 
these files. 

I'm surprised you even need to download them to your network. I suspect it 
would be quite possible to implement 'external tables' so that they could 
be read from eg. an RSS feed. That would be a pretty slick feature.

There's a gazillion number of data formats that people would want support 
for though; CSV, probably excel, RSS feeds, web services, etc.

Most of that should probably be implemented by means of stored procedures 
(using the unsafe languages - pl/pythonu, pl/perlu, etc) returning TABLE 
(...), which means this is in fact already possible I think? It's just 
that nobody's (publicly) thought of doing this so far.

Alban Hertroys

--
Screwing up is the best way to attach something to the ceiling.


!DSPAM:1019,4b559f7b10604920110887!








DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






Re: [GENERAL] postgres external table

2010-01-19 Thread Greg Smith

Craig Ringer wrote:

On 19/01/2010 1:13 AM, Vincenzo Romano wrote:


Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...


How can that work without a transactional file system, though? If the 
external process writes to the file while you're half-way through 
reading it, what's the database to do? In general, how do external 
tables cope with the fact that they're on non-transactional storage?


With Oracle's implementation, you version the input files (timestamp or 
something) and then do ALTER TABLE to change the location the external 
table points to.  That will block waiting for exclusive access before it 
fires, then you get a clean switch to the new location.  There is no 
smartness here to cope with weird behavior built-in here--I expect it 
will just crash the query.  One thing you always have to be careful 
about when using these is that an external table might return a weird 
query error under odd circumstances such as you describe, which you 
might not normally expect from a simple SELECT.



--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Greg Smith

Craig Ringer wrote:

For those non-Oracle users among us, what's an external table?


External tables let you map a text file directly to a table without 
explicitly loading it.  In PostgreSQL, if you have data in a CSV file, 
usually you'd import it with COPY before you'd use it.  If external 
tables were available, you'd just say there's an external table as a CSV 
file and you could start running queries against it.


So the quick answer is no, there is no built-in external table support 
in PostgreSQL; normally people load the data using COPY instead.  
There's a work in progress on this topic at 
http://pgfoundry.org/projects/pgexternaltable/ , but it's extremely 
rough at this point and I'm not sure if it's even moving in the right 
direction--the main project would do something like this via SQL/MED, 
and I don't think that's how the prototype is being built at all.


The only PostgreSQL-based product I'm aware of that has working external 
table support already is Greenplum DB.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 Craig Ringer wrote:
 For those non-Oracle users among us, what's an external table?

 External tables let you map a text file directly to a table without 
 explicitly loading it.  In PostgreSQL, if you have data in a CSV file, 
 usually you'd import it with COPY before you'd use it.  If external 
 tables were available, you'd just say there's an external table as a CSV 
 file and you could start running queries against it.

I'm finding it hard to visualize a use-case for that.  We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it.  Which among other things
implies that every query will seqscan the whole table.  Where's the
savings?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Sam Mason
On Mon, Jan 18, 2010 at 09:57:02AM -0500, Tom Lane wrote:
 Greg Smith g...@2ndquadrant.com writes:
  Craig Ringer wrote:
  For those non-Oracle users among us, what's an external table?
 
  External tables let you map a text file directly to a table without 
  explicitly loading it.  In PostgreSQL, if you have data in a CSV file, 
  usually you'd import it with COPY before you'd use it.  If external 
  tables were available, you'd just say there's an external table as a CSV 
  file and you could start running queries against it.
 
 I'm finding it hard to visualize a use-case for that.  We must postulate
 that the table is so big that you don't want to import it, and yet you
 don't feel a need to have any index on it.  Which among other things
 implies that every query will seqscan the whole table.  Where's the
 savings?

I've mostly wanted something like it when I've been dealing with
externally maintained data.  The best idea I've seen so far has just
been a function similar to:

  copy_csv_from(filename text) returns setof text[]

to be used as:

  SELECT d[0]::Int AS id, d[1] AS name
  FROM copy_csv_from('/tmp/usernames.csv') d;

This could be wrapped in a VIEW giving what I'd expect to be similar
semantics to an external table, however I've never used one so I could
be missing something.

It's possible to write this function at the moment, it's somewhat
suboptimal as the csv file is completely imported before anything else
happens so is only good for small files.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Greg Stark
On Mon, Jan 18, 2010 at 2:57 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm finding it hard to visualize a use-case for that.  We must postulate
 that the table is so big that you don't want to import it, and yet you
 don't feel a need to have any index on it.  Which among other things
 implies that every query will seqscan the whole table.  Where's the
 savings?

I think it's usually more my data is updated by other tools and it
would be hard/impossible/annoying to insert another step into the
pipeline to copy it to yet another place. The main benefit is that
you can access the authoritative data directly without having to copy
it and have some sort of process in place to do that regularly.

Text files are kind of useless but they're a baseline bit of
functionality on top of which to add more sophisticated external forms
such as data available over at some url or over some kind of rpc -- to
which various conditions could be pushed using external indexes -- or
ultimately in another database to which whole joins can be pushed.


-- 
greg

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Scott Marlowe
On Mon, Jan 18, 2010 at 7:57 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Smith g...@2ndquadrant.com writes:
 Craig Ringer wrote:
 For those non-Oracle users among us, what's an external table?

 External tables let you map a text file directly to a table without
 explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
 usually you'd import it with COPY before you'd use it.  If external
 tables were available, you'd just say there's an external table as a CSV
 file and you could start running queries against it.

 I'm finding it hard to visualize a use-case for that.  We must postulate
 that the table is so big that you don't want to import it, and yet you
 don't feel a need to have any index on it.  Which among other things
 implies that every query will seqscan the whole table.  Where's the
 savings?

I've used it mostly for importing in the past.  Saves the step of
loading a large file into a table with no constraints as a middle
step.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Greg Smith

Tom Lane wrote:

I'm finding it hard to visualize a use-case for that.  We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it.  Which among other things
implies that every query will seqscan the whole table.  Where's the
savings?
  


I've mainly seen it used for data loading where there's some sort of 
transformation going on, typically to cleanup junk fields that would 
fail a constraint or derive new columns.  If you have external tables, 
there's no need to load the data into a temporary table if all you're 
going to do is modify a few things and then write the result to 
somewhere else.  Most of these use cases process the whole file anyway, 
so having to do a whole scan isn't an issue.  I used to run an app that 
imported gigabytes a day of text files dumped from another server that 
used a weird date format I had to process via pl/pgsql function.  Having 
to pass them through COPY and then INSERT processed versions to 
somewhere else was really a drag, given that there was no use for the 
intermediate data.


It also can be handy for bootstrapping apps that are converting stuff 
out of a legacy system too.  Just make the mainframe/whatever dump a new 
text file periodically into where the external table looks for its data, 
and you skip having to schedule reloads when the content changes.  Can 
make your life easier while running the two systems in parallel initially.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Vincenzo Romano
2010/1/18 Tom Lane t...@sss.pgh.pa.us:
 Greg Smith g...@2ndquadrant.com writes:
 Craig Ringer wrote:
 For those non-Oracle users among us, what's an external table?

 External tables let you map a text file directly to a table without
 explicitly loading it.  In PostgreSQL, if you have data in a CSV file,
 usually you'd import it with COPY before you'd use it.  If external
 tables were available, you'd just say there's an external table as a CSV
 file and you could start running queries against it.

 I'm finding it hard to visualize a use-case for that.  We must postulate
 that the table is so big that you don't want to import it, and yet you
 don't feel a need to have any index on it.  Which among other things
 implies that every query will seqscan the whole table.  Where's the
 savings?

                        regards, tom lane

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...

-- 
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Craig Ringer

On 19/01/2010 1:13 AM, Vincenzo Romano wrote:


Another case, Tom, could be when the file is updated from a non-DB
application and you need to synchronize
the data with other DB applications ...


How can that work without a transactional file system, though? If the 
external process writes to the file while you're half-way through 
reading it, what's the database to do? In general, how do external 
tables cope with the fact that they're on non-transactional storage?


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-18 Thread Jayadevan M
Having 'external tables' lets us avoid the step of loading data from a 
file into the table. We do not have to check whether a load job has run 
successfully, whether the data in the table really corresponds to the data 
in the file etc. It also lets us decide how many rejects can be allowed 
and so forth.
http://www.adp-gmbh.ch/ora/misc/ext_table.html
Other than that, I have not found any advantage.
Jayadevan




From:   Tom Lane t...@sss.pgh.pa.us
To: Greg Smith g...@2ndquadrant.com
Cc: Craig Ringer cr...@postnewspapers.com.au, Amy Smith 
vah...@gmail.com, pgsql-general@postgresql.org
Date:   01/18/2010 08:25 PM
Subject:Re: [GENERAL] postgres external table
Sent by:pgsql-general-ow...@postgresql.org



Greg Smith g...@2ndquadrant.com writes:
 Craig Ringer wrote:
 For those non-Oracle users among us, what's an external table?

 External tables let you map a text file directly to a table without 
 explicitly loading it.  In PostgreSQL, if you have data in a CSV file, 
 usually you'd import it with COPY before you'd use it.  If external 
 tables were available, you'd just say there's an external table as a CSV 

 file and you could start running queries against it.

I'm finding it hard to visualize a use-case for that.  We must postulate
that the table is so big that you don't want to import it, and yet you
don't feel a need to have any index on it.  Which among other things
implies that every query will seqscan the whole table.  Where's the
savings?

 regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general






DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






[GENERAL] postgres external table

2010-01-17 Thread Amy Smith
all
is there a external table create method ( similar to oracle external table )
? where to find the information ?
thanks
Amy


Re: [GENERAL] postgres external table

2010-01-17 Thread Craig Ringer
Amy Smith wrote:
 all
 is there a external table create method ( similar to oracle external
 table ) ? where to find the information ?

For those non-Oracle users among us, what's an external table? What are
you trying to achieve?

Random guess: you might be looking for tablespaces.

--
Craig Ringer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres external table

2010-01-17 Thread David Fetter
On Sun, Jan 17, 2010 at 07:27:34PM -0800, Amy Smith wrote:
 all
 is there a external table create method ( similar to oracle external table )
 ? where to find the information ?

There is a project on pgfoundry which has had some activity lately
that's similar.  You might also try DBI-Link.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general