Re: [HACKERS] CSV mode option for pg_dump

PFC Tue, 13 Jun 2006 09:25:19 -0700


        From what I gather, the CSV format dump would only contain data.

I think pg_dump is the friend of pg_restore. It dumps everythingincluding user defined functions, types, schemas etc. CSV does not fitwith this.

Besides, people will probably want to dump into CSV the result of anyquery, to load it into excel, not just the full contents of a table.

So, why not create a separate tool, someone suggested pg_query for that,I second it.This tool would take a query and format options, and would output a filein whatever format chosen by the user (CSV, COPY format, xml, whatever)

A script language (python) can be used, which will significantly shortendevelopment times and allow easy modularity, as it is easier to add amodule to a python program than a C program.I would vote for Python because I love it and it has a very good postgresadapter (psycopg2) which knows how to convers every postgres type to anative language type (yes, even multidimensional arrays of BOX getconverted). And it's really fast at retrieving large volumes of data.

So you have a stable, fast tool for backup and restore (pg_dump) and arapidly evolving, user-friendly and extendable tool for exporting data,and everyone is happy.

Mr Momijan talks about adding modular functionality to pg_dump. Is itreally necessary ? What is the objective ? Is it to reuse code in pg_dump? I guess not ; if a user wants to dump, for instance, all the tables in aschema, implementing this logic in python is only a few lines of code(select from information_schema...)

To be realistic, output format modules should be written in scriptlanguages. Noone sane is eager to do string manipulation in C. Thus thesemodules would have to somehow fit with pg_dump, maybe with a pipe orsomething. This means designing another protocol. Reimplementing in ascripting langage the parts of pg_dump which will be reused by thisproject (mainly, enumerating tables and stuff) will be far easier.


        Just look.

Python 2.4.2 (#1, Mar 30 2006, 14:34:35)
[GCC 3.4.4 (Gentoo 3.4.4-r1, ssp-3.4.4-1.0, pie-8.7.8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

...opens a db connection...

c.execute( "SELECT * FROM test.csv" )
data = c.fetchall()
data

[[1, datetime.date(2006, 6, 13), 'this\tcontains\ttabulations'], [2,datetime.date(2006, 6, 13), "this'contains'quotes"], [3,datetime.date(2006, 6, 13), 'this"contains"double quotes']]

import csv, sys
c = csv.writer( sys.stdout, dialect = csv.excel )
c.writerows( data )

1,2006-06-13,this       contains        tabulations
2,2006-06-13,this'contains'quotes
3,2006-06-13,"this""contains""double quotes"

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [HACKERS] CSV mode option for pg_dump

Reply via email to