On 10/09/2013 11:23 AM, Dimitri Fontaine wrote:
Andrew Dunstan <and...@dunslane.net> writes:
I don't see at all that your suggested alternative has any advantages over
what's been written. If you can say "NULL FOR (foo) as '""' how will you
specify the null for some other column(s)? Are we going to have multiple
such clauses? It looks like a real mess.
Basically the CSV files don't have out-of-band NULLs and it's then a
real mess. In the new pgloader version I've been adding per-column NULL
processing, where NULL can be either an empty string, any number of
space characters or any constant string such as "\N" or "****".

I first added a global per-file NULL representation setting, but that's
not flexible enough to make any sense really. The files we have to
import are way to "creative" in their formats.

In my view, we can slowly deprecate pgloader by including such features
in the core code or make pgloader and the like non-optional parts of
external data loading tool chain.



The CSV code was somewhat controversial when adopted, and was never intended to cater for all cases. I think it was accepted because it gave good coverage of a large number of common cases without huge additional code complexity. I think we drew the line in about the right place for what we support, although we've extended it modestly over the years. I seriously doubt that it will ever fully replace a utility like pgloader, and I'm not sure that's a desirable goal in the first place.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to