-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/30/2015 10:58 AM, Sylvain Pointeau wrote:
> is it possible? in a lot of cases, I cannot use sqlite (executable)
> because of the lack of a good CSV import. It would really great if
> this could be addressed.

Use the APSW shell:

  http://rogerbinns.github.io/apsw/shell.html

Behind the scenes it uses the Python CSV libraries which support a
number of dialects including excel (default) and excel-tab.

The APSW shell also supports an .autoimport command.  It automatically
works out dialects, separators and data types.  Here is the extended help:

==== 8< ====
sqlite> .help autoimport

.autoimport FILENAME ?TABLE?  Imports filename creating a table
                              and automatically working out
                              separators and data types
                              (alternative to .import command)

The import command requires that you precisely pre-setup the
table and schema, and set the data separators (eg commas or
tabs).  In many cases this information can be automatically
deduced from the file contents which is what this command does.
There must be at least two columns and two rows.

If the table is not specified then the basename of the file will
be used.

Additionally the type of the contents of each column is also
deduced - for example if it is a number or date.  Empty values
are turned into nulls.  Dates are normalized into YYYY-MM-DD
format and DateTime are normalized into ISO8601 format to allow
easy sorting and searching.  4 digit years must be used to detect
dates.  US (swapped day and month) versus rest of the world is
also detected providing there is at least one value that resolves
the ambiguity.

Care is taken to ensure that columns looking like numbers are
only treated as numbers if they do not have unnecessary leading
zeroes or plus signs.  This is to avoid treating phone numbers
and similar number like strings as integers.

This command can take quite some time on large files as they are
effectively imported twice.  The first time is to determine the
format and the types for each column while the second pass
actually imports the data.
==== 8< ====

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlW6dDEACgkQmOOfHg372QRG5ACgt/OpLOPVZ40YQ4B2EWBf7yk0
HCUAniZLLUkPs7ac7cvCLb7Bn3zyNguC
=YpCt
-----END PGP SIGNATURE-----

Reply via email to