On Feb 18, 2008 4:01 PM, Graham Cossey <[EMAIL PROTECTED]> wrote:

> On Feb 18, 2008 6:45 PM, Andrew Ballard <[EMAIL PROTECTED]> wrote:
> > On Feb 18, 2008 1:39 PM, Robert Cummings <[EMAIL PROTECTED]> wrote:
> >
> > >
> > > On Mon, 2008-02-18 at 13:24 -0500, Andrew Ballard wrote:
> > > > On Feb 18, 2008 1:08 PM, Graham Cossey <[EMAIL PROTECTED]>
> wrote:
> > > >
> > > > > My biggest gripe with tab delimited files is
> > > > > that they are quite a bit bigger than comma delimited files so I
> may
> > > > > have to split the large files I receive into smaller 'chunks' to
> allow
> > > > > them to be uploaded.
> > > > >
> > > > >
> > > > Why would tab-delimited files be larger than CSV? A tab character
> takes
> > > up
> > > > just as much space as a comma as far as document size is concerned.
> Am I
> > > > missing something?
> > >
> > > He's probably confusing tab delimited with fixed width columns.
> > >
> > > Cheers,
> > > Rob.
> > >
> >
> > Ah, yes. That would also explain why Excel would open it as a single
> column
> > by default if he didn't use the text import wizard.
> >
> > FWIW - If you do open a text file like this, there is a menu item to
> convert
> > text to columnar data once the sheet is already open so you don't have
> to
> > close the document and reopen it.
> >
> > Andrew
> >
>
> Nope not fixed width, definitely tab delimited (longer fields
> 'overlap' ones above and below when viewed in a text editor), as for
> the size difference I don't know why but when I open into Excel and
> save as CSV the files are smaller.
>
> I just opened a 50.2KB tab delimited file into Excel saved it as CSV
> and this new file is 25KB!!
>
> Maybe I really am finally going mad.
>
>
> --
> Graham
>
>
I can only think of two things that might cause that sort of "bloat" in a
file. Either the original had a lot of values quoted as strings that Excel
felt did not need to be (Excel usually only quotes strings if they have
commas in them whereas I've seen CSV files from an Oracle DB where every
field was quoted) or else the original was in a multi-byte character
encoding like UTF-16 and Excel trimmed it down to Windows-1252 or something
when you save it. Beyond that, I can't see any reason why the files would
vary that much. I can see a small difference with differing line endings (CR
LF versus just LF) in a file with several short lines, but Excel would even
tend toward the longer line endings.

Andrew

Reply via email to