[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

jtuc...@objektfabrik.de Thu, 21 Jan 2021 23:07:09 -0800

Kasper,

I think this is somewhat close to another thing I am describing here:https://github.com/svenvc/NeoCSV/issues/20<https://github.com/svenvc/NeoCSV/issues/20>

The problem with extending NeoCSV endlessly is that some of the thingswe need with "real-life" CSV files is the fact that they are not CSVfiles at all, so it is hard to tell if it is a good idea to litterNeoCSV with methods for edge cases that have literally to do with thefact that part of our files are not CSV at all...

I somehow have the feeling some of what we need is a subclass of Streamthat knows about constraints like "a line-break is only the end of arecord if it is not part of a quoted field". So I am somewhat tornbetween wanting that stuff in NeoCSV and not wanting to mix csv parsingwith handling of stupid ideas people have when exporting data in someCSV-like file.

Maybe it is a good idea to collect a few concepts that have beenmentioned in these threads:


 * Sometimes we want to skip lines (header, footer) without
   interpreting their contents and without any side effects (not
   CSV-compliant)
 * Sometimes we want to ignore "additional data" after the end of a
   defined number of columns (not CSV-compliant)
 * Sometimes we need to know which line/column couldn't be parsed
   correctly (related to CSV and non-CSV)
   A plus would be if we could add the column name to the error message
   like in "The column 'amount' in line 34 cannot be interpreted as
   monetary amount" - but this is surely quite some work!
 * Sometimes we want to interpret columns by the column names in the
   header line (which may or may not be the first line of the file,
   only the former being CSV-compliant)

Of course this all doesn't mean I am not a fan of NeoCSV. It iswell-written, works very well for "real" CSV and performs very well formy use cases. Most of the things we are talking about here are problemsthat arise when a CSV-file is not a CSV-file...



Joachim




Am 22.01.21 um 07:42 schrieb Kasper Osterbye:

As it happened, I ran into the exact same scenario as Joachim just theother day,that is, the external provider of my csv had added some new columns.In my casemanifested itself in an error that an integer field was not an integer(because new
columns were added in the middle).
Reading through this whole thread leaves me with the feeling that nomatter what Svenadds, there is still a risk for error. Nevertheless, my suggestionwould be to add a
functionality to #skipHeaders, or make a sister method:
#assertAndSkipHeaders: numberOfColumns onFailDo: aBlock given theactual number of headers
That would give me a way to handle the error up front.

This will only be interesting if your data has headers of cause.

Thanks for NeoCSV which I use all the time!

Best,

Kasper



--
-----------------------------------------------------------------------
Objektfabrik Joachim Tuchel          mailto:jtuc...@objektfabrik.de
Fliederweg 1                         http://www.objektfabrik.de
D-71640 Ludwigsburg                  http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

Reply via email to