There is now the following commit: https://github.com/svenvc/NeoCSV/commit/0acc2270b382f52533c478f2f1585341e390d4b5
which should address a couple of issues. > On 22 Jan 2021, at 12:15, jtuc...@objektfabrik.de wrote: > > Tim, > > > > > Am 22.01.21 um 10:22 schrieb Tim Mackinnon: >> I’m not doing any CSV processing at the moment, but have in the past - so >> was interested in this thread. >> >> @Kasper, can’t you just use #readHeader upfront, and do the assertion >> yourself, and then proceed to loop through your records? It would seem that >> the Neo caters for what you are suggesting - and if you want to add a helper >> method extension you have the building blocks to already do this? >> > This is a good idea. One caveat, however: #readHeader in its current > implementation does 2 things: > > • read the line respecting each field (thereby, respect line breaks > within quoted fields - perfect for this purpose) > • update the number of Columns for further reading (assuming > #readHeader's purpose is to interpret the header line) > This second thing is in our way, because it may influence the way the > following lines will be interpreted. That is ecactly why I created an issue > on github (https://github.com/svenvc/NeoCSV/issues/20). > A method that reads a line without any side effects (other than pushing the > position pointer forward to the next line) would come in handy for such > scenarios. But you can always argue that this has nothing to do with CSV, > because in CSV all lines have the same number of columns, each of them > containing the same kind of information, and there may be exactly one header > line. Anything else is just some file that may contain CSV-y stuff in it. So > I am really not sure if NeoCSV should build lots of stuff for such files. I'd > love to have this, but I'd understand if Sven refused to integrate it.... ;-) > > >> The only flaw I can think of, is if there is no header present then I can’t >> recall what Neo does - ideally throws an exception so you can decide what to >> do - potentially continue if the number of columns is what you expect and >> the data matches the columns - or you fail with an error that a header is >> required. But I think you would always need to do some basic initial checks >> when processing CSV due to the nature of the format? > Right. You'd always have to write some specific logic for this particular > file format and make NeoCSV ignore the right stuff... > > > > Joachim > > > > > >> >> Tim >> >> On Fri, 22 Jan 2021, at 6:42 AM, Kasper Osterbye wrote: >>> As it happened, I ran into the exact same scenario as Joachim just the >>> other day, >>> that is, the external provider of my csv had added some new columns. In my >>> case >>> manifested itself in an error that an integer field was not an integer >>> (because new >>> columns were added in the middle). >>> >>> Reading through this whole thread leaves me with the feeling that no matter >>> what Sven >>> adds, there is still a risk for error. Nevertheless, my suggestion would be >>> to add a >>> functionality to #skipHeaders, or make a sister method: >>> #assertAndSkipHeaders: numberOfColumns onFailDo: aBlock given the actual >>> number of headers >>> That would give me a way to handle the error up front. >>> >>> This will only be interesting if your data has headers of cause. >>> >>> Thanks for NeoCSV which I use all the time! >>> >>> Best, >>> >>> Kasper >> > > > -- > ----------------------------------------------------------------------- > Objektfabrik Joachim Tuchel > mailto:jtuc...@objektfabrik.de > > Fliederweg 1 > http://www.objektfabrik.de > > D-71640 Ludwigsburg > http://joachimtuchel.wordpress.com > > Telefon: +49 7141 56 10 86 0 Fax: +49 7141 56 10 86 1 > > > >