[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

Sven Van Caekenberghe Thu, 13 May 2021 08:37:15 -0700

There is now the following commit:

https://github.com/svenvc/NeoCSV/commit/0acc2270b382f52533c478f2f1585341e390d4b5


which should address a couple of issues.

> On 22 Jan 2021, at 12:15, jtuc...@objektfabrik.de wrote:
> 
> Tim,
> 
> 
> 
> 
> Am 22.01.21 um 10:22 schrieb Tim Mackinnon:
>> I’m not doing any CSV processing at the moment, but have in the past - so 
>> was interested in this thread.
>> 
>> @Kasper, can’t you just use #readHeader upfront, and do the assertion 
>> yourself, and then proceed to loop through your records? It would seem that 
>> the Neo caters for what you are suggesting - and if you want to add a helper 
>> method extension you have the building blocks to already do this?
>> 
> This is a good idea. One caveat, however: #readHeader in its current 
> implementation does 2 things: 
> 
>       • read the line respecting each field (thereby, respect line breaks 
> within quoted fields - perfect for this purpose)
>       • update the number of Columns for further reading (assuming 
> #readHeader's purpose is to interpret the header line) 
> This second thing is in our way, because it may influence the way the 
> following lines will be interpreted. That is ecactly why I created an issue 
> on github (https://github.com/svenvc/NeoCSV/issues/20). 
> A method that reads a line without any side effects (other than pushing the 
> position pointer forward to the next line) would come in handy for such 
> scenarios. But you can always argue that this has nothing to do with CSV, 
> because in CSV all lines have the same number of columns, each of them 
> containing the same kind of information, and there may be exactly one header 
> line. Anything else is just some file that may contain CSV-y stuff in it. So 
> I am really not sure if NeoCSV should build lots of stuff for such files. I'd 
> love to have this, but I'd understand if Sven refused to integrate it.... ;-)
> 
> 
>> The only flaw I can think of, is if there is no header present then I can’t 
>> recall what Neo does - ideally throws an exception so you can decide what to 
>> do - potentially continue if the number of columns is what you expect and 
>> the data matches the columns - or you fail with an error that a header is 
>> required. But I think you would always need to do some basic initial checks 
>> when processing CSV due to the nature of the format?
> Right. You'd always have to write some specific logic for this particular 
> file format and make NeoCSV ignore the right stuff...
> 
> 
> 
> Joachim
> 
> 
> 
> 
> 
>> 
>> Tim
>> 
>> On Fri, 22 Jan 2021, at 6:42 AM, Kasper Osterbye wrote:
>>> As it happened, I ran into the exact same scenario as Joachim just the 
>>> other day,
>>> that is, the external provider of my csv had added some new columns. In my 
>>> case
>>> manifested itself in an error that an integer field was not an integer 
>>> (because new
>>> columns were added in the middle).
>>> 
>>> Reading through this whole thread leaves me with the feeling that no matter 
>>> what Sven
>>> adds, there is still a risk for error. Nevertheless, my suggestion would be 
>>> to add a 
>>> functionality to #skipHeaders, or make a sister method: 
>>> #assertAndSkipHeaders: numberOfColumns onFailDo: aBlock given the actual 
>>> number of headers
>>> That would give me a way to handle the error up front. 
>>> 
>>> This will only be interesting if your data has headers of cause.
>>> 
>>> Thanks for NeoCSV which I use all the time!
>>> 
>>> Best,
>>> 
>>> Kasper 
>> 
> 
> 
> -- 
> -----------------------------------------------------------------------
> Objektfabrik Joachim Tuchel          
> mailto:jtuc...@objektfabrik.de
> 
> Fliederweg 1                         
> http://www.objektfabrik.de
> 
> D-71640 Ludwigsburg                  
> http://joachimtuchel.wordpress.com
> 
> Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1
> 
> 
> 
>

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

Reply via email to