Re: Review of Jesse Phillips's CSV Parser

Robert Jacques Fri, 11 Nov 2011 21:25:47 -0800

On Thu, 10 Nov 2011 22:17:10 -0500, Jesse Phillips <jessekphillip...@gmail.com> 
wrote:

On Mon, 07 Nov 2011 23:28:20 -0500, Robert Jacques wrote:

On Sun, 06 Nov 2011 19:00:42 -0500, Jesse Phillips
<jessekphillip...@gmail.com> wrote:

[snip]

I also have some API design concerns. Primarily, most of the type names
are overly broad and unclear. For example: Record, Records, Malformed
could just as easily be part of a database API. And because of this, not
only does one reading the code not know what the object is, but if
namespace conflicts can occur, then code reviewers will probably make
the wrong assumption half the time as to what Record is and not bother
looking it up.


The reason it is name like what you would find in a database, is because
they are the same thing. It is a row of data. In past discussion about
existing name clashes it was concluded that was what modules, static, and
named imports wore for.

Malformed, I just don't have any clue what to actually call it. No
CSVMalformed doesn't help.

But then you go on to the point below, that you shouldn't even see them.

Furthermore, Records and Record, are essentially internal ranges/structs
returned by csvReader. In the tradition of std.algorithms these should
be defined inside the csvReader function (preferably) or as CsvReader /
CsvReader.Record.


I was going to explain how wrong you are and how I tried having all the
options part of csvReader but it just muddies up the clean API when
customizing something. But actually I already found the solution with my
previous updates and incorrectly chose to move Malformed customization
out.

Long story short the problems I see:

* The documentation still needs to exist and will be a little harder to
do without these
* Records contains the heading field, this is abnormal for an InputRange
and means it won't show as an auto link if it becomes a hidden range.
* Currently you can find exactly which function call result in a specific
error, if I hide the ranges you don't.

I'm expecting that introducing CSVFormatter will require another review,
at that time I can try making two forms of documentation to see which is
preferred.

Thank you! Very useful food for thought.


Modules, static and named imports are extremely useful tools for making 
disparate third party libraries work together. But Phobos is one (big) library, 
not a bunch of separate ones. We shouldn't ever need to use these tools to 
resolve Phobos name conflicts. In fact one of the (many) goals of D is to scale 
down to quick and dirty scripts. rdmd helps with this a lot, but the other 
piece of this puzzle is 'import std.all;' which doesn't work today. And while I 
don't think we should break existing code just to fix this problem, going 
forward, as we update and add libraries to Phobos, we need to strive to prevent 
name collisions. And, like std.file and std.stdio, I'd expect std.csv and 
std.databases to be used together often. Furthermore, your primary case for 
exposing these structs is for documentation purposes. As the user isn't 
expected to actually instantiate these structs (or at least not often), naming 
them CsvReader and CsvReader.Row is perfectly acceptable. Lastly, I'd li!

ke todraw your attention to std.algorithm.findSplit. findSplit has to return three ranges, which it does so using a Tuple. Records is simply a manual tuple of the headings and the row ranges, so why not leverage Phobos for this instead?

Re: Review of Jesse Phillips's CSV Parser

Reply via email to