Re: Why does std.string.splitLines return an array?

2012-10-22 Thread Andrei Alexandrescu

On 10/22/12 1:05 AM, Chad J wrote:

On 10/21/2012 06:35 PM, Jonathan M Davis wrote:

On Sun, 2012-10-21 at 18:00 -0400, Chad J wrote:

std.string.splitLines returns an array, which is pretty grody. Why not
return a lazily-evaluated range struct so that we can avoid allocations
on this simple but common operation?


If you want a lazy range, then use std.algorithm.splitter. std.string
operates on and returns strings, not general ranges.

- Jonathan M Davis



std.algorithm.splitter is simply not acceptable for this. It doesn't
have this kind of logic:

bool matchLineEnd( string text, size_t pos )
{
if ( pos+1 < text.length
&& text[pos] == '\r'
&& text[pos+1] == '\n' )
return true;
else if ( pos < text.length
&& (text[pos] == '\r' || text[pos] == '\n') )
return true;
else
return false;
}


Agreed. We should add splitter() accepting only one argument of some 
string type. It would use the line splitting logic above.


Could you please adapt your code to do this and package it in a pull 
request? Thanks!



Andrei


Re: Why does std.string.splitLines return an array?

2012-10-21 Thread Chad J

On 10/21/2012 06:35 PM, Jonathan M Davis wrote:

On Sun, 2012-10-21 at 18:00 -0400, Chad J wrote:

std.string.splitLines returns an array, which is pretty grody.  Why not
return a lazily-evaluated range struct so that we can avoid allocations
on this simple but common operation?


If you want a lazy range, then use std.algorithm.splitter. std.string
operates on and returns strings, not general ranges.

- Jonathan M Davis



std.algorithm.splitter is simply not acceptable for this.  It doesn't 
have this kind of logic:


bool matchLineEnd( string text, size_t pos )
{
if ( pos+1 < text.length
  && text[pos] == '\r'
  && text[pos+1] == '\n' )
return true;
else if ( pos < text.length
  && (text[pos] == '\r' || text[pos] == '\n') )
return true;
else
return false;
}

I've never used std.algorithm.splitter for line splitting, despite 
trying.  It's always more effective to write your own.


I'm with bearophile on this one:
http://d.puremagic.com/issues/show_bug.cgi?id=4764

I think his suggestions about naming also just make *sense*.  I'm not 
sure how practical some of those naming changes would be if there is a 
lot of wild D2 code that uses the current weirdly-named stuff that 
emphasizes eager evaluation and extraneous allocations.  I'm not sure 
how necessary it is to even /have/ functions that return arrays when 
there are lazy versions: the result of a lazy function can always be fed 
to std.array.array(range).  Heh, even parentheses nesting is nicely 
handled by UFCS now.




Re: Why does std.string.splitLines return an array?

2012-10-21 Thread Jonathan M Davis
On Sun, 2012-10-21 at 18:00 -0400, Chad J wrote:
> std.string.splitLines returns an array, which is pretty grody.  Why not 
> return a lazily-evaluated range struct so that we can avoid allocations 
> on this simple but common operation?

If you want a lazy range, then use std.algorithm.splitter. std.string
operates on and returns strings, not general ranges.

- Jonathan M Davis



Re: Why does std.string.splitLines return an array?

2012-10-21 Thread bearophile

Chad J:

std.string.splitLines returns an array, which is pretty grody.  
Why not return a lazily-evaluated range struct so that we can 
avoid allocations on this simple but common operation?


splitLines is probably modeled on the str.splitlines() string 
method of Python, that returns a list (array) of strings (because 
originally Python was eager). In Phobos there is both a split() 
and splitter(), they are eager and lazy. So maybe you want a 
splitterLines().


I have asked for a lazy splitLines, vote here:
http://d.puremagic.com/issues/show_bug.cgi?id=4764

But I have suggested for a different naming:
http://d.puremagic.com/issues/show_bug.cgi?id=5838

See also:
http://d.puremagic.com/issues/show_bug.cgi?id=6730
http://d.puremagic.com/issues/show_bug.cgi?id=7689

And especially:
http://d.puremagic.com/issues/show_bug.cgi?id=8013

Bye,
bearophile


Why does std.string.splitLines return an array?

2012-10-21 Thread Chad J
std.string.splitLines returns an array, which is pretty grody.  Why not 
return a lazily-evaluated range struct so that we can avoid allocations 
on this simple but common operation?