Re: tolf and detab

Andrei Alexandrescu Sun, 08 Aug 2010 20:20:27 -0700

On 08/08/2010 02:32 PM, bearophile wrote:

Walter Bright:

bearophile wrote:

In the D code I have added an idup to make the comparison more fair, because
in the Python code the "line" is a true newly allocated line, you can safely
use it as dictionary key.


So it is with byLine, too. You've burdened D with double the amount of 
allocations.


I think you are wrong two times:

1) byLine() doesn't return a newly allocated line, you can see it with this 
small program:

import std.stdio: File, writeln;

void main(string[] args) {
     char[][] lines;
     auto file = File(args[1]);
     foreach (rawLine; file.byLine()) {
         writeln(rawLine.ptr);
         lines ~= rawLine;
     }
     file.close();
}


Its output shows that all "strings" (char[]) share the same pointer:

14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
...


2) You can't use the result of rawLine() as string key for an associative 
array, as you I have said you can in Python. Currently you can, but according 
to Andrei this is a bug. And if it's not a bug then I'll reopen this closed bug 
4474:

http://d.puremagic.com/issues/show_bug.cgi?id=4474

Also, I object in general to this method of making things "more fair". Using a
less efficient approach in X because Y cannot use such an approach is not a
legitimate comparison.


I generally agree, but this it not the case.
In some situations you indeed don't need a newly allocated string for each 
loop, because for example you just want to read them and process them and not 
change/store them. You can't do this in Python, but this is not what I want to 
test. As I have explained in bug 4474 this behaviour is useful but it is 
acceptable only if explicitly requested by the programmer, and not as default 
one. The language is safe, as Andrei explains there, because you are supposed 
to idup the char[] to use it as key for an associative array (if your 
associative array is declared as int[char[]] then it can accept such rawLine() 
as keys, but you can clearly see those aren't strings. This is why I have 
closed bug 4474).

Bye,
bearophile

I think at the end of the day, regardless the relative possibilities offile reading in the two languages, we should be faster than Python whenallocating one new string per line.


Andrei

Re: tolf and detab

Reply via email to