On 08/08/2010 02:32 PM, bearophile wrote:
Walter Bright:
bearophile wrote:
In the D code I have added an idup to make the comparison more fair, because
in the Python code the "line" is a true newly allocated line, you can safely
use it as dictionary key.
So it is with byLine, too. You've burdened D with double the amount of
allocations.
I think you are wrong two times:
1) byLine() doesn't return a newly allocated line, you can see it with this
small program:
import std.stdio: File, writeln;
void main(string[] args) {
char[][] lines;
auto file = File(args[1]);
foreach (rawLine; file.byLine()) {
writeln(rawLine.ptr);
lines ~= rawLine;
}
file.close();
}
Its output shows that all "strings" (char[]) share the same pointer:
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
14E5E00
...
2) You can't use the result of rawLine() as string key for an associative
array, as you I have said you can in Python. Currently you can, but according
to Andrei this is a bug. And if it's not a bug then I'll reopen this closed bug
4474:
http://d.puremagic.com/issues/show_bug.cgi?id=4474
Also, I object in general to this method of making things "more fair". Using a
less efficient approach in X because Y cannot use such an approach is not a
legitimate comparison.
I generally agree, but this it not the case.
In some situations you indeed don't need a newly allocated string for each
loop, because for example you just want to read them and process them and not
change/store them. You can't do this in Python, but this is not what I want to
test. As I have explained in bug 4474 this behaviour is useful but it is
acceptable only if explicitly requested by the programmer, and not as default
one. The language is safe, as Andrei explains there, because you are supposed
to idup the char[] to use it as key for an associative array (if your
associative array is declared as int[char[]] then it can accept such rawLine()
as keys, but you can clearly see those aren't strings. This is why I have
closed bug 4474).
Bye,
bearophile
I think at the end of the day, regardless the relative possibilities of
file reading in the two languages, we should be faster than Python when
allocating one new string per line.
Andrei