On Monday, 23 April 2012 at 23:52:41 UTC, bearophile wrote:
James Miller:

I realised that when you want the number of characters, you normally actually want to use walkLength, not length.

As with strlen() in C, unfortunately the result of walkLength(somestring) is computed every time you call it... because it's doesn't get cached. A partial improvement for this situation is to assure walkLength(somestring) to be strongly pure, and to assure the D compiler is able to move this invariant pure computation out of loops.


Is is reasonable for the compiler to pick this up during semantic analysis and point out this situation?

This is not easy to do, because sometimes you want to know the number of code points, and sometimes of code units. I remember even a proposal to rename the "length" field to another name for narrow strings, to avoid such bugs.

I was thinking about that. This is quite a vague suggestion, more just throwing the idea out there and seeing what people think. I am aware of the issue of walkLength being computed every time, rather than being a constant lookup. One option would be to make it only a warning in @safe code, so worst case scenario is that you mark the function as @trusted. I feel this fits in with the idea of @safe quite well, since you have to explicitly tell the compiler that you know what you're doing.

Another option would be to have some sort of general lint tool that picks up on these kinds of potential errors, though that is a lot bigger scope...

--
James Miller

Reply via email to