On Tue, Jan 01, 2013 at 11:18:29PM +0100, Ondřej Bílka wrote: > On Tue, Jan 01, 2013 at 09:12:07PM +0100, Loup Vaillant-David wrote: > > > > void latin1_to_utf8(std::string & s); > > > Let me guess. They do it to save cycles caused by allocation of new > string. > > instead of > > > > std::string utf8_of_latin1(std::string s) > > or > > std::string utf8_of_latin1(const std::string & s)
You may have guessed right. But then, *they* guessed wrong. First, the program in which I saw this conversion routine is dead slow anyway. If they really cared about the performance of a few encoding conversion, they should have started by unifying string handling to begin with (there are 6 string types in the program, all actively used, and sometimes converted back and forth). Second, every time the conversion does actually do anything, the utf8 string will be longer than the original one, and require a realloc() anyway (unless they wrote some very clever code, but the overall quality of their monstrosity makes it unlikely). Finally, I often needed to write this: std::string temp = compute_text(); latin1_to_utf8(temp); call_function(temp); Which does not reduce allocations in the slightest, compared to call_function(utf8_of_latin1(compute_text())); My version may even be a bit more amenable to optimisation by the compiler. (In addition to be more readable, I dare say.) So, they *may* have made this move because they cared about performance. A more likely explanation though, is that they simply thought "oh, I need to convert some strings to utf8", and transliterated that in C++. They could have thought "oh, I need utf8 versions of some strings" instead, but that would be functional thinking. Loup. _______________________________________________ fonc mailing list [email protected] http://vpri.org/mailman/listinfo/fonc
