Georg Baum wrote:
Am Freitag, 27. Oktober 2006 17:03 schrieb Abdelrazak Younes:
Hello,
This patch brings a 20% improvement when loading a big document (my test
document is the UserGuide copied&Pasted 3 times). This is done by
avoiding multiple string/vector/docstring copying.
I can think of many other optimisation but I would like this one to go
in first. IMHO, most of unicode can go and only one iconv_convert()
should stay.
Objections?
You will probably get some from Lars, but I have one of my own: I don't
like the replacement of iconv_t by int and the resulting casting. Please
revert that. The casting of -1 to iconv_t is safer. Do you really know
that int will always be compatible to iconv_t?
I am not changing the semantics at all. The casting happens in any case,
be it internally or externally to the iconv_convert function. I just
factorized the number of casting, that's all. The result is identical to
the old code.
Besides, this static variable is declared all other the place is not
used anywhere except within iconv_convert so I really don't see why we
keep it.
IMO an important bit is missing: Completing a conversion that is incomplete
because the buffer was too small.
Read more closely the code, I don't need a fixed size buffer any more. I
am writing the converted data right into the docstring output that has
already enough place.
We cannot always increase the bufffer if
that happens, so IMO we need to do that for 1.5.0, because users will
otherwise get silent data corruption.
As said above, we don't need to do that anymore with my version.
With that in mind I do not like that you make iconv_convert a part of the
public interface. It has an ugly interface that should be kept private.
Come one, it is only used in docstring. I can remove it from the
unicode.h and declare it as extern in docstring.C if you prefer.
It
should be possible to do an efficient but still clean design IMHO.
Be my guest then.
Abdel.
PS: I have problems with news gmane interface so I am maybe going to
break the thread by posting directly. Sorry about that.