ok, thanks for your reply. But I thought Method: public void writeVInt(int i) is not about UTF-8, it is about how to write an int in variable length. Is it included as a part of future unicode character writing?
-- Best regards, Charlie --- >> I thought >> >> (byte)((i & 0x7f) | 0x80) == (byte)(i | 0x80) >> >> As (byte) is able to truncate the last byte for us already, no need of >> (& 0x7f). If so, we may change that line to >> >> writeByte((byte)(i | 0x80)); >> >> and may speed up a little bit. Correct me if (i & 0x7f) is necessary. >> Thank you. > I wouldn't bother optimizing these methods... I think they will be > changed in the future anyway. > 1) The current code outputs modified-UTF-8 instead of true UTF-8 > 2) I think we may be going to byte-oriented counts for length (away > from number of java chars, which are variable-length with the latest > unicode standards) > Marvin Humphrey has done the first, and seems close to finishing #2. > http://www.mail-archive.com/java-dev@lucene.apache.org/msg01970.html > http://www.mail-archive.com/java-dev@lucene.apache.org/msg02109.html > http://www.mail-archive.com/java-dev@lucene.apache.org/msg02468.html > http://www.mail-archive.com/java-dev@lucene.apache.org/msg03801.html > -Yonik > http://incubator.apache.org/solr Solr, the open-source Lucene search server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]