Not true. You do not need to pre-scan it. When you use CharSet encoder, it will write the bytes to a buffer (expanding as needed). At the end of the encoding you can get the actual number of bytes needed.
The pseudo-code is use CharsetEncoder to write String to ByteBuffer write VInt using ByteBuffer.getLength() write bytes using ByteBuffer.getByte[] better yet you NIO so you can pass the ByteBuffer directly. -----Original Message----- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 12:56 PM To: java-dev@lucene.apache.org; [EMAIL PROTECTED] Subject: Re: Lucene does NOT use UTF-8. > I think you guys are WAY overcomplicating things, or you just don't know > enough about the Java class libraries. People were just pointing out that if the vint isn't String.length(), then one has to either buffer the entire string, or pre-scan it. It's a valid point, and CharsetEncoder doesn't change that. -Yonik Now hiring -- http://tinyurl.com/7m67g --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]