Re: org.apache.poi.util.StringUtil

Avik Sengupta Fri, 13 May 2005 08:56:40 -0700

On Fri, 2005-05-13 at 16:35 +0100, Nick Burch wrote:
> Hi All
> 
> It has been suggested (in Bugzilla) that my PowerPoint code's 
> util.TextMunger class is largely a duplicate of util.StringUtil.
> 
Since i did the suggesting, I suppose it behoves me to reply :). But let
me say that I haven't looked very closely at what you require, it just
looked similar.


> However, I'm really struggling to figure out exactly what that class does. 
> Comments like "write compressed unicode" don't really explain much...
> 
> Could someone perhaps tell me if there are any methods to do the 
> following?
> 
> * Take little endian unicode bytes, and return a string
public static String getFromUnicodeLE(
                final byte[] string,
                final int offset,
                final int len)

The javadoc is completely off! Also, I am not sure if the method that
takes only the byte array is correct... I think we mostly use the above
method.


> * Take a string, and return little endian unicode bytes
public static void putUnicodeLE(
                final String input,
                final byte[] output,
                final int offset)
the output is not returned, but put into the byte array. 

> * Take a string, and return the closest approximation in US-ASCII bytes
?? What's closest? taking only the low bytes? I dont think there's
anything that does that (there were, but they were bugfixed out :)

> * Take a string, try to convert it US-ASCII bytes, and either return the 
>    bytes or indicate (exception, null return etc) that it couldn't be 
>    done?

public static boolean isUnicodeString(final String value)  does the
checking, and returns true of false. 

public static void putCompressedUnicode(
                final String input,
                final byte[] output,
                final int offset)

converts to a US-ASCII byte array, or throws an java.lang.InternalError 

> I'll happily do a patch the javadocs for the methods I end up using, once 
> I know what they do!
Thanks! the term Compressed/Uncompressed unicode is an unfortunate
Excel'ism that's got into our code. 


Hope that helps. I'm pretty sure the above is correct, but... 

Shout if you need anything else. 

Regards
-
Avik



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

Re: org.apache.poi.util.StringUtil

Reply via email to