On 5/15/09, Markus Wiederkehr <[email protected]> wrote: > On Fri, May 15, 2009 at 12:02 AM, Alejandro Valdez > <[email protected]> wrote: >> Hi list, I'm using mime4j to extract the text content from the >> e-mail's text/html parts, I >> found that sometimes there are non-standard MIME parts that use >> iso-8859-1 characters (i.e. >> accented vowels) but don't declare any charset in the part's MIME header. >> >> In that cases I found that mime4j creates a Reader that uses us-ascii >> as the charset (that is what >> should be done when there is no charset declaration in the header). >> Reading the content from that >> Reader produces char[] with the unicode FFFD symbol in replacement of >> the non us-ascii characters. >> >> Do anyone know some way to use the mime4j API to return a Reader with >> iso-8859-1 charset set, >> or some other solution to this (maybe common) problem? > > I looks indeed like this is not possible. > > For Mime4j 0.7 I would propose that we pull up getInputStream() from > BinaryBody to SingleBody so that TextBody gets this method too. > > If that's okay I can open a JIRA and fix the issue. > >> This is the way I'm reading a TextPart content: >> >> TextBody textBody = (TextBody) part.getBody(); >> Reader reader = textBody.getReader(); >> char[] buffer = new char[16000]; >> StringBuilder sb = new StringBuilder(); >> >> int bytesReaded = 1; >> while (bytesReaded != -1) { >> bytesReaded = reader.read(buffer,0,buffer.length); >> if(bytesReaded != -1) { >> sb.append(buffer,0,bytesReaded); >> } >> } >> return sb.toString(); > > Looks like you want to convert the TextBody to a String.. How about this: > > TextBody textBody = (TextBody) part.getBody(); > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > textBody.writeTo(baos); > return new String(baos.toByteArray(), "iso-8859-1"); > > hth > Markus >
Hello Markus, thank you (very much) for your help, your snippet works great: it creates a String with all the characters (bytes) in the MIME TextPart. I'm curious about how the wirteTo() method actually works, I looked at the mime4j 0.6 source code SingleBody.java and TextPart.java (at src\main\java\org\apache\james\mime4j\message) but I couldn't find the implementation of this method, please can you point me in the right direction?
