[android-developers] Re: Writing files in UTF-8

2012-03-30 Thread Dirk Vranckaert
Ooh great, didn't think about that.
Now it works perfect! Thanks for all the support with this issue!

Kr,

Dirk

Op vrijdag 30 maart 2012 11:55:34 UTC+2 schreef Remote Red het volgende:
>
> Looping? 
>
> byte[] bom = {(byte)0xEF, (byte)0xBB, (byte)0xBF }; 
> byte[] byteResult = 
> Charset.forName("UTF-8").encode(result.toString()).array(); 
>
> fos = new FileOutputStream(file); 
> fos.write(bom); 
> fos.write(byteResult); 
>
> Doesn't this work?

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-30 Thread Remote Red
Looping?

byte[] bom = {(byte)0xEF, (byte)0xBB, (byte)0xBF };
byte[] byteResult =
Charset.forName("UTF-8").encode(result.toString()).array();

fos = new FileOutputStream(file);
fos.write(bom);
fos.write(byteResult);

Doesn't this work?

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en


[android-developers] Re: Writing files in UTF-8

2012-03-30 Thread Dirk Vranckaert
Sadly there is no insert and indeed the put overwrites the bytes.

The charset.encode(...) gives me a ByteBuffer that I cannot retrieve 
'empty'...

Adding (put) the BOM first should be ok, but then adding all the test-bytes 
will not work. Then again I have to loop over all the bytes...

Op vrijdag 30 maart 2012 10:34:20 UTC+2 schreef Remote Red het volgende:
>
> > .put(0, (byte) 0xEF) 
> > .put(1, (byte) 0xBB) 
> > .put(2, (byte) 0xBF) 
>
> > However when the file is created and I open it with either notepad++ or 
> > excel the first characters are  not show, the file always starts with * 
> > artdate"* . 
> > 
> > Any idea how to work around this? 
>
> You are using put(). Those three statements will overwrite the 
> first three bytes. 
>
> Isn't there an insert() ? 
>
> If not: just first write the bom and then the bytes.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-30 Thread Remote Red
> .put(0, (byte) 0xEF)
> .put(1, (byte) 0xBB)
> .put(2, (byte) 0xBF)

> However when the file is created and I open it with either notepad++ or
> excel the first characters are  not show, the file always starts with *
> artdate"* .
>
> Any idea how to work around this?

You are using put(). Those three statements will overwrite the
first three bytes.

Isn't there an insert() ?

If not: just first write the bom and then the bytes.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en


[android-developers] Re: Writing files in UTF-8

2012-03-30 Thread Remote Red
> But the question marks in the in the file i attached are because google
> groups did not get it right the document... Locally it works and I can send
> it through mail without issues.

Sorry but I find it hard to believe that when Google offers us a
platform to discuss development and offers the possibility to upload
files so we can show them others that those files are changed and
corrupted.

A giant as Google would never do that

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en


[android-developers] Re: Writing files in UTF-8

2012-03-30 Thread Dirk Vranckaert
I did specify the type of result ;) It's a StringBuilder and does indeed 
not support the toString("UTF-8")

What I did now:
FileOutputStream fos = null;
byte[] byteResult = Charset.forName("UTF-8").encode(result.toString())
.put(0, (byte) 0xEF)
.put(1, (byte) 0xBB)
.put(2, (byte) 0xBF)
.array();
fos = new FileOutputStream(file);
fos.write(byteResult);

This works! It's showing my hebrew and chinese characters correct!

But I still have one issue:
The string result.toString() starts with this data (double quotes 
included): *"Startdate";"Starttime";"Enddate";*
However when the file is created and I open it with either notepad++ or 
excel the first characters are  not show, the file always starts with *
artdate"* .

Any idea how to work around this? I though that maybe it just took the 
first few characters (always 3) after the BOM into account but I added some 
spaces between the BOM (exactly 3) and then it's ok. It's a work-around but 
not really the best I hope...

Kr,

Dirk

Op vrijdag 30 maart 2012 00:38:08 UTC+2 schreef Lew het volgende:
>
> b0b wrote:
>
>> For your code to work you need:
>>
>> out.write(result.toString("UTF-8));
>>
>
> How do you know that the type of 'result' supports such a method? The OP 
> did not indicate the type of 'result'.
>
> Surely you are aware that 'String' values in Java are always UTF-16?
> 
>
> -- 
> Lew
>
>
>  
>

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Lew
b0b wrote:

> For your code to work you need:
>
> out.write(result.toString("UTF-8));
>

How do you know that the type of 'result' supports such a method? The OP 
did not indicate the type of 'result'.

Surely you are aware that 'String' values in Java are always UTF-16?


-- 
Lew


 

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread b0b
For your code to work you need:

out.write(result.toString("UTF-8));

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

Re: [android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Kostya Vasilyev
29 марта 2012 г. 22:18 пользователь Lew  написал:

> Remote Red wrote:
> > Dirk Vranckaert wrote:
>
>> > I've added "out.write('\ufeff');" to write the Byte Order Mark
>>
>> That is not the bom for utf-8. You wrote the bom for UTF-16.
>>
>> http://en.wikipedia.org/wiki/**Byte_order_mark
>>
>> The bom for utf-8 is 0xEF, 0xBB, 0xBF.
>
>
> And that same article reminds us,
> "The Unicode standard recommends against the BOM for 
> UTF-8.[26]
>  "
>
> It also says that the need for it is something Notepad decided despite
> that it violates the recommendation.
>

IIRC, Notepad++ has a menu to switch encoding on an already loaded file.
Perhaps it was as simple as clicking there.

-- K


>
>
>> You are not done before it works with the right bom.
>>
>
> Which ought to be no BOM at all.
>
> --
> Lew
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Android Developers" group.
> To post to this group, send email to android-developers@googlegroups.com
> To unsubscribe from this group, send email to
> android-developers+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/android-developers?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Lew
Remote Red wrote:
> Dirk Vranckaert wrote:

> > I've added "out.write('\ufeff');" to write the Byte Order Mark 
>
> That is not the bom for utf-8. You wrote the bom for UTF-16. 
>
> http://en.wikipedia.org/wiki/Byte_order_mark 
>
> The bom for utf-8 is 0xEF, 0xBB, 0xBF. 


And that same article reminds us, 
"The Unicode standard recommends against the BOM for 
UTF-8.[26]
 "

It also says that the need for it is something Notepad decided despite that 
it violates the recommendation.
 

> You are not done before it works with the right bom. 
>

Which ought to be no BOM at all.

-- 
Lew
 

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Dirk Vranckaert
Ooh yeah, I will have a look tomorrow for the UTF-8.

But the question marks in the in the file i attached are because google 
groups did not get it right the document... Locally it works and I can send 
it through mail without issues.

Kr,

Dirk

Op donderdag 29 maart 2012 14:43:33 UTC+2 schreef Remote Red het volgende:
>
> > I've added "out.write('\ufeff');" to write the Byte Order Mark 
>
> That is not the bom for utf-8. You wrote the bom for UTF-16. 
>
> http://en.wikipedia.org/wiki/Byte_order_mark 
>
> The bom for utf-8 is 0xEF, 0xBB, 0xBF. 
>
> > ... and you can 
> > see in the attachment the export now works fine! :) 
>
> Those attachment have no bom and do not contain characters like 
> €å”€å”€å”€à €à €à €ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€Ì€Ì€Ì€Ì€Ì€Ì€ 
> but a lot of questionmarks line. 
> ? 
>
> You are not done before it works with the right bom. 
>

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Remote Red
> I've added "out.write('\ufeff');" to write the Byte Order Mark

That is not the bom for utf-8. You wrote the bom for UTF-16.

http://en.wikipedia.org/wiki/Byte_order_mark

The bom for utf-8 is 0xEF, 0xBB, 0xBF.

> ... and you can
> see in the attachment the export now works fine! :)

Those attachment have no bom and do not contain characters like
€å”€å”€å”€à €à €à €ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€Ì€Ì€Ì€Ì€Ì€Ì€
but a lot of questionmarks line.
?

You are not done before it works with the right bom.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en


[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Dirk Vranckaert
Allright it works now!
Thank you for poiting out to me the Byte Order Mark.

I've added "out.write('\ufeff');" to write the Byte Order Mark and you can 
see in the attachment the export now works fine! :)

This is my code now:

out = new OutputStreamWriter(
new FileOutputStream(file), "UTF-8"
);
out.write('\ufeff');
out.write(result.toString());
out.flush();

Op donderdag 29 maart 2012 14:11:01 UTC+2 schreef Dirk Vranckaert het 
volgende:
>
> Oups, forgot the file ;)
>
> But update:
> I was wrong, ö ë é
> are not working either!
>
> File is now attached! :)
>

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en?"Startdate";"Starttime";"Enddate";"Endtime";"Comment";"Project";"Task";"Project-comment";
"3/29/12";"11:55 AM";"Until now...";"";"QaE???";"Your Project";"Your Task!";"";
"3/29/12";"07:03 AM";"3/29/12";"11:54 AM";"";"Your Project";"Your Task!";"";


[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Dirk Vranckaert
Oups, forgot the file ;)

But update:
I was wrong, ö ë é
are not working either!

File is now attached! :)

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en"Startdate";"Starttime";"Enddate";"Endtime";"Comment";"Project";"Task";"Project-comment";
"3/29/12";"11:55 AM";"Until now...";"";"QaE???";"Your Project";"Your Task!";"";
"3/29/12";"07:03 AM";"3/29/12";"11:54 AM";"";"Your Project";"Your Task!";"";


[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Dirk Vranckaert

>
> Ok. But then why also a BufferedWriter?  

 
The BufferedWriter is indeed obsolete but leaving it out doesn't change 
anything...

so of what type is result in result.toString()?  
>

Result is of type StringBuilder.

Do characters like ö, é come through ok?  
>

These do indeed come through correctly! 

Are you shure you use an utf-8 capable editor?  


Yes I'm sure :) I'm using Notepadd++ ;)

I attached the file as it is exported!

the produced file does not have a BOM (or at least I don't know :) )
How can I write 'them' myself?

Kr,

Dirk

Op donderdag 29 maart 2012 13:50:28 UTC+2 schreef Remote Red het volgende:
>
> Just a thougth: does the produced file have a BOM (Byte Order Mark) 
> as the first bytes? 
>
> If not you have to write them yourself. 
>

Op donderdag 29 maart 2012 13:50:28 UTC+2 schreef Remote Red het volgende:
>
> Just a thougth: does the produced file have a BOM (Byte Order Mark) 
> as the first bytes? 
>
> If not you have to write them yourself. 
>

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Remote Red
Just a thougth: does the produced file have a BOM (Byte Order Mark)
as the first bytes?

If not you have to write them yourself.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en


[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Remote Red


On 29 mrt, 13:03, Dirk Vranckaert  wrote:
> > Why are you using FileOutputStream AND OutputStreamWriter AND
> > BufferedWriter?
>
> I need the OutputStreamWriter to specify the encoding,

Ok. But then why also a BufferedWriter?

> The result is a CSV file,

Sorry but you did not understand my question. You wrote
"result.toString()"
so of what type is result in result.toString()?

> So when I do so for Dutch, German, English, Frensh,... (all western
> languages) it works.

Do characters like ö, é come through ok?

> ... You can open the CSV file
> in a text-editor


Are you shure you use an utf-8 capable editor?

> So instead of seeing something like this:
> 這是一個測試
> I see something like this:
> â €å”€å”€å”€à €à €à €ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€Ì€Ì€Ì€Ì€Ì€Ì€

That looks like what an editor that cannot handle utf-8 would display.

All this encoding is very interesting and I would like to help you.

Suggesting:
-make a small utf-8 file with those Chinese characters.
-put that file somewhere on the internet so we can download it.
-make an activity that just reads the file in a string and writes
 the contents of that string to another file.
-compare the two files.

If you publish your activity code here (keep it as small as possible)
I will experiment with it.


-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en


[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Dirk Vranckaert

>
> Why are you using FileOutputStream AND OutputStreamWriter AND 
> BufferedWriter?  

 

I need the OutputStreamWriter to specify the encoding, however that doesn't 
seem to work...

What is the type of result? How did you put content in it? 
>
The result is a CSV file, I just loop over some DB records and build a 
string, then I write the string to the writer... 

> You are not checking the returnvalue of your write statement.  

the Writer.write(..) method does not return anything... it's void...!

 Please explain exactly what is wrong with it.

I thought I did, but anyway, I'll do it again:
Very simplistic explained: so in the application text (that the user 
enters) is stored in a DB, I read the contents of the DB and construct a 
string. Then I write the string in the writer (to the file)!
So when I do so for Dutch, German, English, Frensh,... (all western 
languages) it works. However users started reporting that when inputting 
Chinese or Hebrew characters the application works just fine but for the 
export the content of the CSV file is unreadable. You can open the CSV file 
in a text-editor or a spreadsheet editor but the characters that should be 
Chinese or Hebrew are just unreadable characters.

So instead of seeing something like this:
這是一個測試
I see something like this:
â €å”€å”€å”€à €à €à €ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€ç¼€Ì€Ì€Ì€Ì€Ì€Ì€

So it's a matter of encoding that is not correct. But I am encoding in 
UTF-8 so I don't see the issue...

Kr,

Dirk



Op donderdag 29 maart 2012 11:35:37 UTC+2 schreef Remote Red het volgende:
>
>
>
>
> > out = new BufferedWriter(new OutputStreamWriter( 
> > new FileOutputStream(file), "UTF-8" 
> > )); 
>
> Why are you using FileOutputStream AND OutputStreamWriter AND 
> BufferedWriter? 
>
> > out.write(result.toString()); 
>
> What is the type of result? How did you put content in it? 
>
> You are not checking the returnvalue of your write statement. 
>
> > I indeed see that the text is not correct. 
>
> Please explain exactly what is wrong with it.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

[android-developers] Re: Writing files in UTF-8

2012-03-29 Thread Remote Red



>     out = new BufferedWriter(new OutputStreamWriter(
> new FileOutputStream(file), "UTF-8"
>     ));

Why are you using FileOutputStream AND OutputStreamWriter AND
BufferedWriter?

>     out.write(result.toString());

What is the type of result? How did you put content in it?

You are not checking the returnvalue of your write statement.

> I indeed see that the text is not correct.

Please explain exactly what is wrong with it.

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en