ZY Zhou Wrote:
it doesn't make sense to add try/catch every time you use
tolower/toupper/foreach
on string. No one will do that.
You either throw exception when convert invalid utf8 bytes to string, or never
throw exception and use invalid UTF32 code in dchar to represent invalid utf8
On Mar 14, 11 13:53, Jesse Phillips wrote:
KennyTM~ Wrote:
It is already throwing an exception called
core.exception.UnicodeException. This even provides you the index where
decoding failed.
(However Phobos is not using it, AFAIK.)
---
import core.exception, std.stdio, std.conv;
On Sunday 13 March 2011 22:45:38 ZY Zhou wrote:
it doesn't make sense to add try/catch every time you use
tolower/toupper/foreach on string. No one will do that.
You either throw exception when convert invalid utf8 bytes to string, or
never throw exception and use invalid UTF32 code in dchar
On 2011-03-13 23:36, KennyTM~ wrote:
On Mar 14, 11 02:55, Jacob Carlborg wrote:
I would say that the functions should NOT crash but instead throw an
exception. Then the developer can choose what to do when there's an
invalid unicode character.
It is already throwing an exception called
On 2011-03-14 06:45, ZY Zhou wrote:
it doesn't make sense to add try/catch every time you use
tolower/toupper/foreach
on string. No one will do that.
You either throw exception when convert invalid utf8 bytes to string, or never
throw exception and use invalid UTF32 code in dchar to represent
Hi,
I wrote a small program to read and parse html(charset=UTF-8). It worked great
until some invalid utf8 chars appears in that page.
When the string is invalid, things like foreach or std.string.tolower will
just crash.
this make the string type totally unusable when processing files, since
On Sunday 13 March 2011 01:57:12 ZY Zhou wrote:
Hi,
I wrote a small program to read and parse html(charset=UTF-8). It worked
great until some invalid utf8 chars appears in that page.
When the string is invalid, things like foreach or std.string.tolower will
just crash.
this make the string
std.utf throw exception instead of crash the program. but you still need to add
try/catch everywhere.
My point is: this simple code should work, instead of crash, it is supposed to
leave all invalid codes untouched and just process the valid parts.
Stream file = new BufferedFile(sample.txt);
On Sunday 13 March 2011 04:34:24 ZY Zhou wrote:
std.utf throw exception instead of crash the program. but you still need to
add try/catch everywhere.
My point is: this simple code should work, instead of crash, it is supposed
to leave all invalid codes untouched and just process the valid
== Quote from ZY Zhou (rin...@geemail.com)'s article
std.utf throw exception instead of crash the program. but you still need to
add
try/catch everywhere.
My point is: this simple code should work, instead of crash, it is supposed to
leave all invalid codes untouched and just process the
On 03/13/2011 10:57 AM, ZY Zhou wrote:
Hi,
I wrote a small program to read and parse html(charset=UTF-8). It worked great
until some invalid utf8 chars appears in that page.
When the string is invalid, things like foreach or std.string.tolower will
just crash.
this make the string type totally
but I think that it's completely unreasonable to expect
all of the string-based and/or range-based functions to be able to handle
invalid unicode.
As I explained in the first mail, if utf8 parser convert all invalid utf8 chars
to
low surrogate code points(0x80~0xFF -
0xDC80~0xDCFF), other
On 03/13/2011 12:34 PM, ZY Zhou wrote:
std.utf throw exception instead of crash the program. but you still need to add
try/catch everywhere.
My point is: this simple code should work, instead of crash, it is supposed to
leave all invalid codes untouched and just process the valid parts.
Stream
On 03/13/2011 01:25 PM, ZY Zhou wrote:
but I think that it's completely unreasonable to expect
all of the string-based and/or range-based functions to be able to handle
invalid unicode.
As I explained in the first mail, if utf8 parser convert all invalid utf8 chars
to
low surrogate code
On 03/13/2011 01:25 PM, ZY Zhou wrote:
but I think that it's completely unreasonable to expect
all of the string-based and/or range-based functions to be able to handle
invalid unicode.
As I explained in the first mail, if utf8 parser convert all invalid utf8 chars
to
low surrogate code
What if I'm making a text editor with D?
I know the text has something wrong, I want to open it and fix it. the exception
won't help, if the editor just refuse to open invalid file, then the editor is
useless.
Try open an invalid utf file with a text editor, like vim, you will understand
what I
On 2011-03-13 10:18:24 -0400, ZY Zhou rin...@geeemail.com said:
What if I'm making a text editor with D?
I know the text has something wrong, I want to open it and fix it. the
exception
won't help, if the editor just refuse to open invalid file, then the editor is
useless.
Try open an invalid
If a invalid utf8 or utf16 code need to be converted to utf32, then it should be
converted to an invalid utf32. that's why D800~DFFF are marked as invalid points
in unicode standard.
== Quote from spir (denis.s...@gmail.com)'s article
This is not a good idea, imo. Surrogate values /are/ invalid
On 2011-03-13 13:22, spir wrote:
On 03/13/2011 10:57 AM, ZY Zhou wrote:
Hi,
I wrote a small program to read and parse html(charset=UTF-8). It
worked great
until some invalid utf8 chars appears in that page.
When the string is invalid, things like foreach or std.string.tolower
will
just crash.
On 03/13/2011 04:43 PM, ZY Zhou wrote:
If a invalid utf8 or utf16 code need to be converted to utf32, then it should be
converted to an invalid utf32. that's why D800~DFFF are marked as invalid points
in unicode standard.
You are wrong on both points.
First, there is no definition of invalid
On 3/13/11 1:55 PM, Jacob Carlborg wrote:
I would say that the functions should NOT crash but instead throw an
exception. Then the developer can choose what to do when there's an
invalid unicode character.
Yah. In addition, the exception should provide index information such
that an
Crash - Have fun stepping through your code with a debugger, or
worse, observe disassembly.
Throw - (Hopefully) get an informative error message, which could
mean you'll be able to fix the bug quickly.
it doesn't make sense to add try/catch every time you use
tolower/toupper/foreach
on string. No one will do that.
You either throw exception when convert invalid utf8 bytes to string, or never
throw exception and use invalid UTF32 code in dchar to represent invalid utf8
code.
string s = \x0A;
KennyTM~ Wrote:
It is already throwing an exception called
core.exception.UnicodeException. This even provides you the index where
decoding failed.
(However Phobos is not using it, AFAIK.)
---
import core.exception, std.stdio, std.conv;
void main() {
char[] s = [0x0f,
24 matches
Mail list logo