Re: [rust-dev] Proposed API for character encodings

2013-09-22 Thread Simon Sapin
Le 21/09/2013 16:38, Olivier Renaud a écrit : I'd expect this offset to be absolute. After all, the only thing that the programmer can do with this information at this point is to report it to the user ; if the programmer wanted to handle the error, he could have done it by using a trap. A relati

Re: [rust-dev] Proposed API for character encodings

2013-09-21 Thread Olivier Renaud
Le samedi 21 septembre 2013 07:59:26 Simon Sapin a écrit : > Le 20/09/2013 20:07, Olivier Renaud a écrit : > > I have one more question regarding the error handling : in DecodeError, > > what does 'input_byte_offset' mean ? Is it relative to the > > 'invalid_byte_sequence' or to the beginning of th

Re: [rust-dev] Proposed API for character encodings

2013-09-21 Thread Simon Sapin
Le 20/09/2013 20:07, Olivier Renaud a écrit : I have one more question regarding the error handling : in DecodeError, what does 'input_byte_offset' mean ? Is it relative to the 'invalid_byte_sequence' or to the beginning of the decoded stream ? Good point. I’m not sure. (Remember I make this up

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Henri Sivonen
On Tue, Sep 10, 2013 at 6:47 PM, Simon Sapin wrote: > /// Call this to indicate the end of the input. > /// The Decoder instance should be discarded afterwards. > /// Some encodings may append some final output at this point. > /// May raise the decoding_error condition. > fn f

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Olivier Renaud
Le vendredi 20 septembre 2013 11:52:14 Simon Sapin a écrit : > Le 13/09/2013 23:03, Simon Sapin a écrit : > > /// Takes the invalid byte sequence. > > /// Return a replacement string, or None to abort with a DecodeError. > > condition! { > > > > pub decoding_error : ~[u8] -> Option<~str>; >

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Olivier Renaud
Le vendredi 20 septembre 2013 11:47:04 Simon Sapin a écrit : > Le 20/09/2013 10:18, Olivier Renaud a écrit : > > I really like the API you are proposing. In particular, the error handling > > is close to what I was expecting from such an API. > > > > I have some remarks, though. > > > > Is there

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Simon Sapin
Le 10/09/2013 16:47, Simon Sapin a écrit : TR;DR: the actual proposal is at the end of this email. I moved this to the wiki, to better deal with updates: https://github.com/mozilla/rust/wiki/Proposal-for-character-encoding-API -- Simon Sapin ___ Rust

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Simon Sapin
Le 20/09/2013 13:40, Henri Sivonen a écrit : On Tue, Sep 10, 2013 at 6:47 PM, Simon Sapin wrote: /// Call this to indicate the end of the input. /// The Decoder instance should be discarded afterwards. /// Some encodings may append some final output at this point. /// May ra

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Simon Sapin
Le 13/09/2013 23:03, Simon Sapin a écrit : * Make the output generic in the low-level API by having StringWriter instead of ~str This has the nice side effect to let Servo use a different string type for decoding, but not for encoding. To fix the latter, the input of encoding could be generic

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Simon Sapin
Le 13/09/2013 23:03, Simon Sapin a écrit : /// Takes the invalid byte sequence. /// Return a replacement string, or None to abort with a DecodeError. condition! { pub decoding_error : ~[u8] -> Option<~str>; } /// Functions to be used with decoding_error::cond.trap mod decoding_error_handle

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Simon Sapin
Le 20/09/2013 10:18, Olivier Renaud a écrit : I really like the API you are proposing. In particular, the error handling is close to what I was expecting from such an API. I have some remarks, though. Is there a reason for encoders and decoders to not be reusable ? I think it would be reasonabl

Re: [rust-dev] Proposed API for character encodings

2013-09-20 Thread Olivier Renaud
I really like the API you are proposing. In particular, the error handling is close to what I was expecting from such an API. I have some remarks, though. Is there a reason for encoders and decoders to not be reusable ? I think it would be reasonable to specify that they get back to their initi

Re: [rust-dev] Proposed API for character encodings

2013-09-19 Thread Simon Sapin
Le 19/09/2013 13:39, Jeffery Olson a écrit : As to the implementation: rust-encoding has a lot that could be adapted. https://github.com/__lifthrasiir/rust-encoding Can someone comment on whether we should look at adapting what's in str

Re: [rust-dev] Proposed API for character encodings

2013-09-19 Thread Jeffery Olson
On Thu, Sep 19, 2013 at 1:05 AM, Simon Sapin wrote: > Le 18/09/2013 23:31, Brian Anderson a écrit : > >> On 09/10/2013 08:47 AM, Simon Sapin wrote: >> >>> Iterator and Iterator are tempting, but we may need to work >>> on big chucks at a time for efficiency: Iterator<~[u8]> and >>> Iterator<~str>

Re: [rust-dev] Proposed API for character encodings

2013-09-19 Thread Simon Sapin
Le 18/09/2013 23:31, Brian Anderson a écrit : On 09/10/2013 08:47 AM, Simon Sapin wrote: Iterator and Iterator are tempting, but we may need to work on big chucks at a time for efficiency: Iterator<~[u8]> and Iterator<~str>. Or could single-byte/char iterators be reliably inlined to achieve simi

Re: [rust-dev] Proposed API for character encodings

2013-09-18 Thread Brian Anderson
On 09/10/2013 08:47 AM, Simon Sapin wrote: Hi, TR;DR: the actual proposal is at the end of this email. Thanks for working on this. It's crucial. Rust today has good support for UTF-8 which new content definitely should use, but many systems still have to deal with legacy content that uses

Re: [rust-dev] Proposed API for character encodings

2013-09-13 Thread Simon Sapin
Here is an updated proposal, based on email and IRC feedback. The changes are: * Fix .feed() and .flush() to have the self parameter they need. * Remove the iterator stuff. I don’t find it super useful, and it’s easy enough to build on top of the "push" API. KISS. * Duplicate the "one shot"

Re: [rust-dev] Proposed API for character encodings

2013-09-11 Thread Simon Sapin
Le 11/09/2013 17:19, Marvin Löbel a écrit : On 09/10/2013 05:47 PM, Simon Sapin wrote: Hi, TR;DR: the actual proposal is at the end of this email. Rust today has good support for UTF-8 which new content definitely should use, but many systems still have to deal with legacy content that uses ot

Re: [rust-dev] Proposed API for character encodings

2013-09-11 Thread Marvin Löbel
On 09/10/2013 05:47 PM, Simon Sapin wrote: Hi, TR;DR: the actual proposal is at the end of this email. Rust today has good support for UTF-8 which new content definitely should use, but many systems still have to deal with legacy content that uses other character encodings. There are several

[rust-dev] Proposed API for character encodings

2013-09-10 Thread Simon Sapin
Hi, TR;DR: the actual proposal is at the end of this email. Rust today has good support for UTF-8 which new content definitely should use, but many systems still have to deal with legacy content that uses other character encodings. There are several projects around to implement more encodings