Hi Niko,

Thank you for your response. I forgot to put it in the original
message, but here is a link to bitbucket repository with all the code
I got so far: https://bitbucket.org/googolplex/algo/src, see module
io. Maybe it will be helpful.

I'm still inclined to think that this is a bug, and I really do not
see how can I do what I want to with two lifetime parameters. There
really should be one lifetime (the one of the Reader), or maybe I'm
missing something?

As for internal iterators, yes, I tried to do that first, but since it
is impossible to break out or return from inside of the iterator loop
directly via `break` or `return` it quickly becomes pretty unfeasible
since I have to use a lot of boilerplate boolean parameters and check
them in many places. This may be not so apparent for finite
structures, but for potentially infinite ones (like a wrapper for
`Reader`) it is.

Nonetheless, as far as I understand, external iterators are the future
of iteration in Rust (generators won't appear in the nearest future,
will they?), so I want to use the most idiomatic style.

2013/8/20 Niko Matsakis <[email protected]>:
> Hi,
>
> Sorry for not responding more quickly. I've been wanting to sit down
> and work out your example; I am confident that it can be made to work,
> although from reading it quickly it sounds like a case that might be
> better served with two lifetime parameters, which are not yet
> supported (on my list...).
>
> However, I did want to briefly point out that you can continue to use
> "internal" iterators, you just don't get the `for` syntax
> anymore. Just write a higher-order function as you always did,
> possibly returning bool to indicate whether to break or continue.
>
>
> Niko
>
>
> On Sat, Aug 17, 2013 at 01:54:09PM +0400, Vladimir Matveev wrote:
>> Hello,
>>
>> I'm writing a simple tokenizer which is defined by this trait:
>>
>> trait Tokenizer {
>>     fn next_token(&mut self) -> ~str;
>>     fn eof(&self) -> bool;
>> }
>>
>> Obvious application for a tokenizer is splitting a stream going from
>> Reader, so I have the following structure which should implement
>> Tokenizer:
>>
>> pub struct ReaderTokenizer<'self> {
>>     priv inner: &'self Reader,
>>     priv buffer: ~CyclicBuffer,
>>     priv seps: ~[~str]
>> }
>>
>> I have used 'self lifetime parameter since I want for the tokenizer
>> work for any Reader. CyclicBuffer is another structure which
>> essentially is an array of u8 with special read/write operations.
>>
>> Implementation of a Tokenizer for ReaderTokenizer involves reading
>> from the Reader one byte at a time. I decided to use buffering to
>> improve performance. But I still want to keep the useful abstraction
>> of single byte reading, so I decided to implement Iterator<u8> for my
>> Reader+CyclicBuffer pair. BTW, internal iterators in 0.7 were much
>> better for this, because internal iterator code was very simple and
>> didn't use explicit lifetimes at all, but 0.7 compiler suffers from
>> several errors related to pointers to traits which prevented my
>> program from compiling (I couldn't pass a reference to Reader to
>> CyclicBuffer method; there were other errors I've encountered too). I
>> So, I decided to use trunk version of the compiler in which these
>> errors are resolved according to github, but trunk version does not
>> allow internal iterators, which is very sad since now I'm forced to
>> create intermediate structures to achieve the same thing.
>>
>> So, I came up with the following iterator structure:
>>
>> struct RTBytesIterator<'self> {
>>     tokenizer: &'self mut ReaderTokenizer<'self>
>> }
>>
>> impl<'self> Iterator<u8> for RTBytesIterator<'self> {
>>     fn next(&mut self) -> Option<u8> {
>>         if self.tokenizer.eof() {
>>             return None;
>>         }
>>         if self.tokenizer.buffer.readable_bytes() > 0 ||
>>            self.tokenizer.buffer.fill_from_reader(self.tokenizer.inner) > 0 {
>>             return Some(self.tokenizer.buffer.read_unsafe());
>>         } else {
>>             return None;
>>         }
>>     }
>> }
>>
>> Note that tokenizer field is &'self mut since CyclicBuffer is mutable.
>> buffer.fill_from_reader() function reads as much as possible from the
>> reader (returning a number of bytes read), and buffer.read_unsafe()
>> returns next byte from the cyclic buffer.
>>
>> Then I've added the following method to ReaderTokenizer:
>>
>> impl<'self> ReaderTokenizer<'self> {
>> ...
>>     fn bytes_iter(&mut self) -> RTBytesIterator<'self> {
>>         RTBytesIterator { tokenizer: self }
>>     }
>> ...
>> }
>>
>> This does not compile with the following error:
>>
>> io/convert_io.rs:98:37: 98:43 error: cannot infer an appropriate
>> lifetime due to conflicting requirements
>> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>>                                                          ^~~~~~
>> io/convert_io.rs:97:55: 99:5 note: first, the lifetime cannot outlive
>> the anonymous lifetime #1 defined on the block at 97:55...
>> io/convert_io.rs:97     fn bytes_iter(&mut self) -> RTBytesIterator<'self> {
>> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>> io/convert_io.rs:99     }
>> io/convert_io.rs:98:37: 98:43 note: ...due to the following expression
>> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>>                                                          ^~~~~~
>> io/convert_io.rs:97:55: 99:5 note: but, the lifetime must be valid for
>> the lifetime &'self  as defined on the block at 97:55...
>> io/convert_io.rs:97     fn bytes_iter(&mut self) -> RTBytesIterator<'self> {
>> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>> io/convert_io.rs:99     }
>> io/convert_io.rs:98:8: 98:23 note: ...due to the following expression
>> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>>                             ^~~~~~~~~~~~~~~
>> error: aborting due to previous error
>>
>> OK, fair enough, I guess I have to annotate self parameter with 'self 
>> lifetime:
>>
>>     fn bytes_iter(&'self mut self) -> RTBytesIterator<'self> {
>>         RTBytesIterator { tokenizer: self }
>>     }
>>
>> This compiles, but now I'm getting another error at bytes_iter() usage
>> site, for example, the following code:
>>
>>     fn try_read_sep(&mut self, first: u8) -> (~[u8], bool) {
>>         let mut part = ~[first];
>>         for b in self.bytes_iter() {
>>             part.push(b);
>>             if !self.is_sep_prefix(part) {
>>                 return (part, false);
>>             }
>>             if self.is_sep(part) {
>>                 break;
>>             }
>>         }
>>         return (part, true);
>>     }
>>
>> fails to compile with this error:
>>
>> io/convert_io.rs:117:17: 117:36 error: cannot infer an appropriate
>> lifetime due to conflicting requirements
>> io/convert_io.rs:117         for b in self.bytes_iter() {
>>                                       ^~~~~~~~~~~~~~~~~~~
>> io/convert_io.rs:117:17: 117:22 note: first, the lifetime cannot
>> outlive the expression at 117:17...
>> io/convert_io.rs:117         for b in self.bytes_iter() {
>>                                       ^~~~~
>> io/convert_io.rs:117:17: 117:22 note: ...due to the following expression
>> io/convert_io.rs:117         for b in self.bytes_iter() {
>>                                       ^~~~~
>> io/convert_io.rs:117:17: 117:36 note: but, the lifetime must be valid
>> for the method call at 117:17...
>> io/convert_io.rs:117         for b in self.bytes_iter() {
>>                                       ^~~~~~~~~~~~~~~~~~~
>> io/convert_io.rs:117:17: 117:22 note: ...due to the following expression
>> io/convert_io.rs:117         for b in self.bytes_iter() {
>>                                       ^~~~~
>>
>> And now I'm completely stuck. I can't avoid these errors at all. This
>> looks like a bug to me, but I'm not completely sure - maybe it's me
>> who is wrong here.
>>
>> I've studied libstd/libextra code for clues and found out that some
>> iterable structures have code very similar to mine, for example,
>> RingBuf. Here is its mut_iter() method:
>>
>>     pub fn mut_iter<'a>(&'a mut self) -> RingBufMutIterator<'a, T> {
>>         RingBufMutIterator{index: 0, rindex: self.nelts, lo: self.lo,
>> elts: self.elts}
>>     }
>>
>> I have tried to implement bytes_iter() method like this, but it
>> naturally didn't work because of 'a and 'self lifetimes conflict. In
>> my understanding, this works here because RingBuf does not have
>> lifetime parameter, so no conflict between 'self and 'a lifetime is
>> possible at all. But this will not work in my case, because I have to
>> have 'self parameter because of &'self Reader field.
>>
>> What can I do to implement my ReaderTokenizer? Maybe there are other
>> ways of which I'm unaware?
>>
>> Thank you very much in advance.
>>
>> Best regards,
>> Vladimir.
>> _______________________________________________
>> Rust-dev mailing list
>> [email protected]
>> https://mail.mozilla.org/listinfo/rust-dev
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to