Hi,

Sorry for not responding more quickly. I've been wanting to sit down
and work out your example; I am confident that it can be made to work,
although from reading it quickly it sounds like a case that might be
better served with two lifetime parameters, which are not yet
supported (on my list...).

However, I did want to briefly point out that you can continue to use
"internal" iterators, you just don't get the `for` syntax
anymore. Just write a higher-order function as you always did,
possibly returning bool to indicate whether to break or continue.


Niko


On Sat, Aug 17, 2013 at 01:54:09PM +0400, Vladimir Matveev wrote:
> Hello,
> 
> I'm writing a simple tokenizer which is defined by this trait:
> 
> trait Tokenizer {
>     fn next_token(&mut self) -> ~str;
>     fn eof(&self) -> bool;
> }
> 
> Obvious application for a tokenizer is splitting a stream going from
> Reader, so I have the following structure which should implement
> Tokenizer:
> 
> pub struct ReaderTokenizer<'self> {
>     priv inner: &'self Reader,
>     priv buffer: ~CyclicBuffer,
>     priv seps: ~[~str]
> }
> 
> I have used 'self lifetime parameter since I want for the tokenizer
> work for any Reader. CyclicBuffer is another structure which
> essentially is an array of u8 with special read/write operations.
> 
> Implementation of a Tokenizer for ReaderTokenizer involves reading
> from the Reader one byte at a time. I decided to use buffering to
> improve performance. But I still want to keep the useful abstraction
> of single byte reading, so I decided to implement Iterator<u8> for my
> Reader+CyclicBuffer pair. BTW, internal iterators in 0.7 were much
> better for this, because internal iterator code was very simple and
> didn't use explicit lifetimes at all, but 0.7 compiler suffers from
> several errors related to pointers to traits which prevented my
> program from compiling (I couldn't pass a reference to Reader to
> CyclicBuffer method; there were other errors I've encountered too). I
> So, I decided to use trunk version of the compiler in which these
> errors are resolved according to github, but trunk version does not
> allow internal iterators, which is very sad since now I'm forced to
> create intermediate structures to achieve the same thing.
> 
> So, I came up with the following iterator structure:
> 
> struct RTBytesIterator<'self> {
>     tokenizer: &'self mut ReaderTokenizer<'self>
> }
> 
> impl<'self> Iterator<u8> for RTBytesIterator<'self> {
>     fn next(&mut self) -> Option<u8> {
>         if self.tokenizer.eof() {
>             return None;
>         }
>         if self.tokenizer.buffer.readable_bytes() > 0 ||
>            self.tokenizer.buffer.fill_from_reader(self.tokenizer.inner) > 0 {
>             return Some(self.tokenizer.buffer.read_unsafe());
>         } else {
>             return None;
>         }
>     }
> }
> 
> Note that tokenizer field is &'self mut since CyclicBuffer is mutable.
> buffer.fill_from_reader() function reads as much as possible from the
> reader (returning a number of bytes read), and buffer.read_unsafe()
> returns next byte from the cyclic buffer.
> 
> Then I've added the following method to ReaderTokenizer:
> 
> impl<'self> ReaderTokenizer<'self> {
> ...
>     fn bytes_iter(&mut self) -> RTBytesIterator<'self> {
>         RTBytesIterator { tokenizer: self }
>     }
> ...
> }
> 
> This does not compile with the following error:
> 
> io/convert_io.rs:98:37: 98:43 error: cannot infer an appropriate
> lifetime due to conflicting requirements
> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>                                                          ^~~~~~
> io/convert_io.rs:97:55: 99:5 note: first, the lifetime cannot outlive
> the anonymous lifetime #1 defined on the block at 97:55...
> io/convert_io.rs:97     fn bytes_iter(&mut self) -> RTBytesIterator<'self> {
> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
> io/convert_io.rs:99     }
> io/convert_io.rs:98:37: 98:43 note: ...due to the following expression
> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>                                                          ^~~~~~
> io/convert_io.rs:97:55: 99:5 note: but, the lifetime must be valid for
> the lifetime &'self  as defined on the block at 97:55...
> io/convert_io.rs:97     fn bytes_iter(&mut self) -> RTBytesIterator<'self> {
> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
> io/convert_io.rs:99     }
> io/convert_io.rs:98:8: 98:23 note: ...due to the following expression
> io/convert_io.rs:98         RTBytesIterator { tokenizer: self }
>                             ^~~~~~~~~~~~~~~
> error: aborting due to previous error
> 
> OK, fair enough, I guess I have to annotate self parameter with 'self 
> lifetime:
> 
>     fn bytes_iter(&'self mut self) -> RTBytesIterator<'self> {
>         RTBytesIterator { tokenizer: self }
>     }
> 
> This compiles, but now I'm getting another error at bytes_iter() usage
> site, for example, the following code:
> 
>     fn try_read_sep(&mut self, first: u8) -> (~[u8], bool) {
>         let mut part = ~[first];
>         for b in self.bytes_iter() {
>             part.push(b);
>             if !self.is_sep_prefix(part) {
>                 return (part, false);
>             }
>             if self.is_sep(part) {
>                 break;
>             }
>         }
>         return (part, true);
>     }
> 
> fails to compile with this error:
> 
> io/convert_io.rs:117:17: 117:36 error: cannot infer an appropriate
> lifetime due to conflicting requirements
> io/convert_io.rs:117         for b in self.bytes_iter() {
>                                       ^~~~~~~~~~~~~~~~~~~
> io/convert_io.rs:117:17: 117:22 note: first, the lifetime cannot
> outlive the expression at 117:17...
> io/convert_io.rs:117         for b in self.bytes_iter() {
>                                       ^~~~~
> io/convert_io.rs:117:17: 117:22 note: ...due to the following expression
> io/convert_io.rs:117         for b in self.bytes_iter() {
>                                       ^~~~~
> io/convert_io.rs:117:17: 117:36 note: but, the lifetime must be valid
> for the method call at 117:17...
> io/convert_io.rs:117         for b in self.bytes_iter() {
>                                       ^~~~~~~~~~~~~~~~~~~
> io/convert_io.rs:117:17: 117:22 note: ...due to the following expression
> io/convert_io.rs:117         for b in self.bytes_iter() {
>                                       ^~~~~
> 
> And now I'm completely stuck. I can't avoid these errors at all. This
> looks like a bug to me, but I'm not completely sure - maybe it's me
> who is wrong here.
> 
> I've studied libstd/libextra code for clues and found out that some
> iterable structures have code very similar to mine, for example,
> RingBuf. Here is its mut_iter() method:
> 
>     pub fn mut_iter<'a>(&'a mut self) -> RingBufMutIterator<'a, T> {
>         RingBufMutIterator{index: 0, rindex: self.nelts, lo: self.lo,
> elts: self.elts}
>     }
> 
> I have tried to implement bytes_iter() method like this, but it
> naturally didn't work because of 'a and 'self lifetimes conflict. In
> my understanding, this works here because RingBuf does not have
> lifetime parameter, so no conflict between 'self and 'a lifetime is
> possible at all. But this will not work in my case, because I have to
> have 'self parameter because of &'self Reader field.
> 
> What can I do to implement my ReaderTokenizer? Maybe there are other
> ways of which I'm unaware?
> 
> Thank you very much in advance.
> 
> Best regards,
> Vladimir.
> _______________________________________________
> Rust-dev mailing list
> [email protected]
> https://mail.mozilla.org/listinfo/rust-dev
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to