On Fri, 2008-09-05 at 00:43 +0200, Kinkie wrote:
> On Thu, Sep 4, 2008 at 7:30 PM, Alex Rousskov
> <[EMAIL PROTECTED]> wrote:
> > On Thu, 2008-09-04 at 18:00 +0200, Kinkie wrote:
> >
> >> > BTW, this is yet another case where a Tokenizer class would be better
> >> > than let-String-do-everything approach because a tokenizer object can at
> >> > any time return the current token, the current delimiter, and/or both,
> >> > without performance overhead or design complications.
> >>
> >> There is no such thing as "current delimiter"; it's supplied by the
> >> caller each time.
> >
> > According to your documentation the caller supplies a _set_ of delimiter
> > characters. Thus, the current or actual delimiter (i.e., the actual
> > character at the end of the returned token, if any) is unknown to the
> > caller if the caller used a multi-character delimiter set:
>
> Yes.
>
> You convinced me, but for a different reason.
> If the Tokenizer is a separate object, it must hold a reference to the
> KBuf it's parsing.
> If rather than a reference it holds a copy, this will have the
> practical effect of making the KBuf being parsed immutable.
Storing a reference to a String is prohibited!
Tokenizer has to keep a const copy of a String object. The underlying
memory buffer is refcounted and, hence, the buffer is not copied when
the String is.
> Any preferences for the Tokenizer interface?
Just like String, the iterator interface is pretty standard. For our
Tokenizer, we can simplify it a little unless others think that
compatibility with standard library algorithms is worth the trouble.
Here is a sketch:
class Tokenizer {
public:
Tokenizer(); // immediately atEnd
Tokenizer(const String &aString, const String &delimiters);
// current token, named and STL-like interfaces
String token() const;
String operator *() const { return token(); }
// move to the next token, named and STL-like interfaces
Tokenizer &operator ++() { advance(); return *this; }
void advance();
// end-of-file condition
bool atEnd() const;
// current delimiter (optional)
String delimiter() const;
...
};
I would not provide both named and STL-like interfaces. We should pick
one approach as it would simplify code maintenance and documentation. If
there are no strong opinions, let's use the interface with explicit
names.
HTH,
Alex.