On 12/09/2013 04:13 PM, Amos Jeffries wrote: > Two requests for additional scope:
> * can we place this is a separate src/parse/ library please? > - we have other generic parse code the deserves to all be bundled up > together instead of spread out. Might as well start that collection > process now. No objections. > * Lets do the charset boolean array earlier rather than later. No objections. I hope I will not be the one implementing the proposed API or the above additions though. :-) > CharacterSet(const char * const c, size_t len) You do not want the len argument. These character sets will be nearly always initialized once, from constant hard-coded c-strings. For esoteric cases, I would add an add(const char c) method to add a single character to the set (and use the merge operation to produce more elaborate sets as needed, see below for a sketch). It is a good idea to add a public label/name argument/member though, for debugging/triaging: CharacterSet(const char *label, const char *characters); ... const char *name; ///< a label for debugging > /// add all characters from the given CharacterSet to this one > void merge(const CharacterSet &src) const { Please provide a '+=' operator [that will use merge()]. Here is how I envision some sets might be created: const CharacterSet &MySpaces() { static CharacterSet cs; if (!cs.name) { cs = CharacterSet("MySpaces", "\n\t "); // RFC 12345 cs += OtherProtocolSpaces(); if (Config.weNeedToBeSpecial) cs.add('\r'); } return cs; } > private: > bool match_[256]; I suggest using std::vector<bool> to avoid sprinkling the code with 256 constants or adding another member to store the hard-coded length. I would also call this member chars_ instead of match_ because some sets will be used in negative sense and "match" will be a little confusing in that negative context. > NP: most of the time we will be wanting to define these CharacterSet as > global once-off objects. So I'm not sure if the merge() method is > useful, but shown here for completeness in case we want it for > generating composite character sets. Agreed on all counts. Cheers, Alex.