Re: A case for opImplicitCast: making string search work better
downs wrote: Consider this type: struct StringPosition { size_t pos; void opImplicitCast(out size_t sz) { sz = pos; } void opImplicitCast(out bool b) { b = pos != -1; } } Wouldn't that effectively sidestep most problems people have with find returning -1? Or am I missing something? Could work, but it looks overcomplicated. It could be intuitive, but even then someone new would not be able to figure out what is actually going on, without digging deep into the internals of the library (or the D language). I like my way better (returning two slices for search). Also, it wouldn't require this: Of course, this would require a way to resolve ambiguities, i.e. functions/statements with preferences - for instance, if() would prefer bool over int. I don't know if this is possible. ...and with my way, it's very simple to check if the search was successful. e.g. void myfind(char[] text, char[] search_for, out char[] before, char[] after); char[] before, after; myfind(text, something, before, after); //was it found? bool was_found = !!after.length; //where was it found? int at = before.length; Both operations are frequently needed and don't require you to reference text or something again, which means they can be returned by other functions, and you don't need to break the flow by putting them into temporary variables. With multiple return values, the signature of myfind() could become nicer, too: auto before, after = myfind(text, something); (Or at least allow static arrays as return values for functions.) Am _I_ missing something?
Re: A case for opImplicitCast: making string search work better
On Fri, 15 May 2009 09:36:51 -0400, grauzone n...@example.net wrote: downs wrote: Consider this type: struct StringPosition { size_t pos; void opImplicitCast(out size_t sz) { sz = pos; } void opImplicitCast(out bool b) { b = pos != -1; } } Wouldn't that effectively sidestep most problems people have with find returning -1? Or am I missing something? Could work, but it looks overcomplicated. It could be intuitive, but even then someone new would not be able to figure out what is actually going on, without digging deep into the internals of the library (or the D language). I like my way better (returning two slices for search). Also, it wouldn't require this: Of course, this would require a way to resolve ambiguities, i.e. functions/statements with preferences - for instance, if() would prefer bool over int. I don't know if this is possible. ...and with my way, it's very simple to check if the search was successful. e.g. void myfind(char[] text, char[] search_for, out char[] before, char[] after); char[] before, after; myfind(text, something, before, after); //was it found? bool was_found = !!after.length; //where was it found? int at = before.length; Both operations are frequently needed and don't require you to reference text or something again, which means they can be returned by other functions, and you don't need to break the flow by putting them into temporary variables. With multiple return values, the signature of myfind() could become nicer, too: auto before, after = myfind(text, something); (Or at least allow static arrays as return values for functions.) Am _I_ missing something? Your solution actually goes the opposite direction than I'd like. That is, it looks more complicated than simply returning an index or a slice. I don't want to have to declare return values ahead of time and I'm not holding my breath for multiple return values. You may be able to return a pair struct, but still, what could be simpler than returning an index? It's easy to construct the value you want (before or after), and if you both multiple values, that is also possible (and probably results in simpler code). -Steve
Re: A case for opImplicitCast: making string search work better
to return a pair struct, but still, what could be simpler than returning an index? It's easy to construct the value you want (before or after), and if you both multiple values, that is also possible (and probably results in simpler code). All what you can do with the index is 1. compare it against the length of the searched string to test if the search was successful 2. slice the searched string 3. do something rather special What else would you do? You'd just have to store the searched string as a temporary, and then you'd slice the searched string (for 2.), or compare it against the length of the searched string. You always have to keep the searched string in a temporary. That's rather unpractical. Oh sure, if you _really_ need the index (for 3.), then directly returning an index is of course the best way. With my approach, you don't need to grab the passed searched string again. All of these can be done in a single, trivial expression (for 3. getting the index only). Actually, compared to your approach, this would just eliminate the trivial but annoying slicing code after the search call, that'd you'd type in... what, 90% of all cases? The thing about multiple return values is true (sadly), but in this case, you could simply return a static array (char[][2]). At least that should be possible in D2 at some point. Maybe a struct would work fine too. But I don't like it, because the programmer had to look up the struct members first. He had to memorize the struct members, and couldn't tell what the function returns just by looking at the function signature. (Yay bikeshed issues.)
Re: A case for opImplicitCast: making string search work better
On Fri, 15 May 2009 10:30:17 -0400, grauzone n...@example.net wrote: to return a pair struct, but still, what could be simpler than returning an index? It's easy to construct the value you want (before or after), and if you both multiple values, that is also possible (and probably results in simpler code). All what you can do with the index is 1. compare it against the length of the searched string to test if the search was successful 2. slice the searched string 3. do something rather special What else would you do? You'd just have to store the searched string as a temporary, and then you'd slice the searched string (for 2.), or compare it against the length of the searched string. You always have to keep the searched string in a temporary. That's rather unpractical. Oh sure, if you _really_ need the index (for 3.), then directly returning an index is of course the best way. With my approach, you don't need to grab the passed searched string again. All of these can be done in a single, trivial expression (for 3. getting the index only). Actually, compared to your approach, this would just eliminate the trivial but annoying slicing code after the search call, that'd you'd type in... what, 90% of all cases? I hadn't thought of the case where you are calling *on* a temporary, I always had in mind that the source string was already declared, this is a good point. The only drawback in this case is you are constructing information you sometimes do not need or care about. If all you want is whether it succeeded or not, then you don't need two ranges constructed and returned. But therein lies a fundamental tradeoff that cannot be avoided. The very basic information you get is the index, and with that, you can construct any larger pieces from the pieces you have, but not always easily, and not without repeating identifiers. I like your approach, but with the single return type, not out parameters. Having out parameters would be a deal breaker. I'd prefer not to have two strings but a string that has an identified pivot point. You could generate the desired left and right hand sides dynamically, and it would work without any changes to the current syntax. for example: struct partition(R) { R range; uint pivot; R lhs() {return range[0..pivot];} R rhs() {return range[pivot..$];} bool found() {return pivot range.length;} } partition!string indexOf(string haystack, dchar needle); usage: string s = str.find(hi).rhs; // or .lhs or .found or .pivot Maybe a struct would work fine too. But I don't like it, because the programmer had to look up the struct members first. He had to memorize the struct members, and couldn't tell what the function returns just by looking at the function signature. If this were implemented, the return type would be very common. At some point you have to look up everything (what's a range?). -Steve
Re: A case for opImplicitCast: making string search work better
downs wrote: Consider this type: struct StringPosition { size_t pos; void opImplicitCast(out size_t sz) { sz = pos; } void opImplicitCast(out bool b) { b = pos != -1; } } Wouldn't that effectively sidestep most problems people have with find returning -1? Or am I missing something? Of course, this would require a way to resolve ambiguities, i.e. functions/statements with preferences - for instance, if() would prefer bool over int. I don't know if this is possible. Just use two functions: find and contains.
Re: A case for opImplicitCast: making string search work better
Christopher Wright: Just use two functions: find and contains. Or better, define a built in operator, you may call it in :-) 'e' in hello = true (The compiler may even cache the resulting position somewhere, so a successive find can be very fast). Bye, bearophile
Re: A case for opImplicitCast: making string search work better
a good point. The only drawback in this case is you are constructing information you sometimes do not need or care about. If all you want is whether it succeeded or not, then you don't need two ranges constructed and returned. But therein lies a fundamental tradeoff that cannot be avoided. The very basic information you get is the index, and with that, you can construct any larger pieces from the pieces you have, but not always easily, and not without repeating identifiers. The whole point of the search function is to make programming easier, isn't it? Its implementation is rather trivial. You call it because it makes your life easier. I don't see why constructing this additional information is a problem. Anyway, you always could move this to a second function. I just think that returning a tuple of slices is the most useful way. I like your approach, but with the single return type, not out parameters. Having out parameters would be a deal breaker. I just wanted to show something, that works on D1 without memory allocation. And without returning a struct. If this were implemented, the return type would be very common. At some point you have to look up everything (what's a range?). I think multiple return values are simpler, and more versatile, elegant and intuitive. I contrast, having to define structs for return values of (almost) trivial functions is not a good sign. You could as well pass all in-parameters of a function as struct, claiming this is more practical, because then you can have named arguments and arbitrary default arguments. Huh.