eliminate junk from std.string?
There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it. What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino. Thanks, Andrei
Re: eliminate junk from std.string?
I think the tr, replace, and translate functions are a bit awkward.
Re: eliminate junk from std.string?
IIRC someone on this NG mentioned that several functions are going away from std.string and into std.algorithm. This would be nice, considering I frequently get name clashes when importing both modules (but at least there's no function hijacking. Thanks, D!).
Re: eliminate junk from std.string?
On Sunday 09 January 2011 15:19:31 Jimmy Cao wrote: > I think the tr, replace, and translate functions are a bit awkward. Really? I use replace() fairly heavily in string-processing code, and I don't see anything about it which could be considered awkward. tr() is definitely cool. I do think that it is a bit awkward if you want to deal with the modifiers, but my only real gripe is that I can't replace a character with a string of characters (e.g. replace all ":" with " - ") or a string of characters with a single character (e.g. replace all " - " with ":"). But I don't see a good way to do that. Maybe something with regexes would be better. I don't know. But the alternative to tr() in many cases, is multiple calls to replace(), which would cause unnecessary heap allocations. - Jonthaan M Davis
Re: eliminate junk from std.string?
On Sun, 09 Jan 2011 16:51:57 -0600, Andrei Alexandrescu wrote: > There's a lot of junk in std.string that should be gone. I'm trying to > motivate myself to port some functions to different string widths and... > it's not worth it. > > What functions do you think we should remove from std.string? Let's make > a string and then send them the way of the dino. My suggestions for things to remove: hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace - What are these arrays useful for? capwords() - It tries to do too much. zfill() - The ljustify(),rjustify(), and center() functions should instead take an optional padding character that defaults to a space. maketrans(), translate() - I don't even understand what these do. inPattern(), countchars(), removechars() - Pattern matching is std.regex's charter. squeeze(), succ(), tr(), soundex(), column() - I am having a very hard time imagining myself ever using these functions... -Lars
Re: eliminate junk from std.string?
Lars T. Kyllingstad: > My suggestions for things to remove: > > hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace > - What are these arrays useful for? > > capwords() > - It tries to do too much. > > zfill() > - The ljustify(),rjustify(), and center() functions >should instead take an optional padding character >that defaults to a space. > > maketrans(), translate() > - I don't even understand what these do. > > inPattern(), countchars(), removechars() > - Pattern matching is std.regex's charter. > > squeeze(), succ(), tr(), soundex(), column() > - I am having a very hard time imagining myself ever >using these functions... I agree with about nothing you have said :-) How much string processing you do day by day? I am using most of those things... If you are used in using Python or Ruby you probably find most of those things useful. If Andrei removes arrays like lowercase, letters, uppecase, I will have to write them myself in code. ljustify(),rjustify(), and center() are very useful, even if they may be improved in some ways. maketrans() and translate() (as other things) come from Python string functions, and I have used them a hundred times in string processing code. I have used squeeze() some times. soundex is not hurting, because even if it's not commonly necessary, its name is easy to understand and it's not easy to miss for something different, so it doesn't add much noise to the library. And I've seen that it's easy to implement soundex wrongly, while the one in the std.string is correct. I agree that too much stuff is generally bad in a library, because searching for something requires more time if there are more items to search into. In Bugzilla I have three or four bug reports that ask for few small changes in std.string (like removing chop and keeping chomp). But please don't remove too much. In a library more is often better. Bye, bearophile
Re: eliminate junk from std.string?
On Mon, 10 Jan 2011 03:41:51 -0500, bearophile wrote: > Lars T. Kyllingstad: > >> My suggestions for things to remove: >> >> hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace >> - What are these arrays useful for? >> >> capwords() >> - It tries to do too much. >> >> zfill() >> - The ljustify(),rjustify(), and center() functions >>should instead take an optional padding character that defaults to a >>space. >> >> maketrans(), translate() >> - I don't even understand what these do. >> >> inPattern(), countchars(), removechars() >> - Pattern matching is std.regex's charter. >> >> squeeze(), succ(), tr(), soundex(), column() >> - I am having a very hard time imagining myself ever >>using these functions... > > I agree with about nothing you have said :-) > > How much string processing you do day by day? I am using most of those > things... If you are used in using Python or Ruby you probably find most > of those things useful. If Andrei removes arrays like lowercase, > letters, uppecase, I will have to write them myself in code. > ljustify(),rjustify(), and center() are very useful, even if they may be > improved in some ways. maketrans() and translate() (as other things) > come from Python string functions, and I have used them a hundred times > in string processing code. I have used squeeze() some times. soundex is > not hurting, because even if it's not commonly necessary, its name is > easy to understand and it's not easy to miss for something different, so > it doesn't add much noise to the library. And I've seen that it's easy > to implement soundex wrongly, while the one in the std.string is > correct. I think you may have misunderstood some of my suggestions. For instance, I never proposed to remove ljustify(), rjustify(), and center(). Rather, I would have them take an extra 'padding' parameter, so we can eliminate zfill(). Before: auto s = zfill("123", 6); After: auto s = rjustify("123", 6, '0'); As for the other things I suggested, well... those are the things i vote to remove from std.string. If they only get that one vote, they stay. ;) By the way, since you seem to be using these things quite often, maybe you can answer this: 1. What are the hexdigits, digits, octdigits, lowercase, letters, uppercase, and whitespace arrays useful for? The only thing I can think of is to check whether a character belongs to one of them, but I think that is better done with the std.ctype functions. 2. What do maketrans() and translate() do? (A brief example would be nice.) -Lars
Re: eliminate junk from std.string?
"Lars T. Kyllingstad" wrote in message news:igeia5$1t4...@digitalmars.com... > > 1. What are the hexdigits, digits, octdigits, lowercase, letters, > uppercase, and whitespace arrays useful for? The only thing I can think > of is to check whether a character belongs to one of them, but I think > that is better done with the std.ctype functions. > They're good for people like me who never noticed std.ctype ;)
Re: eliminate junk from std.string?
On 1/10/11 2:41 AM, bearophile wrote: Lars T. Kyllingstad: My suggestions for things to remove: hexdigits, digits, octdigits, lowercase, letters, uppecase, whitespace - What are these arrays useful for? capwords() - It tries to do too much. zfill() - The ljustify(),rjustify(), and center() functions should instead take an optional padding character that defaults to a space. maketrans(), translate() - I don't even understand what these do. inPattern(), countchars(), removechars() - Pattern matching is std.regex's charter. squeeze(), succ(), tr(), soundex(), column() - I am having a very hard time imagining myself ever using these functions... I agree with about nothing you have said :-) How much string processing you do day by day? I am using most of those things... If you are used in using Python or Ruby you probably find most of those things useful. If Andrei removes arrays like lowercase, letters, uppecase, I will have to write them myself in code. The arrays letters, uppercase, and lowercase aren't all that useful because they only make sense for ASCII. Besides, they should be encoded as functions. ljustify(),rjustify(), and center() are very useful, even if they may be improved in some ways. Hmmm. I suspected everyone's list will be different :o). I personally think the justification and centering functions are rarely useful - how often does one need to justify plain text? If you generate HTML the markup will do that for you and if you generate some nice text then the font will be proportional so the functions are useless. Nevertheless, I ported them (and also fixed them - they were broken for anything non-ASCII, which probably is telling of the extent of their usage). What are your use cases for these three functions? maketrans() and translate() (as other things) come from Python string functions, and I have used them a hundred times in string processing code. I have used squeeze() some times. soundex is not hurting, because even if it's not commonly necessary, its name is easy to understand and it's not easy to miss for something different, so it doesn't add much noise to the library. And I've seen that it's easy to implement soundex wrongly, while the one in the std.string is correct. I think maketrans/translate are okay (if a bit arcane) but they need to be ported to Unicode. Python apparently does mind Unicode as of 3.x, although I'm not sure exactly what the semantics are: http://stackoverflow.com/questions/3031045/how-come-string-maketrans-does-not-work-in-python-3-1. One odd thing is that you'd expect a dynamic language like Python to dynamically detect ASCII vs. non-ASCII. The example shows that Python rejects string-based translation tables even when they are, in fact, ASCII. I agree that too much stuff is generally bad in a library, because searching for something requires more time if there are more items to search into. In Bugzilla I have three or four bug reports that ask for few small changes in std.string (like removing chop and keeping chomp). But please don't remove too much. In a library more is often better. I think we should remove all functions that rely on patterns represented as strings: inPattern, countchars, removechars, squeeze, munch. Representing patterns as a convention on top of otherwise untyped strings doesn't seem a good solution for D. We should either go with regex or with a simple pattern structure and a helper function. That way people can say e.g. munch(s, pattern("[0-9]")). Andrei
Re: eliminate junk from std.string?
> What are your use cases for these three functions? I don't know about bearophile, but I used a lot of the functions you are talking about removing in my HTML -> Plain Text conversion function used for emails and other similar environments. squeeze the whitespace, align text, wrap for the target, etc.
Re: eliminate junk from std.string?
Speaking of regex.. I see there are two enums in std.regex, email and url, which are regular expressions. Why not collect more of these common regexes? And we could pack them up in a struct to avoid polluting the local namespace. I think this might encourage the use of std.regex, since the average Joe wouldn't have to reach for the regex book whenever he's processing strings. E.g.: foreach(m; match("10abc20def30", regex(patterns.number))) // std.regex.patterns.number { writefln("%s[%s]%s", m.pre, m.hit, m.post); } Just a passing thought..
Re: eliminate junk from std.string?
"Andrej Mitrovic" wrote in message news:mailman.543.1294713068.4748.digitalmar...@puremagic.com... > Speaking of regex.. I see there are two enums in std.regex, email and > url, which are regular expressions. Why not collect more of these > common regexes? And we could pack them up in a struct to avoid > polluting the local namespace. I think this might encourage the use of > std.regex, since the average Joe wouldn't have to reach for the regex > book whenever he's processing strings. E.g.: > > foreach(m; match("10abc20def30", regex(patterns.number))) // > std.regex.patterns.number > { >writefln("%s[%s]%s", m.pre, m.hit, m.post); > } > > Just a passing thought.. I think that's a great idea.
Re: eliminate junk from std.string?
On 1/9/11 4:51 PM, Andrei Alexandrescu wrote: There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it. What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino. Thanks, Andrei I have uploaded a preview of the changed APIs here: http://d-programming-language.org/cutting-edge/phobos/std_string.html Let's work together on improving things. Andrei
Re: eliminate junk from std.string?
Hi Andrei, It looks nice. Just a small comment: in many of your comments you use words that not all of us might now. For instance: "sans". I happen to know it because I studied French, but otherwise I wouldn't know that. I just showed that phrase to a colleague here in Argentina and he didn't understand it. He thought it maybe meant "since". Maybe "sans" and "in lieu" are memes there in the USA, but not everywhere. So please, stick with English. :-)
Re: eliminate junk from std.string?
Oh, one more thing: can the names be consistent? inpattern countChars expandtabs chompPrefix toupper toupperInPlace ?? If this can't be done for backwards compatibility maybe you can make alias for the previous ones. Also: stripl stripr strip Strips *l*eading and *t*railing whitespaces... It took me some time to notice that it was strip*r* (for right), but the comment says "trailing", and I never think of "remove right space", always "remove trailing spaces" (like in the comment!). So why not name that function "stript"?
Re: eliminate junk from std.string?
On 01/11/2011 04:34 PM, Ary Borenszweig wrote: Oh, one more thing: can the names be consistent? inpattern countChars expandtabs chompPrefix toupper toupperInPlace ?? If this can't be done for backwards compatibility maybe you can make alias for the previous ones. Also: stripl stripr strip Strips *l*eading and *t*railing whitespaces... stripLeft, stripRight Anyway, the necessity for super-cryptic abbreviated names doesn't exist any more. Maybe, they are justified for very frequently used stuff but stripl/stripr is not the case. It took me some time to notice that it was strip*r* (for right), but the comment says "trailing", and I never think of "remove right space", always "remove trailing spaces" (like in the comment!). So why not name that function "stript"?
Re: eliminate junk from std.string?
Yes, what I meant was that the names are stripl and stripr yet the description of those functions are strip leading and strip trailing... at least put strip left and string right on the description so it matches the names.
Re: eliminate junk from std.string?
On 01/11/2011 05:36 PM, Ary Borenszweig wrote: Yes, what I meant was that the names are stripl and stripr yet the description of those functions are strip leading and strip trailing... at least put strip left and string right on the description so it matches the names. Sorry for misunderstanding. I don't think that the description needs to match the names literally. However, I would aviod "trailing" and "leading", because in RTL environments they can have the opposite meaning.
Re: eliminate junk from std.string?
On 1/11/11 6:29 AM, Ary Borenszweig wrote: Hi Andrei, It looks nice. Just a small comment: in many of your comments you use words that not all of us might now. For instance: "sans". I happen to know it because I studied French, but otherwise I wouldn't know that. I just showed that phrase to a colleague here in Argentina and he didn't understand it. He thought it maybe meant "since". Maybe "sans" and "in lieu" are memes there in the USA, but not everywhere. So please, stick with English. :-) Okay. I think "sans" is Walter's... Andrei
Re: eliminate junk from std.string?
On Tue, 11 Jan 2011 11:39:11 -0500, Andrei Alexandrescu wrote: On 1/11/11 6:29 AM, Ary Borenszweig wrote: Hi Andrei, It looks nice. Just a small comment: in many of your comments you use words that not all of us might now. For instance: "sans". I happen to know it because I studied French, but otherwise I wouldn't know that. I just showed that phrase to a colleague here in Argentina and he didn't understand it. He thought it maybe meant "since". Maybe "sans" and "in lieu" are memes there in the USA, but not everywhere. So please, stick with English. :-) Okay. I think "sans" is Walter's... sans is in the english dictionary: http://www.merriam-webster.com/dictionary/sans According to that reference, Shakespeare used it :) Don't think you can get more English than that... BTW, it would be impossible to phrase everything so everyone who has their specific dialect of English would understand it, I don't think there's much sense in worrying about it. That being said, using 'without' instead of 'sans' is probably fine. -Steve
Re: eliminate junk from std.string?
On 1/11/11 6:34 AM, Ary Borenszweig wrote: Oh, one more thing: can the names be consistent? inpattern countChars expandtabs chompPrefix toupper toupperInPlace ?? If this can't be done for backwards compatibility maybe you can make alias for the previous ones. The names are for compatibility with... other languages :o|. Also: stripl stripr strip Strips *l*eading and *t*railing whitespaces... It took me some time to notice that it was strip*r* (for right), but the comment says "trailing", and I never think of "remove right space", always "remove trailing spaces" (like in the comment!). So why not name that function "stript"? Same thing. These names are imported from other languages. Andrei
Re: eliminate junk from std.string?
"Andrei Alexandrescu" wrote in message news:igi18o$e5...@digitalmars.com... > On 1/11/11 6:34 AM, Ary Borenszweig wrote: >> Oh, one more thing: can the names be consistent? >> >> inpattern >> countChars >> expandtabs >> chompPrefix >> toupper >> toupperInPlace ?? >> >> If this can't be done for backwards compatibility maybe you can make >> alias for the >> previous ones. > > The names are for compatibility with... other languages :o|. > Would that other language be Walterish or C? If C, it's not like using the wrong case will suddendly change the semantics of the function. And if the worry is other non-phobos functions that might have the old C-style name (but different semantics), then Ary's suggestion of compatibly-named alases would take care of that.
Re: eliminate junk from std.string?
"Steven Schveighoffer" wrote in message news:op.vo5kspmfeav...@steve-laptop... > On Tue, 11 Jan 2011 11:39:11 -0500, Andrei Alexandrescu > wrote: > >> On 1/11/11 6:29 AM, Ary Borenszweig wrote: >>> Hi Andrei, >>> >>> It looks nice. Just a small comment: in many of your comments you use >>> words that >>> not all of us might now. For instance: "sans". I happen to know it >>> because I >>> studied French, but otherwise I wouldn't know that. I just showed that >>> phrase to a >>> colleague here in Argentina and he didn't understand it. He thought it >>> maybe meant >>> "since". Maybe "sans" and "in lieu" are memes there in the USA, but not >>> everywhere. So please, stick with English. :-) >> >> Okay. I think "sans" is Walter's... > > sans is in the english dictionary: > > http://www.merriam-webster.com/dictionary/sans > > According to that reference, Shakespeare used it :) Don't think you can > get more English than that... > Thoust words are true. Seriously though, I'm pretty sure a lot of native english speakers don't know "sans" either, unless they're familiar with font-related terminology. "In lieu of" is widely-known though, at least in the US.
Re: eliminate junk from std.string?
Am 11.01.2011 19:07, schrieb Nick Sabalausky: Thoust words are true. Seriously though, I'm pretty sure a lot of native english speakers don't know "sans" either, unless they're familiar with font-related terminology. "In lieu of" is widely-known though, at least in the US. I'm neither representative nor a native speaker (I'm german) and I knew sans, but didn't know "In lieu of".
Re: eliminate junk from std.string?
On 01/11/2011 04:11 PM, Max Samukha wrote: Anyway, the necessity for super-cryptic abbreviated names doesn't exist any more. Maybe, they are justified for very frequently used stuff but stripl/stripr is not the case. +++ Standard names should all be as obvious as possible. Then, everyone is free to alias stripLeft & stripRight to sl & sr ;-) But standard lib should be super clear code; show the right example of what clarity means --not the opposite! And I ask again: what to do with all inherited junk breaking naming rules like uint, size_t, malloc...? Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
"Daniel Gibson" wrote in message news:igi6n5$27p...@digitalmars.com... > Am 11.01.2011 19:07, schrieb Nick Sabalausky: >> Thoust words are true. >> >> Seriously though, I'm pretty sure a lot of native english speakers don't >> know "sans" either, unless they're familiar with font-related >> terminology. >> "In lieu of" is widely-known though, at least in the US. >> >> > > I'm neither representative nor a native speaker (I'm german) and I knew > sans, but didn't know "In lieu of". I guess that just goes to show, we should all just switch to Esperanto ;)
Re: eliminate junk from std.string?
"Max Samukha" wrote in message news:ighvca$ap...@digitalmars.com... > On 01/11/2011 05:36 PM, Ary Borenszweig wrote: >> Yes, what I meant was that the names are stripl and stripr yet the >> description of >> those functions are strip leading and strip trailing... at least put >> strip left >> and string right on the description so it matches the names. > > Sorry for misunderstanding. > > I don't think that the description needs to match the names literally. > However, I would aviod "trailing" and "leading", because in RTL > environments they can have the opposite meaning. I would have thought RTL languages got stored as RTL. If so, then "leading" and "trailing" would be correct and "left"/"right" would be wrong (unless the internal behavior of stripl and stripr takes language-direction into account, which would surprise me).
Re: eliminate junk from std.string?
On 01/11/2011 07:14 PM, Nick Sabalausky wrote: "Daniel Gibson" wrote in message news:igi6n5$27p...@digitalmars.com... Am 11.01.2011 19:07, schrieb Nick Sabalausky: Thoust words are true. Seriously though, I'm pretty sure a lot of native english speakers don't know "sans" either, unless they're familiar with font-related terminology. "In lieu of" is widely-known though, at least in the US. I'm neither representative nor a native speaker (I'm german) and I knew sans, but didn't know "In lieu of". I guess that just goes to show, we should all just switch to Esperanto ;) No, esperanto is just a heap of language-design errors! Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
On 12/01/11 05:07, Nick Sabalausky wrote: "Steven Schveighoffer" wrote in message news:op.vo5kspmfeav...@steve-laptop... On Tue, 11 Jan 2011 11:39:11 -0500, Andrei Alexandrescu wrote: On 1/11/11 6:29 AM, Ary Borenszweig wrote: Hi Andrei, It looks nice. Just a small comment: in many of your comments you use words that not all of us might now. For instance: "sans". I happen to know it because I studied French, but otherwise I wouldn't know that. I just showed that phrase to a colleague here in Argentina and he didn't understand it. He thought it maybe meant "since". Maybe "sans" and "in lieu" are memes there in the USA, but not everywhere. So please, stick with English. :-) Okay. I think "sans" is Walter's... sans is in the english dictionary: http://www.merriam-webster.com/dictionary/sans According to that reference, Shakespeare used it :) Don't think you can get more English than that... Thoust words are true. As an aside you might find some amusement in "The Shakespeare Programming Language" http://shakespearelang.sourceforge.net/report/shakespeare/
Re: eliminate junk from std.string?
On 01/11/2011 07:01 PM, Nick Sabalausky wrote: The names are for compatibility with... other languages :o|. > Would that other language be Walterish or C? If C, it's not like using the wrong case will suddendly change the semantics of the function. And if the worry is other non-phobos functions that might have the old C-style name (but different semantics), then Ary's suggestion of compatibly-named alases would take care of that. Agreed, Ary's suggestion makes much sense. Anyway, when shall we endly get rid of half-a-century-old naming issues? In the XXIInd century? Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
Nick Sabalausky wrote: "Andrei Alexandrescu" wrote in message news:igi18o$e5...@digitalmars.com... On 1/11/11 6:34 AM, Ary Borenszweig wrote: Oh, one more thing: can the names be consistent? inpattern countChars expandtabs chompPrefix toupper toupperInPlace ?? If this can't be done for backwards compatibility maybe you can make alias for the previous ones. The names are for compatibility with... other languages :o|. Would that other language be Walterish or C? The names generally come from Python, Ruby and Javascript.
Re: eliminate junk from std.string?
"spir" wrote in message news:mailman.550.1294771968.4748.digitalmar...@puremagic.com... > On 01/11/2011 07:14 PM, Nick Sabalausky wrote: >> "Daniel Gibson" wrote in message >> news:igi6n5$27p...@digitalmars.com... >>> Am 11.01.2011 19:07, schrieb Nick Sabalausky: Thoust words are true. Seriously though, I'm pretty sure a lot of native english speakers don't know "sans" either, unless they're familiar with font-related terminology. "In lieu of" is widely-known though, at least in the US. >>> >>> I'm neither representative nor a native speaker (I'm german) and I knew >>> sans, but didn't know "In lieu of". >> >> I guess that just goes to show, we should all just switch to Esperanto ;) > > No, esperanto is just a heap of language-design errors! > And that differs from English, how? ;)
Re: eliminate junk from std.string?
Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"?
Re: eliminate junk from std.string?
Nick Sabalausky wrote: Seriously though, I'm pretty sure a lot of native english speakers don't know "sans" either, unless they're familiar with font-related terminology. "In lieu of" is widely-known though, at least in the US. I used to keep a dictionary on my desk, but now I just google definitions. I don't see a good reason to dumb down the language. BTW, english is full of french words, thanks to the Battle of Hastings.
Re: eliminate junk from std.string?
Nick Sabalausky wrote: "Andrej Mitrovic" wrote in message news:mailman.543.1294713068.4748.digitalmar...@puremagic.com... Speaking of regex.. I see there are two enums in std.regex, email and url, which are regular expressions. Why not collect more of these common regexes? And we could pack them up in a struct to avoid polluting the local namespace. I think this might encourage the use of std.regex, since the average Joe wouldn't have to reach for the regex book whenever he's processing strings. E.g.: foreach(m; match("10abc20def30", regex(patterns.number))) // std.regex.patterns.number { writefln("%s[%s]%s", m.pre, m.hit, m.post); } Just a passing thought.. I think that's a great idea. I agree.
Re: eliminate junk from std.string?
Adam Ruppe wrote: I don't know about bearophile, but I used a lot of the functions you are talking about removing in my HTML -> Plain Text conversion function used for emails and other similar environments. squeeze the whitespace, align text, wrap for the target, etc. As has been pointed out, a lot of these seemingly odd functions come from Python/Ruby/Javascript. Users of those languages will be familiar with them, and they've proven themselves handy in those languages. Let's not be cavalier about dumping them just because they aren't familiar to C programmers.
Re: eliminate junk from std.string?
Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? Because people are used to those names due to their wide use. It's the same reason that we still use Qwerty keyboards.
Re: eliminate junk from std.string?
Am 11.01.2011 20:42, schrieb Walter Bright: Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? Because people are used to those names due to their wide use. It's the same reason that we still use Qwerty keyboards. And C++ :-P
Re: eliminate junk from std.string?
Agreed. So what's wrong with improving things and leaving old things as aliases?
Re: eliminate junk from std.string?
"Welcome to D. Do you program in C, Javascript, Python or Ruby? Cool! Then you will feel at home." That phrase currently ends like this: "You don't? Oh, sorry, you will have to learn that some names are all lowercase, some not." But it could end like this: "You don't? Don't worry. D has the convention of writing all function names with X convention, but we keep some aliases for things that we want to keep backwards compatibility for."
Re: eliminate junk from std.string?
On 01/11/2011 09:42 PM, Walter Bright wrote: Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? Because people are used to those names due to their wide use. It's the same reason that we still use Qwerty keyboards. We should be careful in assuming what people are used to. Compare: D/Python/Lisp/... - strip .NET/Delphi/Java/Qt/Haskell/... - Trim/trim/trimmed stripl/stripr are TrimStart/TrimEnd in .NET
Re: eliminate junk from std.string?
On 01/11/2011 08:18 PM, Nick Sabalausky wrote: "Max Samukha" wrote in message news:ighvca$ap...@digitalmars.com... On 01/11/2011 05:36 PM, Ary Borenszweig wrote: Yes, what I meant was that the names are stripl and stripr yet the description of those functions are strip leading and strip trailing... at least put strip left and string right on the description so it matches the names. Sorry for misunderstanding. I don't think that the description needs to match the names literally. However, I would aviod "trailing" and "leading", because in RTL environments they can have the opposite meaning. I would have thought RTL languages got stored as RTL. If so, then "leading" and "trailing" would be correct and "left"/"right" would be wrong (unless the internal behavior of stripl and stripr takes language-direction into account, which would surprise me). AFAIK, there is no universal standard on storing RTL text. There are recommendations to prefer logical order over visual order because visual order is extremely inflexible. I am not an expert in this field and have to shut up.
Re: eliminate junk from std.string?
Ary Borenszweig wrote: Agreed. So what's wrong with improving things and leaving old things as aliases? Clutter. One of the risks with Phobos development is it becoming a river miles wide, and only an inch deep. In other words, endless gobs of shallow, trite functions, with very little depth. (Aliases are as shallow as they get!) As a general rule, I don't want functionality in Phobos that takes more time for a user to find/read/understand the documentation on than to reimplement it himself. Those things give the illusion of comprehensiveness, but are just useless wankery. Do we really want a 1000 page reference manual on Phobos, but no database interface? No network interface? No D lexer? No disassembler? No superfast XML parser? No best-of-breed regex implementation? No CGI support? No HTML parsing? No sound support? No jpg reading? I worry by endless bikeshedding about perfecting the spelling of some name, we miss the whole show. I'd like to see more meat. For example, Don has recently added gamma functions to the math library. These are hard to implement correctly, and are perfect for inclusion.
Re: eliminate junk from std.string?
Lars T. Kyllingstad wrote: 1. What are the hexdigits, digits, octdigits, lowercase, letters, uppercase, and whitespace arrays useful for? The only thing I can think of is to check whether a character belongs to one of them, One example is conversion from a number to text. hexdigits[n] comes to mind. If you do a strings dump on a random executable, you'll usually find such strings embedded in it. By putting them in std.string, hopefully you won't find several instances of the same string.
Re: eliminate junk from std.string?
"Walter Bright" wrote in message news:igib2q$12g...@digitalmars.com... > Adam Ruppe wrote: >> I don't know about bearophile, but I used a lot of the functions >> you are talking about removing in my HTML -> Plain Text conversion >> function used for emails and other similar environments. squeeze the >> whitespace, align text, wrap for the target, etc. > > As has been pointed out, a lot of these seemingly odd functions come from > Python/Ruby/Javascript. Users of those languages will be familiar with > them, and they've proven themselves handy in those languages. > > Let's not be cavalier about dumping them just because they aren't familiar > to C programmers. I agree with this reasoning for having them. However, I don't think it means we shouldn't D-ify or Phobos-ify them, at least as far as capitalization conventions.
Re: eliminate junk from std.string?
Nick Sabalausky wrote: I agree with this reasoning for having them. However, I don't think it means we shouldn't D-ify or Phobos-ify them, at least as far as capitalization conventions. I also object to rather pointlessly annoying people wanting to move their code from D1 to D2 by renaming everything. Endlessly renaming things searching for the perfect name gives the illusion of progress, whereas time would be better spent on improving the documentation, unittests, performance, etc. Naming of things isn't nearly as critical an issue in D as it is in, say, C, because of the excellent antihijacking support in D's module system. Some name changes have turned out to be a big win, like "invariant" => "immutable". But I don't think that implies open season for wholesale renaming of swaths of functions.
Re: eliminate junk from std.string?
Ary Borenszweig wrote: Agreed. So what's wrong with improving things and leaving old things as aliases? I want to add that having multiple names for the same thing doesn't really do anyone any good.
Re: eliminate junk from std.string?
On Tue, Jan 11, 2011 at 12:43:28PM -0800, Walter Bright wrote: > Naming of things isn't nearly as critical an issue in D as it is in, say, > C, because of the excellent antihijacking support in D's module system. And the spell checker will quickly point out messed up capitalization at compile time anyway.
Re: eliminate junk from std.string?
"Walter Bright" wrote in message news:igibu6$154...@digitalmars.com... > Ary Borenszweig wrote: >> Why care where they come from? Why not make them intuitive? Say, like, >> "Always >> camel case"? > > Because people are used to those names due to their wide use. It's the > same reason that we still use Qwerty keyboards. Then why switch langauges at all? When you move to a different language you expect that language is going to have its own set of conventions. And even more than that, you also expect it to at least be internally-consistent, not a grab-bag of different styles. Are they really supposed to remember "Oh, oh, this func comes from this language, so it's capitalized this way, and that one comes from that language so it's capitalized that way..." Not only that, but D has far, far bigger, more significant differences from Ruby/Python/JS/etc than the capitalization of a few functions. If people are going to come over and get used to *those* changes, then using toLower instead of tolower is going to be a downright triviality for them. Your cart is before your horse.
Re: eliminate junk from std.string?
"Walter Bright" wrote in message news:igifgt$1cu...@digitalmars.com... > Nick Sabalausky wrote: >> I agree with this reasoning for having them. However, I don't think it >> means we shouldn't D-ify or Phobos-ify them, at least as far as >> capitalization conventions. > > I also object to rather pointlessly annoying people wanting to move their > code from D1 to D2 by renaming everything. Endlessly renaming things > searching for the perfect name gives the illusion of progress, whereas > time would be better spent on improving the documentation, unittests, > performance, etc. > > Naming of things isn't nearly as critical an issue in D as it is in, say, > C, because of the excellent antihijacking support in D's module system. > > > Some name changes have turned out to be a big win, like "invariant" => > "immutable". But I don't think that implies open season for wholesale > renaming of swaths of functions. We're not asking for free-for-all bikeshedding, we're asking to get rid of the free-for-all naming-convention-carnival in the std lib. Just basic sensible consistency, that's all. And breaking compatibility with D1 for the sake of progress is the whole point of D2.
Re: eliminate junk from std.string?
Am 11.01.2011 21:11, schrieb Ary Borenszweig: "Welcome to D. Do you program in C, Javascript, Python or Ruby? Cool! Then you will feel at home." That phrase currently ends like this: "You don't? Oh, sorry, you will have to learn that some names are all lowercase, some not." I agree. Using different conventions for naming functions etc makes a library look inconsistent. Yeah right, those names are used in other languages, so people who know C, Javascript, Python and Ruby may feel at home (even though there may be similar functions with different names/writing and signatures in e.g. JS and Ruby so one still has to know, where exactly the function was "stolen"). I can to some degree understand to reuse C function names/signatures, with D being a successor and compatible and all, but reusing names (and especially their writing - lowercase, lowercase_with_underscores, CamelCase, ...) from a plethora of languages/libraries doesn't make Phobos look and feel consistent but stitched together like Frankensteins Monster. It's definitely good to adapt functions that have proven useful in other languages/libraries. But they should be adjusted to fit within the style of the own library, especially when it's a standard library. There is a D style guide ( http://www.digitalmars.com/d/2.0/dstyle.html ), so at least D's own standard library should comply with it :-) Cheers, - Daniel
Re: eliminate junk from std.string?
Andrei Alexandrescu Wrote: > On 1/9/11 4:51 PM, Andrei Alexandrescu wrote: > > There's a lot of junk in std.string that should be gone. I'm trying to > > motivate myself to port some functions to different string widths and... > > it's not worth it. > > > > What functions do you think we should remove from std.string? Let's make > > a string and then send them the way of the dino. > > > > > > Thanks, > > > > Andrei > > I have uploaded a preview of the changed APIs here: > > http://d-programming-language.org/cutting-edge/phobos/std_string.html Unclear if iswhite() refers to ASCII whitespace or Unicode. If Unicode, which version of the standard? Same comment for icmp(). Also, in the Unicode standard, case folding can depend on the specific language. There is room for ascii-only functions, but unless a D version of ICU is going to be done separately, it would be nice to have full unicode-aware functions available. You've got chop() marked as deprecated. Is popBack() going to make sense as something that removes a variable number of chars from a string in the CR-LF case? That might be a bit too magical. Rather than zfill, what about modifying ljustify, rjustify, and center to take an optional fill character? One set of functions I'd like to see are startsWith() and endsWith(). I find them frequently useful in Java and an irritating lack in the C++ standard library. Jerry
Re: eliminate junk from std.string?
Jerry Quinn Wrote: > One set of functions I'd like to see are startsWith() and endsWith(). I find > them frequently useful in Java and an irritating lack in the C++ standard > library. Just adding that these functions are useful because they're more efficient than doing a find and checking that the match is in the first position. Jerry
Re: eliminate junk from std.string?
On 12.01.2011 0:47, Jerry Quinn wrote: Jerry Quinn Wrote: One set of functions I'd like to see are startsWith() and endsWith(). I find them frequently useful in Java and an irritating lack in the C++ standard library. Just adding that these functions are useful because they're more efficient than doing a find and checking that the match is in the first position. Jerry Those are present in std.algorithm and seem to work just fine. What's wrong with them? -- Dmitry Olshansky
Re: eliminate junk from std.string?
On Tuesday, January 11, 2011 12:44:57 Nick Sabalausky wrote: > "Walter Bright" wrote in message > news:igibu6$154...@digitalmars.com... > > > Ary Borenszweig wrote: > >> Why care where they come from? Why not make them intuitive? Say, like, > >> "Always > >> camel case"? > > > > Because people are used to those names due to their wide use. It's the > > same reason that we still use Qwerty keyboards. > > Then why switch langauges at all? > > When you move to a different language you expect that language is going to > have its own set of conventions. And even more than that, you also expect > it to at least be internally-consistent, not a grab-bag of different > styles. Are they really supposed to remember "Oh, oh, this func comes from > this language, so it's capitalized this way, and that one comes from that > language so it's capitalized that way..." > > Not only that, but D has far, far bigger, more significant differences from > Ruby/Python/JS/etc than the capitalization of a few functions. If people > are going to come over and get used to *those* changes, then using toLower > instead of tolower is going to be a downright triviality for them. Your > cart is before your horse. I agree. Having the functions named similarly so that they're quickly recognized is good - if a function has a particular name in a variety of languages, why not give it essentially the same name in D? But I don't see why it must be _exactly_ the same name. At least using the same casing as the rest of Phobos. Unless you're directly porting code, the fact that it's toLower instead of tolower really shouldn't be an issue. It's a new a language, a new library, you're going to have to learn how it works anyway. The function names don't need to be _exactly_ the same as other languages. It does look bad when functions in Phobos don't follow the same naming conventions as the rest of it, and it makes it much harder to remember exactly how they're named. So, I'm all for picking names which are essentially the same as functions with the same functionality in other languages, but I think that insisting that the casing of the names match the casing of the functions from other languages when it doesn't match how functions are normally cased in Phobos is definitely a bad idea. Not to mention, I don't think that I've ever heard anyone complain that the casing on a function in Phobos didn't match the casing of a function with essentially the same name in another language, but complaints definitely pop up about how some of the std.string functions don't use the same casing as the rest of Phobos. I vote for consistency. Using essentially the same names for functions as is used in other languages is great. Insisting on the same casing for the function names strikes me as inconsistent and undesirable. I find that it increases the burden of remembering function names rather than reducing it. - Jonathan M Davis
Re: eliminate junk from std.string?
On Tuesday, January 11, 2011 11:12:44 Nick Sabalausky wrote: > "spir" wrote in message > news:mailman.550.1294771968.4748.digitalmar...@puremagic.com... > > > On 01/11/2011 07:14 PM, Nick Sabalausky wrote: > >> "Daniel Gibson" wrote in message > >> news:igi6n5$27p...@digitalmars.com... > >> > >>> Am 11.01.2011 19:07, schrieb Nick Sabalausky: > Thoust words are true. > > Seriously though, I'm pretty sure a lot of native english speakers > don't > know "sans" either, unless they're familiar with font-related > terminology. > "In lieu of" is widely-known though, at least in the US. > >>> > >>> I'm neither representative nor a native speaker (I'm german) and I knew > >>> sans, but didn't know "In lieu of". > >> > >> I guess that just goes to show, we should all just switch to Esperanto > >> ;) > > > > No, esperanto is just a heap of language-design errors! > > And that differs from English, how? ;) English wasn't designed. - Jonathan M Davis
Re: eliminate junk from std.string?
On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei
Re: eliminate junk from std.string?
On 1/11/11 12:09 PM, Ary Borenszweig wrote: Agreed. So what's wrong with improving things and leaving old things as aliases? Petrified lava. Andrei
Re: eliminate junk from std.string?
Am 12.01.2011 00:00, schrieb Andrei Alexandrescu: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei Please do, having different naming conventions of functions within the standard library makes it harder to remember the exact spelling of a function and also doesn't look professional. +1 vote for making the standard library comply with the D style guide[1] Cheers, - Daniel [1] http://digitalmars.com/d/2.0/dstyle.html
Re: eliminate junk from std.string?
On 1/12/11 12:00 AM, Andrei Alexandrescu wrote: If there's enough support for this, I'll do it. Andrei +1 from me – sticking to names commonly used in other programming languages is good for ease of adoption, but also inheriting the various naming convention is, in my humble opinion, just plain weird. David
Re: eliminate junk from std.string?
On 1/11/11 1:47 PM, Jerry Quinn wrote: Jerry Quinn Wrote: One set of functions I'd like to see are startsWith() and endsWith(). I find them frequently useful in Java and an irritating lack in the C++ standard library. Just adding that these functions are useful because they're more efficient than doing a find and checking that the match is in the first position. Jerry They're in std.algorithm. Andrei
Re: eliminate junk from std.string?
So what's a good use for aliases?
Re: eliminate junk from std.string?
On 1/11/11 1:45 PM, Jerry Quinn wrote: Andrei Alexandrescu Wrote: On 1/9/11 4:51 PM, Andrei Alexandrescu wrote: There's a lot of junk in std.string that should be gone. I'm trying to motivate myself to port some functions to different string widths and... it's not worth it. What functions do you think we should remove from std.string? Let's make a string and then send them the way of the dino. Thanks, Andrei I have uploaded a preview of the changed APIs here: http://d-programming-language.org/cutting-edge/phobos/std_string.html Unclear if iswhite() refers to ASCII whitespace or Unicode. If Unicode, which version of the standard? Not sure. enum dchar LS = '\u2028'; /// UTF line separator enum dchar PS = '\u2029'; /// UTF paragraph separator bool iswhite(dchar c) { return c <= 0x7F ? indexOf(whitespace, c) != -1 : (c == PS || c == LS); } Which version? Same comment for icmp(). Also, in the Unicode standard, case folding can depend on the specific language. That uses toUniLower. Not sure how that works. There is room for ascii-only functions, but unless a D version of ICU is going to be done separately, it would be nice to have full unicode-aware functions available. Yah, I'm increasingly thinking of defining an AsciiChar entity and perhaps a Zstring one for zero-terminated strings. You've got chop() marked as deprecated. Is popBack() going to make sense as something that removes a variable number of chars from a string in the CR-LF case? That might be a bit too magical. Well I found little use for chop in e.g. Perl. People either use chomp or want to remove the last character. I think chop is useless. Rather than zfill, what about modifying ljustify, rjustify, and center to take an optional fill character? Yah, I wanted to do that but postponed because it's quite a bit of work with general dchars etc. One set of functions I'd like to see are startsWith() and endsWith(). I find them frequently useful in Java and an irritating lack in the C++ standard library. Yah, those are in std.algorithm. Ideally we'd move everything that's applicable beyond strings to std.algorithm. Andrei
Re: eliminate junk from std.string?
On Tuesday, January 11, 2011 15:29:54 Ary Borenszweig wrote: > So what's a good use for aliases? Oh, there's not necessarily anything wrong with aliases. The problem is if an API has a lot of them. The typical place to use typedef in C++ is when you have long, nasty template types which you don't want to actually have to type out, and while auto and D's improved templates reduce the need for that sort of typedef, I'm sure that folks will still want to use them for that sort of thing. Personally, I've used them for three things: 1. When there's a templated function that you want to be able to call with a set of specific names. A prime example would be get on core.time.Duration. It properly genericizes dealing that functionality, but it would be annoying to have to type duration.get!"days"(), duration.get!"hours", etc. all over the place, so it aliases them to the properties days, hours, etc. 2. Deprecating a function name. For instance, let's say that we rename splitl to splitL or SplitLeft in std.string. Having a deprecated alias to splitl would avoid immediately breaking code. 3. In the new std.datetime, DateTimeException is an alias of core.time.TimeException, so that you can use the same exception type throughout the time stuff (std.datetime also publicly imports core.time) without worrying whether it was core.time or std.datetime which threw the exception and yet still have an exception type with the same name as the module as is typical in a number of Phobos modules. So, you get one exception type for all of the time code but still follow the typical naming convention. However, none of these are things that I'd do very often. alias is a tool that can be very handy at times, and I think that it's very good that we have, it but using it all over the place is likely ill-advised - especially if all you're really doing with it is making it possible to call the same function with different names. I'd say that, on the whole, aliases should be used when they simplify code or when renaming functions or types, and you want a good deprecation path, but other than that, in general, it's probably not a good idea to use them much. - Jonathan M Davis
Re: eliminate junk from std.string?
Am 12.01.2011 00:59, schrieb Jonathan M Davis: On Tuesday, January 11, 2011 15:29:54 Ary Borenszweig wrote: So what's a good use for aliases? 2. Deprecating a function name. For instance, let's say that we rename splitl to splitL or SplitLeft in std.string. Having a deprecated alias to splitl would avoid immediately breaking code. Isn't this exactly what Ary had in mind? :-)
Re: eliminate junk from std.string?
On Tuesday, January 11, 2011 16:07:11 Daniel Gibson wrote: > Am 12.01.2011 00:59, schrieb Jonathan M Davis: > > On Tuesday, January 11, 2011 15:29:54 Ary Borenszweig wrote: > >> So what's a good use for aliases? > > > > 2. Deprecating a function name. For instance, let's say that we rename > > splitl to splitL or SplitLeft in std.string. Having a deprecated alias > > to splitl would avoid immediately breaking code. > > Isn't this exactly what Ary had in mind? :-) No, or at least that's not the impression that I got. I understood that he meant to have to aliases around permanently. It's just confusing and adds clutter to do things like have both splitl and splitLeft (or splitL or whotever splitl got renamed to) around in the long run. _That_ is what Andrei and Walter is objecting to. Renaming a function and having a deprecated alias to the old name for a few releases eases the transition would definitely be good practice. aliasing a function just to have another name for the same thing wouldn't be good practice. There has to be a real benefit to having the second name. Providing a smooth deprecation route would be a case where there's a real benefit. - Jonathan M Davis
Re: eliminate junk from std.string?
Am 12.01.2011 01:17, schrieb Jonathan M Davis: On Tuesday, January 11, 2011 16:07:11 Daniel Gibson wrote: Am 12.01.2011 00:59, schrieb Jonathan M Davis: On Tuesday, January 11, 2011 15:29:54 Ary Borenszweig wrote: So what's a good use for aliases? 2. Deprecating a function name. For instance, let's say that we rename splitl to splitL or SplitLeft in std.string. Having a deprecated alias to splitl would avoid immediately breaking code. Isn't this exactly what Ary had in mind? :-) No, or at least that's not the impression that I got. I understood that he meant to have to aliases around permanently. It's just confusing and adds clutter to do things like have both splitl and splitLeft (or splitL or whotever splitl got renamed to) around in the long run. _That_ is what Andrei and Walter is objecting to. Renaming a function and having a deprecated alias to the old name for a few releases eases the transition would definitely be good practice. aliasing a function just to have another name for the same thing wouldn't be good practice. There has to be a real benefit to having the second name. Providing a smooth deprecation route would be a case where there's a real benefit. - Jonathan M Davis Ok, you're right, that is a slight difference. Deprecating them is certainly a good idea, but I'd suggest to keep the deprecated aliases around for longer (until D3), so anybody porting a Phobos1-based application to D2/Phobos2 can use them, even if he doesn't do this within the next few releases. Cheers, - Daniel
Re: eliminate junk from std.string?
On 2011-01-12 01:00:51 +0200, Andrei Alexandrescu said: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei ++vote. Uniformity in how functions are named will improve readibility.
Re: eliminate junk from std.string?
You are right, deprecating those names and removing them in the long run is what I think should be done.
Re: eliminate junk from std.string?
On Tuesday, January 11, 2011 16:23:13 Daniel Gibson wrote: > Am 12.01.2011 01:17, schrieb Jonathan M Davis: > > On Tuesday, January 11, 2011 16:07:11 Daniel Gibson wrote: > >> Am 12.01.2011 00:59, schrieb Jonathan M Davis: > >>> On Tuesday, January 11, 2011 15:29:54 Ary Borenszweig wrote: > So what's a good use for aliases? > >>> > >>> 2. Deprecating a function name. For instance, let's say that we rename > >>> splitl to splitL or SplitLeft in std.string. Having a deprecated alias > >>> to splitl would avoid immediately breaking code. > >> > >> Isn't this exactly what Ary had in mind? :-) > > > > No, or at least that's not the impression that I got. I understood that > > he meant to have to aliases around permanently. It's just confusing and > > adds clutter to do things like have both splitl and splitLeft (or splitL > > or whotever splitl got renamed to) around in the long run. _That_ is > > what Andrei and Walter is objecting to. > > > > Renaming a function and having a deprecated alias to the old name for a > > few releases eases the transition would definitely be good practice. > > aliasing a function just to have another name for the same thing > > wouldn't be good practice. There has to be a real benefit to having the > > second name. Providing a smooth deprecation route would be a case where > > there's a real benefit. > > > > - Jonathan M Davis > > Ok, you're right, that is a slight difference. > > Deprecating them is certainly a good idea, but I'd suggest to keep the > deprecated aliases around for longer (until D3), so anybody porting a > Phobos1-based application to D2/Phobos2 can use them, even if he doesn't > do this within the next few releases. Well, leaving an alias until D3 would equate to a permanent alias in D2, which is exactly what Walter and Andrei don't want (and I don't either). There's already plenty in Phobos 2 that's different from Phobos 1. So, while I don't think that we should rename stuff just to rename stuff, I also don't think that we should keep aliases around just to make porting D1 code easier - especially when most D1 code is probably using Tango anyway. We don't really have a policy in place for how long deprecation should last prior to outright removal, but until D3 is definitely too long. I would have thought that the question would be more along the lines of whether it should be a couple of releases or more like 6 months to a year before removing deprecated functions and modules at this point, not whether something will remain deprecated until D3. - Jonathan M Davis
Re: eliminate junk from std.string?
On 01/11/2011 09:11 PM, Ary Borenszweig wrote: "Welcome to D. Do you program in C, Javascript, Python or Ruby? Cool! Then you will feel at home." That phrase currently ends like this: "You don't? Oh, sorry, you will have to learn that some names are all lowercase, some not." But it could end like this: "You don't? Don't worry. D has the convention of writing all function names with X convention, but we keep some aliases for things that we want to keep backwards compatibility for." Yop. And anyway those legacy names are not all the same in C, Javascript, Python, Ruby, etc.. One has to be chosen or created for D, why not follow a guideline for the standard D name? (I really cannot (under)stand this general politic of sticking at wrong design choices from the past for generations and generations --even in brand new languages. How do improvements happen in other fields than programming? One day or the other, one needs to throw away old (mental) garbage.) Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
Am 12.01.2011 01:55, schrieb Jonathan M Davis: On Tuesday, January 11, 2011 16:23:13 Daniel Gibson wrote: Deprecating them is certainly a good idea, but I'd suggest to keep the deprecated aliases around for longer (until D3), so anybody porting a Phobos1-based application to D2/Phobos2 can use them, even if he doesn't do this within the next few releases. Well, leaving an alias until D3 would equate to a permanent alias in D2, which is exactly what Walter and Andrei don't want (and I don't either). There's already plenty in Phobos 2 that's different from Phobos 1. So, while I don't think that we should rename stuff just to rename stuff, I also don't think that we should keep aliases around just to make porting D1 code easier - especially when most D1 code is probably using Tango anyway. We don't really have a policy in place for how long deprecation should last prior to outright removal, but until D3 is definitely too long. I would have thought that the question would be more along the lines of whether it should be a couple of releases or more like 6 months to a year before removing deprecated functions and modules at this point, not whether something will remain deprecated until D3. - Jonathan M Davis Somewhere in this thread: Am 11.01.2011 21:43, schrieb Walter Bright: > Nick Sabalausky wrote: >> I agree with this reasoning for having them. However, I don't think it >> means we shouldn't D-ify or Phobos-ify them, at least as far as >> capitalization conventions. > > I also object to rather pointlessly annoying people wanting to move > their code from D1 to D2 by renaming everything. Endlessly renaming > things searching for the perfect name gives the illusion of progress, > whereas time would be better spent on improving the documentation, > unittests, performance, etc. > So his objection was specifically that renaming those functions could annoy people migrating D1 code (and certainly he meant Phobos1 users, because Tango-people either port (parts of) Tango or will have to rewrite that anyway). So, to accomplish that goal (not annoying those people), these aliases should be kept for longer. (An alternative may be to one/some phobos1-compat modules that contain such aliases and maybe even wrappers with old signatures for new functions, that could be imported to ease porting of old applications. That would have the benefit of not cluttering the regular Phobos2 modules with that legacy stuff.) Cheers, - Daniel
Re: eliminate junk from std.string?
On 01/12/2011 02:17 AM, Daniel Gibson wrote: Somewhere in this thread: Am 11.01.2011 21:43, schrieb Walter Bright: > Nick Sabalausky wrote: >> I agree with this reasoning for having them. However, I don't think it >> means we shouldn't D-ify or Phobos-ify them, at least as far as >> capitalization conventions. > > I also object to rather pointlessly annoying people wanting to move > their code from D1 to D2 by renaming everything. Endlessly renaming > things searching for the perfect name gives the illusion of progress, > whereas time would be better spent on improving the documentation, > unittests, performance, etc. > So his objection was specifically that renaming those functions could annoy people migrating D1 code (and certainly he meant Phobos1 users, because Tango-people either port (parts of) Tango or will have to rewrite that anyway). So, to accomplish that goal (not annoying those people), these aliases should be kept for longer. (An alternative may be to one/some phobos1-compat modules that contain such aliases and maybe even wrappers with old signatures for new functions, that could be imported to ease porting of old applications. That would have the benefit of not cluttering the regular Phobos2 modules with that legacy stuff.) When D2 / Phobos2 stabilise, what about a semi-automatic porting tool (at least signaling potential issues, first of all occurrences of deprecated stdlib names)? Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
On Tuesday, January 11, 2011 17:17:43 Daniel Gibson wrote: > Am 12.01.2011 01:55, schrieb Jonathan M Davis: > > On Tuesday, January 11, 2011 16:23:13 Daniel Gibson wrote: > >> Deprecating them is certainly a good idea, but I'd suggest to keep the > >> deprecated aliases around for longer (until D3), so anybody porting a > >> Phobos1-based application to D2/Phobos2 can use them, even if he doesn't > >> do this within the next few releases. > > > > Well, leaving an alias until D3 would equate to a permanent alias in D2, > > which is exactly what Walter and Andrei don't want (and I don't either). > > There's already plenty in Phobos 2 that's different from Phobos 1. So, > > while I don't think that we should rename stuff just to rename stuff, I > > also don't think that we should keep aliases around just to make porting > > D1 code easier - especially when most D1 code is probably using Tango > > anyway. We don't really have a policy in place for how long deprecation > > should last prior to outright removal, but until D3 is definitely too > > long. I would have thought that the question would be more along the > > lines of whether it should be a couple of releases or more like 6 months > > to a year before removing deprecated functions and modules at this > > point, not whether something will remain deprecated until D3. > > > > - Jonathan M Davis > > Somewhere in this thread: > > Am 11.01.2011 21:43, schrieb Walter Bright: > > Nick Sabalausky wrote: > >> I agree with this reasoning for having them. However, I don't think it > >> means we shouldn't D-ify or Phobos-ify them, at least as far as > >> capitalization conventions. > > > > I also object to rather pointlessly annoying people wanting to move > > their code from D1 to D2 by renaming everything. Endlessly renaming > > things searching for the perfect name gives the illusion of progress, > > whereas time would be better spent on improving the documentation, > > unittests, performance, etc. > > So his objection was specifically that renaming those functions could > annoy people migrating D1 code (and certainly he meant Phobos1 users, > because Tango-people either port (parts of) Tango or will have to > rewrite that anyway). > So, to accomplish that goal (not annoying those people), these aliases > should be kept for longer. > > (An alternative may be to one/some phobos1-compat modules that contain > such aliases and maybe even wrappers with old signatures for new > functions, that could be imported to ease porting of old applications. > That would have the benefit of not cluttering the regular Phobos2 > modules with that legacy stuff.) Well, I didn't say that Walter wasn't concerned about it. I just don't see the point. Phobos has changed enough from D1 to D2 that even D1 Phobos users (of which I get the impression there are relatively few) that there's probably already plenty of stuff which is going to break for anyone porting over. I do think that keeping a deprecated alias around longer for a function which has been around longer makes sense, and the Phobos 1 functions have been around longer than anything else. So, deprecating a function that was added 2 releases ago probably shouldn't require a deprecated alias for as long as deprecating a function that was in Phobos 1 would, but there's still a limit to how long it makes sense. And given that your average D1 user uses Tango rather than Phobos, it makes that much less sense to keep aliases to Phobos 1 functions around for a long time. So, no, we shoudln't get rid of the deprecated alias for a Phobos 1 function after only a release or two, but I don't think that it makes sense to keep it around for a year or two either. - Jonathan M Davis
Re: eliminate junk from std.string?
Am 12.01.2011 03:10, schrieb Jonathan M Davis: On Tuesday, January 11, 2011 17:17:43 Daniel Gibson wrote: Am 12.01.2011 01:55, schrieb Jonathan M Davis: On Tuesday, January 11, 2011 16:23:13 Daniel Gibson wrote: Deprecating them is certainly a good idea, but I'd suggest to keep the deprecated aliases around for longer (until D3), so anybody porting a Phobos1-based application to D2/Phobos2 can use them, even if he doesn't do this within the next few releases. Well, leaving an alias until D3 would equate to a permanent alias in D2, which is exactly what Walter and Andrei don't want (and I don't either). There's already plenty in Phobos 2 that's different from Phobos 1. So, while I don't think that we should rename stuff just to rename stuff, I also don't think that we should keep aliases around just to make porting D1 code easier - especially when most D1 code is probably using Tango anyway. We don't really have a policy in place for how long deprecation should last prior to outright removal, but until D3 is definitely too long. I would have thought that the question would be more along the lines of whether it should be a couple of releases or more like 6 months to a year before removing deprecated functions and modules at this point, not whether something will remain deprecated until D3. - Jonathan M Davis Somewhere in this thread: Am 11.01.2011 21:43, schrieb Walter Bright: > Nick Sabalausky wrote: >> I agree with this reasoning for having them. However, I don't think it >> means we shouldn't D-ify or Phobos-ify them, at least as far as >> capitalization conventions. > > I also object to rather pointlessly annoying people wanting to move > their code from D1 to D2 by renaming everything. Endlessly renaming > things searching for the perfect name gives the illusion of progress, > whereas time would be better spent on improving the documentation, > unittests, performance, etc. So his objection was specifically that renaming those functions could annoy people migrating D1 code (and certainly he meant Phobos1 users, because Tango-people either port (parts of) Tango or will have to rewrite that anyway). So, to accomplish that goal (not annoying those people), these aliases should be kept for longer. (An alternative may be to one/some phobos1-compat modules that contain such aliases and maybe even wrappers with old signatures for new functions, that could be imported to ease porting of old applications. That would have the benefit of not cluttering the regular Phobos2 modules with that legacy stuff.) Well, I didn't say that Walter wasn't concerned about it. I just don't see the point. Phobos has changed enough from D1 to D2 that even D1 Phobos users (of which I get the impression there are relatively few) that there's probably already plenty of stuff which is going to break for anyone porting over. I do think that keeping a deprecated alias around longer for a function which has been around longer makes sense, and the Phobos 1 functions have been around longer than anything else. So, deprecating a function that was added 2 releases ago probably shouldn't require a deprecated alias for as long as deprecating a function that was in Phobos 1 would, but there's still a limit to how long it makes sense. And given that your average D1 user uses Tango rather than Phobos, it makes that much less sense to keep aliases to Phobos 1 functions around for a long time. So, no, we shoudln't get rid of the deprecated alias for a Phobos 1 function after only a release or two, but I don't think that it makes sense to keep it around for a year or two either. - Jonathan M Davis Hmm maybe. I guess there will be further similar discussions (e.g. the depreation of std.stream once the successor is ready). I think those aliases should at least be kept until all Phobos1 stuff that is to be replaced is indeed replaced. That'd allow a decision that is at least consistent for most Phobos1 stuff (some has already been removed/replaced, e.g. by the druntime modules like core.thread). Cheers, - Daniel
Re: eliminate junk from std.string?
On 2011-01-11 18:00:51 -0500, Andrei Alexandrescu said: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. I support this. -- Michel Fortin michel.for...@michelf.com http://michelf.com/
Re: eliminate junk from std.string?
"Andrei Alexandrescu" wrote in message news:iginid$1rt...@digitalmars.com... > On 1/11/11 11:21 AM, Ary Borenszweig wrote: >> Why care where they come from? Why not make them intuitive? Say, like, >> "Always >> camel case"? > > If there's enough support for this, I'll do it. > I've already been arguing in favor of it :) vote++;
Re: eliminate junk from std.string?
Andrei Alexandrescu Wrote: > On 1/11/11 1:45 PM, Jerry Quinn wrote: > > Unclear if iswhite() refers to ASCII whitespace or Unicode. If Unicode, > > which version of the standard? > > Not sure. > > enum dchar LS = '\u2028'; /// UTF line > separator > enum dchar PS = '\u2029'; /// UTF > paragraph separator > > bool iswhite(dchar c) > { > return c <= 0x7F > ? indexOf(whitespace, c) != -1 > : (c == PS || c == LS); > } > > Which version? This looks pretty incomplete if the goal is to return true for any unicode whitespace character. My comment was really that if we're going to offer things like this, they need to be more completely defined. > > Same comment for icmp(). Also, in the Unicode standard, case folding can > > depend on the specific language. > > That uses toUniLower. Not sure how that works. And doesn't mention details about the Unicode standard version it implements. > > You've got chop() marked as deprecated. Is popBack() going to make > > sense as something that removes a variable number of chars from a > > string in the CR-LF case? That might be a bit too magical. > > Well I found little use for chop in e.g. Perl. People either use chomp > or want to remove the last character. I think chop is useless. Agreed, chomp is more useful. My question is whether popBack() should automatically act like perl chomp() for strings or not? > > One set of functions I'd like to see are startsWith() and endsWith(). I > > find them frequently useful in Java and an irritating lack in the C++ > > standard library. > > Yah, those are in std.algorithm. Ideally we'd move everything that's > applicable beyond strings to std.algorithm. Ah, missed those. Jerry
Re: eliminate junk from std.string?
Jerry Quinn Wrote: > > > Same comment for icmp(). Also, in the Unicode standard, case folding can > > > depend on the specific language. > > > > That uses toUniLower. Not sure how that works. > > And doesn't mention details about the Unicode standard version it implements. Actually it does. *munch* *munch* my words are delicious. It would be good to have better docs on what icmp() does. Also, it might make sense to do icmp() using unicode case folding and normalization rather than simple lowercase. Thinking out loud here.
Re: eliminate junk from std.string?
On 12/01/11 10:00, Andrei Alexandrescu wrote: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Yes please; it's got my vote.
Re: eliminate junk from std.string?
Andrei Alexandrescu wrote: > On 1/11/11 11:21 AM, Ary Borenszweig wrote: >> Why care where they come from? Why not make them intuitive? Say, like, >> "Always camel case"? > > If there's enough support for this, I'll do it. > > Andrei +1
Re: eliminate junk from std.string?
On Tue, 11 Jan 2011 15:00:51 -0800, Andrei Alexandrescu wrote: > On 1/11/11 11:21 AM, Ary Borenszweig wrote: >> Why care where they come from? Why not make them intuitive? Say, like, >> "Always camel case"? > > If there's enough support for this, I'll do it. > > Andrei ++vote IMO, this should be done throughout Phobos before it's too late. -Lars
Re: eliminate junk from std.string?
On 2011-01-12 00:00, Andrei Alexandrescu wrote: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei vote++ -- /Jacob Carlborg
Re: eliminate junk from std.string?
Andrei Alexandrescu wrote: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei ++vote. Bear in mind that with D's spell checker, the error message is: test.d(8): Error: undefined identifier tolower, did you mean function toLower? Which is pretty darn good.
Re: eliminate junk from std.string?
Andrei Alexandrescu Wrote: > If there's enough support for this, I'll do it. > > Andrei for how much it can be worth, +1 Paolo
Re: eliminate junk from std.string?
On 01/12/2011 07:22 AM, Jerry Quinn wrote: Jerry Quinn Wrote: Same comment for icmp(). Also, in the Unicode standard, case folding can depend on the specific language. That uses toUniLower. Not sure how that works. And doesn't mention details about the Unicode standard version it implements. Actually it does. *munch* *munch* my words are delicious. It would be good to have better docs on what icmp() does. Also, it might make sense to do icmp() using unicode case folding and normalization rather than simple lowercase. Thinking out loud here. You'll get this very soon. (see https://bitbucket.org/stephan/dunicode/src/bcf19471ebf9/unicodedata.d for details) Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
On 01/12/2011 07:22 AM, Jerry Quinn wrote: Jerry Quinn Wrote: Same comment for icmp(). Also, in the Unicode standard, case folding can depend on the specific language. That uses toUniLower. Not sure how that works. And doesn't mention details about the Unicode standard version it implements. Actually it does. *munch* *munch* my words are delicious. It would be good to have better docs on what icmp() does. Also, it might make sense to do icmp() using unicode case folding and normalization rather than simple lowercase. Thinking out loud here. You'll get this very soon. (see https://bitbucket.org/stephan/dunicode/src/bcf19471ebf9/unicodedata.d for details) Denis _ vita es estrany spir.wikidot.com
Re: eliminate junk from std.string?
Walter Bright Wrote: > Ary Borenszweig wrote: > > Agreed. So what's wrong with improving things and leaving old things as > > aliases? > > Clutter. > > One of the risks with Phobos development is it becoming a river miles wide, > and > only an inch deep. In other words, endless gobs of shallow, trite functions, > with very little depth. (Aliases are as shallow as they get!) > > As a general rule, I don't want functionality in Phobos that takes more time > for > a user to find/read/understand the documentation on than to reimplement it > himself. Those things give the illusion of comprehensiveness, but are just > useless wankery. > > Do we really want a 1000 page reference manual on Phobos, but no database > interface? No network interface? No D lexer? No disassembler? No superfast > XML > parser? No best-of-breed regex implementation? No CGI support? No HTML > parsing? > No sound support? No jpg reading? > > I worry by endless bikeshedding about perfecting the spelling of some name, > we > miss the whole show. > > I'd like to see more meat. For example, Don has recently added gamma > functions > to the math library. These are hard to implement correctly, and are perfect > for > inclusion. Trivial solution: have a separate set of modules that contain the backward compatible aliases. Have those modules documented *separately* in an appendix. No Clutter and no problems. Want to use familiar functions from e.g. C++? just use: iimport compatibility.cpp.string; instead of: import string; Providing such packages would help programmers to transition from other languages to D and perhaps they should be optional addons to phoboes which are maintained separately.
Re: eliminate junk from std.string?
On Wed, 12 Jan 2011 08:00:51 +0900, Andrei Alexandrescu wrote: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. ++vote :) Masahiro
Re: eliminate junk from std.string?
On 12.01.2011 2:00, Andrei Alexandrescu wrote: > On 1/11/11 11:21 AM, Ary Borenszweig wrote: >> Why care where they come from? Why not make them intuitive? Say, >> like, "Always >> camel case"? > > If there's enough support for this, I'll do it. > > Andrei ++vote -- Dmitry Olshansky
Re: eliminate junk from std.string?
spir wrote: On 01/11/2011 09:11 PM, Ary Borenszweig wrote: "Welcome to D. Do you program in C, Javascript, Python or Ruby? Cool! Then you will feel at home." That phrase currently ends like this: "You don't? Oh, sorry, you will have to learn that some names are all lowercase, some not." But it could end like this: "You don't? Don't worry. D has the convention of writing all function names with X convention, but we keep some aliases for things that we want to keep backwards compatibility for." Yop. And anyway those legacy names are not all the same in C, Javascript, Python, Ruby, etc.. One has to be chosen or created for D, why not follow a guideline for the standard D name? (I really cannot (under)stand this general politic of sticking at wrong design choices from the past for generations and generations --even in brand new languages. How do improvements happen in other fields than programming? One day or the other, one needs to throw away old (mental) garbage.) Denis Yes, I recently did the same with many of the math functions. tgamma --> gamma, lgamma -> logGamma. It's pretty funny to try to find out why there is a 't' in front of 'tgamma' in C. http://pubs.opengroup.org/onlinepubs/009695399/functions/tgamma.html "RATIONALE This function is named tgamma() in order to avoid conflicts with the historical gamma() and lgamma() functions." And why the t? 't' stands for 'true'. Because the original gamma() had a bug. Bravo.
Re: eliminate junk from std.string?
On 1/11/2011 2:42 PM, Walter Bright wrote: Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? Because people are used to those names due to their wide use. It's the same reason that we still use Qwerty keyboards. I disagree strongly with this. I remember function names phonetically in my head. I don't want to have to remember which language it came from in order to know what the casing should be. And these function names aren't nearly standardized as a qwerty keyboard. And besides, D makes much more radical departures from languages in other areas (which is usually a good thing). Php is already ridiculed for its library having no internal consistency. I don't want the same thing to happen to phobos.
Re: eliminate junk from std.string?
On 01/11/2011 07:17 PM, Jonathan M Davis wrote: > Renaming a function and having a deprecated alias to the old name for a few > releases eases the transition would definitely be good practice. aliasing a > function just to have another name for the same thing wouldn't be good > practice. > There has to be a real benefit to having the second name. Providing a smooth > deprecation route would be a case where there's a real benefit. How about wrapping the aliases up in, e.g., std.compatibility.{c, javascript, python, …} Well, maybe not in std; that might imply a commitment to keep up-to-date full library compatibility forever. But it might make a semi-useful contrib library. --Joel
Re: eliminate junk from std.string?
On 11/01/2011 23:00, Andrei Alexandrescu wrote: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei +1 vote. -- Bruno Medeiros - Software Engineer
D standard style [was: Re: eliminate junk from std.string?]
On 01/12/2011 12:07 AM, Daniel Gibson wrote: Am 12.01.2011 00:00, schrieb Andrei Alexandrescu: On 1/11/11 11:21 AM, Ary Borenszweig wrote: Why care where they come from? Why not make them intuitive? Say, like, "Always camel case"? If there's enough support for this, I'll do it. Andrei Please do, having different naming conventions of functions within the standard library makes it harder to remember the exact spelling of a function and also doesn't look professional. +1 vote for making the standard library comply with the D style guide[1] +1 as well But while we're at conventions, and before any change is actually done, we may take the opportunity to agree not only on morphology, but on semantics ;-) For instance, from online doc: string capitalize(string s); Capitalize first character of string s[], convert rest of string s[] to lower case. Then, use it: auto s = "capital"; s.capitalize(); writeln(s); // "capital" Uh? Not only the name is misleading, but the doc as well. For this kind of issue, some guidelines read like: * perform an action --> action verb (eg capitalise: changes the passed string) * return a result --> named after result (eg capitalised: return new string) Sure, the func's interface also tells the reader what's actually done. But having name (and doc) contradict it is not very helpful. And beeing forced to open the doc or even the source for every unknown bit is an annoying obstacle. There are probably other common issues like this. My personal evaluation is whether some newcomer can guess the purpose of the func, the type, the constant, etc... I would also vote for: * full words, except for rare exception used everywhere in programming _and_ really helpful (eg OS) * get rid of obscure, ambiguous, or misleading namings * when possible, use international words rather than english-only (eg section better than slice if everything else equal) Finally, take the opportunity to make the doc usable, eg: string format(...); Format arguments into a string. ??? Denis _ vita es estrany spir.wikidot.com
Re: D standard style [was: Re: eliminate junk from std.string?]
On 2011-01-11 20:28:27 -0500, spir said: But while we're at conventions, and before any change is actually done, we may take the opportunity to agree not only on morphology, but on semantics ;-) For instance, from online doc: string capitalize(string s); Capitalize first character of string s[], convert rest of string s[] to lower case. Then, use it: auto s = "capital"; s.capitalize(); writeln(s); // "capital" Uh? Not only the name is misleading, but the doc as well. For this kind of issue, some guidelines read like: * perform an action --> action verb (eg capitalise: changes the passed string) * return a result --> named after result (eg capitalised: return new string) Sure, the func's interface also tells the reader what's actually done. But having name (and doc) contradict it is not very helpful. And beeing forced to open the doc or even the source for every unknown bit is an annoying obstacle. There are probably other common issues like this. My personal evaluation is whether some newcomer can guess the purpose of the func, the type, the constant, etc... I would also vote for: * full words, except for rare exception used everywhere in programming _and_ really helpful (eg OS) * get rid of obscure, ambiguous, or misleading namings * when possible, use international words rather than english-only (eg section better than slice if everything else equal) I support this too. Names should be easy to read. That said, I'm not exactly sure about what you mean by this "use international words" recommendation. I really don't get why "section" would be better than "slice". Words that exists in other languages don't always have the exact same meaning as in English, so they might also be more confusing to an international audience. I'd stick with the "choose a meaningful word" rule. -- Michel Fortin michel.for...@michelf.com http://michelf.com/