> On Dec 7, 2021, at 22:09, Taras Zakharko <taras.zakha...@uzh.ch> wrote:
> 
> Great summary, Avi. 
> 
> String concatenation cold be trivially added to R, but it probably should not 
> be. You will notice that modern languages tend not to use “+” to do string 
> concatenation (they either have 
> a custom operator or a special kind of pattern to do it) due to practical 
> issues such an approach brings (implicit type casting, lack of commutativity, 
> performance etc.). These issues will be felt even more so in R with it’s weak 
> typing, idiosyncratic casting behavior and NAs. 
> 
> As other’s have pointed out, any kind of behavior one wants from string 
> concatenation can be implemented by custom operators as needed. This is not 
> something that needs to be in the base R. I would rather like the efforts to 
> be directed on improving string formatting (such as glue-style built-in 
> string interpolation).
> 

This is getting OT, but there is a very good reason why string interpolation is 
not in core R. As I recall it has been considered some time ago, but it is very 
dangerous as it implies evaluation on constants which opens a huge security 
hole and has questionable semantics (where you evaluate etc). Hence it's much 
easier to ban a package than to hack it out of R ;).

Cheers,
Simon


> — Taras
> 
> 
>> On 7 Dec 2021, at 02:27, Avi Gross via R-devel <r-devel@r-project.org> wrote:
>> 
>> After seeing what others are saying, it is clear that you need to carefully
>> think things out before designing any implementation of a more native
>> concatenation operator whether it is called "+' or anything else. There may
>> not be any ONE right solution but unlike a function version like paste()
>> there is nowhere to place any options that specify what you mean.
>> 
>> You can obviously expand paste() to accept arguments like replace.NA="" or
>> replace.NA="<NA>" and similar arguments on what to do if you see a NaN, and
>> Inf or -Inf, a NULL or even an NA.character_ and so on. Heck, you might tell
>> to make other substitutions as in substitute=list(100=99, D=F) or any other
>> nonsense you can come up with.
>> 
>> But you have nowhere to put options when saying:
>> 
>> c <- a + b
>> 
>> Sure, you could set various global options before the addition and maybe
>> rest them after, but that is not a way I like to go for something this
>> basic.
>> 
>> And enough such tinkering makes me wonder if it is easier to ask a user to
>> use a slightly different function like this:
>> 
>> paste.no.na <- function(...) do.call(paste, Filter(Negate(is.na),
>> list(...)))
>> 
>> The above one-line function removes any NA from the argument list to make a
>> potentially shorter list before calling the real paste() using it.
>> 
>> Variations can, of course, be made that allow functionality as above. 
>> 
>> If R was a true object-oriented language in the same sense as others like
>> Python, operator overloading of "+" might be doable in more complex ways but
>> we can only work with what we have. I tend to agree with others that in some
>> places R is so lenient that all kinds of errors can happen because it makes
>> a guess on how to correct it. Generally, if you really want to mix numeric
>> and character, many languages require you to transform any arguments to make
>> all of compatible types. The paste() function is clearly stated to coerce
>> all arguments to be of type character for you. Whereas a+b makes no such
>> promises and also is not properly defined even if a and b are both of type
>> character. Sure, we can expand the language but it may still do things some
>> find not to be quite what they wanted as in "2"+"3" becoming "23" rather
>> than 5. Right now, I can use as.numeric("2")+as.numeric("3") and get the
>> intended result after making very clear to anyone reading the code that I
>> wanted strings converted to floating point before the addition.
>> 
>> As has been pointed out, the plus operator if used to concatenate does not
>> have a cognate for other operations like -*/ and R has used most other
>> special symbols for other purposes. So, sure, we can use something like ....
>> (4 periods) if it is not already being used for something but using + here
>> is a tad confusing. Having said that, the makers of Python did make that
>> choice.
>> 
>> -----Original Message-----
>> From: R-devel <r-devel-boun...@r-project.org> On Behalf Of Gabriel Becker
>> Sent: Monday, December 6, 2021 7:21 PM
>> To: Bill Dunlap <williamwdun...@gmail.com>
>> Cc: Radford Neal <radf...@cs.toronto.edu>; r-devel <r-devel@r-project.org>
>> Subject: Re: [Rd] string concatenation operator (revisited)
>> 
>> As I recall, there was a large discussion related to that which resulted in
>> the recycle0 argument being added (but defaulting to FALSE) for
>> paste/paste0.
>> 
>> I think a lot of these things ultimately mean that if there were to be a
>> string concatenation operator, it probably shouldn't have behavior identical
>> to paste0. Was that what you were getting at as well, Bill?
>> 
>> ~G
>> 
>> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap <williamwdun...@gmail.com> wrote:
>> 
>>> Should paste0(character(0), c("a","b")) give character(0)?
>>> There is a fair bit of code that assumes that paste("X",NULL) gives "X"
>>> but c(1,2)+NULL gives numeric(0).
>>> 
>>> -Bill
>>> 
>>> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
>>> <murdoch.dun...@gmail.com>
>>> wrote:
>>> 
>>>> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
>>>>> Gabe, I agree that missingness is important to factor in. To 
>>>>> somewhat
>>>> abuse
>>>>> the terminology, NA is often used to represent missingness. Perhaps 
>>>>> concatenating character something with character something missing
>>>> should
>>>>> result in the original character?
>>>> 
>>>> I think that's a bad idea.  If you wanted to represent an empty 
>>>> string, you should use "" or NULL, not NA.
>>>> 
>>>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it 
>>>> should give NA.
>>>> 
>>>> Duncan Murdoch
>>>> 
>>>>> 
>>>>> Avi
>>>>> 
>>>>> On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
>>>>> <gabembec...@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> Seeing this and the other thread (and admittedly not having 
>>>>>> clicked
>>>> through
>>>>>> to the linked r-help thread), I wonder about NAs.
>>>>>> 
>>>>>> Should NA <concat> "hi there"  not result in NA_character_? This 
>>>>>> is not what any of the paste functions do, but in my opinoin, NA +
>>>> <non_na_value>
>>>>>> seems like it should be NA  (not "NA"), particularly if we are 
>>>>>> talking about `+` overloading, but potentially even in the case of 
>>>>>> a distinct concatenation operator?
>>>>>> 
>>>>>> I guess what I'm saying is that in my head missingness propagation
>>>> rules
>>>>>> should take priority in such an operator (ie NA + <anything> 
>>>>>> should *always * be NA).
>>>>>> 
>>>>>> Is that something others disagree with, or has it just not come up 
>>>>>> yet
>>>> in
>>>>>> (the parts I have read) of this discussion?
>>>>>> 
>>>>>> Best,
>>>>>> ~G
>>>>>> 
>>>>>> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
>>>>>> <radf...@cs.toronto.edu>
>>>>>> wrote:
>>>>>> 
>>>>>>>>> In pqR (see pqR-project.org), I have implemented ! and !! as 
>>>>>>>>> binary string concatenation operators, equivalent to paste0 and 
>>>>>>>>> paste, respectively.
>>>>>>>>> 
>>>>>>>>> For instance,
>>>>>>>>> 
>>>>>>>>>> "hello" ! "world"
>>>>>>>>>     [1] "helloworld"
>>>>>>>>>> "hello" !! "world"
>>>>>>>>>     [1] "hello world"
>>>>>>>>>> "hello" !! 1:4
>>>>>>>>>     [1] "hello 1" "hello 2" "hello 3" "hello 4"
>>>>>>>> 
>>>>>>>> I'm curious about the details:
>>>>>>>> 
>>>>>>>> Would `1 ! 2` convert both to strings?
>>>>>>> 
>>>>>>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", 
>>>>>>> just like paste0(1,2) does.  Of course, they wouldn't have to be 
>>>>>>> exactly equivalent to paste0 and paste - one could impose 
>>>>>>> stricter requirements if that seemed better for error detection.  
>>>>>>> Off hand, though, I think automatically converting is more in 
>>>>>>> keeping with the rest of R.  Explicitly converting with as.character
>> could be tedious.
>>>>>>> 
>>>>>>> I suppose disallowing logical arguments might make sense to guard 
>>>>>>> against typos where ! was meant to be the unary-not operator, but 
>>>>>>> ended up being a binary operator, after some sort of typo.  I 
>>>>>>> doubt that this would be a common error, though.
>>>>>>> 
>>>>>>> (Note that there's no ambiguity when there are no typos, except 
>>>>>>> that when negation is involved a space may be needed - so, for 
>>>>>>> example, "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  
>>>>>>> Existing uses of double negation are still fine - eg, a <- !!TRUE
>> still sets a to TRUE.
>>>>>>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
>>>> "xTRUE".)
>>>>>>> 
>>>>>>>> Where does the binary ! fit in the operator priority?  E.g. how 
>>>>>>>> is
>>>>>>>> 
>>>>>>>>  a ! b > c
>>>>>>>> 
>>>>>>>> parsed?
>>>>>>> 
>>>>>>> As (a ! b) > c.
>>>>>>> 
>>>>>>> Their precedence is between that of + and - and that of < and >.
>>>>>>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
>>>>>>> 
>>>>>>> (Actually, pqR also has a .. operator that fixes the problems 
>>>>>>> with generating sequences with the : operator, and it has 
>>>>>>> precedence lower than + and - and higher than ! and !!, but 
>>>>>>> that's not relevant if you don't have the .. operator.)
>>>>>>> 
>>>>>>>   Radford Neal
>>>>>>> 
>>>>>>> ______________________________________________
>>>>>>> R-devel@r-project.org mailing list 
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>> 
>>>>>> 
>>>>>>        [[alternative HTML version deleted]]
>>>>>> 
>>>>>> ______________________________________________
>>>>>> R-devel@r-project.org mailing list 
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>> 
>>>> 
>>>> ______________________________________________
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>> 
>>> 
>> 
>>      [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to