Re: [Rd] surprising behaviour of names<-
Thomas Lumley wrote: > > Wacek, > > In this case I think the *tmp* dates from the days before backticks, > when it was not a legal name (it still isn't) and it was much, much > harder to use illegal names, so the collision issue really didn't exist. > thanks for the explanation. > You're right about the documentation. > thanks for the acknowledgement. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > >> '*tmp*' = 0 >> `*tmp*` >> # 0 >> >> x = 1 >> names(x) = 'foo' >> `*tmp*` >> # error: object "*tmp*" not found >> >> `*ugly*` >> > > I agree, and I am a bit flabbergasted. I had not expected that > something like this would happen and I am indeed not aware of anything > in the documentation that warns about this; but others may prove me > wrong on this. > hopefully. > >> given that `*tmp*`is a perfectly legal (though some would say >> 'non-standard') name, it would be good if somewhere here a warning >> were issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is >> not just any non-standard name, but one that is 'obviously' used >> under the hood to perform black magic. >> > > Now I wonder whether there are any other objects (with non-standard) > names) that can be nuked by operations performed under the hood. > any such risk should be clearly documented, if not with a warning issued each time the user risks h{is,er} workspace corrupted by the under-the-hood. > I guess the best thing is to stay away from non-standard names, if only > to save the typing of back-ticks. :) > agree. but then, there may be -- and probably are -- other such 'best to stay away' things in r, all of which should be documented so that a user know what may happen on the surface, *without* having to peek under the hood. > Thanks for letting me know, I have learned something new today. > wow. most of my fiercely truculent ranting is meant to point out things that may not be intentional, or if they are, they seem to me design flaws rather than features -- so that either i learn that i am ignorant or wrong, or someone else does, pro bono. hopefully. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek, In this case I think the *tmp* dates from the days before backticks, when it was not a legal name (it still isn't) and it was much, much harder to use illegal names, so the collision issue really didn't exist. You're right about the documentation. -thomas On Sun, 15 Mar 2009, Wacek Kusnierczyk wrote: Berwin A Turlach wrote: Obviously, assuming that R really executes *tmp* <- x x <- "names<-"('*tmp*', value=c("a","b")) under the hood, in the C code, then *tmp* does not end up in the symbol table and does not persist beyond the execution of names(x) <- c("a","b") to prove that i take you seriously, i have peeked into the code, and found that indeed there is a temporary binding for *tmp* made behind the scenes -- sort of. unfortunately, it is not done carefully enough to avoid possible interference with the user's code: '*tmp*' = 0 `*tmp*` # 0 x = 1 names(x) = 'foo' `*tmp*` # error: object "*tmp*" not found `*ugly*` given that `*tmp*`is a perfectly legal (though some would say 'non-standard') name, it would be good if somewhere here a warning were issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is not just any non-standard name, but one that is 'obviously' used under the hood to perform black magic. it also appears that the explanation given in, e.g., the r language definition (draft, of course) sec. 3.4.4: " Assignment to subsets of a structure is a special case of a general mechanism for complex assignment: x[3:5] <- 13:15 The result of this commands is as if the following had been executed ‘*tmp*‘ <- x x <- "[<-"(‘*tmp*‘, 3:5, value=13:15) " is incomplete (because the final result is not '*tmp*' having the value of x, as it might seem, but rather '*tmp*' having been unbound). so the suggestion for the documenters is to add to the end of the section (or wherever else it is appropriate) a warning to the effect that in the end '*tmp*' will be removed, even if the user has explicitly defined it earlier in the same scope. or maybe have the implementation not rely on a user-forgeable name? for example, the '.Last.value' name is automatically bound to the most recently returned value, but it resides in package:base and does not collide with bindings using it made by the user: .Last.value = 0 1 .Last.value # 0, not 1 1 base::.Last.value # 1, not 0 why could not '*tmp*' be bound and unbound outside of the user's namespace? (i guess it's easier to update the docs -- or just ignore the issue.) on the margin, traceback('<-') will pick only one of the uses of '<-' suggested by the code above: x <- 1:10 trace('<-') x[3:5] <- 13:15 # trace: x[3:5] <- 13:15 # trace: x <- `[<-`(`*tmp*`, 3:5, value = 13:15) which is somewhat confusing, because then '*tmp*' appears in the trace somewhat ex machina. (again, the explanation is in the source code, but the traceback could have been more informative.) cheers, vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
G'day Wacek, On Sun, 15 Mar 2009 21:01:33 +0100 Wacek Kusnierczyk wrote: > Berwin A Turlach wrote: > > > > Obviously, assuming that R really executes > > *tmp* <- x > > x <- "names<-"('*tmp*', value=c("a","b")) > > under the hood, in the C code, then *tmp* does not end up in the > > symbol table and does not persist beyond the execution of > > names(x) <- c("a","b") > > > > > > to prove that i take you seriously, i have peeked into the code, and > found that indeed there is a temporary binding for *tmp* made behind > the scenes -- sort of. unfortunately, it is not done carefully enough > to avoid possible interference with the user's code: > > '*tmp*' = 0 > `*tmp*` > # 0 > > x = 1 > names(x) = 'foo' > `*tmp*` > # error: object "*tmp*" not found > > `*ugly*` I agree, and I am a bit flabbergasted. I had not expected that something like this would happen and I am indeed not aware of anything in the documentation that warns about this; but others may prove me wrong on this. > given that `*tmp*`is a perfectly legal (though some would say > 'non-standard') name, it would be good if somewhere here a warning > were issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is > not just any non-standard name, but one that is 'obviously' used > under the hood to perform black magic. Now I wonder whether there are any other objects (with non-standard) names) that can be nuked by operations performed under the hood. I guess the best thing is to stay away from non-standard names, if only to save the typing of back-ticks. :) Thanks for letting me know, I have learned something new today. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > > Obviously, assuming that R really executes > *tmp* <- x > x <- "names<-"('*tmp*', value=c("a","b")) > under the hood, in the C code, then *tmp* does not end up in the symbol > table and does not persist beyond the execution of > names(x) <- c("a","b") > > to prove that i take you seriously, i have peeked into the code, and found that indeed there is a temporary binding for *tmp* made behind the scenes -- sort of. unfortunately, it is not done carefully enough to avoid possible interference with the user's code: '*tmp*' = 0 `*tmp*` # 0 x = 1 names(x) = 'foo' `*tmp*` # error: object "*tmp*" not found `*ugly*` given that `*tmp*`is a perfectly legal (though some would say 'non-standard') name, it would be good if somewhere here a warning were issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is not just any non-standard name, but one that is 'obviously' used under the hood to perform black magic. it also appears that the explanation given in, e.g., the r language definition (draft, of course) sec. 3.4.4: " Assignment to subsets of a structure is a special case of a general mechanism for complex assignment: x[3:5] <- 13:15 The result of this commands is as if the following had been executed ‘*tmp*‘ <- x x <- "[<-"(‘*tmp*‘, 3:5, value=13:15) " is incomplete (because the final result is not '*tmp*' having the value of x, as it might seem, but rather '*tmp*' having been unbound). so the suggestion for the documenters is to add to the end of the section (or wherever else it is appropriate) a warning to the effect that in the end '*tmp*' will be removed, even if the user has explicitly defined it earlier in the same scope. or maybe have the implementation not rely on a user-forgeable name? for example, the '.Last.value' name is automatically bound to the most recently returned value, but it resides in package:base and does not collide with bindings using it made by the user: .Last.value = 0 1 .Last.value # 0, not 1 1 base::.Last.value # 1, not 0 why could not '*tmp*' be bound and unbound outside of the user's namespace? (i guess it's easier to update the docs -- or just ignore the issue.) on the margin, traceback('<-') will pick only one of the uses of '<-' suggested by the code above: x <- 1:10 trace('<-') x[3:5] <- 13:15 # trace: x[3:5] <- 13:15 # trace: x <- `[<-`(`*tmp*`, 3:5, value = 13:15) which is somewhat confusing, because then '*tmp*' appears in the trace somewhat ex machina. (again, the explanation is in the source code, but the traceback could have been more informative.) cheers, vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > On Sat, 14 Mar 2009 07:22:34 +0100 > Wacek Kusnierczyk wrote: > > [...] > >>> Well, I don't see any new object created in my workspace after >>> x <- 4 >>> names(x) <- "foo" >>> Do you? >>> >>> >> of course not. that's why i'd say the two above are *not* >> equivalent. >> >> i haven't noticed the 'in the c code'; do you mean the r interpreter >> actually generates, in the c code, such r expressions for itself to >> evaluate? >> > > As I said before, I have little knowledge about how the parser works and > what goes on under the hood; and I have also little time and > inclination to learn about it. > > But if you are interested in these details, then by all means invest > the time to investigate. > > berwin, you're playing radio erewan now. i talk about what the user sees at the interface, and you talk about c code. then you admit you don't know the code, and suggest i examine it if i'm interested. i incidentally am, but the whole point was that the user should not be forced to look under the hood to know the interface to a function. prefix 'names<-' seems to have a certain behaviour that is not properly documented. > Alternatively, you would hope that Simon eventually finishes the book > that he is writing on programming in R; as I understand it, that book > would explain part of these issues in details. Hopefully, along with > the book he makes the tools that he has for introspection available. > simon: i'd be happy to contribute in any way you might find useful. > i guess you have looked under the hood; point me to the relevant code. >>> No I did not, because I am not interested in knowing such intimate >>> details of R, but it seems you were interested. >>> >>> >> yes, but then your claim about what happens under the hood, in the c >> code, is a pure stipulation. >> > > I made no claim about what is going on under the hood because I have no > knowledge about these matters. But, yes, I was speculating of what > might go on. > owe me a beer. > >> and you got the example from the r language definition sec. 10.2, >> which says the forms are equivalent, with no 'under the hood, in the >> c code' comment. >> > > Trying to figure out what a writer/painter actually means/says beyond > the explicitly stated/painted, something that is summed up in Australia > (and other places) under the term "critical thinking", was not high in > the curriculum of your school, was it? :-) > sure, but probably not the way you seem to think about. have you incidentally read ferdydurke by gombrowicz? > >> you're just showing that your statements cannot be taken seriously. >> > > Usually, my statement can be taken seriously, unless followed by some > indication that I said them tongue-in-cheek. Of course, statements > that I allegedly made but were in fact put into my mouth cannot, and > should not, be taken seriously. > i'm talking about your speculations about what the parser does (wrt. infix and prefix forms having exactly the same parse tree), rather vague statements such as "'names<-'(x,'foo') should create (more or less) a parse tree equivalent to that expression", and other statements (surely, qualified with 'assuming', 'strongly suggests', and the like), coupled with your admitting that you in fact donæt know what happens there, is not particularly reassuring. > yes, *if* you are able to predict the refcount of the object passed to 'names<-' *then* you can predict what 'names<-' will do, [...] >>> I think Simon pointed already out that you seem to have a wrong >>> picture of what is going on. [...] >>> >> so what you quote effectively talks about a specific refcount >> mechanism. it's not refcount that would be used by the garbage >> collector, but it's a refcount, or maybe refflag. >> > > Fair enough, if you call this a refcount then there is no problem. > Whenever I came across the term refcount in my readings, it was > referring to different mechanisms, typically mechanisms that kept exact > track on how often an object was referred too. So I would not call the > value of the named field a refcount. And we can agree to call it from > now on a refcount as long as we realise what mechanism is really used. > the major point of the discussion was that 'names<-' will sometimes modify and othertimes copy its argument. you chose to justify this by looking under the hood, and i suppose you were pretty clear what i meant by refcount, because it should have been clear from the context. > > >> yes, that's my opinion: the effects of implementation tricks should >> not be observable by the user, because they can lead to hard to >> explain and debug behaviour in the user's program. you surely don't >> suggest that all users consult the source code before writing >> programs in r. >> > > Indeed, I am not suggesting
Re: [Rd] surprising behaviour of names<-
On Sat, 14 Mar 2009 07:22:34 +0100 Wacek Kusnierczyk wrote: [...] > > Well, I don't see any new object created in my workspace after > > x <- 4 > > names(x) <- "foo" > > Do you? > > > > of course not. that's why i'd say the two above are *not* > equivalent. > > i haven't noticed the 'in the c code'; do you mean the r interpreter > actually generates, in the c code, such r expressions for itself to > evaluate? As I said before, I have little knowledge about how the parser works and what goes on under the hood; and I have also little time and inclination to learn about it. But if you are interested in these details, then by all means invest the time to investigate. Alternatively, you would hope that Simon eventually finishes the book that he is writing on programming in R; as I understand it, that book would explain part of these issues in details. Hopefully, along with the book he makes the tools that he has for introspection available. > >> i guess you have looked under the hood; point me to the relevant > >> code. > > > > No I did not, because I am not interested in knowing such intimate > > details of R, but it seems you were interested. > > > > yes, but then your claim about what happens under the hood, in the c > code, is a pure stipulation. I made no claim about what is going on under the hood because I have no knowledge about these matters. But, yes, I was speculating of what might go on. > and you got the example from the r language definition sec. 10.2, > which says the forms are equivalent, with no 'under the hood, in the > c code' comment. Trying to figure out what a writer/painter actually means/says beyond the explicitly stated/painted, something that is summed up in Australia (and other places) under the term "critical thinking", was not high in the curriculum of your school, was it? :-) > you're just showing that your statements cannot be taken seriously. Usually, my statement can be taken seriously, unless followed by some indication that I said them tongue-in-cheek. Of course, statements that I allegedly made but were in fact put into my mouth cannot, and should not, be taken seriously. > >> yes, *if* you are able to predict the refcount of the object > >> passed to 'names<-' *then* you can predict what 'names<-' will do, > >> [...] > > > > I think Simon pointed already out that you seem to have a wrong > > picture of what is going on. [...] > > so what you quote effectively talks about a specific refcount > mechanism. it's not refcount that would be used by the garbage > collector, but it's a refcount, or maybe refflag. Fair enough, if you call this a refcount then there is no problem. Whenever I came across the term refcount in my readings, it was referring to different mechanisms, typically mechanisms that kept exact track on how often an object was referred too. So I would not call the value of the named field a refcount. And we can agree to call it from now on a refcount as long as we realise what mechanism is really used. > yes, that's my opinion: the effects of implementation tricks should > not be observable by the user, because they can lead to hard to > explain and debug behaviour in the user's program. you surely don't > suggest that all users consult the source code before writing > programs in r. Indeed, I am not suggesting this. Only users who use/rely on features that are not sufficiently documented would have to study the source code to find out what the exact behaviour is. But, of course, this could be fraught with danger since the behaviour could change without warning. > i have indeed learned what prefix 'names<-' does and now i know that > the surprising behaviour is due to the observability of the internal > optimization. > > thanks to simon, peter, and you for your answers which allowed me to > learn this ugly detail. You are welcome. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Fri, 13 Mar 2009, William Dunlap wrote: Would it make anyone any happier if the manual said that the replacement functions should not be called in the form xNew <- `func<-` (xOld, value) and should only be used as func(xToBeChanged) <- value ? That was my reaction, too. The discussion reminded me of old comp.lang.c threads about i=i++ and similar issues. The anomalies in xNew <- `func<-` (xOld, value) arise precisely because it isn't supposed to be used that way. My other proposal for 'rigidly defined areas of doubt and uncertainty' has been the evaluation order of the *apply family (eg, does apply process the columns left to right, or right to left, or however it feels like?). -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > On Fri, 13 Mar 2009 19:41:42 +0100 > Wacek Kusnierczyk wrote: > > > >> indeed, you said "R supposedly uses call-by-value (though we know how >> to circumvent that, don't we?)". >> >> in that vain, R supposedly can be used to do valid statistical >> computations (though we know how to circumvent it) ;) >> > > Sure, use Excel? ;-) > no, it has a buggy round > > >>> Indeed, if you type these two commands on the command line, then it >>> is not surprising that a copy of tmp is returned since you create a >>> temporary object that ends up in the symbol table and persist after >>> the commands are finished. >>> >>> >> what does command line have to do with it? >> > > If you want to find out what goes on under the hood, it is not > necessarily sufficient to do the same calculations on the command line. > > >>> Obviously, assuming that R really executes >>> *tmp* <- x >>> x <- "names<-"('*tmp*', value=c("a","b")) >>> under the hood, in the C code, then *tmp* does not end up in the >>> symbol table >>> >> no? >> > > Well, I don't see any new object created in my workspace after > x <- 4 > names(x) <- "foo" > Do you? > of course not. that's why i'd say the two above are *not* equivalent. i haven't noticed the 'in the c code'; do you mean the r interpreter actually generates, in the c code, such r expressions for itself to evaluate? > >> i guess you have looked under the hood; point me to the relevant >> code. >> > > No I did not, because I am not interested in knowing such intimate > details of R, but it seems you were interested. > yes, but then your claim about what happens under the hood, in the c code, is a pure stipulation. and you got the example from the r language definition sec. 10.2, which says the forms are equivalent, with no 'under the hood, in the c code' comment. you're just showing that your statements cannot be taken seriously. > > >> yes, *if* you are able to predict the refcount of the object passed to >> 'names<-' *then* you can predict what 'names<-' will do, [...] >> > > I think Simon pointed already out that you seem to have a wrong > picture of what is going on. As far as I know, there is no refcount > for objects. > > The relevant documentation would be R Language Manual, 1.1 SEXPs: > > What R users think of as variables or objects are symbols which are > bound to a value. The value can be thought of as either a SEXP (a > pointer), or the structure it points to, a SEXPREC (and there are > alternative forms used for vectors, namely VECSXP pointing to > VECTOR_SEXPREC structures). > > and 1.1.2 Rest of header: > > The named field is set and accessed by the SET_NAMED > and NAMED macros, and take values 0, 1 and 2. R has a `call by value' > illusion, so an assignment like > > b <- a > > appears to make a copy of a and refer to it as b. However, if neither > a nor b are subsequently altered there is no need to copy. What really > happens is that a new symbol b is bound to the same value as a and the > named field on the value object is set (in this case to 2). When an > object is about to be altered, the named field is consulted. A value > of 2 means that the object must be duplicated before being changed. > (Note that this does not say that it is necessary to duplicate, only > that it should be duplicated whether necessary or not.) A value of 0 > means that it is known that no other SEXP shares data with this > object, and so it may safely be altered. A value of 1 is used for > situations like > > dim(a) <- c(7, 2) > > where in principle two copies of a exist for the duration of the > computation as (in principle) > > a <- `dim<-`(a, c(7, 2)) > > but for no longer, and so some primitive functions can be optimized to > avoid a copy in this case. > > so what you quote effectively talks about a specific refcount mechanism. it's not refcount that would be used by the garbage collector, but it's a refcount, or maybe refflag. >> and in general, this should not matter because it should be >> unobservable, but it isn't. >> > > That's your opinion (to which you are entitled). yes, that's my opinion: the effects of implementation tricks should not be observable by the user, because they can lead to hard to explain and debug behaviour in the user's program. you surely don't suggest that all users consult the source code before writing programs in r. > Unfortunately (for > you), the designers of R decided on a design which allows them to > reduce the number of copies that have to be made. > and that's excellent, only that they failed to hide the mechanism below the interface. or maybe they decided not to hide it? > I was under the impression that you were interested to understand what > happens if you issue the commands > names(x) <- "foo" > and > "names<-"(x, "foo") >
Re: [Rd] surprising behaviour of names<-
On Fri, 13 Mar 2009 19:41:42 +0100 Wacek Kusnierczyk wrote: > > Glad to see that we agree on this. > > > > owe you a beer. O.k., if we ever meet is is first your shout and then mine. > >> haven't objected to that. i object to your 'r uses pass by value', > >> which is only partially correct. > >> > > > > Well, I used qualifiers and did not stated it categorically. > > > > indeed, you said "R supposedly uses call-by-value (though we know how > to circumvent that, don't we?)". > > in that vain, R supposedly can be used to do valid statistical > computations (though we know how to circumvent it) ;) Sure, use Excel? ;-) > > Indeed, if you type these two commands on the command line, then it > > is not surprising that a copy of tmp is returned since you create a > > temporary object that ends up in the symbol table and persist after > > the commands are finished. > > > > what does command line have to do with it? If you want to find out what goes on under the hood, it is not necessarily sufficient to do the same calculations on the command line. > > Obviously, assuming that R really executes > > *tmp* <- x > > x <- "names<-"('*tmp*', value=c("a","b")) > > under the hood, in the C code, then *tmp* does not end up in the > > symbol table > > no? Well, I don't see any new object created in my workspace after x <- 4 names(x) <- "foo" Do you? > i guess you have looked under the hood; point me to the relevant > code. No I did not, because I am not interested in knowing such intimate details of R, but it seems you were interested. > yes, *if* you are able to predict the refcount of the object passed to > 'names<-' *then* you can predict what 'names<-' will do, [...] I think Simon pointed already out that you seem to have a wrong picture of what is going on. As far as I know, there is no refcount for objects. The relevant documentation would be R Language Manual, 1.1 SEXPs: What R users think of as variables or objects are symbols which are bound to a value. The value can be thought of as either a SEXP (a pointer), or the structure it points to, a SEXPREC (and there are alternative forms used for vectors, namely VECSXP pointing to VECTOR_SEXPREC structures). and 1.1.2 Rest of header: The named field is set and accessed by the SET_NAMED and NAMED macros, and take values 0, 1 and 2. R has a `call by value' illusion, so an assignment like b <- a appears to make a copy of a and refer to it as b. However, if neither a nor b are subsequently altered there is no need to copy. What really happens is that a new symbol b is bound to the same value as a and the named field on the value object is set (in this case to 2). When an object is about to be altered, the named field is consulted. A value of 2 means that the object must be duplicated before being changed. (Note that this does not say that it is necessary to duplicate, only that it should be duplicated whether necessary or not.) A value of 0 means that it is known that no other SEXP shares data with this object, and so it may safely be altered. A value of 1 is used for situations like dim(a) <- c(7, 2) where in principle two copies of a exist for the duration of the computation as (in principle) a <- `dim<-`(a, c(7, 2)) but for no longer, and so some primitive functions can be optimized to avoid a copy in this case. > but in general you may not have the chance. [...] Agreed. > and in general, this should not matter because it should be > unobservable, but it isn't. That's your opinion (to which you are entitled). Unfortunately (for you), the designers of R decided on a design which allows them to reduce the number of copies that have to be made. > >> you suggested that "One reads the manual, (...) one reflects and > >> investigates, ..." > >> > > > > Indeed, and I am not giving up hope that one day you will master > > this art. > > > > well, this time i meant you. Rest assure I have read and reflected on that part of the manual. And I guess it boils down to how you interpret what "is equivalent to" means. For me it means that those two commands are what is executed in the C engine once the "names(x)<-c("a","b")" expression is parsed and the parse list arrives at the interpreter. To investigate whether that is the case, one would have to look at the C code, and I have little inclination to do so. But that would be necessary to answer the question whether *tmp* or a copy of *tmp* is returned, if one is really interested in this question. Or whether a *tmp* object is created at all. You seem to take "is equivalent to" to mean that issuing "names(x)<-c("a","b")" on the command line has the same effect as issuing those two other commands on the command line and addressing whether *tmp* or a copy of *tmp* is returned in this case. Fair enough, but it addresses a different question. And, as you said yourself in a
Re: [Rd] surprising behaviour of names<-
Tony Plate wrote: > Wacek Kusnierczyk wrote: >> Tony Plate wrote: >> >>> Is there anything incorrect or missing in the help page for normal >>> usage of the replacement function for 'names'? (i.e., when used in an >>> expression like 'names(x) <- ...') >>> >> >> what is missing here in the first place is a specification of what >> 'normal' means. as far as i can see from the man page, 'normal' does >> not exclude prefix use. and if so, what is missing in the help page is >> a clear statement what an application of 'names<-' will do, in the sense >> of what a user may observe. >> > Fair enough. I looked at the help page for "names" after sending my > email, and was surprised to see the following in the "DETAILS" section: > > "It is possible to update just part of the names attribute via the > general rules: see the examples. This works because the expression > there is evaluated as |z <- "names<-"(z, "[<-"(names(z), 3, "c2"))|. " > > To me, this paragraph is far more confusing than enlightening, > especially as also gives the impression that it's OK to use a > replacement function in a functional form. In my own personal opinion > it would be a enhancement to remove that example from the > documentation, and just say you can do things like 'names(x)[2:3] <- > c("a","b")'. i must say that this part of the man page does explain things to me. much less the code [1] berwin suggested as a piece to read and investigate (slightly modified): tmp = x x = 'names<-'(tmp, 'foo') berwin's conclusion seemed to be that this code hints/suggests/fortune-tells the user that 'names<-' might be doing side effects. this code illustrates what names(x) = 'foo' (the infix form) does -- that it destructively modifies x. now, if the code were to illustrate that the prefix form does perform side effects too, then the following would be enough: 'names<-'(x, 'foo') if the code were to illustrate that the prefix form, unlike the infix form, does not perform side effects, then the following would suffice for a discussion: x = 'names<-'(x, 'foo') if the code wee to illustrate that the prefix form may or may not do side effects depending on the situation, then it surely fails to show that, unless the user performs some sophisticated inference which i am not capable of, or, more likely, unless the user already knows that this was to be shown. without a discussion, the example is simply an unworked rubbish. and it's obviously wrong; it says that (slightly and irrelevantly simplified) names(x) = 'foo' "is equivalent to" tmp = x x = 'names<-'(tmp, 'foo') which is nonsense, because in the latter case you either have an additional binding that you don't have in the former case, or, worse, you rebind, possibly with a different value, a name that has had a binding already. it's a gritty-nitty detail, but so is most of statistics based on nitty-gritty details which non-statisticians are happy to either ignore or be ignorant about. [1] http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-lang.html#Comments > > I often use name replacement functions in a functional way, and > because one can't use 'names<-' etc in this way, note, this 'because' does not follow in any way from the man page, or the section of 'r language definition' referred to above. > I define my own functions like the following: > > set.names <- function(n,x) {names(x) <- n; x} it appears that set.names = function(n, x) 'names<-'(x, n) would do the job (guess why). > > (and similarly for set.rownames(), set colnames(), etc.) > > I would highly recommend you do this rather than try to use a call > like "names<-"(x, ...). i'm almost tempted to extend your recommendation to 'define your own function for about every function already in r' ;) vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek Kusnierczyk wrote: Tony Plate wrote: Wacek Kusnierczyk wrote: [snip] i just can't get it why the manual does not manifestly explain what 'names<-' does, and leaves you doing the guesswork you suggest. I'm having trouble understanding the point of this discussion. Someone is calling a replacement function in a way that it's not meant to be used, and is them complaining about it not doing what he thinks it should, or about the documentation not describing what happens when one does that? where is it written that the function is not meant to be used this way? you get an example in the man page, showing precisely how it could be used that way. it also explains the value of 'names<-': " For 'names<-', the updated object. (Note that the value of 'names(x) <- value' is that of the assignment, 'value', not the return value from the left-hand side.) " it does speak of 'names<-' used in prefix form, and does not do it in any negative (discouraging) way. Is there anything incorrect or missing in the help page for normal usage of the replacement function for 'names'? (i.e., when used in an expression like 'names(x) <- ...') what is missing here in the first place is a specification of what 'normal' means. as far as i can see from the man page, 'normal' does not exclude prefix use. and if so, what is missing in the help page is a clear statement what an application of 'names<-' will do, in the sense of what a user may observe. Fair enough. I looked at the help page for "names" after sending my email, and was surprised to see the following in the "DETAILS" section: "It is possible to update just part of the names attribute via the general rules: see the examples. This works because the expression there is evaluated as |z <- "names<-"(z, "[<-"(names(z), 3, "c2"))|. " To me, this paragraph is far more confusing than enlightening, especially as also gives the impression that it's OK to use a replacement function in a functional form. In my own personal opinion it would be a enhancement to remove that example from the documentation, and just say you can do things like 'names(x)[2:3] <- c("a","b")'. I often use name replacement functions in a functional way, and because one can't use 'names<-' etc in this way, I define my own functions like the following: set.names <- function(n,x) {names(x) <- n; x} (and similarly for set.rownames(), set colnames(), etc.) I would highly recommend you do this rather than try to use a call like "names<-"(x, ...). -- Tony Plate (I guess that if on the label of fridge there is a picture of a guy carrying it on his back, then Mr. Fridge-Racer might have some grounds for suing.) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Tony Plate wrote: > Wacek Kusnierczyk wrote: >> [snip] >> i just can't get it why the manual does not manifestly explain what >> 'names<-' does, and leaves you doing the guesswork you suggest. >> >> > I'm having trouble understanding the point of this discussion. > Someone is calling a replacement function in a way that it's not meant > to be used, and is them complaining about it not doing what he thinks > it should, or about the documentation not describing what happens when > one does that? where is it written that the function is not meant to be used this way? you get an example in the man page, showing precisely how it could be used that way. it also explains the value of 'names<-': " For 'names<-', the updated object. (Note that the value of 'names(x) <- value' is that of the assignment, 'value', not the return value from the left-hand side.) " it does speak of 'names<-' used in prefix form, and does not do it in any negative (discouraging) way. > > Is there anything incorrect or missing in the help page for normal > usage of the replacement function for 'names'? (i.e., when used in an > expression like 'names(x) <- ...') what is missing here in the first place is a specification of what 'normal' means. as far as i can see from the man page, 'normal' does not exclude prefix use. and if so, what is missing in the help page is a clear statement what an application of 'names<-' will do, in the sense of what a user may observe. > > R does give one the ability to use its facilities in non-standard > ways. However, I don't see much value in the help page for 'gun' > attempting to describe the ways in which the bones in your foot will > be shattered should you choose to point the gun at your foot and pull > the trigger. Reminds me of the story of the guy in New York, who > after injuring his back in refrigerator-carrying race, sued the > manufacturer of the refrigerator for not having a warning label > against that sort of use. very funny. little relevant. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
William Dunlap wrote: > Would it make anyone any happier if the manual said > that the replacement functions should not be called > in the form >xNew <- `func<-` (xOld, value) > and should only be used as >func(xToBeChanged) <- value > surely better than guesswork. > ? > > The explanation > names(x) <- c("a","b") > is equivalent to > '*tmp*' <- x > x <- "names<-"('*tmp*', value=c("a","b")) > could also be extended a bit, adding a line like > rm(`*tmp*`) > Those 3 lines should be considered an atomic operation: > the value that `*tmp*` or `x` may have or what is > in the symbol table at various points in that sequence > is not defined. (Letting details be explicitly undefined > is important: it gives developers room to improve the > efficiency of the interpreter and tells users where not to go.) > there is a difference between letting things be undefined and explicitly stating that things are unspecified. the c99 standard [1], for example, is explicit about the non-determinism of expressions that involve side effects, as it is about that some expressions may actually not be evaluated if the optimizer decides so. berwin has already suggested that one reads from what docs do *not* say; it's a very bad idea. it's best that the documentation *does* say that, for example, a particular function should be used only in the infix form because the semantics of the prefix form are not guaranteed and may change in future versions. if the current state is that 'names<-' will modify the object it is given as an argument in some situations, but not in others, and this is visible to the user, the best thing to do is to give an explicit warning -- perhaps with an annotation that things may change, if they may. best, vQ [1] http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek Kusnierczyk wrote: [snip] i just can't get it why the manual does not manifestly explain what 'names<-' does, and leaves you doing the guesswork you suggest. I'm having trouble understanding the point of this discussion. Someone is calling a replacement function in a way that it's not meant to be used, and is them complaining about it not doing what he thinks it should, or about the documentation not describing what happens when one does that? Is there anything incorrect or missing in the help page for normal usage of the replacement function for 'names'? (i.e., when used in an expression like 'names(x) <- ...') R does give one the ability to use its facilities in non-standard ways. However, I don't see much value in the help page for 'gun' attempting to describe the ways in which the bones in your foot will be shattered should you choose to point the gun at your foot and pull the trigger. Reminds me of the story of the guy in New York, who after injuring his back in refrigerator-carrying race, sued the manufacturer of the refrigerator for not having a warning label against that sort of use. -- Tony Plate __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Would it make anyone any happier if the manual said that the replacement functions should not be called in the form xNew <- `func<-` (xOld, value) and should only be used as func(xToBeChanged) <- value ? The explanation names(x) <- c("a","b") is equivalent to '*tmp*' <- x x <- "names<-"('*tmp*', value=c("a","b")) could also be extended a bit, adding a line like rm(`*tmp*`) Those 3 lines should be considered an atomic operation: the value that `*tmp*` or `x` may have or what is in the symbol table at various points in that sequence is not defined. (Letting details be explicitly undefined is important: it gives developers room to improve the efficiency of the interpreter and tells users where not to go.) Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com > -Original Message- > From: r-devel-boun...@r-project.org > [mailto:r-devel-boun...@r-project.org] On Behalf Of Wacek Kusnierczyk > Sent: Friday, March 13, 2009 11:42 AM > To: Berwin A Turlach > Cc: r-devel@r-project.org List > Subject: Re: [Rd] surprising behaviour of names<- > ... blah blah blah > >> x = 1 > >> tmp = x > >> x = 'names<-'(tmp, 'foo') > >> names(tmp) > >> # NULL __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > >> sure! >> > > Glad to see that we agree on this. > owe you a beer. > >>> Read section 2.1.10 ("Environments") in the R >>> Language Definition, >>> >> haven't objected to that. i object to your 'r uses pass by value', >> which is only partially correct. >> > > Well, I used qualifiers and did not stated it categorically. > indeed, you said "R supposedly uses call-by-value (though we know how to circumvent that, don't we?)". in that vain, R supposedly can be used to do valid statistical computations (though we know how to circumvent it) ;) > > and actually, in the example we discuss, 'names<-' does *not* return an updated *tmp*, so there's even less to entertain. >>> How do you know? Are you sure? Have you by now studied what goes >>> on under the hood? >>> >> yes, a bit. but in this example, it's enough to look into *tmp* to >> see that it hasn't got the names added, and since x does have names, >> names<- must have returned a copy of *tmp* rather than *tmp* changed: >> >> x = 1 >> tmp = x >> x = 'names<-'(tmp, 'foo') >> names(tmp) >> # NULL >> > > Indeed, if you type these two commands on the command line, then it is > not surprising that a copy of tmp is returned since you create a > temporary object that ends up in the symbol table and persist after the > commands are finished. > what does command line have to do with it? > Obviously, assuming that R really executes > *tmp* <- x > x <- "names<-"('*tmp*', value=c("a","b")) > under the hood, in the C code, then *tmp* does not end up in the symbol > table no? > and does not persist beyond the execution of > names(x) <- c("a","b") > no? i guess you have looked under the hood; point me to the relevant code. > This looks to me as one of the situations where a value of 1 is used > for the named field of some of the objects involves so that a copy can > be avoided. That's why I asked whether you looked under the hood. > anyway, what happens under the hood is much less interesting from the user's perspective that what can be seen over the hood. what i can see, is that 'names<-' will incoherently perform in-place modification or copy-on-assignment. yes, *if* you are able to predict the refcount of the object passed to 'names<-' *then* you can predict what 'names<-' will do, but in general you may not have the chance. and in general, this should not matter because it should be unobservable, but it isn't. back to your i += i++ example, the outcome may differ from a compiler to a compiler, but, i guess, compilers will implement the order coherently, so that whatever version they choose, the outcome will be predictable, and not dependent on some earlier code. (prove me wrong. or maybe i'll do it myself.) > >> you suggested that "One reads the manual, (...) one reflects and >> investigates, ..." >> > > Indeed, and I am not giving up hope that one day you will master this > art. > well, this time i meant you. > >> -- had you done it, you wouldn't have asked the question. >> > > Sorry, I forgot that you have a tendency to interpret statements > extremely verbatim yes, i have two hooks installed: one says \begin{verbatim}, the other says \end{verbatim}. > and with little reference to the context in which > they are made. not that you're trying to be extremely accurate or polite here... > I will try to be more explicit in future. > it will certainly do good to you. >> >> i just can't get it why the manual does not manifestly explain what >> 'names<-' does, and leaves you doing the guesswork you suggest. >> > > As I said before, patched to documentation are also welcome. > i'll give it a try. > Best wishes, > hope you mean it. likewise, vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Fri, 13 Mar 2009 11:43:55 +0100 Wacek Kusnierczyk wrote: > Berwin A Turlach wrote: > > > And it is documented behaviour. > > sure! Glad to see that we agree on this. > > Read section 2.1.10 ("Environments") in the R > > Language Definition, > > haven't objected to that. i object to your 'r uses pass by value', > which is only partially correct. Well, I used qualifiers and did not stated it categorically. > >> and actually, in the example we discuss, 'names<-' does *not* > >> return an updated *tmp*, so there's even less to entertain. > >> > > > > How do you know? Are you sure? Have you by now studied what goes > > on under the hood? > > yes, a bit. but in this example, it's enough to look into *tmp* to > see that it hasn't got the names added, and since x does have names, > names<- must have returned a copy of *tmp* rather than *tmp* changed: > > x = 1 > tmp = x > x = 'names<-'(tmp, 'foo') > names(tmp) > # NULL Indeed, if you type these two commands on the command line, then it is not surprising that a copy of tmp is returned since you create a temporary object that ends up in the symbol table and persist after the commands are finished. Obviously, assuming that R really executes *tmp* <- x x <- "names<-"('*tmp*', value=c("a","b")) under the hood, in the C code, then *tmp* does not end up in the symbol table and does not persist beyond the execution of names(x) <- c("a","b") This looks to me as one of the situations where a value of 1 is used for the named field of some of the objects involves so that a copy can be avoided. That's why I asked whether you looked under the hood. > you suggested that "One reads the manual, (...) one reflects and > investigates, ..." Indeed, and I am not giving up hope that one day you will master this art. > -- had you done it, you wouldn't have asked the question. Sorry, I forgot that you have a tendency to interpret statements extremely verbatim and with little reference to the context in which they are made. I will try to be more explicit in future. > >> for fun and more guesswork, the example could have been: > >> > >> x = x > >> x = 'names<-'(x, value=c('a', 'b')) > >> > > > > But it is manifestly not written that way in the manual; and for > > good reasons since 'names<-' might have side effects which invokes > > in the last line undefined behaviour. Just as in the equivalent C > > snippet that I mentioned. > > i just can't get it why the manual does not manifestly explain what > 'names<-' does, and leaves you doing the guesswork you suggest. As I said before, patched to documentation are also welcome. Best wishes, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > >> foo = function(arg) arg$foo = foo >> >> e = new.env() >> foo(e) >> e$foo >> >> are you sure this is pass by value? >> > > But that is what environments are for, aren't they? might be. > And it is > documented behaviour. sure! > Read section 2.1.10 ("Environments") in the R > Language Definition, haven't objected to that. i object to your 'r uses pass by value', which is only partially correct. > in particular the last paragraph: > > Unlike most other R objects, environments are not copied when > passed to functions or used in assignments. Thus, if you assign the > same environment to several symbols and change one, the others will > change too. In particular, assigning attributes to an environment can > lead to surprises. > > [..] > >> and actually, in the example we discuss, 'names<-' does *not* return >> an updated *tmp*, so there's even less to entertain. >> > > How do you know? Are you sure? Have you by now studied what goes on > under the hood? > yes, a bit. but in this example, it's enough to look into *tmp* to see that it hasn't got the names added, and since x does have names, names<- must have returned a copy of *tmp* rather than *tmp* changed: x = 1 tmp = x x = 'names<-'(tmp, 'foo') names(tmp) # NULL you suggested that "One reads the manual, (...) one reflects and investigates, ..." -- had you done it, you wouldn't have asked the question. > >> for fun and more guesswork, the example could have been: >> >> x = x >> x = 'names<-'(x, value=c('a', 'b')) >> > > But it is manifestly not written that way in the manual; and for good > reasons since 'names<-' might have side effects which invokes in the > last line undefined behaviour. Just as in the equivalent C snippet > that I mentioned. > i just can't get it why the manual does not manifestly explain what 'names<-' does, and leaves you doing the guesswork you suggest. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Thu, 12 Mar 2009 21:26:15 +0100 Wacek Kusnierczyk wrote: > > YMMV, but when I read a passage like this in R documentation, I > > start to wonder why it is stated that > > names(x) <- c("a","b") > > is equivalent to > > *tmp* <- x > > x <- "names<-"('*tmp*', value=c("a","b")) > > and the simpler construct > > x <- "names<-"(x, value=c("a", "b")) > > is not used. There must be a reason, > > got an explanation: because it probably is as drafty as the > aforementioned document. Your grasp of what "draft manual" means in the context of R documentation seems to be as tenuous as the grasp of intelligent design/creationist proponents on what it means in science to label a body of knowledge a "(scientific) theory". :) [...] > but it is possible to send an argument to a function that makes an > assignment to the argument, and yet the assignment is made to the > original, not to a copy: > > foo = function(arg) arg$foo = foo > > e = new.env() > foo(e) > e$foo > > are you sure this is pass by value? But that is what environments are for, aren't they? And it is documented behaviour. Read section 2.1.10 ("Environments") in the R Language Definition, in particular the last paragraph: Unlike most other R objects, environments are not copied when passed to functions or used in assignments. Thus, if you assign the same environment to several symbols and change one, the others will change too. In particular, assigning attributes to an environment can lead to surprises. [..] > and actually, in the example we discuss, 'names<-' does *not* return > an updated *tmp*, so there's even less to entertain. How do you know? Are you sure? Have you by now studied what goes on under the hood? > for fun and more guesswork, the example could have been: > > x = x > x = 'names<-'(x, value=c('a', 'b')) But it is manifestly not written that way in the manual; and for good reasons since 'names<-' might have side effects which invokes in the last line undefined behaviour. Just as in the equivalent C snippet that I mentioned. > for your interest in well written documentation, ?names says that the > argument x is 'an r object', and nowhere does it say that environment > is not an r object. it also says what the value of 'names<-' applied > to pairlists is. the following error message is doubly surprising: > > e = new.env() > 'names<-'(e, 'foo') > # Error: names() applied to a non-vector But names are implemented by assigning a "name" attribute to the object; as you should know. And the above documentation suggests that it is not a good idea to assign attributed to environments. So why would you expect this to work? > firstly, because it would seem that there's nothing wrong in applying > names to an environment; from ?'$': > > " > x$name > > name: A literal character string or a name (possibly backtick > quoted). For extraction, this is normally (see under > 'Environments') partially matched to the 'names' of the > object. > " I fail to see the relevance of this. > secondly, because, as ?names says, names can be applied to pairlists, Yes, but it does not say that names can be applied to environment. And it explicitly says that the "default methods get and set the '"name"' attribute of..." and (other) documentation warns you about setting attributes on environments. > which are not vectors, and the following does not give an error as > above: > > p = pairlist() > is.vector(p) > # FALSE > names(p) > # names successfully applied to a non-vector > > assure me this is not a mess, but a well-documented design feature. It is documented, if it is well-documented depends on your definition of "well-documented". :) > ... and one wonders why r man pages have to be read in O(e^n) time. I believe patches to documentation are also welcome; and perhaps more readily accepted than patches to code. [...] > >>> I guess that would require a rewrite (or extension) of the parser. > >>> To me, Section 10.1.2 of the Language Definition manual suggests > >>> that once an expression is parsed, you cannot distinguish any more > >>> whether 'names<-' was called using infix syntax or prefix syntax. > >>> > >>> > >> but this must be nonsense, since: > >> > >> x = 1 > >> 'names<-'(x, 'foo') > >> names(x) > >> # NULL > >> > >> x = 1 > >> names(x) <- 'foo' > >> names(x) > >> # "foo" > >> > >> clearly, there is not only syntactic difference here. but it > >> might be that 10.1.2 does not suggest anything like what you say. > >> > > > > Please tell me how this example contradicts my reading of 10.1.2 > > that the expressions > > 'names<-'(x, 'foo') > > and > > names(x) <- 'foo' > > once they are parsed, produce exactly the same parse tree and that > > it becomes impossible to tell from the parse tree whether > > originally the infix syntax or
Re: [Rd] surprising behaviour of names<-
G. Jay Kerns wrote: > Wacek Kusnierczyk wrote: > > > > I am prompted to imagine someone pointing out to the volunteers of the > International Red Cross - on the field of a natural disaster, no less > - that their uniforms are not an acceptably consistent shade of > pink... or that the screws on their tourniquets do not have the > appropriate pitch as to minimize the friction for the turner... > > not that it is very accurate, because unintuitive and confusing semantics may lead to hidden and dangerous errors in users' code. wrong shade of a uniform might lead to the person being shot, for example, but then your point vanishes. > As a practicing statistician I am simply thankful that the bleeding is > stopped. :-) > when it is stopped, not turned to an internal bleeding, which you simply don't see. > Cheers to R-Core (and the hundreds of other volunteers). > > absolutely. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Thu, Mar 12, 2009 at 3:24 PM, G. Jay Kerns wrote: > Wacek Kusnierczyk wrote: > > [snip] > >> as i explained a few months ago, i study r to find examples of bad >> design. if anyone in the r core is interested in having the problems i >> report fixed, i'm happy to get involved in a discussion about the design >> and implementation. if not, i'm happy with just pointing out the issues. > > :-) > > I am prompted to imagine someone pointing out to the volunteers of the > International Red Cross - on the field of a natural disaster, no less > - that their uniforms are not an acceptably consistent shade of > pink... or that the screws on their tourniquets do not have the > appropriate pitch as to minimize the friction for the turner... Your analogy may overstate the case a bit, since R volunteers - while providing a valuable service to the community - are not dealing with matters of life and death. Habitat for Humanity (an organization that provides free housing to the under-privileged) would be a better comparison. I'm sure those volunteers would appreciate a critique of their work, provided the critique was not condescending and focused on serving the community better, not to showcase the acumen of the one giving the critique. > > As a practicing statistician I am simply thankful that the bleeding is > stopped. :-) > > Cheers to R-Core (and the hundreds of other volunteers). > Jay > I second that. Thanks to R-Core et al for all their generous efforts. > > > *** > G. Jay Kerns, Ph.D. > Associate Professor > Department of Mathematics & Statistics > Youngstown State University > Youngstown, OH 44555-0002 USA > Office: 1035 Cushwa Hall > Phone: (330) 941-3310 Office (voice mail) > -3302 Department > -3170 FAX > E-mail: gke...@ysu.edu > http://www.cc.ysu.edu/~gjkerns/ > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > Best, Josh -- http://quantemplation.blogspot.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Simon Urbanek wrote: > > On Mar 12, 2009, at 11:12 , Wacek Kusnierczyk wrote: > >> Simon Urbanek wrote: >>> >>> On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: >>> Wacek, Peter gave you a full answer explaining it very well. If you really want to be able to trace each instance yourself, you have to learn far more about R internals than you apparently know (and Peter hinted at that). Internally x=1 an x=c(1) are slightly different in that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is what causes the difference in behavior as Peter explained. The reason is that c(1) creates a copy of the 1 (which is a constant [=unmutable] thus requiring a copy) and the new copy has no other references and thus can be modified and hence NAMED(x) = 0. >>> >>> Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above -- >>> since NAMED(c(1)) = 0 and once it's assigned to x it becomes NAMED(x) >>> = 1 -- this is just a detail on how things work with assignment, the >>> explanation above is still correct since duplication happens >>> conditional on NAMED == 2. >> >> there is an interesting corollary. self-assignment seems to increase >> the reference count: >> >>x = 1; 'names<-'(x, 'foo'); names(x) >># NULL >> >>x = 1; x = x; 'names<-'(x, 'foo'); names(x) >># "foo" >> > > Not for me, at least in current R: not for me either. i messed up the example, sorry. here's the intended version: x = c(1); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); x = x; 'names<-'(x, 'foo'); names(x) # NULL > > > x = 1; 'names<-'(x, 'foo'); names(x) > foo > 1 > NULL > > x = 1; x = x; 'names<-'(x, 'foo'); names(x) > foo > 1 > NULL > > (both R 2.8.1 and R-devel 3/11/09, darwin 9.6) > > In addition, you still got it backwards - your output suggests that > the assignment created a new, clean copy. Functional call of `names<-` > (whose side-effect on x is undefined BTW) is destructive when you get > a clean copy (e.g. as a result of the c function) and non-destructive > when the object was referenced. It is left as an exercise to the > reader to reason why constants such as 1 are referenced. all true, again because of my mistake. anyway, it may be suprising that with all its smartness (i mean it) about copy-on-assingment, r does not see that it makes no sense to increase refcount here. of course, you can't judge from just the syntactic form 'x=x', but still it should not be very difficult to have the interpreter see when it finds an object named 'x' in the same environment where it attempts the assignment. (of course, who'd do self-assignments in practical code?) cheers, vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > On Thu, 12 Mar 2009 15:21:50 +0100 > Wacek Kusnierczyk wrote: > > >> seems to suggest? is not the purpose of documentation to clearly, >> ideally beyond any doubt, specify what is to be specified? >> > > The R Language Definition manual is still a draft. :) > this is indeed a good explanation for all sorts of nonsense. worse if stuff tends to persist despite critique. > >>> that in this case the infix and prefix syntax >>> is not equivalent as it does not say that >>> >>> >> are you suggesting fortune telling from what the docs do *not* say? >> > > My experience is that sometimes you have to realise what is not > stated. in general, yes. in r, this often ends up with 'have you seen the documentation saying that??' in response. > I remember a discussion with somebody who asked why he could > not run, on windows, R CMD INSTALL on a *.zip file. I pointed out to > him that the documentation states that you can run R CMD INSTALL on > *.tar.gz or *.tgz files and, thus, there should be no expectation that > it can be run on *.zip file. > yes, that's a good point. this reminds me of a (possibly anectodal) lady who sued the manufacturer of her microwave after she had dried in it her cat after a bath. > YMMV, but when I read a passage like this in R documentation, I start > to wonder why it is stated that > names(x) <- c("a","b") > is equivalent to > *tmp* <- x > x <- "names<-"('*tmp*', value=c("a","b")) > and the simpler construct > x <- "names<-"(x, value=c("a", "b")) > is not used. There must be a reason, got an explanation: because it probably is as drafty as the aforementioned document. > nobody likes to type > unnecessarily long code. And, after thinking about this for a while, > the penny might drop. > that's cool. instead of stating what 'names<-' does or does not, one expresses it in a convoluted way an makes you guess from a *tmp* variable. a nice exercise, i like it. > [...] > does this say anything about what 'names<-'(...) actually returns? updated *tmp*, or a copy of it? >>> Since R uses pass-by-value, >>> >> since? it doesn't! >> > > For all practical purposes it is as long as standard evaluation is > used. One just have to be aware that some functions evaluate their > arguments in a non-standard way. > it's maybe a bit of hairsplitting, but what you have in r is not exactly what is called 'pass by value'. here's a relevant quote from [1], p. 309: " In the call-by-name (CBN) mechanism, a formal parameter names the computation designated by an unevaluated argument expression. In the call-by-value (CBV) mechanism, a formal parameter names the value of an evaluated argument expression. In the call-by-need or lazy evaluation (CBL), the formal parameter name can be bound to a location that originally stores the computation of the argument expression. The first time the parameter is referenced, the computation is performed, but the resulting value is cached at the location and is used on every subsequent reference. Thus, the argument expression is evaluated at most once and is never evaluated at all if the parameter is never referenced. " note the 'unevaluated' and 'evaluated'. you're free to have your pick. but it is possible to send an argument to a function that makes an assignment to the argument, and yet the assignment is made to the original, not to a copy: foo = function(arg) arg$foo = foo e = new.env() foo(e) e$foo are you sure this is pass by value? it appears that r has a pass-by-need mechanism that dispatches to pass-by-value or pass-by-reference depending on the type of the object. with this semantics, all sorts of mess are possible, and 'names<-' provides one example. [1] design concepts in programming languages, turbak and gifford, mit press 2008 > [...] > >>> If you entertain the idea that 'names<-' updates *tmp* and >>> returns the updated *tmp*, then you believe that 'names<-' behaves >>> in a non-standard way and should take appropriate care. >>> >> i got lost in your argumentation. [..] >> > > I was commenting on "does this say anything about what 'names<-'(...) > actually returns? updated *tmp*, or a copy of it?" > > As I said, if you entertain the idea that 'names<-' returns an updated > *tmp*, then you believe that 'names<-' behaves in a non-standard way > and appropriate care has to be taken. > > i can check, by experimentation, whether 'names<-' returns a copy or the original; even if i can establish that it returns the original after having modified it, it's not something to entertain. maybe you entertain the idea of your users performing the guesswork instead of reading an unambiguous specification. you have already said that you don't care if your users get confused, it would fit the image. and actually, in the example we discuss, 'names<-' does *not*
Re: [Rd] surprising behaviour of names<-
Wacek Kusnierczyk wrote: [snip] > as i explained a few months ago, i study r to find examples of bad > design. if anyone in the r core is interested in having the problems i > report fixed, i'm happy to get involved in a discussion about the design > and implementation. if not, i'm happy with just pointing out the issues. :-) I am prompted to imagine someone pointing out to the volunteers of the International Red Cross - on the field of a natural disaster, no less - that their uniforms are not an acceptably consistent shade of pink... or that the screws on their tourniquets do not have the appropriate pitch as to minimize the friction for the turner... As a practicing statistician I am simply thankful that the bleeding is stopped. :-) Cheers to R-Core (and the hundreds of other volunteers). Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics & Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX E-mail: gke...@ysu.edu http://www.cc.ysu.edu/~gjkerns/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Mar 12, 2009, at 11:12 , Wacek Kusnierczyk wrote: Simon Urbanek wrote: On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: Wacek, Peter gave you a full answer explaining it very well. If you really want to be able to trace each instance yourself, you have to learn far more about R internals than you apparently know (and Peter hinted at that). Internally x=1 an x=c(1) are slightly different in that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is what causes the difference in behavior as Peter explained. The reason is that c(1) creates a copy of the 1 (which is a constant [=unmutable] thus requiring a copy) and the new copy has no other references and thus can be modified and hence NAMED(x) = 0. Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above -- since NAMED(c(1)) = 0 and once it's assigned to x it becomes NAMED(x) = 1 -- this is just a detail on how things work with assignment, the explanation above is still correct since duplication happens conditional on NAMED == 2. there is an interesting corollary. self-assignment seems to increase the reference count: x = 1; 'names<-'(x, 'foo'); names(x) # NULL x = 1; x = x; 'names<-'(x, 'foo'); names(x) # "foo" Not for me, at least in current R: > x = 1; 'names<-'(x, 'foo'); names(x) foo 1 NULL > x = 1; x = x; 'names<-'(x, 'foo'); names(x) foo 1 NULL (both R 2.8.1 and R-devel 3/11/09, darwin 9.6) In addition, you still got it backwards - your output suggests that the assignment created a new, clean copy. Functional call of `names<-` (whose side-effect on x is undefined BTW) is destructive when you get a clean copy (e.g. as a result of the c function) and non-destructive when the object was referenced. It is left as an exercise to the reader to reason why constants such as 1 are referenced. Cheers, Simon __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Thu, 12 Mar 2009 15:21:50 +0100 Wacek Kusnierczyk wrote: [...] > >>> And the R Language manual (ignoring for the moment that it is a > >>> draft and all that), > >>> > >> since we must... > >> > >> > >>> clearly states that > >>> > >>> names(x) <- c("a","b") > >>> > >>> is equivalent to > >>> > >>> '*tmp*' <- x > >>> x <- "names<-"('*tmp*', value=c("a","b")) > >>> > >>> > >> ... and? > >> > > > > This seems to suggest > > seems to suggest? is not the purpose of documentation to clearly, > ideally beyond any doubt, specify what is to be specified? The R Language Definition manual is still a draft. :) > > that in this case the infix and prefix syntax > > is not equivalent as it does not say that > > > > are you suggesting fortune telling from what the docs do *not* say? My experience is that sometimes you have to realise what is not stated. I remember a discussion with somebody who asked why he could not run, on windows, R CMD INSTALL on a *.zip file. I pointed out to him that the documentation states that you can run R CMD INSTALL on *.tar.gz or *.tgz files and, thus, there should be no expectation that it can be run on *.zip file. YMMV, but when I read a passage like this in R documentation, I start to wonder why it is stated that names(x) <- c("a","b") is equivalent to *tmp* <- x x <- "names<-"('*tmp*', value=c("a","b")) and the simpler construct x <- "names<-"(x, value=c("a", "b")) is not used. There must be a reason, nobody likes to type unnecessarily long code. And, after thinking about this for a while, the penny might drop. [...] > >> does this say anything about what 'names<-'(...) actually > >> returns? updated *tmp*, or a copy of it? > >> > > > > Since R uses pass-by-value, > > since? it doesn't! For all practical purposes it is as long as standard evaluation is used. One just have to be aware that some functions evaluate their arguments in a non-standard way. [...] > > If you entertain the idea that 'names<-' updates *tmp* and > > returns the updated *tmp*, then you believe that 'names<-' behaves > > in a non-standard way and should take appropriate care. > > i got lost in your argumentation. [..] I was commenting on "does this say anything about what 'names<-'(...) actually returns? updated *tmp*, or a copy of it?" As I said, if you entertain the idea that 'names<-' returns an updated *tmp*, then you believe that 'names<-' behaves in a non-standard way and appropriate care has to be taken. > > And the fact that a variable *tmp* is used hints to the fact that > > 'names<-' might have side-effect. > > are you suggesting fortune telling from the fact that a variable *tmp* > is used? Nothing to do with fortune telling. One reads the manual, one wonders why is this construct used instead of an apparently much more simple one, one reflects and investigates, one realises why the given construct is stated as the equivalent: because "names<-" has side-effects. > > This is similar to the discussion what value i should have in the > > following C snippet: > > i = 0; > > i += i++; > > > > nonsense, it's a *completely* different issue. here you touch the > issue of the order of evaluation, and not of whether an object is > copied or modified; above, the inverse is true. Sorry, there was a typo above. The second statement should have been i = i++; Then on some abstract level they are the same; an object appears on the left hand side of an assignment but is also modified in the expression assigned to it. So what value should it end up with? > >> why? you can still use the infix names<- with destructive > >> semantics to avoid copying. > >> > > > > I guess that would require a rewrite (or extension) of the parser. > > To me, Section 10.1.2 of the Language Definition manual suggests > > that once an expression is parsed, you cannot distinguish any more > > whether 'names<-' was called using infix syntax or prefix syntax. > > > > but this must be nonsense, since: > > x = 1 > 'names<-'(x, 'foo') > names(x) > # NULL > > x = 1 > names(x) <- 'foo' > names(x) > # "foo" > > clearly, there is not only syntactic difference here. but it might be > that 10.1.2 does not suggest anything like what you say. Please tell me how this example contradicts my reading of 10.1.2 that the expressions 'names<-'(x, 'foo') and names(x) <- 'foo' once they are parsed, produce exactly the same parse tree and that it becomes impossible to tell from the parse tree whether originally the infix syntax or the prefix syntax was used. In fact, the last sentence in section 10.1.2 strongly suggests to me that the parse tree stores all function calls as if prefix notation was used. But it is probably my English again. > > Thus, I guess you want to start a discussion with R Core whether it > > is worthwhile to change t
Re: [Rd] surprising behaviour of names<-
Simon Urbanek wrote: > > On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: > >> Wacek, >> >> Peter gave you a full answer explaining it very well. If you really >> want to be able to trace each instance yourself, you have to learn >> far more about R internals than you apparently know (and Peter hinted >> at that). Internally x=1 an x=c(1) are slightly different in that the >> former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is >> what causes the difference in behavior as Peter explained. The reason >> is that c(1) creates a copy of the 1 (which is a constant >> [=unmutable] thus requiring a copy) and the new copy has no other >> references and thus can be modified and hence NAMED(x) = 0. >> > > Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above -- > since NAMED(c(1)) = 0 and once it's assigned to x it becomes NAMED(x) > = 1 -- this is just a detail on how things work with assignment, the > explanation above is still correct since duplication happens > conditional on NAMED == 2. there is an interesting corollary. self-assignment seems to increase the reference count: x = 1; 'names<-'(x, 'foo'); names(x) # NULL x = 1; x = x; 'names<-'(x, 'foo'); names(x) # "foo" vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek Kusnierczyk wrote: > Berwin A Turlach wrote: > > >> This is similar to the discussion what value i should have in the >> following C snippet: >> i = 0; >> i += i++; >> >> > > > in fact, your example is useless because the result here is clearly > specified by the semantics (as far as i know -- prove me wrong). you > lookup i (0) and i (0) (the order does not matter here), add these > values (0), assign to i (0), and increase i (1). > i'm happy to prove myself wrong. the c programming language, 2nd ed. by ritchie and kernigan, has the following discussion: " One unhappy situation is typified by the statement a[i] = i++; The question is whether the subscript is the old value of i or the new. Compilers can interpret this in different ways, and generate different answers depending on their interpretation. The standard intentionally leaves most such matters unspecified. " vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > On Thu, 12 Mar 2009 10:53:19 +0100 > Wacek Kusnierczyk wrote: > > >> well, ?'names<-' says: >> >> " >> Value: >> For 'names<-', the updated object. >> " >> >> which is only partially correct, in that the value will sometimes be >> an updated *copy* of the object. >> > > But since R supposedly *supposedly* > uses call-by-value (though we know how to > circumvent that, don't we?) we know how a lot of built-ins hack around this, don't we, and we also know that call-by-value is not really the argument passing mechanism in r. > wouldn't you always expect that a copy of > the object is returned? > indeed! that's what i have said previously, no? there is still space for the smart (i mean it) copy-on-assignment behaviour, but it should not be visible to the user, in particular, not in that 'names<-' destructively modifies the object it is given when the refcount is 1. in my humble opinion, there is either a design flaw or a bug here. > > >>> And the R Language manual (ignoring for the moment that it is a >>> draft and all that), >>> >> since we must... >> >> >>> clearly states that >>> >>> names(x) <- c("a","b") >>> >>> is equivalent to >>> >>> '*tmp*' <- x >>> x <- "names<-"('*tmp*', value=c("a","b")) >>> >>> >> ... and? >> > > This seems to suggest seems to suggest? is not the purpose of documentation to clearly, ideally beyond any doubt, specify what is to be specified? > that in this case the infix and prefix syntax > is not equivalent as it does not say that > are you suggesting fortune telling from what the docs do *not* say? > names(x) <- c("a","b") > is equivalent to > x <- "names<-"(x, value=c("a","b")) > and I was commenting on the claim that the infix syntax is equivalent > to the prefix syntax. > > >> does this say anything about what 'names<-'(...) actually >> returns? updated *tmp*, or a copy of it? >> > > Since R uses pass-by-value, since? it doesn't! > you would expect the latter, wouldn't > you? yes, that's what i'd expect in a functional language. > If you entertain the idea that 'names<-' updates *tmp* and > returns the updated *tmp*, then you believe that 'names<-' behaves in a > non-standard way and should take appropriate care. > i got lost in your argumentation. i have given examples of where 'names<-' destructively modifies and returns the updated object, not a copy. what is your point here? > And the fact that a variable *tmp* is used hints to the fact that > 'names<-' might have side-effect. are you suggesting fortune telling from the fact that a variable *tmp* is used? > If 'names<-' has side effects, > then it might not be well defined with what value x ends up with if > one executes: > x <- 'names<-'(x, value=c("a","b")) > not really, unless you mean the returned object in the referential sense (memory location) versus value conceptually. here x will obviously have the value of the original x plus the names, *but* indeed you cannot tell from this snippet whether after the assignment x will be the same, though updated, object or will rather be an updated copy: x = c(1) x = 'names<-'(x, 'foo') # x is the same object x = c(1) y = x x = 'names<-'(x, 'foo') # x is another object so, as you say, it is not well defined with what object will x end up as its value, though the value of the object visible to the user is well defined. rewrite the above and play: x = c(1) y = 'names<-'(x, 'foo') names(x) what are the names of x? is y identical (sensu refernce) with x, is y different (sensu reference) but indiscernible (sensu value) from x, or is y different (sensu value) from x in that y has names and x doesn't? > This is similar to the discussion what value i should have in the > following C snippet: > i = 0; > i += i++; > nonsense, it's a *completely* different issue. here you touch the issue of the order of evaluation, and not of whether an object is copied or modified; above, the inverse is true. in fact, your example is useless because the result here is clearly specified by the semantics (as far as i know -- prove me wrong). you lookup i (0) and i (0) (the order does not matter here), add these values (0), assign to i (0), and increase i (1). i have a better example for you: int i = 0; i += ++i - ++i which will give different final values for i in c (2 with gcc 4.2, 1 with gcc 3.4), c# and java (-1), perl (2) and php (1). again, this has nothing to do with the above. > > [..] > >>> I am not sure whether R ever behaved in that way, but as Peter >>> pointed out, this would be quite undesirable from a memory >>> management and performance point of view. >>> >> why? you can still use the infix names<- with destructive semantics >> to avoid copying. >> > > I guess that would require a rewrite (or extension) of the par
Re: [Rd] surprising behaviour of names<-
On Thu, 12 Mar 2009 10:53:19 +0100 Wacek Kusnierczyk wrote: > well, ?'names<-' says: > > " > Value: > For 'names<-', the updated object. > " > > which is only partially correct, in that the value will sometimes be > an updated *copy* of the object. But since R supposedly uses call-by-value (though we know how to circumvent that, don't we?) wouldn't you always expect that a copy of the object is returned? > > And the R Language manual (ignoring for the moment that it is a > > draft and all that), > > since we must... > > > clearly states that > > > > names(x) <- c("a","b") > > > > is equivalent to > > > > '*tmp*' <- x > > x <- "names<-"('*tmp*', value=c("a","b")) > > > > ... and? This seems to suggest that in this case the infix and prefix syntax is not equivalent as it does not say that names(x) <- c("a","b") is equivalent to x <- "names<-"(x, value=c("a","b")) and I was commenting on the claim that the infix syntax is equivalent to the prefix syntax. > does this say anything about what 'names<-'(...) actually > returns? updated *tmp*, or a copy of it? Since R uses pass-by-value, you would expect the latter, wouldn't you? If you entertain the idea that 'names<-' updates *tmp* and returns the updated *tmp*, then you believe that 'names<-' behaves in a non-standard way and should take appropriate care. And the fact that a variable *tmp* is used hints to the fact that 'names<-' might have side-effect. If 'names<-' has side effects, then it might not be well defined with what value x ends up with if one executes: x <- 'names<-'(x, value=c("a","b")) This is similar to the discussion what value i should have in the following C snippet: i = 0; i += i++; [..] > > I am not sure whether R ever behaved in that way, but as Peter > > pointed out, this would be quite undesirable from a memory > > management and performance point of view. > > why? you can still use the infix names<- with destructive semantics > to avoid copying. I guess that would require a rewrite (or extension) of the parser. To me, Section 10.1.2 of the Language Definition manual suggests that once an expression is parsed, you cannot distinguish any more whether 'names<-' was called using infix syntax or prefix syntax. Thus, I guess you want to start a discussion with R Core whether it is worthwhile to change the parser such that it keeps track on whether a function was used with infix notation or prefix notation and to provide for most (all?) assignment operators implementations that use destructive semantics if the infix version was used and always copy if the prefix notation is used. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > > Whoever said that must have been at that moment not as precise as he or > she could have been. Also, R does not behave according to what people > say on this list (which is good, because some times people they wrong > things on this list) but according to how it is documented to do; at > least that is what people on this list (and others) say. :) > well, ?'names<-' says: " Value: For 'names<-', the updated object. " which is only partially correct, in that the value will sometimes be an updated *copy* of the object. > And the R Language manual (ignoring for the moment that it is a draft > and all that), since we must... > clearly states that > > names(x) <- c("a","b") > > is equivalent to > > '*tmp*' <- x > x <- "names<-"('*tmp*', value=c("a","b")) > ... and? does this say anything about what 'names<-'(...) actually returns? updated *tmp*, or a copy of it? > [...] > >> well, i can imagine a user using the prefix 'names<-' precisely under >> the assumption that it will perform functionally; >> > > You mean > y <- 'names<-'(x, "foo") > instead of > y <- x > names(y) <- "foo" > ? > what i mean is, rather precisely, that 'names<-'(x, 'foo') will produce a *new* object with a copy of the value of x and names as specified, and will *not*, under any circumstances, modify x. the first line above does not quite address this, e.g.: x = c(1) y = 'names<-'(x, 'foo') names(x) # "foo", 'should' be NULL > Fair enough. But I would still prefer the latter version this it is > (for me) easier to read and to decipher the intention of the code. > you're welcome to use it. but this is personal preference, and i'm trying to discuss the semantics of r here. what you show is a way to clutter the code, and you need to explicitly name the new object, while, in functional programming, it is typical to operate on anonymous objects passed from one function to another, e.g. f('names<-'(x, 'foo')) which would have to become y = x names(y) = 'foo' f(y) or f({y = x; names(y) = 'foo'; y}) with 'y' being a nuissance name. >> i.e., 'names<-'(x, 'foo') will always produce a copy of x with the >> new names, and never change the x. >> > > I am not sure whether R ever behaved in that way, but as Peter pointed > out, this would be quite undesirable from a memory management and > performance point of view. why? you can still use the infix names<- with destructive semantics to avoid copying. > Image that every time you modify a (name) > component of a large object a new copy of that object is created. > see above. besides, r has been several times claimed here (but see your remark above) to be a functional language, and in this context it is surprising that the smart (i mean it) copy-on-assignment mechanism, which is an implementational optimization, not only becomes visible, but also makes functions (hmm, procedures?) such as 'names<-' non-functional -- in some, but not all, cases. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Thu, 12 Mar 2009 10:05:36 +0100 Wacek Kusnierczyk wrote: > well, as far as i remember, it has been said on this list that in r > the infix syntax is equivalent to the prefix syntax, [...] Whoever said that must have been at that moment not as precise as he or she could have been. Also, R does not behave according to what people say on this list (which is good, because some times people they wrong things on this list) but according to how it is documented to do; at least that is what people on this list (and others) say. :) And the R Language manual (ignoring for the moment that it is a draft and all that), clearly states that names(x) <- c("a","b") is equivalent to '*tmp*' <- x x <- "names<-"('*tmp*', value=c("a","b")) [...] > well, i can imagine a user using the prefix 'names<-' precisely under > the assumption that it will perform functionally; You mean y <- 'names<-'(x, "foo") instead of y <- x names(y) <- "foo" ? Fair enough. But I would still prefer the latter version this it is (for me) easier to read and to decipher the intention of the code. > i.e., 'names<-'(x, 'foo') will always produce a copy of x with the > new names, and never change the x. I am not sure whether R ever behaved in that way, but as Peter pointed out, this would be quite undesirable from a memory management and performance point of view. Image that every time you modify a (name) component of a large object a new copy of that object is created. > cheers, and thanks for the discussion. You are welcome. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek Kusnierczyk wrote: > > is precisely why i'd think that the prefix 'names<-' should never do > destructive modifications, because that's what x = 'names<-'(x, 'foo'), > and thus also names(x) = 'foo', is for. > > to make the point differently, i'd expect the following two to be equivalent: x = c(1); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); do.call('names<-', list(x, 'foo')); names(x) # NULL but they're obviously not. and of course, just that i'd expect it is not a strong argument. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > On Wed, 11 Mar 2009 20:29:14 +0100 > Wacek Kusnierczyk wrote: > > >> Simon Urbanek wrote: >> >>> Wacek, >>> >>> Peter gave you a full answer explaining it very well. If you really >>> want to be able to trace each instance yourself, you have to learn >>> far more about R internals than you apparently know (and Peter >>> hinted at that). Internally x=1 an x=c(1) are slightly different in >>> that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = >>> 0 which is what causes the difference in behavior as Peter >>> explained. The reason is that c(1) creates a copy of the 1 (which >>> is a constant [=unmutable] thus requiring a copy) and the new copy >>> has no other references and thus can be modified and hence NAMED(x) >>> = 0. >>> >> simon, thanks for the explanation, it's now as clear as i might >> expect. >> >> now i'm concerned with what you say: that to understand something >> visible to the user one needs to "learn far more about R internals >> than one apparently knows". your response suggests that to use r >> without confusion one needs to know the internals, >> > > Simon can probably speak for himself, but according to my reading he > has not suggested anything similar to what you suggest he suggested. :) > so i did not say *he* suggested this. 'your response suggests' does not, on my reading, imply any intention from simon's side. but it's you who is an expert in (a dialect of) english, so i won't argue. > >> and this would be a really bad thing to say.. >> > > No problems, since he did not say anything vaguely similar to what you > suggest he said. > let's not depart from the point. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Berwin A Turlach wrote: > On Wed, 11 Mar 2009 20:31:18 +0100 > Wacek Kusnierczyk wrote: > > >> Simon Urbanek wrote: >> >>> On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: >>> >>> Wacek, Peter gave you a full answer explaining it very well. If you really want to be able to trace each instance yourself, you have to learn far more about R internals than you apparently know (and Peter hinted at that). Internally x=1 an x=c(1) are slightly different in that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is what causes the difference in behavior as Peter explained. The reason is that c(1) creates a copy of the 1 (which is a constant [=unmutable] thus requiring a copy) and the new copy has no other references and thus can be modified and hence NAMED(x) = 0. >>> Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above >>> -- since NAMED(c(1)) = 0 and once it's assigned to x it becomes >>> NAMED(x) = 1 -- this is just a detail on how things work with >>> assignment, the explanation above is still correct since >>> duplication happens conditional on NAMED == 2. >>> >> i guess this is what every user needs to know to understand the >> behaviour one can observe on the surface? >> > > Nope, only users who prefer to write '+'(1,2) instead of 1+2, or > 'names<-'(x, 'foo') instead of names(x)='foo'. > > well, as far as i remember, it has been said on this list that in r the infix syntax is equivalent to the prefix syntax, so no one wanting to use the form above should be afraid of different semantics; these two forms should be perfectly equivalent. after all, x = 1 names(x) = 'foo' names(x) should return NULL, because when the second assignment is made, we need to make a copy of the value of x, so it is the copy that should have changed names, not the value of x (which would still be the original 1). on the other hand, the fact that names(x) = 'foo' is (or so it seems) a shorthand for x = 'names<-'(x, 'foo') is precisely why i'd think that the prefix 'names<-' should never do destructive modifications, because that's what x = 'names<-'(x, 'foo'), and thus also names(x) = 'foo', is for. i guess the above is sort of blasphemy. > Attempting to change the name attribute of x via 'names<-'(x, 'foo') > looks to me as if one relies on a side effect of the function > 'names<-'; which, in my book would be a bad thing. indeed; so, for coherence, 'names<-' should always do the modification on a copy. it would then have semantics different from the infix form of 'names<-', but at least consistently so. > I.e. relying on side > effects of a function, or writing functions with side effects which are > then called for their side-effects; this, of course, excludes > functions like plot() :) I never had the need to call 'names<-'() > directly and cannot foresee circumstances in which I would do so. > > Plenty of users, including me, are happy using the latter forms and, > hence, never have to bother with understanding these implementation > details or have to bother about them. > > Your mileage obviously varies, but that is when you have to learn about > these internal details. If you call functions because of their > side-effects, you better learn what the side-effects are exactly. > well, i can imagine a user using the prefix 'names<-' precisely under the assumption that it will perform functionally; i.e., 'names<-'(x, 'foo') will always produce a copy of x with the new names, and never change the x. that there will be a destructive modification made to x on some, but not all, occasions, is hardly a good thing in this context -- and it's not a situation where a user wants to use the function "because of its side effects", quite to the contrary. this was actually the situation i had when i first discovered the surprizing behaviour of 'names<-'; i thought 'names<-' did *not* have side effects. cheers, and thanks for the discussion. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Wed, 11 Mar 2009 20:29:14 +0100 Wacek Kusnierczyk wrote: > Simon Urbanek wrote: > > Wacek, > > > > Peter gave you a full answer explaining it very well. If you really > > want to be able to trace each instance yourself, you have to learn > > far more about R internals than you apparently know (and Peter > > hinted at that). Internally x=1 an x=c(1) are slightly different in > > that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = > > 0 which is what causes the difference in behavior as Peter > > explained. The reason is that c(1) creates a copy of the 1 (which > > is a constant [=unmutable] thus requiring a copy) and the new copy > > has no other references and thus can be modified and hence NAMED(x) > > = 0. > > > simon, thanks for the explanation, it's now as clear as i might > expect. > > now i'm concerned with what you say: that to understand something > visible to the user one needs to "learn far more about R internals > than one apparently knows". your response suggests that to use r > without confusion one needs to know the internals, Simon can probably speak for himself, but according to my reading he has not suggested anything similar to what you suggest he suggested. :) > and this would be a really bad thing to say.. No problems, since he did not say anything vaguely similar to what you suggest he said. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Wed, 11 Mar 2009 20:31:18 +0100 Wacek Kusnierczyk wrote: > Simon Urbanek wrote: > > > > On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: > > > >> Wacek, > >> > >> Peter gave you a full answer explaining it very well. If you really > >> want to be able to trace each instance yourself, you have to learn > >> far more about R internals than you apparently know (and Peter > >> hinted at that). Internally x=1 an x=c(1) are slightly different > >> in that the former has NAMED(x) = 2 whereas the latter has > >> NAMED(x) = 0 which is what causes the difference in behavior as > >> Peter explained. The reason is that c(1) creates a copy of the 1 > >> (which is a constant [=unmutable] thus requiring a copy) and the > >> new copy has no other references and thus can be modified and > >> hence NAMED(x) = 0. > >> > > > > Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above > > -- since NAMED(c(1)) = 0 and once it's assigned to x it becomes > > NAMED(x) = 1 -- this is just a detail on how things work with > > assignment, the explanation above is still correct since > > duplication happens conditional on NAMED == 2. > > i guess this is what every user needs to know to understand the > behaviour one can observe on the surface? Nope, only users who prefer to write '+'(1,2) instead of 1+2, or 'names<-'(x, 'foo') instead of names(x)='foo'. Attempting to change the name attribute of x via 'names<-'(x, 'foo') looks to me as if one relies on a side effect of the function 'names<-'; which, in my book would be a bad thing. I.e. relying on side effects of a function, or writing functions with side effects which are then called for their side-effects; this, of course, excludes functions like plot() :) I never had the need to call 'names<-'() directly and cannot foresee circumstances in which I would do so. Plenty of users, including me, are happy using the latter forms and, hence, never have to bother with understanding these implementation details or have to bother about them. Your mileage obviously varies, but that is when you have to learn about these internal details. If you call functions because of their side-effects, you better learn what the side-effects are exactly. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Simon Urbanek wrote: > > On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: > >> Wacek, >> >> Peter gave you a full answer explaining it very well. If you really >> want to be able to trace each instance yourself, you have to learn >> far more about R internals than you apparently know (and Peter hinted >> at that). Internally x=1 an x=c(1) are slightly different in that the >> former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is >> what causes the difference in behavior as Peter explained. The reason >> is that c(1) creates a copy of the 1 (which is a constant >> [=unmutable] thus requiring a copy) and the new copy has no other >> references and thus can be modified and hence NAMED(x) = 0. >> > > Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above -- > since NAMED(c(1)) = 0 and once it's assigned to x it becomes NAMED(x) > = 1 -- this is just a detail on how things work with assignment, the > explanation above is still correct since duplication happens > conditional on NAMED == 2. i guess this is what every user needs to know to understand the behaviour one can observe on the surface? thanks for further clarifications. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Simon Urbanek wrote: > Wacek, > > Peter gave you a full answer explaining it very well. If you really > want to be able to trace each instance yourself, you have to learn far > more about R internals than you apparently know (and Peter hinted at > that). Internally x=1 an x=c(1) are slightly different in that the > former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is > what causes the difference in behavior as Peter explained. The reason > is that c(1) creates a copy of the 1 (which is a constant [=unmutable] > thus requiring a copy) and the new copy has no other references and > thus can be modified and hence NAMED(x) = 0. simon, thanks for the explanation, it's now as clear as i might expect. now i'm concerned with what you say: that to understand something visible to the user one needs to "learn far more about R internals than one apparently knows". your response suggests that to use r without confusion one needs to know the internals, and this would be a really bad thing to say.. i have long been concerned with that r unnecessarily exposes users to its internals, and here's one more example of how the interface fails to hide the guts. (and peter did not give me a full answer, but a vague hint.) vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
On Mar 11, 2009, at 10:52 , Simon Urbanek wrote: Wacek, Peter gave you a full answer explaining it very well. If you really want to be able to trace each instance yourself, you have to learn far more about R internals than you apparently know (and Peter hinted at that). Internally x=1 an x=c(1) are slightly different in that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is what causes the difference in behavior as Peter explained. The reason is that c(1) creates a copy of the 1 (which is a constant [=unmutable] thus requiring a copy) and the new copy has no other references and thus can be modified and hence NAMED(x) = 0. Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above -- since NAMED(c(1)) = 0 and once it's assigned to x it becomes NAMED(x) = 1 -- this is just a detail on how things work with assignment, the explanation above is still correct since duplication happens conditional on NAMED == 2. Cheers, Simon On Mar 10, 2009, at 18:16 , Wacek Kusnierczyk wrote: i got an offline response saying that my original post may have not been clear as to what the problem was, essentially, and that i may need to restate it in words, in addition to code. the problem is: the performance of 'names<-' is incoherent, in that in some situations it acts in a functional manner, producing a copy of its argument with the names changed, while in others it changes the object in-place (and returns it), without copying first. your explanation below is of course valid, but does not seem to address the issue. in the examples below, there is always (or so it seems) just one reference to the object. why are the following functional: x = 1; 'names<-'(x, 'foo'); names(x) x = 'foo'; 'names<-'(x, 'foo'); names(x) while these are destructive: x = c(1); 'names<-'(x, 'foo'); names(x) x = c('foo'); 'names<-'(x, 'foo'); names(x) it is claimed that in r a singular value is a one-element vector, and indeed, identical(1, c(1)) # TRUE all.equal(is(1), is(c(1))) # TRUE i also do not understand the difference here: x = c(1); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); names(x); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); print(x); 'names<-'(x, 'foo'); names(x) # NULL x = c(1); print(c(x)); 'names<-'(x, 'foo'); names(x) # "foo" does print, but not names, increase the reference count for x when applied to x, but not to c(x)? if the issue is that there is, in those examples where x is left unchanged, an additional reference to x that causes the value of x to be copied, could you please explain how and when this additional reference is created? thanks, vQ Peter Dalgaard wrote: is there something i misunderstand here? Only the ideology/pragmatism... In principle, R has call-by-value semantics and a function does not destructively modify its arguments(*), and foo(x)<-bar behaves like x <- "foo<-"(x, bar). HOWEVER, this has obvious performance repercussions (think x <- rnorm(1e7); x[1] <- 0), so we do allow destructive modification by replacement functions, PROVIDED that the x is not used by anything else. On the least suspicion that something else is using the object, a copy of x is made before the modification. So (A) you should not use code like y <- "foo<-"(x, bar) because (B) you cannot (easily) predict whether or not x will be modified destructively (*) unless you mess with match.call() or substitute() and the like. But that's a different story. -- --- Wacek Kusnierczyk, MD PhD Email: w...@idi.ntnu.no Phone: +47 73591875, +47 72574609 Department of Computer and Information Science (IDI) Faculty of Information Technology, Mathematics and Electrical Engineering (IME) Norwegian University of Science and Technology (NTNU) Sem Saelands vei 7, 7491 Trondheim, Norway Room itv303 Bioinformatics & Gene Regulation Group Department of Cancer Research and Molecular Medicine (IKM) Faculty of Medicine (DMF) Norwegian University of Science and Technology (NTNU) Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway Room 231.05.060 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek, Peter gave you a full answer explaining it very well. If you really want to be able to trace each instance yourself, you have to learn far more about R internals than you apparently know (and Peter hinted at that). Internally x=1 an x=c(1) are slightly different in that the former has NAMED(x) = 2 whereas the latter has NAMED(x) = 0 which is what causes the difference in behavior as Peter explained. The reason is that c(1) creates a copy of the 1 (which is a constant [=unmutable] thus requiring a copy) and the new copy has no other references and thus can be modified and hence NAMED(x) = 0. Cheers, Simon On Mar 10, 2009, at 18:16 , Wacek Kusnierczyk wrote: i got an offline response saying that my original post may have not been clear as to what the problem was, essentially, and that i may need to restate it in words, in addition to code. the problem is: the performance of 'names<-' is incoherent, in that in some situations it acts in a functional manner, producing a copy of its argument with the names changed, while in others it changes the object in-place (and returns it), without copying first. your explanation below is of course valid, but does not seem to address the issue. in the examples below, there is always (or so it seems) just one reference to the object. why are the following functional: x = 1; 'names<-'(x, 'foo'); names(x) x = 'foo'; 'names<-'(x, 'foo'); names(x) while these are destructive: x = c(1); 'names<-'(x, 'foo'); names(x) x = c('foo'); 'names<-'(x, 'foo'); names(x) it is claimed that in r a singular value is a one-element vector, and indeed, identical(1, c(1)) # TRUE all.equal(is(1), is(c(1))) # TRUE i also do not understand the difference here: x = c(1); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); names(x); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); print(x); 'names<-'(x, 'foo'); names(x) # NULL x = c(1); print(c(x)); 'names<-'(x, 'foo'); names(x) # "foo" does print, but not names, increase the reference count for x when applied to x, but not to c(x)? if the issue is that there is, in those examples where x is left unchanged, an additional reference to x that causes the value of x to be copied, could you please explain how and when this additional reference is created? thanks, vQ Peter Dalgaard wrote: is there something i misunderstand here? Only the ideology/pragmatism... In principle, R has call-by-value semantics and a function does not destructively modify its arguments(*), and foo(x)<-bar behaves like x <- "foo<-"(x, bar). HOWEVER, this has obvious performance repercussions (think x <- rnorm(1e7); x[1] <- 0), so we do allow destructive modification by replacement functions, PROVIDED that the x is not used by anything else. On the least suspicion that something else is using the object, a copy of x is made before the modification. So (A) you should not use code like y <- "foo<-"(x, bar) because (B) you cannot (easily) predict whether or not x will be modified destructively (*) unless you mess with match.call() or substitute() and the like. But that's a different story. -- --- Wacek Kusnierczyk, MD PhD Email: w...@idi.ntnu.no Phone: +47 73591875, +47 72574609 Department of Computer and Information Science (IDI) Faculty of Information Technology, Mathematics and Electrical Engineering (IME) Norwegian University of Science and Technology (NTNU) Sem Saelands vei 7, 7491 Trondheim, Norway Room itv303 Bioinformatics & Gene Regulation Group Department of Cancer Research and Molecular Medicine (IKM) Faculty of Medicine (DMF) Norwegian University of Science and Technology (NTNU) Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway Room 231.05.060 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
i got an offline response saying that my original post may have not been clear as to what the problem was, essentially, and that i may need to restate it in words, in addition to code. the problem is: the performance of 'names<-' is incoherent, in that in some situations it acts in a functional manner, producing a copy of its argument with the names changed, while in others it changes the object in-place (and returns it), without copying first. your explanation below is of course valid, but does not seem to address the issue. in the examples below, there is always (or so it seems) just one reference to the object. why are the following functional: x = 1; 'names<-'(x, 'foo'); names(x) x = 'foo'; 'names<-'(x, 'foo'); names(x) while these are destructive: x = c(1); 'names<-'(x, 'foo'); names(x) x = c('foo'); 'names<-'(x, 'foo'); names(x) it is claimed that in r a singular value is a one-element vector, and indeed, identical(1, c(1)) # TRUE all.equal(is(1), is(c(1))) # TRUE i also do not understand the difference here: x = c(1); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); names(x); 'names<-'(x, 'foo'); names(x) # "foo" x = c(1); print(x); 'names<-'(x, 'foo'); names(x) # NULL x = c(1); print(c(x)); 'names<-'(x, 'foo'); names(x) # "foo" does print, but not names, increase the reference count for x when applied to x, but not to c(x)? if the issue is that there is, in those examples where x is left unchanged, an additional reference to x that causes the value of x to be copied, could you please explain how and when this additional reference is created? thanks, vQ Peter Dalgaard wrote: > >> is there something i misunderstand here? >> > > Only the ideology/pragmatism... In principle, R has call-by-value > semantics and a function does not destructively modify its arguments(*), > and foo(x)<-bar behaves like x <- "foo<-"(x, bar). HOWEVER, this has > obvious performance repercussions (think x <- rnorm(1e7); x[1] <- 0), so > we do allow destructive modification by replacement functions, PROVIDED > that the x is not used by anything else. On the least suspicion that > something else is using the object, a copy of x is made before the > modification. > > So > > (A) you should not use code like y <- "foo<-"(x, bar) > > because > > (B) you cannot (easily) predict whether or not x will be modified > destructively > > > (*) unless you mess with match.call() or substitute() and the like. But > that's a different story. > > > -- --- Wacek Kusnierczyk, MD PhD Email: w...@idi.ntnu.no Phone: +47 73591875, +47 72574609 Department of Computer and Information Science (IDI) Faculty of Information Technology, Mathematics and Electrical Engineering (IME) Norwegian University of Science and Technology (NTNU) Sem Saelands vei 7, 7491 Trondheim, Norway Room itv303 Bioinformatics & Gene Regulation Group Department of Cancer Research and Molecular Medicine (IKM) Faculty of Medicine (DMF) Norwegian University of Science and Technology (NTNU) Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway Room 231.05.060 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Peter Dalgaard wrote: > > (*) unless you mess with match.call() or substitute() and the like. But > that's a different story. > different or not, it is a story that happens quite often -- too often, perhaps -- to the degree that one may be tempted to say that the semantics of argument passing in r is a mess. which of course is not true, but since it is possible to mess with match.call & co, people (including r core) do mess with them, and the result is obviously a mess. on top of the clear call-by-need semantics -- and on the surface, you cannot tell how the arguments of a function will be taken (by value? by reference? not at all?), which in effect looks like a messy semantics. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Stavros Macrakis wrote: >>> (B) you cannot (easily) predict whether or not x will be modified >>> destructively >>> >> that's fine, thanks, but i must be terribly stupid as i do not see how >> this explains the examples above. where is the x used by something else >> in the first example, so that 'names<-'(x, 'foo') does *not* modify x >> destructively, while it does in the other cases? >> >> i just can't see how your explanation fits the examples -- it probably >> does, but i beg you show it explicitly. >> > > I think the following shows what Peter was referring to: > > In this case, there is only one pointer to the value of x: > > x <- c(1,2) > >> "names<-"(x,"foo") >> > foo >12 > >> x >> > foo >12 > > In this case, there are two: > > >> x <- c(1,2) >> y <- x >> "names<-"(x,"foo") >> > foo >12 > >> x >> > [1] 1 2 > >> y >> > [1] 1 2 > that is and was clear to me, but none of my examples was of the second form, and hence i think peter's answer did not answer my question. what's the difference here: x = 1 'names<-'(x, 'foo') names(x) # NULL x = c(foo=1) 'names<-'(x, 'foo') names(x) # "foo" certainly not something like what you show. what's the difference here: x = 1 'names<-'(x, 'foo') names(x) # NULL x = 1:2 'names<-'(x, c('foo', 'bar')) names(x) # "foo" "bar" certainly not something like what you show. > It seems as though `names<-` and the like cannot be treated as R > functions (which do not modify their arguments) but as special > internal routines which do sometimes modify their arguments. > they seem to behave somewhat like macros: 'names<-'(a, b) with the destructive 'names<-' is sort of replaced with a = 'names<-'(a, b) with a functional 'names<-'. but this still does not explain the incoherence above. my problem was and is not that 'names<-' is not a pure function, but that it sometimes is, sometimes is not, without any obvious explanation. that is, i suspect (not claim) that the behaviour is not a design feature, but an incident. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
>> (B) you cannot (easily) predict whether or not x will be modified >> destructively > > that's fine, thanks, but i must be terribly stupid as i do not see how > this explains the examples above. where is the x used by something else > in the first example, so that 'names<-'(x, 'foo') does *not* modify x > destructively, while it does in the other cases? > > i just can't see how your explanation fits the examples -- it probably > does, but i beg you show it explicitly. I think the following shows what Peter was referring to: In this case, there is only one pointer to the value of x: x <- c(1,2) > "names<-"(x,"foo") foo 12 > x foo 12 In this case, there are two: > x <- c(1,2) > y <- x > "names<-"(x,"foo") foo 12 > x [1] 1 2 > y [1] 1 2 It seems as though `names<-` and the like cannot be treated as R functions (which do not modify their arguments) but as special internal routines which do sometimes modify their arguments. -s __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Peter Dalgaard wrote: > Wacek Kusnierczyk wrote: > >> playing with 'names<-', i observed the following: >> >> x = 1 >> names(x) >> # NULL >> 'names<-'(x, 'foo') >> # c(foo=1) >> names(x) >> # NULL >> >> where 'names<-' has a functional flavour (does not change x), but: >> >> x = 1:2 >> names(x) >> # NULL >> 'names<-'(x, 'foo') >> # c(foo=1, 2) >> names(x) >> # "foo" NA >> >> where 'names<-' seems to perform a side effect on x (destructively >> modifies x). furthermore: >> >> x = c(foo=1) >> names(x) >> # "foo" >> 'names<-'(x, NULL) >> names(x) >> # NULL >> 'names<-'(x, 'bar') >> names(x) >> # "bar" !!! >> >> x = c(foo=1) >> names(x) >> # "foo" >> 'names<-'(x, 'bar') >> names(x) >> # "bar" !!! >> >> where 'names<-' is not only able to destructively remove names from x, >> but also destructively add or modify them (quite unlike in the first >> example above). >> >> analogous code but using 'dimnames<-' on a matrix performs a side effect >> on the matrix even if it initially does not have dimnames: >> >> x = matrix(1,1,1) >> dimnames(x) >> # NULL >> 'dimnames<-'(x, list('foo', 'bar')) >> dimnames(x) >> # list("foo", "bar") >> >> this is incoherent with the first example above, in that in both cases >> the structure initially has no names or dimnames attribute, but the end >> result is different in the two examples. >> >> is there something i misunderstand here? >> > > Only the ideology/pragmatism... In principle, R has call-by-value > semantics and a function does not destructively modify its arguments(*), > and foo(x)<-bar behaves like x <- "foo<-"(x, bar). HOWEVER, this has > obvious performance repercussions (think x <- rnorm(1e7); x[1] <- 0), so > we do allow destructive modification by replacement functions, PROVIDED > that the x is not used by anything else. On the least suspicion that > something else is using the object, a copy of x is made before the > modification. > > So > > (A) you should not use code like y <- "foo<-"(x, bar) > > because > > (B) you cannot (easily) predict whether or not x will be modified > destructively > > that's fine, thanks, but i must be terribly stupid as i do not see how this explains the examples above. where is the x used by something else in the first example, so that 'names<-'(x, 'foo') does *not* modify x destructively, while it does in the other cases? i just can't see how your explanation fits the examples -- it probably does, but i beg you show it explicitly. thanks. vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] surprising behaviour of names<-
Wacek Kusnierczyk wrote: > playing with 'names<-', i observed the following: > > x = 1 > names(x) > # NULL > 'names<-'(x, 'foo') > # c(foo=1) > names(x) > # NULL > > where 'names<-' has a functional flavour (does not change x), but: > > x = 1:2 > names(x) > # NULL > 'names<-'(x, 'foo') > # c(foo=1, 2) > names(x) > # "foo" NA > > where 'names<-' seems to perform a side effect on x (destructively > modifies x). furthermore: > > x = c(foo=1) > names(x) > # "foo" > 'names<-'(x, NULL) > names(x) > # NULL > 'names<-'(x, 'bar') > names(x) > # "bar" !!! > > x = c(foo=1) > names(x) > # "foo" > 'names<-'(x, 'bar') > names(x) > # "bar" !!! > > where 'names<-' is not only able to destructively remove names from x, > but also destructively add or modify them (quite unlike in the first > example above). > > analogous code but using 'dimnames<-' on a matrix performs a side effect > on the matrix even if it initially does not have dimnames: > > x = matrix(1,1,1) > dimnames(x) > # NULL > 'dimnames<-'(x, list('foo', 'bar')) > dimnames(x) > # list("foo", "bar") > > this is incoherent with the first example above, in that in both cases > the structure initially has no names or dimnames attribute, but the end > result is different in the two examples. > > is there something i misunderstand here? Only the ideology/pragmatism... In principle, R has call-by-value semantics and a function does not destructively modify its arguments(*), and foo(x)<-bar behaves like x <- "foo<-"(x, bar). HOWEVER, this has obvious performance repercussions (think x <- rnorm(1e7); x[1] <- 0), so we do allow destructive modification by replacement functions, PROVIDED that the x is not used by anything else. On the least suspicion that something else is using the object, a copy of x is made before the modification. So (A) you should not use code like y <- "foo<-"(x, bar) because (B) you cannot (easily) predict whether or not x will be modified destructively (*) unless you mess with match.call() or substitute() and the like. But that's a different story. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel