On Mon, 5 Sep 2005, [EMAIL PROTECTED] wrote: > Luke Tierney <[EMAIL PROTECTED]> wrote: > >> It might or might not work now but is not guaranteed to do so reliably >> in the future. Seeing the risks of leaving SETLENGTH exposed, it is >> very likely that SETLENGTH will be removed from the sources after the >> 2.2.0 release. > >> If you provide your own methods to read and write the external pointer >> then you don' need this; this is safer than relying on undocumented >> behavior of [ and [<- in any case. You also then don't need to use >> R_PreserveObject unless you really need to use it from the C level >> outside of a context where an R reference exists. > > I'm not sure I follow this. Maybe I should explain the context for > the problem. > > textConnection("xyz", "w") creates a connection, the output of which > is deposited in a char vector named "xyz", which is updated line by > line as output is sent to the connection. The current code maintains > a pointer to "xyz" in the form of an unprotected SEXP. Hence if the > user does rm(xyz), bad things happen. A small bug, I admit. > > I think the best fix is to use a protected reference to the result > vector. I think this is safe and doesn't rely on any abuse of the > interfaces. > > There's also a performance issue, that the result is updated after > every line of output, resulting in a vast amount of copying if a large > result is accumulated. This is the part that could be fixed by using > SETLENGTH to manage the length of the protected result vector. > > I'm not sure what you mean by undocumented behavior of [ and [<-. I > think all I'm relying on is that as long as an outstanding reference > to the result vector exists, that R has to make sure the reference > remains valid, and hence can't change the memory allocation of the > result vector in any way. I don't care what else happens to the > contents of the vector, as long as I get to control when it is > released. It is ok with me if the user modifies the result vector > in-place, since my reference stays valid. So I don't actually care > how [ and [<- work.
It would have helped to explain what you are up to. I had to guess and guessed wrong, so forget the [ and [<- issue for now. > I think the only undocumented thing I'm relying on, is that the memory > manager doesn't pay attention to the LENGTH of objects that it isn't > actively doing anything to. Currently, it actually only uses LENGTH > in one spot: for updating R_LargeVallocSize when a large vector is > released. The true allocation sizes for individual objects are always > kept in another place (either by malloc, or in the node class of the > object). > > It seems like in this limited usage, SETLENGTH does represent a useful > feature, by permitting safe over-allocation of a protected object, and > might be worth preserving (and documenting) for that purpose. I am not comfortable making this available at this point. It might be useful to have but would need careful thought. Without some way to find out the true length there are potential problems. Without some way of making sure the fields in VECSXP and STRSXP that are added are valid there are potential problems (not the first time but if the size is shrunk and then increased). Not that this can't be resolved but it would take time that I don't have now, and this isn't high priority enough to schedule in the near future. So for now you should not use SETLENGTH if you want your code to work beyond 2.2.0. > Of course, the real problem here is the semantics of textConnection(), > which make life much more difficult and can't be changed because they > are specified outside of R. It may be possible to expand the semantics by adding a logical argument that controls whether the vector is to be over-allocated and filled with zero length strings and truncated to the true length on close. Another variant would be to have a logical argument that says to keep the input internally and provide a function, say textConnectionOutput, to retrieve the internal output. I would then use a linked list internally. The semantics of close complicate this a bit; this function would probably need to optionally close the connection to get a final complete line. luke -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel