[R] Question
Hi I was fitting this model with two matrices. But this bifit function didnt run in my R studio and i got error message. Could anyone guide me how to install this command. I am giving that command below to have better understanding. x.f <- cbind(log(Q5.all.pf), log(Q5.all.pf)^2) y.f <- t(log(mx5x5.all.pf)) bifit.f <- bifit(x.f, y.f, c=6) Ankita Shukla Research Scholar International Institute for Population sciences Mumbai, India [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] wiki link on main page is down?
On the left panel of http://www.r-project.org/ , there is a link titled "Wiki" which points to http://rwiki.sciviews.org/ . However, this link is broken and gives a 404 Not Found error. Could you please fix it? thanks raju __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help
i want to analyse survival data using typeI HALF LOGISTIC DISTRIBUTION.how can i go about it?it installed one on R in the survival package didn't include the distribution...or i need a code to use maximum likelihood to estimate the parameter in survival analysis.a typical example of distribution other than that installed in R will help.thanks __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, 25 Dec 2014, Bert Gunter wrote: You persist in failing to read the docs! "the docs" -- do those exclude those I have been quoting and linking to? Moreover, neither Hadley Wickham, nor anyone else, is the authoritative source for R usage (other than for the (many!) packages he, himself has authored). R's Help pages and manuals -- and ultimately the source code -- are the only such source. Very pendantic. Are you saying that Hadley Wickham's claim was incorrect? To repeat, he said that this would return TRUE if x were a vector: is.atomic(x) || is.list(x) If you think that is wrong, I'd be interested to know more about that. ?factor says in its very first line: "The function factor is used to encode a **vector** as a factor (the terms ‘category’ and ‘enumerated type’ are also used for factors)" (emphasis added) So what? Are you saying that a factor *is* a vector? I quoted this before, but I'll repeat it here -- from the third paragaraph of the Details section of ?vector: Note that factors are _not_ vectors; ‘is.vector’ returns ‘FALSE’ and ‘as.vector’ converts a factor to a character vector for ‘mode = "any"’. I guess that is an "authoritative source" by your criteria even though it isn't in the first line of the page. f <- factor (letters[1:3]) f [1] a b c Levels: a b c attributes(f) $levels [1] "a" "b" "c" $class [1] "factor" is.vector(f) [1] FALSE attributes(f) <- NULL f [1] 1 2 3 is.vector(f) [1] TRUE And your point is what? Yes, we can convert between different kinds of objects. Are you saying that a factor *is* a vector because you can coerce it into a vector by removing its attributes? I do think it is very central to this discussion that attributes(x) <- NULL makes x into a vector, and that is not true just for factors, but also matrices, as you showed me earlier. Following your lead, this is another example: b <- 1:4 attr(b, "dim") <- c(2,2) is.matrix(b) [1] TRUE Does that mean that "a matrix is a vector"? Not for me, but it does make it easy to see how that concept helps people to understand the internal workings of R. Gabor Grothendieck wrote, "I think its the idea that in R all data objects are vectors (for some notion of vector) in the sense that all Lisp objects are lists, all APL objects are arrays and all tcl objects are character strings." That's how I've been thinking about it, too, but I'm not sure that *all* data objects are vectors in this sense. If that were the case, the Wickham test would always return TRUE. Don't you think it's time to call a halt to this? You go first. Mike -- Michael B. Miller, Ph.D. University of Minnesota http://scholar.google.com/citations?user=EV_phq4J __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debugging R code and dealing with dependencies
On Thu, 25 Dec 2014, David Winsemius wrote: On Dec 25, 2014, at 1:04 AM, Mike Miller wrote: I just wanted to put this out there. It's just some of my observations about things that happen with R, or happened in this particular investigation. There were definitely some lessons for me in this, and maybe that will be true of someone else. The main thing I picked up is that it is good to put plenty of checks into our code -- if we expect input of a certain type or class, then I should either coerce input into that structure or test the input and throw an error. If the function works very differently for different kinds of input, this should be documented. The more people are doing this, the better things will go for everyone. I was working with a CRAN package called RFGLS... http://cran.r-project.org/web/packages/RFGLS/index.html ...and I was getting an error. After a few rounds of testing I realized that the error was caused by a FAMID variable that was of character type. But The Details section of the help page does say that the accepted FTYPES are all integers between 1 and 6 and the INDIV variables are integers in range 1:4. But FAMID and FTYPE are different variables, both required. The problem seemed to be that gls.batch() expected FAMID to be integers, but the default ought to be character type because family and individual IDs in nearly all genetic-analysis software are character strings (they might even be people's names). You are making up rules that were not in accord with the documentation. I think you are confusing FTYPE with FAMID. This was the error: Error in sum(blocksize) : invalid 'type' (character) of argument Calls: gls.batch -> bdsmatrix To figure out more about it, I spent a bunch of time to go from CMD BATCH mode to an interactive session so that I could look at traceback(). Generally the first thing to check is the help page. And if there is a worked example to look at its data: data(pedigree, package="RFGLS") str(pedigree) 'data.frame': 4050 obs. of 5 variables: $ FAMID: int 10 10 10 10 20 20 20 20 30 30 ... $ ID : int 11 12 13 14 21 22 23 24 31 32 ... $ PID : int 14 14 0 0 24 24 0 0 34 34 ... $ MID : int 13 13 0 0 23 23 0 0 33 33 ... $ SEX : num 1 1 2 1 2 2 2 1 2 2 … Thanks for your efforts, but you are mistaken. Before I wrote anything here I had already worked through this with Rob Kirkpatrick, we had run the data() examples, confirmed the error there, and more. I was a coauthor of the Human Heredity paper that introduced this software and it was based on other work I had done. I'm pretty sure I'm the #1 user of this package. FTYPE != FAMID Everything I said was correct. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debugging R code and dealing with dependencies
> On Dec 25, 2014, at 1:04 AM, Mike Miller wrote: > > I just wanted to put this out there. It's just some of my observations about > things that happen with R, or happened in this particular investigation. > There were definitely some lessons for me in this, and maybe that will be > true of someone else. The main thing I picked up is that it is good to put > plenty of checks into our code -- if we expect input of a certain type or > class, then I should either coerce input into that structure or test the > input and throw an error. If the function works very differently for > different kinds of input, this should be documented. The more people are > doing this, the better things will go for everyone. > > > I was working with a CRAN package called RFGLS... > > http://cran.r-project.org/web/packages/RFGLS/index.html > > ...and I was getting an error. After a few rounds of testing I realized that > the error was caused by a FAMID variable that was of character type. But The Details section of the help page does say that the accepted FTYPES are all integers between 1 and 6 and the INDIV variables are integers in range 1:4. > The problem seemed to be that gls.batch() expected FAMID to be integers, but > the default ought to be character type because family and individual IDs in > nearly all genetic-analysis software are character strings (they might even > be people's names). You are making up rules that were not in accord with the documentation. > This was the error: > > Error in sum(blocksize) : invalid 'type' (character) of argument > Calls: gls.batch -> bdsmatrix > > To figure out more about it, I spent a bunch of time to go from CMD BATCH > mode to an interactive session so that I could look at traceback(). Generally the first thing to check is the help page. And if there is a worked example to look at its data: > data(pedigree, package="RFGLS") > str(pedigree) 'data.frame': 4050 obs. of 5 variables: $ FAMID: int 10 10 10 10 20 20 20 20 30 30 ... $ ID : int 11 12 13 14 21 22 23 24 31 32 ... $ PID : int 14 14 0 0 24 24 0 0 34 34 ... $ MID : int 13 13 0 0 23 23 0 0 33 33 ... $ SEX : num 1 1 2 1 2 2 2 1 2 2 … — David. > That got me this additional info: > >> traceback() > 2: bdsmatrix(sizelist, lme.out$sigma@blocks, dimnames = list(id, id)) > > bdsmatrix() is from a package on which RFGLS depends: > > http://cran.r-project.org/web/packages/bdsmatrix/index.html > > The problem is that RFGLS's gls.batch() function is sending something to > bdsmatrix's bdsmatrix() that it can't handle. So I look at the code for > bdsmatrix() and I see this: > > if (any(blocksize <= 0)) > stop("Block sizes must be >0") > if (any(as.integer(blocksize) != blocksize)) > stop("Block sizes must be integers") > n1 <- as.integer(sum(blocksize)) > > The condition any(as.integer(blocksize) != blocksize)) fails (is TRUE) only > if blocksize contains one or more noninteger numeric values. It doesn't fail > if blocksize is character or logical if the character strings are integers. > Example: > >> 4=="4" > [1] TRUE > > That's an interesting feature of R, but I guess that's how it works. Also > this: > >> 1=="1" > [1] TRUE >> 1==TRUE > [1] TRUE >> "1"==TRUE > [1] FALSE > > bdsmatrix() has no test that blocksize is numeric, so it fails when > sum(blocksize) cannot sum character strings. > > Next I had to figure out where RFGLS's gls.batch() is going wrong in > producing sizelist. It is created in a number of steps, but I identified > this line as especially suspicious: > > test.dat$famsize[test.dat$FTYPE!=6]=ave(test.dat$FAMID[test.dat$FTYPE!=6],test.dat$FAMID[test.dat$FTYPE!=6],FUN=length) > > famsize was later converted to sizelist, and this line also includes FAMID, > so this is likely where the problem originates. Of course this is the big > problem with debugging -- it's hard to find the source of an error that > occurs far downstream in another function from a different package. I see > that ave() is used, so I have to understand ave(). > > William Dunlap provided some guidance: > > "ave() uses its first argument, 'x', to set the length of its output and to > make an initial guess at the type of its output. The return value of FUN can > alter the type, but only in an 'upward' direction where > logical x[i]<-newvalue uses.)" > > In other words, if x is of character type, the output cannot be of integer or > numeric type even if the output of FUN is always of integer or numeric type. > Looking at the ave() code, I can understand that choice: > > function (x, ..., FUN = mean) > { >if (missing(...)) >x[] <- FUN(x) >else { >g <- interaction(...) >split(x, g) <- lapply(split(x, g), FUN) >} >x > } > > If the factor is missing an element, then the corresponding element of X is > not changed in the output: > >> fact <- gl(2,2) >> fact[3] <- NA >> fact > [1] 11 2 > Level
Re: [R] ave(x, y, FUN=length) produces character output when x is character
You persist in failing to read the docs! Moreover, neither Hadley Wickham, nor anyone else, is the authoritative source for R usage (other than for the (many!) packages he, himself has authored). R's Help pages and manuals -- and ultimately the source code -- are the only such source. ?factor says in its very first line: "The function factor is used to encode a **vector** as a factor (the terms ‘category’ and ‘enumerated type’ are also used for factors)" (emphasis added) and: > f <- factor (letters[1:3]) > f [1] a b c Levels: a b c > attributes(f) $levels [1] "a" "b" "c" $class [1] "factor" > is.vector(f) [1] FALSE > attributes(f) <- NULL > f [1] 1 2 3 > is.vector(f) [1] TRUE Don't you think it's time to call a halt to this? Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Thu, Dec 25, 2014 at 2:45 PM, Mike Miller wrote: > On Thu, 25 Dec 2014, Mike Miller wrote: > >> I was going to ask a question about it how to test that an object is a >> vector, but then I found this: >> >> "is.vector() does not test if an object is a vector. Instead it returns >> TRUE only if the object is a vector with no attributes apart from names. Use >> is.atomic(x) || is.list(x) to test if an object is actually a vector." >> >> From here: >> >> http://adv-r.had.co.nz/Data-structures.html#vectors > > > > But here... > > https://stat.ethz.ch/R-manual/R-devel/library/base/html/vector.html > > ...I read, "Note that factors are *not* vectors" (emphasis theirs), yet... > > >> d <- gl(2,2) > > >> is.factor(d) > > [1] TRUE > >> is.atomic(d) || is.list(d) > > [1] TRUE > >> is.list(d) > > [1] FALSE > >> is.atomic(d) > > [1] TRUE > >> is.vector(d) > > [1] FALSE > > So the factor is not a vector according to R documentation, but it is a > vector according to the Wickham test, and it is not a vector according to > is.vector(). Admittedly, the latter seems not to mean much to the R > experts. Maybe a factor is just a vector with additional attributes. > > Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, 25 Dec 2014, Mike Miller wrote: I was going to ask a question about it how to test that an object is a vector, but then I found this: "is.vector() does not test if an object is a vector. Instead it returns TRUE only if the object is a vector with no attributes apart from names. Use is.atomic(x) || is.list(x) to test if an object is actually a vector." From here: http://adv-r.had.co.nz/Data-structures.html#vectors But here... https://stat.ethz.ch/R-manual/R-devel/library/base/html/vector.html ...I read, "Note that factors are *not* vectors" (emphasis theirs), yet... d <- gl(2,2) is.factor(d) [1] TRUE is.atomic(d) || is.list(d) [1] TRUE is.list(d) [1] FALSE is.atomic(d) [1] TRUE is.vector(d) [1] FALSE So the factor is not a vector according to R documentation, but it is a vector according to the Wickham test, and it is not a vector according to is.vector(). Admittedly, the latter seems not to mean much to the R experts. Maybe a factor is just a vector with additional attributes. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, Dec 25, 2014 at 1:57 PM, Mike Miller wrote: > I do think I get what is going on with this, but why should I buy into this > conceptualization? Why is it better to say that a matrix *is* a vector than > to say that a matrix *contains* a vector? The latter seems to be the more > common way of thinking but such things. Even in R you've had to construct > two different definitions of "vector" to deal with the inconsistency created > by the "matrix is a vector" way of thinking. So there must be something > really good about it that I am not understanding (and I'm not being > facetious or ironic!) I think its the idea that in R all data objects are vectors (for some notion of vector) in the sense that all Lisp objects are lists, all APL objects are arrays and all tcl objects are character strings. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On 25/12/2014 3:59 PM, Bert Gunter wrote: > On Thu, Dec 25, 2014 at 12:41 PM, Duncan Murdoch > wrote: >> Would you say a cube contains a polygon, or a cube is a polygon? > > Neither, actually. I'd say a cube is a polyhedron or a square is a polygon. > > :-) > > But point taken, of course. I have trouble remembering how many dimensions I live in. Duncan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, 25 Dec 2014, Duncan Murdoch wrote: On 25/12/2014 1:57 PM, Mike Miller wrote: I do think I get what is going on with this, but why should I buy into this conceptualization? Why is it better to say that a matrix *is* a vector than to say that a matrix *contains* a vector? The latter seems to be the more common way of thinking but such things. "More common"? The better way to think of this is as a class hierarchy. A matrix is a particular kind of vector (the kind that has a dimension attribute). A matrix has all the properties that a vector has, plus some more. Would you say a cube contains a polygon, or a cube is a polygon? I would say that the sides of the cube are polygons, so I guess a cube "contains" six polygons, but "is" would be wrong because a cube is a polyhedron, not a polygon. A cube is not a polygon. I get your point about hierarchy. Someone else showed me that setting the dim attribute to NULL changes the matrix into a vector so that is.vector() is TRUE. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, Dec 25, 2014 at 12:41 PM, Duncan Murdoch wrote: > Would you say a cube contains a polygon, or a cube is a polygon? Neither, actually. I'd say a cube is a polyhedron or a square is a polygon. :-) But point taken, of course. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debugging R code and dealing with dependencies
Thanks, but I was already in touch with Rob Kirkpatrick about it. We all work together at U Minnesota, or did until Rob went to VCU. Mike On Thu, 25 Dec 2014, Uwe Ligges wrote: This is a rather detailed analysis, thanks, but I think it should be send to the maintainer of the "RFGLS" package (CCing). Best, Uwe Ligges On 25.12.2014 10:04, Mike Miller wrote: I just wanted to put this out there. It's just some of my observations about things that happen with R, or happened in this particular investigation. There were definitely some lessons for me in this, and maybe that will be true of someone else. The main thing I picked up is that it is good to put plenty of checks into our code -- if we expect input of a certain type or class, then I should either coerce input into that structure or test the input and throw an error. If the function works very differently for different kinds of input, this should be documented. The more people are doing this, the better things will go for everyone. I was working with a CRAN package called RFGLS... http://cran.r-project.org/web/packages/RFGLS/index.html ...and I was getting an error. After a few rounds of testing I realized that the error was caused by a FAMID variable that was of character type. The problem seemed to be that gls.batch() expected FAMID to be integers, but the default ought to be character type because family and individual IDs in nearly all genetic-analysis software are character strings (they might even be people's names). This was the error: Error in sum(blocksize) : invalid 'type' (character) of argument Calls: gls.batch -> bdsmatrix To figure out more about it, I spent a bunch of time to go from CMD BATCH mode to an interactive session so that I could look at traceback(). That got me this additional info: traceback() 2: bdsmatrix(sizelist, lme.out$sigma@blocks, dimnames = list(id, id)) bdsmatrix() is from a package on which RFGLS depends: http://cran.r-project.org/web/packages/bdsmatrix/index.html The problem is that RFGLS's gls.batch() function is sending something to bdsmatrix's bdsmatrix() that it can't handle. So I look at the code for bdsmatrix() and I see this: if (any(blocksize <= 0)) stop("Block sizes must be >0") if (any(as.integer(blocksize) != blocksize)) stop("Block sizes must be integers") n1 <- as.integer(sum(blocksize)) The condition any(as.integer(blocksize) != blocksize)) fails (is TRUE) only if blocksize contains one or more noninteger numeric values. It doesn't fail if blocksize is character or logical if the character strings are integers. Example: 4=="4" [1] TRUE That's an interesting feature of R, but I guess that's how it works. Also this: 1=="1" [1] TRUE 1==TRUE [1] TRUE "1"==TRUE [1] FALSE bdsmatrix() has no test that blocksize is numeric, so it fails when sum(blocksize) cannot sum character strings. Next I had to figure out where RFGLS's gls.batch() is going wrong in producing sizelist. It is created in a number of steps, but I identified this line as especially suspicious: test.dat$famsize[test.dat$FTYPE!=6]=ave(test.dat$FAMID[test.dat$FTYPE!=6],test.dat$FAMID[test.dat$FTYPE!=6],FUN=length) famsize was later converted to sizelist, and this line also includes FAMID, so this is likely where the problem originates. Of course this is the big problem with debugging -- it's hard to find the source of an error that occurs far downstream in another function from a different package. I see that ave() is used, so I have to understand ave(). William Dunlap provided some guidance: "ave() uses its first argument, 'x', to set the length of its output and to make an initial guess at the type of its output. The return value of FUN can alter the type, but only in an 'upward' direction where logical fact <- gl(2,2) fact[3] <- NA fact [1] 11 2 Levels: 1 2 ave(1:4, fact) [1] 1.5 1.5 3.0 4.0 That's a reasonable plan, but it isn't the documented functioning of ave(). From the document... https://stat.ethz.ch/R-manual/R-devel/library/stats/html/ave.html ...you get next to nothing about what the function actually does. It does say that x is "a numeric," but the function does not throw an error when x is not numeric. So if someone writes code expecting numeric x, but a user provides a non-numeric x, there may be trouble. I suspect that the programmer saw that the code worked in her examples and she went on to other things. I can't blame the documentation for that, but it is possible that if it said something about the relation between the type of the input and the type of the output she might have written it differently. In addition, I probably would have caught it sooner and I would have understood the problem. This is how I'll recommend they fix the bug in the code (thanks to those of you who helped with this): temp.vec <- as.character( test.dat$FAMID[ test.dat$FTYPE != 6 ] ) test.dat$famsize[ test.dat$FTYPE != 6 ] <- as.vector( t
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On 25/12/2014 1:57 PM, Mike Miller wrote: > On Thu, 25 Dec 2014, peter dalgaard wrote: > >> >>> On 25 Dec 2014, at 08:15 , Mike Miller wrote: >>> "is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise." >>> >>> So that means that a vector in R has no attributes other than names. >> >> Wrong. Read carefully. There are >> >> - vectors >> - vectors having no attributes other than names > > You are right. I was being difficult about the meaning of "is.vector()". > > But would you also say that a matrix is a vector? > > I was going to ask a question about it how to test that an object is a > vector, but then I found this: > > "is.vector() does not test if an object is a vector. Instead it returns > TRUE only if the object is a vector with no attributes apart from names. > Use is.atomic(x) || is.list(x) to test if an object is actually a vector." > > From here: > > http://adv-r.had.co.nz/Data-structures.html#vectors > >> a <- c(1,2,3,4) > >> names(a) <- LETTERS[1:4] > >> attr(a, "vecinfo") <- "yes, I'm a vector" > >> a > A B C D > 1 2 3 4 > attr(,"vecinfo") > [1] "yes, I'm a vector" > >> attributes(a) > $names > [1] "A" "B" "C" "D" > > $vecinfo > [1] "yes, I'm a vector" > >> is.vector(a) > [1] FALSE > >> is.atomic(a) || is.list(a) > [1] TRUE > > But then we also see this: > >> b <- matrix(1:4, 2,2) > >> is.atomic(b) || is.list(b) > [1] TRUE > > > "It is common to call the atomic types ‘atomic vectors’, but note that > is.vector imposes further restrictions: an object can be atomic but not a > vector (in that sense)." > > https://stat.ethz.ch/R-manual/R-devel/library/base/html/is.recursive.html > > I think a matrix is always atomic. So a matrix is "not a vector (in that > sense)," but "is.matrix returns TRUE if x is a vector and has a 'dim' > attribute of length 2." > > I do think I get what is going on with this, but why should I buy into > this conceptualization? Why is it better to say that a matrix *is* a > vector than to say that a matrix *contains* a vector? The latter seems to > be the more common way of thinking but such things. "More common"? The better way to think of this is as a class hierarchy. A matrix is a particular kind of vector (the kind that has a dimension attribute). A matrix has all the properties that a vector has, plus some more. Would you say a cube contains a polygon, or a cube is a polygon? Duncan Murdoch Even in R you've had > to construct two different definitions of "vector" to deal with the > inconsistency created by the "matrix is a vector" way of thinking. So > there must be something really good about it that I am not understanding > (and I'm not being facetious or ironic!) > > Mike > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] debugging R code and dealing with dependencies
This is a rather detailed analysis, thanks, but I think it should be send to the maintainer of the "RFGLS" package (CCing). Best, Uwe Ligges On 25.12.2014 10:04, Mike Miller wrote: I just wanted to put this out there. It's just some of my observations about things that happen with R, or happened in this particular investigation. There were definitely some lessons for me in this, and maybe that will be true of someone else. The main thing I picked up is that it is good to put plenty of checks into our code -- if we expect input of a certain type or class, then I should either coerce input into that structure or test the input and throw an error. If the function works very differently for different kinds of input, this should be documented. The more people are doing this, the better things will go for everyone. I was working with a CRAN package called RFGLS... http://cran.r-project.org/web/packages/RFGLS/index.html ...and I was getting an error. After a few rounds of testing I realized that the error was caused by a FAMID variable that was of character type. The problem seemed to be that gls.batch() expected FAMID to be integers, but the default ought to be character type because family and individual IDs in nearly all genetic-analysis software are character strings (they might even be people's names). This was the error: Error in sum(blocksize) : invalid 'type' (character) of argument Calls: gls.batch -> bdsmatrix To figure out more about it, I spent a bunch of time to go from CMD BATCH mode to an interactive session so that I could look at traceback(). That got me this additional info: traceback() 2: bdsmatrix(sizelist, lme.out$sigma@blocks, dimnames = list(id, id)) bdsmatrix() is from a package on which RFGLS depends: http://cran.r-project.org/web/packages/bdsmatrix/index.html The problem is that RFGLS's gls.batch() function is sending something to bdsmatrix's bdsmatrix() that it can't handle. So I look at the code for bdsmatrix() and I see this: if (any(blocksize <= 0)) stop("Block sizes must be >0") if (any(as.integer(blocksize) != blocksize)) stop("Block sizes must be integers") n1 <- as.integer(sum(blocksize)) The condition any(as.integer(blocksize) != blocksize)) fails (is TRUE) only if blocksize contains one or more noninteger numeric values. It doesn't fail if blocksize is character or logical if the character strings are integers. Example: 4=="4" [1] TRUE That's an interesting feature of R, but I guess that's how it works. Also this: 1=="1" [1] TRUE 1==TRUE [1] TRUE "1"==TRUE [1] FALSE bdsmatrix() has no test that blocksize is numeric, so it fails when sum(blocksize) cannot sum character strings. Next I had to figure out where RFGLS's gls.batch() is going wrong in producing sizelist. It is created in a number of steps, but I identified this line as especially suspicious: test.dat$famsize[test.dat$FTYPE!=6]=ave(test.dat$FAMID[test.dat$FTYPE!=6],test.dat$FAMID[test.dat$FTYPE!=6],FUN=length) famsize was later converted to sizelist, and this line also includes FAMID, so this is likely where the problem originates. Of course this is the big problem with debugging -- it's hard to find the source of an error that occurs far downstream in another function from a different package. I see that ave() is used, so I have to understand ave(). William Dunlap provided some guidance: "ave() uses its first argument, 'x', to set the length of its output and to make an initial guess at the type of its output. The return value of FUN can alter the type, but only in an 'upward' direction where logical fact <- gl(2,2) fact[3] <- NA fact [1] 11 2 Levels: 1 2 ave(1:4, fact) [1] 1.5 1.5 3.0 4.0 That's a reasonable plan, but it isn't the documented functioning of ave(). From the document... https://stat.ethz.ch/R-manual/R-devel/library/stats/html/ave.html ...you get next to nothing about what the function actually does. It does say that x is "a numeric," but the function does not throw an error when x is not numeric. So if someone writes code expecting numeric x, but a user provides a non-numeric x, there may be trouble. I suspect that the programmer saw that the code worked in her examples and she went on to other things. I can't blame the documentation for that, but it is possible that if it said something about the relation between the type of the input and the type of the output she might have written it differently. In addition, I probably would have caught it sooner and I would have understood the problem. This is how I'll recommend they fix the bug in the code (thanks to those of you who helped with this): temp.vec <- as.character( test.dat$FAMID[ test.dat$FTYPE != 6 ] ) test.dat$famsize[ test.dat$FTYPE != 6 ] <- as.vector( table( temp.vec )[ temp.vec ] ) rm(temp.vec) I think we should force FAMID to be character from the beginning, though. Best, Mike FYI -- RFGLS code that fails in RFGLS version 1.1:
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, 25 Dec 2014, peter dalgaard wrote: On 25 Dec 2014, at 08:15 , Mike Miller wrote: "is.vector returns TRUE if x is a vector of the specified mode having no attributes other than names. It returns FALSE otherwise." So that means that a vector in R has no attributes other than names. Wrong. Read carefully. There are - vectors - vectors having no attributes other than names You are right. I was being difficult about the meaning of "is.vector()". But would you also say that a matrix is a vector? I was going to ask a question about it how to test that an object is a vector, but then I found this: "is.vector() does not test if an object is a vector. Instead it returns TRUE only if the object is a vector with no attributes apart from names. Use is.atomic(x) || is.list(x) to test if an object is actually a vector." From here: http://adv-r.had.co.nz/Data-structures.html#vectors a <- c(1,2,3,4) names(a) <- LETTERS[1:4] attr(a, "vecinfo") <- "yes, I'm a vector" a A B C D 1 2 3 4 attr(,"vecinfo") [1] "yes, I'm a vector" attributes(a) $names [1] "A" "B" "C" "D" $vecinfo [1] "yes, I'm a vector" is.vector(a) [1] FALSE is.atomic(a) || is.list(a) [1] TRUE But then we also see this: b <- matrix(1:4, 2,2) is.atomic(b) || is.list(b) [1] TRUE "It is common to call the atomic types ‘atomic vectors’, but note that is.vector imposes further restrictions: an object can be atomic but not a vector (in that sense)." https://stat.ethz.ch/R-manual/R-devel/library/base/html/is.recursive.html I think a matrix is always atomic. So a matrix is "not a vector (in that sense)," but "is.matrix returns TRUE if x is a vector and has a 'dim' attribute of length 2." I do think I get what is going on with this, but why should I buy into this conceptualization? Why is it better to say that a matrix *is* a vector than to say that a matrix *contains* a vector? The latter seems to be the more common way of thinking but such things. Even in R you've had to construct two different definitions of "vector" to deal with the inconsistency created by the "matrix is a vector" way of thinking. So there must be something really good about it that I am not understanding (and I'm not being facetious or ironic!) Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tcltk
On 25/12/2014 14:26, Ivan Calandra wrote: Dear useRs, I have just upgraded to R 3.1.2 for MacOS 10.6.8 (with the binary for Snow Leopard). Rather belatedly Everything is fine except that I get an error when loading TclTk: >library(tcltk) Error : .onLoad a échoué dans loadNamespace() pour 'tcltk', détails : appel : system2("otool", c("-L", shQuote(DLL)), stdout = TRUE) erreur : erreur lors de l'exécution d'une commande Erreur : le chargement du package ou de l'espace de noms a échoué pour ‘tcltk’ sh: otool: command not found What could be wrong? If needed I can translate the error. I had no problem with R 3.1.1. As the posting guide asks, post Mac-specific questions on R-sig-mac. otool should be part of OS X. Thanks in advance and Merry Christmas! Ivan Here is my session info: > sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-apple-darwin10.8.0 (64-bit) That is the OS this was compiled under. You will need to tell people the OS you are running under, e.g. via (in R) system('uname -a') locale: [1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tcltk
Dear useRs, I have just upgraded to R 3.1.2 for MacOS 10.6.8 (with the binary for Snow Leopard). Everything is fine except that I get an error when loading TclTk: >library(tcltk) Error : .onLoad a échoué dans loadNamespace() pour 'tcltk', détails : appel : system2("otool", c("-L", shQuote(DLL)), stdout = TRUE) erreur : erreur lors de l'exécution d'une commande Erreur : le chargement du package ou de l'espace de noms a échoué pour ‘tcltk’ sh: otool: command not found What could be wrong? If needed I can translate the error. I had no problem with R 3.1.1. Thanks in advance and Merry Christmas! Ivan Here is my session info: > sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base -- Ivan Calandra, ATER University of Reims Champagne-Ardenne GEGENA² - EA 3795 CREA - 2 esplanade Roland Garros 51100 Reims, France +33(0)3 26 77 36 89 ivan.calan...@univ-reims.fr https://www.researchgate.net/profile/Ivan_Calandra __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
> On 25 Dec 2014, at 08:15 , Mike Miller wrote: > >> >> "is.vector returns TRUE if x is a vector of the specified mode having >> no attributes other than names. It returns FALSE otherwise." > > So that means that a vector in R has no attributes other than names. Wrong. Read carefully. There are - vectors - vectors having no attributes other than names -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, 25 Dec 2014, Mike Miller wrote: On Thu, 25 Dec 2014, Jeff Newmiller wrote: You have written a lot, Mike, as though we did not know it. You are not the only one with math and multiple computing languages under your belt. I'm not assuming that you and Bert don't know things, but I do expect to have a wider audience -- when I search for things, sometimes I find my postings from 10 years ago. I'm probably not the only one. To clarify one point: I write some things to organize my own ideas. I also think I'm not the only one who will want to see it. I don't believe that I know more about R or programming than you guys do. I consider that an extremely remote possibility, at best. That doesn't mean you are right about everything, or about any particular thing. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ave(x, y, FUN=length) produces character output when x is character
On Thu, 25 Dec 2014, Jeff Newmiller wrote: You have written a lot, Mike, as though we did not know it. You are not the only one with math and multiple computing languages under your belt. I'm not assuming that you and Bert don't know things, but I do expect to have a wider audience -- when I search for things, sometimes I find my postings from 10 years ago. I'm probably not the only one. The point Bert made is that the concept that a matrix IS-A vector is not just an implementation detail in R... it helps the practitioner keep straight why things like a[3] is perfectly valid when a is a matrix, and why a*a [,1] [,2] [1,]19 [2,]4 16 is true. I understand why you are uncomfortable with it, as I was once, but this is how R works so you are only impeding your own effectiveness by clinging to theory on this point. Sorry, your concepts aren't helping me and I'm not uncomfortable with what R is doing. In fact, what I wrote earlier is that R seems to be much like Octave/MATLAB, which I have used even more than I've used R. The use of what Octave calls "fortran indexing" is an example. We can refer to matrix element a[3] in R or element a(3) in Octave and we're referring to the same element. It's also the third element of vec(a). That doesn't mean that 'a' is a vector. R says that it is not a vector. That doesn't confuse me. I can refer to the third element of a matrix even if the matrix is not a vector. I don't understand how your concept helps in understanding a*a. It's an element-by-element product (also called Hadamard product). In Octave it would be a.*a. The matrix product in R is a%*%a and in Octave it is a*a. I find nothing confusing about any of this and I don't see how conceiving of 'a' as a vector helps at all. The difference between our concepts is that this makes no sense within your framework where you think of a matrix as being a vector: a <- matrix(1:4, 2,2) is.vector(a) [1] FALSE But to me it makes perfect sense because 'a' is a matrix and a matrix is not a vector. We might say that it contains a vector, or that the elements of the matrix are constructed from the elements of a vector, but not that the matrix is a vector. It is a different class of object: class(a) [1] "matrix" Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] debugging R code and dealing with dependencies
I just wanted to put this out there. It's just some of my observations about things that happen with R, or happened in this particular investigation. There were definitely some lessons for me in this, and maybe that will be true of someone else. The main thing I picked up is that it is good to put plenty of checks into our code -- if we expect input of a certain type or class, then I should either coerce input into that structure or test the input and throw an error. If the function works very differently for different kinds of input, this should be documented. The more people are doing this, the better things will go for everyone. I was working with a CRAN package called RFGLS... http://cran.r-project.org/web/packages/RFGLS/index.html ...and I was getting an error. After a few rounds of testing I realized that the error was caused by a FAMID variable that was of character type. The problem seemed to be that gls.batch() expected FAMID to be integers, but the default ought to be character type because family and individual IDs in nearly all genetic-analysis software are character strings (they might even be people's names). This was the error: Error in sum(blocksize) : invalid 'type' (character) of argument Calls: gls.batch -> bdsmatrix To figure out more about it, I spent a bunch of time to go from CMD BATCH mode to an interactive session so that I could look at traceback(). That got me this additional info: traceback() 2: bdsmatrix(sizelist, lme.out$sigma@blocks, dimnames = list(id, id)) bdsmatrix() is from a package on which RFGLS depends: http://cran.r-project.org/web/packages/bdsmatrix/index.html The problem is that RFGLS's gls.batch() function is sending something to bdsmatrix's bdsmatrix() that it can't handle. So I look at the code for bdsmatrix() and I see this: if (any(blocksize <= 0)) stop("Block sizes must be >0") if (any(as.integer(blocksize) != blocksize)) stop("Block sizes must be integers") n1 <- as.integer(sum(blocksize)) The condition any(as.integer(blocksize) != blocksize)) fails (is TRUE) only if blocksize contains one or more noninteger numeric values. It doesn't fail if blocksize is character or logical if the character strings are integers. Example: 4=="4" [1] TRUE That's an interesting feature of R, but I guess that's how it works. Also this: 1=="1" [1] TRUE 1==TRUE [1] TRUE "1"==TRUE [1] FALSE bdsmatrix() has no test that blocksize is numeric, so it fails when sum(blocksize) cannot sum character strings. Next I had to figure out where RFGLS's gls.batch() is going wrong in producing sizelist. It is created in a number of steps, but I identified this line as especially suspicious: test.dat$famsize[test.dat$FTYPE!=6]=ave(test.dat$FAMID[test.dat$FTYPE!=6],test.dat$FAMID[test.dat$FTYPE!=6],FUN=length) famsize was later converted to sizelist, and this line also includes FAMID, so this is likely where the problem originates. Of course this is the big problem with debugging -- it's hard to find the source of an error that occurs far downstream in another function from a different package. I see that ave() is used, so I have to understand ave(). William Dunlap provided some guidance: "ave() uses its first argument, 'x', to set the length of its output and to make an initial guess at the type of its output. The return value of FUN can alter the type, but only in an 'upward' direction where logicalthat x[i]<-newvalue uses.)" In other words, if x is of character type, the output cannot be of integer or numeric type even if the output of FUN is always of integer or numeric type. Looking at the ave() code, I can understand that choice: function (x, ..., FUN = mean) { if (missing(...)) x[] <- FUN(x) else { g <- interaction(...) split(x, g) <- lapply(split(x, g), FUN) } x } If the factor is missing an element, then the corresponding element of X is not changed in the output: fact <- gl(2,2) fact[3] <- NA fact [1] 11 2 Levels: 1 2 ave(1:4, fact) [1] 1.5 1.5 3.0 4.0 That's a reasonable plan, but it isn't the documented functioning of ave(). From the document... https://stat.ethz.ch/R-manual/R-devel/library/stats/html/ave.html ...you get next to nothing about what the function actually does. It does say that x is "a numeric," but the function does not throw an error when x is not numeric. So if someone writes code expecting numeric x, but a user provides a non-numeric x, there may be trouble. I suspect that the programmer saw that the code worked in her examples and she went on to other things. I can't blame the documentation for that, but it is possible that if it said something about the relation between the type of the input and the type of the output she might have written it differently. In addition, I probably would have caught it sooner and I would have understood the problem. This is how I'll
Re: [R] ave(x, y, FUN=length) produces character output when x is character
You have written a lot, Mike, as though we did not know it. You are not the only one with math and multiple computing languages under your belt. The point Bert made is that the concept that a matrix IS-A vector is not just an implementation detail in R... it helps the practitioner keep straight why things like a[3] is perfectly valid when a is a matrix, and why a*a [,1] [,2] [1,]19 [2,]4 16 is true. I understand why you are uncomfortable with it, as I was once, but this is how R works so you are only impeding your own effectiveness by clinging to theory on this point. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On December 24, 2014 11:15:28 PM PST, Mike Miller wrote: >On Wed, 24 Dec 2014, Bert Gunter wrote: > >> You are again misinterpreting because you have not read the docs, >> although this time I will grant that they are to some extent >misleading. >> >> First of all, a matrix _IS_ a vector: >> >>> a <- matrix(1:4, 2,2) >>> a[3] ## vector indexing works because it is a vector >> [1] 3 >> >> In fact, a matrix (or array) is a vector with a "dim" attribute. This >> is documented in ?matrix: >> >> "is.matrix returns TRUE if x is a vector and has a "dim" attribute of >> length 2) and FALSE otherwise." > >But a vector has no such attribute, so a matrix is not a vector which >is >why you see this: > >> a <- matrix(1:4, 2,2) >> is.vector(a) >[1] FALSE > >Of course the matrix can be coerced back into a vector just as the >vector >was coerced into a matrix: > >> b <- as.vector(a) >> is.vector(b) >[1] TRUE > > >> Your confusion arises because, despite its name, is.vector() does not > >> actually test whether something "is" a vector (after all these are >all >> abstractions; what it "is" is contents of memory, implemented as a >> linked list or some such). ?is.vector tells you: >> >> "is.vector returns TRUE if x is a vector of the specified mode having >> no attributes other than names. It returns FALSE otherwise." > >So that means that a vector in R has no attributes other than names. > > >> An array has a "dim" attribute, so is.vector() returns FALSE on it. >But >> it actually _is_ ("behaves like") a vector (in column major >> order,actually). > >An array is a vector with additional attributes which cause it to be an > >array rather than a vector. This is why R says FALSE when we query it >about an array using is.vector(). > > >> Now you may complain that this is confusing and I would agree. Why is >it >> this way? I dunno -- probably due to historical quirks -- evolution >is >> not necessarily orderly. But that's the way it is; that's the way >it's >> documented; and tutorials will tell you about this (that's how I >> learned). So please stop guessing and intuiting and read the docs to >> understand how things work. > >I don't think it is confusing. This is the kind of behavior I'm used >to >from other programs like Octave/MATLAB. A vector is just an ordered >list >of numbers. Those numbers can be put into matrices or >higher-dimensional >arrays, but they then become something more than just a vector. A >vector >like 1:4 becomes a 2x2 matrix when we do matrix(1:4, 2,2) such that the > >number 3 which was just the third element before (and still is) is now >also the [1,2] element of a matrix. It didn't have that before, back >when >it was a vector, but now that it has become something more than a >vector, >it has that new property. We can take that away using as.vector(). > >In many situations the behavior of the R vector and the same values in >a >matrix format will be very different: > >> a <- 1:4 >> b <- matrix(a, 2,2) >> a %*% a > [,1] >[1,] 30 >> b %*% b > [,1] [,2] >[1,]7 15 >[2,] 10 22 >> b %*% t(b) > [,1] [,2] >[1,] 10 14 >[2,] 14 20 >> a %*% t(a) > [,1] [,2] [,3] [,4] >[1,]1234 >[2,]2468 >[3,]369 12 >[4,]48 12 16 > >That is not true in ave(), as I showed earlier, because it uses the >vector >ordering of elements in the x matrix or array (what one would get from >as.vector()) to form the correspondence with the factor. > >I get your idea, but I don't think it is correct to say "a matrix is a >vector." Rather, I would say that there is a standard way in which one > >can create a one-to-one correspondence between the elements of a matrix >of >given dimensions and the elements of a vector. I believe this is >usually >called "fortran indexing," or at least