Re: [R] Extract elements from objects in a list

2011-06-28 Thread Bill.Venables
x - lapply(1:100, function(x) summary(runif(100))) head(x, 4) [[1]] Min. 1st Qu. MedianMean 3rd Qu.Max. 0.02922 0.38330 0.58120 0.58230 0.83430 0.99870 [[2]] Min. 1st Qu. Median Mean 3rd Qu. Max. 0.004903 0.281400 0.478900 0.497100 0.729900 0.990700 [[3]]

Re: [R] New to R, trying to use agnes, but can't load my ditance matrix

2011-06-27 Thread Bill.Venables
The first problem is that you are using a character string as the first argument to agnes() The help information for agnes says that its first argument, x, is x: data matrix or data frame, or dissimilarity matrix, depending on the value of the 'diss' argument. Not a character

Re: [R] Accessing variables in a data frame

2011-06-26 Thread Bill.Venables
Just to start things off: var.name - c(gdp,inf,unp) var.id - c(w,i) x - paste(var.name, rep(var.id, each=length(var.name)), sep=_) x [1] gdp_w inf_w unp_w gdp_i inf_i unp_i Now the three differences: gdp_w - gdp_i inf_w - inf_i unp_w - unp_i Can be got using dwi - dat[, x[1:3]] -

Re: [R] extract worksheet names from an Excel file

2011-06-23 Thread Bill.Venables
Package XLConnect appears to provide this kind of thing. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Shi, Tao Sent: Friday, 24 June 2011 2:42 PM To: r-help@r-project.org Subject: [R] extract worksheet names from an Excel file

Re: [R] omitting columns from a data frame

2011-06-21 Thread Bill.Venables
Suppose names(xm1) - c(alpha, beta, gamma, delta) then xm2 - subset(xm1, select = alpha:gamma) or xm2 - subset(xm1, select = -delta) will do the same job as xm1[, -4] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erin

Re: [R] and now a cut question

2011-06-21 Thread Bill.Venables
con - + textConnection(13053 13068 13068 13053 14853 14853 14850 14850 13053 13053 13068 13068 + ) x - scan(con) Read 12 items cut(x, 4) [1] (1.31e+04,1.35e+04] (1.31e+04,1.35e+04] (1.31e+04,1.35e+04] [4] (1.31e+04,1.35e+04] (1.44e+04,1.49e+04] (1.44e+04,1.49e+04] [7] (1.44e+04,1.49e+04]

Re: [R] converting character to numeric

2011-06-21 Thread Bill.Venables
..or something like that. Without more details it is hard to know just what is going on. Firstly in R the object is a 'data frame' (or object of class data.frame to be formal). There is no standard object in R called a 'database'. If you read in your data using read.csv, then mydata is

Re: [R] converting character to numeric

2011-06-21 Thread Bill.Venables
The point I would make is that for safety it's much better to use FALSE rather than F. FALSE is a reserved word in R, F is a pre-set variable, but can easily be changed at any time by the user. Secondly, doesn't this do the same as yours: readFF.csv - function(..., stringsAsFactors = FALSE)

Re: [R] Unreasonable syntax error

2011-06-20 Thread Bill.Venables
The advice is always NOT to use Microsoft Word to edit an R file. That stuff is poisonous. Microsoft word, typical of all Microsoft software, does not do what you tell it to do but helpfully does what it thinks you meant to ask it to do but were too dumb to do so. Even notepad, gawdelpus,

Re: [R] setting breaks in hist

2011-06-20 Thread Bill.Venables
The way to guarantee a specific number of panels in the histogram, say n, is to specify n+1 breaks which cover the range of the data. As far as I know this is the only way. Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On

Re: [R] please help! what are the different using log-link function and log transformation?

2011-06-19 Thread Bill.Venables
The two commands you give below are certain to lead to very different results, because they are fitting very different models. The first is a gaussian model for the response with a log link, and constant variance. The second is a gaussian model for a log-transformed response and identity

Re: [R] For loop by factor.

2011-06-19 Thread Bill.Venables
If I understand you correctly, you are trying to find the cumulative maximum from the end within each level of the factor. If this is what you are trying to do, then here is one way you might like to do it. First, define the function: cumMax - function(x) Reduce(max, x, right = TRUE,

Re: [R] can this sequence be generated easier?

2011-06-17 Thread Bill.Venables
Here ia an idea that might be useful to adapt fixedSumCombinations - function(N, terms) if(terms == 1) return(N) else if(terms == 2) return(cbind(0:N, N:0)) else { X - NULL for(i in 0:N) X - rbind(X, cbind(i, Recall(N-i, terms-1))) X }

Re: [R] functions for polynomial and rational interplation?

2011-06-13 Thread Bill.Venables
Neville's algorithm is not an improvement on Lagrange interpolation, it is simply one way of calculating it that has some useful properties. The result is still the Lagrange interpolating polynomial, though, with all its flaws. Implementing Neville's algorithm is fairly easy using the PolynomF

Re: [R] Score Test Function

2011-06-11 Thread Bill.Venables
The score test looks at the effect of adding extra columns to the model matrix. The function glm.scoretest takes the fitted model object as the first argument and the extra column, or columns, as the second argument. Your x2 argument has length only 3. Is this really what you want? I would

Re: [R] Custom Sort on a Table object

2011-06-06 Thread Bill.Venables
Here is a one way. tab fm 0 to 5 11.328000 6.900901 15 to 24 6.100570 5.190058 25 to 34 9.428707 6.567280 35 to 4410.462158 7.513270 45 to 54 7.621988 5.692905 5 to 14 6.502741 6.119663 55 to 64 5.884737 4.319905 65 to 74 5.075606

Re: [R] Matrix Question

2011-06-02 Thread Bill.Venables
Here is one way you might do it. con - textConnection( + characteristics_ch1.3 Stage: T1N0 Stage: T2N1 + Stage: T0N0 Stage: T1N0 Stage: T0N3 + ) txt - scan(con, what = ) Read 11 items close(con) Ts - grep(^T, txt, value = TRUE) Ts - sub(T([[:digit:]]+)N([[:digit:]]+), \\1x\\2, Ts) out

Re: [R] count value changes in a column

2011-05-31 Thread Bill.Venables
I thought so to. If so, here is one way you could do it fixSeq - function(state) { shift1 - function(x) c(1, x[-length(x)]) repeat { change - state %in% c(4,5) shift1(state) == 3 if(any(change)) state[change] - 3 else break } state } e.g. state [1] 1 3 3 5 5 3 2 4 2 1

Re: [R] Forcing a negative slope in linear regression?

2011-05-31 Thread Bill.Venables
If you want to go ahead with this in cold blood, you might look at the 'nnls' package. It fits regressions with non-negative coefficients. This might seem like the very opposite of what you want, but it essentially gets you there. You have to be prepared for the coefficient to go to zero

Re: [R] Value of 'pi'

2011-05-30 Thread Bill.Venables
There is an urban legend that says Indiana passed a law implying pi = 3. (Because it says so in the bible...) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Joshua Wiley Sent: Monday, 30 May 2011 4:10 PM To: Vincy Pyne Cc:

Re: [R] Basic question about three factor Anova

2011-05-30 Thread Bill.Venables
This is really a question about the help file for gl. The arguments are gl(n, k, length = n*k, labels = 1:n, ordered = FALSE) 'n' is the number of factor levels. That seems to be easy enough 'k' is called the number of replications. This is perhaps not the best way to express what it is. k

Re: [R] DateTime Math in R - POSIXct

2011-05-30 Thread Bill.Venables
Perhaps because the timezone is specified as a character string and not a date-time object complete with timezone. From the help filr for as.POSIXct.numeric: origin:a date-time object, or something which can be coerced by as.POSIXct(tz=GMT) to such an object. Note the coercion. Bill

Re: [R] Adding a numeric to all values in the dataframe

2011-05-20 Thread Bill.Venables
Oops The first line of my template should use data.matrix() rather than data.frame() data.matrix() is guaranteed to return a numerical matrix from a data frame, making arithmetic always possible. Bill Venables. From: Venables, Bill (CMIS, Dutton Park)

Re: [R] Contrasts in Penalized Package

2011-05-20 Thread Bill.Venables
The key line is prep - .checkinput(match.call(), parent.frame()) Among other things the model matrix is built in .checkinput( ) which is not exported from the package namespace. So you have to get rough with it and use penalized:::.checkinput and then you see these line of code

Re: [R] [r] regression coefficient for different factors

2011-05-20 Thread Bill.Venables
You have received suggestions about this already, but you may want to consider something like this as an alternative: require(english) lev - as.character(as.english(0:9)) dat - data.frame(f = factor(sample(lev, 500, + rep=TRUE), levels = lev), + B =

Re: [R] *not* using attach() *but* in one case ....

2011-05-19 Thread Bill.Venables
Martin Maechler writes: Well, then you don't know *THE ONE* case where modern users of R should use attach() ... as I have been teaching for a while, but seem not have got enought students listening ;-) ... --- Use it instead of load() {for save()d R objects} --- The advantage of

Re: [R] extraction of mean square value from ANOVA

2011-05-19 Thread Bill.Venables
That only applies if you have the same factors a and b each time. If this is the case you can do things in a much more slick way. u - matrix(rnorm(5000), nrow = 10) ## NB, nrow AB - expand.grid(a = letters[1:2], b = letters[1:5]) M - lm(u ~ a+b, AB) rmsq - colSums(resid(M)^2)/M$df.resid and

Re: [R] Adding a numeric to all values in the dataframe

2011-05-19 Thread Bill.Venables
For that kind of operation (unusual as it is) work with numeric matrices. When you are finished, if you still want a data frame, make it then, not before. If your data starts off as data frame to begin with, turn it into a matrix first. E.g. myMatrix - data.frame(myData) myMatrix2 -

Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-18 Thread Bill.Venables
Hi Bert, I think people should know about the Google Sytle Guide for R because, as I said, it represents a thoughtful contribution to the debate. Most of its advice is very good (meaning I agree with it!) but some is a bit too much (for example, the blanket advice never to use S4 classes and

Re: [R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

2011-05-18 Thread Bill.Venables
I used to think like that. However I have recently re-read John Chambers' Software for Data Analysis and now I'm starting to see the point. S4 classes and methods do require you to plan your classes and methods well and the do impose a discipline that can seem rigid and unnecessary. But I

Re: [R] Post-hoc tests in MASS using glm.nb

2011-05-17 Thread Bill.Venables
Amen to all of that, Bert. Nicely put. The google style guide (not perfect, but a thoughtful contribution on these kinds of issues, has avoiding attach() as its very first line. See http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html) I would add, though, that not enough

Re: [R] Post-hoc tests in MASS using glm.nb

2011-05-17 Thread Bill.Venables
PS I should have followed the example with one using with() for something that would often be done with attach(): Consider: with(polyData, { plot(x, y, pch=.) o - order(x) lines(x[o], eta[o], col = red) }) I use this kind of dodge a lot, too, but now you can mostly use data= arguments

Re: [R] simprof test using jaccard distance

2011-05-17 Thread Bill.Venables
The documentation for simprof says, with respect to the method.distance argument, This value can also be any function which returns a dist object. So you should be able to use the Jaccard index by setting up your own function to compute it. e.g. Jaccard - function(X) vegan::vegdist(X, method

Re: [R] Multiple plots on one device using stl

2011-05-17 Thread Bill.Venables
If you ?plot.stl you will see that that the second argument, set.pars, is a list of argument settings for par(), including a (variable) default setting for mfrow. I.e. plot.stl overrides your external setting (which will also override any layout() setting). It looks like to override it

Re: [R] Extracting the dimnames of an array with variable dimensions

2011-05-16 Thread Bill.Venables
Here is an alternative solution foo - array(data = rnorm(32), dim = c(4,4,2), + dimnames=list(letters[1:4], LETTERS[1:4], letters[5:6])) ind - which(foo 0, arr.ind = TRUE) row.names(ind) - NULL ## to avoid warnings. mapply([, dimnames(foo), data.frame(ind)) [,1] [,2] [,3] [1,] a

Re: [R] Post-hoc tests in MASS using glm.nb

2011-05-16 Thread Bill.Venables
?relevel Also, you might want to fit the models as follows Model1 - glm.nb(Cells ~ Cryogel*Day, data = myData) myData2 - within(myData, Cryogel - relevel(Cryogel, ref = 2)) Model2 - update(Model1, data = myData1) c You should always spedify the data set when you fit a model if at all

Re: [R] Filtering out bad data points

2011-05-09 Thread Bill.Venables
You could use a function to do the job: withinRange - function(x, r = quantile(x, c(0.05, 0.95))) x = r[1] x = r[2] dtest2 - subset(dftest, withinRange(x)) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Robert A'gata Sent:

Re: [R] Creating Observation ID

2011-05-09 Thread Bill.Venables
Here is one way: df - data.frame(Value = rnorm(30), Group = sample(c('A','B','C'), 30, replace = TRUE)) ## make a little function to do the job iNumber - function(f) { f - as.factor(f) X - outer(f, levels(f), ==)+0 rowSums(X * apply(X, 2, cumsum)) } ##

Re: [R] is this an ANOVA ?

2011-04-13 Thread Bill.Venables
You probably want to do something like this: fm - lm(y ~ x, MD) anova(fm) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F valuePr(F) x 2250 125.0 50 1.513e-06 Residuals 12 30 2.5 Answers to questions: 1. No. 2. Yes. (whoever you are).

Re: [R] multinom() residual deviance

2011-04-09 Thread Bill.Venables
The residual deviance from a multinomial model is numerically equal (up to round-off error) to that you would get had you fitted the model as a surrogate Poisson generalized linear model. Here is a short demo building on your example set.seed(101) df - data.frame(f = sample(letters[1:3],

Re: [R] multinom() residual deviance

2011-04-08 Thread Bill.Venables
The two models you fit are quite different. The first is a binomial model equivalent to fm - glm(I(y == a) ~ x, binomial, df) which you can check leads to the same result. I.e. this model amalgamates classes b and c into one. The second is a multivariate logistic model that considers all

Re: [R] Pulling strings from a Flat file

2011-04-06 Thread Bill.Venables
Isn't all you need read.fwf? From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Kalicin, Sarah [sarah.kali...@intel.com] Sent: 06 April 2011 09:48 To: r-help@r-project.org Subject: [R] Pulling strings from a Flat file Hi, I

Re: [R] fraction with timelag

2011-03-24 Thread Bill.Venables
How about simply df - data.frame(id = 1:6, + xout = c(12.34, 21.34, 2.34, 4.56, 3.24, 3.45), + xin = c(NA, 34,67,87,34, NA)) with(df, c(NA, xin[-1]/xout[-length(xout)])) [1]NA 2.755267 3.139644 37.179487 7.456140NA BTW You seem to

Re: [R] R as a non-functional language

2011-03-21 Thread Bill.Venables
That's not the point. The point is that R has functions which have side-effects and hence does not meet the strict requirements for a functional language. -Original Message- From: ONKELINX, Thierry [mailto:thierry.onkel...@inbo.be] Sent: Monday, 21 March 2011 7:20 PM To:

Re: [R] Help with POSIXct

2011-03-21 Thread Bill.Venables
You might try dat$F1 - format(as.Date(dat$F1), format = %b-%y) although it rather depends on the class of F1 as it has been read. Bill Venables. (It would be courteous of you to give us yor name, by the way.) -Original Message- From: r-help-boun...@r-project.org

Re: [R] How to substract a valur from dataframe with condition

2011-03-21 Thread Bill.Venables
dat - within(dat, { X2 - ifelse(X2 50, 100-X2, X2) X3 - ifelse(X3 50, 100-X3, X3) }) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of joe82 Sent: Tuesday, 22 March 2011 7:40 AM To: r-help@r-project.org Subject:

Re: [R] R as a non-functional language

2011-03-19 Thread Bill.Venables
The idiom I prefer is pH - structure(c(4.5,7,7.3,8.2,6.3), names = c('area1','area2','mud','dam','middle')) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gabor Grothendieck Sent: Sunday, 20 March 2011 2:33 PM

Re: [R] making dataframes

2011-03-16 Thread Bill.Venables
Firstly, the way you have constructed your data frame in the example will convert everything to factors. What you need to do is actually a bit simpler: ### dum - data.frame(date, col1, col2) ### One way to turn this into the kind of data frame you want is to convert the main part of

Re: [R] Why doesn't this work ?

2011-03-16 Thread Bill.Venables
It doesn't work (in R) because it is not written in R. It's written in some other language that looks a bit like R. t - 3 z - t %in% 1:3 z [1] TRUE t - 4 z - t %in% 1:3 z [1] FALSE -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On

Re: [R] Subset using grepl

2011-03-16 Thread Bill.Venables
subset(data, grepl([1-5], section) !grepl(0, section)) BTW grepl([1:5], section) does work. It checks for the characters 1, :, or 5. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Kang Min Sent: Thursday, 17 March 2011

Re: [R] data.frame transformation

2011-03-14 Thread Bill.Venables
It is possible to do it with numeric comparisons, as well, but to make life comfortable you need to turn off the warning system temporarily. df - data.frame(q1 = c(0,0,33.33,check), q2 = c(0,33.33,check,9.156), q3 = c(check,check,25,100), q4 =

Re: [R] Stepwise Discriminant... in R

2011-03-13 Thread Bill.Venables
If you want to do a stepwise selection there is a function in the klaR package to do it. This is not what you are asking for, though. You want a way of finding the successive error rates as additional variables are added in the forward selection process. As far as I can see you have to do

Re: [R] troubles with logistic regression

2011-03-13 Thread Bill.Venables
It means you have selected a response variable from one data frame (unmarried.male) and a predictor from another data frame (fieder.male) and they have different lengths. You might be better off if you used the names in the data frame rather than selecting columns in a form such as

Re: [R] Cleaning date columns

2011-03-09 Thread Bill.Venables
Here is one possible way (I think - untested code) cData - do.call(rbind, lapply(split(data, data$prochi), function(dat) { dat - dat[order(dat$date), ] while(any(d - (diff(dat$date) = 3))) dat - dat[-(min(which(d))+1), ]

Re: [R] Extracting only odd columns from a matrix

2011-03-09 Thread Bill.Venables
Xonly - XY[, grep(^X, dimnames(XY)[[2]])] -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nixon, Matthew Sent: Thursday, 10 March 2011 12:20 AM To: r-help@R-project.org Subject: [R] Extracting only odd columns from a matrix Hi,

Re: [R] attr question

2011-03-07 Thread Bill.Venables
Erin You could use as.vector(t.test(buzz$var1, conf.level=.98)$conf.int) Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Erin Hodgess Sent: Monday, 7 March 2011 3:12 PM To: R help Subject: [R] attr question Dear

Re: [R] extract rows with unique values from data.frame

2011-03-07 Thread Bill.Venables
Here is possibly one method (if I have understood you correctly): con - textConnection( + xloc yloc gonad indEneW Agent + 123 20 516.74 1 0.02 20.21 0.25 + 223 20 1143.20 1 0.02 20.21 0.50 + 321 19 250.00 1 0.02 20.21 0.25 + 422 15

Re: [R] creating a count variable in R

2011-03-03 Thread Bill.Venables
You can probably simplify this if you can assume that the dates are in sorted order. Here is a way of doing it even if the days are in arbitrary order. The count refers to the number of times that this date has appeared so far in the sequence. con - textConnection( 01/01/2011 01/01/2011

Re: [R] R usage survey

2011-03-03 Thread Bill.Venables
No. That's not answering the question. ALL surveys are for collecting information. The substantive issue is what purpose do you have in seeking this information in the first place and what are you going to do with it when you get it? Do you have some commercial purpose in mind? If so, what

Re: [R] Non-conformable arrays

2011-03-02 Thread Bill.Venables
Here is one way. 1. make sure y.test is a factor 2. Use table(y.test, factor(PredictedTestCurrent, levels = levels(y.test)) 3. If PredictedTestCurrent is already a factor with the wrong levels, turn it back into a character string vector first. -Original Message- From:

Re: [R] What am I doing wrong with this loop ?

2011-03-02 Thread Bill.Venables
Here is a start x - as.data.frame(runif(2000, 12, 38)) length(x) [1] 1 names(x) [1] runif(2000, 12, 38) Why are you turning x and y into data frames? It also looks as if you should be using if(...) ... else ... rather than ifelse(.,.,), too. You need to sort out a few issues, it seems.

Re: [R] Logistic Stepwise Criterion

2011-03-01 Thread Bill.Venables
The probability OF the residual deviance is zero. The significance level for the residual deviance according to its asymptotic Chi-squared distribution is a possible criterion, but a silly one. If you want to minimise that, just fit no variables at all. That's the best you can do. If you

Re: [R] How to prove the MLE estimators are normal distributed?

2011-03-01 Thread Bill.Venables
This is a purely statistical question and you should try asking it on some statistics list. This is for help with using R, mostly for data analysis and graphics. A glance at the posting guide (see the footnote below) might be a good idea. -Original Message- From:

Re: [R] Transforming list into an array

2011-02-27 Thread Bill.Venables
One way to do it is to use the 'abind' package NCurvas - 10 NumSim - 15 dW - replicate(NumSim, matrix(rnorm(NCurvas * 3), NCurvas, 3), + simplify = FALSE) library(abind) DW - do.call(abind, c(dW, rev.along = 0)) dim(DW) [1] 10 3 15 -Original Message- From:

Re: [R] Combinations

2011-02-27 Thread Bill.Venables
You can compute the logarithm of it easily enough lchoose(54323456, 2345) [1] 25908.4 Now, what did you want to do with it? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Silverton Sent: Monday, 28 February 2011 10:38 AM

Re: [R] Weighted Mean By Factor Using BY

2011-02-23 Thread Bill.Venables
Here is the party line, perhaps by(data, data$TYPE, function(dat) with(dat, weighted.mean(MEASURE, COUNT))) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mike Schumacher Sent: Thursday, 24 February 2011 9:40 AM To:

Re: [R] problem with rbind when data frame contains an date-time variable POSIXt POSIXlt

2011-02-17 Thread Bill.Venables
The solution is probably to make the data-time columns POSIXct: x - read.table(textConnection( + ID event.date.time + 1 '2009-07-23 00:20:00' + 2 '2009-08-18 16:25:00' + 3 '2009-08-13 08:30:00' + ), header = TRUE) y - read.table(textConnection( + ID event.date.time + 4

Re: [R] Saturated model in binomial glm

2011-02-16 Thread Bill.Venables
This is a very good question. You have spotted something that not many people see and it is important. The bland assertion that the deviance can be used as a test of fit can be seriously misleading. For this data the response is clearly binary, Admitted (success) or Rejected (failure) and

Re: [R] How to get warning about implicit factor to integer coercion?

2011-02-14 Thread Bill.Venables
Your complaint is based on what you think a factor should be rather than what it actually is andhow it works. The trick with R (BTW I think it's version 2.12.x rather than 12.x at this stage...) is learning to work *with* it as it is rather than making it work the way you would like it to do.

Re: [R] Predictions with missing inputs

2011-02-11 Thread Bill.Venables
With R it is always possible to shoot yourself squarely in the foot, as you seem keen to do, but R does at least often make it difficult. When you predict, you need to have values for ALL variables used in the model. Just leaving out the coefficients corresponding to absent predictors is

Re: [R] if a variable is defined

2011-02-10 Thread Bill.Venables
!is.null(my.obj...@my.data.frame$my.var) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Kushan Thakkar Sent: Friday, 11 February 2011 10:30 AM To: r-help@r-project.org Subject: [R] if a variable is defined I have an object type

Re: [R] Getting p-value from summary output

2011-02-10 Thread Bill.Venables
Hi Alice, You can use pvals - summary(myprobit)$coefficients[, Pr(|z|)] Notice that if the p-value is very small, the printed version is abbreviated, but the object itself has full precision (not that it matters). Bill Venables. -Original Message- From: r-help-boun...@r-project.org

Re: [R] factor.scores

2011-02-09 Thread Bill.Venables
The function factor.scores does not inherit anything. It is a generic function that provieds methods for a number of classes, including those you mention. (The terminology is important if you are to understand what is going on here): library(ltm) Loading required package: MASS Loading

Re: [R] merge multiple .csv files

2011-02-09 Thread Bill.Venables
You want advice? 1. Write sentences that contain a subject and where appropriate, an object as well. This makes your email just that bit more polite. This list is not a paid service. 2. The sheets may have variables in common, but do they have the same name in both, and the same class, and

Re: [R] Newb Prediction Question using stepAIC and predict(), is R wrong?

2011-02-09 Thread Bill.Venables
Using complex names, like res[, 3+i] or res$var, in the formula for a model is a very bad idea, especially if eventually you want eventualluy to predict to new data. (In fact it won't work, so that makes is very bad indeed.) So do not use '$' or '[..]' terms in model formulae - this is going

Re: [R] leap year and order function

2011-01-30 Thread Bill.Venables
yearLength - function(year) 365 + (year %% 4 == 0) yearLength(1948:2010) [1] 366 365 365 365 366 365 365 365 366 365 365 365 366 365 365 365 366 365 365 365 366 [22] 365 365 365 366 365 365 365 366 365 365 365 366 365 365 365 366 365 365 365 366 365 [43] 365 365 366 365 365 365 366 365 365

Re: [R] Unexpected Gap in simple line plot

2011-01-20 Thread Bill.Venables
You do have missing values. Setting xlim does not subset the data. How about link - http://processtrends.com/files/RClimate_CTS_latest.csv; cts - read.csv(link, header = TRUE) scts - subset(cts, !is.na(GISS) !is.na(cts)) ## remove defectives plot(GISS ~ yr_frac, scts, type = l,

Re: [R] Log difference in a dataframe column

2011-01-18 Thread Bill.Venables
lag and as.ts are separate operations (which in fact commute) lag(as.ts(1:10), 1) Time Series: Start = 0 End = 9 Frequency = 1 [1] 1 2 3 4 5 6 7 8 9 10 as.ts(lag(1:10, 1)) Time Series: Start = 0 End = 9 Frequency = 1 [1] 1 2 3 4 5 6 7 8 9 10 You do NOT need to call

Re: [R] plot continuous data vs clock time

2011-01-17 Thread Bill.Venables
plot(y~x, type=p, xlim = x[c(2,4)]) ? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of wangxipei Sent: Tuesday, 18 January 2011 1:27 PM To: r-help Subject: [R] plot continuous data vs clock time Dear R users, I have a question

Re: [R] data prep question

2011-01-16 Thread Bill.Venables
Here is one way Here is one way: con - textConnection( + ID TIMEOBS + 001 220023 + 001 240011 + 001 320010 + 001 450022 + 003 3900 45 + 003 5605

Re: [R] Rounding variables in a data frame

2011-01-14 Thread Bill.Venables
If you can specify the omitted columns as numbers there is a quick way to do it. e.g. d d1 d2d3d4 1 9.586524 4.833417 0.8142588 -3.237877 2 11.481521 6.536360 2.3361894 -4.042314 3 10.243192 5.506440 2.0443788 -3.478543 4 9.969548 6.159666 3.0449121

Re: [R] Weighted least squares regression for an exponential decay function

2011-01-14 Thread Bill.Venables
nls in the stats package. ?nls From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Erik Thulin [ethu...@gmail.com] Sent: 15 January 2011 16:16 To: r-help@r-project.org Subject: [R] Weighted least squares regression for an

Re: [R] Problems creating a PNG file for a dendrogram: Error in plot.window(...) : need finite 'xlim' values

2011-01-11 Thread Bill.Venables
I very much doubt your first example does work. the value of plot() is NULL which if you plot again will give the error message you see in your second example. What where you trying to achieve doing p - plot(hc) plot(p) ### this one is trying to plot NULL ? Here is an example

Re: [R] Plotting Factors -- Sorting x-axis

2011-01-07 Thread Bill.Venables
That rather depends on what kind of plot you want to use. Here is one option that you can use without any changes: ## con - textConnection( Months Prec 1 Jan 102.1 2 Feb69.7 3 Mar44.7 4 Apr32.1 5 May24.0 6 Jun18.7 7 Jul14.0 8 Aug

Re: [R] anova vs aov commands for anova with repeated measures

2011-01-07 Thread Bill.Venables
lm() and aov() are not fully equivalent. They both fit linear models, but they use different algorighms, and this allows aov, for example, to handle some simple multistratum models. The algorithm used by lm does not allow this, but it has other advantages for simpler models. If you want to

Re: [R] packagename:::functionname vs. importFrom

2011-01-03 Thread Bill.Venables
If you use ::: to access non-exported functions, as Frank confesses he does, then you can't complain if in the next release of the package involved the non-exported objects are missing and things are being done another way entirely. That's the deal. On the other hand, sometimes package

Re: [R] Changing column names

2010-12-31 Thread Bill.Venables
You don't give us much to go on, but some variant of country - c(US, France, UK, NewZealand, Germany, Austria, Italy, Canada) result - read.csv(result.csv, header = FALSE) names(result) - country should do what you want. From:

Re: [R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread Bill.Venables
Here is an alternaive approach that is closer to that used by lm and friends. df - data.frame(x=1:10,y=11:20) test - function(col, dat) eval(substitute(col), envir = dat) test(x, df) [1] 1 2 3 4 5 6 7 8 9 10 test(y, df) [1] 11 12 13 14 15 16 17 18 19 20 There is a slight added

Re: [R] filling up holes

2010-12-28 Thread Bill.Venables
Dear 'analyst41' (it would be a courtesy to know who you are) Here is a low-level way to do it. First create some dummy data allDates - seq(as.Date(2010-01-01), by = 1, length.out = 50) client_ID - sample(LETTERS[1:5], 50, rep = TRUE) value - 1:50 date - sample(allDates) clientData -

Re: [R] linear regression for grouped data

2010-12-28 Thread Bill.Venables
library(nlme) lmList(y ~ x | factor(ID), myData) This gives a list of fitted model objects. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Entropi ntrp Sent: Wednesday, 29 December 2010 12:24 PM To: r-help@r-project.org Subject:

Re: [R] monthly median in a daily dataset

2010-12-19 Thread Bill.Venables
I find this function useful for digging out months from Date objects Month - function(date, ...) factor(month.abb[as.POSIXlt(date)$mon + 1], levels = month.abb) For this little data set below this is what it gives with(data, tapply(value, Month(date), median, na.rm = TRUE)) Jan Feb

Re: [R] Use generalised additive model to plot curve

2010-12-14 Thread Bill.Venables
Dear Lurker, If all you art trying to do is to plot something, isn't all you need something like the following? x - c( 30, 50, 80, 90, 100) y - c(160, 180, 250, 450, 300) sp - spline(x, y, n = 500) plot(sp, type = l, xlab = x, ylab = y, las = 1, main = A Spline Interpolation) points(x,

Re: [R] LaTeX, MiKTeX, LyX: A Guide for the Perplexed

2010-12-07 Thread Bill.Venables
A new fortune is born? Sharing LaTeX documents with people using word processors only is no more difficult than giving driving directions to someone who is blindfolded and has all 4 limbs tied behind their back. Collaboration with people who insist on using programs that process their words

Re: [R] Newbie - want to view code for a function

2010-12-07 Thread Bill.Venables
For a substantial calculation like this the algorithms will likely be in C or Fortran. You will need to download the source for the stats package from CRAN (as a tar.gz file), expand it, and look at the source code in the appropriate sub-directories. You can get a bit of a road map in R by

Re: [R] Summing up Non-numeric column

2010-12-07 Thread Bill.Venables
?unique -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of zhiji19 Sent: Wednesday, 8 December 2010 2:57 PM To: r-help@r-project.org Subject: [R] Summing up Non-numeric column Dear All If I have the following dataset V1 V2 x y

Re: [R] Two time measures

2010-11-27 Thread Bill.Venables
I think all you need is ?split -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Eduardo de Oliveira Horta Sent: Sunday, 28 November 2010 8:02 AM To: r-help@r-project.org Subject: [R] Two time measures Hello! I have a csv file

Re: [R] Gap between graph and axis

2010-11-22 Thread Bill.Venables
perhaps you need something like this. par(yaxs = i) plot(runif(10), type = h, ylim = c(0, 1.1)) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sebastian Rudnick Sent: Tuesday, 23 November 2010 10:37 AM To: r-help@r-project.org

Re: [R] a philosophy R question

2010-11-20 Thread Bill.Venables
The conventional view used to be that S is the language and that R and S-PLUS are implementations of it. R is usually described as 'a programming environment for data analysis and graphics' (as was S-PLUS before it). However as the language that R implements diverges inexorably from the

Re: [R] Can't invert matrix

2010-11-20 Thread Bill.Venables
What you show below is only a representation of the matrix to 7dp. If you look at that, though, the condition number is suspiciously large (i.e. the matrix is very ill-conditioned): txt - textConnection( + 0.99252358 0.93715047 0.7540535 0.4579895 + 0.01607797 0.09616267 0.2452471

Re: [R] density at particular values

2010-11-20 Thread Bill.Venables
It's actually not too difficult to write the density function itself as returning a function rather than a list of x and y values. Here is a no frills (well, few frills) version: ### cut here ### densityfun - local({ normd - function(value, bw) { force(value); force(bw) function(z)

  1   2   3   4   5   >