[R] Questions concerning function 'svm' in e1071 package
Greetings everyone, I have the following problem (illustrating R-code at bottom of mail): Given a training sample with binary outcomes (-1/+1), I train a linear Support Vector Machine to separate them. Afterwards, I compute the weight vector w in the usual way, and obtain the fitted values as w'x + b > 0 ==> yfitted = 1, otherwise -1. However, upon verifying with the 'predict' method, the outcomes do not match up as they should. I've already tried to find information concerning this issue on the R-help board, but to no avail. Can any of you point me in the right direction? Signed, Johan Van Kerckhoven ORSTAT and University Center of Statistics Katholieke Universiteit Leuven -- #initialization of the problem rm(list=ls()) library(e1071) set.seed(2) n = 50 d = 4 p = 0.5 x = matrix(rnorm(n*d), ncol=d) mushift = c(1, -1, rep(0, d-2)) y = runif(n) > p y = factor(2*y - 1) x = x - outer(rep(1, n), mushift) x[y == 1, ] = x[y == 1] + 2*outer(rep(1, sum(y == 1)), mushift) svclass = svm(x, y, scale=FALSE, kernel="linear") #Computation of the weight vector w = t(svclass$coefs) %*% svclass$SV if (y[1] == -1) { w = -w } #Derivation of predicted class lavels #Using method in documentation yfit = (x %*% t(w) + svclass$rho) > 0 yfit = factor(2*yfit - 1) #Extracting them directly from the model yfit2 = svclass$fitted #Display where predictions differ from each other yfit != yfit2 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem with try()
Dear R-experts, I am running a large simulation exercise where the enough complicated integration is required. The integral is computed within a C-function called Denom by use of function qags from the gsl library. Here is a piece of R-code: denom<-try(.C("Denom",as.double(x),as.integer(n), as.integer(p), as.double(param), as.double(delta),res=as.double(results))) denomres=if (class(denom)=="try-error") NA else denom$res Sometimes, it happens that the integration process fails with the follwoing error message gsl: qags.c:553: ERROR: bad integrand behavior found in the integration interval Default GSL error handler invoked. and the whole simulation job is destroyed. My question is: why try() does not work and how to fix this problem? Much thanks, Leonid Landsman. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Query : Chi Square goodness of fit test
If we have the data base of frauds given below no. of frauds = variable variable <-c(4,1,6,9,9,10,2,4,8,2,3,0,1,2,3,1,3,4,5,4,4,4,9,5,4,3,11,8,12,3,10,0, 7) pmf <- dpois(i, lambda, log = FALSE) # prob. mass function of variable How to apply chi-square goodness of fit to test, Sample coming from Poisson distribution. How to calculate observed frequencies & expected frequencies, after that how to calculate chi 2 test and interpret the result The formula which I have used & answer which I am getting is as follows, chisq.test(variable, p=pmf, simulate.p.value =FALSE, correct = FALSE) Chi-squared test for given probabilities data: No_of_Frouds X-squared = 1.043111e+15, df = 32, p-value < 2.2e-16 Warning message: Chi-squared approximation may be incorrect in: chisq.test(No_of_Frouds, p = pmf, simulate.p.value = FALSE, correct = FALSE) But the answer is not correct. Please suggest me the correct variable, calculations & formula in R. Awaiting your positive reply. Regards, Priti. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Data Manipulations - Group By equivalent
use doBy package will be more easy. # GENERATE A TREATMENT GROUP # group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_')); # CREATE A SERIES OF RANDOM VALUES # x<-rnorm(length(group)); # CREATE A DATA FRAME TO COMBINE THE ABOVE TWO # data<-data.frame(group, x); library(doBy) summ2<-summaryBy(x~group,data=data,FUN=c(mean,sum),na.rm=T,prefix=c("mean","sum")) combine2<-merge(data,summ) Ronggui 2006/7/2, Wensui Liu <[EMAIL PROTECTED]>: Zubin, I bet you are working for intercontinental hotels and think you probably are not the real Zubin there. right? ^_^. If you have chance, could you please say hi to him for me? Here is a piece of R code I copy from my blog side by side with SAS. You might need to tweak it a little to get what you need. CALCULATE GROUP SUMMARY IN R ## # HOW TO CALCULATE GROUP SUMMARY IN R # # DATE : DEC-13, 2005 # ## # EQUIVALENT SAS CODE: # # # # DATA DATA; # # DO I = 1 TO 2; # # DO J = 1 TO 4; # # GROUP = 'TREATMENT_'||PUT(I, 1.); # # X = RANNOR(1); # # OUTPUT; # # END; # # END; # # KEEP GROUP X; # # RUN; # # # # PROC SQL; # # CREATE TABLE COMBINE AS # # SELECT *, MEAN(X) AS MEAN_X, SUM(X) AS SUM_X # # FROM DATA # # GROUP BY GROUP; # # QUIT; # ## # GENERATE A TREATMENT GROUP # group<-as.factor(paste("treatment", rep(1:2, 4), sep = '_')); # CREATE A SERIES OF RANDOM VALUES # x<-rnorm(length(group)); # CREATE A DATA FRAME TO COMBINE THE ABOVE TWO # data<-data.frame(group, x); # CALCULATE SUMMARY FOR X # x.mean<-tapply(data$x, data$group, mean, na.rm = T); x.sum<-tapply(data$x, data$group, sum, na.rm = T); # CREATE A DATA FRAME TO COMBINE SUMMARIES # summ<-data.frame(x.mean, x.sum, group = names(x.mean)); # COMBINE DATA AND SUMMARIES TOGETHER # combine<-merge(data, summ, by = "group"); On 7/1/06, zubin <[EMAIL PROTECTED]> wrote: > > Hello, a beginner R user - boy i wish there was a book on just data > manipulations for SAS users learning R (equivalent to the SAS DATA > STEP).. Okay, my question: > > I have a panel data set, hotel data occupancy by month for 12 months, > 1000 hotels. I have a field labeled 'year' and want to consolidate the > monthly records using an average into 1000 occupancy numbers - just a > simple average of the 12 months by hotel. In SQL this operation is > pretty easy, a group by query (group by hotel where year = 2005, avg > occupancy) - how is this done in R? (in R language not SQL). Thx! > > -zubin > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > -- WenSui Liu (http://spaces.msn.com/statcompute/blog) Senior Decision Support Analyst Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Department of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to recode in my dataset?
I always use "recode" function (in the car packages) to recode variables.That works well and I like that function. 2006/7/2, zhijie zhang <[EMAIL PROTECTED]>: Dear Rusers, My question is about "recode variables". First, i'd like to say something about the idea of recoding: My dataset have three variables:type,soiltem and airtem,which means grass type, soil temperature and air temperature. As we all known, the change of air temperature is greater than soil temperature,so the values in those two different temperaturemay represent different range. My recoding is to recode soiltem with 0.2 intervals, and airtem with 0.5 intervals, that is: In soiltem:0~0.2<-0.1, 0.2~0.4<-0.3, 0.4`0.6<-0.5,...etc; In airtem:0~0.5<-0.25, 0.5~1<-0.75, 1`1.5<-1.25,...etc; My example like this: type<-c(1, 1, 2, 3,4,1,1,4,3,2) soiltem<-c(19.2,18.6,19.5,19.8,19.6,20.6,19.1,18.7,22.4,21.6) airtem<-c(19.9,20.5,21.6,25.6,22.6,21.3,23.7,21.5,24.7,24.4) mydata<-data.frame(type,soiltem,airtem) #copy the above four arguments to generate the dataset mydata type soiltem airtem 1 119.2 19.9 2 118.6 20.5 3 219.5 21.6 4 319.8 25.6 5 419.6 22.6 6 120.6 21.3 7 119.1 23.7 8 418.7 21.5 9 322.4 24.7 10221.6 24.4 Thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Department of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to get the studentized residuals in lm()
help.search("studentized") You will see: studres(MASS) Extract Studentized Residuals from a Linear Model 2006/7/3, zhijie zhang <[EMAIL PROTECTED]>: Dear friends, In s-plus, lm() generates the the studentized residuals automatically for us, and In R, it seems don't have the results: After i fitted lm(), i use attibutes() to see the objects and didn't find studentized residuals . How to get the the studentized residuals in lm(),have i missed something? thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- 黄荣贵 Department of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] panel ordering in nlme and augPred plots
Hi, I'm new at this, I'm very confused, and I think I'm missing something important here. In our pet example we have this: > fm <- lme(Orthodont) > plot(Orthodont) > plot(augPred(fm, level = 0:1)) which gives us a trellis plot with the females above the males, starting with "F03", "F04", "F11", "F06", etc. I thought the point of this was to create an ordering where the females are ordered ("F01", "F02", "F03", etc -- followed by the males being ordered). However, the solution given ... > fm <- lme(Orthodont) > plot(Orthodont) > plot(augPred(fm1, level = 0:1), skip = rep(c(F,T), c(16, 2))) ... doesn't solve it -- although it does do all the females before starting on the males. That is, it starts with "F02", "F08", "F03", ... which isn't in order either. Running Petr's code also gave output which wasn't ordered by the subjects. Could someone please explain to me how to order the panels of the trellis plot by the subjects? thanks, Nandor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] large dataset!
Hello Jennifer, I'm writing a package SQLiteDF for Google SOC2006, under the supervision of Prof. Bates & Prof. Riley. Basically, it stores data frame into sqlite databases (i.e. in a file) and aims to be transparently accessible to R using the same operators for ordinary data frames. Right now, it's quite usable (the "indexers" are working, and some other generic methods), and only for linux (I should have the windows package any time soon though). I would love to hear about your requirements so as to test my package. Cheers, M. Manese On 7/3/06, Andrew Robinson <[EMAIL PROTECTED]> wrote: > Jennifer, > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to get the studentized residuals in lm()
Dear friends, In s-plus, lm() generates the the studentized residuals automatically for us, and In R, it seems don't have the results: After i fitted lm(), i use attibutes() to see the objects and didn't find studentized residuals . How to get the the studentized residuals in lm(),have i missed something? thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] large dataset!
Jennifer, it sounds like that's too much data for R to hold in your computer's RAM. You should give serious consideration as to whether you need all those data for the models that you're fitting, and if so, whether you need to do them all at once. If not, think about pre-processing steps, using e.g. SQL command, to pull out the data that you need. For example, if the data are spatial, then think about analyzing them by patches. Good luck, Andrew On Sun, Jul 02, 2006 at 10:12:25AM -0400, JENNIFER HILL wrote: > > Hi, I need to analyze data that has 3.5 million observations and > about 60 variables and I was planning on using R to do this but > I can't even seem to read in the data. It just freezes and ties > up the whole system -- and this is on a Linux box purchased about > 6 months ago on a dual-processor PC that was pretty much the top > of the line. I've tried expanding R the memory limits but it > doesn't help. I'll be hugely disappointed if I can't use R b/c > I need to do build tailor-made models (multilevel and other > complexities). My fall-back is the SPlus big data package but > I'd rather avoid if anyone can provide a solution > > Thanks > > Jennifer Hill > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to recode in my dataset?
probably ?cut() is what you're looking for, e.g., something like: ind <- cut(mydata$soiltem, seq(0, 60, 0.2), labels = FALSE) seq(0.1, 60, 0.2)[ind] I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting zhijie zhang <[EMAIL PROTECTED]>: > Dear Rusers, > My question is about "recode variables". First, i'd like to say > something about the idea of recoding: > My dataset have three variables:type,soiltem and airtem,which means > grass type, soil temperature and air temperature. As we all known, > the > change of air temperature is greater than soil temperature,so the > values in those two different temperaturemay represent different > range. > My recoding is to recode soiltem with 0.2 intervals, and airtem > with > 0.5 intervals, that is: > In soiltem:0~0.2<-0.1, 0.2~0.4<-0.3, 0.4`0.6<-0.5,...etc; > In airtem:0~0.5<-0.25, 0.5~1<-0.75, 1`1.5<-1.25,...etc; > My example like this: > type<-c(1, 1, 2, 3,4,1,1,4,3,2) > soiltem<-c(19.2,18.6,19.5,19.8,19.6,20.6,19.1,18.7,22.4,21.6) > airtem<-c(19.9,20.5,21.6,25.6,22.6,21.3,23.7,21.5,24.7,24.4) > mydata<-data.frame(type,soiltem,airtem) #copy the above four > arguments > to generate the dataset > > mydata >type soiltem airtem > 1 119.2 19.9 > 2 118.6 20.5 > 3 219.5 21.6 > 4 319.8 25.6 > 5 419.6 22.6 > 6 120.6 21.3 > 7 119.1 23.7 > 8 418.7 21.5 > 9 322.4 24.7 > 10221.6 24.4 > > Thanks very much! > -- > Kind Regards, > Zhi Jie,Zhang ,PHD > Department of Epidemiology > School of Public Health > Fudan University > Tel:86-21-54237149 > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] curiosity question: new graphics vs. old graphics subsystem
Well, as a newbee, I believe your idea is great. However, the R Core team is, in my humble opinion, way too stretched (for a free software development team) to do this. A complementary development team (similar to, say, the Tinn-R team) might be able to address this issue. I wish I would have the skills to contribute :-) Just my 2c. The least of learning is done in the classrooms. - Thomas Merton > Date: Sun, 2 Jul 2006 09:34:39 -0400 > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > Subject: Re: [R] curiosity question: new graphics vs. old graphics subsystem > > hi mihai: it is more likely that the developers will take this more > seriously if you echo my concern on r-help itself. regards, /iaw > > On 7/1/06, Mihai Nica <[EMAIL PROTECTED]> wrote: > > > > Wow, this is what I would say if I knew how to say it :-) For newbees > > (such as myself) or those who lack programming expertise (and, why not, for > > those not interested in programming) this approach would be great. > > mihai > > > > > > Express yourself instantly with Windows Live Messenger _ Express yourself: design your homepage the way you want it with Live.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] large dataset!
Hi, I need to analyze data that has 3.5 million observations and about 60 variables and I was planning on using R to do this but I can't even seem to read in the data. It just freezes and ties up the whole system -- and this is on a Linux box purchased about 6 months ago on a dual-processor PC that was pretty much the top of the line. I've tried expanding R the memory limits but it doesn't help. I'll be hugely disappointed if I can't use R b/c I need to do build tailor-made models (multilevel and other complexities). My fall-back is the SPlus big data package but I'd rather avoid if anyone can provide a solution Thanks Jennifer Hill __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] nlme: correlation structure in gls and zero distance
Joris De Wolf a écrit : > Have you tried to define 'an' as a group? Like in > > gls(IKAfox~an,correlation=corExp(2071,form=~x+y|an,nugget=1.22),data=renliev) > > > A small data set might help to explain the problem. > > Joris Thanks. Seems to work with a small artificial data set: an<-as.factor(rep(2001:2004,each=10)) x<-rep(rnorm(10),times=4) y<-rep(rnorm(10),times=4) IKA<-rpois(40,2) site<-as.factor(rep(letters[1:10],times=4)) library(nlme) mod1<-gls(IKA~an-1,correlation=corExp(form=~x+y)) >Error in getCovariate.corSpatial(object, data = data) : Cannot have zero distances in "corSpatial" mod2<-gls(IKA~an-1,correlation=corExp(form=~x+y|an)) > mod2 Generalized least squares fit by REML Model: IKA ~ an - 1 Data: NULL Log-restricted-likelihood: -73.63998 Coefficients: an2001 an2002 an2003 an2004 1.987611 2.454520 2.429907 2.761011 Correlation Structure: Exponential spatial correlation Formula: ~x + y | an Parameter estimate(s): range 0.4304012 Degrees of freedom: 40 total; 36 residual Residual standard error: 1.746205 > > Joris > > Patrick Giraudoux wrote: >> Dear listers, >> >> I am trying to model the distribution of fox density over years in >> the Doubs department. Measurements have been taken on 470 plots in >> March each year and georeferenced. Average density is supposed to be >> different each year. >> >> In a first approach, I would like to use a general model of this >> type, taking spatial correlation into account: >> >> gls(IKAfox~an,correlation=corExp(2071,form=~x+y,nugget=1.22),data=renliev) >> >> >> but I get >> >> > >> gls(IKAfox~an,correlation=corExp(2071,form=~x+y,nugget=1.22),data=renliev) >> >> Error in getCovariate.corSpatial(object, data = data) : >> Cannot have zero distances in "corSpatial" >> >> I understand that the 470 geographical coordinates are repeated three >> times (measurement are taken each of the three years at the same >> place) which obviously cannot be handled there. >> >> Does anybody know a way to work around that except jittering slightly >> the geographical coordinates? >> >> Thanks in advance, >> >> Patrick >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide! >> http://www.R-project.org/posting-guide.html > > > confidentiality notice: > The information contained in this e-mail is confidential a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] send output to printer
Matthias Braeunig wrote: > It has to be a simple thing, but I could not figure it out: > > How do I send the text output from object x to the printer? > As a shell user I would expect a pipe to the printer... "|kprinter" or > "|lpr -Pmyprinter" somehow. And yes, I'm on Linux. I think capture.output() helps to send stuff to a connection. Uwe Ligges > Thanks! > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problems with simple statistical procedures
Thomas Preuth wrote: > Hello, > > I use an imported dataframe and want to extract the mean value for one > column. > after typing "mean (rae.df$VOL_DEP)" I receive > "[1] NA > Warning message: > Argument ist weder numerisch noch boolesch: gebe NA zurück in: > mean.default("rae.df$POINT_Y_CH") " Well, rae.df$VOL_DEP != "rae.df$POINT_Y_CH" I think this is really strange. Are you sure this is the exact call and its output? If so, please tell us the output of str(rae.df) Uwe Ligges > But when i look into the dataframe the column is characterized as numeric. > > Sorry for bothering but as a complete newbie I just cannot halp myself. > > Greetings, > thomas > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] sparse matrix tools
Dear R-Help list: I'm using the Matrix library to operate on 600 X ~5000 element unsymmetrical sparse arrays. So far, so good, but if I find I need more speed or functionality, how hard would it be to utilize other sparse matrix toolsets from within R, say MUMPS, PARDISO or UMFPACK, that do not have explicit R interfaces? More information on these is available here www.cise.ufl.edu/research/sparse/umfpack/ www.computational.unibas.ch/cs/scicomp/software/pardiso www.enseeiht.fr/lima/apo/MUMPS/ and in these reviews ftp://ftp.numerical.rl.ac.uk/pub/reports/ghsNAGIR20051r1.pdf http://www.cise.ufl.edu/research/sparse/codes/ neither of which reviewed the R Matrix package, unfortunately. Thanks, - John Thaden, Ph.D., U. Arkansas for Med. Sci., Little Rock. Confidentiality Notice: This e-mail message, including any a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to recode in my dataset?
Dear Rusers, My question is about "recode variables". First, i'd like to say something about the idea of recoding: My dataset have three variables:type,soiltem and airtem,which means grass type, soil temperature and air temperature. As we all known, the change of air temperature is greater than soil temperature,so the values in those two different temperaturemay represent different range. My recoding is to recode soiltem with 0.2 intervals, and airtem with 0.5 intervals, that is: In soiltem:0~0.2<-0.1, 0.2~0.4<-0.3, 0.4`0.6<-0.5,...etc; In airtem:0~0.5<-0.25, 0.5~1<-0.75, 1`1.5<-1.25,...etc; My example like this: type<-c(1, 1, 2, 3,4,1,1,4,3,2) soiltem<-c(19.2,18.6,19.5,19.8,19.6,20.6,19.1,18.7,22.4,21.6) airtem<-c(19.9,20.5,21.6,25.6,22.6,21.3,23.7,21.5,24.7,24.4) mydata<-data.frame(type,soiltem,airtem) #copy the above four arguments to generate the dataset mydata type soiltem airtem 1 119.2 19.9 2 118.6 20.5 3 219.5 21.6 4 319.8 25.6 5 419.6 22.6 6 120.6 21.3 7 119.1 23.7 8 418.7 21.5 9 322.4 24.7 10221.6 24.4 Thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Optional variables in function?
?missing On 7/2/06, Jonathan Greenberg <[EMAIL PROTECTED]> wrote: > > I'm a bit new to writing R functions and I was wondering what the "best > practice" for having optional variables in a function is, and how to test > for optional and non-optional variables? e.g. if I have the following > function: > > helpme <- function(a,b,c) { > > > } > > In this example, I want c to be an optional variable, but a and b to be > required. How do I: > 1) test to see if the user has inputted c > 2) break out of the function of the user has NOT inputted a or b. > > Thanks! > > --j > > -- > > Jonathan A. Greenberg, PhD > NRC Research Associate > NASA Ames Research Center > MS 242-4 > Moffett Field, CA 94035-1000 > Phone: 415-794-5043 > AIM: jgrn3007 > MSN: [EMAIL PROTECTED] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] workaround for numeric problems
I'd compute this in the log-scale (taking also advantage of the 'log' and 'log.p' arguments of dnorm() and pnorm(), respectively), and then transform back, e.g., fn1 <- function(B){ -(pnorm(B) * dnorm(B) * B + dnorm(B)^2)/pnorm(B)^2 } fn2 <- function(B){ p1 <- dnorm(B, log = TRUE) + log(-B) - pnorm(B, log.p = TRUE) p2 <- 2 * (dnorm(B, log = TRUE) - pnorm(B, log.p = TRUE)) exp(p1) - exp(p2) } fn1(c(-15, -25, -35, -55, -105)) fn2(c(-15, -25, -35, -55, -105)) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm Quoting Ott Toomet <[EMAIL PROTECTED]>: > Dear R-people, > > I have to compute > > C - -(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2 > > This expression seems to be converging to -1 if B approaches to -Inf > (although I am unable to prove it). R has no problems until B > equals > around -28 or less, where both numerator and denominator go to 0 and > you get NaN. A simple workaround I did was > > C <- ifelse(B > -25, >-(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2, > -1) > > It works well for me (32bit intel/linux platform). But what about > other processors/platforms/compilator options? Are there any better > ways for finding out at which values the numerical problems start? > Can one derive something from .Machine$double.eps (but what about > the > precison of dnorm and other analytic functions)? > > Thanks in advance, > Ott > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] replace values?
# reproducing your example xx<-"x y z + 1 2 3 + 2 3 1 + 3 2 1 + 1 1 3 + 2 1 2 + 3 2 3 + 2 1 1" # you did not tell us the class of your data, assuming data.frame df<-read.table(textConnection(xx),header=T,colClasses="factor") # a clean way to do what you want is using factors with ?levels # (note that data has already been read as factor) levels(df$x)<-c("a","b","c","d") levels(df$y)<-c("b","a","c","d") levels(df$z)<-c("d","c","b","a") subset(df,x=="a") x y z 1 a a b 4 a b b subset(df,x=="a"&y=="a") x y z 1 a a b HTH, m zhijie zhang wrote: > Dear friends, > i have a dataset like this: > x y z > 1 2 3 > 2 3 1 > 3 2 1 > 1 1 3 > 2 1 2 > 3 2 3 > 2 1 1 > I want to replace x with the following values:1<-a,2<-b,3<-c,4<-d; > replace y with the following values:1<-b,2<-a,3<-c,4<-d; > replace z with the following values:1<-d,2<-c,3<-b,4<-a; > Finally,select two subsets: > 1. if x='a'; > 2.x='a' and y='a'; > thanks very much! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] replace values?
On 07/02/06 12:39, zhijie zhang wrote: > Dear friends, > i have a dataset like this: > x y z > 1 2 3 > 2 3 1 > 3 2 1 > 1 1 3 > 2 1 2 > 3 2 3 > 2 1 1 > I want to replace x with the following values:1<-a,2<-b,3<-c,4<-d; > replace y with the following values:1<-b,2<-a,3<-c,4<-d; > replace z with the following values:1<-d,2<-c,3<-b,4<-a; Here's one way. Call your dataset M, and assume it is a data.frame. This method of replacement works best when you are replacing consecutive integers, as you are. Note that X[1] is "a", X[2] is "b" and so on. X <- c("a","b","c","d") Y <- c("b","a","c","d") Z <- c("d","c","b","a") M$x <- X[M$x] M$y <- Y[M$y] M$z <- Z[M$z] > Finally,select two subsets: > 1. if x='a'; > 2.x='a' and y='a'; M[M$x=="a",] M[M$x=="a" & M$y=="a",] The subsets will be rows. I'm not sure that's what you mean. Jon -- Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] workaround for numeric problems
Dear R-people, I have to compute C - -(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2 This expression seems to be converging to -1 if B approaches to -Inf (although I am unable to prove it). R has no problems until B equals around -28 or less, where both numerator and denominator go to 0 and you get NaN. A simple workaround I did was C <- ifelse(B > -25, -(pnorm(B)*dnorm(B)*B + dnorm(B)^2)/pnorm(B)^2, -1) It works well for me (32bit intel/linux platform). But what about other processors/platforms/compilator options? Are there any better ways for finding out at which values the numerical problems start? Can one derive something from .Machine$double.eps (but what about the precison of dnorm and other analytic functions)? Thanks in advance, Ott __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problems with simple statistical procedures
Hello, I use an imported dataframe and want to extract the mean value for one column. after typing "mean (rae.df$VOL_DEP)" I receive "[1] NA Warning message: Argument ist weder numerisch noch boolesch: gebe NA zurück in: mean.default("rae.df$POINT_Y_CH") " But when i look into the dataframe the column is characterized as numeric. Sorry for bothering but as a complete newbie I just cannot halp myself. Greetings, thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html