Re: [R] problem installing Rpmi : mpi.h...Found in /usr/include/lam, yet libmpi
Dear Mr Ripley, Dear all, Could you please help me to find an appropriate rpm package to install on RED HAT LINUX ENTERPRISE 5. I have experienced trouble in invoking R with R-2.5.1-1.fc7.i386.rpm. It strats normaly and then it exit me to the prompt like shown below: How to cope with this error of segmentation. Thanks for your help, Faithfully Yours, Souleymane N'Doye Statisticain Decison Support Systems consultant Labstat Conseil P. O. BOX 347, 00606 Nairobi Kenya Email: [EMAIL PROTECTED] tel. : +254 (20) 736 842 478 www.labstatconseil.com *** caught segfault *** address (nil), cause 'memory not mapped' Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Erreur de segmentation _ [[replacing trailing spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] HTML reading,
Hello, Sorry for my english, in a R function, I want to read HTML files to analyse the text. Do somebody now, how can i read the text only in txt Foirmat... Thanks -- View this message in context: http://www.nabble.com/HTML-reading%2C-tf4447190.html#a12688719 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HTML reading,
On 9/15/07, christophe vuadens [EMAIL PROTECTED] wrote: Hello, Sorry for my english, in a R function, I want to read HTML files to analyse the text. Do somebody now, how can i read the text only in txt Foirmat... Thanks -- Have a look at this: http://gking.harvard.edu/readme/ I don't know how they strip the html tags exactly, but it is described in the documents there. This is also a good tool for text analysis. -- Armin Goralczyk, M.D. Dept. of General Surgery University of Göttingen Göttingen, Germany http://www.chirurgie-goettingen.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RPM package for Linux RED HAT ENTERPRISE 5
See http://cran.r-project.org/bin/linux/redhat/el5/i386/ The ReadMe there says they should work on RHEL5. I've not used RHEL5, but have a little experience with Centos5, where R builds from the tarball without any problems at all. On Sat, 15 Sep 2007, Ndoye Souleymane wrote: Dear Mr Ripley, Dear all, Could you please help me to find an appropriate rpm package to install on RED HAT LINUX ENTERPRISE 5. I have experienced trouble in invoking R with R-2.5.1-1.fc7.i386.rpm. It strats normaly and then it exit me to the prompt like shown below: How to cope with this error of segmentation. Thanks for your help, Faithfully Yours, Souleymane N'Doye Statisticain Decison Support Systems consultant Labstat Conseil P. O. BOX 347, 00606 Nairobi Kenya Email: [EMAIL PROTECTED] tel. : +254 (20) 736 842 478 www.labstatconseil.com *** caught segfault *** address (nil), cause 'memory not mapped' Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Erreur de segmentation _ Windows Live Spaces : créez votre blog à votre image ! http://www.windowslive.fr/spaces -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to change print limit on screen
Hi all user, Is there any way i can chage the print limit ( getOption(max.print)) to unlimited or specified limit? Thanks in advance _ Feel like a local wherever you go. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] starting with a capital letter
Hi everyone, I am wondering if there is any built-in funcion that can determine whether words in a character vector start with a captial letter or not. Help, please. Thanks. -- View this message in context: http://www.nabble.com/starting-with-a-capital-letter-tf4447302.html#a12689105 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] starting with a capital letter
kevinchang a écrit : I am wondering if there is any built-in funcion that can determine whether words in a character vector start with a captial letter or not. Help, please. Thanks. DIY with tolower(). apply tolower() on 1st letter and compare. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] starting with a capital letter
On 15-Sep-07 10:21:19, kevinchang wrote: Hi everyone, I am wondering if there is any built-in funcion that can determine whether words in a character vector start with a captial letter or not. Help, please. Thanks. Something like: C-c(Abc, aBc, abC) for(i in (1:length(C))){ if(length(grep(^[A-Z],C[i]))0){ print(Yes) else print(No) } } [1] Yes [1] No [1] No The grep expression [A-Z] looks for one of A,B,C,...,Z and the ^ makes it look for it at the start of the string. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 15-Sep-07 Time: 12:55:00 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HTML reading,
Check out: https://stat.ethz.ch/pipermail/r-help/2007-August/137742.html On 9/15/07, christophe vuadens [EMAIL PROTECTED] wrote: Hello, Sorry for my english, in a R function, I want to read HTML files to analyse the text. Do somebody now, how can i read the text only in txt Foirmat... Thanks -- View this message in context: http://www.nabble.com/HTML-reading%2C-tf4447190.html#a12688719 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Class probabilities in rpart
Hi, the predict.rpart() function from the rpart library allows for calculating the class probabilities for a given test case instead of a discrete class label. How are these class probabilities derived? Is it simply the proportion of the majority class to all cases in a leaf node? Thanks in advance, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Storing Variables of different types
Hi there, I have an ixjxk array where I want to store dates in the first column of all sub-matrices (i.e. j=1 is a column with dates) and real numbers in the rest of the columns...I have been trying many things, but I am not getting anywhere. Thank you very much for your help, Fabian This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not the intended recipient, please telephone or email the sender and delete this message and any attachment from your system. If you are not the intended recipient you must not copy this message or attachment or disclose the contents to any other person. Nothing contained in the attached email shall be regarded as an offer to sell or as a solicitation of an offer to buy any services, funds or products or an expression of any opinion or views of the firm or its employees. Nothing contained in the attached email shall be deemed to be an advise of, or recommendation by, the firm or its employees. No representation is made as to accuracy, completeness, reliability or appropriateness of the information contained in the attached email. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generate ROC curve using randomForest package
Hi, I am new here. I would like to compare the performance of the random forest model with support vector machine. Can anybody let me know how to generate a ROC curve for random forest model since there is no need to run the cross-validation. Thank you very much! TL _ [[replacing trailing spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival model (time to event data)
High all, I would appreciate input about how the following survival model can be modeled in R and how competing risk models can generally be modeled. Also I would appreciate hints about resources that you are aware of that explain the use of survival models in R in greater detail. The data structure of my data is plotted below. My problem is that I don't know how to model 4 different events in the same hazard model for which the hazards are conditional on some other factor. Conditions: 0. All events are mutually exclusive 1. Either no event, Event1, or one of the Events 2-4 occurs (i.e. events 2-4 are competing) 2. Event1 can only occur if St.Beg=0 (it switches St.End from this period and St.Beg from the following periods on to 1 until Event4 occurs). 3. Event2-4 can only occur if St.Beg=1 TimeSt.Beg St.End Event1 Event2 Event3 Event4 Number 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 1 1 0 0 0 0 4 1 1 0 0 0 0 10 5 1 1 0 1 0 0 10 6 1 1 0 1 0 0 15 7 1 1 0 0 0 0 20 8 1 1 0 0 0 0 20 9 1 1 0 0 1 0 10 10 1 0 0 0 0 1 0 Thanks much for your help, Daniel - cuncta stricte discussurus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generate ROC curve using randomForest package
L L wrote: I am new here. I would like to compare the performance of the random forest model with support vector machine. Can anybody let me know how to generate a ROC curve for random forest model since there is no need to run the cross-validation. Thank you very much! The ROCR package provides performance measures like AUC, Sensitivity or ROC curves. Especially the performance() function is of interest. Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing Variables of different types
Garavito,Fabian a écrit : Hi there, I have an ixjxk array where I want to store dates in the first column of all sub-matrices (i.e. j=1 is a column with dates) and real numbers in the rest of the columns...I have been trying many things, but I am not getting anywhere. Thank you very much for your help, Fabian ?data.frame or think about storing dates in the rownames. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change print limit on screen
?options options(max.print=10) 1:10 [1] 1 2 3 4 5 6 7 8 9 10 [ reached getOption(max.print) -- omitted 0 entries ]] On 9/15/07, Abu Naser [EMAIL PROTECTED] wrote: Hi all user, Is there any way i can chage the print limit ( getOption(max.print)) to unlimited or specified limit? Thanks in advance _ Feel like a local wherever you go. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] applying math/stat functions to rows in data frame
Hi All, There are a variety of functions that can be applied to a variable (column) in a data frame: mean, min, max, sd, range, IQR, etc. I am aware of only two that work on the rows, using q1-q3 as example variables: rowMeans(cbind(q1,q2,q3),na.rm=T) #mean of multiple variables rowSums (cbind(q1,q2,q3),na.rm=T) #sum of multiple variables Can the standard column functions (listed in the first sentence) be applied to rows, with the use of correct indexes to reference the columns of interest? Or, must these summary functions be programmed separately to work on a row? Thanks, Gerard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying math/stat functions to rows in data frame
At 12:02 PM 9/15/2007, Gerald wrote: Hi All, There are a variety of functions that can be applied to a variable (column) in a data frame: mean, min, max, sd, range, IQR, etc. I am aware of only two that work on the rows, using q1-q3 as example variables: rowMeans(cbind(q1,q2,q3),na.rm=T) #mean of multiple variables rowSums (cbind(q1,q2,q3),na.rm=T) #sum of multiple variables Can the standard column functions (listed in the first sentence) be applied to rows, with the use of correct indexes to reference the columns of interest? Or, must these summary functions be programmed separately to work on a row? Try using t() to transpose the matrix, and then apply the column function of interest. Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: [EMAIL PROTECTED] Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 Vere scire est per causas scire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with a problem
Hello I was wonderinf if anyone can help me with this problem, it seems trivial but for some reason I can not figure it out. With a single R command complete the following: create a vector calles seqvec that repeats the sequence 1, 3,6, 10,15,21.( I was trying to use c() but this does not work) create a 5-row, 6-column matirx from seqvec wuth each row containg the sequence from before and complete the two task above in a single step. LTR __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a problem
On Sat, 2007-09-15 at 12:11 -0400, Letticia Ramlal wrote: Hello I was wonderinf if anyone can help me with this problem, it seems trivial but for some reason I can not figure it out. With a single R command complete the following: create a vector calles seqvec that repeats the sequence 1, 3,6, 10,15,21.( I was trying to use c() but this does not work) create a 5-row, 6-column matirx from seqvec wuth each row containg the sequence from before and complete the two task above in a single step. LTR Is this what you want? seqvec - cumsum(1:6) seqvec [1] 1 3 6 10 15 21 Or to address both: matrix(rep(cumsum(1:6), 5), ncol = 6, byrow = TRUE) [,1] [,2] [,3] [,4] [,5] [,6] [1,]136 10 15 21 [2,]136 10 15 21 [3,]136 10 15 21 [4,]136 10 15 21 [5,]136 10 15 21 See ?cumsum and ?rep HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pulling out parts of a generated array in R
Hello all, I was wondering if it was possible to pull out certain parts of an array in R - not an array of data that I have created, but an array of data that has been spit out by R itself. More specifically, in the lines of code below: summary(prcomp(USArrests)) Importance of components: PC1 PC2PC3 PC4 Standard deviation 83.732 14.2124 6.4894 2.48279 Proportion of Variance 0.966 0.0278 0.0058 0.00085 Cumulative Proportion 0.966 0.9933 0.9991 1.0 I was wondering if there is a way to only extract one piece of this array, specifically the Proportion of Variance for PC2, which is 0.0278. I know how to extract one entire line of data from this array, using the following lines of code: result-summary(prcomp(USArrests)) m-result$importance final-m[2,] These lines of code will produce the follwing output: 0.966 0.0278 0.0058 0.00085 Now I was wondering if there is anyway to break this down even further, and be able to extract one piece of data from this one line. If anyone could help me out, I would really appreciate it. Thanks, Wayne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pulling out parts of a generated array in R
Wayne Aldo Gavioli wrote: Hello all, I was wondering if it was possible to pull out certain parts of an array in R - not an array of data that I have created, but an array of data that has been spit out by R itself. More specifically, in the lines of code below: summary(prcomp(USArrests)) Importance of components: PC1 PC2PC3 PC4 Standard deviation 83.732 14.2124 6.4894 2.48279 Proportion of Variance 0.966 0.0278 0.0058 0.00085 Cumulative Proportion 0.966 0.9933 0.9991 1.0 I was wondering if there is a way to only extract one piece of this array, specifically the Proportion of Variance for PC2, which is 0.0278. I know how to extract one entire line of data from this array, using the following lines of code: result-summary(prcomp(USArrests)) m-result$importance final-m[2,] These lines of code will produce the follwing output: 0.966 0.0278 0.0058 0.00085 Now I was wondering if there is anyway to break this down even further, and be able to extract one piece of data from this one line. I am sure you already read about matrix indexing in the manual An Introduction to R, but here to remind you how easy it is to get the second row, third column: final - m[2,3] Uwe Ligges If anyone could help me out, I would really appreciate it. Thanks, Wayne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying math/stat functions to rows in data frame
On Sat, 2007-09-15 at 09:02 -0700, Gerard Smits wrote: Hi All, There are a variety of functions that can be applied to a variable (column) in a data frame: mean, min, max, sd, range, IQR, etc. I am aware of only two that work on the rows, using q1-q3 as example variables: rowMeans(cbind(q1,q2,q3),na.rm=T) #mean of multiple variables rowSums (cbind(q1,q2,q3),na.rm=T) #sum of multiple variables Can the standard column functions (listed in the first sentence) be applied to rows, with the use of correct indexes to reference the columns of interest? Or, must these summary functions be programmed separately to work on a row? Thanks, Gerard The answer is: it depends If the row can be coerced to a numeric vector, then yes. This presumes that the data frame contains a single data type or the subset of columns you need contains a single data type. If the row contains multiple data types, then the row becomes a single row data frame or a list and you would have to consider other possible approaches. For example: Taking the first row of the 'iris' dataset becomes a single row data frame: str(iris[1, ]) 'data.frame': 1 obs. of 5 variables: $ Sepal.Length: num 5.1 $ Sepal.Width : num 3.5 $ Petal.Length: num 1.4 $ Petal.Width : num 0.2 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 or if you set 'drop = TRUE', a list: str(iris[1, , drop = TRUE]) List of 5 $ Sepal.Length: num 5.1 $ Sepal.Width : num 3.5 $ Petal.Length: num 1.4 $ Petal.Width : num 0.2 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 If however, you remove the last column Species, which is a factor, you can coerce the remaining object to a numeric matrix: str(as.matrix(iris[, -5])) num [1:150, 1:4] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr [1:4] Sepal.Length Sepal.Width Petal.Length Petal.Width Some functions will do this coercion internally: For example: rowSums(iris) Error in rowSums(x, prod(dn), p, na.rm) : 'x' must be numeric However: head(rowSums(iris[, -5])) [1] 10.2 9.5 9.4 9.4 10.2 11.4 HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] applying math/stat functions to rows in data frame
On Sat, 2007-09-15 at 09:02 -0700, Gerard Smits wrote: Hi All, There are a variety of functions that can be applied to a variable (column) in a data frame: mean, min, max, sd, range, IQR, etc. But one their own, these are not equivalents to rowMeans, rowSums etc below. I am aware of only two that work on the rows, using q1-q3 as example variables: rowMeans(cbind(q1,q2,q3),na.rm=T) #mean of multiple variables rowSums (cbind(q1,q2,q3),na.rm=T) #sum of multiple variables If you really want to apply a function to the individual rows of a matrix-like object then apply() is your friend: ?rowMeans states: Details: These functions are equivalent to use of 'apply' with 'FUN = mean' or 'FUN = sum' with appropriate margins, but are a lot faster. So see ?apply and argument 'margin'. For rows use margin = 1, e.g.: dat - matrix(runif(1000), ncol = 100) apply(dat, 1, mean) rowMeans(dat) Can the standard column functions (listed in the first sentence) be applied to rows, with the use of correct indexes to reference the columns of interest? Or, must these summary functions be programmed separately to work on a row? You can only use those functions on a column via subsetting, e.g.: mean(dat[,4]) min(dat[,4]) If all you want is a single row (the equivalent of what you seem to be asking) then these also work: mean(dat[4,]) min(dat[4,]) HTH G Thanks, Gerard [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with a problem
On Sat, 2007-09-15 at 12:11 -0400, Letticia Ramlal wrote: Hello I was wonderinf if anyone can help me with this problem, it seems trivial but for some reason I can not figure it out. With a single R command complete the following: create a vector calles seqvec that repeats the sequence 1, 3,6, 10,15,21.( I was trying to use c() but this does not work) create a 5-row, 6-column matirx from seqvec wuth each row containg the sequence from before and complete the two task above in a single step. If that is just an example of an arbitrary sequence, then the following does what you want: res - matrix(rep(c(1,3,6,10,15,21), 5), nrow = 5, byrow = TRUE) res [,1] [,2] [,3] [,4] [,5] [,6] [1,]136 10 15 21 [2,]136 10 15 21 [3,]136 10 15 21 [4,]136 10 15 21 [5,]136 10 15 21 But if there is something special in the quoted sequence (it is cumsum(1:6) ), then the following also does what you want: res2 - matrix(rep(cumsum(1:6), 5), nrow = 5, byrow = TRUE) res2 [,1] [,2] [,3] [,4] [,5] [,6] [1,]136 10 15 21 [2,]136 10 15 21 [3,]136 10 15 21 [4,]136 10 15 21 [5,]136 10 15 21 all.equal(res, res2) [1] TRUE Take a look at ?rep and, although not needed in this case, ?seq for generating sequences and repeats. HTH G LTR __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] starting with a capital letter
On Sat, 15 Sep 2007, kevinchang wrote: Hi everyone, I am wondering if there is any built-in funcion that can determine whether words in a character vector start with a captial letter or not. Help, please. Thanks. Yes. But your query is not precise. See the posting guide and provide commented, minimal, self-contained, reproducible code (as is requested) to be sure the answers you get address the question you really want answered. I see several possiblilities. In this vector: my.charvec- c( Abc, abc Abc, abc aBc ) You wish to match element 1 only or 1 and 2 only and perhaps report where in each element the last match was found. res - regexpr( \\[[:upper:]].* , my.charvec ) should get you started. Examples: which( res == 1 ) # first case which( res != -1 ) # second case See ?regexpr Also, ?strsplit which I think would be needed to recover the locations of each of several capitalized words in a single element. e.g. abc Def Ghi Chuck -- View this message in context: http://www.nabble.com/starting-with-a-capital-letter-tf4447302.html#a12689105 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about VarSelRF
Dear list members, I am analyzing Affymentrix gene expression data and would like to apply the R package, VarSelRF to identifying small sets of genes that could be used for diagnostic purpose. Basically, the data matrix is composed of 22277 rows (genes) and 65 columns (samples). I did unsupervised clustering using pvclust to get 4 classes. What I would like to do is to get unique genes for each class which can best characterize them. I did so and had the problem when running the code. The error message is: rf.vs1 - varSelRF(exprSet, cl, ntree = 200, ntreeIterat = 100, vars.drop.frac = 0.2) error in randomForest.default(x = xdata, y = Class, ntree = ntree, mtry = mtry, : length of response must be the same as predictors My code is: library(varSelRF) exprSet - as.matrix(read.table('varSelRF_x.txt',header = FALSE)) cl - factor(c(rep(C, 2), rep(D, 2), rep(B, 1), rep(A, 1), rep(D, 1), rep(B, 1), rep(C, 2), rep(B, 1), rep(D, 1), rep(A, 1), rep(D, 2), rep(B, 1),rep(A, 1),rep(B, 1), rep(D, 1),rep(B, 1),rep(C, 1),rep(D, 2),rep(C, 2),rep(B, 2),rep(D, 1),rep(C, 1),rep(D, 1),rep(D, 1),rep(C, 1),rep(B, 1),rep(C, 1),rep(A, 1),rep(C, 1),rep(B, 1),rep(D, 3),rep(D, 1),rep(C, 1),rep(B, 2),rep(D, 1),rep(D, 1),rep(B, 2),rep(D, 1),rep(B, 1),rep(C, 1),rep(D, 1),rep(B, 3),rep(D, 5),rep(B, 1),rep(D, 2),rep(B, 1),rep(D, 1))) rf.vs1 - varSelRF(exprSet, cl, ntree = 200, ntreeIterat = 100, vars.drop.frac = 0.2) rf.vs1 plot(rf.vs1) Would you like to give me some suggestions which could result in the error message? Thank you very much and I am looking forward to your reply! Best Regards, Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] naming columns of data frame
Hey, I am trying to make a data frame and the name of a column is composed of a number, a dot, and a word, such as 1.whatever. But I always get this error message:syntax error, unexpected SYMBOL, expecting ',' in: while printing data frame out . When I rename the column with purely letter, everything works fine. Some suggestion about the cause/ solution ?? Thanks. -- View this message in context: http://www.nabble.com/naming-columns-of-data-frame-tf4449324.html#a12694795 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naming columns of data frame
Try this: df - data.frame('1.test'=rnorm(100), '2.test'=runif(100), check.names=F) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 15/09/2007, kevinchang [EMAIL PROTECTED] wrote: Hey, I am trying to make a data frame and the name of a column is composed of a number, a dot, and a word, such as 1.whatever. But I always get this error message:syntax error, unexpected SYMBOL, expecting ',' in: while printing data frame out . When I rename the column with purely letter, everything works fine. Some suggestion about the cause/ solution ?? Thanks. -- View this message in context: http://www.nabble.com/naming-columns-of-data-frame-tf4449324.html#a12694795 Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] my previous message: Memory Management
Hi, I obviously did not include the subject title. I am looking for memory management on a 64 bit machine. Thank you. TK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
When you say you can not import 4.8GB, is this the size of the text file that you are reading in? If so, what is the structure of the file? How are you reading in the file ('read.table', 'scan', etc). Do you really need all the data or can you work with a portion at a time? If so, then consider putting the data in a database and retrieving the data as needed. If all the data is in an object, how big to you think this object will be? (# rows, # columns, mode of the data). So you need to provide some more information as to the problem that you are trying to solve. On 9/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, Let me apologize for this simple question. I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has saved a lot of time. I am sure this is a lot to do with my memory limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon X2 5600, and 1200W PSU. This PC configuration is the best I could get. I know a bit of C and Perl. Should I use C or Perl to manage this large dataset? or should I even go to 16GB RAM. Sorry for this silly question. But I appreciate if anyone could give me advice. Thank you very much. TK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management
Hi, I apologize again for posting something not suitable on this list. Basically, it sounds like I should go put this large dataset into a database... The dataset I have had trouble with is the transportation network of Chicago Consolidated Metropolitan Statistical Area. The number of samples is about 7,200 points; and every points have outbound and inbound traffic flows: volumes, times, distances, etc. So a quick approximation of the number of rows would be 49,000,000 rows (and 249 columns). This is a text file. I could work with a portion of the data at a time like nearest neighbors or pairs of points. I used read.table('filename',header=F).. I should probably use some bits of data at a time instead of putting all at a time... I am learning RSQLite and RMySQL. As Mr. Wan suggests, I will learn C a bit more. Thank you very much. TK im holtman wrote: When you say you can not import 4.8GB, is this the size of the text file that you are reading in? If so, what is the structure of the file? How are you reading in the file ('read.table', 'scan', etc). Do you really need all the data or can you work with a portion at a time? If so, then consider putting the data in a database and retrieving the data as needed. If all the data is in an object, how big to you think this object will be? (# rows, # columns, mode of the data). So you need to provide some more information as to the problem that you are trying to solve. On 9/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, Let me apologize for this simple question. I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has saved a lot of time. I am sure this is a lot to do with my memory limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon X2 5600, and 1200W PSU. This PC configuration is the best I could get. I know a bit of C and Perl. Should I use C or Perl to manage this large dataset? or should I even go to 16GB RAM. Sorry for this silly question. But I appreciate if anyone could give me advice. Thank you very much. TK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory management
If you data file has 49M rows and 249 columns, then if each column had 5 characters, then you are looking at a text file with 60GB. If these were all numerics (8 bytes per number), then you are looking at an R object that would be almost 100GB. If this is your data, then this is definitely a candidate for a data base since you would need a fairly large machine (at least 300GB of real memory). You probably need to give some serious thought to how you want to store your data and then what type of processing you need to do on it. BTW, do you need all 249 columns, or could you work with just 3-4 columns at a time (this at least makes an R object of about 1.5GB which might be easier to handle). On 9/16/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote: Hi, I apologize again for posting something not suitable on this list. Basically, it sounds like I should go put this large dataset into a database... The dataset I have had trouble with is the transportation network of Chicago Consolidated Metropolitan Statistical Area. The number of samples is about 7,200 points; and every points have outbound and inbound traffic flows: volumes, times, distances, etc. So a quick approximation of the number of rows would be 49,000,000 rows (and 249 columns). This is a text file. I could work with a portion of the data at a time like nearest neighbors or pairs of points. I used read.table('filename',header=F).. I should probably use some bits of data at a time instead of putting all at a time... I am learning RSQLite and RMySQL. As Mr. Wan suggests, I will learn C a bit more. Thank you very much. TK im holtman wrote: When you say you can not import 4.8GB, is this the size of the text file that you are reading in? If so, what is the structure of the file? How are you reading in the file ('read.table', 'scan', etc). Do you really need all the data or can you work with a portion at a time? If so, then consider putting the data in a database and retrieving the data as needed. If all the data is in an object, how big to you think this object will be? (# rows, # columns, mode of the data). So you need to provide some more information as to the problem that you are trying to solve. On 9/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, Let me apologize for this simple question. I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has saved a lot of time. I am sure this is a lot to do with my memory limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon X2 5600, and 1200W PSU. This PC configuration is the best I could get. I know a bit of C and Perl. Should I use C or Perl to manage this large dataset? or should I even go to 16GB RAM. Sorry for this silly question. But I appreciate if anyone could give me advice. Thank you very much. TK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.