Re: [R] problem installing Rpmi : mpi.h...Found in /usr/include/lam, yet libmpi

2007-09-15 Thread Ndoye Souleymane
Dear Mr Ripley, Dear all,

Could you please help me to find an appropriate rpm package to install on 
RED HAT LINUX ENTERPRISE 5.

I have experienced trouble in invoking R with R-2.5.1-1.fc7.i386.rpm. It 
strats normaly and then it exit me to the prompt like shown below:

How to cope with this error of segmentation.

Thanks for your help,

Faithfully Yours,

Souleymane N'Doye
Statisticain  Decison Support Systems consultant

Labstat Conseil
P. O. BOX 347, 00606 Nairobi Kenya
Email: [EMAIL PROTECTED]
tel. : +254 (20) 736 842 478
www.labstatconseil.com

*** caught segfault ***
address (nil), cause 'memory not mapped'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Erreur de segmentation

_
[[replacing trailing spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HTML reading,

2007-09-15 Thread christophe vuadens

Hello,

Sorry for my english, in a R function, I want to read HTML files to analyse
the text.  Do somebody now, how can i read the text only in txt Foirmat... 

Thanks
-- 
View this message in context: 
http://www.nabble.com/HTML-reading%2C-tf4447190.html#a12688719
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HTML reading,

2007-09-15 Thread Armin Goralczyk
On 9/15/07, christophe vuadens [EMAIL PROTECTED] wrote:

 Hello,

 Sorry for my english, in a R function, I want to read HTML files to analyse
 the text.  Do somebody now, how can i read the text only in txt Foirmat...

 Thanks
 --

Have a look at this:

http://gking.harvard.edu/readme/

I don't know how they strip the html tags exactly, but it is described
in the documents there. This is also a good tool for text analysis.

-- 
Armin Goralczyk, M.D.
Dept. of General Surgery
University of Göttingen
Göttingen, Germany
http://www.chirurgie-goettingen.de
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RPM package for Linux RED HAT ENTERPRISE 5

2007-09-15 Thread Prof Brian Ripley

See http://cran.r-project.org/bin/linux/redhat/el5/i386/
The ReadMe there says they should work on RHEL5.

I've not used RHEL5, but have a little experience with Centos5, where R 
builds from the tarball without any problems at all.


On Sat, 15 Sep 2007, Ndoye Souleymane wrote:


Dear Mr Ripley, Dear all,

Could you please help me to find an appropriate rpm package to install on RED 
HAT LINUX ENTERPRISE 5.


I have experienced trouble in invoking R with R-2.5.1-1.fc7.i386.rpm. It 
strats normaly and then it exit me to the prompt like shown below:


How to cope with this error of segmentation.

Thanks for your help,

Faithfully Yours,

Souleymane N'Doye
Statisticain  Decison Support Systems consultant

Labstat Conseil
P. O. BOX 347, 00606 Nairobi Kenya
Email: [EMAIL PROTECTED]
tel. : +254 (20) 736 842 478
www.labstatconseil.com

*** caught segfault ***
address (nil), cause 'memory not mapped'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Erreur de segmentation

_
Windows Live Spaces : créez votre blog à votre image ! 
http://www.windowslive.fr/spaces




--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to change print limit on screen

2007-09-15 Thread Abu Naser

Hi all user,

Is there any way i can chage the print limit ( getOption(max.print)) to 
unlimited or specified limit?

Thanks in advance
_
Feel like a local wherever you go.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] starting with a capital letter

2007-09-15 Thread kevinchang

Hi everyone,

I am wondering if there is any built-in funcion that can determine whether
words in a character vector start with a captial letter or not. Help,
please. Thanks.
-- 
View this message in context: 
http://www.nabble.com/starting-with-a-capital-letter-tf4447302.html#a12689105
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] starting with a capital letter

2007-09-15 Thread mel
kevinchang a écrit :
 I am wondering if there is any built-in funcion that can determine whether
 words in a character vector start with a captial letter or not. Help,
 please. Thanks.

DIY with tolower().
apply tolower() on 1st letter and compare.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] starting with a capital letter

2007-09-15 Thread Ted Harding
On 15-Sep-07 10:21:19, kevinchang wrote:
 
 Hi everyone,
 
 I am wondering if there is any built-in funcion that can
 determine whether words in a character vector start with
 a captial letter or not. Help, please. Thanks.

Something like:

C-c(Abc, aBc, abC)

for(i in (1:length(C))){
 if(length(grep(^[A-Z],C[i]))0){
   print(Yes) else print(No)
  }
}


[1] Yes
[1] No
[1] No


The grep expression [A-Z] looks for one of A,B,C,...,Z
and the ^ makes it look for it at the start of the string.

Hoping this helps,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 15-Sep-07   Time: 12:55:00
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HTML reading,

2007-09-15 Thread Gabor Grothendieck
Check out:

https://stat.ethz.ch/pipermail/r-help/2007-August/137742.html

On 9/15/07, christophe vuadens [EMAIL PROTECTED] wrote:

 Hello,

 Sorry for my english, in a R function, I want to read HTML files to analyse
 the text.  Do somebody now, how can i read the text only in txt Foirmat...

 Thanks
 --
 View this message in context: 
 http://www.nabble.com/HTML-reading%2C-tf4447190.html#a12688719
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Class probabilities in rpart

2007-09-15 Thread Christian Schäfer
Hi,

the predict.rpart() function from the rpart library allows for 
calculating the class probabilities for a given test case instead of a 
discrete class label.

How are these class probabilities derived? Is it simply the proportion 
of the majority class to all cases in a leaf node?

Thanks in advance,
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Storing Variables of different types

2007-09-15 Thread Garavito,Fabian

Hi there,

I have an ixjxk array where I want to store dates in the first column of
all sub-matrices (i.e. j=1 is a column with dates) and real numbers in
the rest of the columns...I have been trying many things, but I am
not getting anywhere. 

Thank you very much for your help,

Fabian


This message and any attachment are confidential and may be privileged or 
otherwise protected  from disclosure. If you are not the intended recipient, 
please telephone or email the sender and delete this message and any attachment 
from your system. If you are not the intended recipient you must not copy this 
message or attachment or disclose the contents to any other person. Nothing 
contained in the attached email shall be regarded as an offer to sell or as a 
solicitation of an offer to buy any  services, funds or products or an 
expression of any opinion or views of the firm or its employees. Nothing 
contained in the attached email shall be deemed to be an advise of, or 
recommendation by, the firm or its employees. No representation is made as to 
accuracy, completeness, reliability or appropriateness of the information 
contained in the attached email. 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] generate ROC curve using randomForest package

2007-09-15 Thread L L
Hi,

I am new here. I would like to compare the performance of the random forest 
model with support vector machine. Can  anybody let me know how to generate 
a ROC curve for random forest model since there is no need to run the 
cross-validation. Thank you very much!

TL

_
[[replacing trailing spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival model (time to event data)

2007-09-15 Thread Daniel Malter
High all, I would appreciate input about how the following survival model
can be modeled in R and how competing risk models can generally be modeled.
Also I would appreciate hints about resources that you are aware of that
explain the use of survival models in R in greater detail. 

The data structure of my data is plotted below. My problem is that I don't
know how to model 4 different events in the same hazard model for which the
hazards are conditional on some other factor.

Conditions: 

0. All events are mutually exclusive
1. Either no event, Event1, or one of the Events 2-4 occurs (i.e. events 2-4
are competing)
2. Event1 can only occur if St.Beg=0 (it switches St.End from this period
and St.Beg from the following periods on to 1 until Event4 occurs).
3. Event2-4 can only occur if St.Beg=1

TimeSt.Beg  St.End  Event1  Event2  Event3  Event4  Number
1   0   0   0   0   0   0   0
2   0   0   0   0   0   0   0
3   0   1   1   0   0   0   0
4   1   1   0   0   0   0   10
5   1   1   0   1   0   0   10
6   1   1   0   1   0   0   15
7   1   1   0   0   0   0   20
8   1   1   0   0   0   0   20
9   1   1   0   0   1   0   10
10  1   0   0   0   0   1   0

Thanks much for your help,
Daniel





-
cuncta stricte discussurus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] generate ROC curve using randomForest package

2007-09-15 Thread Christian Schäfer
L L wrote:

 I am new here. I would like to compare the performance of the random forest 
 model with support vector machine. Can  anybody let me know how to generate 
 a ROC curve for random forest model since there is no need to run the 
 cross-validation. Thank you very much!

The ROCR package provides performance measures like AUC, Sensitivity or 
ROC curves.
Especially the performance() function is of interest.

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Storing Variables of different types

2007-09-15 Thread mel
Garavito,Fabian a écrit :

 Hi there,
 I have an ixjxk array where I want to store dates in the first column of
 all sub-matrices (i.e. j=1 is a column with dates) and real numbers in
 the rest of the columns...I have been trying many things, but I am
 not getting anywhere. 
 Thank you very much for your help,
 Fabian

?data.frame
or think about storing dates in the rownames.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to change print limit on screen

2007-09-15 Thread jim holtman
?options

 options(max.print=10)
 1:10
 [1]  1  2  3  4  5  6  7  8  9 10
 [ reached getOption(max.print) -- omitted 0 entries ]]


On 9/15/07, Abu Naser [EMAIL PROTECTED] wrote:

 Hi all user,

 Is there any way i can chage the print limit ( getOption(max.print)) to 
 unlimited or specified limit?

 Thanks in advance
 _
 Feel like a local wherever you go.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] applying math/stat functions to rows in data frame

2007-09-15 Thread Gerard Smits
Hi All,

There are a variety of functions that can be applied to a variable 
(column) in a data frame: mean, min, max, sd, range, IQR, etc.

I am aware of only two that work on the rows, using q1-q3 as example 
variables:

rowMeans(cbind(q1,q2,q3),na.rm=T)   #mean of multiple variables
rowSums (cbind(q1,q2,q3),na.rm=T)   #sum of multiple variables

Can the standard column functions (listed in the first sentence) be 
applied to rows, with the use of correct indexes to reference the 
columns of interest?  Or, must these summary functions be programmed 
separately to work on a row?

Thanks,

Gerard



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying math/stat functions to rows in data frame

2007-09-15 Thread Robert A LaBudde
At 12:02 PM 9/15/2007, Gerald wrote:
Hi All,

There are a variety of functions that can be applied to a variable
(column) in a data frame: mean, min, max, sd, range, IQR, etc.

I am aware of only two that work on the rows, using q1-q3 as example
variables:

rowMeans(cbind(q1,q2,q3),na.rm=T)   #mean of multiple variables
rowSums (cbind(q1,q2,q3),na.rm=T)   #sum of multiple variables

Can the standard column functions (listed in the first sentence) be
applied to rows, with the use of correct indexes to reference the
columns of interest?  Or, must these summary functions be programmed
separately to work on a row?

Try using t() to transpose the matrix, and then apply the column 
function of interest.


Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with a problem

2007-09-15 Thread Letticia Ramlal
Hello 
I was wonderinf if anyone can help me with this problem, it seems trivial but 
for some reason I can not figure it out.
 
With a single R command complete the following:
create a vector calles seqvec that repeats the sequence 1, 3,6, 10,15,21.( I 
was trying to use c() but this does not work) 
create a 5-row, 6-column matirx from seqvec wuth each row containg the sequence 
from before 
and complete the two task above in a single step.
 
LTR

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with a problem

2007-09-15 Thread Marc Schwartz
On Sat, 2007-09-15 at 12:11 -0400, Letticia Ramlal wrote:
 Hello 
 I was wonderinf if anyone can help me with this problem, it seems trivial but 
 for some reason I can not figure it out.
  
 With a single R command complete the following:
 create a vector calles seqvec that repeats the sequence 1, 3,6, 10,15,21.( I 
 was trying to use c() but this does not work) 
 create a 5-row, 6-column matirx from seqvec wuth each row containg the 
 sequence from before 
 and complete the two task above in a single step.
  
 LTR

Is this what you want?

seqvec - cumsum(1:6)

 seqvec
[1]  1  3  6 10 15 21


Or to address both:

 matrix(rep(cumsum(1:6), 5), ncol = 6, byrow = TRUE)
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]136   10   15   21
[2,]136   10   15   21
[3,]136   10   15   21
[4,]136   10   15   21
[5,]136   10   15   21


See ?cumsum and ?rep

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pulling out parts of a generated array in R

2007-09-15 Thread Wayne Aldo Gavioli
Hello all,

I was wondering if it was possible to pull out certain parts of an array in R -
not an array of data that I have created, but an array of data that has been
spit out by R itself.

More specifically, in the lines of code below:


 summary(prcomp(USArrests))
Importance of components:
  PC1 PC2PC3 PC4
Standard deviation 83.732 14.2124 6.4894 2.48279
Proportion of Variance  0.966  0.0278 0.0058 0.00085
Cumulative Proportion   0.966  0.9933 0.9991 1.0



I was wondering if there is a way to only extract one piece of this array,
specifically the Proportion of Variance for PC2, which is 0.0278.  I know how
to extract one entire line of data from this array, using the following lines
of code:

 result-summary(prcomp(USArrests))
 m-result$importance
 final-m[2,]


These lines of code will produce the follwing output:


0.966  0.0278 0.0058 0.00085


Now I was wondering if there is anyway to break this down even further, and be
able to extract one piece of data from this one line.

If anyone could help me out, I would really appreciate it.


Thanks,


Wayne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pulling out parts of a generated array in R

2007-09-15 Thread Uwe Ligges


Wayne Aldo Gavioli wrote:
 Hello all,
 
 I was wondering if it was possible to pull out certain parts of an array in R 
 -
 not an array of data that I have created, but an array of data that has been
 spit out by R itself.
 
 More specifically, in the lines of code below:
 
 
 summary(prcomp(USArrests))
 Importance of components:
   PC1 PC2PC3 PC4
 Standard deviation 83.732 14.2124 6.4894 2.48279
 Proportion of Variance  0.966  0.0278 0.0058 0.00085
 Cumulative Proportion   0.966  0.9933 0.9991 1.0
 
 
 
 I was wondering if there is a way to only extract one piece of this array,
 specifically the Proportion of Variance for PC2, which is 0.0278.  I know how
 to extract one entire line of data from this array, using the following lines
 of code:
 
 result-summary(prcomp(USArrests))
 m-result$importance
 final-m[2,]
 
 
 These lines of code will produce the follwing output:
 
 
 0.966  0.0278 0.0058 0.00085

 
 Now I was wondering if there is anyway to break this down even further, and be
 able to extract one piece of data from this one line.

I am sure you already read about matrix indexing in the manual An 
Introduction to R, but here to remind you how easy it is to get the 
second row, third column:

final - m[2,3]

Uwe Ligges


 
 If anyone could help me out, I would really appreciate it.
 
 
 Thanks,
 
 
 Wayne
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying math/stat functions to rows in data frame

2007-09-15 Thread Marc Schwartz
On Sat, 2007-09-15 at 09:02 -0700, Gerard Smits wrote:
 Hi All,
 
 There are a variety of functions that can be applied to a variable 
 (column) in a data frame: mean, min, max, sd, range, IQR, etc.
 
 I am aware of only two that work on the rows, using q1-q3 as example 
 variables:
 
 rowMeans(cbind(q1,q2,q3),na.rm=T)   #mean of multiple variables
 rowSums (cbind(q1,q2,q3),na.rm=T)   #sum of multiple variables
 
 Can the standard column functions (listed in the first sentence) be 
 applied to rows, with the use of correct indexes to reference the 
 columns of interest?  Or, must these summary functions be programmed 
 separately to work on a row?
 
 Thanks,
 
 Gerard

The answer is: it depends

If the row can be coerced to a numeric vector, then yes. This presumes
that the data frame contains a single data type or the subset of columns
you need contains a single data type.

If the row contains multiple data types, then the row becomes a single
row data frame or a list and you would have to consider other possible
approaches.

For example:

Taking the first row of the 'iris' dataset becomes a single row data
frame:

 str(iris[1, ])
'data.frame':   1 obs. of  5 variables:
 $ Sepal.Length: num 5.1
 $ Sepal.Width : num 3.5
 $ Petal.Length: num 1.4
 $ Petal.Width : num 0.2
 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1

or if you set 'drop = TRUE', a list:

 str(iris[1, , drop = TRUE])
List of 5
 $ Sepal.Length: num 5.1
 $ Sepal.Width : num 3.5
 $ Petal.Length: num 1.4
 $ Petal.Width : num 0.2
 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1


If however, you remove the last column Species, which is a factor, you
can coerce the remaining object to a numeric matrix:

 str(as.matrix(iris[, -5]))
 num [1:150, 1:4] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:4] Sepal.Length Sepal.Width Petal.Length Petal.Width



Some functions will do this coercion internally:

For example:

 rowSums(iris)
Error in rowSums(x, prod(dn), p, na.rm) : 'x' must be numeric


However:

 head(rowSums(iris[, -5]))
[1] 10.2  9.5  9.4  9.4 10.2 11.4


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying math/stat functions to rows in data frame

2007-09-15 Thread Gavin Simpson
On Sat, 2007-09-15 at 09:02 -0700, Gerard Smits wrote:
 Hi All,
 
 There are a variety of functions that can be applied to a variable 
 (column) in a data frame: mean, min, max, sd, range, IQR, etc.

But one their own, these are not equivalents to rowMeans, rowSums etc
below.

 
 I am aware of only two that work on the rows, using q1-q3 as example 
 variables:
 
 rowMeans(cbind(q1,q2,q3),na.rm=T)   #mean of multiple variables
 rowSums (cbind(q1,q2,q3),na.rm=T)   #sum of multiple variables

If you really want to apply a function to the individual rows of a
matrix-like object then apply() is your friend:

?rowMeans states:

Details:

 These functions are equivalent to use of 'apply' with 'FUN = mean'
 or 'FUN = sum' with appropriate margins, but are a lot faster.

So see ?apply and argument 'margin'. For rows use margin = 1, e.g.:

dat - matrix(runif(1000), ncol = 100)
apply(dat, 1, mean)
rowMeans(dat)


 
 Can the standard column functions (listed in the first sentence) be 
 applied to rows, with the use of correct indexes to reference the 
 columns of interest?  Or, must these summary functions be programmed 
 separately to work on a row?

You can only use those functions on a column via subsetting, e.g.:

mean(dat[,4])
min(dat[,4])

If all you want is a single row (the equivalent of what you seem to be
asking) then these also work:

mean(dat[4,])
min(dat[4,])

HTH

G

 
 Thanks,
 
 Gerard
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with a problem

2007-09-15 Thread Gavin Simpson
On Sat, 2007-09-15 at 12:11 -0400, Letticia Ramlal wrote:
 Hello 
 I was wonderinf if anyone can help me with this problem, it seems
 trivial but for some reason I can not figure it out.
  
 With a single R command complete the following:
 create a vector calles seqvec that repeats the sequence 1, 3,6,
 10,15,21.( I was trying to use c() but this does not work) 
 create a 5-row, 6-column matirx from seqvec wuth each row containg the
 sequence from before 
 and complete the two task above in a single step.

If that is just an example of an arbitrary sequence, then the following
does what you want:

 res - matrix(rep(c(1,3,6,10,15,21), 5), nrow = 5, byrow = TRUE)
 res
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]136   10   15   21
[2,]136   10   15   21
[3,]136   10   15   21
[4,]136   10   15   21
[5,]136   10   15   21

But if there is something special in the quoted sequence (it is
cumsum(1:6) ), then the following also does what you want:

 res2 - matrix(rep(cumsum(1:6), 5), nrow = 5, byrow = TRUE)
 res2
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]136   10   15   21
[2,]136   10   15   21
[3,]136   10   15   21
[4,]136   10   15   21
[5,]136   10   15   21
 all.equal(res, res2)
[1] TRUE

Take a look at ?rep and, although not needed in this case, ?seq for
generating sequences and repeats.

HTH

G

  
 LTR
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] starting with a capital letter

2007-09-15 Thread Charles C. Berry
On Sat, 15 Sep 2007, kevinchang wrote:


 Hi everyone,

 I am wondering if there is any built-in funcion that can determine whether
 words in a character vector start with a captial letter or not. Help,
 please. Thanks.

Yes. But your query is not precise. See the posting guide and provide 
commented, minimal, self-contained, reproducible code (as is requested) to 
be sure the answers you get address the question you really want answered.

I see several possiblilities.

In this vector:

my.charvec- c( Abc, abc Abc, abc aBc )

You wish to match element 1 only or 1 and 2 only and perhaps report where 
in each element the last match was found.


res - regexpr( \\[[:upper:]].* , my.charvec )

should get you started. Examples:

which( res == 1 ) # first case

which( res != -1 ) # second case

See

?regexpr

Also,

?strsplit

which I think would be needed to recover the locations of each of 
several capitalized words in a single element. e.g. abc Def Ghi

Chuck

 -- 
 View this message in context: 
 http://www.nabble.com/starting-with-a-capital-letter-tf4447302.html#a12689105
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about VarSelRF

2007-09-15 Thread ssls sddd
Dear list members,

I am analyzing Affymentrix gene expression data and would like to
apply the R package, VarSelRF to identifying small sets of genes that could
be used for diagnostic purpose.

Basically, the data matrix is composed of 22277 rows (genes) and 65 columns
(samples).
I did unsupervised clustering using pvclust to get 4 classes. What I would
like to do is
to get unique genes for each class which can best characterize them.

I did so and had the problem when running the code.

The error message is:

 rf.vs1 - varSelRF(exprSet, cl, ntree = 200, ntreeIterat = 100,
vars.drop.frac = 0.2)
error in randomForest.default(x = xdata, y = Class, ntree = ntree, mtry =
mtry,  :
length of response must be the same as predictors

My code is:

library(varSelRF)
exprSet - as.matrix(read.table('varSelRF_x.txt',header = FALSE))
cl - factor(c(rep(C, 2), rep(D, 2), rep(B, 1), rep(A, 1), rep(D,
1), rep(B, 1), rep(C, 2), rep(B, 1), rep(D, 1), rep(A, 1),
rep(D, 2), rep(B, 1),rep(A, 1),rep(B, 1), rep(D, 1),rep(B,
1),rep(C, 1),rep(D, 2),rep(C, 2),rep(B, 2),rep(D, 1),rep(C,
1),rep(D, 1),rep(D, 1),rep(C, 1),rep(B, 1),rep(C, 1),rep(A,
1),rep(C, 1),rep(B, 1),rep(D, 3),rep(D, 1),rep(C, 1),rep(B,
2),rep(D, 1),rep(D, 1),rep(B, 2),rep(D, 1),rep(B, 1),rep(C,
1),rep(D, 1),rep(B, 3),rep(D, 5),rep(B, 1),rep(D, 2),rep(B,
1),rep(D, 1)))
rf.vs1 - varSelRF(exprSet, cl, ntree = 200, ntreeIterat = 100,
vars.drop.frac = 0.2)
rf.vs1
plot(rf.vs1)

Would you like to give me some suggestions which could result in the error
message?

Thank you very much and I am looking forward to your reply!

Best Regards,

Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] naming columns of data frame

2007-09-15 Thread kevinchang

Hey,

I am trying to make a data frame and the name of a column is composed of a
number, a dot, and a word, such as 1.whatever. But I always get this error
message:syntax error, unexpected SYMBOL, expecting ',' in: while printing
data frame out . When I rename the column with purely letter, everything
works fine. Some suggestion about the cause/ solution ?? Thanks.
-- 
View this message in context: 
http://www.nabble.com/naming-columns-of-data-frame-tf4449324.html#a12694795
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] naming columns of data frame

2007-09-15 Thread Henrique Dallazuanna
Try this:

df - data.frame('1.test'=rnorm(100), '2.test'=runif(100), check.names=F)


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 15/09/2007, kevinchang [EMAIL PROTECTED] wrote:


 Hey,

 I am trying to make a data frame and the name of a column is composed of a
 number, a dot, and a word, such as 1.whatever. But I always get this
 error
 message:syntax error, unexpected SYMBOL, expecting ',' in: while
 printing
 data frame out . When I rename the column with purely letter, everything
 works fine. Some suggestion about the cause/ solution ?? Thanks.
 --
 View this message in context:
 http://www.nabble.com/naming-columns-of-data-frame-tf4449324.html#a12694795
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] my previous message: Memory Management

2007-09-15 Thread tkobayas
Hi,

I obviously did not include the subject title. I am looking for memory 
management on a 64 bit machine.

Thank you.

TK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2007-09-15 Thread jim holtman
When you say you can not import 4.8GB, is this the size of the text
file that you are reading in?  If so, what is the structure of the
file?  How are you reading in the file ('read.table', 'scan', etc).

Do you really need all the data or can you work with a portion at a
time?  If so, then consider putting the data in a database and
retrieving the data as needed.  If all the data is in an object, how
big to you think this object will be? (# rows, # columns, mode of the
data).

So you need to provide some more information as to the problem that
you are trying to solve.

On 9/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Hi,

 Let me apologize for this simple question.

 I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has
 saved a lot of time. I am sure this is a lot to do with my memory
 limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon
 X2 5600, and 1200W PSU. This PC configuration is the best I could get.

 I know a bit of C and Perl. Should I use C or Perl to manage this large
 dataset? or should I even go to 16GB RAM.

 Sorry for this silly question. But I appreciate if anyone could give me
 advice.

 Thank you very much.

 TK

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management

2007-09-15 Thread Takatsugu Kobayashi
Hi,

I apologize again for posting something not suitable on this list.

Basically, it sounds like I should go put this large dataset into a 
database... The dataset I have had trouble with is the transportation 
network of Chicago Consolidated Metropolitan Statistical Area. The 
number of samples is about 7,200 points; and every points have outbound 
and inbound traffic flows: volumes, times, distances, etc. So a quick 
approximation of the number of rows would be
49,000,000 rows (and 249 columns).

This is a text file. I could work with a portion of the data at a time 
like nearest neighbors or pairs of points.

I used read.table('filename',header=F).. I should probably use some bits 
of data at a time instead of putting all at a time...

I am learning RSQLite and RMySQL. As Mr. Wan suggests, I will learn C a 
bit more.

Thank you very much.

TK

im holtman wrote:
 When you say you can not import 4.8GB, is this the size of the text
 file that you are reading in?  If so, what is the structure of the
 file?  How are you reading in the file ('read.table', 'scan', etc).

 Do you really need all the data or can you work with a portion at a
 time?  If so, then consider putting the data in a database and
 retrieving the data as needed.  If all the data is in an object, how
 big to you think this object will be? (# rows, # columns, mode of the
 data).

 So you need to provide some more information as to the problem that
 you are trying to solve.

 On 9/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
   
 Hi,

 Let me apologize for this simple question.

 I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has
 saved a lot of time. I am sure this is a lot to do with my memory
 limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon
 X2 5600, and 1200W PSU. This PC configuration is the best I could get.

 I know a bit of C and Perl. Should I use C or Perl to manage this large
 dataset? or should I even go to 16GB RAM.

 Sorry for this silly question. But I appreciate if anyone could give me
 advice.

 Thank you very much.

 TK

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory management

2007-09-15 Thread jim holtman
If you data file has 49M rows and 249 columns, then if each column had
5 characters, then you are looking at a text file with 60GB.  If these
were all numerics (8 bytes per number), then you are looking at an R
object that would be almost 100GB.  If this is your data, then this is
definitely a candidate for a data base since you would need a fairly
large machine (at least 300GB of real memory).

You probably need to give some serious thought to how you want to
store your data and then what type of processing you need to do on it.
BTW, do you need all 249 columns, or could you work with just 3-4
columns at a time (this at least makes an R object of about 1.5GB
which might be easier to handle).

On 9/16/07, Takatsugu Kobayashi [EMAIL PROTECTED] wrote:
 Hi,

 I apologize again for posting something not suitable on this list.

 Basically, it sounds like I should go put this large dataset into a
 database... The dataset I have had trouble with is the transportation
 network of Chicago Consolidated Metropolitan Statistical Area. The
 number of samples is about 7,200 points; and every points have outbound
 and inbound traffic flows: volumes, times, distances, etc. So a quick
 approximation of the number of rows would be
 49,000,000 rows (and 249 columns).

 This is a text file. I could work with a portion of the data at a time
 like nearest neighbors or pairs of points.

 I used read.table('filename',header=F).. I should probably use some bits
 of data at a time instead of putting all at a time...

 I am learning RSQLite and RMySQL. As Mr. Wan suggests, I will learn C a
 bit more.

 Thank you very much.

 TK

 im holtman wrote:
  When you say you can not import 4.8GB, is this the size of the text
  file that you are reading in?  If so, what is the structure of the
  file?  How are you reading in the file ('read.table', 'scan', etc).
 
  Do you really need all the data or can you work with a portion at a
  time?  If so, then consider putting the data in a database and
  retrieving the data as needed.  If all the data is in an object, how
  big to you think this object will be? (# rows, # columns, mode of the
  data).
 
  So you need to provide some more information as to the problem that
  you are trying to solve.
 
  On 9/15/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
  Hi,
 
  Let me apologize for this simple question.
 
  I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has
  saved a lot of time. I am sure this is a lot to do with my memory
  limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon
  X2 5600, and 1200W PSU. This PC configuration is the best I could get.
 
  I know a bit of C and Perl. Should I use C or Perl to manage this large
  dataset? or should I even go to 16GB RAM.
 
  Sorry for this silly question. But I appreciate if anyone could give me
  advice.
 
  Thank you very much.
 
  TK
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.