[R] Beginner question: select cases

2006-09-25 Thread Peter Wolkerstorfer - CURE
Hello all,

I hope i chose the right list as my question is a beginner-question.

I have a data set with 3 colums  London, Rome and Vienna - the
location is presented through a 1 like this:
London  RomeVienna  q1
0   0   1   4
0   1   0   2   
1   0   0   3




I just want to calculate the means of a variable q1.

I tried following script:

# calculate the mean of all locations
results - subset(results, subset== 1 )
mean(results$q1)
# calculate the mean of London
results - subset(results, subset== 1 , select=c(London))
mean(results$q1)
# calculate the mean of Rome
results - subset(results, subset== 1 , select=c(Rome))
mean(results$q1)
# calcualate the mean of Vienna
results - subset(results, subset== 1 , select=c(Vienna))
mean(results$q1)

As all results are 1.68 and there is defenitely a difference in the
three locations I wonder whats going on.
I get confused as the Rcmdr asks me to overwrite things and there is no
just filter option.

Any help would be apprechiated. Thank you in advance.

Regards
Peter



___CURE - Center for Usability Research  Engineering___
 
Peter Wolkerstorfer
Usability Engineer
Hauffgasse 3-5, 1110 Wien, Austria
 
[Tel]  +43.1.743 54 51.46
[Fax]  +43.1.743 54 51.30
 
[Mail] [EMAIL PROTECTED]
[Web]  http://www.cure.at

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginner question: select cases

2006-09-25 Thread ONKELINX, Thierry
Your problem would be a lot easier if you coded the location in one
variable instead of three variables. Then you could calculate the means
with one line of code:

by(results$q1, results$location, mean)

With your dataset you could use
by(results$London, results$location, mean)
by(results$Rome, results$location, mean)
by(results$Vienna, results$location, mean)

see ?by for more information

And take a good look at your code. You take a subset from results and
the assign it to results. This means that you replace the original
results dataframe with a subset of it. As you take the subset for the
next city, you won't take a subset from the original dataset but for the
previous subset!

Cheers,

Thierry



ir. Thierry Onkelinx

Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature
and Forest

Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance

Gaverstraat 4

9500 Geraardsbergen

Belgium

tel. + 32 54/436 185

[EMAIL PROTECTED]

www.inbo.be 


-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Namens Peter Wolkerstorfer -
CURE
Verzonden: maandag 25 september 2006 13:51
Aan: r-help@stat.math.ethz.ch
Onderwerp: [R] Beginner question: select cases

Hello all,

I hope i chose the right list as my question is a beginner-question.

I have a data set with 3 colums  London, Rome and Vienna - the
location is presented through a 1 like this:
London  RomeVienna  q1
0   0   1   4
0   1   0   2   
1   0   0   3




I just want to calculate the means of a variable q1.

I tried following script:

# calculate the mean of all locations
results - subset(results, subset== 1 )
mean(results$q1)
# calculate the mean of London
results - subset(results, subset== 1 , select=c(London))
mean(results$q1)
# calculate the mean of Rome
results - subset(results, subset== 1 , select=c(Rome))
mean(results$q1)
# calcualate the mean of Vienna
results - subset(results, subset== 1 , select=c(Vienna))
mean(results$q1)

As all results are 1.68 and there is defenitely a difference in the
three locations I wonder whats going on.
I get confused as the Rcmdr asks me to overwrite things and there is no
just filter option.

Any help would be apprechiated. Thank you in advance.

Regards
Peter



___CURE - Center for Usability Research  Engineering___
 
Peter Wolkerstorfer
Usability Engineer
Hauffgasse 3-5, 1110 Wien, Austria
 
[Tel]  +43.1.743 54 51.46
[Fax]  +43.1.743 54 51.30
 
[Mail] [EMAIL PROTECTED]
[Web]  http://www.cure.at

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginner question: select cases

2006-09-25 Thread Doran, Harold
Peter,

There is a much easier way to do this. First, you should consider
organizing your data as follows:

set.seed(1) # for replication only

# Here is a sample dataframe
tmp - data.frame(city = gl(3,10, label = c(London, Rome,Vienna
)), q1 = rnorm(30))

# Compute the means
with(tmp, tapply(q1,city, mean))
  London   Rome Vienna 
 0.1322028  0.2488450 -0.1336732 

I hope this helps

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter 
 Wolkerstorfer - CURE
 Sent: Monday, September 25, 2006 7:51 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Beginner question: select cases
 
 Hello all,
 
 I hope i chose the right list as my question is a beginner-question.
 
 I have a data set with 3 colums  London, Rome and 
 Vienna - the location is presented through a 1 like this:
 LondonRomeVienna  q1
 0 0   1   4
 0 1   0   2   
 1 0   0   3
 
 
 
 
 I just want to calculate the means of a variable q1.
 
 I tried following script:
 
 # calculate the mean of all locations
 results - subset(results, subset== 1 )
 mean(results$q1)
 # calculate the mean of London
 results - subset(results, subset== 1 , select=c(London))
 mean(results$q1)
 # calculate the mean of Rome
 results - subset(results, subset== 1 , select=c(Rome))
 mean(results$q1)
 # calcualate the mean of Vienna
 results - subset(results, subset== 1 , select=c(Vienna))
 mean(results$q1)
 
 As all results are 1.68 and there is defenitely a difference 
 in the three locations I wonder whats going on.
 I get confused as the Rcmdr asks me to overwrite things and 
 there is no just filter option.
 
 Any help would be apprechiated. Thank you in advance.
 
 Regards
 Peter
 
 
 
 ___CURE - Center for Usability Research  Engineering___
  
 Peter Wolkerstorfer
 Usability Engineer
 Hauffgasse 3-5, 1110 Wien, Austria
  
 [Tel]  +43.1.743 54 51.46
 [Fax]  +43.1.743 54 51.30
  
 [Mail] [EMAIL PROTECTED]
 [Web]  http://www.cure.at
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beginner question: select cases

2006-09-25 Thread John Kane

--- Peter Wolkerstorfer - CURE [EMAIL PROTECTED]
wrote:

 Hello all,
 
 I hope i chose the right list as my question is a
 beginner-question.
 
 I have a data set with 3 colums  London, Rome
 and Vienna - the
 location is presented through a 1 like this:
 LondonRomeVienna  q1
 0 0   1   4
 0 1   0   2   
 1 0   0   3
 
 
 
 
 I just want to calculate the means of a variable q1.
 
 I tried following script:
 
 # calculate the mean of all locations
 results - subset(results, subset== 1 )
 mean(results$q1)
 # calculate the mean of London
 results - subset(results, subset== 1 ,
 select=c(London))
 mean(results$q1)
 # calculate the mean of Rome
 results - subset(results, subset== 1 ,
 select=c(Rome))
 mean(results$q1)
 # calcualate the mean of Vienna
 results - subset(results, subset== 1 ,
 select=c(Vienna))
 mean(results$q1)
 
 As all results are 1.68 and there is defenitely a
 difference in the
 three locations I wonder whats going on.
 I get confused as the Rcmdr asks me to overwrite
 things and there is no
 just filter option.
 
 Any help would be apprechiated. Thank you in
 advance.
 
 Regards
 Peter


I'm new at R also.  However I don't recognize your
syntax. I have not seen select used here. 

Try 
results - subset(results, London==1 )

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.