[R] help with data layout

2008-07-17 Thread Iain Gallagher
Hello list

I have been given some Excel sheets with data laid like this:

Col1Col2
A 3
   2
   3
B 4
   5
   4
C 1
   4
   3

I was hoping to import this into R as a csv and then get the mean and SD for 
each letter in column 1.

Could someone give me some guidance on best to approach this?

Thanks

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with data layout

2008-07-17 Thread Erik Iverson

Iain Gallagher wrote:

Hello list

I have been given some Excel sheets with data laid like this:

Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3

I was hoping to import this into R as a csv and then get the mean and
SD for each letter in column 1.

Could someone give me some guidance on best to approach this?



Sure.  Reading in Excel sheets can be done at least a few ways, see the 
R Data Import/Export manual on CRAN.  The only way I have done it is to 
save the Excel sheet as a CSV file, and then use read.csv in R to get a 
data.frame.  One note here is that sometimes the Excel sheet has 
'missing' cells where someone has inserted blanks.  These may get 
written out to the CSV file, you'll have to check.  For example, I've 
seen an Excel sheet with something like 10 rows of data that outputs 
about 100 to the CSV file, mostly all missing.


Anyway, once you have the data.frame, I'd use na.locf from the zoo 
package to 'fill' in the missing Col1 values, and then use an R function 
such as ave, tapply, aggregate, or by to do whatever you'd like.




Thanks

Iain

[[alternative HTML version deleted]]

__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with data layout

2008-07-17 Thread Henrique Dallazuanna
Try this:

#x <- read.csv('your_file.csv')

x$Col1 <- rep(as.character(x$Col1[x$Col1!="" ]),
  each = unique(diff(which(x$Col1 != ""
with(x, sapply(c(sd, mean), function(x)tapply(Col2, Col1, x)))

On Thu, Jul 17, 2008 at 12:50 PM, Iain Gallagher
<[EMAIL PROTECTED]> wrote:
> Hello list
>
> I have been given some Excel sheets with data laid like this:
>
> Col1Col2
> A 3
>   2
>   3
> B 4
>   5
>   4
> C 1
>   4
>   3
>
> I was hoping to import this into R as a csv and then get the mean and SD for 
> each letter in column 1.
>
> Could someone give me some guidance on best to approach this?
>
> Thanks
>
> Iain
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with data layout

2008-07-17 Thread Mike Lawrence
Nice tip on filling missing col1 values. I've always just done the  
filling in excel by hand. I then use that same hand to smack the  
person that gave me data in excel format :Op


On 17-Jul-08, at 1:19 PM, Erik Iverson wrote:


Iain Gallagher wrote:

Hello list
I have been given some Excel sheets with data laid like this:
Col1Col2 A 3 2 3 B 4 5 4 C 1 4 3
I was hoping to import this into R as a csv and then get the mean and
SD for each letter in column 1.
Could someone give me some guidance on best to approach this?


Sure.  Reading in Excel sheets can be done at least a few ways, see  
the R Data Import/Export manual on CRAN.  The only way I have done  
it is to save the Excel sheet as a CSV file, and then use read.csv  
in R to get a data.frame.  One note here is that sometimes the Excel  
sheet has 'missing' cells where someone has inserted blanks.  These  
may get written out to the CSV file, you'll have to check.  For  
example, I've seen an Excel sheet with something like 10 rows of  
data that outputs about 100 to the CSV file, mostly all missing.


Anyway, once you have the data.frame, I'd use na.locf from the zoo  
package to 'fill' in the missing Col1 values, and then use an R  
function such as ave, tapply, aggregate, or by to do whatever you'd  
like.




Thanks
Iain
[[alternative HTML version deleted]]
__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html  
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

www.memetic.ca

"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with data layout

2008-07-17 Thread Stephen Tucker
Hi, hope this will help:

txt <- "Col1,Col2   
A,3
, 2
,3
B,4
, 5
, 4
C,1
, 4
, 3"

## read data
dat <- read.csv(textConnection(txt),na.string="")
## fill in empty cells with correct category
dat$Col1[] <-
  Reduce(function(x,y) c(x,ifelse(is.na(y),tail(x,1),y)),dat$Col1)
## calculate mean and standard deviation
mat <- t(sapply(split(dat$Col2,f=dat$Col1),function(X)
  c(mean=mean(X),sd=sd(X
## look at results (stored in a matrix)
> print(mat)
  meansd
A 2.67 0.5773503
B 4.33 0.5773503
C 2.67 1.5275252



- Original Message 
From: Iain Gallagher <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, July 17, 2008 8:50:42 AM
Subject: [R] help with data layout

Hello list

I have been given some Excel sheets with data laid like this:

Col1Col2
A 3
   2
   3
B 4
   5
   4
C 1
   4
   3

I was hoping to import this into R as a csv and then get the mean and SD for 
each letter in column 1.

Could someone give me some guidance on best to approach this?

Thanks

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with data layout

2008-07-17 Thread jim holtman
Does this do at least the means for you:

> x <- read.csv(textConnection("Col1 ,   Col2
+ A, 3
+  , 2
+  , 3
+ B, 4
+  , 5
+ ,  4
+ C   ,  1
+ ,  4
+ ,  3"), strip.white=TRUE)
> x
  Col1 Col2
1A3
2 2
3 3
4B4
5 5
6 4
7C1
8 4
9 3
> # replace blanks with NAs in first column
> is.na(x$Col1) <- x$Col1 == ''
> require(zoo)
> x$Col1 <- na.locf(x$Col1)
> x
  Col1 Col2
1A3
2A2
3A3
4B4
5B5
6B4
7C1
8C4
9C3
> aggregate(x$Col2, list(x$Col1), FUN=mean)
  Group.1x
1   A 2.67
2   B 4.33
3   C 2.67
>


On Thu, Jul 17, 2008 at 11:50 AM, Iain Gallagher
<[EMAIL PROTECTED]> wrote:
> Hello list
>
> I have been given some Excel sheets with data laid like this:
>
> Col1Col2
> A 3
>   2
>   3
> B 4
>   5
>   4
> C 1
>   4
>   3
>
> I was hoping to import this into R as a csv and then get the mean and SD for 
> each letter in column 1.
>
> Could someone give me some guidance on best to approach this?
>
> Thanks
>
> Iain
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with data layout - Thanks

2008-07-17 Thread Iain Gallagher
Thanks for all the excellent replies.

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.