Re: [R] Easy 'apply' question

2011-03-11 Thread Aaron Polhamus
Perfect, thanks Josh!

Cheers,
A

2011/3/10 Joshua Wiley 

> Dear Aaron,
>
> The problem is not with your function, but using apply().  Look at the
> "Details" section of ?apply  You will see that if the data is not an
> array or matrix, apply will coerce it to one (or try).  Now go over to
> the "Details" section of ?matrix and you will see that matrices can
> only contain a single class of data and that this follows a hierarchy.
>  In short, your data frame is coerced to a data frame and the classes
> are all coerced to the highest---character.  You can use lapply()
> instead to get your desired results.  Here is an example:
>
> ## Construct (named) test dataframe
> tf <- data.frame(x = 1:3, y = 4:6, z = c("A","A","A"))
>
> ## Show why what you tried did not work
> (test <- apply(tf, 2, class))
>
> ## using lapply()
> (test <- lapply(tf, function(x) {
>  if(is.numeric(x)) mean(x) else unique(x)[1]}))
>
>
> Hope this helps,
>
> Josh
>
> On Thu, Mar 10, 2011 at 5:11 PM, Aaron Polhamus 
> wrote:
> > Dear list,
> >
> > I couldn't find a solution for this problem online, as simple as it
> seems.
> > Here's the problem:
> >
> >
> > #Construct test dataframe
> > tf <- data.frame(1:3,4:6,c("A","A","A"))
> >
> > #Try the apply function I'm trying to use
> > test <- apply(tf,2,function(x) if(is.numeric(x)) mean(x) else
> unique(x)[1])
> >
> > #Look at the output--all columns treated as character columns...
> > test
> >
> > #Look at the format of the original data--the first two columns are
> > integers.
> > str(tf)
> >
> >
> > In general terms, I want to differentiate what function I apply over a
> > row/column based on what type of data that row/column contains. Here I
> want
> > a simple mean if the column is numeric, and the first unique value if the
> > column is a character column. As you can see, 'apply' treats all columns
> as
> > characters the way I've written his function.
> >
> > Any thoughts? Many thanks in advance,
> > Aaron
> >
> >[[alternative HTML version deleted]]
> >
> > ______
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>



-- 
Aaron Polhamus
NASA Jet Propulsion Lab
Statistical consultant, Revolution Analytics

160 E Corson Street Apt 207, Pasadena, CA 91103
Cell: +1 (206) 380.3948
Email: 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Easy 'apply' question

2011-03-10 Thread Aaron Polhamus
Dear list,

I couldn't find a solution for this problem online, as simple as it seems.
Here's the problem:


#Construct test dataframe
tf <- data.frame(1:3,4:6,c("A","A","A"))

#Try the apply function I'm trying to use
test <- apply(tf,2,function(x) if(is.numeric(x)) mean(x) else unique(x)[1])

#Look at the output--all columns treated as character columns...
test

#Look at the format of the original data--the first two columns are
integers.
str(tf)


In general terms, I want to differentiate what function I apply over a
row/column based on what type of data that row/column contains. Here I want
a simple mean if the column is numeric, and the first unique value if the
column is a character column. As you can see, 'apply' treats all columns as
characters the way I've written his function.

Any thoughts? Many thanks in advance,
Aaron

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difficult with round() function

2011-01-17 Thread Aaron Polhamus
Dear list,

I'm writing a function to re-grid a data set from finer to coarser
resolutions in R as follows (I use this function with sapply/apply):

gridResize <- function(startVec = stop("What's your input vector"),
to = stop("Missing 'to': How long do you want the fnial vector to be?")){
 from <- length(startVec)
shortVec<-numeric()
tics <- from*to
for(j in 1:to){
interval <- ((j/to)*tics - (1/to)*tics + 1):((j/to)*tics)
benchmarks <- interval/to
 #FIRST RUN ASSUMES FINAL BENCHMARK/TO IS AN INTEGER...
positions <- which(round(benchmarks) == benchmarks)
indeces <- benchmarks[positions]
fracs <- numeric()
 #SINCE MUCH OF THE TIME THIS WILL NOT BE THE CASE, THIS SCRIPT DEALS WITH
THE REMAINDER...
for(i in 1:length(positions)){
if(i == 1) fracs[i] <- positions[i]/length(benchmarks) else{
fracs[i] <- (positions[i] - sum(positions[1:(i-1)]))/length(benchmarks)
}
}
 #AND UPDATES STARTVEC INDECES AND FRACTION MULTIPLIERS
if(max(positions) != length(benchmarks)) indeces <- c(indeces, max(indeces)
+ 1)
if(sum(fracs) != 1) fracs <- c(fracs, 1 - sum(fracs))
 fromVals <- startVec[indeces]
 if(any(is.na(fromVals))){
NAindex <- which(is.na(fromVals))
if(sum(Fracs[-NAindex]) >= 0.5)  shortVec[j] <- sum(fromVals*fracs,
na.rm=TRUE) else shortVec[j] <- NA
}else{shortVec[j] <- sum(fromVals*fracs)}
}
return(shortVec)
}


for the simple test case test <- gridResize(startVec =
c(2,4,6,8,10,8,6,4,2), to = 7) the function works fine. For larger vectors,
however, it breaks down. E.g.: test <- gridResize(startVec = rnorm(300, 9,
20), to = 200)

This returns the error:

Error in positions[1:(i - 1)] :
  only 0's may be mixed with negative subscripts

and the problem seems to be in the line positions <- which(round(benchmarks)
== benchmarks). In this particular example the code cracks up at j = 27.
When set j = 27 and run the calculation manually I discover the following:

> benchmarks[200]
[1] 40
> benchmarks[200] == 40
[1] FALSE
> round(benchmarks[200]) == 40
[1] TRUE

Even though my benchmark calculation seems to be returning a clean integers
to serve as inputs for the creation of the 'positions' variable, for
whatever reason R doesn't read it that way. I would be very grateful for any
advice on how I can either alter my approach entirely (I am sure there is a
far more elegant way to regrid data in R) or a simple fix for this rounding
error.

Many thanks in advance,
Aaron

-- 
Aaron Polhamus 
Statistical consultant, Revolution Analytics
MSc Applied Statistics, The University of Oxford, 2009
838a NW 52nd St, Seattle, WA 98107
Cell: +1 (206) 380.3948

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Aaron Polhamus
Ivan and Michael,

Many thanks for the tips, those solved my queries. Still interested in how
to force custom functions to work over rows rather than columns when using
apply, but the MAT/rowSums(MAT) technique is definitely the most efficient
way to go for this application.

Cheers,
Aaron

2010/12/2 Michael Bedward 

> Hi Aaron,
>
> Following up on Ivan's suggestion, if you want the column order to
> mirror the row order...
>
> mo <- order(rowSums(MAT), decreasing=TRUE)
> MAT2 <- MAT[mo, mo]
>
> Also, you don't need all those extra c() calls when creating
> inputData, just the outermost one.
>
> Regarding your second question, your statements...
>
> TMAT <- apply(t(MAT), 2, function(X) X/sum(X))
> TMAT <- t(TMAT)
>
> is actually just a complicated way of doing this...
>
> TMAT <- MAT / rowSums(MAT)
>
> You can confirm that by doing it your way and then this...
>
> TMAT == MAT / rowSums(MAT)
>
> ...and you should see a matrix of TRUE values
>
> Michael
>
>
> On 2 December 2010 20:43, Ivan Calandra 
> wrote:
> > Hi,
> >
> > Here is a not so easy way to do your first step, but it works:
> > MAT2 <- cbind(MAT, rowSums(MAT))
> > MAT[order(MAT2[,6], decreasing=TRUE),]
> >
> > For the second, I don't know!
> >
> > HTH,
> > Ivan
> >
> >
> > Le 12/2/2010 09:46, Aaron Polhamus a écrit :
> >>
> >> Greetings,
> >>
> >> My goal is to create a Markov transition matrix (probability of moving
> >> from
> >> one state to another) with the 'highest traffic' portion of the matrix
> >> occupying the top-left section. Consider the following sample:
> >>
> >> inputData<- c(
> >> c(5, 3, 1, 6, 7),
> >> c(9, 7, 3, 10, 11),
> >> c(1, 2, 3, 4, 5),
> >> c(2, 4, 6, 8, 10),
> >> c(9, 5, 2, 1, 1)
> >> )
> >>
> >> MAT<- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE)
> >> colnames(MAT)<- c("A", "B", "C", "D", "E")
> >> rownames(MAT)<- c("A", "B", "C", "D", "E")
> >>
> >> rowSums(MAT)
> >>
> >> I wan to re-arrange the elements of this matrix such that the elements
> >> with
> >> the largest row sums are placed to the top-left, in descending order.
> Does
> >> this make sense? In this case the order I'm looking for would be B, D,
> A,
> >> E,
> >> C Any thoughts?
> >>
> >> As an aside, here is the function I've written to construct the
> transition
> >> matrix. Is there a more elegant way to do this that doesn't involve a
> >> double
> >> transpose?
> >>
> >> TMAT<- apply(t(MAT), 2, function(X) X/sum(X))
> >> TMAT<- t(TMAT)
> >>
> >> I tried the following:
> >>
> >> TMAT<- apply(MAT, 1, function(X) X/sum(X))
> >>
> >> But my the custom function is still getting applied over the columns of
> >> the
> >> array, rather than the rows. For a check try:
> >>
> >> rowSums(TMAT)
> >> colSums(TMAT)
> >>
> >> Row sums here should equal 1...
> >>
> >> Many thanks in advance,
> >> Aaron
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > --
> > Ivan CALANDRA
> > PhD Student
> > University of Hamburg
> > Biozentrum Grindel und Zoologisches Museum
> > Abt. Säugetiere
> > Martin-Luther-King-Platz 3
> > D-20146 Hamburg, GERMANY
> > +49(0)40 42838 6231
> > ivan.calan...@uni-hamburg.de
> >
> > **
> > http://www.for771.uni-bonn.de
> > http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



-- 
Aaron Polhamus 
Statistical consultant, Revolution Analytics
MSc Applied Statistics, The University of Oxford, 2009
838a NW 52nd St, Seattle, WA 98107
Cell: +1 (206) 380.3948

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Arrange elements on a matrix according to rowSums + short 'apply' Q

2010-12-02 Thread Aaron Polhamus
Greetings,

My goal is to create a Markov transition matrix (probability of moving from
one state to another) with the 'highest traffic' portion of the matrix
occupying the top-left section. Consider the following sample:

inputData <- c(
c(5, 3, 1, 6, 7),
c(9, 7, 3, 10, 11),
c(1, 2, 3, 4, 5),
c(2, 4, 6, 8, 10),
c(9, 5, 2, 1, 1)
)

MAT <- matrix(inputData, nrow = 5, ncol = 5, byrow = TRUE)
colnames(MAT) <- c("A", "B", "C", "D", "E")
rownames(MAT) <- c("A", "B", "C", "D", "E")

rowSums(MAT)

I wan to re-arrange the elements of this matrix such that the elements with
the largest row sums are placed to the top-left, in descending order. Does
this make sense? In this case the order I'm looking for would be B, D, A, E,
C Any thoughts?

As an aside, here is the function I've written to construct the transition
matrix. Is there a more elegant way to do this that doesn't involve a double
transpose?

TMAT <- apply(t(MAT), 2, function(X) X/sum(X))
TMAT <- t(TMAT)

I tried the following:

TMAT <- apply(MAT, 1, function(X) X/sum(X))

But my the custom function is still getting applied over the columns of the
array, rather than the rows. For a check try:

rowSums(TMAT)
colSums(TMAT)

Row sums here should equal 1...

Many thanks in advance,
Aaron

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.