Re: [R] lapply and runif issue?

2017-11-15 Thread William Dunlap via R-help
Your lapply is making the call
   runif(n=3, min=i)
for i in 1:3.  That runif's 3 argument is 'max', with default value 1
so that is equivalent to calling
   runif(n=3, min=i, max=1)
When i>max, outside the domain of the family of uniform distributions,
runif returns NaN's.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Nov 14, 2017 at 5:11 PM, Bert Gunter  wrote:

> Could someone please explain the following? I did check bug reports, but
> did not recognize the issue there. I am reluctant to call it a bug, as it
> is much more likely my misunderstanding. Ergo my request for clarification:
>
> ## As expected:
>
> > lapply(1:3, rnorm, n = 3)
> [[1]]
> [1] 2.481575 1.998182 1.312786
>
> [[2]]
> [1] 2.858383 1.827863 1.699015
>
> [[3]]
> [1] 1.821910 2.530091 3.995677
>
>
> ## Unexpected by me:
>
> > lapply(1:3, runif, n = 3)
> [[1]]
> [1] 1 1 1
>
> [[2]]
> [1] NaN NaN NaN
>
> [[3]]
> [1] NaN NaN NaN
>
> Warning messages:
> 1: In FUN(X[[i]], ...) : NAs produced
> 2: In FUN(X[[i]], ...) : NAs produced
>
>
> ## But note, as expected:
>
> > lapply(1:3, function(x)runif(3))
> [[1]]
> [1] 0.2950459 0.8490556 0.4303680
>
> [[2]]
> [1] 0.5961144 0.5330914 0.2363679
>
> [[3]]
> [1] 0.8079495 0.1431838 0.3671915
>
>
>
> Many thanks for any clarification.
>
> -- Bert
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and runif issue?

2017-11-14 Thread Bert Gunter
Thanks, Ista. That explains it.

What I missed is the following "note" in ?lapply:

"This means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. "

That being the case, X[[i]] gets passed to the first available argument,
which for runif(n=3, min, max) is the min argument, as you said. This is a
subtlety (to me, anyway) of which I was unaware. Which is why I hesitated
to call it a bug. It ain't! It is documented -- I just failed to read
carefully enough.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Nov 14, 2017 at 6:11 PM, Ista Zahn  wrote:

> Hi Bert,
>
> On Tue, Nov 14, 2017 at 8:11 PM, Bert Gunter 
> wrote:
> > Could someone please explain the following? I did check bug reports, but
> > did not recognize the issue there. I am reluctant to call it a bug, as it
> > is much more likely my misunderstanding. Ergo my request for
> clarification:
> >
> > ## As expected:
> >
> >> lapply(1:3, rnorm, n = 3)
> > [[1]]
> > [1] 2.481575 1.998182 1.312786
> >
> > [[2]]
> > [1] 2.858383 1.827863 1.699015
> >
> > [[3]]
> > [1] 1.821910 2.530091 3.995677
> >
>
> Exactly what expectation do you imagine the above is consistent with? Does
>
> > lapply(100*(1:3), rnorm, n = 3)
> [[1]]
> [1] 100.35425  99.29429  98.69429
>
> [[2]]
> [1] 198.2963 201.1031 201.1077
>
> [[3]]
> [1] 299.7012 298.3700 298.0684
>
> change your assessment?
>
> >
> > ## Unexpected by me:
> >
> >> lapply(1:3, runif, n = 3)
> > [[1]]
> > [1] 1 1 1
> >
> > [[2]]
> > [1] NaN NaN NaN
> >
> > [[3]]
> > [1] NaN NaN NaN
> >
> > Warning messages:
> > 1: In FUN(X[[i]], ...) : NAs produced
> > 2: In FUN(X[[i]], ...) : NAs produced
>
> The first argument to runif is named 'n'. Thus,
>
> lapply(1:3, runif)
>
> means roughly
>
> list(runif(n = 1), runif(n = 2), runif(n = 3))
>
> But you specify than lapply(1:3, runif, n = 3). Since the first
> argument ('n') is already specified, the X values from lapply get
> "pushed" to the second argument. That is,
>
> lapply(1:3, runif, n = 3)
>
> means roughly
>
> list(runif(n = 3, min = 1), runif(n = 3, min = 2), runif(n = 3, min = 3))
>
> Note that this is exactly the same thing that happens with
>
> lapply(1:3, rnorm, n = 3), though it becomes more obvious with
>
> lapply(100*(1:3), rnorm, n = 3)
>
> That is,
>
> lapply(1:3, rnorm, n = 3)
>
> means roughly
>
> list(rnorm(n = 3, mean = 1), rnorm(n = 3, mean = 2), rnorm(n = 3, mean =
> 3))
>
> >
> >
> > ## But note, as expected:
> >
> >> lapply(1:3, function(x)runif(3))
> > [[1]]
> > [1] 0.2950459 0.8490556 0.4303680
> >
> > [[2]]
> > [1] 0.5961144 0.5330914 0.2363679
> >
> > [[3]]
> > [1] 0.8079495 0.1431838 0.3671915
>
> Sure, because you never use x in the body of your anonymous function.
>
> As a final note, what you seem to expect can be achieved with
>
> replicate(3, rnorm(n = 3), simplify = FALSE)
>
> and
>
> replicate(3, runif(n = 3), simplify = FALSE)
>
>
> Best,
> Ista
>
> >
> >
> >
> > Many thanks for any clarification.
> >
> > -- Bert
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and runif issue?

2017-11-14 Thread Ista Zahn
Hi Bert,

On Tue, Nov 14, 2017 at 8:11 PM, Bert Gunter  wrote:
> Could someone please explain the following? I did check bug reports, but
> did not recognize the issue there. I am reluctant to call it a bug, as it
> is much more likely my misunderstanding. Ergo my request for clarification:
>
> ## As expected:
>
>> lapply(1:3, rnorm, n = 3)
> [[1]]
> [1] 2.481575 1.998182 1.312786
>
> [[2]]
> [1] 2.858383 1.827863 1.699015
>
> [[3]]
> [1] 1.821910 2.530091 3.995677
>

Exactly what expectation do you imagine the above is consistent with? Does

> lapply(100*(1:3), rnorm, n = 3)
[[1]]
[1] 100.35425  99.29429  98.69429

[[2]]
[1] 198.2963 201.1031 201.1077

[[3]]
[1] 299.7012 298.3700 298.0684

change your assessment?

>
> ## Unexpected by me:
>
>> lapply(1:3, runif, n = 3)
> [[1]]
> [1] 1 1 1
>
> [[2]]
> [1] NaN NaN NaN
>
> [[3]]
> [1] NaN NaN NaN
>
> Warning messages:
> 1: In FUN(X[[i]], ...) : NAs produced
> 2: In FUN(X[[i]], ...) : NAs produced

The first argument to runif is named 'n'. Thus,

lapply(1:3, runif)

means roughly

list(runif(n = 1), runif(n = 2), runif(n = 3))

But you specify than lapply(1:3, runif, n = 3). Since the first
argument ('n') is already specified, the X values from lapply get
"pushed" to the second argument. That is,

lapply(1:3, runif, n = 3)

means roughly

list(runif(n = 3, min = 1), runif(n = 3, min = 2), runif(n = 3, min = 3))

Note that this is exactly the same thing that happens with

lapply(1:3, rnorm, n = 3), though it becomes more obvious with

lapply(100*(1:3), rnorm, n = 3)

That is,

lapply(1:3, rnorm, n = 3)

means roughly

list(rnorm(n = 3, mean = 1), rnorm(n = 3, mean = 2), rnorm(n = 3, mean = 3))

>
>
> ## But note, as expected:
>
>> lapply(1:3, function(x)runif(3))
> [[1]]
> [1] 0.2950459 0.8490556 0.4303680
>
> [[2]]
> [1] 0.5961144 0.5330914 0.2363679
>
> [[3]]
> [1] 0.8079495 0.1431838 0.3671915

Sure, because you never use x in the body of your anonymous function.

As a final note, what you seem to expect can be achieved with

replicate(3, rnorm(n = 3), simplify = FALSE)

and

replicate(3, runif(n = 3), simplify = FALSE)


Best,
Ista

>
>
>
> Many thanks for any clarification.
>
> -- Bert
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and runif issue?

2017-11-14 Thread Bert Gunter
Could someone please explain the following? I did check bug reports, but
did not recognize the issue there. I am reluctant to call it a bug, as it
is much more likely my misunderstanding. Ergo my request for clarification:

## As expected:

> lapply(1:3, rnorm, n = 3)
[[1]]
[1] 2.481575 1.998182 1.312786

[[2]]
[1] 2.858383 1.827863 1.699015

[[3]]
[1] 1.821910 2.530091 3.995677


## Unexpected by me:

> lapply(1:3, runif, n = 3)
[[1]]
[1] 1 1 1

[[2]]
[1] NaN NaN NaN

[[3]]
[1] NaN NaN NaN

Warning messages:
1: In FUN(X[[i]], ...) : NAs produced
2: In FUN(X[[i]], ...) : NAs produced


## But note, as expected:

> lapply(1:3, function(x)runif(3))
[[1]]
[1] 0.2950459 0.8490556 0.4303680

[[2]]
[1] 0.5961144 0.5330914 0.2363679

[[3]]
[1] 0.8079495 0.1431838 0.3671915



Many thanks for any clarification.

-- Bert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply

2016-07-31 Thread David Winsemius

> On Jul 30, 2016, at 7:53 PM, roslinazairimah zakaria  
> wrote:
> 
> Dear r-users,
> 
> I would like to use lapply for the following task:
> 
> ## Kolmogorov-Smirnov
> ks.test(stn_all[,1][stn_all[,1] > 0],stn_all_gen[,1][stn_all_gen[,1] > 0])
> ks.test(stn_all[,2][stn_all[,2] > 0],stn_all_gen[,2][stn_all_gen[,2] > 0])
> ks.test(stn_all[,3][stn_all[,3] > 0],stn_all_gen[,3][stn_all_gen[,3] > 0])
> ks.test(stn_all[,4][stn_all[,4] > 0],stn_all_gen[,4][stn_all_gen[,4] > 0])
> ks.test(stn_all[,5][stn_all[,5] > 0],stn_all_gen[,5][stn_all_gen[,5] > 0])
> ks.test(stn_all[,6][stn_all[,6] > 0],stn_all_gen[,6][stn_all_gen[,6] > 0])
> 
> I would like to conduct the Kolmogorov Smirnov goodness of fit tests.

Either use `lapply` over the column indices or (probably more cleanly) use 
`mapply` over the two dataframes.

-- 
David
> 
> Is it possible?
> 
> 
> Thank you very much.
> -- 
> *Dr. Roslinazairimah Binti Zakaria*
> *Tel: +609-5492370; Fax. No.+609-5492766*
> 
> *Email: roslinazairi...@ump.edu.my ;
> roslina...@gmail.com *
> Deputy Dean (Academic & Student Affairs)
> Faculty of Industrial Sciences & Technology
> University Malaysia Pahang
> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply

2016-07-30 Thread roslinazairimah zakaria
Dear r-users,

I would like to use lapply for the following task:

## Kolmogorov-Smirnov
ks.test(stn_all[,1][stn_all[,1] > 0],stn_all_gen[,1][stn_all_gen[,1] > 0])
ks.test(stn_all[,2][stn_all[,2] > 0],stn_all_gen[,2][stn_all_gen[,2] > 0])
ks.test(stn_all[,3][stn_all[,3] > 0],stn_all_gen[,3][stn_all_gen[,3] > 0])
ks.test(stn_all[,4][stn_all[,4] > 0],stn_all_gen[,4][stn_all_gen[,4] > 0])
ks.test(stn_all[,5][stn_all[,5] > 0],stn_all_gen[,5][stn_all_gen[,5] > 0])
ks.test(stn_all[,6][stn_all[,6] > 0],stn_all_gen[,6][stn_all_gen[,6] > 0])

 I would like to conduct the Kolmogorov Smirnov goodness of fit tests.

Is it possible?


Thank you very much.
-- 
*Dr. Roslinazairimah Binti Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: roslinazairi...@ump.edu.my ;
roslina...@gmail.com *
Deputy Dean (Academic & Student Affairs)
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply on list of lists

2016-02-12 Thread Stefano Sofia
Dear R list users,
I have three lists of data frames, respectively temp_list, wind_list and 
snow_list.
The elements of these three lists are

temp_list$station1, temp_list$station2 and temp_list$station3 with columns date 
and temp;
wind_list$station1, wind_list$station2 and wind_list$station3 with columns 
date, wind_vel and wind_dir;
snow_list$station1, snow_list$station2 and snow_list$station3 with columns date 
and hs

where date has been transformed to character.
I need to merge temp_list$station1, wind_list$station1 and snow_list$station1, 
and same thing for station2 and station3.

If I create a list
list_all <- list(temp_list$station1, wind_list$station1, snow_list$station1)

then
Reduce(function(x, y) merge(x, y, by=c("date"), all=TRUE), list_all)

will do it. But then I have to create the other two lists and apply again 
Reduce.

I would like to create a list of list and using lapply twice in order to get 
this process completely automatic.
I tried

list_all <- list(temp_list, wind_list, snow_list)
names(list_all) <- c("temp", "wind", "snow")
lapply(names(list_all), function(val){list_all$val, lapply(c("station1", 
"station2", "station3"), function(val){Reduce(function(x, y) merge(x$val, 
y$val, by=c("date"), all=TRUE), list_all)}), list_all})

but it gives me a syntax error and I am struggling to make it work.
Could someboby help me to create the correct loop?

Thank you for your help
Stefano



AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche 
possono contenere informazioni confidenziali e con privilegi legali. Se non si 
è il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al 
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi 
dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed 
urgenza, la risposta al presente messaggio di posta elettronica può essere 
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Regione Marche may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lapply function

2015-05-30 Thread Sarah Goslee
You need the vectorizes ifelse() instead of if().

Also watch out for order of operations in the last line, and there is
already a base R function named scale(). And spelling of arguments, of
course.

On Saturday, May 30, 2015, Sohail Khan  wrote:

> Hi R Gurus,
>
> I am writing a simple function that take a numeric vector column from a
> data frame and scales the vector column with certain criteria.  I would
> then pass this function to a list of dataframes by lappy.
>
> Question is how do I write a function that works on a numeric vector.  My
> function as is, seems to work on the first element of the vector.
>
> I.E.
> scale
> function(x,mn,max){
> if (x==min(x)) 0
>
> if (x==max(x)) 10
> else x=max-min/10
> }
>
> where x is the numeric column of the dataframe.
>
> Thank you.
>
>
>

-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply function

2015-05-30 Thread Sohail Khan
Hi R Gurus,

I am writing a simple function that take a numeric vector column from a
data frame and scales the vector column with certain criteria.  I would
then pass this function to a list of dataframes by lappy.

Question is how do I write a function that works on a numeric vector.  My
function as is, seems to work on the first element of the vector.

I.E.
scale
function(x,mn,max){
if (x==min(x)) 0

if (x==max(x)) 10
else x=max-min/10
}

where x is the numeric column of the dataframe.

Thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply returns NULL ?

2014-07-12 Thread luke-tierney

Another option is

Filter(function(x) x[1] == 1, foo)

Best,

luke
On Sat, 12 Jul 2014, ce wrote:



Thanks Jeff et. all,

This is exactly what I needed.

-Original Message-
From: "Jeff Newmiller" [jdnew...@dcn.davis.ca.us]
Date: 07/12/2014 10:38 AM
To: "Uwe Ligges" , "ce" , 
r-help@r-project.org
Subject: Re: [R] lapply returns NULL ?

I think that removing them is something the OP doesn't understand how to do.

The lapply function ALWAYS produces an output element for every input element. 
If this is not what you want then you need to choose a looping structure that 
is not so tightly linked to the input, such as a for loop (untested):

result <- list()
for (nm in names(foo)) {
 if ( 1 == foo[[nm]][1] ) {
   result[[ nm ]] <- foo[[ nm ]]
 }
}
result

or use vector indexing (lists are a special kind of vector) with the loop 
result:

foo[ sapply(foo,function(v){1==v[1]}) ]

---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
 Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

On July 12, 2014 6:37:44 AM PDT, Uwe Ligges  
wrote:



On 12.07.2014 15:25, ce wrote:



Dear all,

I have a list of arrays :

foo<-list(A = c(1,3), B =c(1, 2), C = c(3, 1))


foo

$A
[1] 1 3

$B
[1] 1 2

$C
[1] 3 1


if( foo$C[1] == 1 ) foo$C[1]



  lapply(foo, function(x) if(x[1] == 1 )  x  )


$A
[1] 1 3

$B
[1] 1 2

$C
NULL

I don't want to list $C NULL  in the output. How I can do that ?


Either use your own print function or, if you do not want NULL elements

in the object, remove them.

Best,
Uwe Ligges



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply returns NULL ?

2014-07-12 Thread ce

Thanks Jeff et. all,

This is exactly what I needed.

-Original Message-
From: "Jeff Newmiller" [jdnew...@dcn.davis.ca.us]
Date: 07/12/2014 10:38 AM
To: "Uwe Ligges" , "ce" , 
r-help@r-project.org
Subject: Re: [R] lapply returns NULL ?

I think that removing them is something the OP doesn't understand how to do.

The lapply function ALWAYS produces an output element for every input element. 
If this is not what you want then you need to choose a looping structure that 
is not so tightly linked to the input, such as a for loop (untested):

result <- list()
for (nm in names(foo)) {
  if ( 1 == foo[[nm]][1] ) {
result[[ nm ]] <- foo[[ nm ]]
  }
}
result

or use vector indexing (lists are a special kind of vector) with the loop 
result:

foo[ sapply(foo,function(v){1==v[1]}) ]

---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On July 12, 2014 6:37:44 AM PDT, Uwe Ligges  
wrote:
>
>
>On 12.07.2014 15:25, ce wrote:
>>
>>
>> Dear all,
>>
>> I have a list of arrays :
>>
>> foo<-list(A = c(1,3), B =c(1, 2), C = c(3, 1))
>>
>>> foo
>> $A
>> [1] 1 3
>>
>> $B
>> [1] 1 2
>>
>> $C
>> [1] 3 1
>>
>>> if( foo$C[1] == 1 ) foo$C[1]
>>
>>>   lapply(foo, function(x) if(x[1] == 1 )  x  )
>>
>> $A
>> [1] 1 3
>>
>> $B
>> [1] 1 2
>>
>> $C
>> NULL
>>
>> I don't want to list $C NULL  in the output. How I can do that ?
>
>Either use your own print function or, if you do not want NULL elements
>
>in the object, remove them.
>
>Best,
>Uwe Ligges
>
>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply returns NULL ?

2014-07-12 Thread Jeff Newmiller
I think that removing them is something the OP doesn't understand how to do.

The lapply function ALWAYS produces an output element for every input element. 
If this is not what you want then you need to choose a looping structure that 
is not so tightly linked to the input, such as a for loop (untested):

result <- list()
for (nm in names(foo)) {
  if ( 1 == foo[[nm]][1] ) {
result[[ nm ]] <- foo[[ nm ]]
  }
}
result

or use vector indexing (lists are a special kind of vector) with the loop 
result:

foo[ sapply(foo,function(v){1==v[1]}) ]

---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On July 12, 2014 6:37:44 AM PDT, Uwe Ligges  
wrote:
>
>
>On 12.07.2014 15:25, ce wrote:
>>
>>
>> Dear all,
>>
>> I have a list of arrays :
>>
>> foo<-list(A = c(1,3), B =c(1, 2), C = c(3, 1))
>>
>>> foo
>> $A
>> [1] 1 3
>>
>> $B
>> [1] 1 2
>>
>> $C
>> [1] 3 1
>>
>>> if( foo$C[1] == 1 ) foo$C[1]
>>
>>>   lapply(foo, function(x) if(x[1] == 1 )  x  )
>>
>> $A
>> [1] 1 3
>>
>> $B
>> [1] 1 2
>>
>> $C
>> NULL
>>
>> I don't want to list $C NULL  in the output. How I can do that ?
>
>Either use your own print function or, if you do not want NULL elements
>
>in the object, remove them.
>
>Best,
>Uwe Ligges
>
>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply returns NULL ?

2014-07-12 Thread Rui Barradas

Hello,

Try the following.

res <- lapply(foo, function(x) if(x[1] == 1 )  x  )
res[!sapply(res, is.null)]

Hope this helps,

Rui Barradas

Em 12-07-2014 14:25, ce escreveu:



Dear all,

I have a list of arrays :

foo<-list(A = c(1,3), B =c(1, 2), C = c(3, 1))


foo

$A
[1] 1 3

$B
[1] 1 2

$C
[1] 3 1


if( foo$C[1] == 1 ) foo$C[1]



  lapply(foo, function(x) if(x[1] == 1 )  x  )


$A
[1] 1 3

$B
[1] 1 2

$C
NULL

I don't want to list $C NULL  in the output. How I can do that ?
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply returns NULL ?

2014-07-12 Thread Uwe Ligges



On 12.07.2014 15:25, ce wrote:



Dear all,

I have a list of arrays :

foo<-list(A = c(1,3), B =c(1, 2), C = c(3, 1))


foo

$A
[1] 1 3

$B
[1] 1 2

$C
[1] 3 1


if( foo$C[1] == 1 ) foo$C[1]



  lapply(foo, function(x) if(x[1] == 1 )  x  )


$A
[1] 1 3

$B
[1] 1 2

$C
NULL

I don't want to list $C NULL  in the output. How I can do that ?


Either use your own print function or, if you do not want NULL elements 
in the object, remove them.


Best,
Uwe Ligges



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply returns NULL ?

2014-07-12 Thread ce


Dear all,

I have a list of arrays :

foo<-list(A = c(1,3), B =c(1, 2), C = c(3, 1)) 

> foo
$A
[1] 1 3

$B
[1] 1 2

$C
[1] 3 1

> if( foo$C[1] == 1 ) foo$C[1]

>  lapply(foo, function(x) if(x[1] == 1 )  x  )

$A
[1] 1 3

$B
[1] 1 2

$C
NULL

I don't want to list $C NULL  in the output. How I can do that ? 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lapply to create sub categories based on categorical data

2014-02-02 Thread arun
Hi,
Try:
x <- 
c(rep("A",0.1*1),rep("B",0.2*1),rep("C",0.65*1),rep("D",0.05*1))
set.seed(24)
categorical_data <- sample(x,1)
set.seed(49)
p_val <- runif(1,0,1) 

combi <- data.frame(V1=categorical_data,V2=p_val) 
variables <- unique(combi$V1)
 res <- lapply(levels(variables),function(x){ combi$NEWVAR<-(combi$V1==x)*1; 
combi})


A.K.


I was wondering if you kind folks could answer a question I have. In the sample 
data I've provided below, in column 1 I have a categorical 
variable labeled A,B,C and D, and in column 2 simulated p-values. 

x <- 
c(rep("A",0.1*1),rep("B",0.2*1),rep("C",0.65*1),rep("D",0.05*1))
 
categorical_data=as.matrix(sample(x,1)) 
p_val=as.matrix(runif(1,0,1)) 
combi=as.data.frame(cbind(categorical_data,p_val)) 

This is simulated data, but my example comes out as 

head(combi) 
  V1                V2 
1  A 0.484525170875713 
2  C  0.48046557046473 
3  C 0.228440979029983 
4  B 0.216991128632799 
5  C 0.521497668232769 
6  D 0.358560319757089 

I want to now take one of the categorical variables, let's say 
"C", and create another variable (coded as 1 if it's C or 0 if it 
isn't). 

combi$NEWVAR[combi$V1=="C"] <-1 
combi$NEWVAR[combi$V1!="C" <-0 

  V1                V2 NEWVAR 
1  A 0.484525170875713 0 
2  C  0.48046557046473 1 
3  C 0.228440979029983 1 
4  B 0.216991128632799 0 
5  C 0.521497668232769 1 
6  D 0.358560319757089 0 

I'd like to do this for each of the variables in V1, creating a new table each 
time, by looping over using lapply: 

variables=unique(combi$V1) 

loopeddata=lapply(variables,function(x){ 
combi$NEWVAR[combi$V1==x] <-1 
combi$NEWVAR[combi$V1!=x]<-0 
} 
) 

My output however looks like this: 

[[1]] 
[1] 0 

[[2]] 
[1] 0 

[[3]] 
[1] 0 

[[4]] 
[1] 0 

My desired output would be like the table in the second block of
 code, but when looping over the third column would be A=1, while 
B,C,D=0. Then B=1, A,C,D=0 etc, for each table created. 

Any help would me very much appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply?

2013-11-14 Thread Toth, Denes

Hi,

the output of lapply() is a list; see ?lapply and ?sapply.

# if you know the length of your list in advance,
# this definition is better:
uu <- vector("list", 2)

# list elements
uu[[1]] <- c(1,2,3)
uu[[2]] <- c(3,4,5)


# some options to achieve what you want:
matrix(unlist(uu), 2, 3, T)
do.call(rbind, uu)
t(sapply(uu, I))


HTH,
  Denes


> Hi,
>
> I was trying to use lapply to create a matrix from a list:
>
> uu <- list()
> uu[[1]] <- c(1,2,3)
> uu[[2]] <- c(3,4,5)
>
> The output I desire is a matrix with 2 rows and 3 columns, so I try:
>
> xx <- lapply(uu,rbind)
>
> Obviously, I'm not doing something right, but what!?
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply?

2013-11-14 Thread Brian Smith
Thanks all! So many ways


On Thu, Nov 14, 2013 at 10:35 AM, Rui Barradas  wrote:

> Hello,
>
> You are applying rbind to each element of the list, not rbinding it with
> the others. Try instead
>
> do.call(rbind, uu)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 14-11-2013 15:20, Brian Smith escreveu:
>
>> Hi,
>>
>> I was trying to use lapply to create a matrix from a list:
>>
>> uu <- list()
>> uu[[1]] <- c(1,2,3)
>> uu[[2]] <- c(3,4,5)
>>
>> The output I desire is a matrix with 2 rows and 3 columns, so I try:
>>
>> xx <- lapply(uu,rbind)
>>
>> Obviously, I'm not doing something right, but what!?
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply?

2013-11-14 Thread Rui Barradas

Hello,

You are applying rbind to each element of the list, not rbinding it with 
the others. Try instead


do.call(rbind, uu)

Hope this helps,

Rui Barradas

Em 14-11-2013 15:20, Brian Smith escreveu:

Hi,

I was trying to use lapply to create a matrix from a list:

uu <- list()
uu[[1]] <- c(1,2,3)
uu[[2]] <- c(3,4,5)

The output I desire is a matrix with 2 rows and 3 columns, so I try:

xx <- lapply(uu,rbind)

Obviously, I'm not doing something right, but what!?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply?

2013-11-14 Thread Berend Hasselman

On 14-11-2013, at 16:20, Brian Smith  wrote:

> Hi,
> 
> I was trying to use lapply to create a matrix from a list:
> 
> uu <- list()
> uu[[1]] <- c(1,2,3)
> uu[[2]] <- c(3,4,5)
> 
> The output I desire is a matrix with 2 rows and 3 columns, so I try:
> 
> xx <- lapply(uu,rbind)
> 
> Obviously, I'm not doing something right, but what!?


do.call(rbind,uu) 

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply?

2013-11-14 Thread Brian Smith
Hi,

I was trying to use lapply to create a matrix from a list:

uu <- list()
uu[[1]] <- c(1,2,3)
uu[[2]] <- c(3,4,5)

The output I desire is a matrix with 2 rows and 3 columns, so I try:

xx <- lapply(uu,rbind)

Obviously, I'm not doing something right, but what!?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to multivariate function?

2013-09-01 Thread Rui Barradas

Hello,

I have no experience with packages foreach and doMC.
But I believe that paralel computing only pays if the datasets are 
really large, due to the setup time. Maybe "thousands of observations" 
is not that large.


Rui Barradas

Em 01-09-2013 22:21, Ignacio Martinez escreveu:

Thanks a lot Rui. Loops make sense to me. I made one modification to your
code. I have thousands of observation, so I would like to run it in
parallel. This is my reproducible example:

# Make Data Frame for video actions between given times for user X
DataVideoActionT <- function (userX, Time1, Time2, Time3){
   #Get data for user X
   videoActionsX<-subset(videoLectureActions, username==userX)
   #Time1 = before first attempt
   videoActionsX_T1<-subset(videoActionsX, eventTimestampTime1)
   #Time3= before last attemp
   videoActionsX_T3<-subset(videoActionsX, eventTimestampTime1)

   error1 = sum(videoActionsX_T1$type==" error ")
   pause1 = sum(videoActionsX_T1$type==" pause ")
   play1 = sum(videoActionsX_T1$type==" play ")
   ratechange1 = sum(videoActionsX_T1$type==" ratechange ")
   seeked1 = sum(videoActionsX_T1$type==" seeked ")
   stalled1 = sum(videoActionsX_T1$type==" stalled ")

   error2 = sum(videoActionsX_T2$type==" error ")
   pause2 = sum(videoActionsX_T2$type==" pause ")
   play2 = sum(videoActionsX_T2$type==" play ")
   ratechange2 = sum(videoActionsX_T2$type==" ratechange ")
   seeked2 = sum(videoActionsX_T2$type==" seeked ")
   stalled2 = sum(videoActionsX_T2$type==" stalled ")

   error3 = sum(videoActionsX_T3$type==" error ")
   pause3 = sum(videoActionsX_T3$type==" pause ")
   play3 = sum(videoActionsX_T3$type==" play ")
   ratechange3 = sum(videoActionsX_T3$type==" ratechange ")
   seeked3 = sum(videoActionsX_T3$type==" seeked ")
   stalled3 = sum(videoActionsX_T3$type==" stalled ")

   data<-data.frame(anon_ID=userX,
error1 = error1,
pause1 = pause1,
play1 = play1,
ratechange1 = ratechange1,
seeked1=seeked1,
stalled1=stalled1,
error2 = error2,
pause2 = pause2,
play2 = play2,
ratechange2 = ratechange2,
seeked2 =seeked2,
stalled2 = stalled2,
error3 = error3,
pause3 = pause3,
play3 = play3,
ratechange3 = ratechange3,
seeked3 = seeked3,
stalled3 = stalled3)
   return(data)
}

videoLectureActions<-structure(list(username = c("exampleID1",
"exampleID1", "exampleID1",
  "exampleID2",
"exampleID2", "exampleID2", "exampleID3", "exampleID3",
  "exampleID3",
"exampleID3"), currentTime = c("103.701247", "103.701247",

   "107.543877", "107.543877", "116.456507", "116.456507",
"119.987188",

   "177.816693", "183.417124", "183.417124"), playbackRate =
c("null",


   "null", "null", "null", "null", "null", "null", "null", "null",


   "null"), pause = c("true", "false", "true", "false", "true",


  "false", "true", "false", "true", "false"), error =
c("null",



  "null", "null", "null", "null", "null", "null", "null", "null",



  "null"), networkState = c("1", "1", "1", "1", "1", "1", "1",



  "1", "1", "1"), readyState = c("4", "4", "4", "4",
"4", "4",



 "4", "4", "4",
"4"), lectureID = c("exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1",
"exampleLectureID1",




"exampleLectureID1", "exampleLectureID1",
"exampleLectureID1",




"exampleLectureID1", "exampleLectureID1"), eventTimestamp =
c("2013-03-04 18:51:49",





  "2013-03-04 18:51:50", "2013-03-04 18:51:54", "2013-03-04 18:51:56",





  "2013-03-04 18:52:05", "2013-03-04 18:52:07", "2013-03-04 18:52:11",





  "2013-03-04 18:59:17", "2013-03-04 18:59:23", "2013-03-04 18:59:31"




), initTimestamp = c("2013-03-04 18:44:15", "2013-03-04
18:44:15",




 "2013-03-04 18:44:15", "2013-03-04
18:44:15", "2013-03-04 18:44:15",




 "2013-03-04 18:44:15", "2013-03-04
18:44:15", "2013-03-04 18:44:15",




 "2013-03-04 18:44:15", "2013-03-04
18:44:15"), type = c(" pause ",





 " play ", " pause ", " play ", " pause ", " play ", " pause
",





 " play ", " pause ", " play "), prevTime = c("103.701247 ",
"103.701247 ",





  "107.543877 ",
"107.543877 ", "116.456507 ", "116.456507 ", "119.987188 ",





  "177.816693 ",
"183.4

Re: [R] lapply to multivariate function?

2013-09-01 Thread Ignacio Martinez
Thanks a lot Rui. Loops make sense to me. I made one modification to your
code. I have thousands of observation, so I would like to run it in
parallel. This is my reproducible example:

# Make Data Frame for video actions between given times for user X
DataVideoActionT <- function (userX, Time1, Time2, Time3){
  #Get data for user X
  videoActionsX<-subset(videoLectureActions, username==userX)
  #Time1 = before first attempt
  videoActionsX_T1<-subset(videoActionsX, eventTimestampTime1)
  #Time3= before last attemp
  videoActionsX_T3<-subset(videoActionsX, eventTimestampTime1)

  error1 = sum(videoActionsX_T1$type==" error ")
  pause1 = sum(videoActionsX_T1$type==" pause ")
  play1 = sum(videoActionsX_T1$type==" play ")
  ratechange1 = sum(videoActionsX_T1$type==" ratechange ")
  seeked1 = sum(videoActionsX_T1$type==" seeked ")
  stalled1 = sum(videoActionsX_T1$type==" stalled ")

  error2 = sum(videoActionsX_T2$type==" error ")
  pause2 = sum(videoActionsX_T2$type==" pause ")
  play2 = sum(videoActionsX_T2$type==" play ")
  ratechange2 = sum(videoActionsX_T2$type==" ratechange ")
  seeked2 = sum(videoActionsX_T2$type==" seeked ")
  stalled2 = sum(videoActionsX_T2$type==" stalled ")

  error3 = sum(videoActionsX_T3$type==" error ")
  pause3 = sum(videoActionsX_T3$type==" pause ")
  play3 = sum(videoActionsX_T3$type==" play ")
  ratechange3 = sum(videoActionsX_T3$type==" ratechange ")
  seeked3 = sum(videoActionsX_T3$type==" seeked ")
  stalled3 = sum(videoActionsX_T3$type==" stalled ")

  data<-data.frame(anon_ID=userX,
   error1 = error1,
   pause1 = pause1,
   play1 = play1,
   ratechange1 = ratechange1,
   seeked1=seeked1,
   stalled1=stalled1,
   error2 = error2,
   pause2 = pause2,
   play2 = play2,
   ratechange2 = ratechange2,
   seeked2 =seeked2,
   stalled2 = stalled2,
   error3 = error3,
   pause3 = pause3,
   play3 = play3,
   ratechange3 = ratechange3,
   seeked3 = seeked3,
   stalled3 = stalled3)
  return(data)
}

videoLectureActions<-structure(list(username = c("exampleID1",
"exampleID1", "exampleID1",
 "exampleID2",
"exampleID2", "exampleID2", "exampleID3", "exampleID3",
 "exampleID3",
"exampleID3"), currentTime = c("103.701247", "103.701247",

  "107.543877", "107.543877", "116.456507", "116.456507",
"119.987188",

  "177.816693", "183.417124", "183.417124"), playbackRate =
c("null",


  "null", "null", "null", "null", "null", "null", "null", "null",


  "null"), pause = c("true", "false", "true", "false", "true",


 "false", "true", "false", "true", "false"), error =
c("null",



 "null", "null", "null", "null", "null", "null", "null", "null",



 "null"), networkState = c("1", "1", "1", "1", "1", "1", "1",



 "1", "1", "1"), readyState = c("4", "4", "4", "4",
"4", "4",



"4", "4", "4",
"4"), lectureID = c("exampleLectureID1", "exampleLectureID1",




   "exampleLectureID1", "exampleLectureID1",
"exampleLectureID1",




   "exampleLectureID1", "exampleLectureID1",
"exampleLectureID1",




   "exampleLectureID1", "exampleLectureID1"), eventTimestamp =
c("2013-03-04 18:51:49",





 "2013-03-04 18:51:50", "2013-03-04 18:51:54", "2013-03-04 18:51:56",





 "2013-03-04 18:52:05", "2013-03-04 18:52:07", "2013-03-04 18:52:11",





 "2013-03-04 18:59:17", "2013-03-04 18:59:23", "2013-03-04 18:59:31"




   ), initTimestamp = c("2013-03-04 18:44:15", "2013-03-04
18:44:15",




"2013-03-04 18:44:15", "2013-03-04
18:44:15", "2013-03-04 18:44:15",




"2013-03-04 18:44:15", "2013-03-04
18:44:15", "2013-03-04 18:44:15",




"2013-03-04 18:44:15", "2013-03-04
18:44:15"), type = c(" pause ",





" play ", " pause ", " play ", " pause ", " play ", " pause
",





" play ", " pause ", " play "), prevTime = c("103.701247 ",
"103.701247 ",





 "107.543877 ",
"107.543877 ", "116.456507 ", "116.456507 ", "119.987188 ",





 "177.816693 ",
"183.417124 ", "183.417124 ")), .Names = c("username",






   "currentTime", "playbackRate",
"pause", "error", "networkState",






   "readyState", "lectureID",
"eventTimestamp", "initTimestamp",






   "type", "prevTime"), row.names =

Re: [R] lapply to multivariate function?

2013-09-01 Thread Rui Barradas

Hello,

Your example doesn't really run, but for what I've seen, if your second 
data frame is named dat2, something along the lines of


n <- nrow(dat2)
res <- list("vector", n)
for(i in 1:n){
	res[[i]] <- with(dat2, DataVideoActionT(anon_ID[i], Time1[i], TimeM[i], 
TimeL[i]))

}

do.call(rbind, res)


Rui Barradas

Em 01-09-2013 17:40, Ignacio Martinez escreveu:

I hope this reproduceble example helps understand what I'm trying to do.

This is the function:

# Make Data Frame for video actions between given times for user X
DataVideoActionT <- function (userX, Time1, Time2, Time3){
   #Get data for user X
   videoActionsX<-subset(videoLectureActions, username==userX)
   #Time1 = before first attempt
   videoActionsX_T1<-subset(videoActionsX, eventTimestampTime1)
   #Time3= before last attemp
   videoActionsX_T3<-subset(videoActionsX, eventTimestampTime1)

   error1 = sum(videoActionsX_T1$type==" error ")
   pause1 = sum(videoActionsX_T1$type==" pause ")
   play1 = sum(videoActionsX_T1$type==" play ")
   ratechange1 = sum(videoActionsX_T1$type==" ratechange ")
   seeked1 = sum(videoActionsX_T1$type==" seeked ")
   stalled1 = sum(videoActionsX_T1$type==" stalled ")

   error2 = sum(videoActionsX_T2$type==" error ")
   pause2 = sum(videoActionsX_T2$type==" pause ")
   play2 = sum(videoActionsX_T2$type==" play ")
   ratechange2 = sum(videoActionsX_T2$type==" ratechange ")
   seeked2 = sum(videoActionsX_T2$type==" seeked ")
   stalled2 = sum(videoActionsX_T2$type==" stalled ")

   error3 = sum(videoActionsX_T3$type==" error ")
   pause3 = sum(videoActionsX_T3$type==" pause ")
   play3 = sum(videoActionsX_T3$type==" play ")
   ratechange3 = sum(videoActionsX_T3$type==" ratechange ")
   seeked3 = sum(videoActionsX_T3$type==" seeked ")
   stalled3 = sum(videoActionsX_T3$type==" stalled ")

   data<-data.frame(anon_ID=userX,
error1 = error1,
pause1 = pause1,
play1 = play1,
ratechange1 = ratechange1,
seeked1=seeked1,
stalled1=stalled1,
error2 = error2,
pause2 = pause2,
play2 = play2,
ratechange2 = ratechange2,
seeked2 =seeked2,
stalled2 = stalled2,
error3 = error3,
pause3 = pause3,
play3 = play3,
ratechange3 = ratechange3,
seeked3 = seeked3,
stalled3 = stalled3)
   return(data)
}

This is the videoActionsX  dataframe:

structure(list(username = c("exampleID1", "exampleID1", "exampleID1",
 "exampleID2", "exampleID2", "exampleID2",
"exampleID3", "exampleID3",
 "exampleID3", "exampleID3"), currentTime =
c("103.701247", "103.701247",

  "107.543877", "107.543877", "116.456507", "116.456507", "119.987188",

  "177.816693", "183.417124", "183.417124"), playbackRate = c("null",

  "null", "null",
"null", "null", "null", "null", "null", "null",

  "null"), pause =
c("true", "false", "true", "false", "true",


"false", "true", "false", "true", "false"), error = c("null",


   "null", "null",
"null", "null", "null", "null", "null", "null",


   "null"), networkState
= c("1", "1", "1", "1", "1", "1", "1",



 "1", "1", "1"), readyState = c("4", "4", "4", "4", "4", "4",



"4", "4", "4", "4"), lectureID =
c("exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1"), eventTimestamp = c("2013-03-04
18:51:49",




 "2013-03-04
18:51:50", "2013-03-04 18:51:54", "2013-03-04 18:51:56",




 "2013-03-04
18:52:05", "2013-03-04 18:52:07", "2013-03-04 18:52:11",




 "2013-03-04
18:59:17", "2013-03-04 18:59:23", "2013-03-04 18:59:31"



   ),
initTimestamp = c("2013-03-04 18:44:15", "2013-03-04 18:44:15",




"2013-03-04 18:44:15", "2013-03-04 18:44:15", "2013-03-04
18:44:15",




"2013-03-04 18:44:15", "2013-03-04 18:44:15", "2013-03-04
18:44:15",




"2013-03-04 18:44:15", "2013-03-04 18:44:15"), type = c("
pause ",




"
play ", " pause ", " play ", " pause ", " play ", " pause ",




"
play "

Re: [R] lapply to multivariate function?

2013-09-01 Thread Ignacio Martinez
I hope this reproduceble example helps understand what I'm trying to do.

This is the function:

# Make Data Frame for video actions between given times for user X
DataVideoActionT <- function (userX, Time1, Time2, Time3){
  #Get data for user X
  videoActionsX<-subset(videoLectureActions, username==userX)
  #Time1 = before first attempt
  videoActionsX_T1<-subset(videoActionsX, eventTimestampTime1)
  #Time3= before last attemp
  videoActionsX_T3<-subset(videoActionsX, eventTimestampTime1)

  error1 = sum(videoActionsX_T1$type==" error ")
  pause1 = sum(videoActionsX_T1$type==" pause ")
  play1 = sum(videoActionsX_T1$type==" play ")
  ratechange1 = sum(videoActionsX_T1$type==" ratechange ")
  seeked1 = sum(videoActionsX_T1$type==" seeked ")
  stalled1 = sum(videoActionsX_T1$type==" stalled ")

  error2 = sum(videoActionsX_T2$type==" error ")
  pause2 = sum(videoActionsX_T2$type==" pause ")
  play2 = sum(videoActionsX_T2$type==" play ")
  ratechange2 = sum(videoActionsX_T2$type==" ratechange ")
  seeked2 = sum(videoActionsX_T2$type==" seeked ")
  stalled2 = sum(videoActionsX_T2$type==" stalled ")

  error3 = sum(videoActionsX_T3$type==" error ")
  pause3 = sum(videoActionsX_T3$type==" pause ")
  play3 = sum(videoActionsX_T3$type==" play ")
  ratechange3 = sum(videoActionsX_T3$type==" ratechange ")
  seeked3 = sum(videoActionsX_T3$type==" seeked ")
  stalled3 = sum(videoActionsX_T3$type==" stalled ")

  data<-data.frame(anon_ID=userX,
   error1 = error1,
   pause1 = pause1,
   play1 = play1,
   ratechange1 = ratechange1,
   seeked1=seeked1,
   stalled1=stalled1,
   error2 = error2,
   pause2 = pause2,
   play2 = play2,
   ratechange2 = ratechange2,
   seeked2 =seeked2,
   stalled2 = stalled2,
   error3 = error3,
   pause3 = pause3,
   play3 = play3,
   ratechange3 = ratechange3,
   seeked3 = seeked3,
   stalled3 = stalled3)
  return(data)
}

This is the videoActionsX  dataframe:

structure(list(username = c("exampleID1", "exampleID1", "exampleID1",
"exampleID2", "exampleID2", "exampleID2",
"exampleID3", "exampleID3",
"exampleID3", "exampleID3"), currentTime =
c("103.701247", "103.701247",

 "107.543877", "107.543877", "116.456507", "116.456507", "119.987188",

 "177.816693", "183.417124", "183.417124"), playbackRate = c("null",

 "null", "null",
"null", "null", "null", "null", "null", "null",

 "null"), pause =
c("true", "false", "true", "false", "true",


"false", "true", "false", "true", "false"), error = c("null",


  "null", "null",
"null", "null", "null", "null", "null", "null",


  "null"), networkState
= c("1", "1", "1", "1", "1", "1", "1",



"1", "1", "1"), readyState = c("4", "4", "4", "4", "4", "4",



   "4", "4", "4", "4"), lectureID =
c("exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1", "exampleLectureID1",




"exampleLectureID1", "exampleLectureID1"), eventTimestamp = c("2013-03-04
18:51:49",




"2013-03-04
18:51:50", "2013-03-04 18:51:54", "2013-03-04 18:51:56",




"2013-03-04
18:52:05", "2013-03-04 18:52:07", "2013-03-04 18:52:11",




"2013-03-04
18:59:17", "2013-03-04 18:59:23", "2013-03-04 18:59:31"



  ),
initTimestamp = c("2013-03-04 18:44:15", "2013-03-04 18:44:15",




   "2013-03-04 18:44:15", "2013-03-04 18:44:15", "2013-03-04
18:44:15",




   "2013-03-04 18:44:15", "2013-03-04 18:44:15", "2013-03-04
18:44:15",




   "2013-03-04 18:44:15", "2013-03-04 18:44:15"), type = c("
pause ",




   "
play ", " pause ", " play ", " pause ", " play ", " pause ",




   "
play ", " pause ", " play "), prevTime = c("103.701247 ", "103.701247 ",





"107.543877 ", "107.543877 ",
"116.456507 ", "116.456507 ", "119.987188 ",





"177.816693 ", "183.417124 ",
"183.417124 ")), .Names = c("username",






  "currentTime", "playbackRate", "pause", "error",
"networkState",






  "readySta

Re: [R] lapply to multivariate function?

2013-09-01 Thread Bert Gunter
Oh, another possibility is ?mapply, which I should have pointed out in my
previous reply. Sorry.

-- Bert


On Sun, Sep 1, 2013 at 8:30 AM, Bert Gunter  wrote:

> Rui et.al.:
>
> But apply will not work if the data frame has columns of different
> classes/types, as appears to be the case here. Viz, from ?apply:
>
> "If X is not an array but an object of a class with a non-null 
> dim
>  value (such as a data frame),apply attempts to coerce it to an array via
> as.matrix if it is two-dimensional (e.g., a data frame) or via as.array."
>
> Simply looping by rows (via for() ) appears to be the simplest and
> probably fastest solution. There are other ways via tapply() and friends,
> but these are also essentially loops and are likely to incur some
> additional overhead.
>
> All assuming I understand what the OP has requested, of course.
>
> Cheers,
>
> Bert
>
>
> On Sun, Sep 1, 2013 at 7:31 AM, Rui Barradas  wrote:
>
>> Hello,
>>
>> Maybe you need apply, not lapply. It seems you want to apply() a function
>> to the first dimension of your data.frame, something like
>>
>> apply(dat, 1, fun)  #apply by rows
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 01-09-2013 15:00, Ignacio Martinez escreveu:
>>
>>> I have a Data Frame that contains, between other things, the following
>>> fields: userX, Time1, Time2, Time3. The number of observations is 2000.
>>>
>>> I have a function that has as inputs userX, Time1, Time2, Time3 and
>>> return
>>> a data frame with 1 observation and 19 variables.
>>>
>>> I want to apply that function to all the observations of the first data
>>> frame to make a new data frame with 2000 observations and 19 variables.
>>>
>>> I thought about using lapply, but if I understand correctly, it only
>>> takes
>>> one variable.
>>>
>>> Can somebody point me in the right direction?
>>>
>>> Thanks!
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __**
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html 
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to multivariate function?

2013-09-01 Thread Bert Gunter
Rui et.al.:

But apply will not work if the data frame has columns of different
classes/types, as appears to be the case here. Viz, from ?apply:

"If X is not an array but an object of a class with a non-null
dim
 value (such as a data frame),apply attempts to coerce it to an array via
as.matrix if it is two-dimensional (e.g., a data frame) or via as.array."

Simply looping by rows (via for() ) appears to be the simplest and probably
fastest solution. There are other ways via tapply() and friends, but these
are also essentially loops and are likely to incur some additional overhead.

All assuming I understand what the OP has requested, of course.

Cheers,

Bert


On Sun, Sep 1, 2013 at 7:31 AM, Rui Barradas  wrote:

> Hello,
>
> Maybe you need apply, not lapply. It seems you want to apply() a function
> to the first dimension of your data.frame, something like
>
> apply(dat, 1, fun)  #apply by rows
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 01-09-2013 15:00, Ignacio Martinez escreveu:
>
>> I have a Data Frame that contains, between other things, the following
>> fields: userX, Time1, Time2, Time3. The number of observations is 2000.
>>
>> I have a function that has as inputs userX, Time1, Time2, Time3 and return
>> a data frame with 1 observation and 19 variables.
>>
>> I want to apply that function to all the observations of the first data
>> frame to make a new data frame with 2000 observations and 19 variables.
>>
>> I thought about using lapply, but if I understand correctly, it only takes
>> one variable.
>>
>> Can somebody point me in the right direction?
>>
>> Thanks!
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to multivariate function?

2013-09-01 Thread Rui Barradas

Hello,

Maybe you need apply, not lapply. It seems you want to apply() a 
function to the first dimension of your data.frame, something like


apply(dat, 1, fun)  #apply by rows


Hope this helps,

Rui Barradas

Em 01-09-2013 15:00, Ignacio Martinez escreveu:

I have a Data Frame that contains, between other things, the following
fields: userX, Time1, Time2, Time3. The number of observations is 2000.

I have a function that has as inputs userX, Time1, Time2, Time3 and return
a data frame with 1 observation and 19 variables.

I want to apply that function to all the observations of the first data
frame to make a new data frame with 2000 observations and 19 variables.

I thought about using lapply, but if I understand correctly, it only takes
one variable.

Can somebody point me in the right direction?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply to multivariate function?

2013-09-01 Thread Ignacio Martinez
I have a Data Frame that contains, between other things, the following
fields: userX, Time1, Time2, Time3. The number of observations is 2000.

I have a function that has as inputs userX, Time1, Time2, Time3 and return
a data frame with 1 observation and 19 variables.

I want to apply that function to all the observations of the first data
frame to make a new data frame with 2000 observations and 19 variables.

I thought about using lapply, but if I understand correctly, it only takes
one variable.

Can somebody point me in the right direction?

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and SpatialGridDataFrame error

2013-01-28 Thread Roger Bivand
Irucka Embry  mail2world.com> writes:

> 
> Hi all, I have a set of 54 files that I need to convert from ASCII grid
> format to .shp files to .bnd files for BayesX.
> 
> I have the following R code to operate on those files:
> 
> library(maptools)
> library(Grid2Polygons)
> library(BayesX)
> library(BayesXsrc)
> library(R2BayesX)
> 
> readfunct <- function(x)
> {
> u <- readAsciiGrid(x)
> }
> 
> modfilesmore <- paste0("MaxFloodDepth_", 1:54, ".txt")
> modeldepthsmore <- lapply(modfilesmore, readfunct)
> 
> maxdepth.plys <- lapply(modeldepthsmore, Grid2Polygons(modeldepthsmore,
> level = FALSE))
> 
...
> 
> This is the error message that I receive:
> > maxdepth.plys <- lapply(modeldepthsmore,
> Grid2Polygons(modeldepthsmore, level = FALSE))
> Error in Grid2Polygons(modeldepthsmore, level = FALSE) : Grid object not
> of class SpatialGridDataFrame
> 
> Can someone assist me in modifying the R code so that I can convert the
> set of files to .shp files and then to .bnd files for BayesX?

You also posted on R-sig-geo a few hours after posting here - certainly a more
relevant choice of list, but you are rather impatient.

I'm assuming that you have read up on how lapply() works, and realised what is
wrong with your understanding. But just in case, 

> maxdepth.plys <- lapply(modeldepthsmore, Grid2Polygons(modeldepthsmore,
> level = FALSE))

does not pass the list component from modeldepthsmore anywhere, but tries to run
Grid2Polygons on the whole list. Something like (untried):

maxdepth.plys <- lapply(modeldepthsmore, function(x) Grid2Polygons(x, level =
FALSE))

should do that. Please summarise to R-sig-geo.

Roger


> 
> Thank-you.
> 
> Irucka Embry 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and SpatialGridDataFrame error

2013-01-27 Thread Irucka Embry
Hi all, I have a set of 54 files that I need to convert from ASCII grid
format to .shp files to .bnd files for BayesX.

I have the following R code to operate on those files:

library(maptools)
library(Grid2Polygons)
library(BayesX)
library(BayesXsrc)
library(R2BayesX)

readfunct <- function(x)
{
u <- readAsciiGrid(x)
}

modfilesmore <- paste0("MaxFloodDepth_", 1:54, ".txt")
modeldepthsmore <- lapply(modfilesmore, readfunct)

maxdepth.plys <- lapply(modeldepthsmore, Grid2Polygons(modeldepthsmore,
level = FALSE))

layers <- paste0("examples/floodlayers_", 1:54)
polyshapes <- lapply(writePolyShape(maxdepth.plys, layers))
shpName <- sub(pattern="(.*)\\.dbf", replacement="\\1",
x=system.file("examples/Flood/layer_.dbf", package="BayesX")) 
floodmaps <- lapply(shp2bnd(shpname=shpName, regionnames="SP_ID"))

## draw the map
drawmap(map=floodmaps)


This is the error message that I receive:
> maxdepth.plys <- lapply(modeldepthsmore,
Grid2Polygons(modeldepthsmore, level = FALSE))
Error in Grid2Polygons(modeldepthsmore, level = FALSE) : Grid object not
of class SpatialGridDataFrame


Can someone assist me in modifying the R code so that I can convert the
set of files to .shp files and then to .bnd files for BayesX?

Thank-you.

Irucka Embry 


___Get
 the Free email that has everyone talking at http://www.mail2world.com 
target=new>http://www.mail2world.com  Unlimited 
Email Storage – POP3 – Calendar – SMS – Translator – 
Much More!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply (and friends) with data.frames are slow

2013-01-05 Thread Kevin Ushey
Hi David,

Yes, it is - although the SO question was more directed at figuring out why
sapply seemed slower, the question to R-help is more nuanced in "is this
coercion really necessary for data.frames?", and I figured it might take
some more knowledge of R internals / the difference between lists and
data.frames to answer that.

Ie, might we be introducing some weird subtle bug(s) if we called the
*apply functions on an un-coerced data.frame?

-Kevin

On Sat, Jan 5, 2013 at 1:18 PM, David Winsemius wrote:

>
> On Jan 5, 2013, at 11:38 AM, Kevin Ushey wrote:
>
>  Hey guys,
>>
>> I noticed something curious in the lapply call. I'll copy+paste the
>> function call here because it's short enough:
>>
>> lapply <- function (X, FUN, ...)
>> {
>>FUN <- match.fun(FUN)
>>if (!is.vector(X) || is.object(X))
>>X <- as.list(X)
>>.Internal(lapply(X, FUN))
>> }
>>
>> Notice that lapply coerces X to a list if the !is.vector || is.object(X)
>> check passes.
>>
>> Curiously, data.frames fail the test (is.vector(data.frame()) returns
>> FALSE); but it seems that coercion of a data.frame
>> to a list would be unnecessary for the *apply family of functions.
>>
>> Is there a reason why we must coerce data.frames to list for these
>> functions? I thought data.frames were essentially just 'structured lists'?
>>
>> I ask because it is generally quite slow coercing a (large) data.frame to
>> a
>> list, and it seems like this could be avoided for data.frames.
>>
>
> Is this related to this SO question that uses the microbenchmark function
> to illustrate the costs of the (possibly) superfluous coercion?
>
> http://stackoverflow.com/**questions/14169818/why-is-**
> sapply-relatively-slow-when-**querying-attributes-on-**
> variables-in-a-data-fr
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply (and friends) with data.frames are slow

2013-01-05 Thread David Winsemius


On Jan 5, 2013, at 11:38 AM, Kevin Ushey wrote:


Hey guys,

I noticed something curious in the lapply call. I'll copy+paste the
function call here because it's short enough:

lapply <- function (X, FUN, ...)
{
   FUN <- match.fun(FUN)
   if (!is.vector(X) || is.object(X))
   X <- as.list(X)
   .Internal(lapply(X, FUN))
}

Notice that lapply coerces X to a list if the !is.vector ||  
is.object(X)

check passes.

Curiously, data.frames fail the test (is.vector(data.frame()) returns
FALSE); but it seems that coercion of a data.frame
to a list would be unnecessary for the *apply family of functions.

Is there a reason why we must coerce data.frames to list for these
functions? I thought data.frames were essentially just 'structured  
lists'?


I ask because it is generally quite slow coercing a (large)  
data.frame to a

list, and it seems like this could be avoided for data.frames.


Is this related to this SO question that uses the microbenchmark  
function to illustrate the costs of the (possibly) superfluous coercion?


http://stackoverflow.com/questions/14169818/why-is-sapply-relatively-slow-when-querying-attributes-on-variables-in-a-data-fr

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply (and friends) with data.frames are slow

2013-01-05 Thread R. Michael Weylandt
On Sat, Jan 5, 2013 at 7:38 PM, Kevin Ushey  wrote:
> Hey guys,
>
> I noticed something curious in the lapply call. I'll copy+paste the
> function call here because it's short enough:
>
> lapply <- function (X, FUN, ...)
> {
> FUN <- match.fun(FUN)
> if (!is.vector(X) || is.object(X))
> X <- as.list(X)
> .Internal(lapply(X, FUN))
> }
>
> Notice that lapply coerces X to a list if the !is.vector || is.object(X)
> check passes.
>
> Curiously, data.frames fail the test (is.vector(data.frame()) returns
> FALSE); but it seems that coercion of a data.frame
> to a list would be unnecessary for the *apply family of functions.
>
> Is there a reason why we must coerce data.frames to list for these
> functions? I thought data.frames were essentially just 'structured lists'?
>
> I ask because it is generally quite slow coercing a (large) data.frame to a
> list, and it seems like this could be avoided for data.frames.

Note sure it's a huge deal, but

It does seem to be an avoidable function call with something like this:

lapply1 <- function (X, FUN, ...)
{
FUN <- match.fun(FUN)
if (!(is.vector(X) && is.object(X) || is.data.frame(X)))
X <- as.list(X)
.Internal(lapply(X, FUN))
}

On a microbenchmark:

xx <- data.frame(rnorm(5e7), rexp(5e7), runif(5e7))
xx <- cbind(xx, xx, xx, xx, xx)

system.time(lapply(x, range))
system.time(lapply1(x, range))

It saves me about 50% of the time -- that's of course only using a
relatively cheap FUN argument.

Others will hopefully comment more

M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply (and friends) with data.frames are slow

2013-01-05 Thread Kevin Ushey
Hey guys,

I noticed something curious in the lapply call. I'll copy+paste the
function call here because it's short enough:

lapply <- function (X, FUN, ...)
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(lapply(X, FUN))
}

Notice that lapply coerces X to a list if the !is.vector || is.object(X)
check passes.

Curiously, data.frames fail the test (is.vector(data.frame()) returns
FALSE); but it seems that coercion of a data.frame
to a list would be unnecessary for the *apply family of functions.

Is there a reason why we must coerce data.frames to list for these
functions? I thought data.frames were essentially just 'structured lists'?

I ask because it is generally quite slow coercing a (large) data.frame to a
list, and it seems like this could be avoided for data.frames.

Thanks,
-Kevin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and kernelUD (adehabitatHR package): Home Range kernel estimation for a list of individuals

2012-10-30 Thread nymphita
Dear R experts,

I'm using the adehabitatHR package in order to perform a kernel analysis and
estimate the home range of my input data (GPS relocations of 42
individuals).
I've done the analysis for one of the individuals and it worked perfectly
(see code below).

But now I'm trying to use a list and call the function lapply to do the same
thing through all the 42 individuals (also see code below), but I'm only
obtaining this error:
Error in seq.default(yli[1], yli[2], by = diff(xg[1:2])) : 
  invalid (to - from)/by in seq(.)

I have browsed the net in order to find out what does it mean, but I haven't
found a similar error, so I'm stuck with it...
Any thoughts on what I could be doing wrong will be very appreciated!


See below the code:
FOR ONE ANIMAL NAMED "Gael". IT WORKED PERFECTLY!

># Read a shapefile and convert it into a SpatialPointsDataFrame with its
corresponding CRS
>Gael_WGS84_WorldM <- readShapePoints("900_Gael_WGS84_WorldM", 
proj4string=CRS("+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0
+ellps=WGS84 +datum=WGS84 +units=m +no_defs"))

># Remove all the columns except the name of the animal to use the kernelUD
function. My data looks like this:
> head(Gael_WGS84_WorldM[-c(2:25)])
 coordinates Name
0 (-683614, 4459280) Gael
1 (-769563, 4516660) Gael
2 (-721607, 4431310) Gael
3 (-683613, 4459290) Gael
4 (-765266, 4502750) Gael
5 (-683602, 4459280) Gael
Coordinate Reference System (CRS) arguments: +proj=merc +lon_0=0 +k=1
+x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
+towgs84=0,0,0 

>#  Href Fixed Kernel Density Estimator
>K_Gael_estUDm <- kernelUD(Gael_WGS84_WorldM[-c(2:25)], h="href", grid=500,
kern="bivnorm")

FOR THE LIST OF ANIMALS. EVERYTHING SEEMS TO WORK FINE UNTIL I CALL THE
lapply FUNCTION, THEN IT GIVES AN ERROR:

> # Create a list (vector) with all the animal shapefiles in my folder
> AnimalShapeList <- list.files(path=".", pattern="WGS84_WorldM.shp")

>Animals <- lapply(AnimalShapeList, readShapePoints, 
+   proj4string=CRS("+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0
+ellps=WGS84 +datum=WGS84 +units=m +no_defs"))

> # Create a list of elements with only the coordinates (removing columns
> 2:25). 
> AnimalsXY <- lapply(Animals, "[", TRUE, -c(2:25))

>K_estUDm <- lapply(AnimalsXY, kernelUD, h="href", grid=20, kern="bivnorm")
Error in seq.default(yli[1], yli[2], by = diff(xg[1:2])) : 
  invalid (to - from)/by in seq(.)



--
View this message in context: 
http://r.789695.n4.nabble.com/lapply-and-kernelUD-adehabitatHR-package-Home-Range-kernel-estimation-for-a-list-of-individuals-tp4647934.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply with different size lists?

2012-09-11 Thread Rui Barradas

Hello,

When you don't know what's going on, break long instructions into 
simpler ones.

R is good at doing a lot in one line, to debug don't do that.

b = function (m, mat) {
n=nrow (mat)
p=ceiling (n/m)
lapply (1:p, function (l,n,m) {
inf = ((l-1)*m)+1
if (lIf you want 4 column data.frames, uncomment the instruction above (and 
comment out the previuos one).


Hope this helps,

Rui Barradas
Em 11-09-2012 19:44, Rui Esteves escreveu:

Hello,

I have 2 functions (a and b)

a = function(n) { matrix (runif(n*2,0.0,1), n) }


b = function (m, matrix) {
 n=nrow (matrix)
 p=ceiling (n/m)
 lapply (1:p, function (l,n,m) {
 inf = ((l-1)*m)+1
 if (l
my.matrix = a(7)
my.matrix

 [,1]  [,2]
[1,] 0.708060983 0.3242221
[2,] 0.356736311 0.1454096
[3,] 0.402880340 0.4763676
[4,] 0.795947223 0.4052168
[5,] 0.001620093 0.2618591
[6,] 0.192215589 0.6595275
[7,] 0.539199304 0.5402015


b (m=6,matrix=my_matrix)

[[1]]
   matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
1  1.000, Inf, 0.7080610, 0.3242221
2  2.000, Inf, 0.3567363, 0.1454096
3  3.000, Inf, 0.4028803, 0.4763676
4  4.000, Inf, 0.7959472, 0.4052168
55.0, Inf, 0.001620093, 0.261859077
6  6.000, Inf, 0.1922156, 0.6595275

[[2]]
   matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
1  7.000, Inf, 0.5391993, 0.5402015
2  7.000, Inf, 0.5391993, 0.5402015
3  7.000, Inf, 0.5391993, 0.5402015
4  7.000, Inf, 0.5391993, 0.5402015
5  7.000, Inf, 0.5391993, 0.5402015
6  7.000, Inf, 0.5391993, 0.5402015


It seems like the second list is filled with repeated rows (from 2 to 6)
I would like the second list to stop in the last row of the my_matrix
So, I would like to have the following result:

b (m=6,matrix=my_matrix)

[[1]]
   matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
1  1.000, Inf, 0.7080610, 0.3242221
2  2.000, Inf, 0.3567363, 0.1454096
3  3.000, Inf, 0.4028803, 0.4763676
4  4.000, Inf, 0.7959472, 0.4052168
55.0, Inf, 0.001620093, 0.261859077
6  6.000, Inf, 0.1922156, 0.6595275

[[2]]
   matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
1  7.000, Inf, 0.5391993, 0.5402015


Can`t I do this with an apply function? Is there any more efficient way
that substituting the lapply by a for loop?

THanks,
Rui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply with different size lists?

2012-09-11 Thread Rui Esteves
Hello,

I have 2 functions (a and b)

a = function(n) { matrix (runif(n*2,0.0,1), n) }
>
>
> b = function (m, matrix) {
> n=nrow (matrix)
> p=ceiling (n/m)
> lapply (1:p, function (l,n,m) {
> inf = ((l-1)*m)+1
> if (l else sup=n
> data.frame (matrix (lapply (inf: sup, function(i)
> c(i, Inf, matrix[i,1], matrix[i,2]) ), nrow=m ) )
> }, n=n, m=m
> )
> }
>
> >my.matrix = a(7)
> >my.matrix
> [,1]  [,2]
> [1,] 0.708060983 0.3242221
> [2,] 0.356736311 0.1454096
> [3,] 0.402880340 0.4763676
> [4,] 0.795947223 0.4052168
> [5,] 0.001620093 0.2618591
> [6,] 0.192215589 0.6595275
> [7,] 0.539199304 0.5402015
>
> > b (m=6,matrix=my_matrix)
> [[1]]
>   matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
> 1  1.000, Inf, 0.7080610, 0.3242221
> 2  2.000, Inf, 0.3567363, 0.1454096
> 3  3.000, Inf, 0.4028803, 0.4763676
> 4  4.000, Inf, 0.7959472, 0.4052168
> 55.0, Inf, 0.001620093, 0.261859077
> 6  6.000, Inf, 0.1922156, 0.6595275
>
> [[2]]
>   matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
> 1  7.000, Inf, 0.5391993, 0.5402015
> 2  7.000, Inf, 0.5391993, 0.5402015
> 3  7.000, Inf, 0.5391993, 0.5402015
> 4  7.000, Inf, 0.5391993, 0.5402015
> 5  7.000, Inf, 0.5391993, 0.5402015
> 6  7.000, Inf, 0.5391993, 0.5402015
>

It seems like the second list is filled with repeated rows (from 2 to 6)
I would like the second list to stop in the last row of the my_matrix
So, I would like to have the following result:
> b (m=6,matrix=my_matrix)
[[1]]
  matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
1  1.000, Inf, 0.7080610, 0.3242221
2  2.000, Inf, 0.3567363, 0.1454096
3  3.000, Inf, 0.4028803, 0.4763676
4  4.000, Inf, 0.7959472, 0.4052168
55.0, Inf, 0.001620093, 0.261859077
6  6.000, Inf, 0.1922156, 0.6595275

[[2]]
  matrix.lapply.inf.sup..function.i..c.i..Inf..matrix.i..1...matrix.i..
1  7.000, Inf, 0.5391993, 0.5402015


Can`t I do this with an apply function? Is there any more efficient way
that substituting the lapply by a for loop?

THanks,
Rui

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and paste

2012-03-28 Thread Thomas Lumley
On Thu, Mar 29, 2012 at 7:44 AM, Ed Siefker  wrote:
> Thank you, I was confused about that.  What exactly is lapply for then,
> if R handles this kind of thing automatically?  Are there functions that are
> not "vectorized"?
>

There are, especially ones you write yourself that don't need to be
vectorised on all their arguments.

Also, there may be more than one direction to vectorise: eg mean or
median work on vectors, but need some form of loop to work on data
frames (lapply, apply, for)

And there are structures too complicated to put in a vector: if I have
a list of linear models and want the variance-covariance matrices from
each one, lapply() is a good way to do it.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and paste

2012-03-28 Thread R. Michael Weylandt
Yes, there are non-vectorized functions e.g, integrate(), or you can use 
lapply() to apply a vectorized function to each element of a list (which is not 
what suff was) individually:

x <- list(1:3, 1:4, 1:5, 2:7)
mean(x) # bad
lapply(x, mean) #good

Michael

On Mar 28, 2012, at 2:44 PM, Ed Siefker  wrote:

> Thank you, I was confused about that.  What exactly is lapply for then,
> if R handles this kind of thing automatically?  Are there functions that are
> not "vectorized"?
> 
> 
> On Wed, Mar 28, 2012 at 1:37 PM, R. Michael Weylandt
>  wrote:
>> I think you're confused about the need for lapply -- paste is
>> vectorized so this
>> 
>> paste("filename_", suff, ".ext", sep = "")
>> 
>> will work. But if you want to use lapply (for whatever reason) try this:
>> 
>> lapply(suff, function(x) paste("filename_", x, ".ext", sep = "")
>> 
>> Michael
>> 
>> On Wed, Mar 28, 2012 at 2:31 PM, Ed Siefker  wrote:
>>> I have a list of suffixes I want to turn into file names with extensions.
>>> 
>>> suff<- c("C1", "C2", "C3")
>>> paste("filename_", suff[[1]], ".ext", sep="")
>>> [1] "filename_C1.ext"
>>> 
>>> How do I use lapply() on that call to paste()?
>>> What's the right way to do this:
>>> 
>>> filenames <-  lapply(suff, paste, ...)
>>> 
>>> ?
>>> 
>>> Can I have lapply() reorder the arguments to FUN?
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and paste

2012-03-28 Thread Sarah Goslee
suff isn't a list, so lapply() isn't the right choice. How about instead:

> suff<- c("C1", "C2", "C3")
> sapply(suff, function(x)paste("filename_", x, ".ext", sep=""))
   C1C2C3
"filename_C1.ext" "filename_C2.ext" "filename_C3.ext"


On Wed, Mar 28, 2012 at 2:31 PM, Ed Siefker  wrote:
> I have a list of suffixes I want to turn into file names with extensions.
>
> suff<- c("C1", "C2", "C3")
> paste("filename_", suff[[1]], ".ext", sep="")
> [1] "filename_C1.ext"
>
> How do I use lapply() on that call to paste()?
> What's the right way to do this:
>
> filenames <-  lapply(suff, paste, ...)
>
> ?
>
> Can I have lapply() reorder the arguments to FUN?
>



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and paste

2012-03-28 Thread Ed Siefker
Thank you, I was confused about that.  What exactly is lapply for then,
if R handles this kind of thing automatically?  Are there functions that are
not "vectorized"?


On Wed, Mar 28, 2012 at 1:37 PM, R. Michael Weylandt
 wrote:
> I think you're confused about the need for lapply -- paste is
> vectorized so this
>
> paste("filename_", suff, ".ext", sep = "")
>
> will work. But if you want to use lapply (for whatever reason) try this:
>
> lapply(suff, function(x) paste("filename_", x, ".ext", sep = "")
>
> Michael
>
> On Wed, Mar 28, 2012 at 2:31 PM, Ed Siefker  wrote:
>> I have a list of suffixes I want to turn into file names with extensions.
>>
>> suff<- c("C1", "C2", "C3")
>> paste("filename_", suff[[1]], ".ext", sep="")
>> [1] "filename_C1.ext"
>>
>> How do I use lapply() on that call to paste()?
>> What's the right way to do this:
>>
>> filenames <-  lapply(suff, paste, ...)
>>
>> ?
>>
>> Can I have lapply() reorder the arguments to FUN?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and paste

2012-03-28 Thread R. Michael Weylandt
I think you're confused about the need for lapply -- paste is
vectorized so this

paste("filename_", suff, ".ext", sep = "")

will work. But if you want to use lapply (for whatever reason) try this:

lapply(suff, function(x) paste("filename_", x, ".ext", sep = "")

Michael

On Wed, Mar 28, 2012 at 2:31 PM, Ed Siefker  wrote:
> I have a list of suffixes I want to turn into file names with extensions.
>
> suff<- c("C1", "C2", "C3")
> paste("filename_", suff[[1]], ".ext", sep="")
> [1] "filename_C1.ext"
>
> How do I use lapply() on that call to paste()?
> What's the right way to do this:
>
> filenames <-  lapply(suff, paste, ...)
>
> ?
>
> Can I have lapply() reorder the arguments to FUN?
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and paste

2012-03-28 Thread Ed Siefker
I have a list of suffixes I want to turn into file names with extensions.

suff<- c("C1", "C2", "C3")
paste("filename_", suff[[1]], ".ext", sep="")
[1] "filename_C1.ext"

How do I use lapply() on that call to paste()?
What's the right way to do this:

filenames <-  lapply(suff, paste, ...)

?

Can I have lapply() reorder the arguments to FUN?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to change variable names and variable values

2012-03-12 Thread Simon Kiss
Thanks both! That solves ! You've made a very happy newbie!
Simon
On 2012-03-12, at 2:52 PM, Sarah Goslee wrote:

> Hi Simon,
> 
> On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss  wrote:
>> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation 
>> With R and the R Book and can't find an answer.
>> 
>> Sample list of data frames looks as follows:
>> 
>> .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 
>> 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), 
>> Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 
>> 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
> 
> I tweaked this a bit so that it doesn't actually create df, df2, df3 as well 
> as
> making a list of them, and so that xx doesn't begin with a . and shows up with
> ls(). I don't need invisible objects in my testing session.
> 
> xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004,
> 2005, 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400),
> Var2=rep(c(2004,2005,2006,2007), 400)),
> df3=data.frame(Var1=rep('Alaska', 400),
> Var2=rep(c(2004,2005,2006,2007), 400)) )
> 
> 
>> I would like to accomplish the following two tasks.
>> First, I'd like to go through and change the names of each of the data 
>> frames within the list
>> to be 'State' and 'Year'
>> 
>> Second, I'd like to go through and add one year to each of the 'Var2'  
>> variables.
>> 
>> Third, I'd like to then delete those cases in the data frames that have 
>> values of Var2 (or Year) values of 2008.
>> 
>> I could do this manually, but my data are actually bigger than this, plus 
>> I'd really like to learn. I've been trying to use lapply, but I can't get my 
>> head around how it works:
>>  .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
>> just changes the actual list of data frames to a list of the character 
>> string ('State' and 'Year')  How do I actually change the underlying 
>> variable names?
> 
> Your function doesn't return the right thing. To see how it works, it's often 
> a
> good idea to write a stand-alone function and see what it does. For instance,
> 
> rename <- function(x) {
>   colnames(x)<-c('State', 'Year')
>   x
> }
> 
> To me at least, as soon as it's written as a stand-alone it's obvious that
> you have to return x in the last line. You can either use rename() in your
> lapply statement:
> xx<- lapply(xx, rename)
> 
> or you can write the full function into the lapply statement:
>> xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 
>> 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400), 
>> Var2=rep(c(2004,2005,2006,2007), 400)), df3=data.frame(Var1=rep('Alaska', 
>> 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
>> xx <- lapply(xx, function(x){ colnames(x)<-c('State', 'Year'); x} )
>> colnames(xx[[1]])
> [1] "State" "Year"
> 
> The same strategy should work for your other needs as well.
> 
> Sarah
> 
> 
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org

*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to change variable names and variable values

2012-03-12 Thread R. Michael Weylandt
Your function doesn't return the new data frame but rather the new
names. Note, e.g.

x <- 1:2
names(x) <- letters[1:2]
.Last.value # Not x!

Try this:

.xx<- lapply(.xx, function(x) {colnames(x)<-c('State', 'Year');  x})

or more explicitly

.xx<- lapply(.xx, function(x) {colnames(x)<-c('State', 'Year');  return(x)})

Michael

On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss  wrote:
> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation 
> With R and the R Book and can't find an answer.
>
> Sample list of data frames looks as follows:
>
> .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 
> 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), 
> Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 
> 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
>
> I would like to accomplish the following two tasks.
> First, I'd like to go through and change the names of each of the data frames 
> within the list
> to be 'State' and 'Year'
>
> Second, I'd like to go through and add one year to each of the 'Var2'  
> variables.
>
> Third, I'd like to then delete those cases in the data frames that have 
> values of Var2 (or Year) values of 2008.
>
> I could do this manually, but my data are actually bigger than this, plus I'd 
> really like to learn. I've been trying to use lapply, but I can't get my head 
> around how it works:
>  .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
> just changes the actual list of data frames to a list of the character string 
> ('State' and 'Year')  How do I actually change the underlying variable names?
>
> I'm grateful for your suggestions!
> Simon Kiss
>
> *
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 905 746 7606
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to change variable names and variable values

2012-03-12 Thread Sarah Goslee
Hi Simon,

On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss  wrote:
> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation 
> With R and the R Book and can't find an answer.
>
> Sample list of data frames looks as follows:
>
> .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 
> 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), 
> Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 
> 400), Var2=rep(c(2004,2005,2006,2007), 400)) )

I tweaked this a bit so that it doesn't actually create df, df2, df3 as well as
making a list of them, and so that xx doesn't begin with a . and shows up with
ls(). I don't need invisible objects in my testing session.

xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004,
2005, 2006, 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400),
Var2=rep(c(2004,2005,2006,2007), 400)),
df3=data.frame(Var1=rep('Alaska', 400),
Var2=rep(c(2004,2005,2006,2007), 400)) )


> I would like to accomplish the following two tasks.
> First, I'd like to go through and change the names of each of the data frames 
> within the list
> to be 'State' and 'Year'
>
> Second, I'd like to go through and add one year to each of the 'Var2'  
> variables.
>
> Third, I'd like to then delete those cases in the data frames that have 
> values of Var2 (or Year) values of 2008.
>
> I could do this manually, but my data are actually bigger than this, plus I'd 
> really like to learn. I've been trying to use lapply, but I can't get my head 
> around how it works:
>  .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
> just changes the actual list of data frames to a list of the character string 
> ('State' and 'Year')  How do I actually change the underlying variable names?

Your function doesn't return the right thing. To see how it works, it's often a
good idea to write a stand-alone function and see what it does. For instance,

rename <- function(x) {
   colnames(x)<-c('State', 'Year')
   x
}

To me at least, as soon as it's written as a stand-alone it's obvious that
you have to return x in the last line. You can either use rename() in your
lapply statement:
xx<- lapply(xx, rename)

or you can write the full function into the lapply statement:
> xx<-list(df=data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 
> 2007), 400)), df2=data.frame(Var1=rep('Tennessee', 400), 
> Var2=rep(c(2004,2005,2006,2007), 400)), df3=data.frame(Var1=rep('Alaska', 
> 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
> xx <- lapply(xx, function(x){ colnames(x)<-c('State', 'Year'); x} )
> colnames(xx[[1]])
[1] "State" "Year"

The same strategy should work for your other needs as well.

Sarah



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to change variable names and variable values

2012-03-12 Thread Steve Lianoglou
Hi,

On Mon, Mar 12, 2012 at 2:37 PM, Simon Kiss  wrote:
> Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation 
> With R and the R Book and can't find an answer.
>
> Sample list of data frames looks as follows:
>
> .xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 
> 2006, 2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), 
> Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 
> 400), Var2=rep(c(2004,2005,2006,2007), 400)) )
>
> I would like to accomplish the following two tasks.
> First, I'd like to go through and change the names of each of the data frames 
> within the list
> to be 'State' and 'Year'
>
> Second, I'd like to go through and add one year to each of the 'Var2'  
> variables.
>
> Third, I'd like to then delete those cases in the data frames that have 
> values of Var2 (or Year) values of 2008.
>
> I could do this manually, but my data are actually bigger than this, plus I'd 
> really like to learn. I've been trying to use lapply, but I can't get my head 
> around how it works:
>  .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
> just changes the actual list of data frames to a list of the character string 
> ('State' and 'Year')  How do I actually change the underlying variable names?

Almost there, you have to return the data.frame you've just changed, eg:

xx <- lapply(.xx, function(x) {
  colnames(x) <- c('state', 'year')
  x
})

If you want to remove the rows that correspond to 2008, you can do this:

xx <- lapply(.xx, function(x) {
  colnames(x) <- c('state', 'year')
  subset(x, year != 2008)
})

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply to change variable names and variable values

2012-03-12 Thread Simon Kiss
Hi: I'm sure this is a very easy problem. I've consulted Data Manipulation With 
R and the R Book and can't find an answer.

Sample list of data frames looks as follows: 

.xx<-list(df<-data.frame(Var1=rep('Alabama', 400), Var2=rep(c(2004, 2005, 2006, 
2007), 400)), df2<-data.frame(Var1=rep('Tennessee', 400), 
Var2=rep(c(2004,2005,2006,2007), 400)), df3<-data.frame(Var1=rep('Alaska', 
400), Var2=rep(c(2004,2005,2006,2007), 400)) )

I would like to accomplish the following two tasks. 
First, I'd like to go through and change the names of each of the data frames 
within the list
to be 'State' and 'Year'

Second, I'd like to go through and add one year to each of the 'Var2'  
variables.

Third, I'd like to then delete those cases in the data frames that have values 
of Var2 (or Year) values of 2008.

I could do this manually, but my data are actually bigger than this, plus I'd 
really like to learn. I've been trying to use lapply, but I can't get my head 
around how it works: 
  .xx<- lapply(.xx, function(x) colnames(x)<-c('State', 'Year')
just changes the actual list of data frames to a list of the character string 
('State' and 'Year')  How do I actually change the underlying variable names?

I'm grateful for your suggestions!
Simon Kiss

*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to list of variables

2011-11-10 Thread Kenn Konstabel
Hi hi,

It is much easier to deal with lists than a large number of separate
objects. So the first answer to your question

> How can I apply a function to a list of variables.

.. might be to convert your "list of variables" to a regular list.
Instead of ...

monday <- 1:3
tuesday <- 4:7
wednesday <- 8:100

... you could have:

mydays <- list(monday=1:3, tuesday=4:7, wednesday=8:100)

... and things will be a lot easier. If you happen to have a thousand
of such variables in your workspace then you can convert them to a
list using ...

listvar=list("Monday","Tuesday","Wednesday")
mylist <- lapply(listvar, get)

... but this is not always nice to you, for example, if you have
variables that have the same names as some functions in base packages.
Try this:

a<-1
b<-2
c<-3
mean<-4
listvar <- c("a", "b", "c", "mean")
lapply(listvar, get)

- a safer way would be lapply(lv, get, envir=.GlobalEnv)
# and for a named list, the best I can think of is:
structure(lapply(lv, get, envir=.GlobalEnv), .Names=lv)

And then ...

> func=function(x){x[which(x<=10)]=NA}
> lapply(listvar, func)

do you want actually to change your variables with this? Or just to
have a list with your original variables where any x[x<=10] is set to
NA? In the first case you'll need to do nonstandard tricks that
everyone will say you should avoid (but you can use e.g. `assign`, or
macros - see `defmacro` in package gtools). In the second case you
just need to take care that your function would return a sensible
value. An assignment returns the value that was assigned, in this
case, it is  always NA but if that's not what you meant, you can try

func <- function(x){ x[x<=10] <- NA; x}
lapply(listvar, func)

hth


On Tue, Nov 8, 2011 at 6:59 PM, Ana  wrote:
> Hi
>
> Can someone help me with this?
>
> How can I apply a function to a list of variables.
>
> something like this
>
>
> listvar=list("Monday","Tuesday","Wednesday")
> func=function(x){x[which(x<=10)]=NA}
>
> lapply(listvar, func)
>
> were
> Monday=[213,56,345,33,34,678,444]
> Tuesday=[213,56,345,33,34,678,444]
> ...
>
> in my case I have a neverending list of vectors.
>
> Thanks!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to list of variables

2011-11-08 Thread Dennis Murphy
Hi:

Here's another way of doing this on the simplified version of your example:

L <- vector('list', 3)  # initialize a list of three components
## populate it
for(i in seq_along(L)) L[[i]] <- rnorm(20, 10, 3)
## name the components
names(L) <- c('Monday', 'Tuesday', 'Wednesday')
## replace values <= 10 with NA
lapply(L, function(x) replace(x, x <= 10, NA)

If you have a large number of atomic objects, you could create a
vector of the object names, make a list out of them (e.g., L <-
list(objnames)) and then mimic the lapply() as above. replace() only
works on vectors, though - Uwe's solution is more general.

HTH,
Dennis

On Tue, Nov 8, 2011 at 8:59 AM, Ana  wrote:
> Hi
>
> Can someone help me with this?
>
> How can I apply a function to a list of variables.
>
> something like this
>
>
> listvar=list("Monday","Tuesday","Wednesday")
> func=function(x){x[which(x<=10)]=NA}
>
> lapply(listvar, func)
>
> were
> Monday=[213,56,345,33,34,678,444]
> Tuesday=[213,56,345,33,34,678,444]
> ...
>
> in my case I have a neverending list of vectors.
>
> Thanks!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to list of variables

2011-11-08 Thread Uwe Ligges



On 08.11.2011 17:59, Ana wrote:

Hi

Can someone help me with this?

How can I apply a function to a list of variables.

something like this


listvar=list("Monday","Tuesday","Wednesday")


This is a list of length one character vectors rather than a "list of 
variables".





func=function(x){x[which(x<=10)]=NA}


To make it work, redefine:

func <-function(x){
 x <- get(x)
 is.na(x[x<=10]) <- TRUE
 x
}




lapply(listvar, func)

were
Monday=[213,56,345,33,34,678,444]
Tuesday=[213,56,345,33,34,678,444]


This is not R syntax.


...

in my case I have a neverending list of vectors.


Then your function will take an infinite amount of time - or you will 
get amazing reputation in computer sciences.


Uwe Ligges




Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply to list of variables

2011-11-08 Thread Ana
Hi

Can someone help me with this?

How can I apply a function to a list of variables.

something like this


listvar=list("Monday","Tuesday","Wednesday")
func=function(x){x[which(x<=10)]=NA}

lapply(listvar, func)

were
Monday=[213,56,345,33,34,678,444]
Tuesday=[213,56,345,33,34,678,444]
...

in my case I have a neverending list of vectors.

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and Two TimeStamps as input

2011-10-31 Thread Pete Brecknock

alaios wrote:
> 
> Dear all,
> 
> I have a function that recognizes the following format for timestamps
> "%Y-%m-%d %H:%M:%S"
> 
> my function takes two input arguments the TimeStart and TimeEnd
> I would like to help me create the right list with pairs of TimeStart and
> TimeEnd which I can feed to lapply (I am using mclapply actually). 
> For every lapply I want two inputs to be fed to my function. I only know
> how to feed one input to the lapply
> 
> Could you please help me with that?
> I would like to thank you in advance for your help
> 
> B.R
> Alex
>   [[alternative HTML version deleted]]
> 
> 
> __
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Is this of any use?

# Data in list - first time is start time, second time is end time
myListStartEnd = list(c(strptime("2010-01-01 09:00:00","%Y-%m-%d %H:%M:%S"),
strptime("2011-01-01 11:30:00","%Y-%m-%d
%H:%M:%S")),
  c(strptime("2010-12-01 10:00:00","%Y-%m-%d %H:%M:%S"),
strptime("2010-12-25 06:00:00","%Y-%m-%d
%H:%M:%S")))

lapply(myListStartEnd,function(x) x[2]-x[1])

# Output
[[1]]
Time difference of 365.1042 days

[[2]]
Time difference of 23.8 days

HTH

Pete

--
View this message in context: 
http://r.789695.n4.nabble.com/lapply-and-Two-TimeStamps-as-input-tp3961939p3962139.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and Two TimeStamps as input

2011-10-31 Thread Comcast
Provide some sample data. For instance, a  data frame with two columns and the 
function.


On Oct 31, 2011, at 6:37 PM, Alaios  wrote:

> Dear all,
> 
> I have a function that recognizes the following format for timestamps
> "%Y-%m-%d %H:%M:%S"
> 
> my function takes two input arguments the TimeStart and TimeEnd
> I would like to help me create the right list with pairs of TimeStart and 
> TimeEnd which I can feed to lapply (I am using mclapply actually). 
> For every lapply I want two inputs to be fed to my function. I only know how 
> to feed one input to the lapply
> 
> Could you please help me with that?
> I would like to thank you in advance for your help
> 
> B.R
> Alex
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and Two TimeStamps as input

2011-10-31 Thread Alaios
Dear all,

I have a function that recognizes the following format for timestamps
"%Y-%m-%d %H:%M:%S"

my function takes two input arguments the TimeStart and TimeEnd
I would like to help me create the right list with pairs of TimeStart and 
TimeEnd which I can feed to lapply (I am using mclapply actually). 
For every lapply I want two inputs to be fed to my function. I only know how to 
feed one input to the lapply

Could you please help me with that?
I would like to thank you in advance for your help

B.R
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to return vector

2011-10-22 Thread Dennis Murphy
do.call(rbind, lapply(...))

HTH,
D.

On Sat, Oct 22, 2011 at 1:44 AM, Alaios  wrote:
> Dear all I have wrote the following line
>
>
> return(as.vector(lapply(as.data.frame(data),min,simplify=TRUE)));
>
>
> I want the lapply to return a vector as it returns a list with elements as 
> shown below
>
> List of 30001
> $ V1    : num -131
> $ V2    : num -131
> $ V3    : num -137
> $ V4    : num -129
> $ V5    : num -130
>
>
> as you can see I have already tried the simplify=TRUE and also the 
> as.vector() but both did not help
>
> Why I want to use lapply is because afterwards will be easier to convert it 
> to mclapply that can use many cores to do the work.
>
> I would like to thank you in advance for your help
>
> B.R
> Alex
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply to return vector

2011-10-22 Thread Alaios
Dear all I have wrote the following line


return(as.vector(lapply(as.data.frame(data),min,simplify=TRUE)));


I want the lapply to return a vector as it returns a list with elements as 
shown below

List of 30001
$ V1: num -131
$ V2: num -131
$ V3: num -137
$ V4: num -129
$ V5: num -130


as you can see I have already tried the simplify=TRUE and also the as.vector() 
but both did not help

Why I want to use lapply is because afterwards will be easier to convert it to 
mclapply that can use many cores to do the work.

I would like to thank you in advance for your help

B.R
Alex
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply, if statement and concatenating to a list

2011-05-05 Thread Kenn Konstabel
Hi Lorenzo,

On Thu, May 5, 2011 at 8:38 AM, Lorenzo Cattarino  wrote:
> Hi R users
>
> I was wondering on how to use lapply & co when the applied function has a 
> conditional statement and the output is a 'growing' object.
> See example below:
>
> list1 <- list('A','B','C')
> list2 <- c()
>
> myfun <- function(x,list2)
> {
>  one_elem <- x
>  cat('one_elem= ', one_elem, '\n')
>  random <- sample(1:2,1)
>  show(random)
>  if(random==2)
>  {
>    list2 <- c(list2,one_elem)
>  }else{
>    list2
>  }
> }
>
> lapply(list1,myfun,list2)
>
> Is there a way to get rid of the 'NULL' elements in the output (when there is 
> any?), without using a for loop?

I don't understand what your example is trying to do and which object
you expect to be "growing". list2 ain't growin', and it's not changing
(i.e., it remains NULL) in your code. Perhaps you meant to have a <<-
there; this would make your list2 "growing", if you really want it to,
but in general, that's a bad idea. Lapply goes best with the
functional style where everything your function does is computing and
returning a value but here you're (if I get your intentions correctly)
counting on side effects. If you like side effects, a for (or while)
loop may be more logical choice.

Getting rid of the NULL elements is simple. One way is:

foo <- lapply(list1, yourfun)
foo[!sapply(foo, is.null)]

Regards,
Kenn

>        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply, if statement and concatenating to a list

2011-05-05 Thread Lorenzo Cattarino
Hi R users

I was wondering on how to use lapply & co when the applied function has a 
conditional statement and the output is a 'growing' object.
See example below:

list1 <- list('A','B','C')
list2 <- c()

myfun <- function(x,list2)
{
  one_elem <- x
  cat('one_elem= ', one_elem, '\n')
  random <- sample(1:2,1)
  show(random)
  if(random==2)
  {
list2 <- c(list2,one_elem)
  }else{
list2
  }
}

lapply(list1,myfun,list2)

Is there a way to get rid of the 'NULL' elements in the output (when there is 
any?), without using a for loop?

Thanks for your help
Lorenzo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply sequence

2011-04-20 Thread Duncan Murdoch

On 20/04/2011 7:26 AM, Dean Marks wrote:

Good day,

My question is: Does the lapply function guarantee a particular sequence in
which elements are mapped? And, are we guaranteed that lapply will always be
sequential (i.e. never map elements in parallel) ?


No.


The reason I ask is if I use lapply with the mapping function set to
something that has side-effects that need to be executed in a particular
sequence.


Use a for loop.

If this is not possible, is there an alternate method other than using a for
loop?


while or repeat.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply sequence

2011-04-20 Thread Dean Marks
Good day,

My question is: Does the lapply function guarantee a particular sequence in
which elements are mapped? And, are we guaranteed that lapply will always be
sequential (i.e. never map elements in parallel) ?

The reason I ask is if I use lapply with the mapping function set to
something that has side-effects that need to be executed in a particular
sequence.

If this is not possible, is there an alternate method other than using a for
loop?

-- 
Dean Marks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply over list and the respective name of the list item

2011-04-12 Thread Thaler, Thorn, LAUSANNE, Applied Mathematics
Hi all,

I find myself sometimes in the situation where I lapply over a list and
in the particular function I'd like to use the name and or position of
the respective list item. What I usually do is to use mapply on the list
and the names of the list / a position list:

o <- list(A=1:3, B=1:2, C=1)
mapply(function (item, name) paste(name, sum(item), sep="="), o,
names(o))
mapply(function (item, pos) paste(pos, sum(item), sep="="), o,
seq_along(o))

[another way would be, of course, to use a for loop, but I'm slightly
reluctant to use for loops knowing that I definitely misuse lapply
sometimes]

Now I was wondering whether there is a better way of doing this? Or is
the mapply approach already the best way? In other words, is there any
possibility within lapply to get any information about the context of
the respective item?

To give you an example where this could be useful:

Imagine we have a dataframe and we want to replicate (more general apply
a function to) each column and the result should be a dataframe again,
where the new columns bear the name of their respective parent column
amended by some suffix. Using mapply I'd do something like:

d <- data.frame(A=1:3, B=2:4)
Reduce(cbind, mapply(function (col, nam) {
td <- as.data.frame(do.call(cbind, rep(list(col), 3)))
names(td) <- paste(nam, 1:3, sep = "_")
  td}, d, names(d), SIMPLIFY=FALSE))

Any suggestions? Or is it already the state of the art? Any help
appreciated.

BR Thorn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply, strsplit, and list elements

2011-02-04 Thread Greg Snow
Darn,  Good catch, I fell victim to overthinking the problem.

I think I was more thinking of:
'[0-9]+(?=/)'

Which uses the whole match (then I switched thinking and captured the number, 
but did not simplify the other part).  Yours is the best.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Sent: Friday, February 04, 2011 12:22 PM
> To: Greg Snow
> Cc: Dick Harray; r-help@r-project.org
> Subject: Re: [R] lapply, strsplit, and list elements
> 
> On Fri, Feb 4, 2011 at 1:27 PM, Greg Snow  wrote:
> > Try this:
> >
> >> x <- c("349/077,349/074,349/100,349/117",
> > +          "340/384.2,340/513,367/139,455/128,D13/168",
> > +          "600/437,128/903,128/904")
> >>
> >> library(gsubfn)
> >> out <- strapply(x, '([0-9]+)(?=/)')
> >> out
> > [[1]]
> > [1] "349" "349" "349" "349"
> >
> > [[2]]
> > [1] "340" "340" "367" "455" "13"
> >
> > [[3]]
> > [1] "600" "128" "128"
> >
> >
> > The strapply looks for the pattern then returns every time it finds
> the pattern.  The pattern in this case is 1 or more digits that are
> followed by a /, but the slash is not included in the matched portion
> (a positive look ahead).
> >
> > If you need more than digits you can modify the pattern to whatever
> matches before the /.
> 
> Also this similar approach with a slight simplification of the regular
> expression:
> 
>strapply(x, '([0-9]+)/')
> 
> or to convert the numbers to numeric at the same time:
> 
>strapply(x, '([0-9]+)/', as.numeric)
> 
> 
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply, strsplit, and list elements

2011-02-04 Thread Gabor Grothendieck
On Fri, Feb 4, 2011 at 1:27 PM, Greg Snow  wrote:
> Try this:
>
>> x <- c("349/077,349/074,349/100,349/117",
> +          "340/384.2,340/513,367/139,455/128,D13/168",
> +          "600/437,128/903,128/904")
>>
>> library(gsubfn)
>> out <- strapply(x, '([0-9]+)(?=/)')
>> out
> [[1]]
> [1] "349" "349" "349" "349"
>
> [[2]]
> [1] "340" "340" "367" "455" "13"
>
> [[3]]
> [1] "600" "128" "128"
>
>
> The strapply looks for the pattern then returns every time it finds the 
> pattern.  The pattern in this case is 1 or more digits that are followed by a 
> /, but the slash is not included in the matched portion (a positive look 
> ahead).
>
> If you need more than digits you can modify the pattern to whatever matches 
> before the /.

Also this similar approach with a slight simplification of the regular
expression:

   strapply(x, '([0-9]+)/')

or to convert the numbers to numeric at the same time:

   strapply(x, '([0-9]+)/', as.numeric)


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply, strsplit, and list elements

2011-02-04 Thread Henrique Dallazuanna
Try this:

strsplit(x, "/\\d+\\.\\d+,|/\\d+,|/\\d+")

On Fri, Feb 4, 2011 at 1:37 PM, Dick Harray  wrote:

> Hi there,
>
> I have a problem about lapply, strsplit, and accessing list elements,
> which I don't understand or cannot solve:
>
> I have e.g. a character vector with three elements:
>
> x = c("349/077,349/074,349/100,349/117",
> "340/384.2,340/513,367/139,455/128,D13/168",
> "600/437,128/903,128/904")
>
>
> The task I want to perform, is to generate a list, comprising the
> portion in front of the "/" of each element of x:
>
> neededResult = list(c("349","349", "349", "349"),
> c("340", "340", "367", "455", "D13"),
> c("600", "128", "128") )
>
>
> I figured out that for a single element of x the following works
>
> unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), "/"), "[", 1) )
>
> but due to "unlist" it doesn't provide the required result if extended
> to all elements of x
>
> unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), "/"), "[",))
>
>
> Someone can help me to get the needed result?
>
> Thanks and regards,
>
> Dirk
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply, strsplit, and list elements

2011-02-04 Thread Greg Snow
Try this:

> x <- c("349/077,349/074,349/100,349/117",
+  "340/384.2,340/513,367/139,455/128,D13/168",
+  "600/437,128/903,128/904")
> 
> library(gsubfn)
> out <- strapply(x, '([0-9]+)(?=/)')
> out
[[1]]
[1] "349" "349" "349" "349"

[[2]]
[1] "340" "340" "367" "455" "13" 

[[3]]
[1] "600" "128" "128"


The strapply looks for the pattern then returns every time it finds the 
pattern.  The pattern in this case is 1 or more digits that are followed by a 
/, but the slash is not included in the matched portion (a positive look ahead).

If you need more than digits you can modify the pattern to whatever matches 
before the /.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Dick Harray
> Sent: Friday, February 04, 2011 8:37 AM
> To: r-help@r-project.org
> Subject: [R] lapply, strsplit, and list elements
> 
> Hi there,
> 
> I have a problem about lapply, strsplit, and accessing list elements,
> which I don't understand or cannot solve:
> 
> I have e.g. a character vector with three elements:
> 
> x = c("349/077,349/074,349/100,349/117",
>  "340/384.2,340/513,367/139,455/128,D13/168",
>  "600/437,128/903,128/904")
> 
> 
> The task I want to perform, is to generate a list, comprising the
> portion in front of the "/" of each element of x:
> 
> neededResult = list(c("349","349", "349", "349"),
>  c("340", "340", "367", "455", "D13"),
>  c("600", "128", "128") )
> 
> 
> I figured out that for a single element of x the following works
> 
> unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), "/"), "[",
> 1) )
> 
> but due to "unlist" it doesn't provide the required result if extended
> to all elements of x
> 
> unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), "/"),
> "[",))
> 
> 
> Someone can help me to get the needed result?
> 
> Thanks and regards,
> 
> Dirk
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply, strsplit, and list elements

2011-02-04 Thread William Dunlap

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Dick Harray
> Sent: Friday, February 04, 2011 7:37 AM
> To: r-help@r-project.org
> Subject: [R] lapply, strsplit, and list elements
> 
> Hi there,
> 
> I have a problem about lapply, strsplit, and accessing list elements,
> which I don't understand or cannot solve:
> 
> I have e.g. a character vector with three elements:
> 
> x = c("349/077,349/074,349/100,349/117",
>  "340/384.2,340/513,367/139,455/128,D13/168",
>  "600/437,128/903,128/904")
> 
>   
> The task I want to perform, is to generate a list, comprising the
> portion in front of the "/" of each element of x:
> 
> neededResult = list(c("349","349", "349", "349"),
>  c("340", "340", "367", "455", "D13"),
>  c("600", "128", "128") )

Try the following, which first splits each string by commas
(returning a list), then removes the first slash and everything
after it (using lapply to maintain the list structure).

   > gotResult <- lapply(strsplit(x, ","), function(xi)gsub("/.*", "",
xi))
   > identical(getResult, neededResult)
   [1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> 
> 
> I figured out that for a single element of x the following works
> 
> unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), 
> "/"), "[", 1) )
> 
> but due to "unlist" it doesn't provide the required result if extended
> to all elements of x
> 
> unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), 
> "/"), "[",))
> 
> 
> Someone can help me to get the needed result?
> 
> Thanks and regards,
> 
> Dirk
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply, strsplit, and list elements

2011-02-04 Thread Dick Harray
Hi there,

I have a problem about lapply, strsplit, and accessing list elements,
which I don't understand or cannot solve:

I have e.g. a character vector with three elements:

x = c("349/077,349/074,349/100,349/117",
 "340/384.2,340/513,367/139,455/128,D13/168",
 "600/437,128/903,128/904")


The task I want to perform, is to generate a list, comprising the
portion in front of the "/" of each element of x:

neededResult = list(c("349","349", "349", "349"),
 c("340", "340", "367", "455", "D13"),
 c("600", "128", "128") )


I figured out that for a single element of x the following works

unlist( lapply( strsplit( unlist( strsplit(x[1], "\\,") ), "/"), "[", 1) )

but due to "unlist" it doesn't provide the required result if extended
to all elements of x

unlist(lapply(strsplit( unlist( lapply(x, strsplit, "\\,")), "/"), "[",))


Someone can help me to get the needed result?

Thanks and regards,

Dirk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply getting names of the list

2010-12-09 Thread David Winsemius


On Dec 9, 2010, at 2:21 PM, David Winsemius wrote:



On Dec 9, 2010, at 12:44 PM, Sashi Challa wrote:


Hello All,

I have a toy dataframe like this. It has 8 columns separated by tab.

NameSampleIDAl1 Al2 X   Y   R   Th
rs191191A1  A   B   0.999   0.090.780.090
abc928291   A1  B   J   0.3838  0.3839  0.028   0.888
abcnab  A1  H   K   0.3939  0.939   0.3939  0.77
rx82922 B1  J   K   0.3838  0.393   0.393   0.00
rcn3939 B1  M   O   0.000   0.000   0.000   0.77
tcn39399B1  P   I   0.393   0.393   0.393   0.56


Those were not tabs after being processed by various portions of the  
various mail systems.





Note that the SampleID is repeating. So I want to be able to split  
the dataset based on the SampleID and write the splitted dataset of  
every SampleID into a new file.

I tried split followed by lapply to do this.

infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE)
infile.split  <- split(infile, infile$SampleID)
names(infile.split[1])  ## outputs “A1”
## now A1, B1 are two lists in infile.split as I understand it.  
Correct me if I am wrong.





See if this works any better:

lapply(infile.split,function(x){
filename <- deparse(substitute(x))   # this is the way to  
recover the "names" of arguments

final_filename <- paste(filename,"toy_set.txt", sep="_")
write.table(x, file = paste("", final_filename,sep="/"),  
row.names=FALSE, quote=FALSE,sep="\t")

} )


I substituted "" for that path variable that you didn't provide, put  
in a missing ")" in the write.table file=paste() that wasmissing,  and  
I substituted regular double quotes for those damnable smart-quotes  
that _your_ mailer inserted.


--
David


In lapply I wanted to give a unique filename to all the split  
Sample Ids, i.e. name them here as .
How do I get those names, i.e. A1, B1 to a create a filename like  
above.


names(file.split) <- c("A1_toy_set.txt", "B1_toy_set_txt")

When I write each of the element in the list obtained after split  
into a file,


How are you proposing do do this "writing"?

the column names would have names like A1.Name, A1.SampleID,  
A1.Al1, …..


Are you sure? Why would you think that?

--
David.

Can I get rid of “A1” in the column names within the lapply (other  
than reading in the file again and changing the names) ?


Thanks for your time,

Regards
Sashi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply getting names of the list

2010-12-09 Thread Sashi Challa
Thanks a lot Joshua, that works perfectly fine. 
I could not think to lapply on the names instead of data itself.
I don't now notice SampleID names in the column names.
Thanks for your time,

-Sashi

-Original Message-
From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] 
Sent: Thursday, December 09, 2010 10:07 AM
To: Sashi Challa
Cc: r-help@R-project.org
Subject: Re: [R] lapply getting names of the list

Hi Sashi,

On Thu, Dec 9, 2010 at 9:44 AM, Sashi Challa  wrote:
> Hello All,
>
> I have a toy dataframe like this. It has 8 columns separated by tab.
>
> Name    SampleID        Al1     Al2     X       Y       R       Th
> rs191191        A1      A       B       0.999   0.09    0.78    0.090
> abc928291       A1      B       J       0.3838  0.3839  0.028   0.888
> abcnab  A1      H       K       0.3939  0.939   0.3939  0.77
> rx82922 B1      J       K       0.3838  0.393   0.393   0.00
> rcn3939 B1      M       O       0.000   0.000   0.000   0.77
> tcn39399        B1      P       I       0.393   0.393   0.393   0.56
>
> Note that the SampleID is repeating. So I want to be able to split the 
> dataset based on the SampleID and write the splitted dataset of every 
> SampleID into a new file.
> I tried split followed by lapply to do this.
>
> infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE)
> infile.split  <- split(infile, infile$SampleID)
> names(infile.split[1])  ## outputs “A1”

correct, names() returns the top level names of infile.split (i.e.,
the two data frame names)

> ## now A1, B1 are two lists in infile.split as I understand it. Correct me if 
> I am wrong.

It is a single, named list containing two data frames (A1 and B1)
(though data frames are built from lists, I think so I suppose in a
way it contains two lists, but that is not really the point).

>
> lapply(infile.split,function(x){
>              filename <- names(x)  here I expect to see A1 or B1, I 
> didn’t, I tried (names(x)[1]) and that gave me “Name” and not A1 or B1.

by using lapply() on the actual object, your function is getting each
element of the list.  That is:

infile.split[[1]]
infile.split[[2]]

trying names() on those:

names(infile.split[[1]])

should show what you are getting

>              final_filename <- paste(filename,”toy_set.txt”,sep=”_”)
>              write.table(x, file = paste(path, final_filename,sep=”/”, 
> row.names=FALSE, quote=FALSE,sep=”\t”)

FYI I think you are missing a parenthesis in there somewhere
>  } )
>
> In lapply I wanted to give a unique filename to all the split Sample Ids, 
> i.e. name them here as A1_toy_set.txt, B1_toy_set_txt.
> How do I get those names, i.e. A1, B1 to a create a filename like above.

Try this:

## read your data from the clipboard (obviously you do not need to)
infile <- read.table("clipboard", header = TRUE)
split.infile <- split(dat, dat$SampleID) #split data
path <- "~" # generic path

## rather than applying to the data itself, instead apply to the names
lapply(names(split.infile), function(x) {
  write.table(x = split.infile[[x]],
file = paste(path, paste(x, "toy_set.txt", sep = "_"), sep = "/"),
row.names = FALSE, quote = FALSE, sep = "\t")
  cat("wrote ", x, fill = TRUE)
})

it will return two NULL lists, but that is fine because it should have
written the files.

> When I write each of the element in the list obtained after split into a 
> file, the column names would have names like A1.Name, A1.SampleID, A1.Al1, 
> ….. Can I get rid of “A1” in the column names within the lapply (other than 
> reading in the file again and changing the names) ?

Can you report the results of str(yourdataframe) ?  I did not have
that issue just copying and pasting from your email and using the code
I showed above.

Cheers,

Josh

>
> Thanks for your time,
>
> Regards
> Sashi
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply getting names of the list

2010-12-09 Thread David Winsemius


On Dec 9, 2010, at 12:44 PM, Sashi Challa wrote:


Hello All,

I have a toy dataframe like this. It has 8 columns separated by tab.

NameSampleIDAl1 Al2 X   Y   R   Th
rs191191A1  A   B   0.999   0.090.780.090
abc928291   A1  B   J   0.3838  0.3839  0.028   0.888
abcnab  A1  H   K   0.3939  0.939   0.3939  0.77
rx82922 B1  J   K   0.3838  0.393   0.393   0.00
rcn3939 B1  M   O   0.000   0.000   0.000   0.77
tcn39399B1  P   I   0.393   0.393   0.393   0.56

Note that the SampleID is repeating. So I want to be able to split  
the dataset based on the SampleID and write the splitted dataset of  
every SampleID into a new file.

I tried split followed by lapply to do this.

infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE)
infile.split  <- split(infile, infile$SampleID)
names(infile.split[1])  ## outputs “A1”
## now A1, B1 are two lists in infile.split as I understand it.  
Correct me if I am wrong.


lapply(infile.split,function(x){
 filename <- names(x)  here I expect to see A1 or  
B1, I didn’t, I tried (names(x)[1]) and that gave me “Name” and not  
A1 or B1.

 final_filename <- paste(filename,”toy_set.txt”,sep=”_”)
 write.table(x, file = paste(path,  
final_filename,sep=”/”, row.names=FALSE, quote=FALSE,sep=”\t”)

 } )

In lapply I wanted to give a unique filename to all the split Sample  
Ids, i.e. name them here as .
How do I get those names, i.e. A1, B1 to a create a filename like  
above.


names(file.split) <- c("A1_toy_set.txt", "B1_toy_set_txt")

When I write each of the element in the list obtained after split  
into a file,


How are you proposing do do this "writing"?

the column names would have names like A1.Name, A1.SampleID, A1.Al1,  
…..


Are you sure? Why would you think that?

--
David.

Can I get rid of “A1” in the column names within the lapply (other  
than reading in the file again and changing the names) ?


Thanks for your time,

Regards
Sashi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply getting names of the list

2010-12-09 Thread Joshua Wiley
Hi Sashi,

On Thu, Dec 9, 2010 at 9:44 AM, Sashi Challa  wrote:
> Hello All,
>
> I have a toy dataframe like this. It has 8 columns separated by tab.
>
> Name    SampleID        Al1     Al2     X       Y       R       Th
> rs191191        A1      A       B       0.999   0.09    0.78    0.090
> abc928291       A1      B       J       0.3838  0.3839  0.028   0.888
> abcnab  A1      H       K       0.3939  0.939   0.3939  0.77
> rx82922 B1      J       K       0.3838  0.393   0.393   0.00
> rcn3939 B1      M       O       0.000   0.000   0.000   0.77
> tcn39399        B1      P       I       0.393   0.393   0.393   0.56
>
> Note that the SampleID is repeating. So I want to be able to split the 
> dataset based on the SampleID and write the splitted dataset of every 
> SampleID into a new file.
> I tried split followed by lapply to do this.
>
> infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE)
> infile.split  <- split(infile, infile$SampleID)
> names(infile.split[1])  ## outputs “A1”

correct, names() returns the top level names of infile.split (i.e.,
the two data frame names)

> ## now A1, B1 are two lists in infile.split as I understand it. Correct me if 
> I am wrong.

It is a single, named list containing two data frames (A1 and B1)
(though data frames are built from lists, I think so I suppose in a
way it contains two lists, but that is not really the point).

>
> lapply(infile.split,function(x){
>              filename <- names(x)  here I expect to see A1 or B1, I 
> didn’t, I tried (names(x)[1]) and that gave me “Name” and not A1 or B1.

by using lapply() on the actual object, your function is getting each
element of the list.  That is:

infile.split[[1]]
infile.split[[2]]

trying names() on those:

names(infile.split[[1]])

should show what you are getting

>              final_filename <- paste(filename,”toy_set.txt”,sep=”_”)
>              write.table(x, file = paste(path, final_filename,sep=”/”, 
> row.names=FALSE, quote=FALSE,sep=”\t”)

FYI I think you are missing a parenthesis in there somewhere
>  } )
>
> In lapply I wanted to give a unique filename to all the split Sample Ids, 
> i.e. name them here as A1_toy_set.txt, B1_toy_set_txt.
> How do I get those names, i.e. A1, B1 to a create a filename like above.

Try this:

## read your data from the clipboard (obviously you do not need to)
infile <- read.table("clipboard", header = TRUE)
split.infile <- split(dat, dat$SampleID) #split data
path <- "~" # generic path

## rather than applying to the data itself, instead apply to the names
lapply(names(split.infile), function(x) {
  write.table(x = split.infile[[x]],
file = paste(path, paste(x, "toy_set.txt", sep = "_"), sep = "/"),
row.names = FALSE, quote = FALSE, sep = "\t")
  cat("wrote ", x, fill = TRUE)
})

it will return two NULL lists, but that is fine because it should have
written the files.

> When I write each of the element in the list obtained after split into a 
> file, the column names would have names like A1.Name, A1.SampleID, A1.Al1, 
> ….. Can I get rid of “A1” in the column names within the lapply (other than 
> reading in the file again and changing the names) ?

Can you report the results of str(yourdataframe) ?  I did not have
that issue just copying and pasting from your email and using the code
I showed above.

Cheers,

Josh

>
> Thanks for your time,
>
> Regards
> Sashi
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply getting names of the list

2010-12-09 Thread Sashi Challa
Hello All,

I have a toy dataframe like this. It has 8 columns separated by tab.

NameSampleIDAl1 Al2 X   Y   R   Th
rs191191A1  A   B   0.999   0.090.780.090
abc928291   A1  B   J   0.3838  0.3839  0.028   0.888
abcnab  A1  H   K   0.3939  0.939   0.3939  0.77
rx82922 B1  J   K   0.3838  0.393   0.393   0.00
rcn3939 B1  M   O   0.000   0.000   0.000   0.77
tcn39399B1  P   I   0.393   0.393   0.393   0.56

Note that the SampleID is repeating. So I want to be able to split the dataset 
based on the SampleID and write the splitted dataset of every SampleID into a 
new file.
I tried split followed by lapply to do this.

infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE)
infile.split  <- split(infile, infile$SampleID)
names(infile.split[1])  ## outputs “A1”
## now A1, B1 are two lists in infile.split as I understand it. Correct me if I 
am wrong.

lapply(infile.split,function(x){
  filename <- names(x)  here I expect to see A1 or B1, I 
didn’t, I tried (names(x)[1]) and that gave me “Name” and not A1 or B1.
  final_filename <- paste(filename,”toy_set.txt”,sep=”_”)
  write.table(x, file = paste(path, final_filename,sep=”/”, 
row.names=FALSE, quote=FALSE,sep=”\t”)
  } )

In lapply I wanted to give a unique filename to all the split Sample Ids, i.e. 
name them here as A1_toy_set.txt, B1_toy_set_txt.
How do I get those names, i.e. A1, B1 to a create a filename like above.
When I write each of the element in the list obtained after split into a file, 
the column names would have names like A1.Name, A1.SampleID, A1.Al1, ….. Can 
I get rid of “A1” in the column names within the lapply (other than reading 
in the file again and changing the names) ?

Thanks for your time,

Regards
Sashi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to subsets

2010-10-12 Thread Feng Li
Yes, that is what I what...
Thanks.

Feng

On Wed, Oct 13, 2010 at 6:38 AM, Michael Bedward
wrote:

> Hello Feng,
>
> I think you just want this...
>
> lapply(A, function(x) apply(x[,,-c(1,2)], c(1,2), mean))
>
> Michael
>
>
> On 13 October 2010 04:00, Feng Li  wrote:
> > Dear R,
> >
> > I have a silly question concerns with *apply. Say I have a list called A,
> >
> > A <- list(a  =  array(1:20, c(2, 2, 5)), b  = array(1:30, c(2, 3, 5)))
> >
> > I wish to calculate the mean of A$a, and A$b w.r.t. their third dimension
> so
> > I did
> >
> > lapply(A,apply,c(1,2),mean)
> >
> > Now if I still wish to do the above task but take away some burn-in, e.g.
> do
> > not take A$a[,,1:2],and A$b[,,1:2] into account. How can I do then?
> >
> >
> > Thanks!
> >
> >
> >
> > Feng
> >
> >
> > --
> > Feng Li
> > Department of Statistics
> > Stockholm University
> > 106 91 Stockholm, Sweden
> > http://feng.li/
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



-- 
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply to subsets

2010-10-12 Thread Michael Bedward
Hello Feng,

I think you just want this...

lapply(A, function(x) apply(x[,,-c(1,2)], c(1,2), mean))

Michael


On 13 October 2010 04:00, Feng Li  wrote:
> Dear R,
>
> I have a silly question concerns with *apply. Say I have a list called A,
>
> A <- list(a  =  array(1:20, c(2, 2, 5)), b  = array(1:30, c(2, 3, 5)))
>
> I wish to calculate the mean of A$a, and A$b w.r.t. their third dimension so
> I did
>
> lapply(A,apply,c(1,2),mean)
>
> Now if I still wish to do the above task but take away some burn-in, e.g. do
> not take A$a[,,1:2],and A$b[,,1:2] into account. How can I do then?
>
>
> Thanks!
>
>
>
> Feng
>
>
> --
> Feng Li
> Department of Statistics
> Stockholm University
> 106 91 Stockholm, Sweden
> http://feng.li/
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply to subsets

2010-10-12 Thread Feng Li
Dear R,

I have a silly question concerns with *apply. Say I have a list called A,

A <- list(a  =  array(1:20, c(2, 2, 5)), b  = array(1:30, c(2, 3, 5)))

I wish to calculate the mean of A$a, and A$b w.r.t. their third dimension so
I did

lapply(A,apply,c(1,2),mean)

Now if I still wish to do the above task but take away some burn-in, e.g. do
not take A$a[,,1:2],and A$b[,,1:2] into account. How can I do then?


Thanks!



Feng


-- 
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and boxplots with variable names

2010-06-22 Thread Henrique Dallazuanna
Try this:

library(lattice)
bwplot(values ~ TimePeriod | ind, cbind(stack(my.data), TimePeriod =
my.data$TimePeriod))

On Tue, Jun 22, 2010 at 1:45 PM, Shawn Morrison <
shawn.morri...@dryasresearch.com> wrote:

> Hi all,
>
> I have a dataset with several variables, each of which is a separate
> column. For each variable, I want to produce a boxplot and include the name
> of the variable (ie, column name) on each plot.
>
> I have included a sample dataset below. Can someone tell me where I am
> going wrong?
>
> Thank you for your help,
> Shawn Morrison
>
> # Generate a sample dataset
> var1 = rnorm(1000)
> var2 = rnorm(1000)
> TimePeriod = rep((LETTERS[1:4]), 250)
>
> my.data = as.data.frame(cbind(var1, var2, TimePeriod)); summary(my.data)
> attach(my.data)
>
> # Create box plots for var1 and var2 using TimePeriod on the x-axis
> lapply(my.data[,1:2], function(y) {
>boxplot(y~TimePeriod,
>main = y
>data = my.data)
>
>})
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and boxplots with variable names

2010-06-22 Thread Shawn Morrison

Thanks Josh,

I do want to see each plot. I took your code and modified it (below) and 
it appears to do what I wanted:


my.data <- data.frame(var1=rnorm(1000), var2=rnorm(1000),
TimePeriod=factor(rep((LETTERS[1:4]), 250)))
str(my.data)

lapply(names(my.data[ , 1:2]), function(y) {
 quartz()
  boxplot(my.data[, y] ~ my.data[, "TimePeriod"],
  main = y)
})



On 22/06/10 11:25 AM, Joshua Wiley wrote:

Hello Shawn,

Does this do what you want?  I'm assuming you want to look at each
plot, so I added a call to par().

###
my.data<- data.frame(var1=rnorm(1000), var2=rnorm(1000),
TimePeriod=factor(rep((LETTERS[1:4]), 250)))
str(my.data)

lapply(names(my.data[ , 1:2]), function(y) {
   old.par<- par(no.readonly = TRUE)
   on.exit(par(old.par))
   par("ask"=TRUE)
   boxplot(my.data[, y] ~ my.data[, "TimePeriod"],
   main = y)
})
##

HTH,

Josh


On Tue, Jun 22, 2010 at 9:45 AM, Shawn Morrison
  wrote:
   

Hi all,

I have a dataset with several variables, each of which is a separate column.
For each variable, I want to produce a boxplot and include the name of the
variable (ie, column name) on each plot.

I have included a sample dataset below. Can someone tell me where I am going
wrong?

Thank you for your help,
Shawn Morrison

# Generate a sample dataset
var1 = rnorm(1000)
var2 = rnorm(1000)
TimePeriod = rep((LETTERS[1:4]), 250)

my.data = as.data.frame(cbind(var1, var2, TimePeriod)); summary(my.data)
attach(my.data)

# Create box plots for var1 and var2 using TimePeriod on the x-axis
lapply(my.data[,1:2], function(y) {
boxplot(y~TimePeriod,
main = y
data = my.data)
})

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and boxplots with variable names

2010-06-22 Thread Joshua Wiley
Hello Shawn,

Does this do what you want?  I'm assuming you want to look at each
plot, so I added a call to par().

###
my.data <- data.frame(var1=rnorm(1000), var2=rnorm(1000),
TimePeriod=factor(rep((LETTERS[1:4]), 250)))
str(my.data)

lapply(names(my.data[ , 1:2]), function(y) {
  old.par <- par(no.readonly = TRUE)
  on.exit(par(old.par))
  par("ask"=TRUE)
  boxplot(my.data[, y] ~ my.data[, "TimePeriod"],
  main = y)
})
##

HTH,

Josh


On Tue, Jun 22, 2010 at 9:45 AM, Shawn Morrison
 wrote:
> Hi all,
>
> I have a dataset with several variables, each of which is a separate column.
> For each variable, I want to produce a boxplot and include the name of the
> variable (ie, column name) on each plot.
>
> I have included a sample dataset below. Can someone tell me where I am going
> wrong?
>
> Thank you for your help,
> Shawn Morrison
>
> # Generate a sample dataset
> var1 = rnorm(1000)
> var2 = rnorm(1000)
> TimePeriod = rep((LETTERS[1:4]), 250)
>
> my.data = as.data.frame(cbind(var1, var2, TimePeriod)); summary(my.data)
> attach(my.data)
>
> # Create box plots for var1 and var2 using TimePeriod on the x-axis
> lapply(my.data[,1:2], function(y) {
>    boxplot(y~TimePeriod,
>                main = y
>                data = my.data)
>    })
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student
Health Psychology
University of California, Los Angeles

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and boxplots with variable names

2010-06-22 Thread Shawn Morrison

Many thanks Phil, that does help.

If I could ask a follow-up, how do I put each plot in its own device 
window? Right now, all I get is the boxplot for var2 (var1 gets 
overwritten?). I tried putting quartz() before the boxplot command but 
got an error message.


Cheers,
Shawn

On 22/06/10 11:13 AM, Phil Spector wrote:

Shawn -
   Does this example help?  (Please don't use cbind when creating
a data frame, since it first creates a matrix, which means everything
must be of the same mode.)


var1 = rnorm(1000)
var2 = rnorm(1000)
TimePeriod = rep((LETTERS[1:4]), 250)
my.data = data.frame(var1,var2,TimePeriod)
lapply(names(my.data)[1:2],

+function(y)boxplot(formula(paste(y,'TimePeriod',sep='~')),
+main=y,data=my.data))


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Tue, 22 Jun 2010, Shawn Morrison wrote:


Hi all,

I have a dataset with several variables, each of which is a separate 
column. For each variable, I want to produce a boxplot and include 
the name of the variable (ie, column name) on each plot.


I have included a sample dataset below. Can someone tell me where I 
am going wrong?


Thank you for your help,
Shawn Morrison

# Generate a sample dataset
var1 = rnorm(1000)
var2 = rnorm(1000)
TimePeriod = rep((LETTERS[1:4]), 250)

my.data = as.data.frame(cbind(var1, var2, TimePeriod)); summary(my.data)
attach(my.data)

# Create box plots for var1 and var2 using TimePeriod on the x-axis
lapply(my.data[,1:2], function(y) {
   boxplot(y~TimePeriod,
   main = y
   data = my.data)
   })

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and boxplots with variable names

2010-06-22 Thread Phil Spector

Shawn -
   Does this example help?  (Please don't use cbind when creating
a data frame, since it first creates a matrix, which means everything
must be of the same mode.)


var1 = rnorm(1000)
var2 = rnorm(1000)
TimePeriod = rep((LETTERS[1:4]), 250)
my.data = data.frame(var1,var2,TimePeriod)
lapply(names(my.data)[1:2],

+function(y)boxplot(formula(paste(y,'TimePeriod',sep='~')),
+main=y,data=my.data))


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Tue, 22 Jun 2010, Shawn Morrison wrote:


Hi all,

I have a dataset with several variables, each of which is a separate column. 
For each variable, I want to produce a boxplot and include the name of the 
variable (ie, column name) on each plot.


I have included a sample dataset below. Can someone tell me where I am going 
wrong?


Thank you for your help,
Shawn Morrison

# Generate a sample dataset
var1 = rnorm(1000)
var2 = rnorm(1000)
TimePeriod = rep((LETTERS[1:4]), 250)

my.data = as.data.frame(cbind(var1, var2, TimePeriod)); summary(my.data)
attach(my.data)

# Create box plots for var1 and var2 using TimePeriod on the x-axis
lapply(my.data[,1:2], function(y) {
   boxplot(y~TimePeriod,
   main = y
   data = my.data)
   })

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and boxplots with variable names

2010-06-22 Thread Shawn Morrison

Hi all,

I have a dataset with several variables, each of which is a separate 
column. For each variable, I want to produce a boxplot and include the 
name of the variable (ie, column name) on each plot.


I have included a sample dataset below. Can someone tell me where I am 
going wrong?


Thank you for your help,
Shawn Morrison

# Generate a sample dataset
var1 = rnorm(1000)
var2 = rnorm(1000)
TimePeriod = rep((LETTERS[1:4]), 250)

my.data = as.data.frame(cbind(var1, var2, TimePeriod)); summary(my.data)
attach(my.data)

# Create box plots for var1 and var2 using TimePeriod on the x-axis
lapply(my.data[,1:2], function(y) {
boxplot(y~TimePeriod,
main = y
data = my.data)
})

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply or data.table to find a unit's previous transaction

2010-06-03 Thread Matthew Dowle
William,

Try a rolling join in data.table, something like this (untested) :

setkey(Data, UnitID, TranDt)# sort by unit then date
previous = transform(Data, TranDt=TranDt-1)
Data[previous,roll=TRUE]# lookup the prevailing date before, if any, 
for each row within that row's UnitID

Thats all it is, no loops required. That should be fast and memory 
efficient. 100's of times faster than a subquery in SQL.

If you have trouble please follow up on datatable-help.

Matthew


"William Rogers"  wrote in message 
news:aanlktikk_avupm7j108iseryo9fucpnjhanxpaqvt...@mail.gmail.com...
I have a dataset of property transactions that includes the
transaction ID (TranID), property ID (UnitID), and transaction date
(TranDt). I need to create a data frame (or data table) that includes
the previous transaction date, if one exists.
This is an easy problem in SQL, where I just run a sub-query, but I'm
trying to make R my one-stop-shopping program. The following code
works on a subset of my data, but I can't run this on my full dataset
because my computer runs out of memory after about 30 minutes. (Using
a 32-bit machine.)
Use the following synthetic data for example.

n<- 100
TranID<- lapply(n:(2*n), function(x) (
as.matrix(paste(x, sample(seq(as.Date('2000-01-01'),
as.Date('2010-01-01'), "days"), sample(1:5, 1)), sep= "D"), ncol= 1)))
TranID<- do.call("rbind", TranID)
UnitID<- substr(TranID, 1, nchar(n))
TranDt<- substr(TranID, nchar(n)+2, nchar(n)+11)
Data<- data.frame(TranID= TranID, UnitID= UnitID, TranDt= as.Date(TranDt))

#First I create a list of all the previous transactions by unit

TranList<- as.matrix(Data$TranID, ncol= 1)
PreTran<- lapply(TranList,
function(x) (with(Data,
Data[
UnitID== substr(x, 1, nchar(n))&
TranDt< Data[TranID== x, "TranDt"], ]
))
)

#I do get warnings about missing data because some transactions have
no predecessor.
#Some transactions have no previous transactions, others have many so
I pick the most recent

BeforeTran<- lapply(seq_along(PreTran), function(x) (
with(PreTran[[x]], PreTran[[x]][which(TranDt== max(TranDt)), ])))

#I need to add the current transaction's TranID to the list so I can merge 
later

BeforeTran<- lapply(seq_along(PreTran), function(x) (
transform(BeforeTran[[x]], TranID= TranList[x, 1])))

#Finally, I convert from a list to a data frame

BeforeTran<- do.call("rbind", BeforeTran)

#I have used a combination of data.table and for loops, but that seems
cheesey and doesn't preform much better.

library(data.table)

#First I create a list of all the previous transactions by unit

TranList2<- vector(nrow(Data), mode= "list")
names(TranList2)<- levels(Data$TranID)
DataDT<- data.table(Data)

#Use a for loop and data.table to find the date of the previous transaction

for (i in levels(Data$TranID)) {
if (DataDT[UnitID== substr(i, 1, nchar(n))&
TranDt<= (DataDT[TranID== i, TranDt]),
length(TranDt)]> 1)
TranList2[[i]]<- cbind(TranID= i,
DataDT[UnitID== substr(i, 1, nchar(n))&
TranDt< (DataDT[TranID== i, TranDt]),
list(TranDt= max(TranDt))])
}

#Finally, I convert from a list to a data table

BeforeTran2<- do.call("rbind", TranList2)

#My intution says that this code doesn't take advantage of
data.table's attributes.
#Are there any ideas out there? Thank you.
#P.S. I've tried plyr and it does not help my memory problem.

--
William H. Rogers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply or data.table to find a unit's previous transaction

2010-06-02 Thread Gabor Grothendieck
On Wed, Jun 2, 2010 at 10:29 PM, William Rogers  wrote:
> I have a dataset of property transactions that includes the
> transaction ID (TranID), property ID (UnitID), and transaction date
> (TranDt). I need to create a data frame (or data table) that includes
> the previous transaction date, if one exists.
> This is an easy problem in SQL, where I just run a sub-query, but I'm
> trying to make R my one-stop-shopping program.  The following code

The sqldf package lets you use SQL queries on R data frames.  See its
home page at:
http://sqldf.googlecode.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply or data.table to find a unit's previous transaction

2010-06-02 Thread William Rogers
I have a dataset of property transactions that includes the
transaction ID (TranID), property ID (UnitID), and transaction date
(TranDt). I need to create a data frame (or data table) that includes
the previous transaction date, if one exists.
This is an easy problem in SQL, where I just run a sub-query, but I'm
trying to make R my one-stop-shopping program.  The following code
works on a subset of my data, but I can't run this on my full dataset
because my computer runs out of memory after about 30 minutes. (Using
a 32-bit machine.)
Use the following synthetic data for example.

n<- 100
TranID<- lapply(n:(2*n), function(x) (
as.matrix(paste(x, sample(seq(as.Date('2000-01-01'),
as.Date('2010-01-01'), "days"), sample(1:5, 1)), sep= "D"), ncol= 1)))
TranID<- do.call("rbind", TranID)
UnitID<- substr(TranID, 1, nchar(n))
TranDt<- substr(TranID, nchar(n)+2, nchar(n)+11)
Data<- data.frame(TranID= TranID, UnitID= UnitID, TranDt= as.Date(TranDt))

#First I create a list of all the previous transactions by unit

TranList<- as.matrix(Data$TranID, ncol= 1)
PreTran<- lapply(TranList,
  function(x) (with(Data,
  Data[
  UnitID== substr(x, 1, nchar(n))&
  TranDt< Data[TranID== x, "TranDt"], ]
  ))
  )

#I do get warnings about missing data because some transactions have
no predecessor.
#Some transactions have no previous transactions, others have many so
I pick the most recent

BeforeTran<- lapply(seq_along(PreTran), function(x) (
with(PreTran[[x]], PreTran[[x]][which(TranDt== max(TranDt)), ])))

#I need to add the current transaction's TranID to the list so I can merge later

BeforeTran<- lapply(seq_along(PreTran), function(x) (
transform(BeforeTran[[x]], TranID= TranList[x, 1])))

#Finally, I convert from a list to a data frame

BeforeTran<- do.call("rbind", BeforeTran)

#I have used a combination of data.table and for loops, but that seems
cheesey and doesn't preform much better.

library(data.table)

#First I create a list of all the previous transactions by unit

TranList2<- vector(nrow(Data), mode= "list")
names(TranList2)<- levels(Data$TranID)
DataDT<- data.table(Data)

#Use a for loop and data.table to find the date of the previous transaction

for (i in levels(Data$TranID)) {
if (DataDT[UnitID== substr(i, 1, nchar(n))&
   TranDt<= (DataDT[TranID== i, TranDt]),
length(TranDt)]> 1)
TranList2[[i]]<- cbind(TranID= i,
DataDT[UnitID== substr(i, 1, nchar(n))&
TranDt< (DataDT[TranID== i, TranDt]),
list(TranDt= max(TranDt))])
}

#Finally, I convert from a list to a data table

BeforeTran2<- do.call("rbind", TranList2)

#My intution says that this code doesn't take advantage of
data.table's attributes.
#Are there any ideas out there?  Thank you.
#P.S. I've tried plyr and it does not help my memory problem.

--
William H. Rogers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply with functions with changing parameters

2010-06-02 Thread Bunny, lautloscrew.com
Henrique, 

thx, your suggestion worked perfectly fine for me. 


On 01.06.2010, at 23:01, Henrique Dallazuanna wrote:

>  lapply(mydf[-6], ccf, y = mydf[6])

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply with functions with changing parameters

2010-06-01 Thread Erik Iverson



Bunny, lautloscrew.com wrote:

Dear all,

I am trying to avoid a for loop here and wonder if the following is
possible:

I have a data.frame with 6 columns and i want to get a
cross-correlogram (by using ccf) . Obivously ccf only accepts two
columns at once and then returms a list. In fact, with a for loop i´d
do the following


for (i in 1:6) {

x[[i]]=ccf(mydf[,i],mydf[,6])


}

Is there any chance to the same with lapply? e.g. lapply(mydf,"ccf",
 ) with ... respresenting the changing arguments for ccf
functions (note only the first argument does actually change)


You don't give a reproducible example, but since you want to apply a 
function to a list, the answer is yes.  Just defining the function and 
the list is the trick, untested:


lapply(mydf[, 1:5], function(x) ccf(x, mydf[, 6]))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply with functions with changing parameters

2010-06-01 Thread Henrique Dallazuanna
Try this:

 lapply(mydf[-6], ccf, y = mydf[6])

On Tue, Jun 1, 2010 at 5:50 PM, Bunny, lautloscrew.com <
bu...@lautloscrew.com> wrote:

> Dear all,
>
> I am trying to avoid a for loop here and wonder if the following is
> possible:
>
> I have a data.frame with 6 columns and i want to get a cross-correlogram
> (by using ccf) . Obivously ccf only accepts two columns at once and then
> returms a list.
> In fact, with a for loop i´d do the following
>
>
> for (i in 1:6) {
>
>  x[[i]]=ccf(mydf[,i],mydf[,6])
>
>
> }
>
> Is there any chance to the same with lapply? e.g. lapply(mydf,"ccf",  )
> with ... respresenting the changing arguments for ccf functions (note only
> the first argument does actually change)
>
> thx for any suggestions in advance
>
> best
>
> matt
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply with functions with changing parameters

2010-06-01 Thread Bunny, lautloscrew.com
Dear all, 

I am trying to avoid a for loop here and wonder if the following is possible: 

I have a data.frame with 6 columns and i want to get a cross-correlogram (by 
using ccf) . Obivously ccf only accepts two columns at once and then returms a 
list. 
In fact, with a for loop i´d do the following


for (i in 1:6) {

 x[[i]]=ccf(mydf[,i],mydf[,6])


}

Is there any chance to the same with lapply? e.g. lapply(mydf,"ccf",  ) 
with ... respresenting the changing arguments for ccf functions (note only the 
first argument does actually change)

thx for any suggestions in advance

best

matt
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply - function with arguments

2010-04-13 Thread Arun.stat

another thought possibly

fn = function(n, a=1, b=3) return(n*(a+b))
sapply(1:3, fn)

-- 
View this message in context: 
http://n4.nabble.com/lapply-function-with-arguments-tp1838373p1838506.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply - function with arguments

2010-04-13 Thread Randall Wrong
Thank you Jim

2010/4/13 jim holtman 

> lapply(yourList, f, a=1, b=2)
>
>   On Tue, Apr 13, 2010 at 9:11 AM, Randall Wrong 
> wrote:
>
>>  Dear R users,
>>
>> I have created a function f of n, a and b : f(n,a,b)
>>
>> I would like to apply this function several times to some values of n. a
>> and
>> b are held constant. I was thinking of using lapply. How can I do this ?
>>
>> Thank you very much
>> Randall
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply - function with arguments

2010-04-13 Thread jim holtman
lapply(yourList, f, a=1, b=2)

On Tue, Apr 13, 2010 at 9:11 AM, Randall Wrong wrote:

> Dear R users,
>
> I have created a function f of n, a and b : f(n,a,b)
>
> I would like to apply this function several times to some values of n. a
> and
> b are held constant. I was thinking of using lapply. How can I do this ?
>
> Thank you very much
> Randall
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply - function with arguments

2010-04-13 Thread Randall Wrong
Dear R users,

I have created a function f of n, a and b : f(n,a,b)

I would like to apply this function several times to some values of n. a and
b are held constant. I was thinking of using lapply. How can I do this ?

Thank you very much
Randall

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and list indexing basics (after realizing I wasn't previously subscribed...sorry)

2010-03-07 Thread Simon Knapp
library(MASS)

dat <- data.frame(
col1=as.factor(sample(1:4, 100, T)),
col2=as.factor(sample(1:4, 100, T)),
col3=as.factor(sample(1:4, 100, T)),
isi=rnorm(100)
)

dat <- split(dat, as.factor(sample(1:3, 100, T)))
lapply(dat, function(x, densfun) fitdistr(x$isi, densfun), 'normal')

Not used fitdistr before, and without data, I can't help with the error you
note below (which may well still occur)

Hope this helps.





On Mon, Mar 8, 2010 at 1:29 PM, Dgnn  wrote:

>
> I have split my original dataframe to generate a list of dataframes each of
> which has 3 columns of factors and a 4th column of numeric data.
> I would like to use lapply to apply the fitdistr() function to only the 4th
> column (x$isi) of the dataframes in the list.
>
> Is there a way to do this or am I misusing lapply?
>
> As a second solution I tried splitting only the numeric data column to
> yield
> a list of vectors and then using
> lapply(myList, fitdistr, densfun='gamma',start=list(scale=1, shape=2))
> returns the error:
> Error in optim(x = c(305, 290, 283, 363, 331, 293, 304, 312, 286, 339,  :
> non-finite finite-difference value [2]
> In addition: Warning message:
> In dgamma(x, shape, scale, log) : NaNs produced
>
> However, if I use fitdistr(myList[[i]]) on each element of the list, there
> are no errors.
>
> Thanks in advance for any comments.
>
> -jason
> --
> View this message in context:
> http://n4.nabble.com/lapply-and-list-indexing-basics-after-realizing-I-wasn-t-previously-subscribed-sorry-tp1584053p1584053.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply and list indexing basics

2010-03-07 Thread jim holtman
It would have been nice if you had at least posted what the structure of
myList is.  Assuming that this is the list of your data frames, then the
following might work:

lapply(myList, function(x) fitdistr(x$isi,
densfun='gamma',start=list(scale=1, shape=2)))


On Sun, Mar 7, 2010 at 7:30 PM, Dgnn  wrote:

>
> I have split my original dataframe to generate a list of dataframes each of
> which has 3 columns of factors and a 4th column of numeric data.
> I would like to use lapply to apply the fitdistr() function to only the 4th
> column (x$isi) of the dataframes in the list.
>
> Is there a way to do this or am I misusing lapply?
>
> As a second solution I tried splitting only the numeric data column to
> yield
> a list of vectors and then using
> lapply(myList, fitdistr, densfun='gamma',start=list(scale=1, shape=2))
> returns the error:
> Error in optim(x = c(305, 290, 283, 363, 331, 293, 304, 312, 286, 339,  :
> non-finite finite-difference value [2]
> In addition: Warning message:
> In dgamma(x, shape, scale, log) : NaNs produced
>
> However, if I use fitdistr(myList[[i]]) on each element of the list, there
> are no errors.
>
> Thanks in advance for any comments.
>
> -jason
> --
> View this message in context:
> http://n4.nabble.com/lapply-and-list-indexing-basics-tp1584006p1584006.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and list indexing basics

2010-03-07 Thread Dgnn

I have split my original dataframe to generate a list of dataframes each of
which has 3 columns of factors and a 4th column of numeric data.
I would like to use lapply to apply the fitdistr() function to only the 4th
column (x$isi) of the dataframes in the list. 

Is there a way to do this or am I misusing lapply?

As a second solution I tried splitting only the numeric data column to yield
a list of vectors and then using
lapply(myList, fitdistr, densfun='gamma',start=list(scale=1, shape=2)) 
returns the error:
Error in optim(x = c(305, 290, 283, 363, 331, 293, 304, 312, 286, 339,  : 
non-finite finite-difference value [2]
In addition: Warning message:
In dgamma(x, shape, scale, log) : NaNs produced

However, if I use fitdistr(myList[[i]]) on each element of the list, there
are no errors.

Thanks in advance for any comments.

-jason
-- 
View this message in context: 
http://n4.nabble.com/lapply-and-list-indexing-basics-tp1584006p1584006.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply and list indexing basics (after realizing I wasn't previously subscribed...sorry)

2010-03-07 Thread Dgnn

I have split my original dataframe to generate a list of dataframes each of
which has 3 columns of factors and a 4th column of numeric data. 
I would like to use lapply to apply the fitdistr() function to only the 4th
column (x$isi) of the dataframes in the list. 

Is there a way to do this or am I misusing lapply? 

As a second solution I tried splitting only the numeric data column to yield
a list of vectors and then using 
lapply(myList, fitdistr, densfun='gamma',start=list(scale=1, shape=2)) 
returns the error: 
Error in optim(x = c(305, 290, 283, 363, 331, 293, 304, 312, 286, 339,  : 
non-finite finite-difference value [2] 
In addition: Warning message: 
In dgamma(x, shape, scale, log) : NaNs produced 

However, if I use fitdistr(myList[[i]]) on each element of the list, there
are no errors. 

Thanks in advance for any comments. 

-jason
-- 
View this message in context: 
http://n4.nabble.com/lapply-and-list-indexing-basics-after-realizing-I-wasn-t-previously-subscribed-sorry-tp1584053p1584053.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply with data frame

2010-02-28 Thread Bill.Venables
Oops!  My caveat about untested code was certainly appropriate.  The 
normalization code below will not work.  

Here is probably what I was thinking of doing:

data <- within(data, norm <- value / tapply(value, group, sum)[group])

The same caveats apply here as below!



From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
bill.venab...@csiro.au [bill.venab...@csiro.au]
Sent: 01 March 2010 17:18
To: n...@smartmediacorp.com; r-help@r-project.org
Subject: [ExternalEmail] Re: [R] lapply with data frame

Data frames are lists.  Each column of the data frame is a component of the 
list.  So in, e.g.

lapply(data, function(x) x)

the function would receive each column of the data frame in turn.

To apply a function to each row of the data frame (which may need some care) 
one tool you can use is apply(...)

apply(data, 1, function(x) ...)

The form of the result will depend on the value of the function.  If the value 
returned by the function is a vector, these will form the *columns* of the 
result of apply, not the rows, which will be a matrix.

For the normalization problem, here is one way to do it:

data <- within(data, norm <- tapply(value, group, function(x) x/sum(x))[group])


Warning 1: the second of these assignment operators may not be replaced by '='.
Warning 2: untested code!


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Noah Silverman [n...@smartmediacorp.com]
Sent: 28 February 2010 12:37
To: r-help@r-project.org
Subject: [R] lapply with data frame

I'm a bit confused on how to use lapply with a data.frame.

For example.

lapply(data, function(x) print(x))

WHAT exactly is passed to the function.  Is it each ROW in the data
frame, one by one, or each column, or the entire frame in one shot?

What I want to do apply a function to each row in the data frame.  Is
lapply the right way.

A second application is to normalize a column value by group.  For
example, if I have the following table:
idgroupvalue  norm
1A3.2
2A3.0
3A3.1
4B5.5
5B6.0
6B6.2
etc...

The long version would be:
foreach (group in unique(data$group)){
 data$norm[group==group] <- data$value[group==group] /
sum(data$value[group==group])
}

There must be a faster way to do this with lapply.  (Ideally, I'd then
use mclapply to run on multi-cores and really crank up the speed.)

Any suggestions?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lapply with data frame

2010-02-28 Thread Bill.Venables
Data frames are lists.  Each column of the data frame is a component of the 
list.  So in, e.g. 

lapply(data, function(x) x) 

the function would receive each column of the data frame in turn.

To apply a function to each row of the data frame (which may need some care) 
one tool you can use is apply(...)

apply(data, 1, function(x) ...)

The form of the result will depend on the value of the function.  If the value 
returned by the function is a vector, these will form the *columns* of the 
result of apply, not the rows, which will be a matrix.

For the normalization problem, here is one way to do it:

data <- within(data, norm <- tapply(value, group, function(x) x/sum(x))[group])


Warning 1: the second of these assignment operators may not be replaced by '='. 
 
Warning 2: untested code!


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Noah Silverman [n...@smartmediacorp.com]
Sent: 28 February 2010 12:37
To: r-help@r-project.org
Subject: [R] lapply with data frame

I'm a bit confused on how to use lapply with a data.frame.

For example.

lapply(data, function(x) print(x))

WHAT exactly is passed to the function.  Is it each ROW in the data
frame, one by one, or each column, or the entire frame in one shot?

What I want to do apply a function to each row in the data frame.  Is
lapply the right way.

A second application is to normalize a column value by group.  For
example, if I have the following table:
idgroupvalue  norm
1A3.2
2A3.0
3A3.1
4B5.5
5B6.0
6B6.2
etc...

The long version would be:
foreach (group in unique(data$group)){
 data$norm[group==group] <- data$value[group==group] /
sum(data$value[group==group])
}

There must be a faster way to do this with lapply.  (Ideally, I'd then
use mclapply to run on multi-cores and really crank up the speed.)

Any suggestions?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >