Re: [R] Data Manipulation

2010-09-10 Thread dfong

I'm actually importing it from a CSV, so I already have that in a table. But
i Can't make a graph with text. I assume I need to do some counting in order
to draw the graph?
Any example of this?

thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534690.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation

2010-09-10 Thread Joshua Wiley
Hi,

Look at the table() function.  Here is an example with your data:


dat <- read.table(textConnection("
Study
A
A
B
B
B
A
C
C
D"), header = TRUE)
closeAllConnections()

table(dat)


Hope that helps,

Josh

On Fri, Sep 10, 2010 at 8:53 AM, dfong  wrote:
>
> Hi,
>
> I just started using R and need some guidance.
>
> I need to create a time series chart in R, but the problem is the data is
> not numeric.
> The data is in the following format
>
> Study
> A
> A
> B
> B
> B
> A
> C
> C
> D
>
> Then there is also another column with dates. How can I manipulate this in
> order to have something that will count the number of unique entries and
> group them.
> Say A = 3 B= 3 C=2 D=1
>
> Thanks
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534662.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Manipulation

2010-09-10 Thread dfong

Hi,

I just started using R and need some guidance.

I need to create a time series chart in R, but the problem is the data is
not numeric.
The data is in the following format

Study 
A
A
B
B
B
A
C
C
D

Then there is also another column with dates. How can I manipulate this in
order to have something that will count the number of unique entries and
group them.
Say A = 3 B= 3 C=2 D=1

Thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534662.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulations and SQL

2010-08-26 Thread Gabor Grothendieck
On Thu, Aug 26, 2010 at 1:18 PM, stephenb  wrote:
> is it possible to open a channel to a data frame in the default environment?
> there are cases when using a sql update statement is the simplest
> alternative, so instead of dumping the df and then updating and then
> reimporting it I would like to update the df directly in R.

Assuming that this is a question about sqldf see these links. The
first has an example of update and the second discusses connections
that persist across sqldf commands:

http://code.google.com/p/sqldf/#8._Why_am_I_having_problems_with_update?
http://code.google.com/p/sqldf/#Example_10._Persistent_Connections

You may also wish to use the RSQLite package directly.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulations and SQL

2010-08-26 Thread stephenb

Greetings Gabor,

is it possible to open a channel to a data frame in the default environment?
there are cases when using a sql update statement is the simplest
alternative, so instead of dumping the df and then updating and then
reimporting it I would like to update the df directly in R.

Thank you.
Stephen
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Manipulations-and-SQL-tp860420p2340098.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame handling

2010-08-16 Thread David Winsemius


On Aug 16, 2010, at 5:53 AM, Lily_stats wrote:



Dear all,

I have an xts object , t.xts with 4 columns: "v1" "DD1" "v2" "DD2" and
created a data frame :

t <- as.data.frame(t.xts)


"t" is not the best choice of names for an object because it is the  
name of a commonly used function


Let's instead assume you used "tt". Try:

tt[which(tt$DD1 >0 & tt$DD1 < 30), "v1"]

Or:

subset(tt, subset=which(tt$DD1 >0 & tt$DD1 < 30), select="v1")



I would like to extract data and create a new data frame for when  
the values
in column DD1 falls between 0 and 30 and extract the corresponding  
v1 value.


How can I do this?

Thanks.



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame handling

2010-08-16 Thread Lily_stats

Dear all, 

I have an xts object , t.xts with 4 columns: "v1" "DD1" "v2" "DD2" and
created a data frame :

t <- as.data.frame(t.xts)

I would like to extract data and create a new data frame for when the values
in column DD1 falls between 0 and 30 and extract the corresponding v1 value.

How can I do this?

Thanks.



-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-frame-handling-tp2326617p2326617.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data manipulation search

2010-08-11 Thread Erik Iverson

?match, look at the %in% operator.

Mestat wrote:

Hi listers,
I made some search, but i didn`t find in the forum.
I have a data set.
I would like to make a search (conditon) on my data set.

x<-c(1,2,3,4,5,6,7,8,9,10)
count<-0
if (CONDITON){count<-1}else{count<-0}

My CONDITION would be: is there number 5 in my data set?

Thanks in advance,
Marcio


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data manipulation search

2010-08-11 Thread Mestat

Hi listers,
I made some search, but i didn`t find in the forum.
I have a data set.
I would like to make a search (conditon) on my data set.

x<-c(1,2,3,4,5,6,7,8,9,10)
count<-0
if (CONDITON){count<-1}else{count<-0}

My CONDITION would be: is there number 5 in my data set?

Thanks in advance,
Marcio
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-manipulation-search-tp2321927p2321927.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-08 Thread Gabor Grothendieck
On Sun, Aug 8, 2010 at 5:54 PM, steven mosher  wrote:
> z<-as.zooreg(as.ts(g))
>> z
>          X12345 X34567 X56789
> 1989(1)      NA      3      6
> 1989(2)      NA      3      6
> 1989(3)      NA      3      6
> 1989(4)      NA      3      6
> 1989(5)      NA      3      6
> 1989(6)      NA      3      6
> 1989(7)      NA      3      6
> 1989(8)      NA      3      6
> 1989(9)      NA      3      6
> 1989(10)     NA      3      6
> 1989(11)     NA      3      6
> 1989(12)     NA      3      6
> 1990(1)       2      4      6
> 1990(2)       2      4      6
> 1990(3)       2      4      6
> 1990(4)       2      4      6
> 1990(5)       2      4      6
> 1990(6)       2      4      6
> 1990(7)       2      4      6
> 1990(8)       2      4      6
> 1990(9)       2      4      6
> 1990(10)      2      4      6
> 1990(11)      2      4      6
> 1990(12)      2      4      6
> 1991(1)      NA      5     NA
> 1991(2)      NA      5     NA
> 1991(3)      NA      5     NA
> 1991(4)      NA      5     NA
> 1991(5)      NA      5     NA
> 1991(6)      NA      5     NA
> 1991(7)      NA      5     NA
> 1991(8)      NA      5     NA
> 1991(9)      NA      5     NA
> 1991(10)     NA      5     NA
> 1991(11)     NA      5     NA
> 1991(12)     NA      5     NA
> 1992(1)       2     NA     NA
> 1992(2)       2     NA     NA
>
> ***
> The interesting this is the change from months to the (1)...

zooreg converts a ts series to one with a numeric index and the same frequency.

You can convert the index to "yearmon" class if you wish:

z<-as.zooreg(as.ts(g))
time(z) <- as.yearmon(time(z))

or

z <- aggregate(as.zooreg(as.ts(g)), as.yearmon, identity)
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-08 Thread steven mosher
Thanks again,

They worked for me as well. I did a simpler example with fewer years just to
show that it worked...( shorted here for display)

 f <- function(x) {
+dat <- x[-(1:2)]
+tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12,
"+"))
+zoo(c(as.matrix(dat)), tim)
+ }
> g<-do.call(cbind, by(Data, Data$Index, f))
> g
 X12345 X34567 X56789
Jan 1989 NA  3  6
Feb 1989 NA  3  6
Mar 1989 NA  3  6
Apr 1989 NA  3  6
May 1989 NA  3  6
Jun 1989 NA  3  6
Jul 1989 NA  3  6
Aug 1989 NA  3  6
Sep 1989 NA  3  6
Oct 1989 NA  3  6
Nov 1989 NA  3  6
Dec 1989 NA  3  6
Jan 1990  2  4  6
Feb 1990  2  4  6
Mar 1990  2  4  6
Apr 1990  2  4  6
May 1990  2  4  6
Jun 1990  2  4  6
Jul 1990  2  4  6
Aug 1990  2  4  6
Sep 1990  2  4  6
Oct 1990  2  4  6
Nov 1990  2  4  6
Dec 1990  2  4  6
Jan 1991 NA  5 NA

.

z<-as.zooreg(as.ts(g))
> z
 X12345 X34567 X56789
1989(1)  NA  3  6
1989(2)  NA  3  6
1989(3)  NA  3  6
1989(4)  NA  3  6
1989(5)  NA  3  6
1989(6)  NA  3  6
1989(7)  NA  3  6
1989(8)  NA  3  6
1989(9)  NA  3  6
1989(10) NA  3  6
1989(11) NA  3  6
1989(12) NA  3  6
1990(1)   2  4  6
1990(2)   2  4  6
1990(3)   2  4  6
1990(4)   2  4  6
1990(5)   2  4  6
1990(6)   2  4  6
1990(7)   2  4  6
1990(8)   2  4  6
1990(9)   2  4  6
1990(10)  2  4  6
1990(11)  2  4  6
1990(12)  2  4  6
1991(1)  NA  5 NA
1991(2)  NA  5 NA
1991(3)  NA  5 NA
1991(4)  NA  5 NA
1991(5)  NA  5 NA
1991(6)  NA  5 NA
1991(7)  NA  5 NA
1991(8)  NA  5 NA
1991(9)  NA  5 NA
1991(10) NA  5 NA
1991(11) NA  5 NA
1991(12) NA  5 NA
1992(1)   2 NA NA
1992(2)   2 NA NA


***
The interesting this is the change from months to the (1)...



On Sun, Aug 8, 2010 at 8:55 AM, Gabor Grothendieck
wrote:

> On Sun, Aug 8, 2010 at 11:21 AM, steven mosher 
> wrote:
> > Ok,
> > I'm a bit confused by what you mean by "regularly spaced"
> > After I do the  do.call I do get a data structure with all the times
> present
> > and every time has a NA or a data value.
> > Steve
> >
>
> regularly spaced means that every observation is one month later than
> the prior.  If there are missing 6 month chunks or missing entire
> years then the observations are not regularly spaced since there are
> some months not present.
>
> It works for me:
>
> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> >  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
> >
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
> +  Oct=Values,Nov=Values,Dec=Values2)
> >
> > library(zoo)
> > f <- function(x) {
> +dat <- x[-(1:2)]
> +tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12,
> "+"))
> +zoo(c(as.matrix(dat)), tim)
> + }
> > do.call(cbind, by(Data, Data$Index, f))
> X12345X67543X89765
> Jan 1989NA 12.00NA
> Feb 1989NA  6.00NA
> Mar 1989NA 12.00NA
> Apr 1989NA 12.00NA
> May 1989NA 12.00NA
> Jun 1989NA  4.00NA
> Jul 1989NA 12.00NA
> Aug 1989NA 12.00NA
> Sep 1989NA 12.00NA
> Oct 1989NA 12.00NA
> Nov 1989NA 12.00NA
> Jan 1990NA 14.00NA
> Feb 1990NA  7.00NA
> Mar 1990NANANA
> Apr 1990NANANA
> May 1990NA 14.00NA
> Jun 1990NA  4.67NA
> Jul 1990NANANA
> Aug 1990NA 14.00NA
> Sep 1990NA 14.00NA
> Oct 1990NA 14.00NA
> Nov 1990NANANA
> Jan 1991 54.00 34.00 12.00
> Feb 1991 27.00 17.00  6.00
> Mar 1991NA 34.00NA
> Apr 1991NA 34.00NA
> May 1991 54.00 34.00 12.00
> Jun 1991 18.00 11.33  4.00
> Jul 1991NA 34.00NA
> Aug 1991 54.00 34.00 12.00
> Sep 1991 54.00 34.00 12.

Re: [R] Data frame reordering to time series

2010-08-08 Thread Gabor Grothendieck
On Sun, Aug 8, 2010 at 11:55 AM, Gabor Grothendieck
 wrote:
> On Sun, Aug 8, 2010 at 11:21 AM, steven mosher  wrote:
>> Ok,
>> I'm a bit confused by what you mean by "regularly spaced"
>> After I do the  do.call I do get a data structure with all the times present
>> and every time has a NA or a data value.
>> Steve
>>
>
> regularly spaced means that every observation is one month later than
> the prior.  If there are missing 6 month chunks or missing entire
> years then the observations are not regularly spaced since there are
> some months not present.
>


And here it is with as.ts


> Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
+  Oct=Values,Nov=Values,Dec=Values2)
>
> library(zoo)
> f <- function(x) {
+dat <- x[-(1:2)]
+tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+"))
+zoo(c(as.matrix(dat)), tim)
+ }
> z <- do.call(cbind, by(Data, Data$Index, f))
> as.ts(z)
X12345X67543X89765
Jan 1989NA 12.00NA
Feb 1989NA  6.00NA
Mar 1989NA 12.00NA
Apr 1989NA 12.00NA
May 1989NA 12.00NA
Jun 1989NA  4.00NA
Jul 1989NA 12.00NA
Aug 1989NA 12.00NA
Sep 1989NA 12.00NA
Oct 1989NA 12.00NA
Nov 1989NA 12.00NA
Dec 1989NANANA
Jan 1990NA 14.00NA
Feb 1990NA  7.00NA
Mar 1990NANANA
Apr 1990NANANA
May 1990NA 14.00NA
Jun 1990NA  4.67NA
Jul 1990NANANA
Aug 1990NA 14.00NA
Sep 1990NA 14.00NA
Oct 1990NA 14.00NA
Nov 1990NANANA
Dec 1990NANANA
Jan 1991 54.00 34.00 12.00
Feb 1991 27.00 17.00  6.00
Mar 1991NA 34.00NA
Apr 1991NA 34.00NA
May 1991 54.00 34.00 12.00
Jun 1991 18.00 11.33  4.00
Jul 1991NA 34.00NA
Aug 1991 54.00 34.00 12.00
Sep 1991 54.00 34.00 12.00
Oct 1991 54.00 34.00 12.00
Nov 1991NA 34.00NA
Dec 1991NANANA
Jan 1992NA 21.00 13.00
Feb 1992NA 10.50  6.50
Mar 1992NA 21.00 13.00
Apr 1992NA 21.00 13.00
May 1992NA 21.00 13.00
Jun 1992NA  7.00  4.33
Jul 1992NA 21.00 13.00
Aug 1992NA 21.00 13.00
Sep 1992NA 21.00 13.00
Oct 1992NA 21.00 13.00
Nov 1992NA 21.00 13.00
Dec 1992NANANA
Jan 1993 65.00NA 13.00
Feb 1993 32.50NA  6.50
Mar 1993 65.00NANA
Apr 1993 65.00NANA
May 1993 65.00NA 13.00
Jun 1993 21.67NA  4.33
Jul 1993 65.00NANA
Aug 1993 65.00NA 13.00
Sep 1993 65.00NA 13.00
Oct 1993 65.00NA 13.00
Nov 1993 65.00NANA
Dec 1993NANANA
Jan 1994 23.00NA 13.00
Feb 1994 11.50NA  6.50
Mar 1994 23.00NA 13.00
Apr 1994 23.00NA 13.00
May 1994 23.00NA 13.00
Jun 1994  7.67NA  4.33
Jul 1994 23.00NA 13.00
Aug 1994 23.00NA 13.00
Sep 1994 23.00NA 13.00
Oct 1994 23.00NA 13.00
Nov 1994 23.00NA 13.00
Dec 1994NANANA
Jan 1995NANA 14.00
Feb 1995NANA  7.00
Mar 1995NANA 14.00
Apr 1995NANA 14.00
May 1995NANA 14.00
Jun 1995NANA  4.67
Jul 1995NANA 14.00
Aug 1995NANA 14.00
Sep 1995NANA 14.00
Oct 1995NANA 14.00
Nov 1995NANA 14.00

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-08 Thread Gabor Grothendieck
On Sun, Aug 8, 2010 at 11:21 AM, steven mosher  wrote:
> Ok,
> I'm a bit confused by what you mean by "regularly spaced"
> After I do the  do.call I do get a data structure with all the times present
> and every time has a NA or a data value.
> Steve
>

regularly spaced means that every observation is one month later than
the prior.  If there are missing 6 month chunks or missing entire
years then the observations are not regularly spaced since there are
some months not present.

It works for me:

> Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
+  Oct=Values,Nov=Values,Dec=Values2)
>
> library(zoo)
> f <- function(x) {
+dat <- x[-(1:2)]
+tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+"))
+zoo(c(as.matrix(dat)), tim)
+ }
> do.call(cbind, by(Data, Data$Index, f))
X12345X67543X89765
Jan 1989NA 12.00NA
Feb 1989NA  6.00NA
Mar 1989NA 12.00NA
Apr 1989NA 12.00NA
May 1989NA 12.00NA
Jun 1989NA  4.00NA
Jul 1989NA 12.00NA
Aug 1989NA 12.00NA
Sep 1989NA 12.00NA
Oct 1989NA 12.00NA
Nov 1989NA 12.00NA
Jan 1990NA 14.00NA
Feb 1990NA  7.00NA
Mar 1990NANANA
Apr 1990NANANA
May 1990NA 14.00NA
Jun 1990NA  4.67NA
Jul 1990NANANA
Aug 1990NA 14.00NA
Sep 1990NA 14.00NA
Oct 1990NA 14.00NA
Nov 1990NANANA
Jan 1991 54.00 34.00 12.00
Feb 1991 27.00 17.00  6.00
Mar 1991NA 34.00NA
Apr 1991NA 34.00NA
May 1991 54.00 34.00 12.00
Jun 1991 18.00 11.33  4.00
Jul 1991NA 34.00NA
Aug 1991 54.00 34.00 12.00
Sep 1991 54.00 34.00 12.00
Oct 1991 54.00 34.00 12.00
Nov 1991NA 34.00NA
Jan 1992NA 21.00 13.00
Feb 1992NA 10.50  6.50
Mar 1992NA 21.00 13.00
Apr 1992NA 21.00 13.00
May 1992NA 21.00 13.00
Jun 1992NA  7.00  4.33
Jul 1992NA 21.00 13.00
Aug 1992NA 21.00 13.00
Sep 1992NA 21.00 13.00
Oct 1992NA 21.00 13.00
Nov 1992NA 21.00 13.00
Jan 1993 65.00NA 13.00
Feb 1993 32.50NA  6.50
Mar 1993 65.00NANA
Apr 1993 65.00NANA
May 1993 65.00NA 13.00
Jun 1993 21.67NA  4.33
Jul 1993 65.00NANA
Aug 1993 65.00NA 13.00
Sep 1993 65.00NA 13.00
Oct 1993 65.00NA 13.00
Nov 1993 65.00NANA
Jan 1994 23.00NA 13.00
Feb 1994 11.50NA  6.50
Mar 1994 23.00NA 13.00
Apr 1994 23.00NA 13.00
May 1994 23.00NA 13.00
Jun 1994  7.67NA  4.33
Jul 1994 23.00NA 13.00
Aug 1994 23.00NA 13.00
Sep 1994 23.00NA 13.00
Oct 1994 23.00NA 13.00
Nov 1994 23.00NA 13.00
Jan 1995NANA 14.00
Feb 1995NANA  7.00
Mar 1995NANA 14.00
Apr 1995NANA 14.00
May 1995NANA 14.00
Jun 1995NANA  4.67
Jul 1995NANA 14.00
Aug 1995NANA 14.00
Sep 1995NANA 14.00
Oct 1995NANA 14.00
Nov 1995NANA 14.00

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-08 Thread steven mosher
Ok,

I'm a bit confused by what you mean by "regularly spaced"
After I do the  do.call I do get a data structure with all the times present
and every time has a NA or a data value.

Steve

On Sun, Aug 8, 2010 at 2:46 AM, Gabor Grothendieck
wrote:

> On Sun, Aug 8, 2010 at 2:01 AM, steven mosher 
> wrote:
> > In the real data the months are all complete, but the years can be
> missing.
> > So years can be missing up front, in the middle, at the end. but if a
> year
> > is present than every month has a value or NA.
> > To create regular R ts I had to plow through the data frame, collect a
> year
> > caluculate an index to put it into the final time series.
> >
> > I had tried zoo out and it handled the irregular spaced data, but a large
> > data structure of zoo objects had stumped me. espcially since I need to
> do
> > matching and selecting
> > of the zoo objects.
> > In the real data, there are about 7000 time series of 1500 months and
> those
> > 7000
> > get averaged and combined in different ways
>
> If there are missing years and you want to get a regularly spaced
> series out then use the zoo version of f (rather than the ts version of f)
> and if this is the last statement (same as before but assigning
> it to the variable z):
>
>   z <- do.call(cbind, by(Data, Data$Index, f))
>
> then to get a regularly spaced ts object just do this:
>
>   as.ts(z)
>
> or
>
>   as.zooreg(as.ts(z))
>
> to create a regularly spaced zooreg object.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-08 Thread Gabor Grothendieck
On Sun, Aug 8, 2010 at 2:01 AM, steven mosher  wrote:
> In the real data the months are all complete, but the years can be missing.
> So years can be missing up front, in the middle, at the end. but if a year
> is present than every month has a value or NA.
> To create regular R ts I had to plow through the data frame, collect a year
> caluculate an index to put it into the final time series.
>
> I had tried zoo out and it handled the irregular spaced data, but a large
> data structure of zoo objects had stumped me. espcially since I need to do
> matching and selecting
> of the zoo objects.
> In the real data, there are about 7000 time series of 1500 months and those
> 7000
> get averaged and combined in different ways

If there are missing years and you want to get a regularly spaced
series out then use the zoo version of f (rather than the ts version of f)
and if this is the last statement (same as before but assigning
it to the variable z):

   z <- do.call(cbind, by(Data, Data$Index, f))

then to get a regularly spaced ts object just do this:

   as.ts(z)

or

   as.zooreg(as.ts(z))

to create a regularly spaced zooreg object.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread steven mosher
In the real data the months are all complete, but the years can be missing.
So years can be missing up front, in the middle, at the end. but if a year
is present than every month has a value or NA.

To create regular R ts I had to plow through the data frame, collect a year
caluculate an index to put it into the final time series.

I had tried zoo out and it handled the irregular spaced data, but a large
data structure of zoo objects had stumped me. espcially since I need to do
matching and selecting
of the zoo objects.

In the real data, there are about 7000 time series of 1500 months and those
7000
get averaged and combined in different ways


On Sat, Aug 7, 2010 at 8:45 PM, Gabor Grothendieck
wrote:

> On Sat, Aug 7, 2010 at 9:18 PM, steven mosher 
> wrote:
> > Very Slick.
> > Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
> > bunch of working code.
> >
> >
> >
> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
> >
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
> >  Data
> >Index Year Jan  Feb Mar Apr Jun
> > 1  67543 1989  12  6.0  12  12  12
> > 2  67543 1990  14  7.0  NA  NA  14
> > 3  67543 1991  34 17.0  34  34  34
> > 4  67543 1992  21 10.5  21  21  21
> > 5  12345 1991  54 27.0  NA  NA  54
> > 6  12345 1993  65 32.5  65  65  65
> > 7  12345 1994  23 11.5  23  23  23
> > 8  89765 1991  12  6.0  NA  NA  12
> > 9  89765 1992  13  6.5  13  13  13
> > 10 89765 1993  13  6.5  NA  NA  13
> > 11 89765 1994  13  6.5  13  13  13
> > 12 89765 1995  14  7.0  14  14  14
> > #  Gabor's solution
> >  f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
> >  do.call(cbind, by(Data, Data$Index, f))
> >  12345 67543 89765
>
>
> The original data had consecutive months in each series (actually
> there was a missing 1992 in one case but I assumed that was an
> inadvertent omission and the actual data was complete); however, here
> we have missing 6 month chunks in addition.  That makes the series
> non-consecutive so to solve that we could either apply this to the
> data (after putting the missing 1992 year back in):
>
> Data <- cbind(Data, NA, NA, NA, NA, NA, NA)
>
> or we could use a time series class that can handle irregularly spaced
> data:
>
> library(zoo)
> f <- function(x) {
>dat <- x[-(1:2)]
>tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+"))
>zoo(c(as.matrix(dat)), tim)
> }
> do.call(cbind, by(Data, Data$Index, f))
>
> The last line is  unchanged from before.  This code will also handle
> the original situation correctly even if the missing 1992 is truly
> missing.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread Gabor Grothendieck
On Sat, Aug 7, 2010 at 9:18 PM, steven mosher  wrote:
> Very Slick.
> Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
> bunch of working code.
>
>
>
> Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>  Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
>  Data
>    Index Year Jan  Feb Mar Apr Jun
> 1  67543 1989  12  6.0  12  12  12
> 2  67543 1990  14  7.0  NA  NA  14
> 3  67543 1991  34 17.0  34  34  34
> 4  67543 1992  21 10.5  21  21  21
> 5  12345 1991  54 27.0  NA  NA  54
> 6  12345 1993  65 32.5  65  65  65
> 7  12345 1994  23 11.5  23  23  23
> 8  89765 1991  12  6.0  NA  NA  12
> 9  89765 1992  13  6.5  13  13  13
> 10 89765 1993  13  6.5  NA  NA  13
> 11 89765 1994  13  6.5  13  13  13
> 12 89765 1995  14  7.0  14  14  14
> #  Gabor's solution
>  f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
>  do.call(cbind, by(Data, Data$Index, f))
>              12345 67543 89765


The original data had consecutive months in each series (actually
there was a missing 1992 in one case but I assumed that was an
inadvertent omission and the actual data was complete); however, here
we have missing 6 month chunks in addition.  That makes the series
non-consecutive so to solve that we could either apply this to the
data (after putting the missing 1992 year back in):

Data <- cbind(Data, NA, NA, NA, NA, NA, NA)

or we could use a time series class that can handle irregularly spaced data:

library(zoo)
f <- function(x) {
dat <- x[-(1:2)]
tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+"))
zoo(c(as.matrix(dat)), tim)
}
do.call(cbind, by(Data, Data$Index, f))

The last line is  unchanged from before.  This code will also handle
the original situation correctly even if the missing 1992 is truly
missing.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread steven mosher
Very Slick.

Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
bunch of working code.




Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
 Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
 Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
 
Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
 Data
   Index Year Jan  Feb Mar Apr Jun
1  67543 1989  12  6.0  12  12  12
2  67543 1990  14  7.0  NA  NA  14
3  67543 1991  34 17.0  34  34  34
4  67543 1992  21 10.5  21  21  21
5  12345 1991  54 27.0  NA  NA  54
6  12345 1993  65 32.5  65  65  65
7  12345 1994  23 11.5  23  23  23
8  89765 1991  12  6.0  NA  NA  12
9  89765 1992  13  6.5  13  13  13
10 89765 1993  13  6.5  NA  NA  13
11 89765 1994  13  6.5  13  13  13
12 89765 1995  14  7.0  14  14  14

#  Gabor's solution

 f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
 do.call(cbind, by(Data, Data$Index, f))
 12345 67543 89765
Jan 1989NA  12.0NA
Feb 1989NA   6.0NA
Mar 1989NA  12.0NA
Apr 1989NA  12.0NA
May 1989NA  12.0NA
Jun 1989NA  14.0NA
Jul 1989NA   7.0NA
Aug 1989NANANA
Sep 1989NANANA
Oct 1989NA  14.0NA
Nov 1989NA  34.0NA
Dec 1989NA  17.0NA
Jan 1990NA  34.0NA
Feb 1990NA  34.0NA
Mar 1990NA  34.0NA
Apr 1990NA  21.0NA
May 1990NA  10.5NA
Jun 1990NA  21.0NA
Jul 1990NA  21.0NA
Aug 1990NA  21.0NA
Sep 1990NANANA
Oct 1990NANANA
Nov 1990NANANA
Dec 1990NANANA
Jan 1991  54.0NA  12.0
Feb 1991  27.0NA   6.0
...

On Sat, Aug 7, 2010 at 5:09 PM, steven mosher wrote:

> Thanks Gabor, I probably should have done an example with fewer columns.
>
> i will rework the example and post it up so the next guys who has this
> issue can have a
> clear example with a solution.
>
>
>
> On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck <
> ggrothendi...@gmail.com> wrote:
>
>> On Sat, Aug 7, 2010 at 4:49 PM, steven mosher 
>> wrote:
>> > Given a data frame, or it could be a matrix if I choose to.
>> > The data consists of an ID, a year, and data for all 12 months.
>> > Missing values are a factor AND missing years.
>> >
>> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>> >  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>> >
>>  
>> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
>> > + Oct=Values,Nov=Values,Dec=Values2)
>> >  Data
>> >   Index Year Jan  Feb Mar Apr Jun  July Aug Sep Oct Nov Dec
>> > 1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
>> > 2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
>> > 3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
>> > 4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
>> > 5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
>> > 6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
>> > 7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
>> > 8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
>> > 9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
>> > 10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
>> > 11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
>> > 12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14
>> >
>> >
>> > The Goal is to return a Time series object for each ID. Alternatively
>> one
>> > could return a matrix that I can turn into a Time series.
>> > The final structure would be something like this ( done in matrix form
>> for
>> > illustration)
>> >  1989.0  1989.083
>> >1991 ..19921993. 1994  1995
>> > 67543 12   6.0   12  12  12  4.00  12  12  12  12  12...
>> > .34...21.. NA.NANA
>> > 12345  NA, NA,
>> > NA,.54 27
>> >
>> > Basically the time series will have patches at the front, middle and end
>> > where you may have years of NA
>> > The must be column ordered by time and aligned so that averages for all
>> > series can be computed per month.
>> >
>> > Now I have looping code to do this, where I loop through all the IDs and
>> map
>> > the row of data into the correct
>> > column. and create column names based on the data and row names based on
>> the
>> > ID, but it's painfully
>> > slow. Any wizardry would help.
>>
>> Your email came out a bit garbled so its not clear what you want to
>> get out but this code will produce a multivariate ts series, i.e. an
>> mts series, with one column for each series:
>>
>> f <- function(x) ts(c(t(x[-(1:2

Re: [R] Data frame reordering to time series

2010-08-07 Thread steven mosher
Thanks Gabor, I probably should have done an example with fewer columns.

i will rework the example and post it up so the next guys who has this issue
can have a
clear example with a solution.



On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck
wrote:

> On Sat, Aug 7, 2010 at 4:49 PM, steven mosher 
> wrote:
> > Given a data frame, or it could be a matrix if I choose to.
> > The data consists of an ID, a year, and data for all 12 months.
> > Missing values are a factor AND missing years.
> >
> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> >  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
> >
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
> > + Oct=Values,Nov=Values,Dec=Values2)
> >  Data
> >   Index Year Jan  Feb Mar Apr Jun  July Aug Sep Oct Nov Dec
> > 1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
> > 2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
> > 3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
> > 4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
> > 5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
> > 6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
> > 7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
> > 8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
> > 9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
> > 10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
> > 11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
> > 12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14
> >
> >
> > The Goal is to return a Time series object for each ID. Alternatively one
> > could return a matrix that I can turn into a Time series.
> > The final structure would be something like this ( done in matrix form
> for
> > illustration)
> >  1989.0  1989.083
> >1991 ..19921993. 1994  1995
> > 67543 12   6.0   12  12  12  4.00  12  12  12  12  12...
> > .34...21.. NA.NANA
> > 12345  NA, NA,
> > NA,.54 27
> >
> > Basically the time series will have patches at the front, middle and end
> > where you may have years of NA
> > The must be column ordered by time and aligned so that averages for all
> > series can be computed per month.
> >
> > Now I have looping code to do this, where I loop through all the IDs and
> map
> > the row of data into the correct
> > column. and create column names based on the data and row names based on
> the
> > ID, but it's painfully
> > slow. Any wizardry would help.
>
> Your email came out a bit garbled so its not clear what you want to
> get out but this code will produce a multivariate ts series, i.e. an
> mts series, with one column for each series:
>
> f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
> do.call(cbind, by(Data, Data$Index, f))
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread Gabor Grothendieck
On Sat, Aug 7, 2010 at 4:49 PM, steven mosher  wrote:
> Given a data frame, or it could be a matrix if I choose to.
> The data consists of an ID, a year, and data for all 12 months.
> Missing values are a factor AND missing years.
>
> Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>  Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
> + Oct=Values,Nov=Values,Dec=Values2)
>  Data
>   Index Year Jan  Feb Mar Apr Jun      July Aug Sep Oct Nov Dec
> 1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
> 2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
> 3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
> 4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
> 5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
> 6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
> 7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
> 8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
> 9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
> 10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
> 11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
> 12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14
>
>
> The Goal is to return a Time series object for each ID. Alternatively one
> could return a matrix that I can turn into a Time series.
> The final structure would be something like this ( done in matrix form for
> illustration)
>          1989.0  1989.083
>    1991 ..19921993. 1994  1995
> 67543 12       6.0   12  12  12  4.00  12  12  12  12  12...
> .34...21..     NA.NANA
> 12345  NA, NA,
> NA,.54 27
>
> Basically the time series will have patches at the front, middle and end
> where you may have years of NA
> The must be column ordered by time and aligned so that averages for all
> series can be computed per month.
>
> Now I have looping code to do this, where I loop through all the IDs and map
> the row of data into the correct
> column. and create column names based on the data and row names based on the
> ID, but it's painfully
> slow. Any wizardry would help.

Your email came out a bit garbled so its not clear what you want to
get out but this code will produce a multivariate ts series, i.e. an
mts series, with one column for each series:

f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
do.call(cbind, by(Data, Data$Index, f))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data frame reordering to time series

2010-08-07 Thread steven mosher
Given a data frame, or it could be a matrix if I choose to.
The data consists of an ID, a year, and data for all 12 months.
Missing values are a factor AND missing years.

Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
 Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
 Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
 Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
 
Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
+ Oct=Values,Nov=Values,Dec=Values2)
 Data
   Index Year Jan  Feb Mar Apr Jun  July Aug Sep Oct Nov Dec
1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14


The Goal is to return a Time series object for each ID. Alternatively one
could return a matrix that I can turn into a Time series.
The final structure would be something like this ( done in matrix form for
illustration)
  1989.0  1989.083
1991 ..19921993. 1994  1995
67543 12   6.0   12  12  12  4.00  12  12  12  12  12...
.34...21.. NA.NANA
12345  NA, NA,
NA,.54 27

Basically the time series will have patches at the front, middle and end
where you may have years of NA
The must be column ordered by time and aligned so that averages for all
series can be computed per month.

Now I have looping code to do this, where I loop through all the IDs and map
the row of data into the correct
column. and create column names based on the data and row names based on the
ID, but it's painfully
slow. Any wizardry would help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-08-02 Thread Lily_stats

Hi,

I have managed to convert my data frames into xts such as :

> str(z)
An ‘xts’ object from 1983-01-03 19:00:00 to 2006-01-01 22:00:00 containing:
  Data: num [1:182959, 1:2] 12.6 11.3 12.7 12.8 10.9 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:2] "v" "DD"
  Indexed by objects of class: [POSIXt,POSIXct] TZ:
  Original class: 'data.frame'
  xts Attributes:
 NULL
I have a second set of data and would like to pull out the values of "v"
when time series from z and z1 are exact.

I have tried to look at cbind etc, but I am stuck and very confused!

Any help is appreciated

On Fri, Jul 30, 2010 at 12:44 PM, raghu [via R] <
ml-node+2307836-1452362937-369...@n4.nabble.com
> wrote:

> Convert your datasets into xts objects and then do a cbind ordering by the
> column you want. Do a ?cbind.
>
> HTH
> Raghu
>
>  On Fri, Jul 30, 2010 at 10:33 AM, Lily_stats [via R] <[hidden 
> email]
> > wrote:
>
>> Hi,
>>
>> I am very new to R so these questions may seem simple!
>>
>> I have a huge 2 sets of data(matrix 5x2++) in the following formats ,
>> for example "data.txt" and "data2.txt":
>>
>> Date   Time X   Y
>> 03/03/1983  20:00   0.1  990
>>
>> I would like to recreate a new matrix which filters through "data.txt" and
>> "data2.txt" to get something as below :
>>
>> Date Time   X_data1  X_data2
>>  Y_data1  Y_data2
>> 31/12/2000 12:00 2.25
>>  0990
>>
>> So I basically need :
>> 1) When Date AND Time from data1.txt and data2.txt match, list the
>> corresponding X and Y values (X_data1,X_data2,Y_data1,Y_data2)
>>
>> Thank you in advance, and I hope I have been clear enough in my message
>>
>> --
>>  View message @
>> http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.html
>> To start a new topic under R help, email [hidden 
>> email]
>> To unsubscribe from R help, click here.
>>
>>
>
>
> --
> 'Raghu'
>
>
> --
> View message @
> http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307836.html
> To unsubscribe from Data Handling, click here< (link removed) >.
>
>
>

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2310318.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread raghu

Please try:
data <-  xts(data[,2:n], order.by=as.POSIXct(strptime(data[,1],
"%d/%m/%Y")))

Use similar strptime for hours also.n=number of columns.

Good Luck
Raghu

On Fri, Jul 30, 2010 at 2:02 PM, Lily_stats [via R] <
ml-node+2307936-1777222343-309...@n4.nabble.com
> wrote:

> Hi,
>
> I am trying to convert my dataset into xts. I have tried the following :
>
> data1<-read.table("data1.txt",header=F)
> data2<-read.table("data2.txt",header=F)
>
> data1.xts
> However, I get an error :
>
> Error in as.POSIXlt.character(x, tz, ...) :
>   character string is not in a standard unambiguous format
>
> I understand that my date and time format might not be accepted and have
> tried to convert this but failed.
>
> Could you suggest something ?
>
> My date is in the format : dd/mm/
> My time is in the format : hh:00
>
> Thank you in advance
>
>
>
> --
>  View message @
> http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307936.html
> To start a new topic under R help, email
> ml-node+789696-608741344-309...@n4.nabble.com
> To unsubscribe from R help, click here< (link removed) >.
>
>
>


-- 
'Raghu'

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307959.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread raghu

Convert your datasets into xts objects and then do a cbind ordering by the
column you want. Do a ?cbind.

HTH
Raghu

On Fri, Jul 30, 2010 at 10:33 AM, Lily_stats [via R] <
ml-node+2307770-1033893256-309...@n4.nabble.com
> wrote:

> Hi,
>
> I am very new to R so these questions may seem simple!
>
> I have a huge 2 sets of data(matrix 5x2++) in the following formats ,
> for example "data.txt" and "data2.txt":
>
> Date   Time X   Y
> 03/03/1983  20:00   0.1  990
>
> I would like to recreate a new matrix which filters through "data.txt" and
> "data2.txt" to get something as below :
>
> Date Time   X_data1  X_data2
>  Y_data1  Y_data2
> 31/12/2000 12:00 2.25
>  0990
>
> So I basically need :
> 1) When Date AND Time from data1.txt and data2.txt match, list the
> corresponding X and Y values (X_data1,X_data2,Y_data1,Y_data2)
>
> Thank you in advance, and I hope I have been clear enough in my message
>
> --
>  View message @
> http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.html
> To start a new topic under R help, email
> ml-node+789696-608741344-309...@n4.nabble.com
> To unsubscribe from R help, click here< (link removed) >.
>
>
>


-- 
'Raghu'

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307836.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Handling

2010-07-30 Thread Lily_stats

Hi, 

I am very new to R so these questions may seem simple!

I have a huge 2 sets of data(matrix 5x2++) in the following formats ,
for example "data.txt" and "data2.txt":

Date   Time X   Y
03/03/1983  20:00   0.1  990

I would like to recreate a new matrix which filters through "data.txt" and
"data2.txt" to get something as below :

Date Time   X_data1  X_data2   
Y_data1  Y_data2
31/12/2000 12:00 2.25  0
   
990

So I basically need :
1) When Date AND Time from data1.txt and data2.txt match, list the
corresponding X and Y values (X_data1,X_data2,Y_data1,Y_data2)

Thank you in advance, and I hope I have been clear enough in my message
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307770.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread Gabor Grothendieck
On Fri, Jul 30, 2010 at 9:02 AM, Lily_stats  wrote:
>
> Hi,
>
> I am trying to convert my dataset into xts. I have tried the following :
>
> data1<-read.table("data1.txt",header=F)
> data2<-read.table("data2.txt",header=F)
>
> data1.xts
> However, I get an error :
>
> Error in as.POSIXlt.character(x, tz, ...) :
>  character string is not in a standard unambiguous format
>
> I understand that my date and time format might not be accepted and have
> tried to convert this but failed.
>
> Could you suggest something ?
>
> My date is in the format : dd/mm/
> My time is in the format : hh:00
>
> Thank you in advance
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307936.html
> Sent from the R help mailing list archive at Nabble.com.
>

Can't say too much since there is no detail in your post but you can
do something like this:

library(xts) # this also loads zoo
library(chron) # if you wish to use chron
z <- read.zoo(...)
x <- as.xts(z)

where you may need to use FUN= and possibly the index.column= and
other arguments to read.zoo.  See ?read.zoo and the R News 4/1 article
on dates.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Handling

2010-07-30 Thread Lily_stats

Hi,

I am trying to convert my dataset into xts. I have tried the following :

data1<-read.table("data1.txt",header=F)
data2<-read.table("data2.txt",header=F)

data1.xtshttp://r.789695.n4.nabble.com/Data-Handling-tp2307770p2307936.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data frame modification

2010-07-28 Thread siddharth . garg85
Hi

I am trying to modify a data frame D with lists x and y in such a way that if a 
value in x==0 then it should replace that value with the last not zero value in 
x. I.e.
 
for loop over i{
if(D$x[i]==0)
 D$x[i]=D$x[i-1]
}

The data frame is quite large in size ~ 43000 rows. This operation is taking a 
large amount of time. Can someone please suggest me what might be the reason.

Thanks
Regards
Siddharth
Sent on my BlackBerry® from Vodafone
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data arranged by p-values

2010-07-26 Thread ONKELINX, Thierry
Have a look at ?cumsum. Apply that on a true/false vector (p-value >
0.05)



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

> -Oorspronkelijk bericht-
> Van: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] Namens jd6688
> Verzonden: maandag 26 juli 2010 7:07
> Aan: r-help@r-project.org
> Onderwerp: [R] data arranged by p-values
> 
> 
> Idcat1locationitem_values p-values
> sequence  
> a111  1   3002737 0.196504377 0.011   
> a112  1   3017821 0.196504377 0.052   
> a113  1   3027730 0.196504377 0.023   
> a114  1   3036220 0.196504377 0.044   
> a115  1   3053984 0.196504377 0.035   
> a116  1   3063892 0.196504377 0.076   
> a117  1   3076333 0.196504377 0.087   
> a118  1   3090500 0.196504377 0.028   
> a119  1   3103304 0.196504377 0.039   
> a120  1   3119350 0.196504377 0.0510  
> a121  1   3129884 0.196504377 0.0111  
> a122  1   3154598 0.196504377 0.0312  
> a123  1   3170910 0.196504377 0.0513  
> a124  1   3180712 0.196504377 0.0614  
> a125  1   3186519 0.196504377 0.0715  
> a126  1   3192256 0.196504377 0.0916  
> a127  1   3198441 0.196504377 0.0117  
> a128  1   3205784 0.196504377 0.0218  
> a129  1   3210685 0.196504377 0.0319  
> a130  1   3218542 0.196504377 0.0420  
> a131  1   3234318 0.196504377 0.0521  
> a132  1   3239972 0.196504377 0.0922  
> a133  1   3245663 0.196504377 0.0523  
> a134  1   3257997 0.196504377 0.0224  
> a135  1   3273226 0.196504377 0.0326  
> a136  1   3285404 0.196504377 0.0427  
> a137  1   3290332 0.196504377 0.0528  
> a138  1   3300679 0.196504377 0.0329  
> a139  1   3310164 0.196504377 0.0930  
> 
> 
> first of all, please pay attention to the P -values, all the 
> rows with the p-value <0.05 will be considered as one region 
> until the p-value >0.05 identified. for instance: REGION 1 is 
> the rows from id a111 to id A115 .
> REGION 2  is the rows from id a118 to a123, etc.
> 
> what i am going to accomplish is to pick the start and end 
> location, and the peak value from the item_values for each region.
> 
> option 1:
> 
>loop through each row until the p-value>0.05 identified then
> start_location=the first location value
> end_location=the location value before the p>0.05
> peak_value of the item_values=the maximum one
> 
> option 2
> 
> create a sequence number for each row;
> subset the raw dataframe by p<0.05;
> the p-value regions will be identified by the gapped 
> sequence number.
> for instance
>from sequence 1 to 5 will be considering one region.
> 
>  Id   cat1locationitem_values 
> p-values  sequence
> a111  1   3002737 0.196504377 0.011   
> a112  1   3017821 0.196504377 0.052   
> a113  1   3027730 0.196504377 0.023   
> a114  1   3036220 0.196504377 0.044   
> a115  1   3053984 0.196504377 0.035   
> a118  1   3090500 0.196504377 0.028   
> a119  1   3103304 0.196504377 0.039   
> 
> 
> I need your recommendation on the different approach to 
> implement this?
> Thanks,
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/data-arranged-by-p-values-tp2301
> 909p2301909.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-he

[R] data arranged by p-values

2010-07-25 Thread jd6688

Id  cat1locationitem_values p-valuessequence
a1111   3002737 0.196504377 0.011   
a1121   3017821 0.196504377 0.052   
a1131   3027730 0.196504377 0.023   
a1141   3036220 0.196504377 0.044   
a1151   3053984 0.196504377 0.035   
a1161   3063892 0.196504377 0.076   
a1171   3076333 0.196504377 0.087   
a1181   3090500 0.196504377 0.028   
a1191   3103304 0.196504377 0.039   
a1201   3119350 0.196504377 0.0510  
a1211   3129884 0.196504377 0.0111  
a1221   3154598 0.196504377 0.0312  
a1231   3170910 0.196504377 0.0513  
a1241   3180712 0.196504377 0.0614  
a1251   3186519 0.196504377 0.0715  
a1261   3192256 0.196504377 0.0916  
a1271   3198441 0.196504377 0.0117  
a1281   3205784 0.196504377 0.0218  
a1291   3210685 0.196504377 0.0319  
a1301   3218542 0.196504377 0.0420  
a1311   3234318 0.196504377 0.0521  
a1321   3239972 0.196504377 0.0922  
a1331   3245663 0.196504377 0.0523  
a1341   3257997 0.196504377 0.0224  
a1351   3273226 0.196504377 0.0326  
a1361   3285404 0.196504377 0.0427  
a1371   3290332 0.196504377 0.0528  
a1381   3300679 0.196504377 0.0329  
a1391   3310164 0.196504377 0.0930  


first of all, please pay attention to the P -values, all the rows with the
p-value <0.05 will be considered as one region until the p-value >0.05
identified. for instance: REGION 1 is the rows from id a111 to id A115 .
REGION 2  is the rows from id a118 to a123, etc.

what i am going to accomplish is to pick the start and end location, and the
peak value from the item_values for each region.

option 1:

   loop through each row until the p-value>0.05 identified then
start_location=the first location value
end_location=the location value before the p>0.05
peak_value of the item_values=the maximum one

option 2

create a sequence number for each row;
subset the raw dataframe by p<0.05;
the p-value regions will be identified by the gapped sequence number.
for instance
   from sequence 1 to 5 will be considering one region.

 Id cat1locationitem_values p-valuessequence
a1111   3002737 0.196504377 0.011   
a1121   3017821 0.196504377 0.052   
a1131   3027730 0.196504377 0.023   
a1141   3036220 0.196504377 0.044   
a1151   3053984 0.196504377 0.035   
a1181   3090500 0.196504377 0.028   
a1191   3103304 0.196504377 0.039   


I need your recommendation on the different approach to implement this?
Thanks,

-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-arranged-by-p-values-tp2301909p2301909.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data from SpatialGridDataFrame

2010-07-20 Thread Kingsford Jones
see ?sp::overlay and section 5.2 of Applied Spatial Data Analysis with R

I see there is now also raster::overlay, but I can't claim experience
with that funciton (however my impression is that the raster package
is a powerful tool for working with potentially very large rasters in
R).

hth,
Kingsford



On Tue, Jul 20, 2010 at 6:12 AM,   wrote:
> Dear All,
>
> I have a raster map of the class 'SpatialPointsDataFrame' and coordinates
> of the class 'SpatialPoints'. I would like to retrieve the values that are
> contained in the raster map at the specific locations given by the
> coordinates.
>
> Can anyone help me out?
>
> Kind regards,
> Katrin Fleischer
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data from SpatialGridDataFrame

2010-07-20 Thread chris howden
I'm not that familiar with this type of data.

I just had a similar issue, but had a GIS person do it in Arc view.

But maybe try some of the following functions?
Match
%in%

Plus I'll forward U the replies I got to my post

Good luck :-)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of kfl...@falw.vu.nl
Sent: Tuesday, 20 July 2010 9:42 PM
To: r-help@r-project.org
Subject: [R] data from SpatialGridDataFrame

Dear All,

I have a raster map of the class 'SpatialPointsDataFrame' and coordinates
of the class 'SpatialPoints'. I would like to retrieve the values that are
contained in the raster map at the specific locations given by the
coordinates.

Can anyone help me out?

Kind regards,
Katrin Fleischer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data from SpatialGridDataFrame

2010-07-20 Thread kfleis
Dear All,

I have a raster map of the class 'SpatialPointsDataFrame' and coordinates
of the class 'SpatialPoints'. I would like to retrieve the values that are
contained in the raster map at the specific locations given by the
coordinates.

Can anyone help me out?

Kind regards,
Katrin Fleischer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data export HELP!

2010-07-11 Thread jim holtman
Not necessarily the best way if your dataframe will get large, but it
should work:

parameters <- NULL  # where you will collect the result
for(j in 1:dim(r)[2]){
   indiv=r[,j][which(r[,j]>-1)] #removes -1 growth data
   age.1=age[1:length(indiv)]
   length.ind=data.frame(age.1,indiv, row.names=TRUE) #data frame of
ages and length


est.ind=nls(indiv~Linf*(1-exp(-K*(age.1-to))),start=lkt.A,data=length.ind) #
von b growth estimate
   summary=summary(est.ind, correlation=TRUE) #gives parameter estimate
values


   parameters <- rbind(parameters,
data.frame("Linf"=coef(est.ind)[1],"K"=coef(est.ind)[2]))
#data frame of parameter estimates

   }

On Sun, Jul 11, 2010 at 3:54 PM, adriana1986  wrote:
>
> Hello!
> So, this is going to seem like a very simple question - I am quite new to R
> and am having some trouble figuring out little nuances:
>
> I am running a loop that goes through my data and performs a nls parameter
> estimation on each data set. At the end of the loop, I would like to collect
> the parameter estimates in ONE SINGLE DATA FRAME or WRITE.TABLE that I can
> import into excel.
>
> Here is my code:
>
>
> for(j in 1:dim(r)[2]){
>        indiv=r[,j][which(r[,j]>-1)] #removes -1 growth data
>        age.1=age[1:length(indiv)]
>        length.ind=data.frame(age.1,indiv, row.names=TRUE) #data frame of
> ages and length
>
>
> est.ind=nls(indiv~Linf*(1-exp(-K*(age.1-to))),start=lkt.A,data=length.ind) #
> von b growth estimate
>        summary=summary(est.ind, correlation=TRUE) #gives parameter estimate
> values
>
>
>        parameters=data.frame("Linf"=coef(est.ind)[1],"K"=coef(est.ind)[2])
> #data frame of parameter estimates
>
>        }
>
> if I just type "parameters" I only get the parameter estimates for the LAST
> individual that was in the loop. How can i combine EVERY SINGLE parameter
> estimate into a data.frame or something of the sort and then export this set
> of data into ONE csv file. I can do it using write.table within my loop, but
> then I get 52 csv files -  I would like to avoid this.
>
> I hope this makes sense - any help would be GREATLY APPRECIATED
>
> thanks!
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/data-export-HELP-tp2285445p2285445.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data export HELP!

2010-07-11 Thread adriana1986

Hello! 
So, this is going to seem like a very simple question - I am quite new to R
and am having some trouble figuring out little nuances:

I am running a loop that goes through my data and performs a nls parameter
estimation on each data set. At the end of the loop, I would like to collect
the parameter estimates in ONE SINGLE DATA FRAME or WRITE.TABLE that I can
import into excel. 

Here is my code: 


for(j in 1:dim(r)[2]){ 
indiv=r[,j][which(r[,j]>-1)] #removes -1 growth data 
age.1=age[1:length(indiv)] 
length.ind=data.frame(age.1,indiv, row.names=TRUE) #data frame of
ages and length 

   
est.ind=nls(indiv~Linf*(1-exp(-K*(age.1-to))),start=lkt.A,data=length.ind) #
von b growth estimate 
summary=summary(est.ind, correlation=TRUE) #gives parameter estimate
values 


parameters=data.frame("Linf"=coef(est.ind)[1],"K"=coef(est.ind)[2])
#data frame of parameter estimates 

} 

if I just type "parameters" I only get the parameter estimates for the LAST
individual that was in the loop. How can i combine EVERY SINGLE parameter
estimate into a data.frame or something of the sort and then export this set
of data into ONE csv file. I can do it using write.table within my loop, but
then I get 52 csv files -  I would like to avoid this. 

I hope this makes sense - any help would be GREATLY APPRECIATED 

thanks!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-export-HELP-tp2285445p2285445.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-09 Thread David Winsemius
Really? I don't usually think of Vectorize as a performance  
enhancement, probably because my use of with a complex function then  
gets applied to 4.5 million records. I need to go out, get a cup of  
coffee, and leave it alone for about half an hour. I tried  recently  
to figure out how I can do the matrix look-up and function application  
without the Vectorize route but gave up after a couple of hours after  
realizing that I had a method that worked and I had spent way more  
time on it than just doing it would have.

Glad it helped.
David.

On Jul 9, 2010, at 11:01 AM, harsh yadav wrote:

> Hi,
>
> Thanks a lot.
> The Vectorize method worked and its much faster than looping through  
> the data frame.
>
> Regards,
> Harsh Yadav
>
> On Thu, Jul 8, 2010 at 11:06 PM, David Winsemius  > wrote:
>
> On Jul 8, 2010, at 10:33 PM, Erik Iverson wrote:
>
>
> I have a data frame:
> id  
> url urlType
> 1 1  www.yahoo.com  www.yahoo.com>1
> 2 2  www.google.com/?search=  search=> 2
> 3 3  www.google.com  www.google.com>   1
> 4 4  www.yahoo.com/?query=  query=>   2
> 5 5  www.gmail.com  www.gmail.com> 1
>
> This is not output from ?dput, which means more work to read it in.
>
>
> Yeah it was kind of pain, but ...
>
> dta <- read.table(textConnection(' id  
> url urlType
>
> 1 1  "www.yahoo.com "  1
> 2 2  "www.google.com/?search=  search=>" 2
> 3 3  "www.google.com " 1
> 4 4  "www.yahoo.com/?query=  query=>"   2
> 5 5  "www.gmail.com " 1') )
>
>
>
>
> Here is the definition for WHITELIST:-
> WHITELIST = "[?]query=, [?]search="
> WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))
>
> What is the 'trim' function?  I do not have that defined.
>
> Perhaps David's answer will work for you...
>
> Seems to ... after I fixed my incorrect cmd-V paste of the function  
> name and guessing that trim was the one in gdata:
>
> > require(gdata)
>
> > checkBaseLine <- function(s){
> + for (listItem in WHITELIST){
> + if(regexpr(as.character(listItem), s)[1] > -1){
> + return(TRUE)
> + }
> + }
> + return(FALSE)
> + }
> >
> > #Here is the definition for WHITELIST:-
>
> >
> > WHITELIST = "[?]query=, [?]search="
> > WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))
> > vcheck <- Vectorize(checkBaseLine)
> >
> > vcheck <- Vectorize(checkBaseLine)
> >
> > dta[ dta$urlType != 1 & vcheck(dta$url) , "url" ]
> [1] www.google.com/?search=  
> www.yahoo.com/?query= 
>  
> 5 Levels: www.gmail.com  www.google.com 
>  > ... www.yahoo.com/?query= 
>
> -- 
> David.
>

David Winsemius, MD
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-09 Thread harsh yadav
Hi,

Thanks a lot.
The Vectorize method worked and its much faster than looping through the
data frame.

Regards,
Harsh Yadav

On Thu, Jul 8, 2010 at 11:06 PM, David Winsemius wrote:

>
> On Jul 8, 2010, at 10:33 PM, Erik Iverson wrote:
>
>
>>  I have a data frame:
>>> id url
>>> urlType
>>> 1 1  www.yahoo.com 
>>>  1
>>> 2 2  www.google.com/?search= 
>>>   2
>>> 3 3  www.google.com 
>>>   1
>>> 4 4  www.yahoo.com/?query= 
>>> 2
>>> 5 5  www.gmail.com 
>>>   1
>>>
>>
>> This is not output from ?dput, which means more work to read it in.
>>
>>
> Yeah it was kind of pain, but ...
>
> dta <- read.table(textConnection(' id url
>   urlType
>
> 1 1  "www.yahoo.com "  1
> 2 2  "www.google.com/?search= " 2
> 3 3  "www.google.com " 1
> 4 4  "www.yahoo.com/?query= "   2
> 5 5  "www.gmail.com " 1') )
>
>
>
>
>>  Here is the definition for WHITELIST:-
>>> WHITELIST = "[?]query=, [?]search="
>>> WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))
>>>
>>
>> What is the 'trim' function?  I do not have that defined.
>>
>> Perhaps David's answer will work for you...
>>
>
> Seems to ... after I fixed my incorrect cmd-V paste of the function name
> and guessing that trim was the one in gdata:
>
> > require(gdata)
>
> > checkBaseLine <- function(s){
> + for (listItem in WHITELIST){
> + if(regexpr(as.character(listItem), s)[1] > -1){
> + return(TRUE)
> + }
> + }
> + return(FALSE)
> + }
> >
> > #Here is the definition for WHITELIST:-
>
> >
> > WHITELIST = "[?]query=, [?]search="
> > WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))
> > vcheck <- Vectorize(checkBaseLine)
> >
> > vcheck <- Vectorize(checkBaseLine)
> >
> > dta[ dta$urlType != 1 & vcheck(dta$url) , "url" ]
> [1] www.google.com/?search= 
> www.yahoo.com/?query= 
> 5 Levels: www.gmail.com  www.google.com <
> http://www.google.com> ... www.yahoo.com/?query= <
> http://www.yahoo.com/?query=>
>
> --
> David.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data format question for triangle.plot package ade4

2010-07-09 Thread David Winsemius


On Jul 8, 2010, at 4:41 PM, steve_fried...@nps.gov wrote:




hello,

I am trying to develop a triangle plot but am having difficultly  
assigning

the row.names to the 3 columns in the data.frame

Here is what I've done,

attach(SoilVegHydro)

dim(SoilVegHydro)
129239

# now  take 3 variables from main data.frame for plotting

dat <- cbind.data.frame(TP, meanAnnualDepthAve, BulkDensity)  #   
These are

variables held in the data frame SoilVegHydro


Did that "dat" object have what you wanted? The function call did not  
make any reference to SoilVegHydro. What does str(dat) return? Oh,  
never mind, I now see you use attach.




row.names(dat) <- paste(row.names(SoilVegHydro$Physiogomy),


Generally row.names is used on a dataframe rather than on a column  
vector.


> dat <- data.frame(1:3, LETTERS[1:3])
> row.names(dat$X1)
> row.names(dat)
[1] "1" "2" "3"
> length(row.names(dat$X1))
[1] 0



rep(c(1,2,3),
rep(1292, 3)), sep =" ")  # following the syntax from the help
triangle.plot page

this is returned when the last line is submitted.

row.names(dat) <- paste(row.names(SoilVegHydro$Physiogomy),  
rep(c(1,2,3),

rep(1292,3)), sep="")
Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "1", "1",  
"1",  :

 invalid 'row.names' length

I'm not certain how to define the row.names .  If anyone can help I'd
appreciate it.

I'm using R 2.11.1 (2010-5-31) on Windows XP

Thanks
Steve


Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data format question for triangle.plot package ade4

2010-07-09 Thread Steve_Friedman


hello,

I am trying to develop a triangle plot but am having difficultly assigning
the row.names to the 3 columns in the data.frame

Here is what I've done,

attach(SoilVegHydro)

dim(SoilVegHydro)
129239

# now  take 3 variables from main data.frame for plotting

dat <- cbind.data.frame(TP, meanAnnualDepthAve, BulkDensity)  #  These are
variables held in the data frame SoilVegHydro

row.names(dat) <- paste(row.names(SoilVegHydro$Physiogomy), rep(c(1,2,3),
rep(1292, 3)), sep =" ")  # following the syntax from the help
triangle.plot page

this is returned when the last line is submitted.

row.names(dat) <- paste(row.names(SoilVegHydro$Physiogomy), rep(c(1,2,3),
rep(1292,3)), sep="")
Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "1", "1", "1",  :
  invalid 'row.names' length

I'm not certain how to define the row.names .  If anyone can help I'd
appreciate it.

I'm using R 2.11.1 (2010-5-31) on Windows XP

Thanks
Steve


Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-08 Thread David Winsemius


On Jul 8, 2010, at 10:33 PM, Erik Iverson wrote:




I have a data frame:
 id  
url urlType
1 1  www.yahoo.com www.yahoo.com>1
2 2  www.google.com/?search=  2
3 3  www.google.com www.google.com>   1
4 4  www.yahoo.com/?query=    2
5 5  www.gmail.com www.gmail.com> 1


This is not output from ?dput, which means more work to read it in.



Yeah it was kind of pain, but ...

dta <- read.table(textConnection(' id  
url urlType

1 1  "www.yahoo.com "  1
2 2  "www.google.com/?search= " 2

3 3  "www.google.com " 1
4 4  "www.yahoo.com/?query= "   2
5 5  "www.gmail.com " 1') )





Here is the definition for WHITELIST:-
WHITELIST = "[?]query=, [?]search="
WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))


What is the 'trim' function?  I do not have that defined.

Perhaps David's answer will work for you...


Seems to ... after I fixed my incorrect cmd-V paste of the function  
name and guessing that trim was the one in gdata:


> require(gdata)
> checkBaseLine <- function(s){
+ for (listItem in WHITELIST){
+ if(regexpr(as.character(listItem), s)[1] > -1){
+ return(TRUE)
+ }
+ }
+ return(FALSE)
+ }
>
> #Here is the definition for WHITELIST:-
>
> WHITELIST = "[?]query=, [?]search="
> WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))
> vcheck <- Vectorize(checkBaseLine)
>
> vcheck <- Vectorize(checkBaseLine)
>
> dta[ dta$urlType != 1 & vcheck(dta$url) , "url" ]
[1] www.google.com/?search=  www.yahoo.com/?query= 
 
5 Levels: www.gmail.com  www.google.com  ... www.yahoo.com/?query= 


--
David.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-08 Thread Erik Iverson



I have a data frame:

  id url 
urlType
1 1  www.yahoo.com    
 1
2 2  www.google.com/?search=    
  2
3 3  www.google.com  
  1
4 4  www.yahoo.com/?query=    
2
5 5  www.gmail.com    
  1




This is not output from ?dput, which means more work to read it in.




Here is the definition for WHITELIST:-

WHITELIST = "[?]query=, [?]search="
WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))


What is the 'trim' function?  I do not have that defined.

Perhaps David's answer will work for you...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-08 Thread David Winsemius


On Jul 8, 2010, at 10:09 PM, harsh yadav wrote:


Hi,

Here is a somewhat detailed explanation of what I want to achieve:

I have a data frame:

 id url
urlType
1 1  www.yahoo.com1
2 2  www.google.com/?search= 2
3 3  www.google.com   1
4 4  www.yahoo.com/?query=   2
5 5  www.gmail.com 1

I want to get all the URLs that are not of type `1` and satisfy the
condition defined by the following function:

checkBaseLine <- function(s){
for (listItem in WHITELIST){
if(regexpr(as.character(listItem), s)[1] > -1){
return(TRUE)
}
}
return(FALSE)
}

Here is the definition for WHITELIST:-

WHITELIST = "[?]query=, [?]search="
WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))

Now, for the given data frame I want to apply the above function for
all row values for a given column:-

That is:

It works fine when I define a condition like:
data <- data[data$urlType != 1,]


Arrrgh. Why do people keep using "data" as an object name? Is there  
some water pump from which I can remove the handle?


Anyway ... try:

vcheck <- Vectorize(V)

data[ data$urlType != 1 & vcheck(data$url) , "url" ]

--
David


However, I want to combine two logical conditions together like:
data <- data[data$urlType != 1 & checkBaseLine(data$url),]

This would check whether the column `urlType` contains row values  
that !=

1, and the column `url` contains row values that satisfy the function
definition.

Any ideas how this can be done?

Thanks in advance.

Regards,
Harsh Yadav


On Thu, Jul 8, 2010 at 9:43 PM, Erik Iverson   
wrote:


It will be a lot easier to help you if you follow the posting guide  
and

PLEASE do read the posting guide and provide commented, minimal,
self-contained, reproducible code.

You gave your function definition, which is good.  Use ?dput to  
give us a

small data.frame that can accurately show what you want.


harsh yadav wrote:


Hi all,

I have a data frame for which I want to limit the output by checking
whether
row values for specific column meets particular conditions.

Here are the more specific details:

I have a function that checks whether an input string exists in a  
defined

list:-

checkBaseLine <- function(s){
for (listItem in WHITELIST){
if(regexpr(as.character(listItem), s)[1] > -1){
return(TRUE)
}
}
return(FALSE)
}

Now, I have a data frame for which I want to apply the above  
function for

all row values for a given column:-

This works fine when I define a condition like:
data <- data[data$urlType != 1,]

However, I want to combine two logical conditions together like:
data <- data[data$urlType != 1 & checkBaseLine(data$url),]

This would check whether the column `urlType` contains row values  
that !=

1,
and the column `url` contains row values that gets evaluated using  
the

defined function.

Any ideas how this can be done?

Thanks in advance.

Regards,
Harsh Yadav






David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-08 Thread harsh yadav
Hi,

Here is a somewhat detailed explanation of what I want to achieve:

I have a data frame:

  id url
urlType
1 1  www.yahoo.com1
2 2  www.google.com/?search= 2
3 3  www.google.com   1
4 4  www.yahoo.com/?query=   2
5 5  www.gmail.com 1

I want to get all the URLs that are not of type `1` and satisfy the
condition defined by the following function:

checkBaseLine <- function(s){
for (listItem in WHITELIST){
 if(regexpr(as.character(listItem), s)[1] > -1){
return(TRUE)
}
 }
return(FALSE)
}

Here is the definition for WHITELIST:-

WHITELIST = "[?]query=, [?]search="
WHITELIST <- unlist(trim(strsplit(trim(WHITELIST), ",")))

Now, for the given data frame I want to apply the above function for
all row values for a given column:-

That is:

It works fine when I define a condition like:
data <- data[data$urlType != 1,]

However, I want to combine two logical conditions together like:
data <- data[data$urlType != 1 & checkBaseLine(data$url),]

This would check whether the column `urlType` contains row values that !=
1, and the column `url` contains row values that satisfy the function
definition.

Any ideas how this can be done?

Thanks in advance.

Regards,
Harsh Yadav


On Thu, Jul 8, 2010 at 9:43 PM, Erik Iverson  wrote:

> It will be a lot easier to help you if you follow the posting guide and
> PLEASE do read the posting guide and provide commented, minimal,
> self-contained, reproducible code.
>
> You gave your function definition, which is good.  Use ?dput to give us a
> small data.frame that can accurately show what you want.
>
>
> harsh yadav wrote:
>
>> Hi all,
>>
>> I have a data frame for which I want to limit the output by checking
>> whether
>> row values for specific column meets particular conditions.
>>
>> Here are the more specific details:
>>
>> I have a function that checks whether an input string exists in a defined
>> list:-
>>
>> checkBaseLine <- function(s){
>>  for (listItem in WHITELIST){
>> if(regexpr(as.character(listItem), s)[1] > -1){
>>  return(TRUE)
>> }
>> }
>>  return(FALSE)
>> }
>>
>> Now, I have a data frame for which I want to apply the above function for
>> all row values for a given column:-
>>
>> This works fine when I define a condition like:
>> data <- data[data$urlType != 1,]
>>
>> However, I want to combine two logical conditions together like:
>> data <- data[data$urlType != 1 & checkBaseLine(data$url),]
>>
>> This would check whether the column `urlType` contains row values that !=
>> 1,
>> and the column `url` contains row values that gets evaluated using the
>> defined function.
>>
>> Any ideas how this can be done?
>>
>> Thanks in advance.
>>
>> Regards,
>> Harsh Yadav
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame Manipulation using function

2010-07-08 Thread Erik Iverson
It will be a lot easier to help you if you follow the posting guide and PLEASE 
do read the posting guide and provide commented, minimal, self-contained, 
reproducible code.


You gave your function definition, which is good.  Use ?dput to give us a small 
data.frame that can accurately show what you want.



harsh yadav wrote:

Hi all,

I have a data frame for which I want to limit the output by checking whether
row values for specific column meets particular conditions.

Here are the more specific details:

I have a function that checks whether an input string exists in a defined
list:-

checkBaseLine <- function(s){
 for (listItem in WHITELIST){
if(regexpr(as.character(listItem), s)[1] > -1){
 return(TRUE)
}
}
 return(FALSE)
}

Now, I have a data frame for which I want to apply the above function for
all row values for a given column:-

This works fine when I define a condition like:
data <- data[data$urlType != 1,]

However, I want to combine two logical conditions together like:
data <- data[data$urlType != 1 & checkBaseLine(data$url),]

This would check whether the column `urlType` contains row values that !=
1,
and the column `url` contains row values that gets evaluated using the
defined function.

Any ideas how this can be done?

Thanks in advance.

Regards,
Harsh Yadav

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Frame Manipulation using function

2010-07-08 Thread harsh yadav
Hi all,

I have a data frame for which I want to limit the output by checking whether
row values for specific column meets particular conditions.

Here are the more specific details:

I have a function that checks whether an input string exists in a defined
list:-

checkBaseLine <- function(s){
 for (listItem in WHITELIST){
if(regexpr(as.character(listItem), s)[1] > -1){
 return(TRUE)
}
}
 return(FALSE)
}

Now, I have a data frame for which I want to apply the above function for
all row values for a given column:-

This works fine when I define a condition like:
data <- data[data$urlType != 1,]

However, I want to combine two logical conditions together like:
data <- data[data$urlType != 1 & checkBaseLine(data$url),]

This would check whether the column `urlType` contains row values that !=
1,
and the column `url` contains row values that gets evaluated using the
defined function.

Any ideas how this can be done?

Thanks in advance.

Regards,
Harsh Yadav

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-07 Thread Greg Snow
> fortune(197)

If anything, there should be a Law: Thou Shalt Not Even Think Of Producing A
Graph That Looks Like Anything From A Spreadsheet.
   -- Ted Harding (in a discussion about producing graphics)
  R-help (August 2007)

Also read the discussion started with:
http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of RaoulD
> Sent: Sunday, July 04, 2010 9:44 PM
> To: r-help@r-project.org
> Subject: [R] Data Labels in a barchart (Lattice or otherwise)
> 
> 
> Hi,
> 
> Can anyone please help me with how I could add labels with the value
> for
> each bar in a barchart? (similar to how data labels can be added in
> Excel) I
> have done a lot of searching but havent been lucky.
> 
> Thanks,
> Raoul
> --
> View this message in context: http://r.789695.n4.nabble.com/Data-
> Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278027.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread David Winsemius


On Jul 5, 2010, at 1:14 PM, RaoulD wrote:



Thank You David. Yes, I am using the lattice barchart and have  
managed to add

data labels, however, they tend to be on the tip of each bar and are
difficult to read as they are partially on the bar. Any help would be
greatly appreciated.

This is the code I am using:
levels(PR_SUMMARY$Bucket)=c("0-3 months","3-9 months","9-15  
months","15-18

months")
barchart(PrimaryReason ~ cInteractions| Bucket + Type, data =  
PR_SUMMARY,

layout = c(4, 2),col="lightgreen",main="COMPARISON - PRIMARY REASON",
  sub="L & R",xlab="Number of Customers",ylab="Primary  
Reasons",

  auto.key = list(title = "COMPARISON - PRIMARY
REASON",columns=2,points = FALSE, rectangles =  TRUE,space=  
"right"

),scales = list(x = list(abbreviate=TRUE,minlength=5,rot=45)),
  panel = function(x,y,subscripts,groups,...){
   panel.barchart(x,y,...)
   ltext(x,y,label=round(PR_SUMMARY$cInteractions,1),
cex=.99,rot=45)


# if you add or subtract a small amount from "y" in the prior line it  
will move the labels up or down.



   border="transparent"})

I dont really understand the "ltext" part and found it with some  
other code,

but it works.

Thanks again,
Raoul
--



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread RaoulD

Thank You David. Yes, I am using the lattice barchart and have managed to add
data labels, however, they tend to be on the tip of each bar and are
difficult to read as they are partially on the bar. Any help would be
greatly appreciated.

This is the code I am using:
 levels(PR_SUMMARY$Bucket)=c("0-3 months","3-9 months","9-15 months","15-18
months")
 barchart(PrimaryReason ~ cInteractions| Bucket + Type, data = PR_SUMMARY,
layout = c(4, 2),col="lightgreen",main="COMPARISON - PRIMARY REASON",
   sub="L & R",xlab="Number of Customers",ylab="Primary Reasons",
   auto.key = list(title = "COMPARISON - PRIMARY
REASON",columns=2,points = FALSE, rectangles =  TRUE,space= "right"
),scales = list(x = list(abbreviate=TRUE,minlength=5,rot=45)),
   panel = function(x,y,subscripts,groups,...){
panel.barchart(x,y,...)
ltext(x,y,label=round(PR_SUMMARY$cInteractions,1),
cex=.99,rot=45)
border="transparent"}) 

I dont really understand the "ltext" part and found it with some other code,
but it works.

Thanks again,
Raoul
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278646.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread David Winsemius


On Jul 4, 2010, at 11:43 PM, RaoulD wrote:



Hi,

Can anyone please help me with how I could add labels with the value  
for
each bar in a barchart? (similar to how data labels can be added in  
Excel) I

have done a lot of searching but havent been lucky.


This is generally pretty easy with text() at least if you are using  
base graphics. If it is not clear after reading the help page then  
post an examply with whatever barchart function you have chosen to  
use. If it's the lattice barchart there is an ltext example  
immediately before the barchart example that quickly can be grafted  
into the barchart code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Labels in a barchart (Lattice or otherwise)

2010-07-05 Thread RaoulD

Hi,

Can anyone please help me with how I could add labels with the value for
each bar in a barchart? (similar to how data labels can be added in Excel) I
have done a lot of searching but havent been lucky.

Thanks,
Raoul
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Labels-in-a-barchart-Lattice-or-otherwise-tp2278027p2278027.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame row statistics (mean)?

2010-06-28 Thread Joshua Wiley
Hello Doug,

I just wanted to add that a faster way to initialize a vector is:

avg <- vector("numeric", nrow(d))

Also you might like nrow(d) over length(d[ , 1]) if the number of rows
is what you are after.  Its sister function is ncol() .

Best regards,

Josh


On Mon, Jun 28, 2010 at 11:37 AM, Douglas M. Hultstrand
 wrote:
> Hello,
>
> I am trying to calculate the mean value of each row in a data frame (d), I
> am having troubles and getting errors using the code I have written.  Below
> is a brief example of the code, any thought or suggestions would be great.
>
> Thank you for your time,
> Doug
>
>
> # Example Code:
> d <- data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7),
> st4=c(6,5,7,8))
> avg <- rep(NA,length(d[,1]))
>
> for (i in 1:length(d[,1])) {
>       avg[i] = mean(d[i,1:4], na.rm=TRUE)
> }
>
> # Final Output wanted.
>  st1 st2 st3 st4  avg
> 1   1   2   5   6 3.50
> 2   2   5   5   5 4.25
> 3   3   6  NA   7 5.33
> 4   4   7   7   8 6.50
>
> --
> -
> Douglas M. Hultstrand, MS
> Senior Hydrometeorologist
> Metstat, Inc. Windsor, Colorado
> voice: 720.771.5840
> email: dmhul...@metstat.com
> web: http://www.metstat.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame row statistics (mean)?

2010-06-28 Thread Erik Iverson



Douglas M. Hultstrand wrote:

Hello,

I am trying to calculate the mean value of each row in a data frame (d), 
I am having troubles and getting errors using the code I have written.  
Below is a brief example of the code, any thought or suggestions would 
be great.


Thank you for your time,
Doug


# Example Code:
d <- data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), 
st4=c(6,5,7,8))

avg <- rep(NA,length(d[,1]))

for (i in 1:length(d[,1])) {
   avg[i] = mean(d[i,1:4], na.rm=TRUE)
}

# Final Output wanted.
 st1 st2 st3 st4  avg
1   1   2   5   6 3.50
2   2   5   5   5 4.25
3   3   6  NA   7 5.33
4   4   7   7   8 6.50



d$avg <- rowMeans(d, na.rm = TRUE)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame row statistics (mean)?

2010-06-28 Thread Phil Spector

Doug -
   Try


d$avg = apply(d,1,mean,na.rm=TRUE)
d

  st1 st2 st3 st4  avg
1   1   2   5   6 3.50
2   2   5   5   5 4.25
3   3   6  NA   7 5.33
4   4   7   7   8 6.50

(If you must use a loop, calculate

mean(as.numeric(d[i,1:4]))

Take a look at  mean(d[1,1:4]) to see why your 
program doesn't work properly.)



- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Mon, 28 Jun 2010, Douglas M. Hultstrand wrote:


Hello,

I am trying to calculate the mean value of each row in a data frame (d), I am 
having troubles and getting errors using the code I have written.  Below is a 
brief example of the code, any thought or suggestions would be great.


Thank you for your time,
Doug


# Example Code:
d <- data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), 
st4=c(6,5,7,8))

avg <- rep(NA,length(d[,1]))

for (i in 1:length(d[,1])) {
  avg[i] = mean(d[i,1:4], na.rm=TRUE)
}

# Final Output wanted.
st1 st2 st3 st4  avg
1   1   2   5   6 3.50
2   2   5   5   5 4.25
3   3   6  NA   7 5.33
4   4   7   7   8 6.50

--
-
Douglas M. Hultstrand, MS
Senior Hydrometeorologist
Metstat, Inc. Windsor, Colorado
voice: 720.771.5840
email: dmhul...@metstat.com
web: http://www.metstat.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame row statistics (mean)?

2010-06-28 Thread Douglas M. Hultstrand

Hello,

I am trying to calculate the mean value of each row in a data frame (d), 
I am having troubles and getting errors using the code I have written.  
Below is a brief example of the code, any thought or suggestions would 
be great.


Thank you for your time,
Doug


# Example Code:
d <- data.frame(st1=c(1,2,3,4), st2=c(2,5,6,7), st3=c(5,5,NA,7), 
st4=c(6,5,7,8))

avg <- rep(NA,length(d[,1]))

for (i in 1:length(d[,1])) {
   avg[i] = mean(d[i,1:4], na.rm=TRUE)
}

# Final Output wanted.
 st1 st2 st3 st4  avg
1   1   2   5   6 3.50
2   2   5   5   5 4.25
3   3   6  NA   7 5.33
4   4   7   7   8 6.50

--
-
Douglas M. Hultstrand, MS
Senior Hydrometeorologist
Metstat, Inc. Windsor, Colorado
voice: 720.771.5840
email: dmhul...@metstat.com
web: http://www.metstat.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame

2010-06-17 Thread Sarah Goslee
You've posted this repeatedly, and yet received no answer. Perhaps
that is because you haven't read the posting guide! You didn't
provide a reproducible example, you didn't tell us where ddply came
from, you didn't tell us what was wrong with the code you suggested.
It would also be rather easier to help you if you put your statement
of requirements in R syntax form. In your example the results column
suddenly switched from R_pivot to series - which do you want?

In the meantime, what's wrong with simply:
x[x$YEAR == 2006, "R_pivot"] <- x[x$YEAR == 2007, "R_pivot"] -
x[x$YEAR == 2007, "Delta"]
x[x$YEAR == 2005, "R_pivot"] <- x[x$YEAR == 2006, "R_pivot"] -
x[x$YEAR == 2006, "Delta"]

It works on your sample dataframe, and would work on a larger frame sorted
in the same way.


> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Sarah

On Thu, Jun 17, 2010 at 3:04 AM, n.via...@libero.it  wrote:
>
> Dear list,
> I have the following problem. I have a data frame like this
>
>
>
> CLUSTER        YEAR      variable        Delta         R_pivot
>
> M1                     2005         EC01           NA              NA
>
> M1                     2006         EC01            2                NA
>
> M1                     2007         EC01            4                5
>
> M2                     2005          EC01          NA              NA
>
> M2                     2006          EC01           5                NA
>
> M2                     2007          EC01           8                 7
>
> M1                     2005          EC02           NA               NA
>
> M1                     2006          EC02            3                 NA
>
> M1                    2007            EC02           1                 8
>
> M2                    2005            EC02           NA               NA
>
> M2                    2006            EC02           9                 NA
>
> M2                    2007            EC02            6                 10
>
>
>
>
> I'm trying to build the time series of the variables by applying the 
> following formulas in a recursive way(by starting from the value of R_pivot 
> at time 2007)
>
> R_EC01(2006)=R_EC01(2007)-Delta_EC01(2007)
>
> R_EC01(2005)=R_EC01(2006)-Delta_EC01(2006)
> In terms of number I would have:
> R_EC01(2006)=5-4=1
> R_eco1(2005)=1-2=-1
>
> And the same should be done for variable EC02. In addition, this calculations 
> should be down grouping by variable e cluster..so the result should be
> CLUSTER        YEAR      variable        series
>
> M1                     2005         EC01             -1
>
>
> M1                     2006         EC01             1
>
>
> M1                     2007         EC01              5
>
>
> M2                     2005          EC01             -6
>
>
>
> M2                     2006          EC01             -1
>
> M2                     2007          EC01              7
>
>
>
> M1                     2005          EC02              4
>
> M1                     2006          EC02               7
>
>
> M1                    2007            EC02              8
>
>
> M2                    2005            EC02             -5
>
>
> M2                    2006            EC02            4
>
>
> M2                    2007            EC02             10
> I applied the following formula which gives me a partial good result but not 
> at all:
> series=ddply(x,.(variable,CLUSTER),transform,series=rev(c(R_pivot,rev(R_pivot-cumsum(rev(Delta[-1]))
> Thanks for your attention!!!
>
>



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame

2010-06-17 Thread n.via...@libero.it

Dear list,
I have the following problem. I have a data frame like this



CLUSTERYEAR  variableDelta R_pivot

M1 2005 EC01   NA  NA   
 

M1 2006 EC012NA

M1 2007 EC0145

M2 2005  EC01  NA  NA

M2 2006  EC01   5NA

M2 2007  EC01   8 7

M1 2005  EC02   NA   NA

M1 2006  EC023 NA

M12007EC02   1 8

M22005EC02   NA   NA

M22006EC02   9 NA

M22007EC026 10




I'm trying to build the time series of the variables by applying the following 
formulas in a recursive way(by starting from the value of R_pivot at time 2007)

R_EC01(2006)=R_EC01(2007)-Delta_EC01(2007)

R_EC01(2005)=R_EC01(2006)-Delta_EC01(2006)
In terms of number I would have:
R_EC01(2006)=5-4=1
R_eco1(2005)=1-2=-1

And the same should be done for variable EC02. In addition, this calculations 
should be down grouping by variable e cluster..so the result should be
CLUSTERYEAR  variableseries

M1 2005 EC01 -1


M1 2006 EC01 1


M1 2007 EC01  5


M2 2005  EC01 -6



M2 2006  EC01 -1

M2 2007  EC01  7



M1 2005  EC02  4

M1 2006  EC02   7


M12007EC02  8


M22005EC02 -5


M22006EC024


M22007EC02 10
I applied the following formula which gives me a partial good result but not at 
all:
series=ddply(x,.(variable,CLUSTER),transform,series=rev(c(R_pivot,rev(R_pivot-cumsum(rev(Delta[-1]))
Thanks for your attention!!!













[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame

2010-06-16 Thread n.via...@libero.it

Dear list,
I have the following problem. I have a data frame like this



CLUSTERYEAR  variableDelta R_pivot
M1 2005 EC01   NA  NA   
 

M1 2006 EC012NA

M1 2007 EC0145

M2 2005  EC01  NA  NA

M2 2006  EC01   5NA

M2 2007  EC01   8 7

M1 2005  EC02   NA   NA

M1 2006  EC023 NA

M12007EC02   1 8

M22005EC02   NA   NA

M22006EC02   9 NA

M22007EC026 10




I'm trying to build the time series of the variables by applying the following 
formulas in a recursive way(by starting from the value of R_pivot at time 2007)

R_EC01(2006)=R_EC01(2007)-Delta_EC01(2007)

R_EC01(2005)=R_EC01(2006)-Delta_EC01(2006)
In terms of number I would have:
R_EC01(2006)=5-4=1
R_eco1(2005)=1-2=-1

And the same should be done for variable EC02. In addition, this calculations 
should be down grouping by variable e cluster..so the result should be
CLUSTERYEAR  variableseries

M1 2005 EC01 -1


M1 2006 EC01 1


M1 2007 EC01  5


M2 2005  EC01 -6



M2 2006  EC01 -1

M2 2007  EC01  7



M1 2005  EC02  4

M1 2006  EC02   7


M12007EC02  8


M22005EC02 -5


M22006EC024


M22007EC02 10
I applied the following formula which gives me a partial good result but not at 
all:
series=ddply(x,.(variable,CLUSTER),transform,series=rev(c(R_pivot,rev(R_pivot-cumsum(rev(Delta[-1]))
Thanks for your attention!!!









[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame

2010-06-16 Thread n.via...@libero.it

Dear list,
I have the following problem. I have a data frame like this



CLUSTERYEAR  variableDelta R_pivot
M1 2005 EC01   NA  NA   
 

M1 2006 EC012NA

M1 2007 EC0145

M2 2005  EC01  NA  NA

M2 2006  EC01   5NA

M2 2007  EC01   8 7

M1 2005  EC02   NA   NA

M1 2006  EC023 NA

M12007EC02   1 8

M22005EC02   NA   NA

M22006EC02   9 NA

M22007EC026 10



I'm trying to build the time series of the variables by applying the following 
formulas in a recursive way(by starting from the value of R_pivot at time 2007)

R_EC01(2006)=R_EC01(2007)-Delta_EC01(2007)

R_EC01(2005)=R_EC01(2006)-Delta_EC01(2006)
In terms of number I would have:
R_EC01(2006)=5-4=1
R_eco1(2005)=1-2=-1

And the same should be done for variable EC02. In addition, this calculations 
should be down grouping by variable e cluster..so the result should be
CLUSTERYEAR  variableseries

M1 2005 EC01 -1


M1 2006 EC01 1


M1 2007 EC01  5


M2 2005  EC01 -6



M2 2006  EC01 -1

M2 2007  EC01  7



M1 2005  EC02  4

M1 2006  EC02   7


M12007EC02  8


M22005EC02 -5


M22006EC024


M22007EC02 10
I applied the following formula which gives me a partial good result but not at 
all:
series=ddply(x,.(variable,CLUSTER),transform,series=rev(c(R_pivot,rev(R_pivot-cumsum(rev(Delta[-1]))
Thanks for your attention!!!





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data-management: Rowwise NA

2010-06-03 Thread moleps
-Any- was my fix... Appreciate it.

//M


On 3. juni 2010, at 21.33, Phil Spector wrote:

> ?any
> 
> Not really a reproducible answer, but I think you're looking
> for
> 
> apply(tes[,sam],1,function(x)any(is.na(x)))
> 
> 
>   - Phil Spector
>Statistical Computing Facility
>Department of Statistics
>UC Berkeley
>spec...@stat.berkeley.edu
> 
> 
> On Thu, 3 Jun 2010, moleps wrote:
> 
>> Dear R?ers..
>> 
>> In this mock dataset how can I generate a logical variable based on whether 
>> just tes or tes3 are NA in each row??
>> 
>> test<-sample(c("A",NA,"B"),100,replace=T)
>> test2<-sample(c("A",NA,"B"),100,replace=T)
>> test3<-sample(c("A",NA,"B"),100,replace=T)
>> 
>> tes<-cbind(test,test2,test3)
>> 
>> sam<-c("test","test3")
>> apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
>> 
>> However this just tests whether each variable is missing or not per row. I?d 
>> like an -or- function in here that would provide one true/false per row 
>> based on whether test or tes3 are NA. I guess it would be easy to do it by 
>> subsetting in the example but I figure there is a more elegant way of doing 
>> it when -sam- contains 50 variables...
>> 
>> //M
>> 
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data-management: Rowwise NA

2010-06-03 Thread David S Freedman

you probably want to use the apply function:

d=sample(1000,500); 
d[sample(500,50)]<-NA; #put 50 NAs into the data
d=data.frame(matrix(d,ncol=50)); 
names(d)=paste('var',1:50,sep='.')
d
apply(d,1,sum) #are any of the row values NA ?
apply(d,2,function(x)sum(is.na(x))) #how many values for each of the 50
variables are NA ?

David Freedman, CDC
-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-management-Rowwise-NA-tp2242232p2242260.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data-management: Rowwise NA

2010-06-03 Thread Marc Schwartz
On Jun 3, 2010, at 2:20 PM, moleps wrote:

> Dear R´ers..
> 
> In this mock dataset how can I generate a logical variable based on whether 
> just tes or tes3 are NA in each row?? 
> 
> test<-sample(c("A",NA,"B"),100,replace=T)
> test2<-sample(c("A",NA,"B"),100,replace=T)
> test3<-sample(c("A",NA,"B"),100,replace=T)
> 
> tes<-cbind(test,test2,test3)
> 
> sam<-c("test","test3")
> apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
> 
> However this just tests whether each variable is missing or not per row. I´d 
> like an -or- function in here that would provide one true/false per row based 
> on whether test or tes3 are NA. I guess it would be easy to do it by 
> subsetting in the example but I figure there is a more elegant way of doing 
> it when -sam- contains 50 variables...


How about this:

set.seed(1)
test <- sample(c("A", NA, "B"), 100, replace = TRUE)
test2 <- sample(c("A", NA, "B"), 100, replace = TRUE)
test3 <- sample(c("A", NA, "B"), 100, replace = TRUE)

tes <- cbind(test, test2, test3)

> str(tes)
 chr [1:100, 1:3] "A" NA NA "B" "A" "B" "B" NA NA ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:3] "test" "test2" "test3"

> head(tes)
 test test2 test3
[1,] "A"  NA"A"  
[2,] NA   NA"A"  
[3,] NA   "A"   NA   
[4,] "B"  "B"   "A"  
[5,] "A"  NA"A"  
[6,] "B"  "A"   NA   


sam <- c("test","test3")

> rowSums(is.na(subset(tes, select = sam))) > 0
  [1] FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE
 [12] FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE FALSE
 [23]  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 [34]  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE
 [45]  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE
 [56] FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
 [67]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE
 [78]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
 [89] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE
[100]  TRUE


This avoids the looping involved in calling apply().

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data-management: Rowwise NA

2010-06-03 Thread Jorge Ivan Velez
Hi there,

One option would be

apply(tes, 1, function(.row) any(is.na(.row[c(1,3)])))

See ?any, ?is.na and ?apply for more information.

HTH,
Jorge


On Thu, Jun 3, 2010 at 3:20 PM, moleps <> wrote:

> Dear R´ers..
>
> In this mock dataset how can I generate a logical variable based on whether
> just tes or tes3 are NA in each row??
>
> test<-sample(c("A",NA,"B"),100,replace=T)
> test2<-sample(c("A",NA,"B"),100,replace=T)
> test3<-sample(c("A",NA,"B"),100,replace=T)
>
> tes<-cbind(test,test2,test3)
>
> sam<-c("test","test3")
> apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
>
> However this just tests whether each variable is missing or not per row.
> I´d like an -or- function in here that would provide one true/false per row
> based on whether test or tes3 are NA. I guess it would be easy to do it by
> subsetting in the example but I figure there is a more elegant way of doing
> it when -sam- contains 50 variables...
>
> //M
>
>
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data-management: Rowwise NA

2010-06-03 Thread Phil Spector

?any

Not really a reproducible answer, but I think you're looking
for

apply(tes[,sam],1,function(x)any(is.na(x)))


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Thu, 3 Jun 2010, moleps wrote:


Dear R?ers..

In this mock dataset how can I generate a logical variable based on whether 
just tes or tes3 are NA in each row??

test<-sample(c("A",NA,"B"),100,replace=T)
test2<-sample(c("A",NA,"B"),100,replace=T)
test3<-sample(c("A",NA,"B"),100,replace=T)

tes<-cbind(test,test2,test3)

sam<-c("test","test3")
apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))

However this just tests whether each variable is missing or not per row. I?d 
like an -or- function in here that would provide one true/false per row based 
on whether test or tes3 are NA. I guess it would be easy to do it by subsetting 
in the example but I figure there is a more elegant way of doing it when -sam- 
contains 50 variables...

//M



[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data-management: Rowwise NA

2010-06-03 Thread moleps
Dear R´ers..

In this mock dataset how can I generate a logical variable based on whether 
just tes or tes3 are NA in each row?? 

test<-sample(c("A",NA,"B"),100,replace=T)
test2<-sample(c("A",NA,"B"),100,replace=T)
test3<-sample(c("A",NA,"B"),100,replace=T)

tes<-cbind(test,test2,test3)

sam<-c("test","test3")
apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))

However this just tests whether each variable is missing or not per row. I´d 
like an -or- function in here that would provide one true/false per row based 
on whether test or tes3 are NA. I guess it would be easy to do it by subsetting 
in the example but I figure there is a more elegant way of doing it when -sam- 
contains 50 variables...

//M



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-02 Thread arnaud Gaboury
I do really think it is a very good idea.
TY





> -Original Message-
> From: h.wick...@gmail.com [mailto:h.wick...@gmail.com] On Behalf Of
> Hadley Wickham
> Sent: Wednesday, June 02, 2010 3:31 PM
> To: arnaud Gaboury
> Cc: Peter Ehlers; r-help@r-project.org; Prof Brian Ripley
> Subject: Re: [R] data frame manipulation with zero rows
> 
> Hi Arnaud,
> 
> I've added this case to the set of test cases in plyr and it will be
> fixed in the next version.
> 
> Hadley
> 
> On Tue, Jun 1, 2010 at 2:33 PM, arnaud Gaboury
>  wrote:
> > Maybe not the cleanest way, but I create a fake data frame with one
> row so
> > ddply() is happy!!
> >> if (nrow(futures)==0) futures<-data.frame(...)
> >
> >
> >
> >
> >
> >> -Original Message-
> >> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
> >> Sent: Tuesday, June 01, 2010 12:07 PM
> >> To: arnaud Gaboury
> >> Cc: 'Prof Brian Ripley'; r-help@r-project.org
> >> Subject: Re: [R] data frame manipulation with zero rows
> >>
> >> On 2010-06-01 1:53, arnaud Gaboury wrote:
> >> > Brian,
> >> >
> >> > If I do understand correctly, I must use in my function something
> >> else than
> >> > ddply() if I want to avoid any error each time my df has zero
> rows?
> >> > Am I correct?
> >> >
> >>
> >> You could define a function to handle the zero-rows case:
> >>
> >> f <- function(x){
> >>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
> >>   else
> >>     out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>                      POSITION=sum(QUANTITY))[,c(1,3,2)]
> >>   out
> >> }
> >> f(futures)
> >>
> >>   -Peter Ehlers
> >>
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> >> >> Sent: Tuesday, June 01, 2010 9:47 AM
> >> >> To: arnaud Gaboury
> >> >> Subject: Re: [R] data frame manipulation with zero rows
> >> >>
> >> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >> >>
> >> >>> Dear group,
> >> >>>
> >> >>> Here is the kind of data.frame I obtain every day with my
> function
> >> :
> >> >>>
> >> >>> futures<-
> >> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE
> Aug/10",
> >> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11
> Jul/10",
> >> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class =
> "Date"),
> >> >>>     QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT
> =
> >> >>> c("373.2500",
> >> >>>     "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >> >>>     "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> >> "14.9200"
> >> >>>     )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >> >>>
> >> >>> I need then to apply to the df this following code line :
> >> >>>
> >> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> >> POSITION=
> >> >>> sum(QUANTITY))[,c(1,3,2)]
> >> >>>
> >> >>> It works perfectly in most of case, BUT I have a new problem: it
> >> can
> >> >>> sometime occurs that my df "futures" is empty, with zero rows.
> >> >>>
> >> >>>
> >> >>> futures<-
> >> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >> >>> structure(numeric(0), class = "Date"),
> >> >>>

Re: [R] data frame manipulation with zero rows

2010-06-02 Thread Hadley Wickham
Hi Arnaud,

I've added this case to the set of test cases in plyr and it will be
fixed in the next version.

Hadley

On Tue, Jun 1, 2010 at 2:33 PM, arnaud Gaboury  wrote:
> Maybe not the cleanest way, but I create a fake data frame with one row so
> ddply() is happy!!
>> if (nrow(futures)==0) futures<-data.frame(...)
>
>
>
>
>
>> -Original Message-
>> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
>> Sent: Tuesday, June 01, 2010 12:07 PM
>> To: arnaud Gaboury
>> Cc: 'Prof Brian Ripley'; r-help@r-project.org
>> Subject: Re: [R] data frame manipulation with zero rows
>>
>> On 2010-06-01 1:53, arnaud Gaboury wrote:
>> > Brian,
>> >
>> > If I do understand correctly, I must use in my function something
>> else than
>> > ddply() if I want to avoid any error each time my df has zero rows?
>> > Am I correct?
>> >
>>
>> You could define a function to handle the zero-rows case:
>>
>> f <- function(x){
>>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
>>   else
>>     out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
>>                      POSITION=sum(QUANTITY))[,c(1,3,2)]
>>   out
>> }
>> f(futures)
>>
>>   -Peter Ehlers
>>
>> >
>> >
>> >> -Original Message-
>> >> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
>> >> Sent: Tuesday, June 01, 2010 9:47 AM
>> >> To: arnaud Gaboury
>> >> Subject: Re: [R] data frame manipulation with zero rows
>> >>
>> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
>> >>
>> >>> Dear group,
>> >>>
>> >>> Here is the kind of data.frame I obtain every day with my function
>> :
>> >>>
>> >>> futures<-
>> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
>> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
>> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
>> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
>> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
>> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
>> >>>     QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
>> >>> c("373.2500",
>> >>>     "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
>> >>>     "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
>> "14.9200"
>> >>>     )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
>> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
>> >>>
>> >>> I need then to apply to the df this following code line :
>> >>>
>> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
>> >> POSITION=
>> >>> sum(QUANTITY))[,c(1,3,2)]
>> >>>
>> >>> It works perfectly in most of case, BUT I have a new problem: it
>> can
>> >>> sometime occurs that my df "futures" is empty, with zero rows.
>> >>>
>> >>>
>> >>> futures<-
>> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
>> >>> structure(numeric(0), class = "Date"),
>> >>>     QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
>> >>> c("DESCRIPTION",
>> >>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
>> >> class =
>> >>> "data.frame")
>> >>>
>> >>> It is not the usual case, but it can happen. With this df, when I
>> >> pass the
>> >>> above mentione line, I get an error :
>> >>>
>> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
>> >> POSITION=
>> >>> sum(QUANTITY))[,c(1,3,2)]
>> >>> Error in tapply(1:nrow(data), splitv, list) :
>> >>>   arguments must have same length
>> >>>
>> &

Re: [R] data frame manipulation with zero rows

2010-06-01 Thread arnaud Gaboury
Maybe not the cleanest way, but I create a fake data frame with one row so
ddply() is happy!!
> if (nrow(futures)==0) futures<-data.frame(...)





> -Original Message-
> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
> Sent: Tuesday, June 01, 2010 12:07 PM
> To: arnaud Gaboury
> Cc: 'Prof Brian Ripley'; r-help@r-project.org
> Subject: Re: [R] data frame manipulation with zero rows
> 
> On 2010-06-01 1:53, arnaud Gaboury wrote:
> > Brian,
> >
> > If I do understand correctly, I must use in my function something
> else than
> > ddply() if I want to avoid any error each time my df has zero rows?
> > Am I correct?
> >
> 
> You could define a function to handle the zero-rows case:
> 
> f <- function(x){
>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
>   else
> out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
>  POSITION=sum(QUANTITY))[,c(1,3,2)]
>   out
> }
> f(futures)
> 
>   -Peter Ehlers
> 
> >
> >
> >> -Original Message-
> >> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> >> Sent: Tuesday, June 01, 2010 9:47 AM
> >> To: arnaud Gaboury
> >> Subject: Re: [R] data frame manipulation with zero rows
> >>
> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >>
> >>> Dear group,
> >>>
> >>> Here is the kind of data.frame I obtain every day with my function
> :
> >>>
> >>> futures<-
> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >>> QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> >>> c("373.2500",
> >>> "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >>> "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> "14.9200"
> >>> )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >>>
> >>> I need then to apply to the df this following code line :
> >>>
> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> POSITION=
> >>> sum(QUANTITY))[,c(1,3,2)]
> >>>
> >>> It works perfectly in most of case, BUT I have a new problem: it
> can
> >>> sometime occurs that my df "futures" is empty, with zero rows.
> >>>
> >>>
> >>> futures<-
> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >>> structure(numeric(0), class = "Date"),
> >>> QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> >>> c("DESCRIPTION",
> >>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> >> class =
> >>> "data.frame")
> >>>
> >>> It is not the usual case, but it can happen. With this df, when I
> >> pass the
> >>> above mentione line, I get an error :
> >>>
> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> POSITION=
> >>> sum(QUANTITY))[,c(1,3,2)]
> >>> Error in tapply(1:nrow(data), splitv, list) :
> >>>   arguments must have same length
> >>>
> >>>
> >>> How can I avoid this when my df is empty?
> >>
> >> Ask the author of the (missing) function ddply() to correct the
> error
> >> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> >>
> >> It's helpful to give example code, but much more helpful if you test
> >> it: yours cannot work without the function ddply() -- this is what
> >> 'self-contained' means in the footer here.
> >>
> >>
> >>>
> >>> Any help is appreciated
> >>>
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >> --
> >> Brian D. Ripley,  rip...@stats.ox.ac.uk
> >> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> >> University of Oxford, Tel:  +44 1865 272861 (self)
> >> 1 South Parks Road, +44 1865 272866 (PA)
> >> Oxford OX1 3TG, UKFax:  +44 1865 272595
> >

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread arnaud Gaboury
It is indeed ddply() from package plyr.





> -Original Message-
> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> Sent: Tuesday, June 01, 2010 12:24 PM
> To: Peter Ehlers
> Cc: arnaud Gaboury; r-help@r-project.org
> Subject: Re: [R] data frame manipulation with zero rows
> 
> On Tue, 1 Jun 2010, Peter Ehlers wrote:
> 
> > On 2010-06-01 1:53, arnaud Gaboury wrote:
> >> Brian,
> >>
> >> If I do understand correctly, I must use in my function something
> else than
> >> ddply() if I want to avoid any error each time my df has zero rows?
> >> Am I correct?
> >>
> >
> > You could define a function to handle the zero-rows case:
> >
> > f <- function(x){
> > if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
> > else
> >   out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
> >POSITION=sum(QUANTITY))[,c(1,3,2)]
> > out
> > }
> > f(futures)
> 
> Or simply fix ddply.  We don't know what that is or what it should do
> for the case of zero rows: it may or may not be the one in package
> plyr.
> 
> >
> > -Peter Ehlers
> >
> >>
> >>
> >>> -Original Message-
> >>> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> >>> Sent: Tuesday, June 01, 2010 9:47 AM
> >>> To: arnaud Gaboury
> >>> Subject: Re: [R] data frame manipulation with zero rows
> >>>
> >>> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >>>
> >>>> Dear group,
> >>>>
> >>>> Here is the kind of data.frame I obtain every day with my function
> :
> >>>>
> >>>> futures<-
> >>>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >>>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> >>>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> >>>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >>>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >>>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >>>> QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> >>>> c("373.2500",
> >>>> "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >>>> "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> "14.9200"
> >>>> )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >>>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >>>>
> >>>> I need then to apply to the df this following code line :
> >>>>
> >>>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>> POSITION=
> >>>> sum(QUANTITY))[,c(1,3,2)]
> >>>>
> >>>> It works perfectly in most of case, BUT I have a new problem: it
> can
> >>>> sometime occurs that my df "futures" is empty, with zero rows.
> >>>>
> >>>>
> >>>> futures<-
> >>>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >>>> structure(numeric(0), class = "Date"),
> >>>> QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> >>>> c("DESCRIPTION",
> >>>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> >>> class =
> >>>> "data.frame")
> >>>>
> >>>> It is not the usual case, but it can happen. With this df, when I
> >>> pass the
> >>>> above mentione line, I get an error :
> >>>>
> >>>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>> POSITION=
> >>>> sum(QUANTITY))[,c(1,3,2)]
> >>>> Error in tapply(1:nrow(data), splitv, list) :
> >>>>   arguments must have same length
> >>>>
> >>>>
> >>>> How can I avoid this when my df is empty?
> >>>
> >>> Ask the author of the (missing) function ddply() to correct the
> error
> >>> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> >>>
> >>> It's helpful to give example code, but much more helpful if you
> test
> >>> it: yours cannot work without the function ddply() -- this is what
> >>> 'self-contained' means in the footer here.
> 
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread Prof Brian Ripley

On Tue, 1 Jun 2010, Peter Ehlers wrote:


On 2010-06-01 1:53, arnaud Gaboury wrote:

Brian,

If I do understand correctly, I must use in my function something else than
ddply() if I want to avoid any error each time my df has zero rows?
Am I correct?



You could define a function to handle the zero-rows case:

f <- function(x){
if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
else
  out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
   POSITION=sum(QUANTITY))[,c(1,3,2)]
out
}
f(futures)


Or simply fix ddply.  We don't know what that is or what it should do 
for the case of zero rows: it may or may not be the one in package 
plyr.




-Peter Ehlers





-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: Tuesday, June 01, 2010 9:47 AM
To: arnaud Gaboury
Subject: Re: [R] data frame manipulation with zero rows

On Tue, 1 Jun 2010, arnaud Gaboury wrote:


Dear group,

Here is the kind of data.frame I obtain every day with my function :

futures<-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
"SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
c("373.2500",
"373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")

I need then to apply to the df this following code line :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]

It works perfectly in most of case, BUT I have a new problem: it can
sometime occurs that my df "futures" is empty, with zero rows.


futures<-
structure(list(DESCRIPTION = character(0), CREATED.DATE =
structure(numeric(0), class = "Date"),
QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
c("DESCRIPTION",
"CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),

class =

"data.frame")

It is not the usual case, but it can happen. With this df, when I

pass the

above mentione line, I get an error :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]
Error in tapply(1:nrow(data), splitv, list) :
  arguments must have same length


How can I avoid this when my df is empty?


Ask the author of the (missing) function ddply() to correct the error
of using 1:nrow(data) by replacing it by seq_len(nrow(data)).

It's helpful to give example code, but much more helpful if you test
it: yours cannot work without the function ddply() -- this is what
'self-contained' means in the footer here.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation ddply

2010-06-01 Thread arnaud Gaboury
Patrick,

When apply to this following df :

futures <-
structure(list(DESCRIPTION = character(0), CREATED.DATE =
structure(numeric(0), class = "Date"), 
QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
c("DESCRIPTION", 
"CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0), class =
"data.frame")


> PosFut <- aggregate(futures$QUANTITY, list(DESCRIPTION =
futures$DESCRIPTION,SETTLEMENT=futures$SETTLEMENT),sum)[,c(1,3,2)]
Error in aggregate.data.frame(as.data.frame(x), ...) : 
  no rows to aggregate



> -Original Message-
> From: Patrick Hausmann [mailto:patrick.hausm...@uni-bremen.de]
> Sent: Tuesday, June 01, 2010 11:38 AM
> To: arnaud Gaboury
> Subject: Re: [R] data frame manipulation ddply
> 
> Hi Arnaud,
> 
> maybe "aggregate" can help:
> 
> PosFut <- aggregate(futures$QUANTITY, list(DESCRIPTION =
> futures$DESCRIPTION,
>   SETTLEMENT  = futures$SETTLEMENT),
> sum)[, c(1,3,2)]
> 
> HTH,
> Patrick
> 
> Am 01.06.2010 11:02, schrieb arnaud Gaboury:
> > Dear group,
> >
> > Here is my data frame:
> >
> >
> > futures<-
> > structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> > "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> > "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> > "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> > ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> > 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >  QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> > c("373.2500",
> >  "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >  "90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
> >  )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> > "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >
> > Here is the line I pass :
> >
> >> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> POSITION=
> > sum(QUANTITY))[,c(1,3,2)]
> >
> > And here the result :
> >
> > PosFut<-
> > structure(list(DESCRIPTION = structure(1:3, .Label = c("CORN Jul/10",
> > "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10"), class = "factor"),
> >  POSITION = c(5, 4, 5), SETTLEMENT = structure(c(2L, 3L, 1L
> >  ), .Label = c("14.9200", "373.2500", "90.7750"), class =
> "factor")),
> > .Names = c("DESCRIPTION",
> > "POSITION", "SETTLEMENT"), class = "data.frame", row.names = c(NA,
> > -3L))
> >
> > I can no more use ddply, as this above command line is in a function,
> and
> > this line should be able to work with a data frame with zero rows,
> and in
> > this case ddply doesn't work.
> > Any suggestion how to obtain the same result without ddply?
> >
> > TY for any help.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread Peter Ehlers

On 2010-06-01 1:53, arnaud Gaboury wrote:

Brian,

If I do understand correctly, I must use in my function something else than
ddply() if I want to avoid any error each time my df has zero rows?
Am I correct?



You could define a function to handle the zero-rows case:

f <- function(x){
 if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
 else
   out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
POSITION=sum(QUANTITY))[,c(1,3,2)]
 out
}
f(futures)

 -Peter Ehlers





-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: Tuesday, June 01, 2010 9:47 AM
To: arnaud Gaboury
Subject: Re: [R] data frame manipulation with zero rows

On Tue, 1 Jun 2010, arnaud Gaboury wrote:


Dear group,

Here is the kind of data.frame I obtain every day with my function :

futures<-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
"SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
c("373.2500",
"373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")

I need then to apply to the df this following code line :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]

It works perfectly in most of case, BUT I have a new problem: it can
sometime occurs that my df "futures" is empty, with zero rows.


futures<-
structure(list(DESCRIPTION = character(0), CREATED.DATE =
structure(numeric(0), class = "Date"),
QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
c("DESCRIPTION",
"CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),

class =

"data.frame")

It is not the usual case, but it can happen. With this df, when I

pass the

above mentione line, I get an error :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]
Error in tapply(1:nrow(data), splitv, list) :
  arguments must have same length


How can I avoid this when my df is empty?


Ask the author of the (missing) function ddply() to correct the error
of using 1:nrow(data) by replacing it by seq_len(nrow(data)).

It's helpful to give example code, but much more helpful if you test
it: yours cannot work without the function ddply() -- this is what
'self-contained' means in the footer here.




Any help is appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-

guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame manipulation ddply

2010-06-01 Thread arnaud Gaboury
Dear group,

Here is my data frame:


futures <-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10", 
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10", 
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", 
"SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406, 
18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"), 
QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
c("373.2500", 
"373.2500", "373.2500", "373.2500", "373.2500", "90.7750", 
"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY", 
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")

Here is the line I pass :

>PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise, POSITION=
sum(QUANTITY))[,c(1,3,2)]

And here the result :

PosFut <-
structure(list(DESCRIPTION = structure(1:3, .Label = c("CORN Jul/10", 
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10"), class = "factor"), 
POSITION = c(5, 4, 5), SETTLEMENT = structure(c(2L, 3L, 1L
), .Label = c("14.9200", "373.2500", "90.7750"), class = "factor")),
.Names = c("DESCRIPTION", 
"POSITION", "SETTLEMENT"), class = "data.frame", row.names = c(NA, 
-3L))

I can no more use ddply, as this above command line is in a function, and
this line should be able to work with a data frame with zero rows, and in
this case ddply doesn't work.
Any suggestion how to obtain the same result without ddply?

TY for any help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread arnaud Gaboury
Brian,

If I do understand correctly, I must use in my function something else than
ddply() if I want to avoid any error each time my df has zero rows?
Am I correct?

TY




> -Original Message-
> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> Sent: Tuesday, June 01, 2010 9:47 AM
> To: arnaud Gaboury
> Subject: Re: [R] data frame manipulation with zero rows
> 
> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> 
> > Dear group,
> >
> > Here is the kind of data.frame I obtain every day with my function :
> >
> > futures <-
> > structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> > "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> > "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> > "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> > ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> > 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> > c("373.2500",
> >"373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
> >)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> > "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >
> > I need then to apply to the df this following code line :
> >
> >> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> POSITION=
> > sum(QUANTITY))[,c(1,3,2)]
> >
> > It works perfectly in most of case, BUT I have a new problem: it can
> > sometime occurs that my df "futures" is empty, with zero rows.
> >
> >
> > futures <-
> > structure(list(DESCRIPTION = character(0), CREATED.DATE =
> > structure(numeric(0), class = "Date"),
> >QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> > c("DESCRIPTION",
> > "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> class =
> > "data.frame")
> >
> > It is not the usual case, but it can happen. With this df, when I
> pass the
> > above mentione line, I get an error :
> >
> >> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> POSITION=
> > sum(QUANTITY))[,c(1,3,2)]
> > Error in tapply(1:nrow(data), splitv, list) :
> >  arguments must have same length
> >
> >
> > How can I avoid this when my df is empty?
> 
> Ask the author of the (missing) function ddply() to correct the error
> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> 
> It's helpful to give example code, but much more helpful if you test
> it: yours cannot work without the function ddply() -- this is what
> 'self-contained' means in the footer here.
> 
> 
> >
> > Any help is appreciated
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame manipulation with zero rows

2010-05-31 Thread arnaud Gaboury
Dear group,

Here is the kind of data.frame I obtain every day with my function :

futures <-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10", 
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10", 
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", 
"SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406, 
18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"), 
QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
c("373.2500", 
"373.2500", "373.2500", "373.2500", "373.2500", "90.7750", 
"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY", 
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")

I need then to apply to the df this following code line :

>PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise, POSITION=
sum(QUANTITY))[,c(1,3,2)]

It works perfectly in most of case, BUT I have a new problem: it can
sometime occurs that my df "futures" is empty, with zero rows.


futures <-
structure(list(DESCRIPTION = character(0), CREATED.DATE =
structure(numeric(0), class = "Date"), 
QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
c("DESCRIPTION", 
"CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0), class =
"data.frame")

It is not the usual case, but it can happen. With this df, when I pass the
above mentione line, I get an error :

>PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise, POSITION=
sum(QUANTITY))[,c(1,3,2)]
Error in tapply(1:nrow(data), splitv, list) : 
  arguments must have same length


How can I avoid this when my df is empty?

Any help is appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame as Hash Table

2010-05-30 Thread Don MacQueen
If you only need a single variable (in this case value), and just 
want to refer to it by the "key", there are other options.


   value <- rnorm(6)
   names(value) <- format(seq(0.5,3,0.5))

value['1.5']

But do watch out for numerical precision in the output of seq() if 
your vector of values is long.


Or, given the dataframe version, it's not essential to assign key to 
the row names


  d[ d$key==1.5 , ]
or
  subset(d , key==1.5)

(again with some potential for numerical precision issues)

If you want your psuedo-hash-table to reference more complex 
structures, use a list.


  myhash <- vector('list',6)   ## initialize a list of six elements
  names(myhash) <- letters[1:6] ## name the six elements
  myhash$a <- data.frame(x=1:4, y=c('a','b','d','f'))   ## assign 
something to the first element

  myhash$b <- rnorm(10)   ## assign something to the second element
and so on for $c, $d, $e, and $f
 the elements don't even have to have the same structure

-Don

At 1:03 AM -0700 5/30/10, Alan Lue wrote:

I'm interested in using a data frame as if it were a hash table.  For
instance if I had the following,


 (d <- data.frame(key=seq(0.5, 3, 0.5), value=rnorm(6)))

  keyvalue
1 0.5 -1.118665122
2 1.0  0.465122921
3 1.5 -0.529239211
4 2.0 -0.147324638
5 2.5 -1.531503795
6 3.0 -0.002720434

Then I'd like to be able to quickly retrieve the "value" of "key" 1.5
to get -0.53.  How would one go about doing this?

Yours,
Alan Lue

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
-
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
m...@llnl.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame as Hash Table

2010-05-30 Thread Alan Lue
Thanks, guys!

Alan


On Sun, May 30, 2010 at 5:35 AM, Marshall Feldman  wrote:
> Besides data.table, there's the hash package. It does not use data.frame
> type structures but is a bit more flexible.
>
> Marsh Feldman
>
> On 5/30/10 [May 30, 10] 6:00 AM, r-help-requ...@r-project.org wrote:
>>
>> Message: 40
>> Date: Sun, 30 May 2010 09:24:22 +0100
>> From: Patrick Burns
>> To:r-help@r-project.org,alan@gmail.com
>> Subject: Re: [R] Data Frame as Hash Table
>> Message-ID:<4c0220b6.7090...@pburns.seanet.com>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> You might want to investigate the 'data.table'
>> package.
>>
>> On 30/05/2010 09:03, Alan Lue wrote:
>>
>>>
>>> >  I'm interested in using a data frame as if it were a hash table.  For
>>> >  instance if I had the following,
>>> >
>>>
>>>>
>>>> >>  (d<- data.frame(key=seq(0.5, 3, 0.5), value=rnorm(6)))
>>>>
>>>
>>> >      key        value
>>> >  1 0.5 -1.118665122
>>> >  2 1.0  0.465122921
>>> >  3 1.5 -0.529239211
>>> >  4 2.0 -0.147324638
>>> >  5 2.5 -1.531503795
>>> >  6 3.0 -0.002720434
>>> >
>>> >  Then I'd like to be able to quickly retrieve the "value" of "key" 1.5
>>> >  to get -0.53.  How would one go about doing this?
>>> >
>>> >  Yours,
>>> >  Alan Lue
>>> >
>>> >  __
>>> >  r-h...@r-project.org  mailing list
>>> >  https://stat.ethz.ch/mailman/listinfo/r-help
>>> >  PLEASE do read the posting
>>> > guidehttp://www.R-project.org/posting-guide.html
>>> >  and provide commented, minimal, self-contained, reproducible code.
>>> >
>>>
>>
>> -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home
>> of 'Some hints for the R beginner' and 'The R Inferno')
>
>



-- 
Alan Lue
Master of Financial Engineering
UCLA Anderson School of Management

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame as Hash Table

2010-05-30 Thread Marshall Feldman
Besides data.table, there's the hash package. It does not use data.frame 
type structures but is a bit more flexible.


Marsh Feldman

On 5/30/10 [May 30, 10] 6:00 AM, r-help-requ...@r-project.org wrote:

Message: 40
Date: Sun, 30 May 2010 09:24:22 +0100
From: Patrick Burns
To:r-help@r-project.org,alan@gmail.com
Subject: Re: [R] Data Frame as Hash Table
Message-ID:<4c0220b6.7090...@pburns.seanet.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

You might want to investigate the 'data.table'
package.

On 30/05/2010 09:03, Alan Lue wrote:
   

>  I'm interested in using a data frame as if it were a hash table.  For
>  instance if I had the following,
>
 

>>  (d<- data.frame(key=seq(0.5, 3, 0.5), value=rnorm(6)))
   

>  keyvalue
>  1 0.5 -1.118665122
>  2 1.0  0.465122921
>  3 1.5 -0.529239211
>  4 2.0 -0.147324638
>  5 2.5 -1.531503795
>  6 3.0 -0.002720434
>
>  Then I'd like to be able to quickly retrieve the "value" of "key" 1.5
>  to get -0.53.  How would one go about doing this?
>
>  Yours,
>  Alan Lue
>
>  __
>  R-help@r-project.org  mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.
>
 
-- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com 
(home of 'Some hints for the R beginner' and 'The R Inferno')


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame as Hash Table

2010-05-30 Thread Patrick Burns

You might want to investigate the 'data.table'
package.

On 30/05/2010 09:03, Alan Lue wrote:

I'm interested in using a data frame as if it were a hash table.  For
instance if I had the following,


(d<- data.frame(key=seq(0.5, 3, 0.5), value=rnorm(6)))

   keyvalue
1 0.5 -1.118665122
2 1.0  0.465122921
3 1.5 -0.529239211
4 2.0 -0.147324638
5 2.5 -1.531503795
6 3.0 -0.002720434

Then I'd like to be able to quickly retrieve the "value" of "key" 1.5
to get -0.53.  How would one go about doing this?

Yours,
Alan Lue

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Frame as Hash Table

2010-05-30 Thread Barry Rowlingson
On Sun, May 30, 2010 at 9:03 AM, Alan Lue  wrote:
> I'm interested in using a data frame as if it were a hash table.  For
> instance if I had the following,
>
>> (d <- data.frame(key=seq(0.5, 3, 0.5), value=rnorm(6)))
>  key        value
> 1 0.5 -1.118665122
> 2 1.0  0.465122921
> 3 1.5 -0.529239211
> 4 2.0 -0.147324638
> 5 2.5 -1.531503795
> 6 3.0 -0.002720434
>
> Then I'd like to be able to quickly retrieve the "value" of "key" 1.5
> to get -0.53.  How would one go about doing this?

Assign the key to the rownames:

> row.names(d)=d$key
> d
key  value
0.5 0.5 -0.1023732
1   1.0 -0.2005591
1.5 1.5  0.1204866

but note they are character strings:

> d["0.5",]
key  value
0.5 0.5 -0.1023732

 I'm not sure if R uses a fast hashing algorithm for lookups or a
simple sequential search. Looking at the source code, testing, or
waiting for someone else to answer that on here will tell.

 Or you could do it with a list, but again the keys are always
character strings.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Frame as Hash Table

2010-05-30 Thread Alan Lue
I'm interested in using a data frame as if it were a hash table.  For
instance if I had the following,

> (d <- data.frame(key=seq(0.5, 3, 0.5), value=rnorm(6)))
  keyvalue
1 0.5 -1.118665122
2 1.0  0.465122921
3 1.5 -0.529239211
4 2.0 -0.147324638
5 2.5 -1.531503795
6 3.0 -0.002720434

Then I'd like to be able to quickly retrieve the "value" of "key" 1.5
to get -0.53.  How would one go about doing this?

Yours,
Alan Lue

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame manipulation

2010-05-29 Thread Tal Galili
Hi there,
I am glad it helped.
I used mean as something to use, not because I had an understanding that
this is what you need - so if you believe sum is what you where after - go
with it :)

Regarding loving R, and time spending - everyone on this list probably know
how you feel.  We all spent time trying to invent a wheel, and then found
someone else compiled a better solution then our patch work.
So 1 - this is how we learn I guess.  And 2 - each of us contribute in his
own way so it is all fine :)

Best,
Tal



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, May 28, 2010 at 9:37 PM, LCOG1  wrote:

>
> Tal,
>   Wow, i cant believe how many different manipulations i went through
> trying to coerce it into the format i wanted.  The below works nearly
> perfectly, i had to change the "mean" call to "sum".   Im curious why you
> used mean?  Other than that thank you very much, i feel a little foolish
> about how long i spent trying to do this.  Got to love R.
>
> 
> From: Tal Galili [via R] [mailto:
> ml-node+2234184-1067705461-103...@n4.nabble.com
> ]
> Sent: Friday, May 28, 2010 12:04 AM
> To: ROLL Josh F
> Subject: Re: Data frame manipulation
>
> Hi there,
>
> The tool to learn for this is the cast function using the reshape package.
> In your example you have more then one value for RTL, which you should
> think
> of how to account for.
> But basically, here is a solution to what you asked for (assuming I
> understood you correctly)
>
>
> require(reshape)
> #?cast
> cast(EmpTotCt.Zn..,  Taz ~ ClusterType  , value = "TotEmp", mean, fill = 0)
>
>
>
> Best,
> Tal
>
> Contact
> Details:---
> Contact me: [hidden email]
> |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>
>
>
> On Fri, May 28, 2010 at 3:14 AM, LCOG1 <[hidden
> email]> wrote:
>
> >
> > Hello All,
> > Please consider the following:
> >
> > TotEmp<-c(19,6,1,1,8,44,2,33,48,1)
> >
> >
> ClusterType<-c("AGF","CNS","OSV","RTL","RTL","TRN","REL","ACC_CLUST","RTL","WHL")
> > Taz<-c(0,0,0,100,100,100,101,101,102,103)
> >
> >
> >
> AllCtTypes_<-c("AGF","CNS","OSV","RTL","TRN","REL","ACC_CLUST","WHL","ADM_CLUST",
> >
> >
> "HLH","HLH_CLUST","ACC","RTL_CLUST","MFG","ADM","MFG_CLUST","CNS_CLUST","PRF","PUB",
> > "FIN","INF_CLUST","INF","EDU_CLUST","REC","EDU",
> > "MNG","UTL","MIN")
> > #Build data frame
> > EmpTotCt.Zn..<-data.frame(TotEmp,ClusterType,Taz)
> > #Reverse rows to columns
> > EmpTotCt.Zn2..<-as.data.frame(t(as.matrix(EmpTotCt.Zn..)))
> >
> >
> > "EmpTotCt.Zn.." is a data frame that i would like to alter by adding new
> > columns and input 0s where no values exist.  I tried the line below as
> its
> > the only way i know of switching columns to rows but its far from what i
> am
> > looking for.  So "EmpTotCt.Zn.." returns
> >
> >   TotEmp ClusterType Taz
> > 1  19 AGF   0
> > 2   6 CNS0
> > 3   1 OSV   0
> > 4   1 RTL 100
> > 5   8 RTL 100
> > 6  44 TRN100
> > 7   2 REL 101
> > 8  33   ACC_CLUST 101
> > 9  48 RTL 102
> > 10  1 RTL 103
> >
> > But what i want is to return the below:
> >
> >AGF CNS OSV RTL RTL TRN REL ACC_CLUST
> > RTL
> > 0   19  6   1   0   0   0   0   0
> >   0
> > 100 0   0   0   1   8   44  0   0
> >   0
> > 101 0   0   0   0   0   0   2   33
> >0
> > 102 0   0   0   0   0   0   0   0
> >  48
> > 103 0   0   0   0   0   0   0   0
> >1
> >
> > Where the rows represent "Taz" and the columns represent ALL
> > "ClusterType"'s
> > found in "AllCtTypes_", this would mean that the above output example
> would
> > have many more columns with 0s in all the rows since there are no
> > observations.  Its taken me a while to get the data into the above format
> > and im afraid im stuck with how to get it into the final computational
> > format, so hopefully someone can help.
> >
> > Perhaps i have to build a blank data frame with the appropriate
> dimensions
> > first but i am not sure if this is the most efficient way of
> accomplishing
> > this.
> >
> > Thanks in advance.
> >
> >
> > --
> > View this message in context:
> >
> http://r.789695.n4.nabble.com/Data-frame-manipul

Re: [R] Data frame manipulation

2010-05-28 Thread LCOG1

Tal,
   Wow, i cant believe how many different manipulations i went through trying 
to coerce it into the format i wanted.  The below works nearly perfectly, i had 
to change the "mean" call to "sum".   Im curious why you used mean?  Other than 
that thank you very much, i feel a little foolish about how long i spent trying 
to do this.  Got to love R.


From: Tal Galili [via R] 
[mailto:ml-node+2234184-1067705461-103...@n4.nabble.com]
Sent: Friday, May 28, 2010 12:04 AM
To: ROLL Josh F
Subject: Re: Data frame manipulation

Hi there,

The tool to learn for this is the cast function using the reshape package.
In your example you have more then one value for RTL, which you should think
of how to account for.
But basically, here is a solution to what you asked for (assuming I
understood you correctly)


require(reshape)
#?cast
cast(EmpTotCt.Zn..,  Taz ~ ClusterType  , value = "TotEmp", mean, fill = 0)



Best,
Tal

Contact
Details:---
Contact me: [hidden email] |  
972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, May 28, 2010 at 3:14 AM, LCOG1 <[hidden 
email]> wrote:

>
> Hello All,
> Please consider the following:
>
> TotEmp<-c(19,6,1,1,8,44,2,33,48,1)
>
> ClusterType<-c("AGF","CNS","OSV","RTL","RTL","TRN","REL","ACC_CLUST","RTL","WHL")
> Taz<-c(0,0,0,100,100,100,101,101,102,103)
>
>
> AllCtTypes_<-c("AGF","CNS","OSV","RTL","TRN","REL","ACC_CLUST","WHL","ADM_CLUST",
>
> "HLH","HLH_CLUST","ACC","RTL_CLUST","MFG","ADM","MFG_CLUST","CNS_CLUST","PRF","PUB",
> "FIN","INF_CLUST","INF","EDU_CLUST","REC","EDU",
> "MNG","UTL","MIN")
> #Build data frame
> EmpTotCt.Zn..<-data.frame(TotEmp,ClusterType,Taz)
> #Reverse rows to columns
> EmpTotCt.Zn2..<-as.data.frame(t(as.matrix(EmpTotCt.Zn..)))
>
>
> "EmpTotCt.Zn.." is a data frame that i would like to alter by adding new
> columns and input 0s where no values exist.  I tried the line below as its
> the only way i know of switching columns to rows but its far from what i am
> looking for.  So "EmpTotCt.Zn.." returns
>
>   TotEmp ClusterType Taz
> 1  19 AGF   0
> 2   6 CNS0
> 3   1 OSV   0
> 4   1 RTL 100
> 5   8 RTL 100
> 6  44 TRN100
> 7   2 REL 101
> 8  33   ACC_CLUST 101
> 9  48 RTL 102
> 10  1 RTL 103
>
> But what i want is to return the below:
>
>AGF CNS OSV RTL RTL TRN REL ACC_CLUST
> RTL
> 0   19  6   1   0   0   0   0   0
>   0
> 100 0   0   0   1   8   44  0   0
>   0
> 101 0   0   0   0   0   0   2   33
>0
> 102 0   0   0   0   0   0   0   0
>  48
> 103 0   0   0   0   0   0   0   0
>1
>
> Where the rows represent "Taz" and the columns represent ALL
> "ClusterType"'s
> found in "AllCtTypes_", this would mean that the above output example would
> have many more columns with 0s in all the rows since there are no
> observations.  Its taken me a while to get the data into the above format
> and im afraid im stuck with how to get it into the final computational
> format, so hopefully someone can help.
>
> Perhaps i have to build a blank data frame with the appropriate dimensions
> first but i am not sure if this is the most efficient way of accomplishing
> this.
>
> Thanks in advance.
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Data-frame-manipulation-tp2233932p2233932.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



View message @ 
http://r.789695.n4.nabble.com/Data-frame-manipulation-tp2233932p2234184.html
To unsubscribe from Data frame manipulation, click here< (link removed) ==>.


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-frame-manipulation-tp2233932p2235019.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__

Re: [R] Methods to explore R data structures

2010-05-28 Thread Timothy Wu
Great, these are valuable tips. Thanks both of you. I appreciate it. :)

Timothy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame manipulation

2010-05-28 Thread Tal Galili
Hi there,

The tool to learn for this is the cast function using the reshape package.
In your example you have more then one value for RTL, which you should think
of how to account for.
But basically, here is a solution to what you asked for (assuming I
understood you correctly)


require(reshape)
#?cast
cast(EmpTotCt.Zn..,  Taz ~ ClusterType  , value = "TotEmp", mean, fill = 0)



Best,
Tal

Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, May 28, 2010 at 3:14 AM, LCOG1  wrote:

>
> Hello All,
> Please consider the following:
>
> TotEmp<-c(19,6,1,1,8,44,2,33,48,1)
>
> ClusterType<-c("AGF","CNS","OSV","RTL","RTL","TRN","REL","ACC_CLUST","RTL","WHL")
> Taz<-c(0,0,0,100,100,100,101,101,102,103)
>
>
> AllCtTypes_<-c("AGF","CNS","OSV","RTL","TRN","REL","ACC_CLUST","WHL","ADM_CLUST",
>
> "HLH","HLH_CLUST","ACC","RTL_CLUST","MFG","ADM","MFG_CLUST","CNS_CLUST","PRF","PUB",
> "FIN","INF_CLUST","INF","EDU_CLUST","REC","EDU",
> "MNG","UTL","MIN")
> #Build data frame
> EmpTotCt.Zn..<-data.frame(TotEmp,ClusterType,Taz)
> #Reverse rows to columns
> EmpTotCt.Zn2..<-as.data.frame(t(as.matrix(EmpTotCt.Zn..)))
>
>
> "EmpTotCt.Zn.." is a data frame that i would like to alter by adding new
> columns and input 0s where no values exist.  I tried the line below as its
> the only way i know of switching columns to rows but its far from what i am
> looking for.  So "EmpTotCt.Zn.." returns
>
>   TotEmp ClusterType Taz
> 1  19 AGF   0
> 2   6 CNS0
> 3   1 OSV   0
> 4   1 RTL 100
> 5   8 RTL 100
> 6  44 TRN100
> 7   2 REL 101
> 8  33   ACC_CLUST 101
> 9  48 RTL 102
> 10  1 RTL 103
>
> But what i want is to return the below:
>
>AGF CNS OSV RTL RTL TRN REL ACC_CLUST
> RTL
> 0   19  6   1   0   0   0   0   0
>   0
> 100 0   0   0   1   8   44  0   0
>   0
> 101 0   0   0   0   0   0   2   33
>0
> 102 0   0   0   0   0   0   0   0
>  48
> 103 0   0   0   0   0   0   0   0
>1
>
> Where the rows represent "Taz" and the columns represent ALL
> "ClusterType"'s
> found in "AllCtTypes_", this would mean that the above output example would
> have many more columns with 0s in all the rows since there are no
> observations.  Its taken me a while to get the data into the above format
> and im afraid im stuck with how to get it into the final computational
> format, so hopefully someone can help.
>
> Perhaps i have to build a blank data frame with the appropriate dimensions
> first but i am not sure if this is the most efficient way of accomplishing
> this.
>
> Thanks in advance.
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Data-frame-manipulation-tp2233932p2233932.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data frame manipulation

2010-05-27 Thread LCOG1

Hello All, 
Please consider the following:

TotEmp<-c(19,6,1,1,8,44,2,33,48,1)
ClusterType<-c("AGF","CNS","OSV","RTL","RTL","TRN","REL","ACC_CLUST","RTL","WHL")
Taz<-c(0,0,0,100,100,100,101,101,102,103)

AllCtTypes_<-c("AGF","CNS","OSV","RTL","TRN","REL","ACC_CLUST","WHL","ADM_CLUST",
"HLH","HLH_CLUST","ACC","RTL_CLUST","MFG","ADM","MFG_CLUST","CNS_CLUST","PRF","PUB",
"FIN","INF_CLUST","INF","EDU_CLUST","REC","EDU",
"MNG","UTL","MIN")
#Build data frame
EmpTotCt.Zn..<-data.frame(TotEmp,ClusterType,Taz)
#Reverse rows to columns
EmpTotCt.Zn2..<-as.data.frame(t(as.matrix(EmpTotCt.Zn..)))


"EmpTotCt.Zn.." is a data frame that i would like to alter by adding new
columns and input 0s where no values exist.  I tried the line below as its
the only way i know of switching columns to rows but its far from what i am
looking for.  So "EmpTotCt.Zn.." returns

   TotEmp ClusterType Taz
1  19 AGF   0
2   6 CNS0
3   1 OSV   0
4   1 RTL 100
5   8 RTL 100
6  44 TRN100
7   2 REL 101
8  33   ACC_CLUST 101
9  48 RTL 102
10  1 RTL 103

But what i want is to return the below:

AGF CNS OSV RTL RTL TRN REL ACC_CLUST   
RTL
0   19  6   1   0   0   0   0   0   
  0
100 0   0   0   1   8   44  0   0   
  0
101 0   0   0   0   0   0   2   33  
  0
102 0   0   0   0   0   0   0   0   
 48
103 0   0   0   0   0   0   0   0   
   1

Where the rows represent "Taz" and the columns represent ALL "ClusterType"'s
found in "AllCtTypes_", this would mean that the above output example would
have many more columns with 0s in all the rows since there are no
observations.  Its taken me a while to get the data into the above format
and im afraid im stuck with how to get it into the final computational
format, so hopefully someone can help.

Perhaps i have to build a blank data frame with the appropriate dimensions
first but i am not sure if this is the most efficient way of accomplishing
this.  

Thanks in advance.

  
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-frame-manipulation-tp2233932p2233932.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Methods to explore R data structures

2010-05-27 Thread Greg Snow
The TkListView function in the TeachingDemos package is an interactive tool for 
looking at the structure and contents of lists and other objects.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Timothy Wu
> Sent: Thursday, May 27, 2010 3:14 AM
> To: r-help@r-project.org
> Subject: [R] Methods to explore R data structures
> 
> Hi,
> 
> I'm very confused about R structures and the methods to go with them.
> I'm
> using R for microarray analysis with Bioconductors. Suppose without
> reading
> the documentations, what's the best way to explore a data structure
> when you
> know nothing about it?
> 
> I am currently using is() / class() to see what the object is. str() /
> attributes() to probe inside the object, and
> someth...@something$something
> to walk it and explore. Is there any other way? Also, without reading
> documentations, is there a way to know what functions are available to
> extract data from it? For example, there is sampleNames() which works
> on
> ExpressionSet and AnnotatedDataFrame (which is a part of
> ExpressionSet). How
> do I know they are available (as sometimes I can't recall where I've
> seen
> them and I forgot the function names). And what are R functions? Are
> those
> two separate functions or polymorphic functions? I'm also pretty
> confused
> about S3, S4, or the regular list. I guess I'm fairly confused about R
> in
> general.
> 
> Any good source of reading (hopefully short and understandable, too)
> would
> be appreciated. Thanks.
> 
> Timothy
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Methods to explore R data structures

2010-05-27 Thread Martin Morgan
On 05/27/2010 02:13 AM, Timothy Wu wrote:
> Hi,
>
> I'm very confused about R structures and the methods to go with them.
> I'm using R for microarray analysis with Bioconductors. Suppose
> without reading the documentations, what's the best way to explore a
> data structure when you know nothing about it?

probably by reading the documentation, especially vignettes

 > browseVignettes("Biobase")

and then switching to your web browser. If you're asking about
Bioconductor functionality in particular, then the Bioconductor mailing
list is appropriate

  http://bioconductor.org/docs/mailList.html

>
> I am currently using is() / class() to see what the object is. str()
> / attributes() to probe inside the object, and
> someth...@something$something to walk it and explore. Is there any

This looks at the structure, but many classes will want to be
manipulated by their API.

> other way? . Also, without reading documentations, is there a way to
> know what functions are available to extract data from it? For
> example, there is sampleNames() which works on ExpressionSet and
> AnnotatedDataFrame (which is a part of ExpressionSet). How do I know
> they are available (as sometimes I can't recall where I've seen them
> and I forgot the function names). And what are R functions? Are
> those

For an S4 object 'x', I'd

  class(x)
  getClass(cls)@package

followed by

  showMethods(classes='ExpressionSet',
  where=getNamespace('Biobase'))

or

  cls <- c(class(x), getClass(class(x))@contains)
  pkg <- getClass(cls)@package
  showMethods(classes=cls, where=getNamespace(pkg))

and conversely

  showMethods(sampleNames, where=getNamespace(pkg))

Methods for S3 classes can be found in a similar way, but using
'methods'. Both of these only discover classes in packages that are
loaded in the currently active session. This will miss plain old
functions that don't declare what type of object they intend to operate
on. If whan you say 'what are the R functions' you're asking for the
function definition, then

  selectMethod(sampleNames, 'ExpressionSet')

> two separate functions or polymorphic functions? I'm also pretty

sampleNames is a generic. There are methods that operate on eSet (a base
class of ExpressionSet), and on AnnotatedDataFrame.

> confused about S3, S4, or the regular list. I guess I'm fairly
> confused about R in general.

For S4

  ?Methods
  ?Classes

For S3, maybe section 10.9 of RShowDoc('R-intro')

Martin

>
> Any good source of reading (hopefully short and understandable, too)
> would be appreciated. Thanks.
>
> Timothy
>
> [[alternative HTML version deleted]]
>
> __ R-help@r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation change elements meeting criteria

2010-05-27 Thread arnaud Gaboury
Sorry Joris, but I am totally lost on this issue!!


>tradenews<-sapply(trades$Buy.Sell..Cleared[which(trades$Trade.Status=="DEL"
)],switch,Sell="Buy",Buy="Sell")

> tradenews
 Sell 
"Buy"

Not really what I want !!

From: Joris Meys [mailto:jorism...@gmail.com] 
Sent: Thursday, May 27, 2010 10:38 AM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

Off course. You put in a matrix to sapply, but sapply is for vectors. You
want to apply the switch command on every entry of the vector
trades$Buy.Sell..Cleared for which trades$Trade.Status equals "DEL". Why do
you try to put in a matrix with all variables for the observations where
status is DEL?

You should have done :

tradesnew<-sapply(trades$Buy.Sell..Cleared[which(trades$Trade.Status=="DEL")
],
 switch,Sell="Buy",Buy="Sell")

Check the help files, and keep track of what goes in and out a function.

Cheers
Joris
On Thu, May 27, 2010 at 9:41 AM, arnaud Gaboury 
wrote:
Joris,

If i pass this line :

>tradesnew<-sapply(trades[which(trades$Trade.Status=="DEL"),],switch,Sel
>l="Buy",Buy="Sell")

Here is what I get :

> tradesnew
$Trade.Status
NULL

$Instrument.Long.Name
NULL

$Delivery.Prompt.Date
NULL

$Buy.Sell..Cleared.
[1] "Buy"

$Volume
[1] "Buy"

$Price
NULL

$Net.Charges..sum.
NULL

That's certainly not what I want.




From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Thursday, May 27, 2010 8:43 AM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

The loop is due to the switch statement, not the condition. Without
condition it would become:

for (i in 1:length(Y)){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
You can make an sapply construct too off course :

new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell")

This will speed up things a little bit, but the effect is marginal.
Cheers
Joris
On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury 
wrote:
Thank you for the answer.
Is there any way to combine if() and switch() in one line? In my case,
something like :

>if(trade$Trade.Status=="DEL")switch(.)

I would like to avoid the loop .



From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Wednesday, May 26, 2010 9:15 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

see ?switch

X<- rep(c("Buy","Sell","something else"),each=5)
Y<- rep(c("DEL","INS","DEL"),5)


new.vect <- X
for (i in which(Y=="DEL")){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
cbind(new.vect,X,Y)
On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
wrote:
Dear group,

Here is my df :

trade <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")

Here is what I want :

If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell , change
it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
If trade$Trade.Status=="INS", do nothing
I tried to work around with ifelse, but don't know how to deal with so many
conditions.

Any help is appreciated.

TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php



--
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, b

Re: [R] data frame manipulation change elements meeting criteria

2010-05-27 Thread arnaud Gaboury
Maybe should I be more precise. Here is what I have :

trades <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")

Here is what I want :

tradesnew <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Buy", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")



From: Joris Meys [mailto:jorism...@gmail.com] 
Sent: Thursday, May 27, 2010 10:38 AM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

Off course. You put in a matrix to sapply, but sapply is for vectors. You
want to apply the switch command on every entry of the vector
trades$Buy.Sell..Cleared for which trades$Trade.Status equals "DEL". Why do
you try to put in a matrix with all variables for the observations where
status is DEL?

You should have done :

tradesnew<-sapply(trades$Buy.Sell..Cleared[which(trades$Trade.Status=="DEL")
],
 switch,Sell="Buy",Buy="Sell")

Check the help files, and keep track of what goes in and out a function.

Cheers
Joris
On Thu, May 27, 2010 at 9:41 AM, arnaud Gaboury 
wrote:
Joris,

If i pass this line :

>tradesnew<-sapply(trades[which(trades$Trade.Status=="DEL"),],switch,Sel
>l="Buy",Buy="Sell")

Here is what I get :

> tradesnew
$Trade.Status
NULL

$Instrument.Long.Name
NULL

$Delivery.Prompt.Date
NULL

$Buy.Sell..Cleared.
[1] "Buy"

$Volume
[1] "Buy"

$Price
NULL

$Net.Charges..sum.
NULL

That's certainly not what I want.




From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Thursday, May 27, 2010 8:43 AM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

The loop is due to the switch statement, not the condition. Without
condition it would become:

for (i in 1:length(Y)){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
You can make an sapply construct too off course :

new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell")

This will speed up things a little bit, but the effect is marginal.
Cheers
Joris
On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury 
wrote:
Thank you for the answer.
Is there any way to combine if() and switch() in one line? In my case,
something like :

>if(trade$Trade.Status=="DEL")switch(.)

I would like to avoid the loop .



From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Wednesday, May 26, 2010 9:15 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

see ?switch

X<- rep(c("Buy","Sell","something else"),each=5)
Y<- rep(c("DEL","INS","DEL"),5)


new.vect <- X
for (i in which(Y=="DEL")){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
cbind(new.vect,X,Y)
On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
wrote:
Dear group,

Here is my df :

trade <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", &

Re: [R] data frame manipulation change elements meeting criteria

2010-05-27 Thread Joris Meys
Ah, OK. sapply -evidently- only gives an output for every case that goes in.
Which is only one, as there is only one DEL case. You can use that output to
change the corresponding value in the dataframe, like :

tradenews <- trades
tradenews$Buy.Sell..Cleared.[which(trades$Trade.Status=="DEL")] <-
 sapply(trades$Buy.Sell..Cleared.[which(trades$Trade.Status=="DEL")],
switch,Sell="Buy",Buy="Sell")

Also take a look at these help files and the examples mentioned in there.
?switch
?sapply
?which

And please, give your variables some decent names. All those points make
your code very error-prone.

Cheers
Joris

On Thu, May 27, 2010 at 10:47 AM, arnaud Gaboury
wrote:

> Sorry Joris, but I am totally lost on this issue!!
>
>
>
> >tradenews<-sapply(trades$Buy.Sell..Cleared[which(trades$Trade.Status=="DEL"
> )],switch,Sell="Buy",Buy="Sell")
>
> > tradenews
>  Sell
> "Buy"
>
> Not really what I want !!
>
> From: Joris Meys [mailto:jorism...@gmail.com]
> Sent: Thursday, May 27, 2010 10:38 AM
> To: arnaud Gaboury
> Cc: r-help@r-project.org
> Subject: Re: [R] data frame manipulation change elements meeting criteria
>
> Off course. You put in a matrix to sapply, but sapply is for vectors. You
> want to apply the switch command on every entry of the vector
> trades$Buy.Sell..Cleared for which trades$Trade.Status equals "DEL". Why do
> you try to put in a matrix with all variables for the observations where
> status is DEL?
>
> You should have done :
>
>
> tradesnew<-sapply(trades$Buy.Sell..Cleared[which(trades$Trade.Status=="DEL")
> ],
>  switch,Sell="Buy",Buy="Sell")
>
> Check the help files, and keep track of what goes in and out a function.
>
> Cheers
> Joris
> On Thu, May 27, 2010 at 9:41 AM, arnaud Gaboury 
> wrote:
> Joris,
>
> If i pass this line :
>
> >tradesnew<-sapply(trades[which(trades$Trade.Status=="DEL"),],switch,Sel
> >l="Buy",Buy="Sell")
>
> Here is what I get :
>
> > tradesnew
> $Trade.Status
> NULL
>
> $Instrument.Long.Name
> NULL
>
> $Delivery.Prompt.Date
> NULL
>
> $Buy.Sell..Cleared.
> [1] "Buy"
>
> $Volume
> [1] "Buy"
>
> $Price
> NULL
>
> $Net.Charges..sum.
> NULL
>
> That's certainly not what I want.
>
>
>
>
> From: Joris Meys [mailto:jorism...@gmail.com]
> Sent: Thursday, May 27, 2010 8:43 AM
> To: arnaud Gaboury
> Cc: r-help@r-project.org
> Subject: Re: [R] data frame manipulation change elements meeting criteria
>
> The loop is due to the switch statement, not the condition. Without
> condition it would become:
>
> for (i in 1:length(Y)){
> new.vect[i]<-switch(
>   EXPR = X[i],
>   Sell="Buy",
>   Buy="Sell",
>   X[i])
> }
> You can make an sapply construct too off course :
>
> new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell")
>
> This will speed up things a little bit, but the effect is marginal.
> Cheers
> Joris
> On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury 
> wrote:
> Thank you for the answer.
> Is there any way to combine if() and switch() in one line? In my case,
> something like :
>
> >if(trade$Trade.Status=="DEL")switch(.)
>
> I would like to avoid the loop .
>
>
>
> From: Joris Meys [mailto:jorism...@gmail.com]
> Sent: Wednesday, May 26, 2010 9:15 PM
> To: arnaud Gaboury
> Cc: r-help@r-project.org
> Subject: Re: [R] data frame manipulation change elements meeting criteria
>
> see ?switch
>
> X<- rep(c("Buy","Sell","something else"),each=5)
> Y<- rep(c("DEL","INS","DEL"),5)
>
>
> new.vect <- X
> for (i in which(Y=="DEL")){
> new.vect[i]<-switch(
>   EXPR = X[i],
>   Sell="Buy",
>   Buy="Sell",
>   X[i])
> }
> cbind(new.vect,X,Y)
> On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
> wrote:
> Dear group,
>
> Here is my df :
>
> trade <-
> structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name=
> c("SUGAR NO.11",
> "CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
> "Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
> 2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. 

[R] Methods to explore R data structures

2010-05-27 Thread Timothy Wu
Hi,

I'm very confused about R structures and the methods to go with them. I'm
using R for microarray analysis with Bioconductors. Suppose without reading
the documentations, what's the best way to explore a data structure when you
know nothing about it?

I am currently using is() / class() to see what the object is. str() /
attributes() to probe inside the object, and someth...@something$something
to walk it and explore. Is there any other way? Also, without reading
documentations, is there a way to know what functions are available to
extract data from it? For example, there is sampleNames() which works on
ExpressionSet and AnnotatedDataFrame (which is a part of ExpressionSet). How
do I know they are available (as sometimes I can't recall where I've seen
them and I forgot the function names). And what are R functions? Are those
two separate functions or polymorphic functions? I'm also pretty confused
about S3, S4, or the regular list. I guess I'm fairly confused about R in
general.

Any good source of reading (hopefully short and understandable, too) would
be appreciated. Thanks.

Timothy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation change elements meeting criteria

2010-05-27 Thread Joris Meys
Off course. You put in a matrix to sapply, but sapply is for vectors. You
want to apply the switch command on every entry of the vector
trades$Buy.Sell..Cleared for which trades$Trade.Status equals "DEL". Why do
you try to put in a matrix with all variables for the observations where
status is DEL?

You should have done :

tradesnew<-sapply(trades$Buy.Sell..Cleared[which(trades$Trade.Status=="DEL")],
 switch,Sell="Buy",Buy="Sell")

Check the help files, and keep track of what goes in and out a function.

Cheers
Joris

On Thu, May 27, 2010 at 9:41 AM, arnaud Gaboury wrote:

> Joris,
>
> If i pass this line :
>
> >tradesnew<-sapply(trades[which(trades$Trade.Status=="DEL"),],switch,Sel
> >l="Buy",Buy="Sell")
>
> Here is what I get :
>
> > tradesnew
> $Trade.Status
> NULL
>
> $Instrument.Long.Name
> NULL
>
> $Delivery.Prompt.Date
> NULL
>
> $Buy.Sell..Cleared.
> [1] "Buy"
>
> $Volume
> [1] "Buy"
>
> $Price
> NULL
>
> $Net.Charges..sum.
> NULL
>
> That's certainly not what I want.
>
>
>
>
> From: Joris Meys [mailto:jorism...@gmail.com]
> Sent: Thursday, May 27, 2010 8:43 AM
> To: arnaud Gaboury
> Cc: r-help@r-project.org
> Subject: Re: [R] data frame manipulation change elements meeting criteria
>
> The loop is due to the switch statement, not the condition. Without
> condition it would become:
>
> for (i in 1:length(Y)){
> new.vect[i]<-switch(
>   EXPR = X[i],
>   Sell="Buy",
>   Buy="Sell",
>   X[i])
> }
> You can make an sapply construct too off course :
>
> new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell")
>
> This will speed up things a little bit, but the effect is marginal.
> Cheers
> Joris
> On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury 
> wrote:
> Thank you for the answer.
> Is there any way to combine if() and switch() in one line? In my case,
> something like :
>
> >if(trade$Trade.Status=="DEL")switch(.)
>
> I would like to avoid the loop .
>
>
>
> From: Joris Meys [mailto:jorism...@gmail.com]
> Sent: Wednesday, May 26, 2010 9:15 PM
> To: arnaud Gaboury
> Cc: r-help@r-project.org
> Subject: Re: [R] data frame manipulation change elements meeting criteria
>
> see ?switch
>
> X<- rep(c("Buy","Sell","something else"),each=5)
> Y<- rep(c("DEL","INS","DEL"),5)
>
>
> new.vect <- X
> for (i in which(Y=="DEL")){
> new.vect[i]<-switch(
>   EXPR = X[i],
>   Sell="Buy",
>   Buy="Sell",
>   X[i])
> }
> cbind(new.vect,X,Y)
> On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
> wrote:
> Dear group,
>
> Here is my df :
>
> trade <-
> structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name=
> c("SUGAR NO.11",
> "CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
> "Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
> 2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
> c(4.01,
> -8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
> "Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
> "Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")
>
> Here is what I want :
>
> If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell ,
> change
> it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
> If trade$Trade.Status=="INS", do nothing
> I tried to work around with ifelse, but don't know how to deal with so many
> conditions.
>
> Any help is appreciated.
>
> TY
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joris Meys
> Statistical Consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Applied mathematics, biometrics and process control
>
> Coupure Links 653
> B-9000 Gent
>
> tel : +32 9 264 59 87
> joris.m...@ugent.be
> -

Re: [R] data frame manipulation change elements meeting criteria

2010-05-27 Thread arnaud Gaboury
Joris,

If i pass this line :

>tradesnew<-sapply(trades[which(trades$Trade.Status=="DEL"),],switch,Sel
>l="Buy",Buy="Sell")

Here is what I get :

> tradesnew
$Trade.Status
NULL

$Instrument.Long.Name
NULL

$Delivery.Prompt.Date
NULL

$Buy.Sell..Cleared.
[1] "Buy"

$Volume
[1] "Buy"

$Price
NULL

$Net.Charges..sum.
NULL

That's certainly not what I want.




From: Joris Meys [mailto:jorism...@gmail.com] 
Sent: Thursday, May 27, 2010 8:43 AM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

The loop is due to the switch statement, not the condition. Without
condition it would become:

for (i in 1:length(Y)){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
You can make an sapply construct too off course :

new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell") 

This will speed up things a little bit, but the effect is marginal.
Cheers
Joris
On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury 
wrote:
Thank you for the answer.
Is there any way to combine if() and switch() in one line? In my case,
something like :

>if(trade$Trade.Status=="DEL")switch(.)

I would like to avoid the loop .



From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Wednesday, May 26, 2010 9:15 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

see ?switch

X<- rep(c("Buy","Sell","something else"),each=5)
Y<- rep(c("DEL","INS","DEL"),5)


new.vect <- X
for (i in which(Y=="DEL")){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
cbind(new.vect,X,Y)
On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
wrote:
Dear group,

Here is my df :

trade <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")

Here is what I want :

If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell , change
it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
If trade$Trade.Status=="INS", do nothing
I tried to work around with ifelse, but don't know how to deal with so many
conditions.

Any help is appreciated.

TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php



-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering 
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be 
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame change elements meeting criteria

2010-05-27 Thread arnaud Gaboury
If i pass this line :

>tradesnew<-sapply(trades[which(trades$Trade.Status=="DEL"),],switch,Sell="B
uy",Buy="Sell")

Here is what I get :

> tradesnew
$Trade.Status
NULL

$Instrument.Long.Name
NULL

$Delivery.Prompt.Date
NULL

$Buy.Sell..Cleared.
[1] "Buy"

$Volume
[1] "Buy"

$Price
NULL

$Net.Charges..sum.
NULL

That's certainly not what I want.



From: Joris Meys [mailto:jorism...@gmail.com] 
Sent: Thursday, May 27, 2010 8:43 AM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

The loop is due to the switch statement, not the condition. Without
condition it would become:

for (i in 1:length(Y)){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
You can make an sapply construct too off course :

new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell") 

This will speed up things a little bit, but the effect is marginal.
Cheers
Joris
On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury 
wrote:
Thank you for the answer.
Is there any way to combine if() and switch() in one line? In my case,
something like :

>if(trade$Trade.Status=="DEL")switch(.)

I would like to avoid the loop .



From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Wednesday, May 26, 2010 9:15 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

see ?switch

X<- rep(c("Buy","Sell","something else"),each=5)
Y<- rep(c("DEL","INS","DEL"),5)


new.vect <- X
for (i in which(Y=="DEL")){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
cbind(new.vect,X,Y)
On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
wrote:
Dear group,

Here is my df :

trade <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")

Here is what I want :

If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell , change
it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
If trade$Trade.Status=="INS", do nothing
I tried to work around with ifelse, but don't know how to deal with so many
conditions.

Any help is appreciated.

TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php



-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering 
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be 
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation change elements meeting criteria

2010-05-26 Thread arnaud Gaboury
Thank you for the answer.
Is there any way to combine if() and switch() in one line? In my case,
something like :

>if(trade$Trade.Status=="DEL")switch(.)

I would like to avoid the loop .



From: Joris Meys [mailto:jorism...@gmail.com] 
Sent: Wednesday, May 26, 2010 9:15 PM
To: arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] data frame manipulation change elements meeting criteria

see ?switch

X<- rep(c("Buy","Sell","something else"),each=5)
Y<- rep(c("DEL","INS","DEL"),5)


new.vect <- X
for (i in which(Y=="DEL")){
    new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
cbind(new.vect,X,Y)
On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
wrote:
Dear group,

Here is my df :

trade <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11",
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01,
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")

Here is what I want :

If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell , change
it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
If trade$Trade.Status=="INS", do nothing
I tried to work around with ifelse, but don't know how to deal with so many
conditions.

Any help is appreciated.

TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering 
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be 
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation change elements meeting criteria

2010-05-26 Thread Joris Meys
The loop is due to the switch statement, not the condition. Without
condition it would become:

for (i in 1:length(Y)){
new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
You can make an sapply construct too off course :

new.vect <- sapply(X[which(Y=="DEL")],switch,Sell="Buy",Buy="Sell")

This will speed up things a little bit, but the effect is marginal.
Cheers
Joris

On Thu, May 27, 2010 at 8:33 AM, arnaud Gaboury wrote:

> Thank you for the answer.
> Is there any way to combine if() and switch() in one line? In my case,
> something like :
>
> >if(trade$Trade.Status=="DEL")switch(.)
>
> I would like to avoid the loop .
>
>
>
> From: Joris Meys [mailto:jorism...@gmail.com]
> Sent: Wednesday, May 26, 2010 9:15 PM
> To: arnaud Gaboury
> Cc: r-help@r-project.org
> Subject: Re: [R] data frame manipulation change elements meeting criteria
>
> see ?switch
>
> X<- rep(c("Buy","Sell","something else"),each=5)
> Y<- rep(c("DEL","INS","DEL"),5)
>
>
> new.vect <- X
> for (i in which(Y=="DEL")){
> new.vect[i]<-switch(
>   EXPR = X[i],
>   Sell="Buy",
>   Buy="Sell",
>   X[i])
> }
> cbind(new.vect,X,Y)
> On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury 
> wrote:
> Dear group,
>
> Here is my df :
>
> trade <-
> structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name=
> c("SUGAR NO.11",
> "CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
> "Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
> 2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
> c(4.01,
> -8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
> "Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
> "Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")
>
> Here is what I want :
>
> If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell ,
> change
> it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
> If trade$Trade.Status=="INS", do nothing
> I tried to work around with ifelse, but don't know how to deal with so many
> conditions.
>
> Any help is appreciated.
>
> TY
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joris Meys
> Statistical Consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Applied mathematics, biometrics and process control
>
> Coupure Links 653
> B-9000 Gent
>
> tel : +32 9 264 59 87
> joris.m...@ugent.be
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
>


-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation change elements meeting criteria

2010-05-26 Thread Joris Meys
see ?switch

X<- rep(c("Buy","Sell","something else"),each=5)
Y<- rep(c("DEL","INS","DEL"),5)


new.vect <- X
for (i in which(Y=="DEL")){
new.vect[i]<-switch(
  EXPR = X[i],
  Sell="Buy",
  Buy="Sell",
  X[i])
}
cbind(new.vect,X,Y)

On Wed, May 26, 2010 at 7:43 PM, arnaud Gaboury wrote:

> Dear group,
>
> Here is my df :
>
> trade <-
> structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name=
> c("SUGAR NO.11",
> "CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10",
> "Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L,
> 2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
> c(4.01,
> -8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name",
> "Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price",
> "Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")
>
> Here is what I want :
>
> If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell ,
> change
> it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
> If trade$Trade.Status=="INS", do nothing
> I tried to work around with ifelse, but don't know how to deal with so many
> conditions.
>
> Any help is appreciated.
>
> TY
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data frame manipulation change elements meeting criteria

2010-05-26 Thread arnaud Gaboury
Dear group,

Here is my df :

trade <-
structure(list(Trade.Status = c("DEL", "INS", "INS"), Instrument.Long.Name =
c("SUGAR NO.11", 
"CORN", "CORN"), Delivery.Prompt.Date = c("Jul/10", "Jul/10", 
"Jul/10"), Buy.Sell..Cleared. = c("Sell", "Buy", "Buy"), Volume = c(1L, 
2L, 1L), Price = c("15.2500", "368.", "368.5000"), Net.Charges..sum. =
c(4.01, 
-8.64, -4.32)), .Names = c("Trade.Status", "Instrument.Long.Name", 
"Delivery.Prompt.Date", "Buy.Sell..Cleared.", "Volume", "Price", 
"Net.Charges..sum."), row.names = c(NA, 3L), class = "data.frame")

Here is what I want :

If trade$Trade.Status=="DEL": then if trade$buy.Sell..Cleared==Sell , change
it to "Buy", if trade$buy.Sell..Cleared==Buy, change it to "Sell".
If trade$Trade.Status=="INS", do nothing
I tried to work around with ifelse, but don't know how to deal with so many
conditions.

Any help is appreciated.

TY

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data reconstruction following PCA using Eigen function

2010-05-24 Thread Julia El-Sayed Moustafa

Hi Thomas,

Thanks very much for your reply. I used svd and it worked perfectly for my
purposes!

Thanks again,
Julia
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-reconstruction-following-PCA-using-Eigen-function-tp2226535p2229191.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frames, passing by value, and performance (Matt Shotwell)

2010-05-24 Thread biostatmatt
R is pretty smart about duplicating only when necessary. That is,
arguments passed to a function are copy-on-write. Also, I think (someone
more knowledgeable please correct if I'm wrong) it may be better to use
the data frame, which is just a list internally, because if you only
modify one column, only that column is duplicated, not the entire data
frame. If you were to use a matrix, the entire matrix would require
duplication.

-Matt

On Mon, 2010-05-24 at 09:29 -0500, gschu...@scriptpro.com wrote:
> I understand that everything passed to an R function is passed "by
> value".  This would seem to include data frames, which my current
> application uses heavily, both for storing program inputs, and holding
> intermediate and final results.  In trying to get greater performance
> out of my R code, I am wondering if there is any clean way to access
> data frames without having them copied all the time.  Or is my only
> option to make them global, and write to them using <<-  ?
> 
> I have considered using matrices, but I like the self-documenting aspect
> of data frame column names.  Input/output to disk is not the issue here,
> as that does not take long in my case.  It's just the internal parameter
> passing that I'm concerned about.
> 
> (I've checked R-FAQ, R-lang and searched the R-help archives, but didn't
> see any specific mentions of this.)
> 
> Thanks.
> 
> Grant Schultz
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    4   5   6   7   8   9   10   11   12   13   >