Re: [R] data frame manipulation with zero rows

2010-06-02 Thread arnaud Gaboury
I do really think it is a very good idea.
TY





> -Original Message-
> From: h.wick...@gmail.com [mailto:h.wick...@gmail.com] On Behalf Of
> Hadley Wickham
> Sent: Wednesday, June 02, 2010 3:31 PM
> To: arnaud Gaboury
> Cc: Peter Ehlers; r-help@r-project.org; Prof Brian Ripley
> Subject: Re: [R] data frame manipulation with zero rows
> 
> Hi Arnaud,
> 
> I've added this case to the set of test cases in plyr and it will be
> fixed in the next version.
> 
> Hadley
> 
> On Tue, Jun 1, 2010 at 2:33 PM, arnaud Gaboury
>  wrote:
> > Maybe not the cleanest way, but I create a fake data frame with one
> row so
> > ddply() is happy!!
> >> if (nrow(futures)==0) futures<-data.frame(...)
> >
> >
> >
> >
> >
> >> -Original Message-
> >> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
> >> Sent: Tuesday, June 01, 2010 12:07 PM
> >> To: arnaud Gaboury
> >> Cc: 'Prof Brian Ripley'; r-help@r-project.org
> >> Subject: Re: [R] data frame manipulation with zero rows
> >>
> >> On 2010-06-01 1:53, arnaud Gaboury wrote:
> >> > Brian,
> >> >
> >> > If I do understand correctly, I must use in my function something
> >> else than
> >> > ddply() if I want to avoid any error each time my df has zero
> rows?
> >> > Am I correct?
> >> >
> >>
> >> You could define a function to handle the zero-rows case:
> >>
> >> f <- function(x){
> >>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
> >>   else
> >>     out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>                      POSITION=sum(QUANTITY))[,c(1,3,2)]
> >>   out
> >> }
> >> f(futures)
> >>
> >>   -Peter Ehlers
> >>
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> >> >> Sent: Tuesday, June 01, 2010 9:47 AM
> >> >> To: arnaud Gaboury
> >> >> Subject: Re: [R] data frame manipulation with zero rows
> >> >>
> >> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >> >>
> >> >>> Dear group,
> >> >>>
> >> >>> Here is the kind of data.frame I obtain every day with my
> function
> >> :
> >> >>>
> >> >>> futures<-
> >> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE
> Aug/10",
> >> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11
> Jul/10",
> >> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class =
> "Date"),
> >> >>>     QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT
> =
> >> >>> c("373.2500",
> >> >>>     "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >> >>>     "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> >> "14.9200"
> >> >>>     )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >> >>>
> >> >>> I need then to apply to the df this following code line :
> >> >>>
> >> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> >> POSITION=
> >> >>> sum(QUANTITY))[,c(1,3,2)]
> >> >>>
> >> >>> It works perfectly in most of case, BUT I have a new problem: it
> >> can
> >> >>> sometime occurs that my df "futures" is empty, with zero rows.
> >> >>>
> >> >>>
> >> >>> futures<-
> >> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >> >>> structure(numeric(0), class = "Date"),
> >> >>>

Re: [R] data frame manipulation with zero rows

2010-06-02 Thread Hadley Wickham
Hi Arnaud,

I've added this case to the set of test cases in plyr and it will be
fixed in the next version.

Hadley

On Tue, Jun 1, 2010 at 2:33 PM, arnaud Gaboury  wrote:
> Maybe not the cleanest way, but I create a fake data frame with one row so
> ddply() is happy!!
>> if (nrow(futures)==0) futures<-data.frame(...)
>
>
>
>
>
>> -Original Message-
>> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
>> Sent: Tuesday, June 01, 2010 12:07 PM
>> To: arnaud Gaboury
>> Cc: 'Prof Brian Ripley'; r-help@r-project.org
>> Subject: Re: [R] data frame manipulation with zero rows
>>
>> On 2010-06-01 1:53, arnaud Gaboury wrote:
>> > Brian,
>> >
>> > If I do understand correctly, I must use in my function something
>> else than
>> > ddply() if I want to avoid any error each time my df has zero rows?
>> > Am I correct?
>> >
>>
>> You could define a function to handle the zero-rows case:
>>
>> f <- function(x){
>>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
>>   else
>>     out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
>>                      POSITION=sum(QUANTITY))[,c(1,3,2)]
>>   out
>> }
>> f(futures)
>>
>>   -Peter Ehlers
>>
>> >
>> >
>> >> -Original Message-
>> >> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
>> >> Sent: Tuesday, June 01, 2010 9:47 AM
>> >> To: arnaud Gaboury
>> >> Subject: Re: [R] data frame manipulation with zero rows
>> >>
>> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
>> >>
>> >>> Dear group,
>> >>>
>> >>> Here is the kind of data.frame I obtain every day with my function
>> :
>> >>>
>> >>> futures<-
>> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
>> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
>> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
>> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
>> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
>> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
>> >>>     QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
>> >>> c("373.2500",
>> >>>     "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
>> >>>     "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
>> "14.9200"
>> >>>     )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
>> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
>> >>>
>> >>> I need then to apply to the df this following code line :
>> >>>
>> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
>> >> POSITION=
>> >>> sum(QUANTITY))[,c(1,3,2)]
>> >>>
>> >>> It works perfectly in most of case, BUT I have a new problem: it
>> can
>> >>> sometime occurs that my df "futures" is empty, with zero rows.
>> >>>
>> >>>
>> >>> futures<-
>> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
>> >>> structure(numeric(0), class = "Date"),
>> >>>     QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
>> >>> c("DESCRIPTION",
>> >>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
>> >> class =
>> >>> "data.frame")
>> >>>
>> >>> It is not the usual case, but it can happen. With this df, when I
>> >> pass the
>> >>> above mentione line, I get an error :
>> >>>
>> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
>> >> POSITION=
>> >>> sum(QUANTITY))[,c(1,3,2)]
>> >>> Error in tapply(1:nrow(data), splitv, list) :
>> >>>   arguments must have same length
>> >>>
>> &

Re: [R] data frame manipulation with zero rows

2010-06-01 Thread arnaud Gaboury
Maybe not the cleanest way, but I create a fake data frame with one row so
ddply() is happy!!
> if (nrow(futures)==0) futures<-data.frame(...)





> -Original Message-
> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
> Sent: Tuesday, June 01, 2010 12:07 PM
> To: arnaud Gaboury
> Cc: 'Prof Brian Ripley'; r-help@r-project.org
> Subject: Re: [R] data frame manipulation with zero rows
> 
> On 2010-06-01 1:53, arnaud Gaboury wrote:
> > Brian,
> >
> > If I do understand correctly, I must use in my function something
> else than
> > ddply() if I want to avoid any error each time my df has zero rows?
> > Am I correct?
> >
> 
> You could define a function to handle the zero-rows case:
> 
> f <- function(x){
>   if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
>   else
> out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
>  POSITION=sum(QUANTITY))[,c(1,3,2)]
>   out
> }
> f(futures)
> 
>   -Peter Ehlers
> 
> >
> >
> >> -Original Message-----
> >> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> >> Sent: Tuesday, June 01, 2010 9:47 AM
> >> To: arnaud Gaboury
> >> Subject: Re: [R] data frame manipulation with zero rows
> >>
> >> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >>
> >>> Dear group,
> >>>
> >>> Here is the kind of data.frame I obtain every day with my function
> :
> >>>
> >>> futures<-
> >>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> >>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> >>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >>> QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> >>> c("373.2500",
> >>> "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >>> "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> "14.9200"
> >>> )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >>>
> >>> I need then to apply to the df this following code line :
> >>>
> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> POSITION=
> >>> sum(QUANTITY))[,c(1,3,2)]
> >>>
> >>> It works perfectly in most of case, BUT I have a new problem: it
> can
> >>> sometime occurs that my df "futures" is empty, with zero rows.
> >>>
> >>>
> >>> futures<-
> >>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >>> structure(numeric(0), class = "Date"),
> >>> QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> >>> c("DESCRIPTION",
> >>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> >> class =
> >>> "data.frame")
> >>>
> >>> It is not the usual case, but it can happen. With this df, when I
> >> pass the
> >>> above mentione line, I get an error :
> >>>
> >>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >> POSITION=
> >>> sum(QUANTITY))[,c(1,3,2)]
> >>> Error in tapply(1:nrow(data), splitv, list) :
> >>>   arguments must have same length
> >>>
> >>>
> >>> How can I avoid this when my df is empty?
> >>
> >> Ask the author of the (missing) function ddply() to correct the
> error
> >> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> >>
> >> It's helpful to give example code, but much more helpful if you test
> >> it: yours cannot work without the function ddply() -- this is what
> >> 'self-contained' means in the footer here.
> >>
> >>
> >>>
> >>> Any help is appreciated
> >>>
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >> --
> >> Brian D. Ripley,  rip...@stats.ox.ac.uk
> >> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> >> University of Oxford, Tel:  +44 1865 272861 (self)
> >> 1 South Parks Road, +44 1865 272866 (PA)
> >> Oxford OX1 3TG, UKFax:  +44 1865 272595
> >

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread arnaud Gaboury
It is indeed ddply() from package plyr.





> -Original Message-
> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> Sent: Tuesday, June 01, 2010 12:24 PM
> To: Peter Ehlers
> Cc: arnaud Gaboury; r-help@r-project.org
> Subject: Re: [R] data frame manipulation with zero rows
> 
> On Tue, 1 Jun 2010, Peter Ehlers wrote:
> 
> > On 2010-06-01 1:53, arnaud Gaboury wrote:
> >> Brian,
> >>
> >> If I do understand correctly, I must use in my function something
> else than
> >> ddply() if I want to avoid any error each time my df has zero rows?
> >> Am I correct?
> >>
> >
> > You could define a function to handle the zero-rows case:
> >
> > f <- function(x){
> > if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
> > else
> >   out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
> >POSITION=sum(QUANTITY))[,c(1,3,2)]
> > out
> > }
> > f(futures)
> 
> Or simply fix ddply.  We don't know what that is or what it should do
> for the case of zero rows: it may or may not be the one in package
> plyr.
> 
> >
> > -Peter Ehlers
> >
> >>
> >>
> >>> -Original Message-
> >>> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> >>> Sent: Tuesday, June 01, 2010 9:47 AM
> >>> To: arnaud Gaboury
> >>> Subject: Re: [R] data frame manipulation with zero rows
> >>>
> >>> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> >>>
> >>>> Dear group,
> >>>>
> >>>> Here is the kind of data.frame I obtain every day with my function
> :
> >>>>
> >>>> futures<-
> >>>> structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> >>>> "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> >>>> "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> >>>> "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> >>>> ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> >>>> 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >>>> QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> >>>> c("373.2500",
> >>>> "373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >>>> "90.7750", "14.9200", "14.9200", "14.9200", "14.9200",
> "14.9200"
> >>>> )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> >>>> "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >>>>
> >>>> I need then to apply to the df this following code line :
> >>>>
> >>>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>> POSITION=
> >>>> sum(QUANTITY))[,c(1,3,2)]
> >>>>
> >>>> It works perfectly in most of case, BUT I have a new problem: it
> can
> >>>> sometime occurs that my df "futures" is empty, with zero rows.
> >>>>
> >>>>
> >>>> futures<-
> >>>> structure(list(DESCRIPTION = character(0), CREATED.DATE =
> >>>> structure(numeric(0), class = "Date"),
> >>>> QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> >>>> c("DESCRIPTION",
> >>>> "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> >>> class =
> >>>> "data.frame")
> >>>>
> >>>> It is not the usual case, but it can happen. With this df, when I
> >>> pass the
> >>>> above mentione line, I get an error :
> >>>>
> >>>>> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> >>> POSITION=
> >>>> sum(QUANTITY))[,c(1,3,2)]
> >>>> Error in tapply(1:nrow(data), splitv, list) :
> >>>>   arguments must have same length
> >>>>
> >>>>
> >>>> How can I avoid this when my df is empty?
> >>>
> >>> Ask the author of the (missing) function ddply() to correct the
> error
> >>> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> >>>
> >>> It's helpful to give example code, but much more helpful if you
> test
> >>> it: yours cannot work without the function ddply() -- this is what
> >>> 'self-contained' means in the footer here.
> 
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread Prof Brian Ripley

On Tue, 1 Jun 2010, Peter Ehlers wrote:


On 2010-06-01 1:53, arnaud Gaboury wrote:

Brian,

If I do understand correctly, I must use in my function something else than
ddply() if I want to avoid any error each time my df has zero rows?
Am I correct?



You could define a function to handle the zero-rows case:

f <- function(x){
if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
else
  out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
   POSITION=sum(QUANTITY))[,c(1,3,2)]
out
}
f(futures)


Or simply fix ddply.  We don't know what that is or what it should do 
for the case of zero rows: it may or may not be the one in package 
plyr.




-Peter Ehlers





-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: Tuesday, June 01, 2010 9:47 AM
To: arnaud Gaboury
Subject: Re: [R] data frame manipulation with zero rows

On Tue, 1 Jun 2010, arnaud Gaboury wrote:


Dear group,

Here is the kind of data.frame I obtain every day with my function :

futures<-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
"SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
c("373.2500",
"373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")

I need then to apply to the df this following code line :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]

It works perfectly in most of case, BUT I have a new problem: it can
sometime occurs that my df "futures" is empty, with zero rows.


futures<-
structure(list(DESCRIPTION = character(0), CREATED.DATE =
structure(numeric(0), class = "Date"),
QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
c("DESCRIPTION",
"CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),

class =

"data.frame")

It is not the usual case, but it can happen. With this df, when I

pass the

above mentione line, I get an error :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]
Error in tapply(1:nrow(data), splitv, list) :
  arguments must have same length


How can I avoid this when my df is empty?


Ask the author of the (missing) function ddply() to correct the error
of using 1:nrow(data) by replacing it by seq_len(nrow(data)).

It's helpful to give example code, but much more helpful if you test
it: yours cannot work without the function ddply() -- this is what
'self-contained' means in the footer here.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread Peter Ehlers

On 2010-06-01 1:53, arnaud Gaboury wrote:

Brian,

If I do understand correctly, I must use in my function something else than
ddply() if I want to avoid any error each time my df has zero rows?
Am I correct?



You could define a function to handle the zero-rows case:

f <- function(x){
 if(nrow(x) < 1) out <- x[, c(1,3,2)]  # or whatever
 else
   out <- ddply(x, c("DESCRIPTION","SETTLEMENT"), summarise,
POSITION=sum(QUANTITY))[,c(1,3,2)]
 out
}
f(futures)

 -Peter Ehlers





-Original Message-
From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
Sent: Tuesday, June 01, 2010 9:47 AM
To: arnaud Gaboury
Subject: Re: [R] data frame manipulation with zero rows

On Tue, 1 Jun 2010, arnaud Gaboury wrote:


Dear group,

Here is the kind of data.frame I obtain every day with my function :

futures<-
structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
"CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
"LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
"SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
c("373.2500",
"373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
"SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")

I need then to apply to the df this following code line :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]

It works perfectly in most of case, BUT I have a new problem: it can
sometime occurs that my df "futures" is empty, with zero rows.


futures<-
structure(list(DESCRIPTION = character(0), CREATED.DATE =
structure(numeric(0), class = "Date"),
QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
c("DESCRIPTION",
"CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),

class =

"data.frame")

It is not the usual case, but it can happen. With this df, when I

pass the

above mentione line, I get an error :


PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,

POSITION=

sum(QUANTITY))[,c(1,3,2)]
Error in tapply(1:nrow(data), splitv, list) :
  arguments must have same length


How can I avoid this when my df is empty?


Ask the author of the (missing) function ddply() to correct the error
of using 1:nrow(data) by replacing it by seq_len(nrow(data)).

It's helpful to give example code, but much more helpful if you test
it: yours cannot work without the function ddply() -- this is what
'self-contained' means in the footer here.




Any help is appreciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-

guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame manipulation with zero rows

2010-06-01 Thread arnaud Gaboury
Brian,

If I do understand correctly, I must use in my function something else than
ddply() if I want to avoid any error each time my df has zero rows?
Am I correct?

TY




> -Original Message-
> From: Prof Brian Ripley [mailto:rip...@stats.ox.ac.uk]
> Sent: Tuesday, June 01, 2010 9:47 AM
> To: arnaud Gaboury
> Subject: Re: [R] data frame manipulation with zero rows
> 
> On Tue, 1 Jun 2010, arnaud Gaboury wrote:
> 
> > Dear group,
> >
> > Here is the kind of data.frame I obtain every day with my function :
> >
> > futures <-
> > structure(list(DESCRIPTION = c("CORN Jul/10", "CORN Jul/10",
> > "CORN Jul/10", "CORN Jul/10", "CORN Jul/10", "LIVE CATTLE Aug/10",
> > "LIVE CATTLE Aug/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10",
> > "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 Jul/10"
> > ), CREATED.DATE = structure(c(18403, 18406, 18406, 18406, 18406,
> > 18407, 18408, 18406, 18407, 18407, 18407, 18407), class = "Date"),
> >QUANTITY = c(1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1), SETTLEMENT =
> > c("373.2500",
> >"373.2500", "373.2500", "373.2500", "373.2500", "90.7750",
> >"90.7750", "14.9200", "14.9200", "14.9200", "14.9200", "14.9200"
> >)), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANTITY",
> > "SETTLEMENT"), row.names = c(NA, 12L), class = "data.frame")
> >
> > I need then to apply to the df this following code line :
> >
> >> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> POSITION=
> > sum(QUANTITY))[,c(1,3,2)]
> >
> > It works perfectly in most of case, BUT I have a new problem: it can
> > sometime occurs that my df "futures" is empty, with zero rows.
> >
> >
> > futures <-
> > structure(list(DESCRIPTION = character(0), CREATED.DATE =
> > structure(numeric(0), class = "Date"),
> >QUANTITY = numeric(0), SETTLEMENT = character(0)), .Names =
> > c("DESCRIPTION",
> > "CREATED.DATE", "QUANTITY", "SETTLEMENT"), row.names = integer(0),
> class =
> > "data.frame")
> >
> > It is not the usual case, but it can happen. With this df, when I
> pass the
> > above mentione line, I get an error :
> >
> >> PosFut=ddply(futures, c("DESCRIPTION","SETTLEMENT"), summarise,
> POSITION=
> > sum(QUANTITY))[,c(1,3,2)]
> > Error in tapply(1:nrow(data), splitv, list) :
> >  arguments must have same length
> >
> >
> > How can I avoid this when my df is empty?
> 
> Ask the author of the (missing) function ddply() to correct the error
> of using 1:nrow(data) by replacing it by seq_len(nrow(data)).
> 
> It's helpful to give example code, but much more helpful if you test
> it: yours cannot work without the function ddply() -- this is what
> 'self-contained' means in the footer here.
> 
> 
> >
> > Any help is appreciated
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.