[R] ddply function nesting problems
While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it. I've tried several approahces, but neither worked and I need to have the ability to include the cut, range, and fullseq methods within ddply. (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html) Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running: function_nesting() Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function. Thank you for any advice about how to proceed forward. determine_counts-function() { min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) #counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) # Approach 1 counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) # Approach 2 range_tmp-range(c(Group_df$Data, min_range, max_range)) counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow) names(counts) - c(Bin, Person, Frequency) qplot(Person, Frequency, data = counts, fill = Person, geom=bar, stat=identity, width = 0.9, xlab=Person) + facet_grid(. ~ Bin) } function_nesting-function() { determine_counts() } However, if the code is just run straight through without being nested it works fine: min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) #counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) Unfortunately this is not within a function, so thanks again for any advice on how to approach this issue. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ddply function nesting problems
Hi, I think your ddply call with a calculation inside .( ) is the problem. Are you sure you need to do this? Performing the cut outside ddply seems to work fine, determine_counts-function() { min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) Group_df - transform(Group_df, cut=cut(Data, breaks=fullseq(range(c(Data, min_range, max_range)), bin_range_size))) counts - ddply(Group_df, .(cut, Person), nrow) names(counts) - c(Bin, Person, Frequency) qplot(Person, Frequency, data = counts, fill = Person, geom=bar, stat=identity, width = 0.9, xlab=Person) + facet_grid(. ~ Bin) } function_nesting() HTH, baptiste 2009/11/19 Jason Rupert jasonkrup...@yahoo.com: While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it. I've tried several approahces, but neither worked and I need to have the ability to include the cut, range, and fullseq methods within ddply. (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html) Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running: function_nesting() Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function. Thank you for any advice about how to proceed forward. determine_counts-function() { min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) #counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) # Approach 1 counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) # Approach 2 range_tmp-range(c(Group_df$Data, min_range, max_range)) counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow) names(counts) - c(Bin, Person, Frequency) qplot(Person, Frequency, data = counts, fill = Person, geom=bar, stat=identity, width = 0.9, xlab=Person) + facet_grid(. ~ Bin) } function_nesting-function() { determine_counts() } However, if the code is just run straight through without being nested it works fine: min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) #counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) Unfortunately this is not within a function, so thanks again for any advice on how to approach this issue. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ddply function nesting problems
Awesome! Thanks a ton! I guess I had overlooked how it was really working. I will still have to reflect on why it was working running it straight through, but not being nested. That is kind of a mystery. Oh well... Thanks again. - Original Message From: baptiste auguie baptiste.aug...@googlemail.com To: Jason Rupert jasonkrup...@yahoo.com Cc: R-help@r-project.org Sent: Thu, November 19, 2009 9:24:29 AM Subject: Re: [R] ddply function nesting problems Hi, I think your ddply call with a calculation inside .( ) is the problem. Are you sure you need to do this? Performing the cut outside ddply seems to work fine, determine_counts-function() { min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) Group_df - transform(Group_df, cut=cut(Data, breaks=fullseq(range(c(Data, min_range, max_range)), bin_range_size))) counts - ddply(Group_df, .(cut, Person), nrow) names(counts) - c(Bin, Person, Frequency) qplot(Person, Frequency, data = counts, fill = Person, geom=bar, stat=identity, width = 0.9, xlab=Person) + facet_grid(. ~ Bin) } function_nesting() HTH, baptiste 2009/11/19 Jason Rupert jasonkrup...@yahoo.com: While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it. I've tried several approahces, but neither worked and I need to have the ability to include the cut, range, and fullseq methods within ddply. (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html) Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running: function_nesting() Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function. Thank you for any advice about how to proceed forward. determine_counts-function() { min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) #counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) # Approach 1 counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) # Approach 2 range_tmp-range(c(Group_df$Data, min_range, max_range)) counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow) names(counts) - c(Bin, Person, Frequency) qplot(Person, Frequency, data = counts, fill = Person, geom=bar, stat=identity, width = 0.9, xlab=Person) + facet_grid(. ~ Bin) } function_nesting-function() { determine_counts() } However, if the code is just run straight through without being nested it works fine: min_range-1 max_range-30 bin_range_size-5 Me_df-data.frame(Data = c(1:15), Person = Me) You_df-data.frame(Data = c(10:20), Person = You) Them_df-data.frame(Data = c(15:25), Person = Them) Group_df_tmp-rbind(Me_df,You_df) Group_df-rbind(Group_df_tmp,Them_df) Group_df$Person - factor(Group_df$Person, levels = c(Them, You, Me)) #counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) counts - ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) Unfortunately this is not within a function, so thanks again for any advice on how to approach this issue. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting