Re: [R] Question regarding sqldf
My sqldf package was already updated. The result from the example you ran is also missing 2 of the five days so something still isn't working correctly. | | From: | | | |milton ruser milton.ru...@gmail.com | | | | To:| | | |Michael Rizzo/RTP/USEPA/u...@epa | | | | Cc:| | | |r-help@r-project.org | | | | Date: | | | |07/30/2009 05:56 PM | | | | Subject: | | | |Re: [R] Question regarding sqldf | | How about update R/sqldf? require(sqlf) Loading required package: sqlf Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'sqlf' testframe-data.frame(sdate=rep(paste + (day,1:5,sep=),each=5),bext=runif(25,1,100),otherstuff=runif + (25,45,60),stuff=runif(25,3,25)) maxdays-sqldf(select distinct * from testframe group by sdate having bext=max(bext)) maxdays sdate bext otherstuff stuff 1 day1 77.20252 51.77503 12.045835 2 day2 91.37352 50.07259 4.468679 3 day5 74.29047 58.58456 10.027444 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding sqldf
Your sql statement is using a variable in the having clause that is not in the group by clause. Given that group by is supposed to produce a single output per group we see that this construct is problematic since what should happen in the case that there are two equal maximum values in the same group? Its possible that some databases can support that anyways but I don't think sqlite does. Write your query like this: sqldf(select * from testframe a where a.bext = + (select max(bext) from testframe where a.sdate = sdate)) sdate bext otherstuff stuff 1 day1 94.10626 47.20670 15.340856 2 day2 89.34949 55.36058 19.572773 3 day3 95.72650 52.16694 17.632534 4 day4 95.49586 47.28667 12.676297 5 day5 99.43271 48.98959 3.013745 On Thu, Jul 30, 2009 at 5:30 PM, rizzo.mich...@epamail.epa.gov wrote: Here is a test data frame: testframe-data.frame(sdate=rep(paste (day,1:5,sep=),each=5),bext=runif(25,1,100),otherstuff=runif (25,45,60),stuff=runif(25,3,25)) testframe sdate bext otherstuff stuff 1 day1 37.863859 49.19816 10.036211 2 day1 58.557049 59.23145 21.793954 3 day1 70.345462 48.82313 24.630742 4 day1 1.245913 57.70718 14.131270 5 day1 40.170634 50.38963 21.420328 6 day2 36.126846 51.97043 22.815169 7 day2 95.801896 53.30061 13.540535 8 day2 19.959486 50.46775 9.403084 9 day2 5.772996 55.14082 17.137113 10 day2 77.956966 51.44969 13.448527 11 day3 6.743810 54.49836 17.650795 12 day3 60.758896 55.35407 5.005359 13 day3 86.924873 57.74122 8.691572 14 day3 91.933544 48.17538 8.559737 15 day3 23.627126 59.28906 4.766172 16 day4 76.200158 55.78072 19.493428 17 day4 67.809049 51.06784 7.116428 18 day4 70.309643 54.38067 10.736461 19 day4 91.152166 58.39768 23.199943 20 day4 30.358417 54.16645 24.153823 21 day5 64.064953 58.31813 19.856882 22 day5 49.220106 48.49721 16.615940 23 day5 21.554696 47.03421 3.585458 24 day5 30.227874 55.51724 14.435731 25 day5 51.518716 54.31445 20.837326 If I run the following sqldf statement: maxdays-sqldf(select distinct * from testframe group by sdate having bext=max(bext)) I get: maxdays data frame with 0 columns and 0 rows I want to get the entire observation for each day where bext is the maximum for the day. | | From: | | | |milton ruser milton.ru...@gmail.com | | | | To: | | | |Michael Rizzo/RTP/USEPA/u...@epa | | | | Cc: | | | |r-help@r-project.org | | | | Date: | | | |07/30/2009 03:03 PM | | | | Subject: | | | |Re: [R] Question regarding sqldf | | Hi Rizzi, how about a reproducible example/data.frame? :-) milton On Thu, Jul 30, 2009 at 1:46 PM, rizzo.mich...@epamail.epa.gov wrote: Hello, I am having a problem using sqldf. I'm trying to choose
Re: [R] Question regarding sqldf
Hi Rizzi, how about a reproducible example/data.frame? :-) milton On Thu, Jul 30, 2009 at 1:46 PM, rizzo.mich...@epamail.epa.gov wrote: Hello, I am having a problem using sqldf. I'm trying to choose a subset of observations from a data set based on the date and maximum value of a variable by date. Here is the code I am using: test-sqldf(select distinct * from bextuse group by sdate having bext=max(bext),method=raw); The result I get back is a data frame with 0 rows and 0 columns. I have tried the code in another program that utilizes SQL, and I retrieve the 14 rows I was expecting to get. I looked at the SQLite information on the web, and it mentions that the having clause can be used to select observations from grouped data using aggregating functions. Any help or advice is greatly appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding sqldf
Here is a test data frame: testframe-data.frame(sdate=rep(paste (day,1:5,sep=),each=5),bext=runif(25,1,100),otherstuff=runif (25,45,60),stuff=runif(25,3,25)) testframe sdate bext otherstuff stuff 1 day1 37.863859 49.19816 10.036211 2 day1 58.557049 59.23145 21.793954 3 day1 70.345462 48.82313 24.630742 4 day1 1.245913 57.70718 14.131270 5 day1 40.170634 50.38963 21.420328 6 day2 36.126846 51.97043 22.815169 7 day2 95.801896 53.30061 13.540535 8 day2 19.959486 50.46775 9.403084 9 day2 5.772996 55.14082 17.137113 10 day2 77.956966 51.44969 13.448527 11 day3 6.743810 54.49836 17.650795 12 day3 60.758896 55.35407 5.005359 13 day3 86.924873 57.74122 8.691572 14 day3 91.933544 48.17538 8.559737 15 day3 23.627126 59.28906 4.766172 16 day4 76.200158 55.78072 19.493428 17 day4 67.809049 51.06784 7.116428 18 day4 70.309643 54.38067 10.736461 19 day4 91.152166 58.39768 23.199943 20 day4 30.358417 54.16645 24.153823 21 day5 64.064953 58.31813 19.856882 22 day5 49.220106 48.49721 16.615940 23 day5 21.554696 47.03421 3.585458 24 day5 30.227874 55.51724 14.435731 25 day5 51.518716 54.31445 20.837326 If I run the following sqldf statement: maxdays-sqldf(select distinct * from testframe group by sdate having bext=max(bext)) I get: maxdays data frame with 0 columns and 0 rows I want to get the entire observation for each day where bext is the maximum for the day. | | From: | | | |milton ruser milton.ru...@gmail.com | | | | To:| | | |Michael Rizzo/RTP/USEPA/u...@epa | | | | Cc:| | | |r-help@r-project.org | | | | Date: | | | |07/30/2009 03:03 PM | | | | Subject: | | | |Re: [R] Question regarding sqldf | | Hi Rizzi, how about a reproducible example/data.frame? :-) milton On Thu, Jul 30, 2009 at 1:46 PM, rizzo.mich...@epamail.epa.gov wrote: Hello, I am having a problem using sqldf. I'm trying to choose a subset of observations from a data set based on the date and maximum value of a variable by date. Here is the code I am using: test-sqldf(select distinct * from bextuse group by sdate having bext=max(bext),method=raw); The result I get back is a data frame with 0 rows and 0 columns. I have tried the code in another program that utilizes SQL, and I retrieve the 14 rows I was expecting to get. I looked at the SQLite information on the web, and it mentions that the having clause can be used to select observations from grouped data using aggregating functions. Any help or advice is greatly appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal
Re: [R] Question regarding sqldf
How about update R/sqldf? require(sqlf) Loading required package: sqlf Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called 'sqlf' testframe-data.frame(sdate=rep(paste + (day,1:5,sep=),each=5),bext=runif(25,1,100),otherstuff=runif + (25,45,60),stuff=runif(25,3,25)) maxdays-sqldf(select distinct * from testframe group by sdate having bext=max(bext)) maxdays sdate bext otherstuff stuff 1 day1 77.20252 51.77503 12.045835 2 day2 91.37352 50.07259 4.468679 3 day5 74.29047 58.58456 10.027444 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding sqldf
I just answered essentially the same question from another person offline within the last day. Anyways: Date class variables are stored as numbers, i.e. days since the Epoch, in R and when transferred to sqlite get transferred as numbers. Try this: library(sqldf) DF - data.frame(a = Sys.Date() + 1:5, b = 1:5) DF a b 1 2009-07-31 1 2 2009-08-01 2 3 2009-08-02 3 4 2009-08-03 4 5 2009-08-04 5 Sys.Date() + 2 [1] 2009-08-01 s - sprintf(select * from DF where a = %d, Sys.Date() + 2) s [1] select * from DF where a = 14457 sqldf(s) a b 1 2009-08-01 2 2 2009-08-02 3 3 2009-08-03 4 4 2009-08-04 5 # to compare against character string store a as character DF2 - transform(DF, a = as.character(a)) sqldf(select * from DF2 where a = '2009-08-01') a b 1 2009-08-01 2 2 2009-08-02 3 3 2009-08-03 4 4 2009-08-04 5 On Thu, Jul 30, 2009 at 1:46 PM, rizzo.mich...@epamail.epa.gov wrote: Hello, I am having a problem using sqldf. I'm trying to choose a subset of observations from a data set based on the date and maximum value of a variable by date. Here is the code I am using: test-sqldf(select distinct * from bextuse group by sdate having bext=max(bext),method=raw); The result I get back is a data frame with 0 rows and 0 columns. I have tried the code in another program that utilizes SQL, and I retrieve the 14 rows I was expecting to get. I looked at the SQLite information on the web, and it mentions that the having clause can be used to select observations from grouped data using aggregating functions. Any help or advice is greatly appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.