Hi,

I have a query about sqldf, and dates in general. I couldnt find much on
the net or on the forums, hence I am here. Here is the issue:

I want to write a function that accepts 3 arguments: date1, date2 and a
dataframe, say 'df'. Within the function, I want to populate a temp
dataframe which essentially contains the output of the query "select * from
df where DATE between date1 and date2". DATE is a column (of class Date)
which will be present in the input dataframe. This is how the function will
look like:

testfun = function("date1", "date2", df){
temp = sqldf("select * from df where DATE between 'date1' and 'date2'")
return(temp) #not sure if I'm doing this correctly
}

There are multiple issues here:

One is, how do I accept date as a parameter? Do I need to change the class
of date1 and date2 at input level or within the sql query? In general how
to format the input and how to return the temp dataframe once the function
ends, kindly help. It may be basic but I've spent hours on it with no
success.

The above was the bigger picture. Another issue that I faced is when
running a simple select (using sqldf) on a sample dataframe, I could not
get a result. But when I use 'subset', I get an output.

Below is the sample dataset named 'dlfhead':

structure(list(NAME = c("DLF", "DLF", "DLF", "DLF", "DLF", "DLF"
), EXPDATE = structure(c(15001, 15001, 15001, 15001, 15001, 15001
), class = "Date"), STRIKE = c(280, 280, 280, 280, 280, 280),
    TMSTMP = c("14:54:17", "14:42:38", "14:09:04", "14:20:25",
    "12:27:26", "11:59:36"), PRICE = c(22, 23, 23, 23, 23, 23.2
    ), DATE = structure(c(14977, 14977, 14977, 14977, 14977,
    14977), class = "Date"), TTE = c(24, 24, 24, 24, 24, 24),
    DTTM = structure(c(1294046657, 1294045958, 1294043944, 1294044625,
    1294037846, 1294036176), class = c("POSIXct", "POSIXt")),
    SPOT = c(295, 294.5, 294.8, 293.85, 294.7, 294.55), MONEYNESS =
c(1.05357142857143,
    1.05178571428571, 1.05285714285714, 1.04946428571429, 1.0525,
    1.05196428571429), TTEBY7 = c(4, 4, 4, 4, 4, 4), COMB =
c("2011-01-034",
    "2011-01-034", "2011-01-034", "2011-01-034", "2011-01-034",
    "2011-01-034"), RATE = c(7.55219988366433, 7.55260732560891,
    7.55302228192484, 7.55344470721609, 7.55387455629645, 7.82914555610361
    )), .Names = c("NAME", "EXPDATE", "STRIKE", "TMSTMP", "PRICE",
"DATE", "TTE", "DTTM", "SPOT", "MONEYNESS", "TTEBY7", "COMB",
"RATE"), row.names = c(NA, 6L), class = "data.frame")

Below are the commands and their outputs:

> dlftest = sqldf("select * from dlfhead where DATE = '2011-01-03'")
> nrow(dlftest)
[1] 0

Kindly note that in the above, the query does complete successfully with
'dlftest' created with 0 rows. When I run the select on a column with class
other than date I get the output.

> dlftest = sqldf("select * from dlfhead where TTE=24")
> nrow(dlftest)
[1] 6

But when I use the 'subset' command, I do get an answer:

> dlftest = subset(dlfhead, DATE == '2011-01-03')
> nrow(dlftest)
[1] 6

So the issue has to be the date-class columns in the dataframe, but what is
the issue I have no idea. Can anyone help please?

Thanks a lot for reading though this long email. Kindly reply if you have
an idea about either issue.

Regards,
Shivam

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to