Hi Yohan,  Thanks.

On Wed, Jan 28, 2009 at 4:57 AM, Yohan Chalabi <chal...@phys.ethz.ch> wrote:

> >>>> "TB" == Ted Byers <r.ted.by...@gmail.com>
> >>>> on Tue, 27 Jan 2009 16:00:27 -0500
>
>   TB> I wasn't even aware I was using midnightStandard.  You won't
>   TB> find it in my
>   TB> script.
>   TB>
>   TB> Here is the relevant loop:
>   TB>
>   TB> date1 = timeDate(charvec = Sys.Date(), format = %Y-%m-%d)
>   TB> date1
>   TB> dow = 3;
>   TB> for (i in 1:length(V4) ) {
>   TB> x = read.csv(as.character(V4[[i]]), header = FALSE,
>   TB> na.strings=);
>   TB> y = x[,1];
>   TB> year = V2[[i]];
>   TB> week = V3[[i]];
>   TB> dtstr = sprintf(%i-%i-%i,year,week,dow);
>   TB> date2 = timeDate(dtstr, format = %Y-%U-%w);
>   TB> resultsdataframe[[i]] <- difftimeDate(date1,date2,units =
>   TB> weeks);
>   TB> fp = fitdistr(y,exponential);
>   TB> print(c(V1[[i]],V2[[i]],V3[[i]],fp,fp));
>   TB> print(c(year,week,date2,resultsdataframe[[i]]));
>   TB> resultsdataframe[[i]] <- fp;
>   TB> resultsdataframe[[i]] <- fp;
>   TB> }
>   TB>
>   TB> It fails with a little more than 100 records left in V4.
>   TB>
>   TB> The full error message is:
>   TB>
>   TB> Error in midnightStandard(charvec, format) :
>   TB> 'charvec' has non-NA entries of different number of characters
>
> timeDate() uses the midnight standard. The function 'midnightStandard'
> assumes that all entries in 'charvec' have the same 'format'. Can you
> please check if this is the case?
>

It is certain that all entries have the same format, but I'm starting to
think that the error message is something of a red herring.  Consider this:

> year = 2009
> week = 0
> day = 3
> datestr = sprintf("%i-%i-%i",year,week,day);datestr
[1] "2009-0-3"
> date1 = timeDate(datestr, format = "%Y-%U-%w");
> date1
GMT
[1] [NA]
> day = 4
> datestr = sprintf("%i-%i-%i",year,week,day);datestr
[1] "2009-0-4"
> date1 = timeDate(datestr, format = "%Y-%U-%w");
> date1
GMT
[1] [2009-01-01]
>
> datestr = sprintf("%i-%i-%i",year,week,3);datestr
[1] "2009-0-3"
> date2 = timeDate(datestr, format = "%Y-%U-%w");date2
GMT
[1] [NA]
> difftimeDate(date2,date1, units = "weeks")
Error in midnightStandard(charvec, format) :
  'charvec' has non-NA entries of different number of characters
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf



The first values for year, week and day are the values on which my loop
dies.  It returns 'NA' here.  It seems clear that it is returning NA because
the date that data corresponds to is 2008-12-31.

The error is being produced by difftimeDate rather than timeDate (as shown
by the above session).  But that represents a flaw in the function design.
It should fail when taking the elapsed time between a null and the present,
but if I wrote such a function, I'd have it return null (perhaps with a
warning) rather than just die.

A bigger issue is that timeDate ought never give null here (which is what I
assume 'NA' means), since all the data comes from transaction data with real
dates, so the elapsed time, measured in weeks, ought to always be a valid
real number that is positive semidefinite.  I have not yet come to any
conclusions as to how it ought to behave (whether to return new years day,
along with a warning, or to return the date requested by reinvoking itself
with the year and week adjusted so a valid date is returned).

On a practical side, how would I test date2 to see if it is null, so I can
give it a sensible default value?

A more troubling thought is that with this handling of dates in this
combination of SQL (my group by clause uses
YEAR(transaction_date),WEEK(transaction_date)) to get the data and R to
process it, the week containing new years day will ALWAYS be split in two at
the first second of the new year. I'm going to have to either figure out a
way to correct this, or ignore it (as it doesn't actually make things wrong,
but rather it splits a sample into two unequal parts).

Thoughts?

Thanks

Ted

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to