AW: [R] ISOdate() and strptime()
Thanks for this clarification. I have learned in the meantime that it is necessary to be very careful when using all these POSIX things. As another example, here is something that made me scratch my head just yesterday: When I create a sequence of days that happens to start before and ends in daylight savings time, I seem to lose a day: > seq(from = strptime("20030329", format="%Y%m%d"), to= strptime("20030402", format="%Y%m%d"), by="DSTday") [1] "2003-03-29 Westeuropäische Normalzeit" "2003-03-30 Westeuropäische Normalzeit" [3] "2003-03-31 Westeuropäische Sommerzeit" "2003-04-01 Westeuropäische Sommerzeit" > seq(from = strptime("20030329", format="%Y%m%d"), to= strptime("20030402", format="%Y%m%d"), by="day") [1] "2003-03-29 00:00:00 Westeuropäische Normalzeit" "2003-03-30 00:00:00 Westeuropäische Normalzeit" [3] "2003-03-31 01:00:00 Westeuropäische Sommerzeit" "2003-04-01 01:00:00 Westeuropäische Sommerzeit" Again, my expectations might be wrong here, and there will be good reasons why I get this result (my OS again?). But considering all these subtle issues I have encountered so far, personally I can understand why some people suggested that it may be easier to use the chron or date package (especially if you are a beginner, have no prior experience with all these things, and don't want to worry about time zones, DST, or the pitfalls of your OS). At least it was useful for me to cross-check the results I obtained with POSIX with the results using chron. The POSIX classes are a great thing, but as they are much more powerful, they are also more complex and have more things to watch out for and more "traps" to fall in (for me at least ;-)). -Heinrich. > -Ursprüngliche Nachricht- > Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED] > Gesendet: Samstag, 22. November 2003 20:56 > An: RINNER Heinrich > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Betreff: Re: [R] ISOdate() and strptime() > > > Confirmation that this *is* an OS-specific problem: A professional > implementation of the POSIX standard (Solaris) gets all of > these correct. > > Your so-called OS lacks any implementation of strptime, so we > borrowed one > from glibc. Unfortunately, that is buggy, even to the extent that > > unclass(strptime("2003-22-20", format="%Y-%m-%d")) > unclass(strptime("2003 22 20", format="%Y %m %d")) > > give different answers! (And RH8.0 gives the same answers as the > substitute code used on R for Windows.) > > I believe Simon Fear owes the R-developers a public apology > for his (not > properly referenced in the archives) reply to this thread. > > BDR > > On Fri, 14 Nov 2003, Prof Brian Ripley wrote: > > > On Fri, 14 Nov 2003, RINNER Heinrich wrote: > > > > > Dear R-people! > > > > > > I am using R 1.8.0, under Windows XP. > > > While using ISOdate() and strptime(), I noticed the > following behaviour when > > > "wrong" arguments (e.g., months>12) are given to these functions: > > > > > > > ISOdate(year=2003,month=2,day=20) #ok > > > [1] "2003-02-20 13:00:00 Westeuropäische Normalzeit" > > > > ISOdate(year=2003,month=2,day=30) #wrong day, but > returns a value > > > [1] "2003-03-02 13:00:00 Westeuropäische Normalzeit" > > > > ISOdate(year=2003,month=2,day=35) #wrong day, and returns NA > > > [1] NA > > > > ISOdate(year=2003,month=2,day=40) #wrong day, but > returns a value > > > [1] "2003-02-04 01:12:00 Westeuropäische Normalzeit" > > > > ISOdate(year=2003,month=22,day=20) #wrong month, but > returns a value > > > [1] "2003-02-02 21:12:00 Westeuropäische Normalzeit" > > > > > > And almost the same with strptime(): > > > > strptime("2003-02-20", format="%Y-%m-%d") > > > [1] "2003-02-20" > > > > strptime("2003-02-30", format="%Y-%m-%d") > > > [1] "2003-03-02" > > > > strptime("2003-02-35", format="%Y-%m-%d") > > > [1] NA > > > > strptime("2003-02-40", format="%Y-%m-%d") > > > [1] "2003-02-04" > > > > strptime("2003-22-20", format="%Y-%m-%d") > > > [1] NA > > > > > > Is this considered to be a user error ("If you put > garbage in, expect to get > > > garbage out"), or would it be safer to generally return Nas, as in > > > ISOdate(year=2003,month=2,day=35)? > > > > Expect to get the best guess at what you intended, and > expect this to > > depend on your OS. > > > > > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] ISOdate() and strptime()
Confirmation that this *is* an OS-specific problem: A professional implementation of the POSIX standard (Solaris) gets all of these correct. Your so-called OS lacks any implementation of strptime, so we borrowed one from glibc. Unfortunately, that is buggy, even to the extent that unclass(strptime("2003-22-20", format="%Y-%m-%d")) unclass(strptime("2003 22 20", format="%Y %m %d")) give different answers! (And RH8.0 gives the same answers as the substitute code used on R for Windows.) I believe Simon Fear owes the R-developers a public apology for his (not properly referenced in the archives) reply to this thread. BDR On Fri, 14 Nov 2003, Prof Brian Ripley wrote: > On Fri, 14 Nov 2003, RINNER Heinrich wrote: > > > Dear R-people! > > > > I am using R 1.8.0, under Windows XP. > > While using ISOdate() and strptime(), I noticed the following behaviour when > > "wrong" arguments (e.g., months>12) are given to these functions: > > > > > ISOdate(year=2003,month=2,day=20) #ok > > [1] "2003-02-20 13:00:00 Westeuropäische Normalzeit" > > > ISOdate(year=2003,month=2,day=30) #wrong day, but returns a value > > [1] "2003-03-02 13:00:00 Westeuropäische Normalzeit" > > > ISOdate(year=2003,month=2,day=35) #wrong day, and returns NA > > [1] NA > > > ISOdate(year=2003,month=2,day=40) #wrong day, but returns a value > > [1] "2003-02-04 01:12:00 Westeuropäische Normalzeit" > > > ISOdate(year=2003,month=22,day=20) #wrong month, but returns a value > > [1] "2003-02-02 21:12:00 Westeuropäische Normalzeit" > > > > And almost the same with strptime(): > > > strptime("2003-02-20", format="%Y-%m-%d") > > [1] "2003-02-20" > > > strptime("2003-02-30", format="%Y-%m-%d") > > [1] "2003-03-02" > > > strptime("2003-02-35", format="%Y-%m-%d") > > [1] NA > > > strptime("2003-02-40", format="%Y-%m-%d") > > [1] "2003-02-04" > > > strptime("2003-22-20", format="%Y-%m-%d") > > [1] NA > > > > Is this considered to be a user error ("If you put garbage in, expect to get > > garbage out"), or would it be safer to generally return Nas, as in > > ISOdate(year=2003,month=2,day=35)? > > Expect to get the best guess at what you intended, and expect this to > depend on your OS. > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] ISOdate() and strptime()
I have followed with interest the discussion on date handling. I am no expert in these things; all I want to do is convert a character vector that has been read into R (and which may contain some erroneous dates) to a "date format", and then do some work with it [e.g., use it in a plot]. Classes "POSIXlt" and "POSIXct" seem fine to me - for example, they have very nice and useful "seq" and "plot" methods. So now I have two more questions: 1. Is it only incomplete or erroneous dates that might be handled "differently" by ISOdate() or strptime()? Do correct specifications of year, month and day always give the same results, no matter where or who I am? 2. Can someone point me to a reference that helps me understand why R's (or the Operating systems?) "best guess at what I intended" turns out to be the results in the examples I posted in my earlier mail? Regards, Heinrich. > -Ursprüngliche Nachricht- > Von: RINNER Heinrich [mailto:[EMAIL PROTECTED] > Gesendet: Freitag, 14. November 2003 11:13 > An: '[EMAIL PROTECTED]' > Betreff: [R] ISOdate() and strptime() > > > Dear R-people! > > I am using R 1.8.0, under Windows XP. > While using ISOdate() and strptime(), I noticed the following > behaviour when > "wrong" arguments (e.g., months>12) are given to these functions: > > > ISOdate(year=2003,month=2,day=20) #ok > [1] "2003-02-20 13:00:00 Westeuropäische Normalzeit" > > ISOdate(year=2003,month=2,day=30) #wrong day, but returns a value > [1] "2003-03-02 13:00:00 Westeuropäische Normalzeit" > > ISOdate(year=2003,month=2,day=35) #wrong day, and returns NA > [1] NA > > ISOdate(year=2003,month=2,day=40) #wrong day, but returns a value > [1] "2003-02-04 01:12:00 Westeuropäische Normalzeit" > > ISOdate(year=2003,month=22,day=20) #wrong month, but returns a value > [1] "2003-02-02 21:12:00 Westeuropäische Normalzeit" > > And almost the same with strptime(): > > strptime("2003-02-20", format="%Y-%m-%d") > [1] "2003-02-20" > > strptime("2003-02-30", format="%Y-%m-%d") > [1] "2003-03-02" > > strptime("2003-02-35", format="%Y-%m-%d") > [1] NA > > strptime("2003-02-40", format="%Y-%m-%d") > [1] "2003-02-04" > > strptime("2003-22-20", format="%Y-%m-%d") > [1] NA > > Is this considered to be a user error ("If you put garbage > in, expect to get > garbage out"), or would it be safer to generally return Nas, as in > ISOdate(year=2003,month=2,day=35)? > > -Heinrich. > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] ISOdate() and strptime()
I think I do understand how difficult dates are. All I'm saying is that by adopting a "standard" that is OS dependent (and hence, almost by definition, OS varying) you make R behave differently on different OSs - and that is NOT "making R portable across multiple OSs". This is a theoretical whinge. I'm not going to program it ! Please don't let me make too much of this anyway. For one thing, although it is not guaranteed, it seems that many OSs DO in fact behave identically. Also, it is only incomplete or erroneous dates that might be handled differently - and in most applications, one needs to pre-process incomplete date-times in R, rather than leave them to any default interpretation (even if that default was strictly fixed). > -Original Message- > From: Jason Turner [mailto:[EMAIL PROTECTED] > Sent: 15 November 2003 06:17 > To: Simon Fear > Cc: [EMAIL PROTECTED] > Subject: Re: [R] ISOdate() and strptime() > > > Security Warning: > If you are not sure an attachment is safe to open please contact > Andy on x234. There are 0 attachments with this message. > > > Thomas Lumley wrote: > > > On Fri, 14 Nov 2003, Simon Fear wrote: > > > >>Is the behaviour of ISOtime() and strptime() determined by ISO > >>or POSIX standard? Seems not to fit R's "no nannying" policy > >>at all. > >> > > > > > > It's determined by your operating system, so you're > complaining to the > > wrong people. > > > > And since R is written to be portable across multiple OSs, > you might get > an idea how tricky this becomes. Hence the "iron fist" > approach to date > handling. Believe me, I've programmed date handling - it's always a > terrible, nasty, messy business when international locales > and different > operating systems clash. I'm stunned it's as good as it is, subtle > traps and all. > > Cheers > > Jason > Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] ISOdate() and strptime()
Thomas Lumley wrote: On Fri, 14 Nov 2003, Simon Fear wrote: Is the behaviour of ISOtime() and strptime() determined by ISO or POSIX standard? Seems not to fit R's "no nannying" policy at all. It's determined by your operating system, so you're complaining to the wrong people. And since R is written to be portable across multiple OSs, you might get an idea how tricky this becomes. Hence the "iron fist" approach to date handling. Believe me, I've programmed date handling - it's always a terrible, nasty, messy business when international locales and different operating systems clash. I'm stunned it's as good as it is, subtle traps and all. Cheers Jason __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] ISOdate() and strptime()
On Fri, 14 Nov 2003, Simon Fear wrote: > > Is the behaviour of ISOtime() and strptime() determined by ISO > or POSIX standard? Seems not to fit R's "no nannying" policy > at all. Or maybe it's the future: in version 1.9 will I be able to > type glm() and have R take a best guess at the model > specification I had in mind? > It's determined by your operating system, so you're complaining to the wrong people. R does not have enough information to work out time zones and daylight saving itself --it has to rely on the OS. as.date doesn't have this problem because it works only with whole days. -thomas __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] ISOdate() and strptime()
People who don't like this behaviour (and particularly those who dislike it as much as I do), should consider as.date() from the dates package as an alternative. Gives you a NA if the specified date is impossible (at least in all the examples given earlier). Is the behaviour of ISOtime() and strptime() determined by ISO or POSIX standard? Seems not to fit R's "no nannying" policy at all. Or maybe it's the future: in version 1.9 will I be able to type glm() and have R take a best guess at the model specification I had in mind? > -Original Message- > From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] > Expect to get the best guess at what you intended, and expect this to > depend on your OS. > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 69 Fax: +44 (0) 1379 65 email: [EMAIL PROTECTED] web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] ISOdate() and strptime()
On Fri, 14 Nov 2003, RINNER Heinrich wrote: > Dear R-people! > > I am using R 1.8.0, under Windows XP. > While using ISOdate() and strptime(), I noticed the following behaviour when > "wrong" arguments (e.g., months>12) are given to these functions: > > > ISOdate(year=2003,month=2,day=20) #ok > [1] "2003-02-20 13:00:00 Westeuropäische Normalzeit" > > ISOdate(year=2003,month=2,day=30) #wrong day, but returns a value > [1] "2003-03-02 13:00:00 Westeuropäische Normalzeit" > > ISOdate(year=2003,month=2,day=35) #wrong day, and returns NA > [1] NA > > ISOdate(year=2003,month=2,day=40) #wrong day, but returns a value > [1] "2003-02-04 01:12:00 Westeuropäische Normalzeit" > > ISOdate(year=2003,month=22,day=20) #wrong month, but returns a value > [1] "2003-02-02 21:12:00 Westeuropäische Normalzeit" > > And almost the same with strptime(): > > strptime("2003-02-20", format="%Y-%m-%d") > [1] "2003-02-20" > > strptime("2003-02-30", format="%Y-%m-%d") > [1] "2003-03-02" > > strptime("2003-02-35", format="%Y-%m-%d") > [1] NA > > strptime("2003-02-40", format="%Y-%m-%d") > [1] "2003-02-04" > > strptime("2003-22-20", format="%Y-%m-%d") > [1] NA > > Is this considered to be a user error ("If you put garbage in, expect to get > garbage out"), or would it be safer to generally return Nas, as in > ISOdate(year=2003,month=2,day=35)? Expect to get the best guess at what you intended, and expect this to depend on your OS. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] ISOdate() and strptime()
Dear R-people! I am using R 1.8.0, under Windows XP. While using ISOdate() and strptime(), I noticed the following behaviour when "wrong" arguments (e.g., months>12) are given to these functions: > ISOdate(year=2003,month=2,day=20) #ok [1] "2003-02-20 13:00:00 Westeuropäische Normalzeit" > ISOdate(year=2003,month=2,day=30) #wrong day, but returns a value [1] "2003-03-02 13:00:00 Westeuropäische Normalzeit" > ISOdate(year=2003,month=2,day=35) #wrong day, and returns NA [1] NA > ISOdate(year=2003,month=2,day=40) #wrong day, but returns a value [1] "2003-02-04 01:12:00 Westeuropäische Normalzeit" > ISOdate(year=2003,month=22,day=20) #wrong month, but returns a value [1] "2003-02-02 21:12:00 Westeuropäische Normalzeit" And almost the same with strptime(): > strptime("2003-02-20", format="%Y-%m-%d") [1] "2003-02-20" > strptime("2003-02-30", format="%Y-%m-%d") [1] "2003-03-02" > strptime("2003-02-35", format="%Y-%m-%d") [1] NA > strptime("2003-02-40", format="%Y-%m-%d") [1] "2003-02-04" > strptime("2003-22-20", format="%Y-%m-%d") [1] NA Is this considered to be a user error ("If you put garbage in, expect to get garbage out"), or would it be safer to generally return Nas, as in ISOdate(year=2003,month=2,day=35)? -Heinrich. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help