RE: [R] Fwd: strptime() problem? - Resolved

2004-08-22 Thread javier garcia - CEBAS
Hi Gabor and everybody;

Thanks Gabor, with the alternative step you've told me the problem is 
resolved. Comparing the two procedures:

Extract from the source 'character' data:

 rain$ts[2039:2046]
[1] 25/03/2000 22:00:00 UTC 25/03/2000 23:00:00 UTC
[3] 26/03/2000 00:00:00 UTC 26/03/2000 01:00:00 UTC
[5] 26/03/2000 02:00:00 UTC 26/03/2000 03:00:00 UTC
[7] 26/03/2000 04:00:00 UTC 26/03/2000 05:00:00 UTC

Proc 1. The 5th el. of the obtained POSIXct serie goes out of itself
-
 rain.strptime - strptime(rain$ts, format=%d/%m/%Y %H:%M:%S)
 rain.strptime.ct - as.POSIXct(rain.strptime,tz=GMT)
 rain.strptime.ct[2039:2046]
[1] 2000-03-25 23:00:00 CET  2000-03-26 00:00:00 CET
[3] 2000-03-26 01:00:00 CET  2000-03-26 03:00:00 CEST
[5] 2000-03-26 05:00:00 CEST 2000-03-26 05:00:00 CEST
[7] 2000-03-26 06:00:00 CEST 2000-03-26 07:00:00 CEST
 format(rain.strptime.ct[2039:2046],tz=GMT,usetz=TRUE)
[1] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
[3] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
[5] 2000-03-26 03:00:00 GMT 2000-03-26 03:00:00 GMT
[7] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT
 as.numeric(rain.strptime.ct[2039:2046])
[1] 954021600 954025200 954028800 954032400 954039600 954039600 954043200
[8] 954046800

Proc 2. The obtained POSIXct serie is continuous, and it seems OK for me.
-
rain.chron - 
chron(substring(rain$ts,1,10),substring(rain$ts,12,19),format=c(d/m/y,h:m:s))
rain.chron.ct - as.POSIXct(rain.chron,tz=GMT)
 rain.chron.ct[2039:2046]
[1] 2000-03-25 23:00:00 CET  2000-03-26 00:00:00 CET
[3] 2000-03-26 01:00:00 CET  2000-03-26 03:00:00 CEST
[5] 2000-03-26 04:00:00 CEST 2000-03-26 05:00:00 CEST
[7] 2000-03-26 06:00:00 CEST 2000-03-26 07:00:00 CEST
 format(lluvia.chron.ct[2039:2046],tz=GMT,usetz=TRUE)
[1] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
[3] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
[5] 2000-03-26 02:00:00 GMT 2000-03-26 03:00:00 GMT
[7] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT
 as.numeric(rain.chron.ct[2039:2046])
[1] 954021600 954025200 954028800 954032400 954036000 954039600 954043200
[8] 954046800


For me the problem is resolved by mean of package 'chron'. And it's as direct 
as the use of the first procedure. Just as a comment, I think that for a 
proper behaviour, the first procedure should give the same result. Shouldn't 
it?

Thanks all and best regards.
---
---

El Mar 17 Ago 2004 20:02, javier garcia - CEBAS escribió:
 --  Mensaje reenviado  --

 Subject: RE: [R] Fwd: strptime() problem?
 Date: Tue, 17 Aug 2004 11:57:46 -0400 (EDT)
 From: Gabor Grothendieck [EMAIL PROTECTED]
 To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]

 I am in a different time zone, EDT, on Windows XP and can't
 replicate this but you might try reading the latest R News
 article on dates and times for some ideas, viz. page 32 of:

http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pd

 In particular, try converting them to chron and then doing
 your manipulations in chron or else convert them from chron to
 POSIXct:

require(chron)
r.asc - raincida$ts
r.chron - chron(substring(r.asc, 1, 10),
  substring(r.asc, 12, 19), format = c(d/m/y, h:m:s))

r.ct - as.POSIXct(r.chron)
format(r.ct, tz=GMT)   # display POSIXct in GMT

 Date: Tue, 17 Aug 2004 16:25:12 +0200
 From: javier garcia - CEBAS [EMAIL PROTECTED]
 To:   Prof Brian Ripley [EMAIL PROTECTED],
 [EMAIL PROTECTED] Subject:  [R] Fwd: strptime() problem?

 Hi all;
 I've already send a similar e-mail to the list and Prof. Brian Ripley
 answered me but my doubts remain unresolved. Thanks for the clarification,
 but perhaps I wasn't clear enough in posting my questions.

 I've got a postgres database which I read into R. The first column is
 Timestamp with timezone, and my data are already in UTC format. An
 'printed' extract of R character column, resulting from the timestamptz
 field is:

 raincida$ts:

 [2039] 25/03/2000 22:00:00 UTC 25/03/2000 23:00:00 UTC
 [2041] 26/03/2000 00:00:00 UTC 26/03/2000 01:00:00 UTC
 [2043] 26/03/2000 02:00:00 UTC 26/03/2000 03:00:00 UTC
 [2045] 26/03/2000 04:00:00 UTC 26/03/2000 05:00:00 UTC

 #And I need to convert this character column into POSIXct, for eventual
 work. #As I can see in the documentation, the process is to use strptime(),
 what #creates an object POSIXlt and doesn't allow to specify that the time
 zone of #the data is already UTC; followed by as.POSIXct()

  lluvia.strptime - strptime(raincida$ts, format=%d/%m/%Y %H:%M:%S)
  lluvia.strptime.POSIXct - as.POSIXct(lluvia.strptime,tz=GMT)

 A printed extract is:

 [2039] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
 [2041] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
 [2043] 2000-03-26 03:00:00 GMT 2000-03-26 03:00:00 GMT
 [2045] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT

RE: [R] Fwd: strptime() problem? - Resolved

2004-08-22 Thread Gabor Grothendieck

Hi,

Unfortunately, in my time zone I cannot reproduce your problem.
For example,

 rain
[1] 25/03/2000 22:00:00 UTC 25/03/2000 23:00:00 UTC
[3] 26/03/2000 00:00:00 UTC 26/03/2000 01:00:00 UTC
[5] 26/03/2000 02:00:00 UTC 26/03/2000 03:00:00 UTC
[7] 26/03/2000 04:00:00 UTC 26/03/2000 05:00:00 UTC
 str(rain)
 chr [1:8] 25/03/2000 22:00:00 UTC 25/03/2000 23:00:00 UTC ...
 rain.lt - strptime(rain, format=%d/%m/%Y %H:%M:%S)
 rain.lt
[1] 2000-03-25 22:00:00 2000-03-25 23:00:00 2000-03-26 00:00:00
[4] 2000-03-26 01:00:00 2000-03-26 02:00:00 2000-03-26 03:00:00
[7] 2000-03-26 04:00:00 2000-03-26 05:00:00
 rain.ct - as.POSIXct(rain.lt,tz=GMT)
 rain.ct
[1] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
[3] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
[5] 2000-03-26 02:00:00 GMT 2000-03-26 03:00:00 GMT
[7] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT
 R.version.string  # Windows XP
[1] R version 1.9.1, 2004-08-03

Without being able to reproduce it, its difficult for me to 
figure out what is wrong.  It seems to be ignoring the tz=
on the conversion to POSIXct.  I mentioned that I noticed
that it sometimes seems to ignore this parameter in my
recent R News article but have never attempted to track down
this behavior further.

What I can say is that, in gereral,
I have found that I wasted a lot of time on subtle aspects
related to time zones even when my underlying problem actually
had nothing to do with time zones so in order to avoid all
those difficulties I converted all my software from POSIXt 
to chron (which does not use time zones in the first place
and so cannot run into suchh problems) and I provided a table 
in the latest R News showing the translation of some idioms.  
This solved everything for me.


Date:   Wed, 18 Aug 2004 11:20:01 +0200
From:   javier garcia - CEBAS [EMAIL PROTECTED]
To: Gabor Grothendieck [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject:RE: [R] Fwd: strptime() problem? - Resolved

Hi Gabor and everybody;

Thanks Gabor, with the alternative step you've told me the problem is
resolved. Comparing the two procedures:

Extract from the source 'character' data:

 rain$ts[2039:2046]
[1] 25/03/2000 22:00:00 UTC 25/03/2000 23:00:00 UTC
[3] 26/03/2000 00:00:00 UTC 26/03/2000 01:00:00 UTC
[5] 26/03/2000 02:00:00 UTC 26/03/2000 03:00:00 UTC
[7] 26/03/2000 04:00:00 UTC 26/03/2000 05:00:00 UTC

Proc 1. The 5th el. of the obtained POSIXct serie goes out of itself
-
 rain.strptime - strptime(rain$ts, format=%d/%m/%Y %H:%M:%S)
 rain.strptime.ct - as.POSIXct(rain.strptime,tz=GMT)
 rain.strptime.ct[2039:2046]
[1] 2000-03-25 23:00:00 CET 2000-03-26 00:00:00 CET
[3] 2000-03-26 01:00:00 CET 2000-03-26 03:00:00 CEST
[5] 2000-03-26 05:00:00 CEST 2000-03-26 05:00:00 CEST
[7] 2000-03-26 06:00:00 CEST 2000-03-26 07:00:00 CEST
 format(rain.strptime.ct[2039:2046],tz=GMT,usetz=TRUE)
[1] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
[3] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
[5] 2000-03-26 03:00:00 GMT 2000-03-26 03:00:00 GMT
[7] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT
 as.numeric(rain.strptime.ct[2039:2046])
[1] 954021600 954025200 954028800 954032400 954039600 954039600 954043200
[8] 954046800

Proc 2. The obtained POSIXct serie is continuous, and it seems OK for me.
-
rain.chron -
chron(substring(rain$ts,1,10),substring(rain$ts,12,19),format=c(d/m/y,h:m:s))
rain.chron.ct - as.POSIXct(rain.chron,tz=GMT)
 rain.chron.ct[2039:2046]
[1] 2000-03-25 23:00:00 CET 2000-03-26 00:00:00 CET
[3] 2000-03-26 01:00:00 CET 2000-03-26 03:00:00 CEST
[5] 2000-03-26 04:00:00 CEST 2000-03-26 05:00:00 CEST
[7] 2000-03-26 06:00:00 CEST 2000-03-26 07:00:00 CEST
 format(lluvia.chron.ct[2039:2046],tz=GMT,usetz=TRUE)
[1] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
[3] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
[5] 2000-03-26 02:00:00 GMT 2000-03-26 03:00:00 GMT
[7] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT
 as.numeric(rain.chron.ct[2039:2046])
[1] 954021600 954025200 954028800 954032400 954036000 954039600 954043200
[8] 954046800


For me the problem is resolved by mean of package 'chron'. And it's as direct
as the use of the first procedure. Just as a comment, I think that for a
proper behaviour, the first procedure should give the same result. Shouldn't
it?

Thanks all and best regards.
---
---

El Mar 17 Ago 2004 20:02, javier garcia - CEBAS escribió:
 -- Mensaje reenviado --

 Subject: RE: [R] Fwd: strptime() problem?
 Date: Tue, 17 Aug 2004 11:57:46 -0400 (EDT)
 From: Gabor Grothendieck [EMAIL PROTECTED]
 To: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]

 I am in a different time zone, EDT, on Windows XP and can't
 replicate this but you might try reading the latest R News
 article on dates and times for some ideas, viz. page 32 of:

 http://cran.r-project.org/doc/Rnews

Re: [R] Fwd: strptime() problem?

2004-08-17 Thread Gabor Grothendieck
javier garcia - CEBAS rn001 at cebas.csic.es writes:

: 
: Hi all;
: I've already send a similar e-mail to the list and Prof. Brian Ripley 
: answered me but my doubts remain unresolved. Thanks for the clarification, 
: but perhaps I wasn't clear enough in posting my questions.
: 
: I've got a postgres database which I read into R. The first column is
: Timestamp with timezone, and my data are already in UTC format. An 'printed' 
: extract of R character column, resulting from the timestamptz field is:
: 
: raincida$ts:
: 
:  [2039] 25/03/2000 22:00:00 UTC 25/03/2000 23:00:00 UTC
:  [2041] 26/03/2000 00:00:00 UTC 26/03/2000 01:00:00 UTC
:  [2043] 26/03/2000 02:00:00 UTC 26/03/2000 03:00:00 UTC
:  [2045] 26/03/2000 04:00:00 UTC 26/03/2000 05:00:00 UTC
: 
: #And I need to convert this character column into POSIXct, for eventual 
work. 
: #As I can see in the documentation, the process is to use strptime(), what 
: #creates an object POSIXlt and doesn't allow to specify that the time zone 
of 
: #the data is already UTC; followed by as.POSIXct()
: 
:  lluvia.strptime - strptime(raincida$ts, format=%d/%m/%Y %H:%M:%S)
:  lluvia.strptime.POSIXct - as.POSIXct(lluvia.strptime,tz=GMT)
: 
: A printed extract is:
: 
:  [2039] 2000-03-25 22:00:00 GMT 2000-03-25 23:00:00 GMT
:  [2041] 2000-03-26 00:00:00 GMT 2000-03-26 01:00:00 GMT
:  [2043] 2000-03-26 03:00:00 GMT 2000-03-26 03:00:00 GMT
:  [2045] 2000-03-26 04:00:00 GMT 2000-03-26 05:00:00 GMT
: 
: As we can see, elements [2043] differ. Shouldn't they be similar as the rest 
: of the other shown elements? I thought this was a bug, but it seems that 
I've 
: got and conceptual error.(?). This happens several times in my data, and 
: produces eventual errors.
: 
: Please, how could I resolved this?

[Sorry if this gets posted twice.  I had a problem posting and
not sure if the first one ever got sent.]

I am in a different time zone, EDT, on Windows XP and can't
replicate this but you might try reading the latest R News
article on dates and times for some ideas, viz. page 32 of:

   http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pd

In particular, try converting the datetimes to chron and then doing 
your manipulations in chron or else converting them from chron to 
POSIXct rather than going through POSIXlt:

   require(chron)
   r.asc - raincida$ts
   r.chron - chron(substring(r.asc, 1, 10), 
 substring(r.asc, 12, 19), format = c(d/m/y, h:m:s))

   r.ct - as.POSIXct(r.chron)
   format(r.ct, tz=GMT) # display in GMT

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Fwd: strptime() problem?

2004-08-17 Thread Whit Armstrong
Javier,

I recently had a problem with dates.  This example might shed some light on
your problem.

 x - ISOdate(rep(2000,2),rep(3,2),rep(26,2),hour=0)
 x

[1] 2000-03-26 GMT 2000-03-26 GMT

 unclass(x)

[1] 954028800 954028800
attr(,tzone)
[1] GMT

When one creates a date with ISOdate, the resulting object is of class
POSIXct and is given the attribute tzone which is set to GMT.

When one prints an object of class POSIXct the function print.POSIXct is
called:
 print.POSIXct
function (x, ...) 
{
print(format(x, usetz = TRUE, ...), ...)
invisible(x)
}
environment: namespace:base
 

So, that function is just calling format which gets dispatched to
format.POSIXct:

 format.POSIXct
function (x, format = , tz = , usetz = FALSE, ...) 
{
if (!inherits(x, POSIXct)) 
stop(wrong class)
if (missing(tz)  !is.null(tzone - attr(x, tzone))) 
tz - tzone
structure(format.POSIXlt(as.POSIXlt(x, tz), format, usetz, 
...), names = names(x))
}
environment: namespace:base
 

Now, if one looks carefully at this code, you will see that it tests for the
attribute tzone on the object that is passed in.  If it finds that
attribute, then it is passed on to format.POSIXlt (which is the function
that ultimately does the printing).  If there is no tzone attribute, then
 is passed to format.POSIXlt as the tzone, which causes the object to be
printed in your locale specific format.

See:

 attr(x,tzone) - 
 x
[1] 2000-03-25 19:00:00 Eastern Standard Time 2000-03-25 19:00:00 Eastern
Standard Time
 attr(x,tzone) - GMT
 x
[1] 2000-03-26 GMT 2000-03-26 GMT
 

Now this is the part that really got me confused:

 x
[1] 2000-03-26 GMT 2000-03-26 GMT
 x[1]
[1] 2000-03-25 19:00:00 Eastern Standard Time
 

What happens in the above case is that the code for [.POSIXct looks like
this:

 get([.POSIXct)
function (x, ..., drop = TRUE) 
{
cl - oldClass(x)
class(x) - NULL
val - NextMethod([)
class(val) - cl
val
}
environment: namespace:base
 

The attribute tzone is not preserved!!  when val is created from the
call to NextMethod, its class is restored, but not its tzone attribute.
So any dates of class POSIXct that are printed after they have been
subscripted ([) will have their tzone attribute stripped, and will print
in the local specific format.

For your specific case, I would convert all my dates to POSIXct, then set
the attribute tzone to GMT.  After that, be very careful when
subscripting them, or you will find them printing in local specific formats
again.

for you:
 y - strptime(4/3/2000,format=%m/%d/%Y)
 y
[1] 2000-04-03
 y - as.POSIXct(y,GMT)
 y
[1] 2000-04-03 GMT
 unclass(y)
[1] 95472
attr(,tzone)
[1] GMT
 

I think that should straighten out your problem.

Hope that helps,
Whit

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html