If it were simply deprecated and then changed then everyone using it would get a warning during the period of deprecation so it would not be so bad. Given that its current behavior is not very useful I suspect its not widely used anyways. | haven't followed the whole discussion so sorry if these points have already been made.
On Dec 15, 2007 5:17 PM, Martin Maechler <[EMAIL PROTECTED]> wrote: > >>>>> "TP" == Tony Plate <[EMAIL PROTECTED]> > >>>>> on Fri, 14 Dec 2007 13:58:30 -0700 writes: > > > TP> Duncan Murdoch wrote: > >> On 12/13/2007 1:59 PM, Tony Plate wrote: > >>> Duncan Murdoch wrote: > >>>> On 12/11/2007 6:20 AM, [EMAIL PROTECTED] wrote: > >>>>> Full_Name: Petr Simecek > >>>>> Version: 2.5.1, 2.6.1 > >>>>> OS: Windows XP > >>>>> Submission from: (NULL) (195.113.231.2) > >>>>> > >>>>> > >>>>> Several times I have experienced that a length of a POSIXt vector > >>>>> has not been > >>>>> computed right. > >>>>> > >>>>> Example: > >>>>> > >>>>> tv<-structure(list(sec = c(50, 0, 55, 12, 2, 0, 37, NA, 17, 3, 31 > >>>>> ), min = c(1L, 10L, 11L, 15L, 16L, 18L, 18L, NA, 20L, 22L, 22L > >>>>> ), hour = c(12L, 12L, 12L, 12L, 12L, 12L, 12L, NA, 12L, 12L, 12L), > >>>>> mday = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, NA, 13L, 13L, 13L), mon > >>>>> = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, NA, 5L, 5L, 5L), year = c(105L, > >>>>> 105L, 105L, 105L, 105L, 105L, 105L, NA, 105L, 105L, 105L), wday = > >>>>> c(1L, 1L, 1L, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L), yday = c(163L, 163L, > >>>>> 163L, 163L, 163L, 163L, 163L, NA, 163L, 163L, 163L), isdst = c(1L, > >>>>> 1L, 1L, 1L, 1L, 1L, 1L, -1L, 1L, 1L, 1L)), .Names = c("sec", "min", > >>>>> "hour", "mday", "mon", "year", "wday", "yday", "isdst" > >>>>> ), class = c("POSIXt", "POSIXlt")) > >>>>> > >>>>> print(tv) > >>>>> # print 11 time points (right) > >>>>> > >>>>> length(tv) > >>>>> # returns 9 (wrong) > >>>> > >>>> tv is a list of length 9. The answer is right, your expectation is > >>>> wrong. > >>>>> I have tried that on several computers with/without switching to > >>>>> English > >>>>> locales, i.e. Sys.setlocale("LC_TIME", "en"). I have searched a > >>>>> help pages but I > >>>>> cannot imagine how that could be OK. > >>>> > >>>> See this in ?POSIXt: > >>>> > >>>> Class '"POSIXlt"' is a named list of vectors... > >>>> > >>>> You could define your own length measurement as > >>>> > >>>> length.POSIXlt <- function(x) length(x$sec) > >>>> > >>>> and you'll get the answer you expect, but be aware that length.XXX > >>>> methods are quite rare, and you may surprise some of your users. > >>>> > >>> > >>> On the other hand, isn't the fact that length() currently always > >>> returns 9 for POSIXlt objects likely to be a surprise to many users > >>> of POSIXlt? > >>> > >>> The back of "The New S Language" says "Easy-to-use facilities allow > >>> you to organize, store and retrieve all sorts of data. ... S > >>> functions and data organization make applications easy to write." > >>> > >>> Now, POSIXlt has methods for c() and vector subsetting "[" (and many > >>> other vector-manipulation methods - see methods(class="POSIXlt")). > >>> Hence, from the point of view of intending to supply "easy-to-use > >>> facilities ... [for] all sorts of data", isn't it a little > >>> incongruous that length() is not also provided -- as 3 functions (any > >>> others?) comprise a core set of vector-manipulation functions? > >>> > >>> Would it make sense to have an informal prescription (e.g., in > >>> R-exts) that a class that implements a vector-like object and > >>> provides at least of one of functions 'c', '[' and 'length' should > >>> provide all three? It would also be easy to describe a test-suite > >>> that should be included in the 'test' directory of a package > >>> implementing such a class, that had some tests of the basic > >>> vector-manipulation functionality, such as: > >>> > >>> > # at this point, x0, x1, x3, & x10 should exist, as vectors of the > >>> > # class being tested, of length 0, 1, 3, and 10, and they should > >>> > # contain no duplicate elements > >>> > length(x0) > >>> [1] 1 > >>> > length(c(x0, x1)) > >>> [1] 2 > >>> > length(c(x1,x10)) > >>> [1] 11 > >>> > all(x3 == x3[seq(len=length(x3))]) > >>> [1] TRUE > >>> > all(x3 == c(x3[1], x3[2], x3[3])) > >>> [1] TRUE > >>> > length(c(x3[2], x10[5:7])) > >>> [1] 4 > >>> > > >>> > >>> It would also be possible to describe a larger set of vector > >>> manipulation functions that should be implemented together, including > >>> e.g., 'rep', 'unique', 'duplicated', '==', 'sort', '[<-', 'is.na', > >>> head, tail ... (many of which are provided for POSIXlt). > >>> > >>> Or is there some good reason that length() cannot be provided (while > >>> 'c' and '[' can) for some vector-like classes such as "POSIXlt"? > >> > >> What you say sounds good in general, but the devil is in the details. > >> Changing the meaning of length(x) for some objects has fairly > >> widespread effects. Are they all positive? I don't know. > >> > >> Adding a prescription like the one you suggest would be good if it's > >> easy to implement, but bad if it's already widely violated. How many > >> base or CRAN or Bioconductor packages violate it currently? Do the > >> ones that provide all 3 methods do so in a consistent way, i.e. does > >> "length(x)" mean the same thing in all of them? > TP> I'm not sure doing something like this would be so bad even if it is > TP> already widely violated. R has evolved significantly over time, and > TP> many rough edges have been cleaned up, sometimes in ways that were not > TP> backward compatible. This is a great thing & my thanks go to the > people > TP> working on R. > > TP> If some base or CRAN or Bioconductor packages currently don't implement > TP> vector operations consistently, wouldn't it be good to know that? > TP> Wouldn't it be useful to have an automatic way of determining whether a > TP> particular vector-like class is consistent with generally agreed set of > TP> principles for how basic vector operations should work -- things like > TP> length(x)+length(y)==length(c(x,y))? This could help developers check, > TP> document & improve their code, and it could help users understand how > to > TP> use a class, and to evaluate the software quality of a class > TP> implementation and whether or not it provides the functionality they > need. > >> I agree that the current state is less than perfect, but making it > >> better would really be a lot of work. I suspect there are better ways > >> to spend my time, so I'm not going to volunteer to do it. I'm not > >> even going to invite someone else to do it, or offer to review your > >> work if you volunteer. I think this falls into the class of "next > >> time we write a language, let's handle this better" problems. > > TP> Thanks very much for the thoughtful (and honest) feedback! I suspect > TP> that the current state could be improved with just a little work, and > TP> without forcing anyone to do any work they don't want to do. I'll > think > TP> about this more and try to come back with a better & more concrete > TP> suggestion. > > Good. From "the outside" (i.e. superficial gut feeling :-) > I've sympathized with your suggestion, Tony, quite a bit. > Further, my own taste would probably also have lead me to define > length.POSIXlt differently .. > OTOH, I agree with Duncan that it may be too late to change it > and even more to enforce the consistency rules you propose. > If with a small bit of code (and some patience) we could check > all of CRAN and hopefully bioconductor packages and find only a > very few where it was violated, the whole endeavor may be worth it > ... for the sake of making R more consistent, easier to teach, etc.. > > Unfortunately I don't remember now what happened many months ago > when I indeed did experiment with having something like > > length.POSIXlt <- function(x) length(x$sec) > > Martin Maechler > > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel