Re: [Rd] [bug] droplevels() also drop object attributes (comment…)

2017-06-14 Thread Suharto Anggono Suharto Anggono via R-devel
In R devel r72789, the added part in 'factor' documentation (factor.Rd) is the 
following.
Undocumentedly for a long time, \code{factor(x)} loses all 
\code{\link{attributes}(x)} but \code{"names"}, and resets \code{"levels"} and 
\code{"class"}.

In the code of function 'factor', names(x) is copied to the result. As I 
mentioned before, names(x) is _not_ "names" attribute of 'x' when 'x' is a 
"POSIXlt" object. In R devel r72789,
factor(x)
is successful when 'x' is a "POSIXlt" object.

I think, it is better to accurately state what the code does, maybe like this.
Undocumentedly for a long time, \code{factor(x)} loses all 
\code{\link{attributes}(x)}, but has a copy of \code{\link{names}(x)}.

Attributes "levels" and "class" are already documented right before the 
statement.


To be more balanced, I am pointing out that, currently, levels replacement of a 
factor ('levels<-.factor') keeps attributes. My previous statement about 
"contrasts" attribute also holds there. By replacing levels, number of levels 
can change and, consequently, the original contrasts matrix is no longer valid. 
It can be argued that 'levels<-.factor' doesn't know "contrasts" attribute, as 
function 'contrasts' is in package stats, different from 'levels<-.factor' that 
is in package base. However, factor subsetting ('[.factor') knows "contrasts" 
attribute.

--------------------
On Sat, 10/6/17, Martin Maechler <maech...@stat.math.ethz.ch> wrote:

 Subject: Re: [Rd] [bug] droplevels() also drop object attributes (comment…)
 To: "R Development" <R-devel@r-project.org>
 Cc: "Serge Bibauw" <sbib...@gmail.com>, "Suharto Anggono" >>>> Suharto Anggono Suharto Anggono via R-devel 
>>>>>     on Thu, 8 Jun 2017 16:43:48 + writes:

    > * Be careful with "contrasts" attribute. If the number of levels is 
reduced, the original contrasts matrix is no longer valid.
    > Example case:
    > x <- factor(c("a", "a", "b", "b", "b"), levels = c("a", "b", "c"))
    > contrasts(x) <- contr.treatment(levels(x), contrasts=FALSE)[, -2, 
drop=FALSE]
    > droplevels(x)

    > * If function 'factor' is changed, make sure that as.factor(x) and 
factor(x) is the same for 'x' where is.integer(x) is TRUE. Currently, 
as.factor() is treated specially.

    > * It is possible that names(x) is not attr(x, "names"). For example, 'x' 
is a "POSIXlt" object.
    > Look at this example, which works in R 3.3.2.
    > x <- as.POSIXlt("2017-01-01", tz="UTC")
    > factor(x, levels=x)


    > By the way, in NEWS, in "CHANGES IN R 3.4.0", in "SIGNIFICANT 
USER-VISIBLE CHANGES", there is "factor() now uses order() to sort its levels". 
It is false. Code of function 'factor' in R 3.4.0 
(https://svn.r-project.org/R/tags/R-3-4-0/src/library/base/R/factor.R) still 
uses 'sort.list', not 'order'.

    > 
>>>>> Martin Maechler 
>>>>>     on Tue, 16 May 2017 11:01:23 +0200 writes:

>>>>> Serge Bibauw 
>>>>>     on Mon, 15 May 2017 11:59:32 -0400 writes:

    >>> Hi,

    >>> Just reporting a small bug… not really a big deal, but I
    >>> don’t think that is intended: droplevels() also drops all
    >>> object’s attributes.

    >> Yes.  The help page for droplevels (or the simple
    >> definition of 'droplevels.factor') clearly indicate that
    >> the method for factors is really just a call to factor(x,
    >> exclude = *)

    >> and that _is_ quite an important base function whose
    >> semantic should not be changed lightly. Still, let's
    >> continue :

    >> Looking a bit, I see that the current behavior of factor()
    >> {and hence droplevels} has been unchanged in this respect
    >> for the whole history of R, well, at least for more than
    >> 17 years (R 1.0.1, April 2000).

    >> I'd agree there _is_ a bug, at least in the documentation
    >> which does *not* mention that currently, all attributes
    >> are dropped but "names", "levels" (and "class").

    >> OTOH, factor() would only need a small change to make it
    >> preserve all attributes (but "class" and "levels" which
    >> are set explicitly).

    >> I'm sure this will break some checks in some packages.  Is
    >> it worth it?

    >> e.g., our own R  QC checks currently check (the printing of) the
    >> following (in tests/reg-tests-2.R ):

    >> > ## some te

Re: [Rd] [bug] droplevels() also drop object attributes (comment…)

2017-06-09 Thread Martin Maechler
> Suharto Anggono Suharto Anggono via R-devel 
> on Thu, 8 Jun 2017 16:43:48 + writes:

> * Be careful with "contrasts" attribute. If the number of levels is 
reduced, the original contrasts matrix is no longer valid.
> Example case:
> x <- factor(c("a", "a", "b", "b", "b"), levels = c("a", "b", "c"))
> contrasts(x) <- contr.treatment(levels(x), contrasts=FALSE)[, -2, 
drop=FALSE]
> droplevels(x)

> * If function 'factor' is changed, make sure that as.factor(x) and 
factor(x) is the same for 'x' where is.integer(x) is TRUE. Currently, 
as.factor() is treated specially.

> * It is possible that names(x) is not attr(x, "names"). For example, 'x' 
is a "POSIXlt" object.
> Look at this example, which works in R 3.3.2.
> x <- as.POSIXlt("2017-01-01", tz="UTC")
> factor(x, levels=x)


> By the way, in NEWS, in "CHANGES IN R 3.4.0", in "SIGNIFICANT 
USER-VISIBLE CHANGES", there is "factor() now uses order() to sort its levels". 
It is false. Code of function 'factor' in R 3.4.0 
(https://svn.r-project.org/R/tags/R-3-4-0/src/library/base/R/factor.R) still 
uses 'sort.list', not 'order'.

> 
> Martin Maechler 
> on Tue, 16 May 2017 11:01:23 +0200 writes:

> Serge Bibauw 
> on Mon, 15 May 2017 11:59:32 -0400 writes:

>>> Hi,

>>> Just reporting a small bug… not really a big deal, but I
>>> don’t think that is intended: droplevels() also drops all
>>> object’s attributes.

>> Yes.  The help page for droplevels (or the simple
>> definition of 'droplevels.factor') clearly indicate that
>> the method for factors is really just a call to factor(x,
>> exclude = *)

>> and that _is_ quite an important base function whose
>> semantic should not be changed lightly. Still, let's
>> continue :

>> Looking a bit, I see that the current behavior of factor()
>> {and hence droplevels} has been unchanged in this respect
>> for the whole history of R, well, at least for more than
>> 17 years (R 1.0.1, April 2000).

>> I'd agree there _is_ a bug, at least in the documentation
>> which does *not* mention that currently, all attributes
>> are dropped but "names", "levels" (and "class").

>> OTOH, factor() would only need a small change to make it
>> preserve all attributes (but "class" and "levels" which
>> are set explicitly).

>> I'm sure this will break some checks in some packages.  Is
>> it worth it?

>> e.g., our own R  QC checks currently check (the printing of) the
>> following (in tests/reg-tests-2.R ):

>> > ## some tests of factor matrices
>> > A <- factor(7:12)
>> > dim(A) <- c(2, 3)
>> > A
>> [,1] [,2] [,3]
>> [1,] 7911  
>> [2,] 810   12  
>> Levels: 7 8 9 10 11 12
>> > str(A)
>> factor [1:2, 1:3] 7 8 9 10 ...
>> - attr(*, "levels")= chr [1:6] "7" "8" "9" "10" ...
>> > A[, 1:2]
>> [,1] [,2]
>> [1,] 79   
>> [2,] 810  
>> Levels: 7 8 9 10 11 12
>> > A[, 1:2, drop=TRUE]
>> [1] 7  8  9  10
>> Levels: 7 8 9 10
>> 
>> with the proposed change to factor(),
>> the last call would change its result:
>> 
>> > A[, 1:2, drop=TRUE]
>> [,1] [,2]
>> [1,] 79   
>> [2,] 810  
>> Levels: 7 8 9 10

>> because 'drop=TRUE' calls factor(..) and that would also
>> preserve the "dim" attribute.  I would think that the
>> changed behavior _is_ better, and is also according to
>> documentation, because the help page for [.factor explains
>> that 'drop = TRUE' drops levels, but _not_ that it
>> transforms a factor matrix into a factor (vector).

>> Martin

> I'm finally coming back to this.
> It still seems to make sense to change factor() and hence
> droplevels() behavior here, and plan to commit this change
> within a day.

I had committed these (including regression checks and *.Rd
changes) as  svn rev 72771  to R-devel..

As Suharto also warned, this change had more severe effects in
package space than I expected.

Notably the fact that the "dim" attribute was kept there, lead
to several errors in CRAN packages / their checks.

Of course, I could have patched the change by explicitely *not*
keeping certain attributes such as "dim" and "contrasts".

But I've decided that in this case it seems more rational,
notably for back compatibility reasons, to keep factor()'s and hence
droplevel()'s behaviour  and rather document it.
So the "bug" is solved differently, by documenting the behavior
that has been in place "forever".

With svn's 72773, we are back to the previous state with the
addition of mentioning in help(factor) that attributes are "lost".

Martin Maechler
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [bug] droplevels() also drop object attributes (comment…)

2017-06-08 Thread Suharto Anggono Suharto Anggono via R-devel
* Be careful with "contrasts" attribute. If the number of levels is reduced, 
the original contrasts matrix is no longer valid.
Example case:
x <- factor(c("a", "a", "b", "b", "b"), levels = c("a", "b", "c"))
contrasts(x) <- contr.treatment(levels(x), contrasts=FALSE)[, -2, drop=FALSE]
droplevels(x)

* If function 'factor' is changed, make sure that as.factor(x) and factor(x) is 
the same for 'x' where is.integer(x) is TRUE. Currently, as.factor() 
is treated specially.

* It is possible that names(x) is not attr(x, "names"). For example, 'x' is a 
"POSIXlt" object.
Look at this example, which works in R 3.3.2.
x <- as.POSIXlt("2017-01-01", tz="UTC")
factor(x, levels=x)


By the way, in NEWS, in "CHANGES IN R 3.4.0", in "SIGNIFICANT USER-VISIBLE 
CHANGES", there is "factor() now uses order() to sort its levels". It is false. 
Code of function 'factor' in R 3.4.0 
(https://svn.r-project.org/R/tags/R-3-4-0/src/library/base/R/factor.R) still 
uses 'sort.list', not 'order'.


> Martin Maechler 
> on Tue, 16 May 2017 11:01:23 +0200 writes:

> Serge Bibauw 
> on Mon, 15 May 2017 11:59:32 -0400 writes:

>> Hi,

>> Just reporting a small bug… not really a big deal, but I
>> don’t think that is intended: droplevels() also drops all
>> object’s attributes.

> Yes.  The help page for droplevels (or the simple
> definition of 'droplevels.factor') clearly indicate that
> the method for factors is really just a call to factor(x,
> exclude = *)

> and that _is_ quite an important base function whose
> semantic should not be changed lightly. Still, let's
> continue :

> Looking a bit, I see that the current behavior of factor()
> {and hence droplevels} has been unchanged in this respect
> for the whole history of R, well, at least for more than
> 17 years (R 1.0.1, April 2000).

> I'd agree there _is_ a bug, at least in the documentation
> which does *not* mention that currently, all attributes
> are dropped but "names", "levels" (and "class").

> OTOH, factor() would only need a small change to make it
> preserve all attributes (but "class" and "levels" which
> are set explicitly).

> I'm sure this will break some checks in some packages.  Is
> it worth it?

> e.g., our own R  QC checks currently check (the printing of) the
> following (in tests/reg-tests-2.R ):

>   > ## some tests of factor matrices
>   > A <- factor(7:12)
>   > dim(A) <- c(2, 3)
>   > A
>[,1] [,2] [,3]
>   [1,] 7911  
>   [2,] 810   12  
>   Levels: 7 8 9 10 11 12
>   > str(A)
>factor [1:2, 1:3] 7 8 9 10 ...
>- attr(*, "levels")= chr [1:6] "7" "8" "9" "10" ...
>   > A[, 1:2]
>[,1] [,2]
>   [1,] 79   
>   [2,] 810  
>   Levels: 7 8 9 10 11 12
>   > A[, 1:2, drop=TRUE]
>   [1] 7  8  9  10
>   Levels: 7 8 9 10
> 
> with the proposed change to factor(),
> the last call would change its result:
> 
>   > A[, 1:2, drop=TRUE]
>[,1] [,2]
>   [1,] 79   
>   [2,] 810  
>   Levels: 7 8 9 10

> because 'drop=TRUE' calls factor(..) and that would also
> preserve the "dim" attribute.  I would think that the
> changed behavior _is_ better, and is also according to
> documentation, because the help page for [.factor explains
> that 'drop = TRUE' drops levels, but _not_ that it
> transforms a factor matrix into a factor (vector).

> Martin

I'm finally coming back to this.
It still seems to make sense to change factor() and hence
droplevels() behavior here, and plan to commit this change
within a day.

Martin Maechler
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [bug] droplevels() also drop object attributes (comment…)

2017-06-06 Thread Martin Maechler
> Martin Maechler 
> on Tue, 16 May 2017 11:01:23 +0200 writes:

> Serge Bibauw 
> on Mon, 15 May 2017 11:59:32 -0400 writes:

>> Hi,

>> Just reporting a small bug… not really a big deal, but I
>> don’t think that is intended: droplevels() also drops all
>> object’s attributes.

> Yes.  The help page for droplevels (or the simple
> definition of 'droplevels.factor') clearly indicate that
> the method for factors is really just a call to factor(x,
> exclude = *)

> and that _is_ quite an important base function whose
> semantic should not be changed lightly. Still, let's
> continue :

> Looking a bit, I see that the current behavior of factor()
> {and hence droplevels} has been unchanged in this respect
> for the whole history of R, well, at least for more than
> 17 years (R 1.0.1, April 2000).

> I'd agree there _is_ a bug, at least in the documentation
> which does *not* mention that currently, all attributes
> are dropped but "names", "levels" (and "class").

> OTOH, factor() would only need a small change to make it
> preserve all attributes (but "class" and "levels" which
> are set explicitly).

> I'm sure this will break some checks in some packages.  Is
> it worth it?

> e.g., our own R  QC checks currently check (the printing of) the
> following (in tests/reg-tests-2.R ):

>   > ## some tests of factor matrices
>   > A <- factor(7:12)
>   > dim(A) <- c(2, 3)
>   > A
>[,1] [,2] [,3]
>   [1,] 7911  
>   [2,] 810   12  
>   Levels: 7 8 9 10 11 12
>   > str(A)
>factor [1:2, 1:3] 7 8 9 10 ...
>- attr(*, "levels")= chr [1:6] "7" "8" "9" "10" ...
>   > A[, 1:2]
>[,1] [,2]
>   [1,] 79   
>   [2,] 810  
>   Levels: 7 8 9 10 11 12
>   > A[, 1:2, drop=TRUE]
>   [1] 7  8  9  10
>   Levels: 7 8 9 10
> 
> with the proposed change to factor(),
> the last call would change its result:
> 
>   > A[, 1:2, drop=TRUE]
>[,1] [,2]
>   [1,] 79   
>   [2,] 810  
>   Levels: 7 8 9 10

> because 'drop=TRUE' calls factor(..) and that would also
> preserve the "dim" attribute.  I would think that the
> changed behavior _is_ better, and is also according to
> documentation, because the help page for [.factor explains
> that 'drop = TRUE' drops levels, but _not_ that it
> transforms a factor matrix into a factor (vector).

> Martin

I'm finally coming back to this.
It still seems to make sense to change factor() and hence
droplevels() behavior here, and plan to commit this change
within a day.

Martin Maechler
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [bug] droplevels() also drop object attributes (comment…)

2017-05-16 Thread Martin Maechler
> Serge Bibauw 
> on Mon, 15 May 2017 11:59:32 -0400 writes:

> Hi,

> Just reporting a small bug… not really a big deal, but I don’t think that 
is intended: droplevels() also drops all object’s attributes.

Yes.  The help page for droplevels (or the simple definition of
'droplevels.factor') clearly indicate that the method for
factors is really just a call to   factor(x, exclude = *)

and that _is_ quite an important base function whose semantic
should not be changed lightly. Still, let's continue :

Looking a bit, I see that the current behavior of factor() {and
hence droplevels} has been unchanged in this respect  for the
whole history of R, well, at least for more than 17 years (R 1.0.1, April 2000).

I'd agree there _is_ a bug, at least in the documentation which
does *not* mention that currently, all attributes are dropped but "names",
"levels" (and "class").

OTOH, factor() would only need a small change to make it
preserve all attributes (but "class" and "levels" which are set explicitly).

I'm sure this will break some checks in some packages.
Is it worth it?

e.g., our own R  QC checks currently check (the printing of) the
following (in tests/reg-tests-2.R ):

> ## some tests of factor matrices
> A <- factor(7:12)
> dim(A) <- c(2, 3)
> A
 [,1] [,2] [,3]
[1,] 7911  
[2,] 810   12  
Levels: 7 8 9 10 11 12
> str(A)
 factor [1:2, 1:3] 7 8 9 10 ...
 - attr(*, "levels")= chr [1:6] "7" "8" "9" "10" ...
> A[, 1:2]
 [,1] [,2]
[1,] 79   
[2,] 810  
Levels: 7 8 9 10 11 12
> A[, 1:2, drop=TRUE]
[1] 7  8  9  10
Levels: 7 8 9 10

with the proposed change to factor(),
the last call would change its result:

> A[, 1:2, drop=TRUE]
 [,1] [,2]
[1,] 79   
[2,] 810  
Levels: 7 8 9 10

because 'drop=TRUE' calls factor(..) and that would also
preserve the "dim" attribute.
I would think that the changed behavior _is_ better, and is also
according to documentation, because the help page for
 [.factor
explains that 'drop = TRUE' drops levels, but _not_ that it
transforms a factor matrix into a factor (vector).


Martin


> Example:

>> > test <- c("hello", "something", "hi")
>> > test <- factor(test)
>> > comment(test) <- "this is a test"
>> > attr(test, "description") <- "this is another test"
>> > attributes(test)
>> $levels
>> [1] "hello"     "hi"        "something"
>> 
>> $class
>> [1] "factor"
>> 
>> $comment
>> [1] "this is a test"
>> 
>> $description
>> [1] "this is another test"
>> 
>> > test <- droplevels(test)
>> > attributes(test)
>> $levels
>> [1] "hello"     "hi"        "something"
>> 
>> $class
>> [1] "factor"


> Serge

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [bug] droplevels() also drop object attributes (comment…)

2017-05-15 Thread Serge Bibauw
Hi,

Just reporting a small bug… not really a big deal, but I don’t think that is 
intended: droplevels() also drops all object’s attributes.

Example:

> > test <- c("hello", "something", "hi")
> > test <- factor(test)
> > comment(test) <- "this is a test"
> > attr(test, "description") <- "this is another test"
> > attributes(test)
> $levels
> [1] "hello"     "hi"        "something"
>
> $class
> [1] "factor"
>
> $comment
> [1] "this is a test"
>
> $description
> [1] "this is another test"
>
> > test <- droplevels(test)
> > attributes(test)
> $levels
> [1] "hello"     "hi"        "something"
>
> $class
> [1] "factor"


Serge

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel