Re: [Rd] 'ordered' destroyed to 'factor'

2017-06-18 Thread Joris Meys
Dear Jens,

multiple people have given you multiple reasons as to why your request
cannot be implemented for basic logical reasons. You also got a workaround
for the special case where all factors have all the same levels in exactly
the same order.

If you believe it's possible to implement this in a way that doesn't break
anything else, please give at least an algorithm that explains HOW R should
do this, and possibly provide a patch. If you fail to do either of them,
it's rather ungrateful to piss on the very people that devote tons of FREE
time to the development of something you're using 17 years now.

And for the record: R handles ordinal data pretty well thank you very much.
Maybe after 17 years, you could do the effort of taking a look at
options("contrasts"). Let it be an eye-opener.

On Sun, Jun 18, 2017 at 12:34 PM, "Jens Oehlschlägel" <
jens.oehlschlae...@truecluster.com> wrote:

> Defending the status quo misses the point that R *could* handle ordinal
> data with a fixed set of levels but actually *does not*. Although it would
> be useful. Even if this does not imply to handle any possible straw-man
> situations. Having data-types for nominal, ordinal, and interval-scale data
> is - in theory - one of the major advantages of S over SAS. But *having*
> without *handling* means: only in theory, not in practice. Has r-devel
> really lost the momentum for continuous improvement, to converge R to an
> optimum? I struggle to recognize the project I loved in 2000.
>
>
> Gesendet: Freitag, 16. Juni 2017 um 18:31 Uhr
> Von: "peter dalgaard" <pda...@gmail.com>
> An: "Robert McGehee" <rmcge...@walleyetrading.net>
> Cc: "Jens Oehlschlägel" <jens.oehlschlae...@truecluster.com>, "
> r-devel@r-project.org" <r-devel@r-project.org>
> Betreff: Re: [Rd] 'ordered' destroyed to 'factor'
> > On 16 Jun 2017, at 15:59 , Robert McGehee <rmcge...@walleyetrading.net>
> wrote:
> >
> > For instance, what would you expect to get from unlist() if each element
> of the list had different levels, or were both ordered, but in a different
> way, or if some elements of the list were factors and others were ordered
> factors?
> >> unlist(list(ordered(c("a","b")), ordered(c("b","a"
> > [1] ?
>
> Those actually have the same levels in the same order: a < b
>
> Possibly, this brings the point home more clearly
>
> unlist(list(ordered(c("a","c")), ordered(c("b","d"
>
> (Notice that alphabetical order is largely irrelevant, so all of these
> level orderings are equally possible:
>
> a < c < b < d
> a < b < c < d
> a < b < d < c
> b < a < c < d
> b < a < d < c
> b < d < a < c
>
> ).
>
> -pd
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 'ordered' destroyed to 'factor'

2017-06-18 Thread Jens Oehlschlägel
Defending the status quo misses the point that R *could* handle ordinal data 
with a fixed set of levels but actually *does not*. Although it would be 
useful. Even if this does not imply to handle any possible straw-man 
situations. Having data-types for nominal, ordinal, and interval-scale data is 
- in theory - one of the major advantages of S over SAS. But *having* without 
*handling* means: only in theory, not in practice. Has r-devel really lost the 
momentum for continuous improvement, to converge R to an optimum? I struggle to 
recognize the project I loved in 2000.
 

Gesendet: Freitag, 16. Juni 2017 um 18:31 Uhr
Von: "peter dalgaard" <pda...@gmail.com>
An: "Robert McGehee" <rmcge...@walleyetrading.net>
Cc: "Jens Oehlschlägel" <jens.oehlschlae...@truecluster.com>, 
"r-devel@r-project.org" <r-devel@r-project.org>
Betreff: Re: [Rd] 'ordered' destroyed to 'factor'
> On 16 Jun 2017, at 15:59 , Robert McGehee <rmcge...@walleyetrading.net> wrote:
>
> For instance, what would you expect to get from unlist() if each element of 
> the list had different levels, or were both ordered, but in a different way, 
> or if some elements of the list were factors and others were ordered factors?
>> unlist(list(ordered(c("a","b")), ordered(c("b","a"
> [1] ?

Those actually have the same levels in the same order: a < b

Possibly, this brings the point home more clearly

unlist(list(ordered(c("a","c")), ordered(c("b","d"

(Notice that alphabetical order is largely irrelevant, so all of these level 
orderings are equally possible:

a < c < b < d
a < b < c < d
a < b < d < c
b < a < c < d
b < a < d < c
b < d < a < c

).

-pd
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk Priv: pda...@gmail.com








 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 'ordered' destroyed to 'factor'

2017-06-16 Thread peter dalgaard

> On 16 Jun 2017, at 15:59 , Robert McGehee  wrote:
> 
> For instance, what would you expect to get from unlist() if each element of 
> the list had different levels, or were both ordered, but in a different way, 
> or if some elements of the list were factors and others were ordered factors?
>> unlist(list(ordered(c("a","b")), ordered(c("b","a"
> [1] ?

Those actually have the same levels in the same order: a < b

Possibly, this brings the point home more clearly

unlist(list(ordered(c("a","c")), ordered(c("b","d"

(Notice that alphabetical order is largely irrelevant, so all of these level 
orderings are equally possible:

a < c < b < d
a < b < c < d
a < b < d < c
b < a < c < d
b < a < d < c
b < d < a < c

).

-pd
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 'ordered' destroyed to 'factor'

2017-06-16 Thread Joris Meys
This can be traced back to the following line in unlist():

structure(res, levels = lv, names = nm, class = "factor")

The Details section of ?unlist states specifically how it treats factors,
so this is documented and expected behaviour.

This is also the appropriate behaviour. In your case one could argue that
unlist should maintain the order, as there's only a single factor. However,
the moment you have 2 ordered factors, there's no guarantee that the levels
are the same, or even in the same order. Hence it is impossible to
determine what should be the correct order. For this reason, the only
logical object to be returned in case of a list of factors, is an unordered
factor.

In your use case (so with a list of factors with identical ordered levels)
the solution is one extra step:

x <- list(
  factor(c("a","b"),
 levels = c("a","b","c"),
 ordered = TRUE),
  factor(c("b","c"),
 levels = c("a","b","c"),
 ordered = TRUE)
)
res <- sapply(x, min)
res <- ordered(res, levels = levels(res))
min(res)


I hope this explains

Cheers
Joris


On Fri, Jun 16, 2017 at 3:03 PM, "Jens Oehlschlägel" <
jens.oehlschlae...@truecluster.com> wrote:

> Dear all,
>
> I don't know if you consider this a bug or feature, but it breaks
> reasonable code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even
> if all levels are equal. Here is a simple example:
>
> o <- ordered(letters)
> o[[1]]
> lapply(o, min)[[1]]  # ordered factor
> unlist(lapply(o, min))[[1]]  # no longer ordered
> sapply(o, min)[[1]]  # no longer ordered
>
> Jens Oehlschlägel
>
>
> P.S: The above examples are silly for simple reproduction. The current
> behavior broke my use-case which had a structure like this
>
> # have some data
> x <- 1:20
> # apply some function to each element
> somefunc <- function(x){
>   # do something and return an ordinal level
>   sample(o, 1)
> }
> x <- sapply(x, somefunc)
> # get minimum result
> min(x)
> # Error in Summary.factor(c(2L, 26L), na.rm = FALSE) :
> #   ‘min’ not meaningful for factors
>
>
> > version
>_
> platform   x86_64-pc-linux-gnu
> arch   x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status
> major  3
> minor  4.0
> year   2017
> month  04
> day21
> svn rev72570
> language   R
> version.string R version 3.4.0 (2017-04-21)
> nickname   You Stupid Darkness
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 'ordered' destroyed to 'factor'

2017-06-16 Thread Robert McGehee
Hi,
It's been my experience that when you combine or aggregate vectors of factors 
using a function, you should be prepared for surprises, as it's not obvious 
what the "right" way to combine factors is (ordered or not), especially if two 
vectors of factors have different levels or (if ordered) are ordered in a 
different way.

For instance, what would you expect to get from unlist() if each element of the 
list had different levels, or were both ordered, but in a different way, or if 
some elements of the list were factors and others were ordered factors?
> unlist(list(ordered(c("a","b")), ordered(c("b","a"
[1] ?

Honestly, my biggest surprise from your question was that unlist even returned 
a factor at all. For example, the c() function just converts factors to 
integers.
> c(ordered(c("a","b")), ordered(c("a","b")))
[1] 1 2 1 2

And here's one that's especially weird. When rbind() data frames with an 
ordered factor, you still get an ordered factor back, but the order may be 
different from either of the original orders:

> x1 <- data.frame(a=ordered(c("b","c")))
> x2 <- data.frame(a=ordered(c("a","b","c")))
> str(rbind(x1,x2)) #  Note b < a
 'data.frame':  5 obs. of  1 variable:
 $ a: Ord.factor w/ 3 levels "b"<"c"<"a": 1 2 3 1 2

Should rbind just have returned an integer like c(), or returned a factor like 
unlist(), or should it kept the result as an ordered factor, but ordered the 
result in a different way? I have no idea.

So in short, IMO, there are definitely inconsistencies in how ordered/factors 
are handled across functions, but I think it would be hard to point to any 
single function and say it is wrong or needs to be changed. My best advice, is 
to just be careful when combining or aggregating factors.
--Robert

-Original Message-
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of "Jens 
Oehlschlägel"
Sent: Friday, June 16, 2017 9:04 AM
To: r-devel@r-project.org
Cc: jens.oehlschlae...@truecluster.com
Subject: [Rd] 'ordered' destroyed to 'factor'

Dear all,
 
I don't know if you consider this a bug or feature, but it breaks reasonable 
code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even if all levels 
are equal. Here is a simple example:

o <- ordered(letters)
o[[1]]
lapply(o, min)[[1]]  # ordered factor
unlist(lapply(o, min))[[1]]  # no longer ordered
sapply(o, min)[[1]]  # no longer ordered

Jens Oehlschlägel
 
 
P.S: The above examples are silly for simple reproduction. The current behavior 
broke my use-case which had a structure like this
 
# have some data
x <- 1:20
# apply some function to each element
somefunc <- function(x){
  # do something and return an ordinal level
  sample(o, 1)
}
x <- sapply(x, somefunc)
# get minimum result
min(x)
# Error in Summary.factor(c(2L, 26L), na.rm = FALSE) :
#   ‘min’ not meaningful for factors
 
 
> version
   _   
platform   x86_64-pc-linux-gnu     
arch   x86_64  
os linux-gnu   
system x86_64, linux-gnu   
status     
major  3   
minor  4.0     
year   2017    
month  04  
day    21  
svn rev    72570   
language   R   
version.string R version 3.4.0 (2017-04-21)
nickname   You Stupid Darkness

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] 'ordered' destroyed to 'factor'

2017-06-16 Thread Jens Oehlschlägel
Dear all,
 
I don't know if you consider this a bug or feature, but it breaks reasonable 
code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even if all levels 
are equal. Here is a simple example:

o <- ordered(letters)
o[[1]]
lapply(o, min)[[1]]  # ordered factor
unlist(lapply(o, min))[[1]]  # no longer ordered
sapply(o, min)[[1]]  # no longer ordered

Jens Oehlschlägel
 
 
P.S: The above examples are silly for simple reproduction. The current behavior 
broke my use-case which had a structure like this
 
# have some data
x <- 1:20
# apply some function to each element
somefunc <- function(x){
  # do something and return an ordinal level
  sample(o, 1)
}
x <- sapply(x, somefunc)
# get minimum result
min(x)
# Error in Summary.factor(c(2L, 26L), na.rm = FALSE) :
#   ‘min’ not meaningful for factors
 
 
> version
   _   
platform   x86_64-pc-linux-gnu     
arch   x86_64  
os linux-gnu   
system x86_64, linux-gnu   
status     
major  3   
minor  4.0     
year   2017    
month  04  
day    21  
svn rev    72570   
language   R   
version.string R version 3.4.0 (2017-04-21)
nickname   You Stupid Darkness

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel