Hi,
It's been my experience that when you combine or aggregate vectors of factors 
using a function, you should be prepared for surprises, as it's not obvious 
what the "right" way to combine factors is (ordered or not), especially if two 
vectors of factors have different levels or (if ordered) are ordered in a 
different way.

For instance, what would you expect to get from unlist() if each element of the 
list had different levels, or were both ordered, but in a different way, or if 
some elements of the list were factors and others were ordered factors?
> unlist(list(ordered(c("a","b")), ordered(c("b","a"))))
[1] ?

Honestly, my biggest surprise from your question was that unlist even returned 
a factor at all. For example, the c() function just converts factors to 
integers.
> c(ordered(c("a","b")), ordered(c("a","b")))
[1] 1 2 1 2

And here's one that's especially weird. When rbind() data frames with an 
ordered factor, you still get an ordered factor back, but the order may be 
different from either of the original orders:

> x1 <- data.frame(a=ordered(c("b","c")))
> x2 <- data.frame(a=ordered(c("a","b","c")))
> str(rbind(x1,x2)) #  Note b < a
 'data.frame':  5 obs. of  1 variable:
 $ a: Ord.factor w/ 3 levels "b"<"c"<"a": 1 2 3 1 2

Should rbind just have returned an integer like c(), or returned a factor like 
unlist(), or should it kept the result as an ordered factor, but ordered the 
result in a different way? I have no idea.

So in short, IMO, there are definitely inconsistencies in how ordered/factors 
are handled across functions, but I think it would be hard to point to any 
single function and say it is wrong or needs to be changed. My best advice, is 
to just be careful when combining or aggregating factors.
--Robert

-----Original Message-----
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of "Jens 
Oehlschlägel"
Sent: Friday, June 16, 2017 9:04 AM
To: r-devel@r-project.org
Cc: jens.oehlschlae...@truecluster.com
Subject: [Rd] 'ordered' destroyed to 'factor'

Dear all,
 
I don't know if you consider this a bug or feature, but it breaks reasonable 
code: 'unlist' and 'sapply' convert 'ordered' to 'factor' even if all levels 
are equal. Here is a simple example:

o <- ordered(letters)
o[[1]]
lapply(o, min)[[1]]          # ordered factor
unlist(lapply(o, min))[[1]]  # no longer ordered
sapply(o, min)[[1]]          # no longer ordered

Jens Oehlschlägel
 
 
P.S: The above examples are silly for simple reproduction. The current behavior 
broke my use-case which had a structure like this
 
# have some data
x <- 1:20
# apply some function to each element
somefunc <- function(x){
  # do something and return an ordinal level
  sample(o, 1)
}
x <- sapply(x, somefunc)
# get minimum result
min(x)
# Error in Summary.factor(c(2L, 26L), na.rm = FALSE) :
#   ‘min’ not meaningful for factors
 
 
> version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          4.0                         
year           2017                        
month          04                          
day            21                          
svn rev        72570                       
language       R                           
version.string R version 3.4.0 (2017-04-21)
nickname       You Stupid Darkness        

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to