The factor approach is horrifically ugly and dangerous.

Even if it didn't have the extraordinarily poor behavior documented
below, it simply isn't well-defined what it should do.  The explicit
approximation route is far far preferable in every way: more
predictable, more controllable, and even (though it hardly matters
usually) faster.

Let's look at the extraordinarily poor behavior I was mentioning. Consider:

nums <- (.3 + 2e-16 * c(-2,-1,1,2)); nums
[1] 0.3 0.3 0.3 0.3

Though they all print as .3 with the default precision (which is
normal and expected), they are all different from .3:

nums - .3 =>  -3.885781e-16 -2.220446e-16  2.220446e-16  3.885781e-16

When we convert nums to a factor, we get:

fact <- as.factor(nums); fact
[1] 0.300000000000000 0.3               0.3               0.300000000000000
Levels: 0.300000000000000 0.3 0.3 0.300000000000000

Not clear what the difference between 0.300000000000000 and 0.3 is
supposed to be, nor why some 0.300000000000000 are < .3 and others are
> .3, but let's put that aside for the moment.

Now let's look at the relations among the factor values:

> fact[1]==fact[4]
[1] TRUE

So though nums[1] < nums[2] < nums[3] < nums[4], fact[1] compares
*unequal* to fact[2] though it compares *equal* to fact[4].
Apparently R is comparing the *names* of the levels rather than the
indexes in the factor.  This would be weird even if it didn't lead to
this very bad case.

Hope this helps,


On Mon, Mar 16, 2009 at 6:53 PM, Daniel Murphy <> wrote:
> I have a matrix whose columns were filled with values which were functions
> of cvseq<-seq(.2,.3,by=.1) (and a row value of mode integer). To do a lookup
> for cv=.3 later, I wanted to match(.3,cvseq), which gave me NA, hence my
> question. I thought R would match .3 in cvseq within .Machine$double.eps,
> but I can understand it if .3 and the second element of cvseq would not have
> identical bits.
> Besides the helpful suggestions below, I also tried
>> cvseqf <- as.factor(cvseq)
>> match(.3,cvseq)
> [1] 2
> which worked.
> In general, would it be better to go the enumeration route via as.factor or
> the approximation route?
> Thanks for the help.
> -Dan
> On Mon, Mar 16, 2009 at 8:24 AM, Stavros Macrakis <>
> wrote:
>> Well, first of all, seq(from=.2,to=.3) gives c(0.2), so I assume you
>> really mean something like seq(from=.2,to=.3,by=.1), which gives
>> c(0.2, 0.3).
>> %in% tests for exact equality, which is almost never a good idea with
>> floating-point numbers.
>> You need to define what exactly you mean by "in" for floating-point
>> numbers.  What sort of tolerance are you willing to allow?
>> Some possibilities would be for example:
>> approxin <- function(x,list,tol) any(abs(list-x)<tol)   # absolute
>> tolerance
>> rapproxin <- function(x,list,tol) (x==0 && 0 %in% list) ||
>> any(abs((list-x)/x)<=tol,na.rm=TRUE)
>>     # relative tolerance; only exact 0 will match 0
>> Hope this helps,
>>          -s
>> On Mon, Mar 16, 2009 at 9:36 AM, Daniel Murphy <>
>> wrote:
>> > Hello:I am trying to match the value 0.3 in the sequence seq(.2,.3). I
>> > get
>> >> 0.3 %in% seq(from=.2,to=.3)
>> > [1] FALSE
>> > Yet
>> >> 0.3 %in% c(.2,.3)
>> > [1] TRUE
>> > For arbitrary sequences, this "invisible .3" has been problematic. What
>> > is
>> > the best way to work around this?

