Re: "in" everywhere

bearophile Thu, 07 Oct 2010 12:40:48 -0700

Andrei Alexandrescu:

> I'm a bit leary of adopting this feature (it has been discussed). To me 
> "in" implies a fast operation and substring searching isn't quite it.
> 
> One thing that could be done is to allow "in" with literal arrays to 
> their right:
> 
> if (x in ["abcde", "asd"]) { ... }
> 
> The size of the operand is constant, known, and visible.


The "in" is useful to tell if an item is present in a collection (dynamic 
arrays, fixed-sized arrays, sorted arrays, tuple, typetuple, hash set, tree 
set, hash map, tree map, trees, graph, deque, skip list, finger tree, etc), and 
it's useful to tell if a substring is present in a string. Those two purposes 
are very commonly useful in programs.

Despite those purposes are very common, there is no need for an "in" syntax. In 
dlibs1 I have created an isIn() function that was good enough.

D supports the "in" operator for associative arrays, and in its operator 
overloading (my Set()/set() implementation supports the "in" operator), so it's 
natural for people to desire that operator to be usable for the other kind of 
collections too, like arrays. Maybe those people have used Python, where the in 
operator (__contains__) is widey used.

Regarding generic programming, Python doesn't have templates, but it has 
dynamic typing, that also has purposes similar to generic programming. To a 
python function that uses the "in" operator you may give both a list and an 
associative array, or other objects that support the __contains__. In this case 
the Python code performs an O(n) or O(1) operation according to the dynamic 
type given to the function.

I can live without the "in" operation for arrays and strings (useful to look 
for substrings too). So one solution is not not change the D language, and not 
add support for "in" to arrays.

Another solution is just to accept O(n) as the worst complexity for the "in" 
operator. I don't understand what's the problem in this.

Another solution is to support the "in" operator for dynamic arrays too, and 
define a new attribute, like @complexity(), plus an Enum that allows to specify 
the worst case complexity. So associative arrays are annotated with 
@complexity(O.linear), while the function that searches for items/substrings in 
arrays/strings is @complexity(O.constant). At compile-time the generic code is 
then able to query the computational complexity of the "in" operator 
implementation. The problem is that the compiler today can't enforce functions 
annotated with @complexity(O.constant) to actually not perform a linear search 
(but I don't think it's a problem, today if the Range protocol asks an 
operation to not be worse than O(n ln n) the compiler doesn't enforce it).

I don't like the idea of allowing the "in" operator for sorted arrays, tuples, 
fixed-sized arrays, skip lists, maps, sets, trees and disallow it for dynamic 
arrays. It looks a bit silly. People for the next ten years will then ask "in" 
to be extended for unsorted dynamic arrays & substring search too, you can bet 
on it.

Bye,
bearophile

Re: "in" everywhere

Reply via email to