On Mon, Jul 23, 2012 at 6:36 PM, Ronan Lamy <ronan.l...@gmail.com> wrote:
> Le dimanche 22 juillet 2012 à 23:07 +0300, Sergiu Ivanov a écrit :
>>
>> Currently, FiniteSet stores a non-.args frozenset attribute which is
>> used exactly for this purposes.  Thus, no matter whether one sorts
>> .args or not, membership testing is O(1).
>
> OK, I should have checked. However having two copies of the arguments is
> inefficient and complicates the code.

I don't think it's that inefficient, especially if FiniteSet is
created from a tuple.  Also, it actually speeds up things like
membership checking, which ultimately pays off.  (That's suppositions,
no statistics.)

Apparently, FiniteSet cannot store only the frozenset, because it
derives from Basic and thus should be rebuildable from .args.

>> > Sorting also turns set creation from an O(n) operation into
>> > O(n*log(n)), which in itself is probably only significant with large
>> > sets.
>>
>> Well, if we do patch FiniteSet.sort_key to sort .args _every time_ a
>> FiniteSet sort key is needed, you will have that slowdown _every time_
>> you sort a container with a FiniteSet.
>
> Well, if we don't sort sets, we don't sort sets of sets either, so we
> don't the sort key when we put a set in another set.

Please keep in mind that you _do_ sort the sets at least when printing
them.  For example, right now you cannot write a doctest checking the
printing of FiniteSet(FiniteSet(1, 3, 5), FiniteSet(2, 4, 6)), because
it will get printed as {{1, 3, 5}, {2, 4, 6}} or as {{2, 4, 6}, {1, 3,
5}}.  (I haven't checked this particular example; I have a more
complicated scenario in my code, but what I show should convey the
idea.)

Therefore I just cannot imagine how we can get away without assuring a
consistent FiniteSet.sort_key.  I'd call the current state a
fundamentally broken state, because it doesn't behave as it should.

>> Now, I do admit that not every use case of FiniteSet implies storing
>> it in a container.  Therefore, perhaps, we could go with a lazy
>> approach: don't sort .args at FiniteSet creation; however, whenever a
>> certain order of .args is required, sort it and leave it like that
>> afterwards.  On the one hand, this will bring in sorting strictly when
>> needed and, on the other hand, it will avoid the sorting in use cases
>> when it is not required.
>>
>> Do you think that's a better approach?
>
> Changing the args during the life of the object seems risky and adds
> complications. For instance, computing the hash will require that the
> args are sorted. And what if something keeps a reference to the original
> args after they are sorted?

That I have thought of a bit.  That does sound tricky, on the overall.
I don't think people are supposed to use .args independently of the
class (i.e., even if someone has a copy to the original, unsorted
args, one shouldn't refer to them directly but rather construct a new
FiniteSet from them), but I may be very wrong and, of course, this
does not diminish the trickiness of the approach.

>> In [21]: %timeit frozenset(s)
>> 100000 loops, best of 3: 2.4 us per loop
>>
>> In [23]: %timeit FiniteSet(s)
>> 1000 loops, best of 3: 191 us per loop
>>
>> Which puts the two more or less on par, if I am reading the %timeit
>> stats correctly (I've never used it before though, so I probably am
>> not).
>
> Er, no, if you're comparing frozenset(s) with FiniteSet(s), then the
> former is about 80 times faster than the latter. This isn't surprising,
> given that set to frozenset is O(1) while set to list to frozenset to
> tuple is O(n).

All right, so one doesn't compare those results as 10^5 * 2.4 vs. 1000
* 191 :-)

I get your point with the O(n) slowdown of storing set elements in
.args, but I don't think it is possible to get away without it and
complying with the behaviour expected from a Basic instance.

Sergiu

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to sympy@googlegroups.com.
To unsubscribe from this group, send email to 
sympy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to