Josiah Carlson wrote:
"Vladimir 'Yu' Stepanov" <[EMAIL PROTECTED]> wrote:

Comparison of functions of sorting and binary trees not absolutely correctly. I think that function sort will lose considerably on
greater lists. Especially after an insert or removal of all one element.

Generally speaking, people who understand at least some of Python's
internals (list internals specifically), will not be *removing* entries
from lists one at a time (at least not using list.pop(0) ), because that
is slow.  If they need to remove items one at a time from the smallest
to the largest, they will usually use list.reverse(), then repeatedly
list.pop(), as this is quite fast (in general).
Yes. I understood it when resulted a set example.
However, as I just said, people usually don't remove items from
just-sorted lists, they tend to iterate over them via 'for i in list:' .
Such problem arises at creation of the list of timers. And this list is in
constant change: addition/removal of elements in the list.
collections.deque here does not approach, as if addition in the big list is
made or search of the nearest value on the average it is necessary to lead
quantity of checks N/2 is made. For a binary tree the quantity of necessary
checks on former is equal log2 (N).

Other variant of use of binary trees: search of concurrence to ranges. Such
ranges can be blocks IP of addresses. Also this kind of the dictionary can
be used for a fast finding, whether the given point enters into one of
pieces. These problems can be realized by means of binary search. For
binary search the same lacks, as for above resulted example are
characteristic: the insert and removal for lists are carried out slowly and
after an insert sorting of the list is required. Except for that function
of binary search is absent in standard package Python. It is possible to
write its analogue on pure Python, but it will be more than twice more
slowly.
 - Josiah

As an aside, I have personally implemented trees a few times for
different reasons.  One of the issues I have with most tree
implementations is that it is generally difficult to do things like
"give me the kth smallest/largest item".  Of course the standard
solution is what is generally called a "partial order" or "order
statistic" tree, but then you end up with yet another field in your
structure.  One nice thing about Red-Black trees combined with
order-statistic trees, is that you can usually use the uppermost bit of
the order statistics for red/black information.  Of course, this is
really only interesting from an "implement this in C" perspective;
because if one were implementing it in Python, one may as well be really
lazy and not bother implementing guaranteed balanced trees and be
satisfied with randomized-balancing via Treaps.
Here that I have found through Google on a line " order statistic binary tree ":

   http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic20/

I have understood, that similar I have already made something :). The result of
the test shows on the most bad result, but is certainly worse, than a usual
tree. Badly that up to the end I have not tested this realization. Percent on
90 I am assured that it is efficient.

There is a question. What API should be used? Therefore as if to address to
object, as to MAP to object __getitem__ will return one result. And if as to
the list - another.

Here the list of already existing attributes and methods:

   class xtree(__builtin__.object)
    |  X-Tree. Binary tree with AVL balance mechanism.
    |
    |  Methods defined here:
    |
    |  __contains__(...)
    |      x.__contains__(y) <==> y in x
    |
    |  __delitem__(...)
    |      x.__delitem__(y) <==> del x[y]
    |
    |  __getitem__(...)
    |      x.__getitem__(y) <==> x[y]
    |
    |  __iter__(...)
    |      x.__iter__() <==> iter(x)
    |
    |  __len__(...)
    |      x.__len__() <==> len(x)
    |
    |  __repr__(...)
    |      x.__repr__() <==> repr(x)
    |
    |  __setitem__(...)
    |      x.__setitem__(i, y) <==> x[i]=y
    |
    |  append(...)
| D.append((k: v)) -> None, append new pair element into tree with sorting.
    |      Return pair element if argument is exist.
    |
    |  clear(...)
    |      D.clear() -> None. Remove all items from D.
    |
    |  copy(...)
    |      D.copy() -> a shallow copy of D.
    |
    |  get(...)
    |      D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.
    |
    |  has_key(...)
    |      D.has_key(k) -> True if D has a key k, else False.
    |
    |  items(...)
    |      D.items() -> list of D's (key, value) pairs.
    |
    |  iteritems(...)
    |      D.iteritems() -> an iterator over the (key, value) items of D.
    |
    |  iterkeys(...)
    |      D.iterkeys() -> an iterator over the keys of D.
    |
    |  itervalues(...)
    |      D.itervalues() -> an iterator over the values of D.
    |
    |  keys(...)
    |      D.keys() -> list of D's keys.
    |
    |  pop(...)
| D.pop(k[,d]) -> v, remove specified key and return the corresponding value. | If key is not found, d is returned if given, otherwise KeyError is raised.
    |
    |  popitem(...)
| D.popmin() -> (k, v), remove and return minimal (key, value) pair as a
    |      2-tuple; but raise KeyError if D is empty.
    |
    |  popmax(...)
| D.popmax() -> (k, v), remove and return maximal (key, value) pair as a
    |      2-tuple; but raise KeyError if D is empty.
    |
    |  popmin(...)
| D.popmin() -> (k, v), remove and return minimal (key, value) pair as a
    |      2-tuple; but raise KeyError if D is empty.
    |
    |  pushmax(...)
| D.pushmax(item) -> (k, v), remove and return maximal (key, value) pair as a
    |      2-tuple; but raise KeyError if D is empty.
    |
    |  pushmin(...)
| D.pushmin(item) -> (k, v), remove and return minimal (key, value) pair as a
    |      2-tuple; but raise KeyError if D is empty.
    |
    |  rebuild(...)
| D.rebuild() -> None. Take compare function for rebuild dictionary.
    |      It works without use of additional memory on a place.
    |
    |  setdefault(...)
| D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D.
    |
    |  update(...)
| D.update(E) -> None. Update D from E and F: for k in E: D[k] = E[k]
    |      (if E has keys else: for (k, v) in E: D[k] = v).
    |
    |  values(...)
    |      D.values() -> list of D's values.
    |
| ----------------------------------------------------------------------
    |  Data descriptors defined here:
    |
    |  cmp
    |      function of comparison of objects
    |
    |  freeze
    |      to block an opportunity change of object.
    |
| ----------------------------------------------------------------------
    |  Data and other attributes defined here:
    |
    |  __new__ = <built-in method __new__ of type object at 0x282a4aa0>
    |      T.__new__(S, ...) -> a new object with type S, a subtype of T

Of course all of this discussion about trees in Python is fine, though
there is one major problem.  With minimal work, one can construct 3
values such that ((a < b < c) and (c < a)) is true.
The same problem is inherent in the standard dictionary - dict (), only
roots at it, it is others. For example:

class test:
       def __hash__(self):
               return random.randint(0, 1000000)

The binary tree is very easy for tearing down. Therefore when any reference
to it is made, blocking either on reading, or on record should be
established. If it is opened iterator on the given object also blocking on
reading should be established where iterator will not come to the end or
while it will not be closed or destroyed. Thus it is possible to guarantee
security of object. Functions of blocking are similar on
pthread_rdlock_trylock, and pthread_wrlock_trylock. If blocking is
impossible, exception RDLockError, WRLockError, FreezeLockError is
generated.

Thanks.

--
Vladimir Stepanov
#!/usr/local/bin/python

# Initialize modules
import avl
import xtree
import collections
import types
import sys

import timeit

if len(sys.argv) != 3:
        iterations = 1000000
        count = 10
else:
        iterations = int(sys.argv[1])
        count = int(sys.argv[2])

print "Iterations", iterations
print "Repeats", count
print

result_avl = timeit.Timer("""
import avl
a = avl.new()
for i in xrange(%d):
        a.insert(i)
""" % iterations).timeit(count)
print "object avl.new()", result_avl

result_xtree = timeit.Timer("""
import xtree
a = xtree.xtree()
for i in xrange(%d):
        a[i] = i
""" % iterations).timeit(count)
print "object xtree.xtree()", result_xtree

result_rbtree = timeit.Timer("""
import collections
a = collections.rbtree()
for i in xrange(%d):
        a[i] = i
""" % iterations).timeit(count)
print "object collections.rbtree()", result_rbtree

result_dict = timeit.Timer("""
import types
a = dict()
for i in xrange(%d):
        a[i] = i
""" % iterations).timeit(count)
print "object dict()", result_dict

print "       Dict   Rbtree Xtree  AVL"
print "Dict   %.2f   %.2f   %.2f   %.2f" % (
        float(result_dict)/result_dict,
        float(result_dict)/result_rbtree,
        float(result_dict)/result_xtree,
        float(result_dict)/result_avl)
print "Rbtree %.2f   %.2f   %.2f   %.2f" % (
        float(result_rbtree)/result_dict,
        float(result_rbtree)/result_rbtree,
        float(result_rbtree)/result_xtree,
        float(result_rbtree)/result_avl)
print "Xtree  %.2f   %.2f   %.2f   %.2f" % (
        float(result_xtree)/result_dict,
        float(result_xtree)/result_rbtree,
        float(result_xtree)/result_xtree,
        float(result_xtree)/result_avl)
print "AVL    %.2f   %.2f   %.2f   %.2f" % (
        float(result_avl)/result_dict,
        float(result_avl)/result_rbtree,
        float(result_avl)/result_xtree,
        float(result_avl)/result_avl)
Iterations 1000000
Repeats 10

object avl.new() 34.5930659771
object xtree.xtree() 16.2341389656
object collections.rbtree() 79.2434329987
object dict() 4.01597499847
       Dict   Rbtree Xtree  AVL
Dict   1.00   0.05   0.25   0.12
Rbtree 19.73   1.00   4.88   2.29
Xtree  4.04   0.20   1.00   0.47
AVL    8.61   0.44   2.13   1.00

Iterations 1000000
Repeats 10

object avl.new() 35.8164830208
object xtree.xtree() 18.5456049442
object collections.rbtree() 74.5128099918
object dict() 4.09129810333
       Dict   Rbtree Xtree  AVL
Dict   1.00   0.05   0.22   0.11
Rbtree 18.21   1.00   4.02   2.08
Xtree  4.53   0.25   1.00   0.52
AVL    8.75   0.48   1.93   1.00
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to