Re: [Sugar-devel] I'm looking for a tree...

2009-06-08 Thread James Zaki
Ben,

Perhaps try communicating the problem you're trying to solve with the data
structure. ie, what you're storing, what would you be looking for when
searching stored objects...

This would yield better advice, and share information within the context of
sugar.


James.


2009/6/8 Benjamin M. Schwartz 

> I've had amazing difficulty communicating what I'm looking for here.
>
> Those closest thing is
>
> http://en.wikipedia.org/wiki/Rope_(computer_science)
>
> A rope is a binary tree that _imposes_ an ordering on its leaves that has
> nothing whatsoever to do with their values (the values are essentially
> opaque).  This is very unusual; almost all standard tree structures
> _derive_ an ordering _from_ the values.
>
> Unfortunately, all implementations of Ropes that I have seen so far are
> designed only to store Chars, whereas I need to store arbitrary python
> objects.  I'd like to avoid coding such a beast myself, but I may have no
> choice.  It would help if someone could identify Ropes as a specialization
> of a more general type of tree, but I have not seen any claims of this
> kind.
>
> The only property I'm looking for that Ropes don't intrinsically satisfy
> is "Reverse lookup".  That is to say, I would like to be able to hold a
> pointer to a particular object in the tree, and at some later time, walk
> back up the tree from that pointer to work out the object's current index.
>  It does seem like that should be doable with a Rope, especially if we
> move to the special case in which the leaves are "arrays of length 1".
>
> --Ben
>
>
> ___
> Sugar-devel mailing list
> Sugar-devel@lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-08 Thread Benjamin M. Schwartz
I've had amazing difficulty communicating what I'm looking for here.

Those closest thing is

http://en.wikipedia.org/wiki/Rope_(computer_science)

A rope is a binary tree that _imposes_ an ordering on its leaves that has
nothing whatsoever to do with their values (the values are essentially
opaque).  This is very unusual; almost all standard tree structures
_derive_ an ordering _from_ the values.

Unfortunately, all implementations of Ropes that I have seen so far are
designed only to store Chars, whereas I need to store arbitrary python
objects.  I'd like to avoid coding such a beast myself, but I may have no
choice.  It would help if someone could identify Ropes as a specialization
of a more general type of tree, but I have not seen any claims of this kind.

The only property I'm looking for that Ropes don't intrinsically satisfy
is "Reverse lookup".  That is to say, I would like to be able to hold a
pointer to a particular object in the tree, and at some later time, walk
back up the tree from that pointer to work out the object's current index.
 It does seem like that should be doable with a Rope, especially if we
move to the special case in which the leaves are "arrays of length 1".

--Ben



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-08 Thread James Zaki
Not sure if you have found an answer to this yet, but with more information
about what you're trying to do could help with communicating the right
solution. Again I think a hash table can answer your questions about
fast-lookup, but it just depends your chosen key, and other requirements you
might have about sorting.

If its not too sugar specific, feel free to contact me directly.

Regards,
James.


2009/6/7 Benjamin M. Schwartz 

> Lucian Branescu wrote:
> > Would an ordered dictionary otherwise be all right, or am I
> > misunderstanding your requirements? There are other implementations,
> > like 
> > http://www.xs4all.nl/~anthon/Python/ordereddict/
>
> No, an ordered dictionary is not enough.  All these ordered dictionaries
> are ordered either by time or by sorting the keys.  Neither will work for
> me.  The problem is simple: if I insert something at position 5, I need
> the object currently at position 5 to move to position 6, and the object
> currently at position 6 to move to position 7, etc.
>
> To accomplish this in an time-ordered odict, I would have to remove all
> keys subsequent to the insertion point, make the insertion, and then
> re-insert those keys.  That's O(n).
>
> To accomplish this in a sorted-order odict, I would have to generate a new
> key that causes my insertion to occur at the right location.  If there are
> repeated insertions at the same location, generating such keys becomes an
> O(n) operation.  To see this, suppose I am repeatedly inserting at the
> second position.  Initially, the keys are (0,1).  The first insertion has
> to pick a key between 0 and 1, e.g. 0.5.  The second insertion has to pick
> a key between 0 and 0.5: e.g. 0.25.  The number of bits required to store
> these keys to sufficient precision increases by one bit on each insertion.
>  This means that the length of the keys is O(n), so every comparison is
> O(n).
>
> --Ben
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-08 Thread Albert Cahalan
Benjamin M. Schwartz writes:

> I am looking for a fast data structure with the following properties:
>
> Maintains an indexed list of arbitrary, non-ordered objects
> (like a python List or C array)
> Allows fast:
> Insertion at any location
> Deletion at any location
> Lookup of an object by its index
> Reverse lookup, to determine the index of an object

It may have been better to describe what you wanted to do.
For example:

Might the amount of data ever exceed the size of RAM?

What data type is an index? (string, int, arbitrary blob...)

What data type is an object? (string, int, arbitrary blob...)

How can you say "Insertion at any location" yet "non-ordered"?
This doesn't seem to make sense. Did you change your mind?

> Python's List has O(1) lookup, but O(N) insert, delete, and
> reverse-lookup.

Heh. A "high level" language is supposed to not have that
kind of problem. Try perl. :-)

> To make reverse lookup O(1) I could maintain a separate
> Dict mapping objects to indices, but this would cost an
> additional O(N) on every insertion and deletion.

If it did, you're still O(N) because O(N)+O(N)==O(N).
It should not take O(N) for insert/delete, even if you
decide you want things ordered.

Reverse mapping is easy: store a copy of the index in (with)
the object.

> A linked list has O(1) insertion and deletion,

I wouldn't say that. You have to find the object's location
to do that, so you need to include the O(N) lookup cost.

> A standard self-balancing tree cannot be used because the objects
> are not ordered, and self-balancing trees require ordered keys.
> I could use the index of an object as the sort key, but then
> insertion and deletion are O(N) because all subsequent keys must
> be altered.  I could fabricate new sort keys to ensure that
> insertions occur at the desired location, but then the length of
> the keys will grow like O(N), making all operations at least O(N).

Standard self-balancing trees don't have these problems.
Neither do hash tables, assuming reasonable table size.

Here you go:

http://en.wikipedia.org/wiki/Judy_array
http://en.wikipedia.org/wiki/Red-black_tree
http://en.wikipedia.org/wiki/AVL_tree
http://en.wikipedia.org/wiki/Splay_tree
http://en.wikipedia.org/wiki/B%2B_tree
http://en.wikipedia.org/wiki/Radix_tree
http://en.wikipedia.org/wiki/Scapegoat_tree
http://en.wikipedia.org/wiki/AA_tree
http://en.wikipedia.org/wiki/Heap_(data_structure)
http://en.wikipedia.org/wiki/Skip_list
http://en.wikipedia.org/wiki/Hash_table

If you end up swapping or otherwise going to disk, pick a
tree with a large branching factor.
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Benjamin M. Schwartz
Lucian Branescu wrote:
> Would an ordered dictionary otherwise be all right, or am I
> misunderstanding your requirements? There are other implementations,
> like http://www.xs4all.nl/~anthon/Python/ordereddict/

No, an ordered dictionary is not enough.  All these ordered dictionaries
are ordered either by time or by sorting the keys.  Neither will work for
me.  The problem is simple: if I insert something at position 5, I need
the object currently at position 5 to move to position 6, and the object
currently at position 6 to move to position 7, etc.

To accomplish this in an time-ordered odict, I would have to remove all
keys subsequent to the insertion point, make the insertion, and then
re-insert those keys.  That's O(n).

To accomplish this in a sorted-order odict, I would have to generate a new
key that causes my insertion to occur at the right location.  If there are
repeated insertions at the same location, generating such keys becomes an
O(n) operation.  To see this, suppose I am repeatedly inserting at the
second position.  Initially, the keys are (0,1).  The first insertion has
to pick a key between 0 and 1, e.g. 0.5.  The second insertion has to pick
a key between 0 and 0.5: e.g. 0.25.  The number of bits required to store
these keys to sufficient precision increases by one bit on each insertion.
 This means that the length of the keys is O(n), so every comparison is O(n).

--Ben



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Lucian Branescu
Would an ordered dictionary otherwise be all right, or am I
misunderstanding your requirements? There are other implementations,
like http://www.xs4all.nl/~anthon/Python/ordereddict/

2009/6/7 Benjamin M. Schwartz :
> Lucian Branescu wrote:
>> This http://www.python.org/dev/peps/pep-0372/ might be interesting.
>> Perhaps it could get backported to 2.5.
>>
>> But it still has O(n) deletion.
>
> It also doesn't have insertion at all (only append), and indexing (and
> reverse indexing) is O(n).
>
> --Ben
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Carol Farlow Lerche
hash chaining only approaches order n if the table is very full (because of
collisions).

On Sun, Jun 7, 2009 at 8:07 AM, Benjamin M. Schwartz <
bmsch...@fas.harvard.edu> wrote:

> Lucian Branescu wrote:
> > This http://www.python.org/dev/peps/pep-0372/ might be interesting.
> > Perhaps it could get backported to 2.5.
> >
> > But it still has O(n) deletion.
>
> It also doesn't have insertion at all (only append), and indexing (and
> reverse indexing) is O(n).
>
> --Ben
>
>
> ___
> Sugar-devel mailing list
> Sugar-devel@lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Benjamin M. Schwartz
Lucian Branescu wrote:
> This http://www.python.org/dev/peps/pep-0372/ might be interesting.
> Perhaps it could get backported to 2.5.
> 
> But it still has O(n) deletion.

It also doesn't have insertion at all (only append), and indexing (and
reverse indexing) is O(n).

--Ben



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Lucian Branescu
This http://www.python.org/dev/peps/pep-0372/ might be interesting.
Perhaps it could get backported to 2.5.

But it still has O(n) deletion.

2009/6/7 Benjamin M. Schwartz :
> James Zaki wrote:
>> Taking a guess with the information you've given perhaps a hash
>> tablecould help?
>
> Python uses the term "Dict" to describe its built-in hash table.  I do
> think a hash table could be helpful, for example, to maintain the reverse
> lookup mapping if the forward lookup is stored in a List (which is
> actually python's version of a dynamic array, like C++'s Vector).
> However, I am still stuck with O(N) performance in this case for insertion
> and deletion, because each such modification shifts the position of all
> subsequent objects.  This would correspond to changing the value
> associated with half the keys in the hash table, on average, for each
> insertion.
>
> I was hoping for a structure with log-time performance.
>
> --Ben
>
>
> ___
> Sugar-devel mailing list
> Sugar-devel@lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Sean DALY
Have you looked into an awk associative array? I understand it is
stored internally as a hash table. I remember reading about a
simulated multidimensional array (index,subscript); inserting a record
in that case didn't involve any reindexing.

Older awks were considered slow though, I have no idea if that would
be in the running performancewise.

Sean


On Sun, Jun 7, 2009 at 3:39 PM, Benjamin M.
Schwartz wrote:
> James Zaki wrote:
>> Taking a guess with the information you've given perhaps a hash
>> tablecould help?
>
> Python uses the term "Dict" to describe its built-in hash table.  I do
> think a hash table could be helpful, for example, to maintain the reverse
> lookup mapping if the forward lookup is stored in a List (which is
> actually python's version of a dynamic array, like C++'s Vector).
> However, I am still stuck with O(N) performance in this case for insertion
> and deletion, because each such modification shifts the position of all
> subsequent objects.  This would correspond to changing the value
> associated with half the keys in the hash table, on average, for each
> insertion.
>
> I was hoping for a structure with log-time performance.
>
> --Ben
>
>
> ___
> Sugar-devel mailing list
> Sugar-devel@lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread Benjamin M. Schwartz
James Zaki wrote:
> Taking a guess with the information you've given perhaps a hash
> tablecould help?

Python uses the term "Dict" to describe its built-in hash table.  I do
think a hash table could be helpful, for example, to maintain the reverse
lookup mapping if the forward lookup is stored in a List (which is
actually python's version of a dynamic array, like C++'s Vector).
However, I am still stuck with O(N) performance in this case for insertion
and deletion, because each such modification shifts the position of all
subsequent objects.  This would correspond to changing the value
associated with half the keys in the hash table, on average, for each
insertion.

I was hoping for a structure with log-time performance.

--Ben



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] I'm looking for a tree...

2009-06-07 Thread James Zaki
Hey Benjamin,

Taking a guess with the information you've given perhaps a hash
tablecould help?

Fast lookup/retrieval, but just have to consider what you would enumerate as
the key, and loading.

Let me know if you want some help applying this, memories of a uni
assignment have just come flooding back.


James.



2009/6/7 Benjamin M. Schwartz 

> I am looking for a fast data structure with the following properties:
>
> Maintains an indexed list of arbitrary, non-ordered objects (like a python
> List or C array)
> Allows fast:
> Insertion at any location
> Deletion at any location
> Lookup of an object by its index
> Reverse lookup, to determine the index of an object
>
> Python's List has O(1) lookup, but O(N) insert, delete, and
> reverse-lookup.  To make reverse lookup O(1) I could maintain a separate
> Dict mapping objects to indices, but this would cost an additional O(N) on
> every insertion and deletion.
>
> A linked list has O(1) insertion and deletion, but O(N) lookup and O(N)
> reverse lookup.  I could maintain a separate Dict for the forward and
> reverse mappings, but this would cost O(N) on every insertion and deletion.
>
> A standard self-balancing tree cannot be used because the objects are not
> ordered, and self-balancing trees require ordered keys.  I could use the
> index of an object as the sort key, but then insertion and deletion are
> O(N) because all subsequent keys must be altered.  I could fabricate new
> sort keys to ensure that insertions occur at the desired location, but
> then the length of the keys will grow like O(N), making all operations at
> least O(N).
>
> I feel like there should be some kind of standard tree-like data structure
> that meets my requirements, but I can't find one.  Do you know of one?  Am
> I on a unicorn hunt?
>
> --Ben
>
>
> ___
> Sugar-devel mailing list
> Sugar-devel@lists.sugarlabs.org
> http://lists.sugarlabs.org/listinfo/sugar-devel
>
>
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel