Re: Map with maintained insertion order

2012-03-23 Thread Jonathan M Davis
On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
> Does someone have a map implementation that maintains the insertion
> order of the keys?
> 
> E.g.:
> 
> string[string] map;
> map["foo"] = "x";
> map["bar"] = "y";
> 
> When iterating over map keys I want to visit "foo" first, then "bar", and
> so on.

I can't think of any data structure that does that off the top of my head, 
though there may very well be one. If you don't mind the extra memory 
overhead, it wouldn't be all that hard to do it with multiple maps wrapped 
in a single type. You could do something like

class InsertedOrderMap
{
string[string] _map;
RedBlackTree!(Tuple!(int, string), "a[0] < b[0]") _insertionOrder;

...
}

_map holds the normal map and _insertionOrder holds pairs of the insertion 
ordered and the key to _map.

You can then have your range iterate over _insertionOrder and use the keys 
to return either the keys, values, or pairs of keys and values from _map, 
depending on what you want to iterate over.

That _does_ require having two data structures in one, and there may a be a 
better way to do it, but just off the top of my head, that's the best that I 
can come up with. I don't know how you could have a data structure where 
it's efficient to index by both insertion order and key without effectively 
having two data structures. But I really should go reread my data structures 
books. It's been too long since I studied that stuff.

- Jonathan M Davis


Re: Map with maintained insertion order

2012-03-23 Thread Andrej Mitrovic
On 3/23/12, Andrej Mitrovic  wrote:
> Does someone have a map implementation that maintains the insertion
> order of the keys?

Here's a sloppy implementation:

module test;

import std.stdio;
import std.traits;

template KeyType(V : V[K], K)
if (isAssociativeArray!(V[K]))
{
alias K KeyType;
}

template ValueType(V : V[K], K)
if (isAssociativeArray!(V[K]))
{
alias V ValueType;
}

template BaseElement(T : U[], U)
{
alias U BaseElement;
}

struct LinkedHash(T)
if (isAssociativeArray!T)
{
alias KeyType!T Key;
alias ValueType!T Value;

static if (isArray!Value)
template isValue(T) { enum bool isValue = (is(T == Value) ||
is(BaseElement!Value == T)); }
else
template isValue(T) { enum bool isValue = is(T == Value); }

Value[] payload;
size_t[Key] keyToIndex;
Key[size_t] indexToKey;
size_t lastIndex;

size_t getNewIndex()
{
payload.length = payload.length + 1;
return lastIndex++;
}

size_t* getKeyIndex(string key)
{
if (auto index = key in keyToIndex)
return index;
else
return null;
}

Value opIndex(in Key key)
{
if (auto index = getKeyIndex(key))
return payload[*index];
else
assert(0, "Key '" ~ key ~ "' not found.");
}

void opIndexAssign(V)(V value, Key key)
if (isValue!V)
{
if (auto index = getKeyIndex(key))
payload[*index] = value;
else
{
auto index = getNewIndex();
payload[index] = value;
keyToIndex[key] = index;
indexToKey[index] = key;
}
}

void opIndexOpAssign(string op, V)(V value, Key key)
if (isValue!V)
{
if (auto index = getKeyIndex(key))
mixin("payload[*index] " ~ op ~ "= value;");
else
{
auto index = getNewIndex();
mixin("payload[index] " ~ op ~ "= value;");
keyToIndex[key] = index;
indexToKey[index] = key;
}
}

int opApply(scope int delegate(ref Key, ref Value) dg)
{
foreach (index; 0 .. lastIndex)
{
auto result = dg(indexToKey[index], payload[index]);
if (result)
return result;
}

return 0;
}
}

void main()
{
auto names = ["1", "2", "3", "4", "5", "6", "7", "8"];

LinkedHash!(string[string]) stringMap;
foreach (name; names)
stringMap[name] ~= "test";

foreach (key, val; stringMap)
writefln("%s: %s", key, val);

LinkedHash!(string[][string]) stringArrMap;
stringArrMap["100"] = ["foo"];
stringArrMap["101"] ~= "bar";

foreach (name; names)
stringArrMap[name] ~= "test";

foreach (key, val; stringArrMap)
writefln("%s: %s", key, val);
}


Re: Map with maintained insertion order

2012-03-23 Thread Andrej Mitrovic
On 3/24/12, Jonathan M Davis  wrote:
> I can't think of any data structure that does that off the top of my head

Java has it and they call it a LinkedHashMap:
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/LinkedHashMap.html

> That _does_ require having two data structures in one, and there may a be a
> better way to do it, but just off the top of my head, that's the best that I
> can come up with. I don't know how you could have a data structure where
> it's efficient to index by both insertion order and key without effectively
> having two data structures.

Yeah it's not efficient, but in my use-case I'm not looking so much
for efficiency but convenience. Sometimes I have to look up a key
based on a name to modify some values, and I have to keep the order of
the keys.


Re: Map with maintained insertion order

2012-03-23 Thread H. S. Teoh
On Fri, Mar 23, 2012 at 06:07:42PM -0700, Jonathan M Davis wrote:
> On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
> > Does someone have a map implementation that maintains the insertion
> > order of the keys?
> > 
> > E.g.:
> > 
> > string[string] map;
> > map["foo"] = "x";
> > map["bar"] = "y";
> > 
> > When iterating over map keys I want to visit "foo" first, then
> > "bar", and so on.
> 
> I can't think of any data structure that does that off the top of my
> head, though there may very well be one.
[...]

Yes there is one. It's called a B-tree. And it's probably total overkill
for what you need. :-) But also, B-trees have O(log n) access time, not
O(1). :-(


T

-- 
Right now I'm having amnesia and deja vu at the same time. I think I've 
forgotten this before.


Re: Map with maintained insertion order

2012-03-23 Thread H. S. Teoh
On Fri, Mar 23, 2012 at 06:33:54PM -0700, H. S. Teoh wrote:
> On Fri, Mar 23, 2012 at 06:07:42PM -0700, Jonathan M Davis wrote:
> > On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
> > > Does someone have a map implementation that maintains the insertion
> > > order of the keys?
> > > 
> > > E.g.:
> > > 
> > > string[string] map;
> > > map["foo"] = "x";
> > > map["bar"] = "y";
> > > 
> > > When iterating over map keys I want to visit "foo" first, then
> > > "bar", and so on.
> > 
> > I can't think of any data structure that does that off the top of my
> > head, though there may very well be one.
> [...]
> 
> Yes there is one. It's called a B-tree. And it's probably total overkill
> for what you need. :-) But also, B-trees have O(log n) access time, not
> O(1). :-(
[...]

Actually, nevermind that. B-trees are probably not what you're looking
for at all, and they aren't really suited to the task anyway (you want
to sort by insertion order, not key order).

What you *could* do is implement a hash augmented with next/prev
pointers in each slot, IOW, a linked list. The hash itself stores a
pointer to the head and tail of the list. When a new slot is created,
the tail of the list is updated to link to it. When a slot is deleted,
its neighbours are updated. This will maintain O(1) performance of the
hash.

In fact, you could use the current AA implementation if you wrap your
data in a struct that contains the extra pointers, and wrap around the
AA operations to update these pointers as needed.


T

-- 
Valentine's Day: an occasion for florists to reach into the wallets of nominal 
lovers in dire need of being reminded to profess their hypothetical love for 
their long-forgotten.


Re: Map with maintained insertion order

2012-03-23 Thread Andrej Mitrovic
On 3/24/12, Andrej Mitrovic  wrote:
> On 3/23/12, Andrej Mitrovic  wrote:
>> Does someone have a map implementation that maintains the insertion
>> order of the keys?
>
> Here's a sloppy implementation:

I forgot to make opIndex ref, but also the isValue check fails on
structs with alias this, it's better to use is(typeof()) instead of
directly checking the type. The fix would be:

static if (isArray!Value)
{
template isValue(T)
{
   enum bool isValue = is(typeof( { Value v; T t; v = t; } )) ||
is(BaseElement!Value == T);
}
}
else
{
template isValue(T) { enum bool isValue = is(typeof( { Value v; T
t; v = t; } )); }
}


Re: Map with maintained insertion order

2012-03-23 Thread Andrej Mitrovic
On 3/24/12, Andrej Mitrovic  wrote:
> static if (isArray!Value)
> {
> template isValue(T)
> {
>enum bool isValue = is(typeof( { Value v; T t; v = t; } )) ||
> is(BaseElement!Value == T);
> }
> }

Correction, it's:
template isValue(T)
{
enum bool isValue = is(typeof( { Value v; T t; v = t; } ))
|| is(typeof( { BaseElement!Value v; T t; v = t; } ))
|| is(BaseElement!Value == T);
}

Complicated beast, but I have to take into acount both 'alias this'
tricks and the fact that ~= works when RHS is either an element type
or an array.


Re: Map with maintained insertion order

2012-03-23 Thread Andrej Mitrovic
On 3/24/12, Andrej Mitrovic  wrote:
> Correction, it's:
> template isValue(T)
> {
> enum bool isValue = is(typeof( { Value v; T t; v = t; } ))
> || is(typeof( { BaseElement!Value v; T t; v = t; } ))
> || is(BaseElement!Value == T);
> }

Last check not needed, so:
> template isValue(T)
> {
> enum bool isValue = is(typeof( { Value v; T t; v = t; } ))
> || is(typeof( { BaseElement!Value v; T t; v = t; } ));
> }

Anyway I'll stop spamming.


Re: Map with maintained insertion order

2012-03-23 Thread Brad Anderson
On Saturday, 24 March 2012 at 01:07:56 UTC, Jonathan M Davis 
wrote:

On Friday, March 23, 2012 23:48:51 Andrej Mitrovic wrote:
Does someone have a map implementation that maintains the 
insertion

order of the keys?

E.g.:

string[string] map;
map["foo"] = "x";
map["bar"] = "y";

When iterating over map keys I want to visit "foo" first, then 
"bar", and

so on.


I can't think of any data structure that does that off the top 
of my head,
though there may very well be one. If you don't mind the extra 
memory
overhead, it wouldn't be all that hard to do it with multiple 
maps wrapped

in a single type. You could do something like

class InsertedOrderMap
{
string[string] _map;
RedBlackTree!(Tuple!(int, string), "a[0] < b[0]") 
_insertionOrder;


...
}

_map holds the normal map and _insertionOrder holds pairs of 
the insertion

ordered and the key to _map.

You can then have your range iterate over _insertionOrder and 
use the keys
to return either the keys, values, or pairs of keys and values 
from _map,

depending on what you want to iterate over.

That _does_ require having two data structures in one, and 
there may a be a
better way to do it, but just off the top of my head, that's 
the best that I
can come up with. I don't know how you could have a data 
structure where
it's efficient to index by both insertion order and key without 
effectively
having two data structures. But I really should go reread my 
data structures

books. It's been too long since I studied that stuff.

- Jonathan M Davis


Ruby's Hash in 1.9 maintains insertion order.  It does it using a 
linked list between the nodes as they are created (so yes, 
essentially two data structures).


Regards,
Brad Anderson


Re: Map with maintained insertion order

2012-03-25 Thread Michael Rynn
On Fri, 23 Mar 2012 23:48:51 +0100, Andrej Mitrovic wrote:

> Does someone have a map implementation that maintains the insertion
> order of the keys?
> 
> E.g.:
> 
> string[string] map;
> map["foo"] = "x";
> map["bar"] = "y";
> 
> When iterating over map keys I want to visit "foo" first, then "bar",
> and so on.

http://bazaar.launchpad.net/~michael-rynn-500/d2-xml/trunk/view/head:/alt/
arraymap.d

I took and adapted the idea from Ultimate++ library code. Ultimate++ is 
worth a try, even if it is C++.

arraymap.d implements a hashmap, that will maintain insertion order, but 
only if no deletions are made.

Each added key-value pair is appended into a separate key and values 
array.

Also maintained is an array of hash_t.

Also maintained is an array of index integers. Matching hash keys are 
related by the linked list method (no pointers).

So starting with an empty map, each insertion appends the key and value 
array storage.   But a key removal will create a hole, and the hole is 
added to the free links list.  The hash_t array points to one of the 
linked list entries which indexes the key/value array.

arraymap.d would appear to have the property, that the hash values cannot 
be mistaken as aliased pointers by the GC, because they can be stored in 
a non-scanned array.

In the same module is a template version copycat implementation of 
druntime AA, aahash.d  with some customizations that I have played around 
with, like being able to use char[] for lookups to string keys (but not 
for op_put). aahash module requires hashutil.d and blockheap.  arraymap 
requires hashutil.d


In performance tests the arraymap.d was slower than the D runtime look 
alike, but acceptable.  I think it has scope for improvement in coding 
and performance.













Re: Map with maintained insertion order

2012-03-26 Thread Richard Webb
I've done that in C++ before using Boost.MultiIndex, and i saw a 
post about a D port of that recently (I don't have the link to 
hand though).