Re: [Collections] [SUBMIT] Trie

Stephen Colebourne Thu, 05 Dec 2002 14:42:27 -0800

I am reluctantly minded to not include Trie in [collections]

Unfortunately, at present I am the only active committer for [collections].
This means that I am unable to get feedback from other committers (who know
collections in more detail than I).


My decision is based on a feeling that different Trie implementations may
differ significantly. I believe that you have put together a good
implementation which attempts to address this, but without other Trie
requirements (from other people) it is very difficult to know if your
interface definition is the right one.

Any new interface in [collections] also requires maintanance, and should
have a TrieUtils class with synchronized, unmodifiable etc. implementations.
This is quite a bit of work.

Thanks for the contribution, and if someone else suggests a Trie in the
future, we will have a good basis to start from.

Stephen

----- Original Message -----
From: "Rich Dougherty" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, December 05, 2002 1:31 AM
Subject: Re: [Collections] [SUBMIT] Trie


> >> I've written an implementation for a trie which I'd like to
> >> contribute. I've attached a simple implementation and a test
> >> case.
> >
> > I've taken a quick look at the code here, and that all looks fine.
> >
> > Unfortunately, I don't really know what I'm looking at! What I mean is
> > that I've never heard of a 'Trie' before, and am thus wondering as to
> > why I would use it, and is it general enough to be in [collections].
> >
> > In the past we have had discussions about Tree implementations (and I
> > have coded some before). This may be related, as the Trie appears to be
> > a recursive structure.
> >
> > Perhaps, some use cases as to why a Trie is useful, especially in a
> > general/server environment would be useful. Thanks.
>
> That sounds reasonable to me. :-)
>
> I wrote this trie to solve a specific problem for me. In a web
> application I'm writing I have a set or URIs that map to a handler for
> a request. I have mappings like:
>
>   / -> home page
>   /article -> article list page
>   /article/view -> view article page
>   /article/edit -> edit article page
>   /auth/login -> login page
>   /auth/logout -> logout page
>
> I wanted a data structure to store these mappings. I have written a
> class called SimplifiedURITrie which extends AbstractTransitionMapTrie
> to store mappings of these type.
>
> I submitted StringTrie as a concrete example of the Trie interface. I
> haven't submitted SimplifiedURITrie since I felt it was very specific
> to my application, but if you're interseted I've attached it. If you
> want to include it in Commons (which I doubt), then I'm happy to tidy
> it up a bit.
>
> Back to my application. A trie that held these mappings would look
> like this:
>
> + -> home page
> |
> +-article-+ -> article list page
> |         |
> |         +-view-+ -> view article page
> |         |
> |         +-edit-+ -> edit article page
> |
> +-auth-+
>        |
>        +-login-+ -> login page
>        |
>        +-logout-+ -> logout page
>
> Notice how the key (the URI) is broken into parts with one part being
> stored on each transition. All keys that share a common prefix hang
> off a particular node. See the diagram at
> http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Tree/Trie.html for an
> example with strings.
>
> For my application a trie was useful for a couple of reasons:
>
> - Space efficiency. Each part of the key is only stored once. I would
>   only need to store the transition "article" once even if I had
>   hundreds of URLs starting with "/article".
>
>   The actual space efficiency of a trie depends on how efficiently the
>   transitions are stored, and the nature of the data stored in it. A
>   dictionary of strings could be stored in a trie quite
>   efficiently. However, I wouldn't suggest the current StringTrie
>   implementation be used for this reason. It uses a HashMap for
>   storing the transitions from each node, which means it uses more
>   space than is strictly necessary.
>
> - Can efficiently find the entries whose keys are the prefix of a
>   given value. The method getLongest() returns the longest of these
>   keys. This is useful for me in my URI matching problem. If
>   someone gives me the URI "/article/view/43" I can search the trie
>   and find that the best match is "/article/view". (I'll pass the
>   "43" along to my handler so it can show the appropriate article.)
>
> I also added the method prefixEntrySet() to the interface because it
> was efficient to perform and seemed like it might be useful. Prompted
> by your question I did a bit of searching and found that it would be
> handy for substring matching using a "suffix trie":
>
>   http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Tree/Suffix/
>
> Cool huh.
>
> I'm not really a trie zealot, but I think a trie is a reasonably
> useful data structure. I couldn't find an implementation with a
> suitable license anywhere, so I wrote my own. I found an
> implementation in a servlet container written by someone at HP (open
> source, with heavy restrictions) and one in LimeWire (GPL, not LGPL or
> Apache). Also many implementations were too limited for me. Most
> operated only on Strings or byte arrays, I wanted something more
> general. I hope that helps.
>
> Rich
>
>


----------------------------------------------------------------------------
----


> --
> To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: [Collections] [SUBMIT] Trie

Reply via email to