[Jprogramming] Re: Programming Digest, Vol 23, Issue 8

Elmer Fittery Sat, 04 Aug 2007 07:40:31 -0700

My thoughts on this are:

It isn't so much the fastest sort that I was interested in, but rather
the "trie" structure itself.  The sort is fast because of the "trie"
structure.


I found this description which might be of interest:

The term "trie" comes from the word retrieval, but is generally
pronounced like "try". A trie is a tree structure (or directed acyclic
graph), the nodes of which represent letters in a word. For example, the
final lookup for the word 'bob' would look something like
$ref->{'b'}{'o'}{'b'}{'00'} (the 00 being an end marker). Only nodes
which would represent words in the trie exist, making the structure
slightly smaller than a hash of the same data set.

The advantages of the trie over other data storage methods is that
lookup times are O(1) WRT the size of the index. For sparse data sets,
it is probably not as efficient as performing a binary search on a
sorted list, and for small files, it has a lot of overhead. The main
advantage (at least from my perspective) is that it provides a
relatively cheap method for finding a list of words in a large (45 GB),
dense data set which begin with a certain string.

The term "word" in this documentation can refer to one of two things:
either a reference to an array of strings, or a scalar which is not a
reference. In the case of the former, each element of the array is
treated as a "letter" of the "word". In the case of the latter, the
scalar is evaluated in string context and it is split into its component
letters. Return values of methods match the values of what is passed in
-- that is, if you call lookup() with an array reference, the return
value will be an array reference (if appropriate).



> ------------------------------
> 
> Message: 2
> Date: Thu, 2 Aug 2007 21:56:57 -0400
> From: "Devon McCormick" <[EMAIL PROTECTED]>
> Subject: Re: [Jprogramming] burstsort - using Burst Tries: fastest
>       sort?
> To: "Programming forum" <[email protected]>
> Message-ID:
>       <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> I would be happy to let Roger implement this for J, just as he has already
> implemented very good random number generators and factoring algorithms.
> 
> However, for my purposes, "grade" is usually more useful than "sort".
> 
> I don't mean this as a put-down.  Any real breakthrough in sorting is
> potentially
> of great benefit to us all, but not necessarily at the level of a J
> application.
> The power of the language is in encapsulating high-level concepts, not
> in the low-level
> details of the implementation of a well-known concept like sorting.


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jprogramming] Re: Programming Digest, Vol 23, Issue 8

Reply via email to