Hi Eric,

Given the likely need to map back from an alternate name (string search in
the definition?) to the auth name (maybe the most common use for such a
service?), I think this route might be on the inefficient side.

I've been wondering about names as handles, with a crossref-like middleman
piece.  But not doing anything about such ideas.


On Mon, 31 Mar 2008, Eric Lease Morgan wrote:

Over the weekend I had fun with the DICT protocol, a DICT server, a
DICT client, and the creation of dictionaries for the afore mentioned.

The DICT protocol seems to be a simple client/server protocol for
searching remote content and returning "definitions" of the query.
[1] I was initially drawn to the protocol for its content.
Specifically, I wanted a dictionary because I thought it would be
useful in a "next generation" library catalog application. The server
was trivial to install because it is available via yum. Since it is
protocol there are a number of clients and libraries available.
There's also bunches o' data to be had, albeit a bit dated. Some of
it includes: 1913 dictionary, version 2.0 of WordNet, the CIA World
Fact Book (2000), Moby's Thesaurus, a gazetteer, and quite a number
of English to other dictionaries.

What's interesting is the DICT protocol data is not limited to
"dictionaries" as the Fact Book exemplifies. The data really only has
two fields: headword (key), and note (definition). After thinking
about it, I thought authority lists would be a pretty good candidate
for DICT. The headword would be the term, and the definition would be
the See From and See Also listings.

Off on an adventure, I downloaded subject authorities from FRED. [2]
I used a shell script to loop through my data (subjects2dictd,
attached) which employed XSLT to parse the MARCXML
(subjects2dict.xsl, attached) and then ran various dict* utilities.
The end result is a "dictionary" query-able with your favorite DICT
client. From a Linux shell, try:

dict -h -d subjects -s substring blues

While I think this is pretty kewl, I wonder whether or not DICT is
the correct approach. Maybe I should use a more robust, full-text
indexer for this problem? After all, DICT servers only look at the
headword when searching, not the definitions. On the other hand DICT
was *pretty* easy to get up an running, and authority lists are a
type of dictionary.

[1] http://www.dict.org
[2] http://www.ibiblio.org/fred2.0/authorities/

Eric Lease Morgan
University Libraries of Notre Dame

Attachment: subjects2dictd
Description: Binary data

Attachment: subjects2dict.xsl
Description: Binary data

Reply via email to