Greets,

Both of Lucy's present target platforms provide much of the functionality missing from C and present in Java -- for instance, portable filepath handling. We could also get that from APR a la Lucene4C, but while it may be possible to add a C target to Lucy someday that uses APR as its foundation, we don't need to complicate the install process by making APR a prerequisite for all targets.

There are a few dependencies I think we should bundle with Lucy:

  * Zlib
  * Snowball stemmers
  * some variant of vsnprintf

While Zlib is provided as part of core Perl and possibly as part of all other platforms Lucy might target, bundling it means we don't have to call back to the native API should we wish to access it from C, as we might if FieldsWriter and/or FieldsReader end up implemented in C.

The Snowball stemmers are also available via CPAN; I now maintain that distribution (Lingua::Stem::Snowball). However, other platforms probably won't have something like that available, and even within the Perl world, bundling Snowball means greater flexibility with regards to how Lucy interacts with it.

We need vsnprintf for formatting error messages, which may include user-controllable input and which are therefore ripe for buffer overflow attack. There are many variants available -- see <http:// www.ijs.si/software/snprintf/> for links to a few (some are outdated). We may be able to derive something from APR's implementation if we can't find one with a compatible license we can just bundle and #inclide.

If those are are only external dependencies, that implies we'll be building a lot from scratch. Here are some of the utilities we'll need to code up:

  * hashtable
  * priority queue
  * byte buffer (an array of bytes that knows its own length)
  * bit vector
  * external sort
  * C test harness

How does that sound?

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/

Reply via email to