Greets,
Both of Lucy's present target platforms provide much of the
functionality missing from C and present in Java -- for instance,
portable filepath handling. We could also get that from APR a la
Lucene4C, but while it may be possible to add a C target to Lucy
someday that uses APR as its foundation, we don't need to complicate
the install process by making APR a prerequisite for all targets.
There are a few dependencies I think we should bundle with Lucy:
* Zlib
* Snowball stemmers
* some variant of vsnprintf
While Zlib is provided as part of core Perl and possibly as part of
all other platforms Lucy might target, bundling it means we don't
have to call back to the native API should we wish to access it from
C, as we might if FieldsWriter and/or FieldsReader end up implemented
in C.
The Snowball stemmers are also available via CPAN; I now maintain
that distribution (Lingua::Stem::Snowball). However, other platforms
probably won't have something like that available, and even within
the Perl world, bundling Snowball means greater flexibility with
regards to how Lucy interacts with it.
We need vsnprintf for formatting error messages, which may include
user-controllable input and which are therefore ripe for buffer
overflow attack. There are many variants available -- see <http://
www.ijs.si/software/snprintf/> for links to a few (some are
outdated). We may be able to derive something from APR's
implementation if we can't find one with a compatible license we can
just bundle and #inclide.
If those are are only external dependencies, that implies we'll be
building a lot from scratch. Here are some of the utilities we'll
need to code up:
* hashtable
* priority queue
* byte buffer (an array of bytes that knows its own length)
* bit vector
* external sort
* C test harness
How does that sound?
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/