On 6/21/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote:
Greets,

KinoSearch uses many routines provided by Perl's C API, called
"XS" (for eXternal Subroutine).  Some of these are specific to Perl's
data structures, but others are just stand-ins for common library
functions: Safefree() for free(), Copy() for memcpy, and so on.  The
really nice thing about these is that they've been vetted by Perl's
configuration process and benefit from a zillion smoke tests and Perl
installs.  They Just Work.

Lucy won't be able to use those, or similarly reliable tools provided
by any other target platform, because we need it to work with many
targets.  So we're going to be stuck solving all the classic
portability headaches all over again.

My goal with Lucy installation via CPAN is to require only standard
Perl build tools.  No shell scripts.  No installation as a library
available to other non-perl apps.  That approach has worked great for
KinoSearch, which currently builds fine on a wide variety of
operating systems, including Windows.  But it rests on all the
configuration that was done when Perl itself was compiled.

Happily, Perl's Configure script is public domain:

   # Yes, you may rip this off to use in other distribution
   # packages. This script belongs to the public domain and
   # cannot be copyrighted.

Unfortunately, it's an enormous sh script.  I could rewrite the
relevant parts of it in Perl, but then we'd also have to rewrite it
for any other target.  We only need a small fraction of its 22,000
lines, but we'd still end up violating DRY egregiously.  There's
another way we can adapt it though.

Much of the script consists of attempting to compile miniscule chunks
of C code, and running tiny tests when the chunks successfully
compile.  We can put this C code into a place where multiple scripts
can read it, and then the individual scripts won't need to be nearly
as long.  We can isolate the C bits as key => value pairs; the values
will be replaced by the output, if any, from the compiled test program.

   /* has_long_long */
   #include <stdio.h>
   int main() { long long foo = 5; printf("1"); return 0; }

Call this aspect of the build process "Configurator".  After end of
all the tests, our primary build scripts would be able to take
Configurator's output and generate a nice, reliable lucyconf.h file.

   if ( $configurator{sizeof_int} == 4 ) {
      print LUCYCONF "typedef lucy_i32_t int;\n";
   }

Writing Configurator will be a little labor-intensive, but I think it
will go a long way towards making Lucy portable, while  still
requiring only the native build tools provided by each platform.

I'm liking this line of thought. Even better I think would be to
design a very simple domain specific language that builds the
lucyconf.h. That way we only need to write the parsers once for each
target platform. Every time we need to add something to the lucyconf.h
we only need to edit (eg.) lucy.config and add a mini c file. What do
you think?

Reply via email to