On Wed, Mar 25, 2009 at 07:34:01AM -0400, Michael McCandless wrote:
> >> What does "incremented" mean?
> >
> > It means that the caller has to take responsibility for one refcount.
> > Usually
> > you'll see that on constructors and factory methods.
> >
> > Having "incremented" as part of the method/function signature makes it
> > easier
> > to autogenerate binding code that doesn't make refcounting errors and leak
> > memory.
>
> OK got it. It's like when Python's docs say "returns a reference".
> It's great to make this a "formal" part of the API.
I'm pretty sure you grok this already but for clarity's sake: this is
Boilerplater syntax -- so it's a "formal" part of an *internal* API.
Even though Boilerplater is a very small language, I was deeply reluctant to
write it. Naturally I hate all programming languages and I have fantasies of
replacing C with something "better" :) -- but I recognize the challenges that
language authors face and have no desire to expose Boilerplater outside of
Lucy. It's just a means to an end.
The C API docs -- which I expect we'll autogenerate from the .bp source files
just as I'm currently generating Perl POD docs from .bp files -- will probably
be HTML files and will say "returns a new reference" or "returns a borrowed
reference" just like the Python docs.
> Instead of having a bunch of version constants at the top of a class
> (eg FieldsReader.java), we'd invoke the "Versions.add(...)" to create
> each version.
Where would we keep track of the registrations? Will each DataReader subclass
keep a class Hash variable?
static Hash* versions = NULL;
static void
S_init_versions_hash()
{
versions = Hash_new(2);
Hash_Store_Str(versions, "1", 1, CB_newf("initial format"));
Hash_Store_Str(versions, "2", 1, CB_newf("fixed stoopid mistake"));
}
Hash*
LexWriter_versions(LexWriter *self)
{
UNUSED_VAR(self);
if (!versions) { S_init_versions_hash(); }
return versions;
}
Actually that'll leak memory without an atexit() or something like that, but
you get the idea.
> Introspection/transparency is the primary reason I can think of --
> it's the same motivation that led you to JSON over private binary.
> Ie, it'd be great to see a string description of what "format: '2'"
> means; eg if each int has a known corresponding description, you could
> add a comment on that line the JSON.
>
> And, in the source code, we of course assign symbolic names to these
> constants anyway.
>
> Also, having an explicit method call to "add" a new version avoids
> silly risks that when adding a new version someone messes up adding
> one to the int :) Or, messes up keeping track of the latest format
> (the format that's written). It may help with the back compat unit
> tests, too, ensuring that each supported version is tested.
>
> I guess it's a matter of where do you draw the line b/w browseability
> of your JSON metadata vs "you must pull in an external tool to get
> more details".
OK, I'm cool with this so long as we can come up with a sensible API.
There are no performance implications or significant shared-object-bloat issues.
> You are needing to bring online a scary amount of basic
> infrastructure (GC, exception handling, object vtables, etc.) just to
> get the ball rolling.
True to an extent, but there's a huge payoff: the actual search code -- where
the rubber hits the road -- is only marginally harder to follow than Java.
Marvin Humphrey