LeHouillier, Frank D. wrote:
While making this change wouldn't affect us in any way as I can see now,
it would still be possible to use the Features in the Result Spec in a
similar way.
Suppose you have an information extraction component that extracts
entities with attributes and you want to control which attributes are
actually being added to the CAS with the Result Spec.  You might have
type Person, with a range of features such as Address, Phone number,
Age, etc. some of which you want to output in a given configuration and
others not.  Suppose the information extraction component also extracts
attributes which are so useless that you don't include them as features
in the type system at all such as an internal id number.  Currently,
with a compiled Result Spec you could have the annotator look up the
feature on the basis of the name of the feature and then you could
reliably instantiate the feature without further ado.  After your
change, the feature would have to be checked to see if it actually
exists.
We added code in the actual change that now checks to see if the feature actually exists (for a "compiled" Result Spec). I thought it was better to preserve the status quo here, rather than remove this check (for performance reasons). It didn't seem like it would have any measurable performance impact - it's one hash table lookup, basically.

Cheers. -Marshall
Again, this doesn't seem like it is that big a deal to me but I thought
I might just point out that it might have a use case.  In practice, it
seems to me that most annotators figure out the features available
either during compilation by using the JCas or during the initialization
of the Annotator.
-----Original Message-----
From: Marshall Schor [mailto:[EMAIL PROTECTED] Sent: Friday, January 25, 2008 3:57 PM
To: uima-dev@incubator.apache.org
Subject: Re: capabilityLangugaeFlow - computeResultSpec

LeHouillier, Frank D. wrote:
We have an annotator that wraps a black box information extraction
component that can return objects of a variety of types.  We check the
result specification to see if the object is something we want to
output
based the actual string of the name of the type.  If you take away the
compiled version of the ResultSpecification then we will have to also
check whether the type that we get back from the type system is null
or
not.
Hi Frank -

This change would *not* take away the compiled version of the Result Spec. It would only change 1 behavior - that of returning "true" if a *feature* (not a type, as in your example above) was associated with a type where the capability was marked "allAnnotatorFeatures", even if the

Feature didn't exist.

Suppose you had a type T1, and a type T2 whose super-type was T1, and features T1:f1 T2:f2, with an output capability = T1 with allAnnotatorFeatures = true, and finally T3 (not inheriting from T1 and feature T3:f3, and the output capability including T3 with allAnnotatorFeatures = false


Here's the current behavior:

Before compile:  The following would all return true except as marked:
   containsType(T1)
containsType(T2) << returns false, T2 not in output capability, and before compile, T2 isn't recognized as a subtype of T1
   containsType(T2:f2)  << returns false, not in output, etc.
   containsFeature(T1:f1)
   containsFeature(T1:asdfasdfasdfasdf) <<< yes... that's what it does -

it ignores the actual feature name because allAnnotatorFeatures is true

After compile the following return true except as marked:
   containsType(T1)
containsType(T2) << T2 not in output capability, but is recognized as a subtype of T1
   containsType(T2:f2)  << T1's *allAnnotatorFeatures* is "inherited"
   containsFeature(T1:f1)
containsFeature(T1:asdfasdfasdfasdf) << false: the actual features are looked up After the change I'm proposing, everything would be same except that
   containsFeature(T1:asdfasdfasdfasdf) would return true.

I don't think this would affect the way you are using result specs, but please let me know if I've misunderstood something. We don't want to impact users with this change.

Thanks for your comments :-)

-Marshall
-----Original Message-----
From: Marshall Schor [mailto:[EMAIL PROTECTED] Sent: Friday, January 25, 2008 5:06 AM
To: uima-dev@incubator.apache.org
Subject: Re: capabilityLangugaeFlow - computeResultSpec

The implementation for checking if a feature is in the result spec
does
the following:

If the result-spec is not "compiled", it says the feature is present
if
it specifically put in, or if its type has the allAnnotatorFeatures
flag
set.

If the result-spec is "compiled", it says the feature is present if it

is specifically put in, or if its type has the allAnnotatorFeatures
flag
set and the feature exists in the type system.

For performance / space reasons, I'd like to drop the 2nd case; this would have the consequence of changing the result spec to return true for features not in the type system where the type had the allAnnotatorFeatures flag set. This case shouldn't come up in
practice
because I can't think of good reason an annotator would ask if a
feature
not in its type system was present.
Any objections?

-Marshall






Reply via email to