Re: capabilityLangugaeFlow - computeResultSpec

Marshall Schor Fri, 01 Feb 2008 06:50:54 -0800

LeHouillier, Frank D. wrote:

While making this change wouldn't affect us in any way as I can see now,
it would still be possible to use the Features in the Result Spec in a

similar way.

Suppose you have an information extraction component that extracts
entities with attributes and you want to control which attributes are
actually being added to the CAS with the Result Spec.  You might have
type Person, with a range of features such as Address, Phone number,
Age, etc. some of which you want to output in a given configuration and
others not.  Suppose the information extraction component also extracts
attributes which are so useless that you don't include them as features
in the type system at all such as an internal id number.  Currently,
with a compiled Result Spec you could have the annotator look up the
feature on the basis of the name of the feature and then you could
reliably instantiate the feature without further ado.  After your
change, the feature would have to be checked to see if it actually

exists.

We added code in the actual change that now checks to see if the featureactually exists (for a "compiled" Result Spec). I thought it was betterto preserve the status quo here, rather than remove this check (forperformance reasons). It didn't seem like it would have any measurableperformance impact - it's one hash table lookup, basically.


Cheers. -Marshall

Again, this doesn't seem like it is that big a deal to me but I thought
I might just point out that it might have a use case.  In practice, it
seems to me that most annotators figure out the features available
either during compilation by using the JCas or during the initialization
of the Annotator.
-----Original Message-----
From: Marshall Schor [mailto:[EMAIL PROTECTED]Sent: Friday, January 25, 2008 3:57 PM
To: uima-dev@incubator.apache.org
Subject: Re: capabilityLangugaeFlow - computeResultSpec

LeHouillier, Frank D. wrote:
We have an annotator that wraps a black box information extraction
component that can return objects of a variety of types.  We check the
result specification to see if the object is something we want to
output
based the actual string of the name of the type.  If you take away the
compiled version of the ResultSpecification then we will have to also
check whether the type that we get back from the type system is null
or
not.
Hi Frank -
This change would *not* take away the compiled version of the ResultSpec. It would only change 1 behavior - that of returning "true" if a*feature* (not a type, as in your example above) was associated with atype where the capability was marked "allAnnotatorFeatures", even if the
Feature didn't exist.
Suppose you had a type T1, and a type T2 whose super-type was T1, andfeatures T1:f1 T2:f2, with an output capability = T1 withallAnnotatorFeatures = true, and finally T3 (not inheriting from T1 andfeature T3:f3, and the output capability including T3 withallAnnotatorFeatures = false
Here's the current behavior:

Before compile:  The following would all return true except as marked:
   containsType(T1)
containsType(T2) << returns false, T2 not in output capability, andbefore compile, T2 isn't recognized as a subtype of T1
   containsType(T2:f2)  << returns false, not in output, etc.
   containsFeature(T1:f1)
   containsFeature(T1:asdfasdfasdfasdf) <<< yes... that's what it does -

it ignores the actual feature name because allAnnotatorFeatures is true

After compile the following return true except as marked:
   containsType(T1)
containsType(T2) << T2 not in output capability, but is recognizedas a subtype of T1
   containsType(T2:f2)  << T1's *allAnnotatorFeatures* is "inherited"
   containsFeature(T1:f1)
containsFeature(T1:asdfasdfasdfasdf) << false: the actual featuresare looked upAfter the change I'm proposing, everything would be same except that
   containsFeature(T1:asdfasdfasdfasdf) would return true.
I don't think this would affect the way you are using result specs, butplease let me know if I've misunderstood something. We don't want toimpact users with this change.
Thanks for your comments :-)

-Marshall
-----Original Message-----
From: Marshall Schor [mailto:[EMAIL PROTECTED]Sent: Friday, January 25, 2008 5:06 AM
To: uima-dev@incubator.apache.org
Subject: Re: capabilityLangugaeFlow - computeResultSpec

The implementation for checking if a feature is in the result spec
does
the following:

If the result-spec is not "compiled", it says the feature is present
if
it specifically put in, or if its type has the allAnnotatorFeatures
flag
set.

If the result-spec is "compiled", it says the feature is present if it
is specifically put in, or if its type has the allAnnotatorFeatures
flag
set and the feature exists in the type system.
For performance / space reasons, I'd like to drop the 2nd case; thiswould have the consequence of changing the result spec to return truefor features not in the type system where the type had theallAnnotatorFeatures flag set. This case shouldn't come up in
practice
because I can't think of good reason an annotator would ask if a
feature
not in its type system was present.
Any objections?

-Marshall

Re: capabilityLangugaeFlow - computeResultSpec

Reply via email to