[
https://issues.apache.org/jira/browse/UIMA-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin De Boe updated UIMA-5041:
----------------------------------
Description:
Our type system includes a type named "com.intersys.uima.annotation.iknow.TOP",
which inherits directly from "uima.cas.TOP" and then has a number of subtypes
specific to our AE. When serializing this through the JsonCasSerializer, it
generates the shortname TOP twice:
{"_context":
{"_types": [
...
"TOP": {"_id":"com.intersys.uima.annotation.iknow.TOP",
"_subtypes":["Entity","ProximityScore"]},
"TOP": {"_id":"uima.cas.TOP",
"_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]},
...]
}
}
While we can work around this by renaming our top type, the documentation
explicitly states this shouldn't pose a problem and shortnames would be
de-duplicated automatically:
https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview
Section 9.2.2:
In the _types section, the key (e.g. "Sofa" or
"A_Typical_User_or_built_in_Type") is the "short" name for the type used in the
serialization. It is either just the last segment of the full type name (e.g.
for the type x.y.z.TypeName, it's TypeName), or, if name would collide with
another type name if just the last segment was used (example:
some.package.cname.Foo, and some.other.package.cname.Foo), then the key is made
up of the next-to-last segment, with an optional suffixed incrementing integer
in case of collisions on that name, a colon (:) and then the last name.
I see there are unit test checking for this, but maybe it's because
uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor our
custom TOP is actually used directly (only subtypes are).
So before I go ahead and change our root type name, I'd like to make sure this
isn't something the framework should have taken care of itself.
was:
Our type system includes a type named "com.intersys.uima.annotation.iknow.TOP",
which inherits directly from "uima.cas.TOP" and then has a number of subtypes
specific to our AE. When serializing this through the JsonCasSerializer, it
generates the shortname TOP twice:
{"_context":
{"_types": [
...
"TOP": {"_id":"com.intersys.uima.annotation.iknow.TOP",
"_subtypes":["Entity","ProximityScore"]},
"TOP": {"_id":"uima.cas.TOP",
"_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]},
...]
}
}
While we can work around this by renaming our top type, the documentation
explicitly states this shouldn't pose a problem and shortnames would be
de-duplicated automatically:
https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview
Section 9.2.2:
In the _types section, the key (e.g. "Sofa" or
"A_Typical_User_or_built_in_Type") is the "short" name for the type used in the
serialization. It is either just the last segment of the full type name (e.g.
for the type x.y.z.TypeName, it's TypeName), or, if name would collide with
another type name if just the last segment was used (example:
some.package.cname.Foo, and some.other.package.cname.Foo), then the key is made
up of the next-to-last segment, with an optional suffixed incrementing integer
in case of collisions on that name, a colon (:) and then the last name.
I see there are unit test checking for this, but maybe it's because
uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor our
custom TOP is actually used directly (only subtypes are).
While I ican definitely change our type system to use a different root type
name, I'd like to make sure this isn't something the framework should have
taken care of itself.
> JsonCasSerializer creates duplicate shortname
> ---------------------------------------------
>
> Key: UIMA-5041
> URL: https://issues.apache.org/jira/browse/UIMA-5041
> Project: UIMA
> Issue Type: Bug
> Components: Core Java Framework
> Affects Versions: 2.8.1SDK
> Reporter: Benjamin De Boe
> Priority: Minor
>
> Our type system includes a type named
> "com.intersys.uima.annotation.iknow.TOP", which inherits directly from
> "uima.cas.TOP" and then has a number of subtypes specific to our AE. When
> serializing this through the JsonCasSerializer, it generates the shortname
> TOP twice:
> {"_context":
> {"_types": [
> ...
> "TOP": {"_id":"com.intersys.uima.annotation.iknow.TOP",
> "_subtypes":["Entity","ProximityScore"]},
> "TOP": {"_id":"uima.cas.TOP",
> "_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]},
> ...]
> }
> }
> While we can work around this by renaming our top type, the documentation
> explicitly states this shouldn't pose a problem and shortnames would be
> de-duplicated automatically:
> https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview
> Section 9.2.2:
> In the _types section, the key (e.g. "Sofa" or
> "A_Typical_User_or_built_in_Type") is the "short" name for the type used in
> the serialization. It is either just the last segment of the full type name
> (e.g. for the type x.y.z.TypeName, it's TypeName), or, if name would collide
> with another type name if just the last segment was used (example:
> some.package.cname.Foo, and some.other.package.cname.Foo), then the key is
> made up of the next-to-last segment, with an optional suffixed incrementing
> integer in case of collisions on that name, a colon (:) and then the last
> name.
> I see there are unit test checking for this, but maybe it's because
> uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor
> our custom TOP is actually used directly (only subtypes are).
> So before I go ahead and change our root type name, I'd like to make sure
> this isn't something the framework should have taken care of itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)