additional detail: There is a serializer for type systems, TypeSystem2Xml.typeSystem2Xml(typeSystem, outputStream).
The implementation excludes from the serialization any types it thinks are "built-in", and this includes all array types. So no serialization is done of types like Annotation[]. -Marshall On 4/20/2016 3:38 PM, Marshall Schor wrote: > Apologies for the long email. Short version - it appears that arrays of > specific Feature Structure types (e.g. myFoo[]) have some holes in the > support; > some possible ways forward. > > ----------------- > > UIMA has some support for arrays and lists of FeatureStructures (FSs) with the > elements restricted to a particular FS type. This is supported in the type > system descriptors, where you can specify in the "featureDescription" an > "elementType". > > One use could be to use these types with indexing; you can get an index over > all > instances of arrays of some specific type. > > In the implementation, I see further support. It is possible to create a type > which is a FS array with a component type, using the TypeSystemManager API: > getArrayType(component_type). This creates (or just retrieves, if already > created) a type whose name is the name of the component_type, suffixed with > "[]". Example: "uima.tcas.Annotation[]". > > You can also specify these types in the XML type descriptor, but not directly; > you can only specify them in the "feature" description for another type, where > that feature is referencing it. > > To actually create instances of these types seems not quite implemented. To > create an array, the API needs to include the array length. Looking at the > non-JCas APIs, we have in the CAS Interface methods for creating arrays: > > createBooleanArray(length) > createStringArray(length) > etc. > createArrayFS(length) > > but there's no > > createArray(type, length) > > The LowLevelCAS interface has this though: > > ll_createArray(type, length) > > I couldn't find any tests that actually create one of these objects, using > this API. > > Modifying a test case to create one of these, and then attempting to serialize > it with both XMI and XCAS serialization produced invalid XML if the array was > in > fact serialized as a separate object. This is the case in XCAS and in XMI > when > the array is referenced from a feature description, and that feature > description > is marked as "multipleReferencesAllowed". > > In these cases, the convention to serialize a FeatureStructure is to serialize > it using the name of the type as the XML element name. For example, the type > "Foo" gets serialized as <Foo ... />. But the name of these types ends in > "[]", > e.g. Annotation[]. And the characters "[]" are not legal as part of an XML > element name. > > There is some code that in some (but not all) cases serializes this using the > element name "FSArray" instead. But the deserialization code produces for > this > FSArray instances instead of the more specific type instances. When the > deserialized object is referenced from another type via a feature having an > "elementType" specification (in the receiving type system), that information > could be used to fix-up the deserialized array instance type, to the that > spec's > component type. > > It also appears that the casCopier doesn't support creating these kinds of > objects. > > I've probably missed some things in my analysis of this. I'm thinking we > ought > to fix the CasCopier and XMI and XCAS serialization to work when serializing > these objects (by serializing them as FSArray, although that loses the > component > type info). When deserializing XMI and XCAS, these FSArray objects could be > updated to include the element-type information when and if that was > available, > for instance, if there was a reference from some typed feature having an > element > type). > > This isn't perfect; to be 100% accurate, we would need to be able to record > the > element type in the serialized stream for these instances. > > I haven't (yet) thought much about JCas for this issue, or support for > fslists. > > Other thoughts? > > -Marshall >
