[
https://issues.apache.org/jira/browse/UIMA-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marshall Schor updated UIMA-4824:
---------------------------------
Description:
The first design for uv3x supported two different styles of storing data. One
style, done for cases where there was no corresponding JCas definition for a
feature, was to store the feature value either in _intData or _refData arrays.
The other style was to have a field in the JCas definition for the item; for
example, a string value would be stored in a locally declared String field.
The JCas API would directly get/set the field in this case. The Plain API
would test for a method handle in the Feature Impl, which would be set if there
was a JCas field implementation; otherwise it would inherit the general set/get
code that used the storage in the _intData or _refData.
The method handle approach was very efficient - Eclipse single step showed only
one step to get the value, and previous testing showed this approach was even
JIT - efficient.
However, running the CasCopier test with and without JCas cover classes showed
a large slow-down for the JCas case. Eventually, the evidence seemed to
suggest some kind of memory cache loading issue, perhaps the instruction cache,
since each getter / setter (with JCas) was running a different bit of code,
whereas the non-JCas was running common code.
Based on this observation, change the impl to simplify things by always using
the array of ints / refs approach to storing data, and change the JCas versions
to reference that storage. As part of this, code the offsets in the JCas class
as static final int values computed when the class is loaded. This imposes a
restriction in the use case where, under a single class loader, multiple type
systems and one set of JCas classes are being used. Add appropriate checks to
insure that the static final offset values remain correct when multiple type
systems are being used. Also add checks to insure the JCas type hierarchy is
consistent with the UIMA Type system hierarchy, and that the range in the JCas
getters is consistent with the UIMA type system range for each feature.
was:
The first design for uv3x supported two different styles of storing data. One
style, done for cases where there was no corresponding JCas definition for a
feature, was to store the feature value either in _intData or _refData arrays.
The other style was to have a field in the JCas definition for the item; for
example, a string value would be stored in a locally declared String field.
The JCas API would directly get/set the field in this case. The Plain API
would test for a method handle in the Feature Impl, which would be set if there
was a JCas field implementation; otherwise it would inherit the general set/get
code that used the storage in the _intData or _refData.
The method handle approach was very efficient - Eclipse single step showed only
one step to get the value, and previous testing showed this approach was even
JIT - efficient.
However, running the CasCopier test with and without JCas cover classes showed
a large slow-down for the JCas case. Eventually, the evidence seemed to
suggest some kind of memory cache loading issue, perhaps the instruction cache,
since each getter / setter (with JCas) was running a different bit of code,
whereas the non-JCas was running common code.
Based on this observation, change the impl to simplify things by always using
the array of ints / refs approach to storing data, and change the JCas versions
to reference that storage. As part of this, code the offsets in the JCas class
as static final int values computed when the class is loaded. This imposes a
restriction in the use case where, under a single class loader, multiple type
systems and one set of JCas classes are being used. Add appropriate checks to
insure that the static final offset values remain correct when multiple type
systems are being used.
> uv3 change storage model to always use int and ref arrays
> ---------------------------------------------------------
>
> Key: UIMA-4824
> URL: https://issues.apache.org/jira/browse/UIMA-4824
> Project: UIMA
> Issue Type: Sub-task
> Components: Core Java Framework
> Reporter: Marshall Schor
> Assignee: Marshall Schor
> Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> The first design for uv3x supported two different styles of storing data.
> One style, done for cases where there was no corresponding JCas definition
> for a feature, was to store the feature value either in _intData or _refData
> arrays. The other style was to have a field in the JCas definition for the
> item; for example, a string value would be stored in a locally declared
> String field.
> The JCas API would directly get/set the field in this case. The Plain API
> would test for a method handle in the Feature Impl, which would be set if
> there was a JCas field implementation; otherwise it would inherit the general
> set/get code that used the storage in the _intData or _refData.
> The method handle approach was very efficient - Eclipse single step showed
> only one step to get the value, and previous testing showed this approach was
> even JIT - efficient.
> However, running the CasCopier test with and without JCas cover classes
> showed a large slow-down for the JCas case. Eventually, the evidence seemed
> to suggest some kind of memory cache loading issue, perhaps the instruction
> cache, since each getter / setter (with JCas) was running a different bit of
> code, whereas the non-JCas was running common code.
> Based on this observation, change the impl to simplify things by always using
> the array of ints / refs approach to storing data, and change the JCas
> versions to reference that storage. As part of this, code the offsets in the
> JCas class as static final int values computed when the class is loaded.
> This imposes a restriction in the use case where, under a single class
> loader, multiple type systems and one set of JCas classes are being used.
> Add appropriate checks to insure that the static final offset values remain
> correct when multiple type systems are being used. Also add checks to insure
> the JCas type hierarchy is consistent with the UIMA Type system hierarchy,
> and that the range in the JCas getters is consistent with the UIMA type
> system range for each feature.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)