[
https://issues.apache.org/jira/browse/UIMA-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Richard Eckart de Castilho updated UIMA-6232:
---------------------------------------------
Component/s: uimaFIT
> Reduce overhead of createTypeSystemDescription() and friends
> ------------------------------------------------------------
>
> Key: UIMA-6232
> URL: https://issues.apache.org/jira/browse/UIMA-6232
> Project: UIMA
> Issue Type: Improvement
> Components: uimaFIT
> Reporter: Richard Eckart de Castilho
> Assignee: Richard Eckart de Castilho
> Priority: Major
> Fix For: 2.6.0uimaFIT
>
>
> uimaFIT offers a range of factory methods which use classpath scanning to
> locate type system descriptions, type priority definitions and index
> definitions.
> The present implementation scans for each type of object once and then stores
> the locations in which the descriptors were found in a global static
> variable. The user can call a method to clear this variable and force a
> re-scan.
> Whenever client code calls a method such as {{createTypeSystemDescription()}}
> the cached locations are read, parsed, and a corresponding Java descriptor
> object is created and returned.
> This issue is about two problems with this approach:
> 1) finding of the descriptor locations does only consider the ClassLoader
> situation the first time the scanning takes place. If at a later stage,
> {{createTypeSystemDescription()}} is called in the context of a ClassLoader
> with access to a different set of descriptions, this is not considered.
> 2) parsing the XML files every time e.g. {{createTypeSystemDescription()}}
> is called is slowing uimaFIT down overall. These methods are potentially
> called very often, in particular every time that
> {{createEngineDescription()}} or similar methods are called. Depending on the
> context, the parse overhead can have significant impact on the overall
> execution time.
> As a solution for 1), we could adopt a similar approach as it is used for
> JCas wrapper classes in the JCasImpl: the locations are stored in a
> {{WeakHashMap}} mapping the current ClassLoader to the discovered locations.
> The "current" ClassLoader is obtained via the Spring
> {{ClassUtils.getDefaultClassLoader()}} which is also (indirectly) used in
> many other places in uimaFIT. In particular, this method uses a Thead context
> classloader - if one is available.
> As a solution for 2), we do not only keep a {{WeakHashMap}} cache for the
> locations, but also for the parsed and aggregated XML files. When calling
> e.g. {{createTypeSystemDescription()}} and the cache already contains a
> respective descriptor, then a deep clone of it is returned. A similar
> approach (cloning a descriptor) was recently also introduced into UIMA Core
> to avoid repeatedly loading and parsing default flow controller definitions.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)