[ https://issues.apache.org/jira/browse/UIMA-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Eckart de Castilho updated UIMA-6413: --------------------------------------------- Description: This is essentially a follow-up issue to UIMA-6276. So, when a CAS is created, then a cache is filled in {{FSClassRegistry.cl_to_type2JCas}} which maintains information about the JCas representation of the different types. This is a per-classloader cache - so for every classloader which is involved in the creation of a (J)CAS, an entry is added. Now normally, classloaders are pretty long-lived objects and you only have so many during the runtime of a program. But there are cases where classloaders are created in volumes and in this case we run into trouble. Now, UIMA-6276 has turned the cache into a weak map hoping that once a classloader is garbarge-collected, the cache would get cleaned up automatically. However, that idea was not thought through entirely because one of the pieces of information stored in the map is a {{FsGenerator3}} and that generator is actually generated via the particular classloader that is the key in the map. Thus, a value in the map has a strong reference to the weak key causing the key never to get garbage collected... and there might be other fields as well contributing to that cycle. In particular, a new classloader is generated whenever a new {{ResourceManager}} with a custom datapath, classpath, or both is created. A typical case for this to happen is when a PEAR is used. But there can be other reasons why somebody would create new custom resource managers. Limiting the number of {{ResourceManager}}s in a system may not be feasible because typically there should be one per pipeline (to allow for shared resources), so if you are in a situation where pipelines are instantiated and destroyed repeatedly, it makes sense to create {{ResourceManager}}s alongside. However, typically the number of classloaders in a system is pretty set. The {{ResourceManager}} internally wraps these classloaders with an {{UimaClassLoader}} (only if a specific classloader or a custom datapath is passed to the resource manager). So assuming that essentially always the same set of classloaders is provided (any maybe only a limited set of datapaths), it should be ok to introduce another cache of {{[classloader, datapath] -> UimaClassLoader}} to limit the number of {{UimaClassLoader}} instances and therefore limit the size of {{FSClassRegistry.cl_to_type2JCas}}. was: This is essentially a follow-up issue to UIMA-6276. So, when a CAS is created, then a cache is filled in > Memory leak in FSClassRegistry > ------------------------------ > > Key: UIMA-6413 > URL: https://issues.apache.org/jira/browse/UIMA-6413 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework > Reporter: Richard Eckart de Castilho > Assignee: Richard Eckart de Castilho > Priority: Major > Fix For: 3.3.0SDK > > > This is essentially a follow-up issue to UIMA-6276. > So, when a CAS is created, then a cache is filled in > {{FSClassRegistry.cl_to_type2JCas}} which maintains information about the > JCas representation of the different types. This is a per-classloader cache - > so for every classloader which is involved in the creation of a (J)CAS, an > entry is added. Now normally, classloaders are pretty long-lived objects and > you only have so many during the runtime of a program. But there are cases > where classloaders are created in volumes and in this case we run into > trouble. Now, UIMA-6276 has turned the cache into a weak map hoping that once > a classloader is garbarge-collected, the cache would get cleaned up > automatically. However, that idea was not thought through entirely because > one of the pieces of information stored in the map is a {{FsGenerator3}} and > that generator is actually generated via the particular classloader that is > the key in the map. Thus, a value in the map has a strong reference to the > weak key causing the key never to get garbage collected... and there might be > other fields as well contributing to that cycle. > In particular, a new classloader is generated whenever a new > {{ResourceManager}} with a custom datapath, classpath, or both is created. A > typical case for this to happen is when a PEAR is used. But there can be > other reasons why somebody would create new custom resource managers. > Limiting the number of {{ResourceManager}}s in a system may not be feasible > because typically there should be one per pipeline (to allow for shared > resources), so if you are in a situation where pipelines are instantiated and > destroyed repeatedly, it makes sense to create {{ResourceManager}}s alongside. > However, typically the number of classloaders in a system is pretty set. The > {{ResourceManager}} internally wraps these classloaders with an > {{UimaClassLoader}} (only if a specific classloader or a custom datapath is > passed to the resource manager). So assuming that essentially always the same > set of classloaders is provided (any maybe only a limited set of datapaths), > it should be ok to introduce another cache of {{[classloader, datapath] -> > UimaClassLoader}} to limit the number of {{UimaClassLoader}} instances and > therefore limit the size of {{FSClassRegistry.cl_to_type2JCas}}. -- This message was sent by Atlassian Jira (v8.20.1#820001)