[jira] [Commented] (FELIX-4656) Improve memory usage of the resolver

Richard S. Hall (JIRA) Fri, 03 Oct 2014 08:52:37 -0700

    [ 
https://issues.apache.org/jira/browse/FELIX-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158115#comment-14158115
 ]


Richard S. Hall commented on FELIX-4656:
----------------------------------------

>From mailing list:

Guillaume Nodet Tue, 30 Sep 2014 01:45:07 -0700

I recently run into OutOfMemory problems when using the resolver in
addition to the slowness I already raised.
I worked a bit on the resolver and pushed my results to a private branch
for review:
    https://github.com/gnodet/felix/commits/resolver-improvements


The memory problem was addressed by replacing the internal data structures
hold by the Candidates class.
This class was using a HashMap to store the Map<Requirement,
List<Capability>> for candidates and for each permutation, the whole data
structure was copied.
Unfortunately, the HashMap object is very memory intensive and the
List<Capability> were duplicated a lot during resolutions.
Therefore, I introduced a few collections which are optimized for memory
consumption.  Those are the OpenHashMap (derived from the mahout
collections) and the CopyOnWriteList/CopyOnWriteSet.
The OpenHashMap uses open adressing and two Object[] internally instead of
tons of entry objects.  Copy should also be faster as much less objects are
created.   The values in this map are now CopyOnWriteList instead of
ArrayList, which has the big benefit of not duplicating the entire arrays
when the structure is copied.  Those two collections work roughly the same
way as CopyOnWriteArrayList but without any thread safety.   In addition,
creating a new list from a CopyOnWriteList does not lead to creating a new
Object[] to store the data, but merely  assign the same pointer internally.
In terms of memory consumption, this means overall, that copying a
Candidates class, will lead to two Object[] arrays creation (for the
OpenHashMap) and the creation of small objects for the lists (but with no
copy of the data itself).
This is mainly commit
https://github.com/gnodet/felix/commit/0bf1523f21f9983b21b2737b4f78bb8d78cd35fd

The slowness problem was partially addressed because I found out that the
resolver was attempting the same resolution multiple times.  I think this
is due to the order or removing possible candidates and multiple paths to
the same resolution were executed.  In order to solve this problem, the
Candidates now holds a m_path structure which contains all the removed
candidates.  This object is used in the resolver loop to make sure we
haven't already tried the same resolution.  For big resolutions, I think it
will improve things a lot.
This is fixed by commit
https://github.com/gnodet/felix/commit/090a67a7fc05170291ad9cff808229a0292b6fb2


I'm going to do further testing, and I'm going to add a big resolution test
to measure performances and memory consumption.
In the mean time, I'd like others to review and test it if possible.


> Improve memory usage of the resolver
> ------------------------------------
>
>                 Key: FELIX-4656
>                 URL: https://issues.apache.org/jira/browse/FELIX-4656
>             Project: Felix
>          Issue Type: Improvement
>          Components: Resolver
>            Reporter: Guillaume Nodet
>            Assignee: Guillaume Nodet
>             Fix For: resolver-1.2.0
>
>
> During big resolutions (> 100 bundles), the memory consumption can become 
> very huge, mostly by keeping a lot of copies of the Candidates object.
> I want to lower the memory requirements of the resolver without touching the 
> algorithm at all (which would be a different improvement).
> This can be done by using :
>   * lower memory intensive collections
>   * do smart copies of those collections (where they would only actually copy 
> the data when modify)
> The second item is slightly more difficult to achieve, as the maps in the 
> Candidate objects contains Set and List, which would mean that those must be 
> copied too.  So it could actually be complementary, if achievable.
> For the first one, the HashMap and HashSet are very memory intensive.  I'll 
> introduce two new collections which will lower the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FELIX-4656) Improve memory usage of the resolver

Reply via email to