[ https://issues.apache.org/jira/browse/OAK-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Dürig updated OAK-2896: ------------------------------- Fix Version/s: (was: 1.3.5) 1.4 > Putting many elements into a map results in many small segments. > ----------------------------------------------------------------- > > Key: OAK-2896 > URL: https://issues.apache.org/jira/browse/OAK-2896 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk > Reporter: Michael Dürig > Assignee: Michael Dürig > Labels: performance > Fix For: 1.4 > > Attachments: OAK-2896.png, OAK-2896.xlsx > > > There is an issue with how the HAMT implementation > ({{SegmentWriter.writeMap()}} interacts with the 256 segment references limit > when putting many entries into the map: This limit gets regularly reached > once the maps contains about 200k entries. At that points segments get > prematurely flushed resulting in more segments, thus more references and thus > even smaller segments. It is common for segments to be as small as 7k with a > tar file containing up to 35k segments. This is problematic as at this point > handling of the segment graph becomes expensive, both memory and CPU wise. I > have seen persisted segment graphs as big as 35M where the usual size is a > couple of ks. > As the HAMT map is used for storing children of a node this might have an > advert effect on nodes with many child nodes. > The following code can be used to reproduce the issue: > {code} > SegmentWriter writer = new SegmentWriter(segmentStore, getTracker(), V_11); > MapRecord baseMap = null; > for (;;) { > Map<String, RecordId> map = newHashMap(); > for (int k = 0; k < 1000; k++) { > RecordId stringId = > writer.writeString(String.valueOf(rnd.nextLong())); > map.put(String.valueOf(rnd.nextLong()), stringId); > } > Stopwatch w = Stopwatch.createStarted(); > baseMap = writer.writeMap(baseMap, map); > System.out.println(baseMap.size() + " " + w.elapsed()); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)