[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969377#comment-13969377 ]
Benedict commented on CASSANDRA-7030: ------------------------------------- FTR, though, I think the problem with your test is that jemalloc is synchronised and malloc is not. This leads to the CLHM not obeying its limits as readily as it is asked to (seems to keep ~ 3x as much data around in my test): {noformat} concurrent malloc: Elapsed: 55.433s Allocated: 2973Mb VM total:177 vsz: 6221 rsz: 4501 synchronized malloc: Elapsed: 96.507s Allocated: 1026Mb VM total:187 vsz: 3341 rsz: 1681 synchronized jemalloc: Elapsed: 263.686s Allocated: 1027Mb VM total:192 vsz: 3628 rsz: 1525 {noformat} and for posterity, the code I was running: {code} public static void main(String[] args) throws InterruptedException, IOException { String pid = ManagementFactory.getRuntimeMXBean().getName().split("@")[0]; final IAllocator allocator = new NativeAllocator(); final AtomicLong total = new AtomicLong(); EvictionListener<UUID, Memory> listener = new EvictionListener<UUID, Memory>() { public void onEviction(UUID k, Memory mem) { total.addAndGet(-mem.size()); mem.free(allocator); } }; final Map<UUID, Memory> map = new ConcurrentLinkedHashMap.Builder<UUID, Memory>().weigher(Weighers.<Memory> singleton()) .initialCapacity(8 * 65536).maximumWeightedCapacity(2 * 65536) .listener(listener).build(); final AtomicLong elapsed = new AtomicLong(); final AtomicLong count = new AtomicLong(); final ExecutorService exec = Executors.newFixedThreadPool(8); for (int i = 0 ; i < 8 ; i++) { final Random rand = new Random(i); exec.execute(new Runnable() { public void run() { byte[] keyBytes = new byte[16]; for (int i = 0; i < 1000000; i++) { int size = rand.nextInt(128 * 128); if (size <= 0) continue; rand.nextBytes(keyBytes); long start = System.nanoTime(); Memory mem = new Memory(allocator, size); elapsed.addAndGet(System.nanoTime() - start); mem.setMemory(0, mem.size(), (byte) 2); Memory r = map.put(UUID.nameUUIDFromBytes(keyBytes), mem); if (r != null) r.free(); total.addAndGet(size); if (count.incrementAndGet() % 10000000 == 0) System.out.println("1M"); } } }); } exec.shutdown(); exec.awaitTermination(1L, TimeUnit.HOURS); System.out.println(String.format("Elapsed: %.3fs", elapsed.get() * 0.000000001d)); System.out.println(String.format("Allocated: %.0fMb", total.get() / (double) (1 << 20))); System.out.println(String.format("VM total:%.0f", Runtime.getRuntime().totalMemory() / (double) (1 << 20))); memuse("vsz", pid); memuse("rsz", pid); Thread.sleep(100); } private static void memuse(String type, String pid) throws IOException { Process p = new ProcessBuilder().command("ps", "-o", type, pid).redirectErrorStream(true).start(); BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream())); reader.readLine(); System.out.println(String.format("%s: %.0f", type, Integer.parseInt(reader.readLine()) / 1024d)); } {code} > Remove JEMallocAllocator > ------------------------ > > Key: CASSANDRA-7030 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Assignee: Benedict > Priority: Minor > Labels: performance > Fix For: 2.1 beta2 > > Attachments: 7030.txt > > > JEMalloc, whilst having some nice performance properties by comparison to > Doug Lea's standard malloc algorithm in principle, is pointless in practice > because of the JNA cost. In general it is around 30x more expensive to call > than unsafe.allocate(); malloc does not have a variability of response time > as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a > sensible idea. I doubt if custom JNI would make it worthwhile either. > I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)