[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13977788#comment-13977788 ] Vijay commented on CASSANDRA-7030: -- Hi Bendict, Sorry missed the update earlier Not sure why we are comparing synchronization, hence i removed synchronization and here are the results on RHEL (32 core box) http://pastebin.com/ZXSytn70. JEMalloc with JNI overhead is faster and efficient. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt, benchmark.21.diff.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970297#comment-13970297 ] Vijay commented on CASSANDRA-7030: -- You are right i had the synchronization in the test attached in the old ticket because initially we had some segfaults which was fixed in the later JEM releases, but the synchronization was never committed into cassandra repo because by then it was fixed. Rerunning the test after removing the locks in the same old test classes, the results the time take is much better in jemalloc, you might need more runs. The memory foot print is better too (malloc is slower and uses more memory comparatively as per my tests). http://pastebin.com/JtixVvGU As mentioned earlier i don't mind removing it either :) Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969377#comment-13969377 ] Benedict commented on CASSANDRA-7030: - FTR, though, I think the problem with your test is that jemalloc is synchronised and malloc is not. This leads to the CLHM not obeying its limits as readily as it is asked to (seems to keep ~ 3x as much data around in my test): {noformat} concurrent malloc: Elapsed: 55.433s Allocated: 2973Mb VM total:177 vsz: 6221 rsz: 4501 synchronized malloc: Elapsed: 96.507s Allocated: 1026Mb VM total:187 vsz: 3341 rsz: 1681 synchronized jemalloc: Elapsed: 263.686s Allocated: 1027Mb VM total:192 vsz: 3628 rsz: 1525 {noformat} and for posterity, the code I was running: {code} public static void main(String[] args) throws InterruptedException, IOException { String pid = ManagementFactory.getRuntimeMXBean().getName().split(@)[0]; final IAllocator allocator = new NativeAllocator(); final AtomicLong total = new AtomicLong(); EvictionListenerUUID, Memory listener = new EvictionListenerUUID, Memory() { public void onEviction(UUID k, Memory mem) { total.addAndGet(-mem.size()); mem.free(allocator); } }; final MapUUID, Memory map = new ConcurrentLinkedHashMap.BuilderUUID, Memory().weigher(Weighers.Memory singleton()) .initialCapacity(8 * 65536).maximumWeightedCapacity(2 * 65536) .listener(listener).build(); final AtomicLong elapsed = new AtomicLong(); final AtomicLong count = new AtomicLong(); final ExecutorService exec = Executors.newFixedThreadPool(8); for (int i = 0 ; i 8 ; i++) { final Random rand = new Random(i); exec.execute(new Runnable() { public void run() { byte[] keyBytes = new byte[16]; for (int i = 0; i 100; i++) { int size = rand.nextInt(128 * 128); if (size = 0) continue; rand.nextBytes(keyBytes); long start = System.nanoTime(); Memory mem = new Memory(allocator, size); elapsed.addAndGet(System.nanoTime() - start); mem.setMemory(0, mem.size(), (byte) 2); Memory r = map.put(UUID.nameUUIDFromBytes(keyBytes), mem); if (r != null) r.free(); total.addAndGet(size); if (count.incrementAndGet() % 1000 == 0) System.out.println(1M); } } }); } exec.shutdown(); exec.awaitTermination(1L, TimeUnit.HOURS); System.out.println(String.format(Elapsed: %.3fs, elapsed.get() * 0.1d)); System.out.println(String.format(Allocated: %.0fMb, total.get() / (double) (1 20))); System.out.println(String.format(VM total:%.0f, Runtime.getRuntime().totalMemory() / (double) (1 20))); memuse(vsz, pid); memuse(rsz, pid); Thread.sleep(100); } private static void memuse(String type, String pid) throws IOException { Process p = new ProcessBuilder().command(ps, -o, type, pid).redirectErrorStream(true).start(); BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream())); reader.readLine(); System.out.println(String.format(%s: %.0f, type, Integer.parseInt(reader.readLine()) / 1024d)); } {code} Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969419#comment-13969419 ] Benedict commented on CASSANDRA-7030: - bq. This leads to the CLHM not obeying its limits as readily as it is asked to Confirmed that the problem I am seeing with concurrent execution (and that I would guess is leading to your test results) is down to CLHM. By replacing the CLHM with an AtomicReferenceArray to guarantee the bounds I get: {noformat} concurrent malloc: Total Elapsed: 9.708s Allocate Elapsed: 21.271s Free Elapsed: 26.023s Total Allocated: 62483Mb Rate: 1.290Gb/s Live Allocated: 1020Mb VM total:117 vsz: 3149 rsz: 1280 synchronized malloc: Total Elapsed: 36.526s Allocate Elapsed: 134.114s Free Elapsed: 128.416s Total Allocated: 62483Mb Rate: 0.232Gb/s Live Allocated: 1020Mb VM total:117 vsz: 3213 rsz: 1427 synchronized jemalloc: Total Elapsed: 217.113s Allocate Elapsed: 162.753s Free Elapsed: 1531.215s Total Allocated: 62483Mb Rate: 0.036Gb/s Live Allocated: 1020Mb VM total:70 vsz: 4084 rsz: 1410 {noformat} Can you rerun your test with either synchronised malloc, or with an AtomicReferenceArray instead of the CLHM, to confirm? Note I have reverted my position back to let's get rid of jemalloc - without more evidence to the contrary: the test I was running that initiated the creation of this ticket was measuring elapsed time for both allocate() *and* free(), and I dropped the latter from the tests based on your benchmark because it's difficult to time the free() calls (as they live in the eviction listener). Now I am timing both, and you can see the real-elapsed time and per-CPU elapsed times are dramatically higher for jemalloc once both are included. The cost of calling free() appears to be disproportionately higher for jemalloc. Note the throughput rate for jemalloc: 36Mb/s. This is really really pathetic! Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968370#comment-13968370 ] Jason Brown commented on CASSANDRA-7030: [~vijay2...@yahoo.com] Vijay, you might have an interest in this one. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968515#comment-13968515 ] Vijay commented on CASSANDRA-7030: -- Hi Bendict, Interesting how slow is slow in terms of Cassandra's throughput/latencies? Isn't it tradeoff between memory fragmentation (use) vs speed? here is the test for memory footprint https://issues.apache.org/jira/browse/CASSANDRA-3997?focusedCommentId=13243924page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13243924 Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968529#comment-13968529 ] Benedict commented on CASSANDRA-7030: - Hmm. Interesting, I didn't expect there to be such a dramatic difference in actual memory utilisation (some, but not much), only in allocation speed. That is quite problematic for use-cases where you want to allocate small and often - 3 micros per call makes it much too expensive for per-cell operations. However for the row cache it's probably quite acceptable. Do you still have the benchmark that you were running? Would be nice to see if we could figure out why it was wasting so much memory - fragmentation doesn't seem an adequate explanation for so many GBs (not saying it's not, just seems quite extreme without specifically trying to break the algorithm). Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968728#comment-13968728 ] Benedict commented on CASSANDRA-7030: - [~vijay2...@yahoo.com] I think there was something wrong with your benchmark. Since it uses a different random seed each time it's run, the end state that you check may vary a great deal. Or possibly reporting total system memory isn't accurate in some way. As I find completely the opposite effect: ~2.3Gb resident memory for the process working with unsafe, and ~2.7Gb resident for JEMalloc. {noformat} UNSAFE: Items: 1M Elapsed: 4.373s Allocated: 2046Mb VM total:172 vsz: 3717 rsz: 2268 Items: 10M Allocated: 2050Mb VM total: 186 vsz: 3717 rsz: 2308 Items: 100M Elapsed: 329.087s Allocated: 2047Mb VM total:186 vsz: 3717 rsz: 2308 JEMALLOC: Items: 1M Elapsed: 8.259s Allocated: 2046Mb VM total:161 vsz: 4128 rsz: 2651 Items: 10M Allocated: 2050 VM total: 192 vsz: 4132 rsz: 2706 Items: 100M Elapsed: 791.370s Allocated: 2047Mb VM total:192 vsz: 4136 rsz: 2710 {noformat} {code} public static void main(String[] args) throws InterruptedException, IOException { String pid = ManagementFactory.getRuntimeMXBean().getName().split(@)[0]; System.out.println(pid); final IAllocator allocator = new JEMallocAllocator(); final Random rand = new Random(0); final AtomicLong total = new AtomicLong(); EvictionListenerUUID, Memory listener = new EvictionListenerUUID, Memory() { public void onEviction(UUID k, Memory mem) { total.addAndGet(-mem.size()); mem.free(allocator); } }; MapUUID, Memory map = new ConcurrentLinkedHashMap.BuilderUUID, Memory().weigher(Weighers.Memory singleton()) .initialCapacity(8 * 65536).maximumWeightedCapacity(4 * 65536) .listener(listener).build(); long start = System.nanoTime(); byte[] keyBytes = new byte[16]; for (int i = 0 ; i 100 ; i++) { int size = rand.nextInt(128); if (size = 0) continue; rand.nextBytes(keyBytes); Memory mem = new Memory(allocator, size * 128); mem.setMemory(0, mem.size(), (byte) 2); map.put(UUID.nameUUIDFromBytes(keyBytes), mem); total.addAndGet(size * 128); } long end = System.nanoTime(); System.out.println(String.format(Elapsed: %.3fs, (end - start) * 0.1d)); System.out.println(String.format(Allocated: %.0fMb, total.get() / (double) (1 20))); System.out.println(String.format(VM total:%.0f, Runtime.getRuntime().totalMemory() / (double) (1 20))); memuse(vsz, pid); memuse(rsz, pid); Thread.sleep(100); } private static void memuse(String type, String pid) throws IOException { Process p = new ProcessBuilder().command(ps, -o, type, pid).redirectErrorStream(true).start(); BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream())); reader.readLine(); System.out.println(String.format(%s: %.0f, type, Integer.parseInt(reader.readLine()) / 1024d)); } {code} Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968847#comment-13968847 ] Vijay commented on CASSANDRA-7030: -- JEMalloc will be better if you have different sized objects (if of same size then there isn't much fragmentation), the benchmark in the other ticket uses 5 200 which will provide you a distribution which can be compared... I don't think the test you are proposing is valid... Anyways i don't really have any bias on this ticket as we don't use this in production and thinking on JNI alternatives for Serialized cache anyways... Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968859#comment-13968859 ] Benedict commented on CASSANDRA-7030: - Nope, I copied that behaviour from your earlier ticket to make it as comparable as possible. Your code: {code} for (int i = 0; i loops; i++) { long size = rand.nextInt(100); if (size = 0) continue; tryMemory(size * 1024); } {code} I played with various multipliers, and I also tried with a completely uniform size distribution. I see no difference in results (uniform distribution of 64K sizes follow): {noformat} malloc: Items: 10M Elapsed: 2.911s Allocated: 2047Mb VM total:187 vsz: 3717 rsz: 2307 jemalloc: Items: 10M Elapsed: 23.869s Allocated: 2047Mb VM total:192 vsz: 4148 rsz: 2721 {noformat} That said, it would be interesting to see how jemalloc performs with JNI, rather than JNA. But as things stand, it doesn't seem a sensible option to offer, as it uses more memory and is slower. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968863#comment-13968863 ] Benedict commented on CASSANDRA-7030: - Also, I can't obviously easily try on a machine with as much memory as your original test, and it is possible malloc does not scale well to larger volumes of memory. However I would be surprised if in those cases it resulted in worse memory overhead - again, I would expect a penalty in allocation _time_ only. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7030) Remove JEMallocAllocator
[ https://issues.apache.org/jira/browse/CASSANDRA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968960#comment-13968960 ] Vijay commented on CASSANDRA-7030: -- Not enough iterations i guess Just ran the old tool again (to verify if i am still making sense :) ) and i do get the same kind of results (http://pastebin.com/LPFUutaY)... Things might be different like the kernel version etc Also there is no difference in the allocator except the malloc is over written to do threaded allocation you can see bench marks inline too. Remove JEMallocAllocator Key: CASSANDRA-7030 URL: https://issues.apache.org/jira/browse/CASSANDRA-7030 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 beta2 Attachments: 7030.txt JEMalloc, whilst having some nice performance properties by comparison to Doug Lea's standard malloc algorithm in principle, is pointless in practice because of the JNA cost. In general it is around 30x more expensive to call than unsafe.allocate(); malloc does not have a variability of response time as extreme as the JNA overhead, so using JEMalloc in Cassandra is never a sensible idea. I doubt if custom JNI would make it worthwhile either. I propose removing it. -- This message was sent by Atlassian JIRA (v6.2#6252)