[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2016-04-07 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229806#comment-15229806
 ] 

Branimir Lambov commented on CASSANDRA-4338:


Switch to byte buffers, and making a direct/on-heap choice that makes best 
sense for the subclass or compressor was implemented as part of CASSANDRA-8709, 
which is included in 2.2.

The issue is now obsolete, unless we want to backport the patch to 2.1.

> Experiment with direct buffer in SequentialWriter
> -
>
> Key: CASSANDRA-4338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>  Labels: performance
> Fix For: 2.1.x
>
> Attachments: 4338-gc.tar.gz, 4338.benchmark.png, 
> 4338.benchmark.snappycompressor.png, 4338.single_node.read.png, 
> 4338.single_node.write.png, gc-4338-patched.png, gc-trunk-me.png, 
> gc-trunk.png, gc-with-patch-me.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a 
> copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2016-04-06 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228440#comment-15228440
 ] 

Sylvain Lebresne commented on CASSANDRA-4338:
-

[~blambov] Can you comment on Jonathan's question above?

> Experiment with direct buffer in SequentialWriter
> -
>
> Key: CASSANDRA-4338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>  Labels: performance
> Fix For: 2.1.x
>
> Attachments: 4338-gc.tar.gz, 4338.benchmark.png, 
> 4338.benchmark.snappycompressor.png, 4338.single_node.read.png, 
> 4338.single_node.write.png, gc-4338-patched.png, gc-trunk-me.png, 
> gc-trunk.png, gc-with-patch-me.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a 
> copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2015-08-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682261#comment-14682261
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Is this obsoleted by CASSANDRA-9500?

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Branimir Lambov
Priority: Minor
  Labels: performance
 Fix For: 2.1.x

 Attachments: 4338-gc.tar.gz, 4338.benchmark.png, 
 4338.benchmark.snappycompressor.png, 4338.single_node.read.png, 
 4338.single_node.write.png, gc-4338-patched.png, gc-trunk-me.png, 
 gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-16 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796473#comment-13796473
 ] 

Marcus Eriksson commented on CASSANDRA-4338:


reads should be exactly the same performance, nothing has been touched there.

i want to do the same experiment for RAR/CRAR (reading into a direct BB and 
decompressing off-heap), will do that soon i hope


 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, 4338.single_node.read.png, 4338.single_node.write.png, 
 gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-11 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792686#comment-13792686
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

Hmm, reading from a single node may not have a very high statistical 
significance:

[data from second 
attempt|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.4338.CompressedSequentialWriter.single_node.2.jsonmetric=interval_op_rateoperation=stress-readsmoothing=4]

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, 4338.single_node.read.png, 4338.single_node.write.png, 
 gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792732#comment-13792732
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Maybe we need those stress improvements [~benedict] was working on.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, 4338.single_node.read.png, 4338.single_node.write.png, 
 gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-11 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13792787#comment-13792787
 ] 

Benedict commented on CASSANDRA-4338:
-

I've not deliberately tested out my patch on writes, but I wouldn't expect as 
dramatic an improvement in consistency once I/O starts entering the picture. 
Might well make some difference, though. For the read run, not sure what 
happened there on the Marcus branch. It looks to me like (maybe) some of the 
stress workers get ahead and finish first, leaving the cache less polluted for 
the remaining workers. Inconsistent worker count was the cause of persistent 
drops in performance for my read tests (but here it could explain peaks). If 
so, my patch will fix that, though could also try running with a lower thread 
count to confirm.

If you want to try with my patch (which will maintain same thread count 
throughout), any of the linked repos in ticket 
[CASSANDRA-4718|https://issues.apache.org/jira/browse/CASSANDRA-4718] will do.

Btw, have we considered benchmarking these snappy changes for messaging service 
connections? Might well reduce the software side of the network overhead, 
although not as dramatically. I do see most of the connection CPU being used in 
snappy native arrayCopy.


 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, 4338.single_node.read.png, 4338.single_node.write.png, 
 gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-10 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791289#comment-13791289
 ] 

Marcus Eriksson commented on CASSANDRA-4338:


ok, thanks, doesnt look like a big difference then

i kind of like that the big dips in performance (caused by GC probably) are 
basically gone though

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, 
 gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791547#comment-13791547
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Ryan, can you also test on a single node?

If the single-node improvements are still swamped by the network overhead...  
but if we can reduce that with some of the other efforts going on 
(CASSANDRA-1632, CASSANDRA-4718) then local performance will matter more.

But if Ryan doesn't see much difference on a single node either then we should 
figure out what the environment difference is.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, 
 gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791548#comment-13791548
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

I also did some quick looking for a ByteBuffer-capable Checksum implementation.

hadoop-common has a NativeCrc32 (using the new intel instructions I think), but 
only for verifying checkums and not generating them.

Adler32 gets {{update(ByteBuffer)}... in jdk8.


 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, 
 gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-10 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13791713#comment-13791713
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

On a single node:

!4338.single_node.write.png!

The read was weird, I don't know what that spike is. I'm rerunning this to see 
if it does it again:

!4338.single_node.read.png!

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, 4338.single_node.read.png, 4338.single_node.write.png, 
 gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-09 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790208#comment-13790208
 ] 

Marcus Eriksson commented on CASSANDRA-4338:


yep, that looks very similar

did you run it with -I SnappyCompressor ?

ive rebased and (force) pushed to 
https://github.com/krummas/cassandra/tree/marcuse/4338 to get the latency stuff 
in

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338-gc.tar.gz, gc-4338-patched.png, 
 gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-09 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790383#comment-13790383
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

{quote}
did you run it with -I SnappyCompressor ?
{quote}

No, I missed that variable, I'll rerun with that as well as for latency metrics.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338-gc.tar.gz, gc-4338-patched.png, 
 gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-09 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790573#comment-13790573
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

With SnappyCompressor:

!4338.benchmark.snappycompressor.png!

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338.benchmark.snappycompressor.png, 
 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, gc-trunk.png, 
 gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-08 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789713#comment-13789713
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

trunk is working again, so I have a baseline now:

!4388.benchmark.png!

[data 
here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.4338.CompressedSequentialWriter.jsonmetric=interval_op_rateoperation=stress-writesmoothing=3]

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338-gc.tar.gz, gc-4338-patched.png, 
 gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-08 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789722#comment-13789722
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

And I notice that the marcuse/4338 line still doesn't have latency metrics, if 
you'd like me to re-run for those stats, I can. Just need to rebase off of 
CASSANDRA-6153 (or rewrite my tool to use a known good cassandra-stress; right 
now it just takes the one from the same branch it's testing.)

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338.benchmark.png, 4338-gc.tar.gz, gc-4338-patched.png, 
 gc-trunk-me.png, gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-07 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788367#comment-13788367
 ] 

Ryan McGuire commented on CASSANDRA-4338:
-

I started to run a benchmark for this but I found CASSANDRA-6153 and 
CASSANDRA-6154 standing in my way.

[Here's the 
data|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.4338.CompressedSequentialWriter.jsonmetric=interval_op_rateoperation=stress-writesmoothing=4]
 for my test with [~krummas]' patch, but it's missing any sort of baseline 
because of those above bugs. 

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, 
 gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783895#comment-13783895
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Promising!

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk-me.png, 
 gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-10-02 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783883#comment-13783883
 ] 

Marcus Eriksson commented on CASSANDRA-4338:


So, got the CompressedSequentialWriter working, code pushed to github: 
https://github.com/krummas/cassandra/commits/marcuse/4338. It uses snappys 
direct bytebuffer support, and a custom Adler32 made by me that can checksum 
direct byte buffers (code here: https://github.com/krummas/adler32 (probably 
only builds on linux, did not spend much time on it).

Micro benchmarks look great, almost no GC at all with the patched version (the 
benchmark is left in main(...) in CompressedSequentialWriter.java):
.h2 Trunk
!gc-trunk-me.png!

.h2 Patched
!gc-with-patch-me.png!

Proper single-node stress benchmarks look good as well:
h2. Trunk
{noformat}
total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
394141,39414,39414,0.0,0.0,0.0,10
1078321,68418,68418,0.0,0.0,0.0,20
1726219,64789,64789,0.0,0.0,0.0,30
2327295,60107,60107,0.0,0.0,0.0,40
2928533,60123,60123,0.0,0.0,0.0,50
3533878,60534,60534,0.0,0.0,0.0,60
3602168,6829,6829,0.0,0.0,0.0,70
3967820,36565,36565,0.0,0.0,0.0,80
4647217,67939,67939,0.0,0.0,0.0,91
5248142,60092,60092,0.0,0.0,0.0,101
5930662,68252,68252,0.0,0.0,0.0,111
6417903,48724,48724,0.0,0.0,0.0,121
6952933,53503,53503,0.0,0.0,0.0,131
7221662,26872,26872,0.0,0.0,0.0,141
7221662,0,0,0.0,0.0,0.0,151
7221662,0,0,0.0,0.0,0.0,161
7221662,0,0,0.0,0.0,0.0,172
7221662,0,0,0.0,0.0,0.0,182
7221662,0,0,0.0,0.0,0.0,192
7509240,28757,28757,0.0,0.0,0.0,202
7780984,27174,27174,0.0,0.0,0.0,212
7780984,0,0,0.0,0.0,0.0,222
7780984,0,0,0.0,0.0,0.0,232
7780984,0,0,0.0,0.0,0.0,242
8414140,63315,63315,0.0,0.0,0.0,252
8968246,55410,55410,0.0,0.0,0.0,263
9669857,70161,70161,0.0,0.0,0.0,273
10236467,56661,56661,0.0,0.0,0.0,283
10774593,53812,53812,0.0,0.0,0.0,293
10824657,5006,5006,0.0,0.0,0.0,303
11165174,34051,34051,0.0,0.0,0.0,313
11165174,0,0,0.0,0.0,0.0,323
11165174,0,0,0.0,0.0,0.0,333
11165174,0,0,0.0,0.0,0.0,343
11304248,13907,13907,0.0,0.0,0.0,354
11927380,62313,62313,0.0,0.0,0.0,364
12526960,59958,59958,0.0,0.0,0.0,374
13234647,70768,70768,0.0,0.0,0.0,384
13792652,55800,55800,0.0,0.0,0.0,394
14329718,53706,53706,0.0,0.0,0.0,404
14512350,18263,18263,0.0,0.0,0.0,414
14512929,57,57,0.0,0.0,0.0,424
14710476,19754,19754,0.0,0.0,0.0,434
14710476,0,0,0.0,0.0,0.0,445
14710476,0,0,0.0,0.0,0.0,455
15061043,35056,35056,0.0,0.0,0.0,465
15760509,69946,69946,0.0,0.0,0.0,475
16461318,70080,70080,0.0,0.0,0.0,485
17126749,66543,66543,0.0,0.0,0.0,495
17708154,58140,58140,0.0,0.0,0.0,505
18226801,51864,51864,0.0,0.0,0.0,515
18226801,0,0,0.0,0.0,0.0,526
18227225,42,42,0.0,0.0,0.0,536
18858228,63100,63100,0.0,0.0,0.0,546
19459047,60081,60081,0.0,0.0,0.0,556
19988583,52953,52953,0.0,0.0,0.0,566
2000,1141,1141,0.0,0.0,0.0,567


Averages from the middle 80% of values:
interval_op_rate  : 34003
interval_key_rate : 34003
latency median: 0.0
latency 95th percentile   : 0.0
latency 99.9th percentile : 0.0
Total operation time  : 00:09:27
END
{noformat}
h2. Patched version
{noformat}
total,interval_op_rate,interval_key_rate,latency,95th,99.9th,elapsed_time
398380,39838,39838,0.0,0.0,0.0,10
1090332,69195,69195,0.0,0.0,0.0,20
1756859,66652,66652,0.0,0.0,0.0,30
2408330,65147,65147,0.0,0.0,0.0,40
3021314,61298,61298,0.0,0.0,0.0,50
3602221,58090,58090,0.0,0.0,0.0,60
3602221,0,0,0.0,0.0,0.0,70
4086404,48418,48418,0.0,0.0,0.0,80
4670997,58459,58459,0.0,0.0,0.0,91
5328657,65766,65766,0.0,0.0,0.0,101
5950535,62187,62187,0.0,0.0,0.0,111
6544475,59394,59394,0.0,0.0,0.0,121
7163644,61916,61916,0.0,0.0,0.0,131
7307634,14399,14399,0.0,0.0,0.0,141
7331684,2405,2405,0.0,0.0,0.0,151
7989707,65802,65802,0.0,0.0,0.0,161
8653302,66359,66359,0.0,0.0,0.0,172
9273188,61988,61988,0.0,0.0,0.0,182
9935986,66279,66279,0.0,0.0,0.0,192
10489010,55302,55302,0.0,0.0,0.0,202
10909996,42098,42098,0.0,0.0,0.0,212
10962871,5287,5287,0.0,0.0,0.0,222
11274293,31142,31142,0.0,0.0,0.0,232
11274293,0,0,0.0,0.0,0.0,242
11274293,0,0,0.0,0.0,0.0,252
11297105,2281,2281,0.0,0.0,0.0,263
11946842,64973,64973,0.0,0.0,0.0,273
12509283,56244,56244,0.0,0.0,0.0,283
13205933,69665,69665,0.0,0.0,0.0,293
13809534,60360,60360,0.0,0.0,0.0,303
14334735,52520,52520,0.0,0.0,0.0,313
14615255,28052,28052,0.0,0.0,0.0,323
14615958,70,70,0.0,0.0,0.0,333
14841997,22603,22603,0.0,0.0,0.0,343
14841997,0,0,0.0,0.0,0.0,354
14841997,0,0,0.0,0.0,0.0,364
15262968,42097,42097,0.0,0.0,0.0,374
15943731,68076,68076,0.0,0.0,0.0,384
16619205,67547,67547,0.0,0.0,0.0,394
17197417,57821,57821,0.0,0.0,0.0,404
17776353,57893,57893,0.0,0.0,0.0,414
18235461,45910,45910,0.0,0.0,0.0,424
18267460,3199,3199,0.0,0.0,0.0,434
18592152,32469,32469,0.0,0.0,0.0,445
18732480,14032,14032,0.0,0.0,0.0,455
19328150,59567,59567,0.0,0.0,0.0,465
19930114,60196,60196,0.0,0.0,0.0,475
2000,6988,6988,0.0,0.0,0.0,479


Averages 

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-26 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778871#comment-13778871
 ] 

Marcus Eriksson commented on CASSANDRA-4338:


so, using a direct bytebuffer in SequentialWriter generates alot less garbage 
in my micro benchmarks (will post patch and graphs later) - mostly by not 
having to copy the incoming byte array, instead just pushing the data to a 
direct BB. It is also a bit faster (~5%), maybe just because of less gc.

Making it work with CompressedSequentialWriter is not as easy since we then 
need to either use a standard byte[] buffer and compress that before pushing it 
off-heap/to disk or copy to the heap, compress and then push it back. Neither 
will be any improvement.

but, then i found out that snappy can compress a direct byte buffer without 
copying anything to the heap:
https://github.com/xerial/snappy-java/blob/develop/src/main/java/org/xerial/snappy/Snappy.java#L126

problem is that LZ4 does not support that (yet?):
https://github.com/jpountz/lz4-java/issues/9

Hadoop seems to ship their own native code to solve this problem:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/

also related:
https://issues.apache.org/jira/browse/HADOOP-8148

i will experiment with making this work with snappy and see how much we can 
gain by doing it.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13778902#comment-13778902
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

/throws up the [~jpountz] signal

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-26 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13779098#comment-13779098
 ] 

Adrien Grand commented on CASSANDRA-4338:
-

Interesting, I was wondering whether people actually need to compress from/to 
byte buffers. Now that I know that some do, I can try to move this issue 
forward.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774798#comment-13774798
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Also relevant, Radim said he got a large improvement from mmap-based writes in 
CASSANDRA-5473.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Marcus Eriksson
Priority: Minor
  Labels: performance
 Fix For: 2.1

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-04-11 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629663#comment-13629663
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Relevant: 
http://mechanical-sympathy.blogspot.com/2011/12/java-sequential-io-performance.html

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Aleksey Yeschenko
Priority: Minor
 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2012-08-21 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439010#comment-13439010
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Any difference in cpu usage with the direct buffer patch?  If we're not maxing 
out CPU then it wouldn't necessarily run faster even if it's more efficient.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.2.0

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2012-06-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401848#comment-13401848
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

I'd vote for:

- test with LCS, with/without compression (maybe even reduce sstable size to 
1MB to really stress sstable creation)
- enable gc logging, count promotion failures so we have quantitative data (if 
we see zero both ways, we may need a more complex test)
- if instead we see nonzero promotion failures both ways, at about the same 
rate, we might need to look at using our cleaner hack to free the direct 
buffers, or use a buffer based on FreeableMemory, to avoid the phantomreference 
crap that DirectBuffer normally inflicts on GC

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.2

 Attachments: gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2012-06-15 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13295771#comment-13295771
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Using direct buffers for RAR and CRAR may also help avoid heap fragmentation.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.2


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira