[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS

2016-01-20 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110147#comment-15110147
 ] 

Christian Winther commented on CASSANDRA-9666:
--

I'm running 2.2.4 and did not see the same stability and performance stability 
from DTCS as TWCS prior to changing - especially in compaction

> Provide an alternative to DTCS
> --
>
> Key: CASSANDRA-9666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9666
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
> Fix For: 2.1.x, 2.2.x
>
> Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png
>
>
> DTCS is great for time series data, but it comes with caveats that make it 
> difficult to use in production (typical operator behaviors such as bootstrap, 
> removenode, and repair have MAJOR caveats as they relate to 
> max_sstable_age_days, and hints/read repair break the selection algorithm).
> I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices 
> the tiered nature of DTCS in order to address some of DTCS' operational 
> shortcomings. I believe it is necessary to propose an alternative rather than 
> simply adjusting DTCS, because it fundamentally removes the tiered nature in 
> order to remove the parameter max_sstable_age_days - the result is very very 
> different, even if it is heavily inspired by DTCS. 
> Specifically, rather than creating a number of windows of ever increasing 
> sizes, this strategy allows an operator to choose the window size, compact 
> with STCS within the first window of that size, and aggressive compact down 
> to a single sstable once that window is no longer current. The window size is 
> a combination of unit (minutes, hours, days) and size (1, etc), such that an 
> operator can expect all data using a block of that size to be compacted 
> together (that is, if your unit is hours, and size is 6, you will create 
> roughly 4 sstables per day, each one containing roughly 6 hours of data). 
> The result addresses a number of the problems with 
> DateTieredCompactionStrategy:
> - At the present time, DTCS’s first window is compacted using an unusual 
> selection criteria, which prefers files with earlier timestamps, but ignores 
> sizes. In TimeWindowCompactionStrategy, the first window data will be 
> compacted with the well tested, fast, reliable STCS. All STCS options can be 
> passed to TimeWindowCompactionStrategy to configure the first window’s 
> compaction behavior.
> - HintedHandoff may put old data in new sstables, but it will have little 
> impact other than slightly reduced efficiency (sstables will cover a wider 
> range, but the old timestamps will not impact sstable selection criteria 
> during compaction)
> - ReadRepair may put old data in new sstables, but it will have little impact 
> other than slightly reduced efficiency (sstables will cover a wider range, 
> but the old timestamps will not impact sstable selection criteria during 
> compaction)
> - Small, old sstables resulting from streams of any kind will be swiftly and 
> aggressively compacted with the other sstables matching their similar 
> maxTimestamp, without causing sstables in neighboring windows to grow in size.
> - The configuration options are explicit and straightforward - the tuning 
> parameters leave little room for error. The window is set in common, easily 
> understandable terms such as “12 hours”, “1 Day”, “30 days”. The 
> minute/hour/day options are granular enough for users keeping data for hours, 
> and users keeping data for years. 
> - There is no explicitly configurable max sstable age, though sstables will 
> naturally stop compacting once new data is written in that window. 
> - Streaming operations can create sstables with old timestamps, and they'll 
> naturally be joined together with sstables in the same time bucket. This is 
> true for bootstrap/repair/sstableloader/removenode. 
> - It remains true that if old data and new data is written into the memtable 
> at the same time, the resulting sstables will be treated as if they were new 
> sstables, however, that no longer negatively impacts the compaction 
> strategy’s selection criteria for older windows. 
> Patch provided for : 
> - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 
> - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2
> - trunk (post-8099):  https://github.com/jeffjirsa/cassandra/commits/twcs 
> Rebased, force-pushed July 18, with bug fixes for estimated pending 
> compactions and potential starvation if more than min_threshold tables 
> existed in current window but STCS did not consider them viable candidates
> Rebased, force-pushed Aug 20 to bring in relevant logic from CASSANDRA-9882



--
This message was sent by Atlassian JI

[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS

2016-01-20 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110055#comment-15110055
 ] 

Christian Winther commented on CASSANDRA-9666:
--

I'm sad by not getting it into C*

I've also done the DTCS -> TWCS migration on my data-set, and the number of 
hours i spend on cassandra maintenance and monitoring per week has dropped from 
5-10 to 1.. it just works, no crazy compactions, no 100GB SSTables, stable 
performance in read/write, not running amok with sstables during 2-3d 
compactions 

it's just a way better experience as a C* newbie than DTCS - and quite easier 
to understand and tweak as well.

C* would be less awesome in my mind if TWCS was not included

> Provide an alternative to DTCS
> --
>
> Key: CASSANDRA-9666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9666
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
> Fix For: 2.1.x, 2.2.x
>
> Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png
>
>
> DTCS is great for time series data, but it comes with caveats that make it 
> difficult to use in production (typical operator behaviors such as bootstrap, 
> removenode, and repair have MAJOR caveats as they relate to 
> max_sstable_age_days, and hints/read repair break the selection algorithm).
> I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices 
> the tiered nature of DTCS in order to address some of DTCS' operational 
> shortcomings. I believe it is necessary to propose an alternative rather than 
> simply adjusting DTCS, because it fundamentally removes the tiered nature in 
> order to remove the parameter max_sstable_age_days - the result is very very 
> different, even if it is heavily inspired by DTCS. 
> Specifically, rather than creating a number of windows of ever increasing 
> sizes, this strategy allows an operator to choose the window size, compact 
> with STCS within the first window of that size, and aggressive compact down 
> to a single sstable once that window is no longer current. The window size is 
> a combination of unit (minutes, hours, days) and size (1, etc), such that an 
> operator can expect all data using a block of that size to be compacted 
> together (that is, if your unit is hours, and size is 6, you will create 
> roughly 4 sstables per day, each one containing roughly 6 hours of data). 
> The result addresses a number of the problems with 
> DateTieredCompactionStrategy:
> - At the present time, DTCS’s first window is compacted using an unusual 
> selection criteria, which prefers files with earlier timestamps, but ignores 
> sizes. In TimeWindowCompactionStrategy, the first window data will be 
> compacted with the well tested, fast, reliable STCS. All STCS options can be 
> passed to TimeWindowCompactionStrategy to configure the first window’s 
> compaction behavior.
> - HintedHandoff may put old data in new sstables, but it will have little 
> impact other than slightly reduced efficiency (sstables will cover a wider 
> range, but the old timestamps will not impact sstable selection criteria 
> during compaction)
> - ReadRepair may put old data in new sstables, but it will have little impact 
> other than slightly reduced efficiency (sstables will cover a wider range, 
> but the old timestamps will not impact sstable selection criteria during 
> compaction)
> - Small, old sstables resulting from streams of any kind will be swiftly and 
> aggressively compacted with the other sstables matching their similar 
> maxTimestamp, without causing sstables in neighboring windows to grow in size.
> - The configuration options are explicit and straightforward - the tuning 
> parameters leave little room for error. The window is set in common, easily 
> understandable terms such as “12 hours”, “1 Day”, “30 days”. The 
> minute/hour/day options are granular enough for users keeping data for hours, 
> and users keeping data for years. 
> - There is no explicitly configurable max sstable age, though sstables will 
> naturally stop compacting once new data is written in that window. 
> - Streaming operations can create sstables with old timestamps, and they'll 
> naturally be joined together with sstables in the same time bucket. This is 
> true for bootstrap/repair/sstableloader/removenode. 
> - It remains true that if old data and new data is written into the memtable 
> at the same time, the resulting sstables will be treated as if they were new 
> sstables, however, that no longer negatively impacts the compaction 
> strategy’s selection criteria for older windows. 
> Patch provided for : 
> - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 
> - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2
> - trunk (post-8099):  https://git

[jira] [Comment Edited] (CASSANDRA-8167) sstablesplit tool can be made much faster with few JVM settings

2016-01-10 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091037#comment-15091037
 ] 

Christian Winther edited comment on CASSANDRA-8167 at 1/10/16 1:28 PM:
---

I've seen similar - I've simply added

{{code}}
. /etc/cassandra/cassandra-env.sh
{{/code}}

before

"$JAVA" $JAVA_AGENT -ea -cp "$CLASSPATH" $JVM_OPTS \
-Dcassandra.storagedir="$cassandra_storagedir" \
-Dlogback.configurationFile=logback-tools.xml \
org.apache.cassandra.tools.StandaloneSplitter "$@"

and remove all other JVM opts from the original execution


was (Author: jippignu):
I've seen similar - I've simply added

{{
. /etc/cassandra/cassandra-env.sh
}}

before

"$JAVA" $JAVA_AGENT -ea -cp "$CLASSPATH" $JVM_OPTS \
-Dcassandra.storagedir="$cassandra_storagedir" \
-Dlogback.configurationFile=logback-tools.xml \
org.apache.cassandra.tools.StandaloneSplitter "$@"

and remove all other JVM opts from the original execution

> sstablesplit tool can be made much faster with few JVM settings
> ---
>
> Key: CASSANDRA-8167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8167
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Nikolai Grigoriev
>Priority: Trivial
>
> I had to use sstablesplit tool intensively to split some really huge 
> sstables. The tool is painfully slow as it does compaction in one single 
> thread.
> I have just found that one one of my machines the tool has crashed when I was 
> almost done with 152Gb sstable (!!!). 
> {code}
>  INFO 16:59:22,342 Writing Memtable-compactions_in_progress@1948660572(0/0 
> serialized/live bytes, 1 ops)
>  INFO 16:59:22,352 Completed flushing 
> /cassandra-data/disk1/system/compactions_in_progress/system-compactions_in_progress-jb-79242-Data.db
>  (42 bytes) for commitlog position ReplayPosition(segmentId=1413904450653, 
> position=69178)
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
> at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:586)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:36)
> at 
> org.apache.cassandra.db.RangeTombstoneList$InOrderTester.isDeleted(RangeTombstoneList.java:751)
> at 
> org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:422)
> at 
> org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:403)
> at 
> org.apache.cassandra.db.ColumnFamily.hasIrrelevantData(ColumnFamily.java:489)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeleted(PrecompactedRow.java:66)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:85)
> at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
> at 
> org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
> at 
> org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
> at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:204)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:154)
> at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.SSTableSplitter.split(SSTableSplitter.java:38)
> at 
> org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:150)
> {code}
> This has  triggered my desire to see what memory settings are used for JVM 
> running the tool...and I have found that it runs with default Java settings 
> (no settings at all).
> I have tried to apply the settings from C* itself and this resulted in over 
> 40% speed increase. It we

[jira] [Commented] (CASSANDRA-8167) sstablesplit tool can be made much faster with few JVM settings

2016-01-10 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091037#comment-15091037
 ] 

Christian Winther commented on CASSANDRA-8167:
--

I've seen similar - I've simply added

{{code}}
. /etc/cassandra/cassandra-env.sh
{{code}}

before

"$JAVA" $JAVA_AGENT -ea -cp "$CLASSPATH" $JVM_OPTS \
-Dcassandra.storagedir="$cassandra_storagedir" \
-Dlogback.configurationFile=logback-tools.xml \
org.apache.cassandra.tools.StandaloneSplitter "$@"

and remove all other JVM opts from the original execution

> sstablesplit tool can be made much faster with few JVM settings
> ---
>
> Key: CASSANDRA-8167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8167
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Nikolai Grigoriev
>Priority: Trivial
>
> I had to use sstablesplit tool intensively to split some really huge 
> sstables. The tool is painfully slow as it does compaction in one single 
> thread.
> I have just found that one one of my machines the tool has crashed when I was 
> almost done with 152Gb sstable (!!!). 
> {code}
>  INFO 16:59:22,342 Writing Memtable-compactions_in_progress@1948660572(0/0 
> serialized/live bytes, 1 ops)
>  INFO 16:59:22,352 Completed flushing 
> /cassandra-data/disk1/system/compactions_in_progress/system-compactions_in_progress-jb-79242-Data.db
>  (42 bytes) for commitlog position ReplayPosition(segmentId=1413904450653, 
> position=69178)
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
> at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:586)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:36)
> at 
> org.apache.cassandra.db.RangeTombstoneList$InOrderTester.isDeleted(RangeTombstoneList.java:751)
> at 
> org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:422)
> at 
> org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:403)
> at 
> org.apache.cassandra.db.ColumnFamily.hasIrrelevantData(ColumnFamily.java:489)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeleted(PrecompactedRow.java:66)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:85)
> at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
> at 
> org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
> at 
> org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
> at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:204)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:154)
> at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.SSTableSplitter.split(SSTableSplitter.java:38)
> at 
> org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:150)
> {code}
> This has  triggered my desire to see what memory settings are used for JVM 
> running the tool...and I have found that it runs with default Java settings 
> (no settings at all).
> I have tried to apply the settings from C* itself and this resulted in over 
> 40% speed increase. It went from ~5Mb/s to 7Mb/s - from the compressed output 
> perspective. I believe this is mostly due to concurrent GC. I see my CPU 
> usage has increased to ~200%. But this is fine, this is an offline tool, the 
> node is down anyway. I know that concurrent GC (at least something like 
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled) 
> normally improves the performance of even primitive single-threaded 
> heap-intensive 

[jira] [Comment Edited] (CASSANDRA-8167) sstablesplit tool can be made much faster with few JVM settings

2016-01-10 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091037#comment-15091037
 ] 

Christian Winther edited comment on CASSANDRA-8167 at 1/10/16 1:28 PM:
---

I've seen similar - I've simply added

{{
. /etc/cassandra/cassandra-env.sh
}}

before

"$JAVA" $JAVA_AGENT -ea -cp "$CLASSPATH" $JVM_OPTS \
-Dcassandra.storagedir="$cassandra_storagedir" \
-Dlogback.configurationFile=logback-tools.xml \
org.apache.cassandra.tools.StandaloneSplitter "$@"

and remove all other JVM opts from the original execution


was (Author: jippignu):
I've seen similar - I've simply added

{{code}}
. /etc/cassandra/cassandra-env.sh
{{code}}

before

"$JAVA" $JAVA_AGENT -ea -cp "$CLASSPATH" $JVM_OPTS \
-Dcassandra.storagedir="$cassandra_storagedir" \
-Dlogback.configurationFile=logback-tools.xml \
org.apache.cassandra.tools.StandaloneSplitter "$@"

and remove all other JVM opts from the original execution

> sstablesplit tool can be made much faster with few JVM settings
> ---
>
> Key: CASSANDRA-8167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8167
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Nikolai Grigoriev
>Priority: Trivial
>
> I had to use sstablesplit tool intensively to split some really huge 
> sstables. The tool is painfully slow as it does compaction in one single 
> thread.
> I have just found that one one of my machines the tool has crashed when I was 
> almost done with 152Gb sstable (!!!). 
> {code}
>  INFO 16:59:22,342 Writing Memtable-compactions_in_progress@1948660572(0/0 
> serialized/live bytes, 1 ops)
>  INFO 16:59:22,352 Completed flushing 
> /cassandra-data/disk1/system/compactions_in_progress/system-compactions_in_progress-jb-79242-Data.db
>  (42 bytes) for commitlog position ReplayPosition(segmentId=1413904450653, 
> position=69178)
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
> at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:586)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:36)
> at 
> org.apache.cassandra.db.RangeTombstoneList$InOrderTester.isDeleted(RangeTombstoneList.java:751)
> at 
> org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:422)
> at 
> org.apache.cassandra.db.DeletionInfo$InOrderTester.isDeleted(DeletionInfo.java:403)
> at 
> org.apache.cassandra.db.ColumnFamily.hasIrrelevantData(ColumnFamily.java:489)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeleted(PrecompactedRow.java:66)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:85)
> at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
> at 
> org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
> at 
> org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
> at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:204)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:154)
> at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
> at 
> org.apache.cassandra.db.compaction.SSTableSplitter.split(SSTableSplitter.java:38)
> at 
> org.apache.cassandra.tools.StandaloneSplitter.main(StandaloneSplitter.java:150)
> {code}
> This has  triggered my desire to see what memory settings are used for JVM 
> running the tool...and I have found that it runs with default Java settings 
> (no settings at all).
> I have tried to apply the settings from C* itself and this resulted in over 
> 40% speed increase. It wen

[jira] [Commented] (CASSANDRA-10478) Seek position is not within mmap segment

2015-10-12 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953025#comment-14953025
 ] 

Christian Winther commented on CASSANDRA-10478:
---

okay, I changed to standard disk mode, my dataset is just a few TB, so went 
pretty fast with rebuilding the summary files

will the next release be 2.2.3 ?

> Seek position is not within mmap segment
> 
>
> Key: CASSANDRA-10478
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10478
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.2.2 
> Java 1.8.0.60
>Reporter: Omri Iluz
>Assignee: Benedict
>Priority: Critical
> Fix For: 2.2.3, 2.1.11
>
>
> After upgrading to 2.2.2 we started seeing timeouts accompanied by the 
> following error in the log. Disabling mmap (by using "disk_access_mode: 
> standard") completely solves the problem.
> We did not experience this problem in 2.2.1.
> The change to src/java/org/apache/cassandra/io/util/ByteBufferDataInput.java 
> in the following commit seems interesting as it changes the calculation of 
> the mmap boundaries (and moves from <= to <) 
> https://github.com/apache/cassandra/commit/25de92e321604626d6c098233082904832c07814
>  
> {noformat}
> WARN  [SharedPool-Worker-1] 2015-10-07 03:40:39,771 
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.RuntimeException: 
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
> Seek position 717680 is not within mmap segment (seg offs: 0, length: 717680)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.2.2.jar:2.2.2]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
> java.io.IOException: Seek position 717680 is not within mmap segment (seg 
> offs: 0, length: 717680)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.(SSTableSliceIterator.java:42)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) 
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   ... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 717680 is not within mmap 
> segment (seg offs: 0, length: 717680)
>   at 
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>  ~[apache-cas

[jira] [Commented] (CASSANDRA-10478) Seek position is not within mmap segment

2015-10-12 Thread Christian Winther (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952919#comment-14952919
 ] 

Christian Winther commented on CASSANDRA-10478:
---

when will this be released as a binary ? we are hitting the same issue currently

is there any workaround I can apply configuration wise?

> Seek position is not within mmap segment
> 
>
> Key: CASSANDRA-10478
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10478
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.2.2 
> Java 1.8.0.60
>Reporter: Omri Iluz
>Assignee: Benedict
>Priority: Critical
> Fix For: 2.2.3, 2.1.11
>
>
> After upgrading to 2.2.2 we started seeing timeouts accompanied by the 
> following error in the log. Disabling mmap (by using "disk_access_mode: 
> standard") completely solves the problem.
> We did not experience this problem in 2.2.1.
> The change to src/java/org/apache/cassandra/io/util/ByteBufferDataInput.java 
> in the following commit seems interesting as it changes the calculation of 
> the mmap boundaries (and moves from <= to <) 
> https://github.com/apache/cassandra/commit/25de92e321604626d6c098233082904832c07814
>  
> {noformat}
> WARN  [SharedPool-Worker-1] 2015-10-07 03:40:39,771 
> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.RuntimeException: 
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
> Seek position 717680 is not within mmap segment (seg offs: 0, length: 717680)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.2.2.jar:2.2.2]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
> java.io.IOException: Seek position 717680 is not within mmap segment (seg 
> offs: 0, length: 717680)
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.(SSTableSliceIterator.java:42)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360) 
> ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   ... 4 common frames omitted
> Caused by: java.io.IOException: Seek position 717680 is not within mmap 
> segment (seg offs: 0, length: 717680)
>   at 
> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>  ~[apache-cassandra-2.2.2.jar:2.2.2]
>   at 
> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>  ~[apache-cassandra-2.2.2.jar:2.

[jira] [Updated] (CASSANDRA-10393) LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref)

2015-09-24 Thread Christian Winther (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Winther updated CASSANDRA-10393:
--
Description: 
When trying to repair full on a table with the following schema my nodes stall 
and end up with spamming this 

I've recently changed the table from SizeTieredCompactionStrategy to 
LeveledCompactionStrategy.

Coming from 2.1.9 -> 2.2.0 -> 2.2.1 i ran upgradesstable without issue as well

When trying to full repair post compaction change, I got "out of order" errors. 
A few google searches later, I was told to "scrub" the keyspace - did that 
during the night (no problems logged, and no data lost)

Now a repair just stalls and output memory leaks all over the place 

{code}
CREATE KEYSPACE sessions WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE sessions.sessions (
id text PRIMARY KEY,
client_ip text,
controller text,
controller_action text,
created timestamp,
data text,
expires timestamp,
http_host text,
modified timestamp,
request_agent text,
request_agent_bot boolean,
request_path text,
site_id int,
user_id int
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"NONE", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
{code}

{code}
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4428a373) to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@184765:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104037-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@368dd97) 
to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@184765:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104037-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@66fb78be) to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@184765:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104037-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@9fdd2e6) 
to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@1460906269:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104788-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@84fcb91) 
to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@1460906269:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104788-big
 was not released before the reference was garbage collected
{code}

  was:
When trying to repair full on a table with the following schema my nodes stall 
and end up with spamming this 

I've recently changed the table from SizeTieredCompactionStrategy to 
LeveledCompactionStrategy.

Coming from 2.1.9 -> 2.2.0 -> 2.2.1 i ran upgradesstable without issue as well

When trying to full repair post compaction change, I got "out of order" errors. 
A few google searches later, I was told to "scrub" the keyspace - did that 
during the night (no problems logged, and no data lost)

Now a repair just stalls and output memory leaks all over the place 

{code}
CREATE KEYSPACE sessions WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE sessions.sessions (
id text PRIMARY KEY,
client_ip text,
controller text,
controller_action text,
created timestamp,
data text,
expires timestamp,
http_host text,
modified timestamp,
request_agent text,
request_agent_bot boolean,
request_path text,
site_id int,
user_id int
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"NONE", "rows_per_pa

[jira] [Created] (CASSANDRA-10393) LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref)

2015-09-24 Thread Christian Winther (JIRA)
Christian Winther created CASSANDRA-10393:
-

 Summary: LEAK DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref)
 Key: CASSANDRA-10393
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10393
 Project: Cassandra
  Issue Type: Bug
 Environment: v 2.2.1 (from apt)

-> lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:Debian GNU/Linux 7.8 (wheezy)
Release:7.8
Codename:   wheezy

-> java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

Reporter: Christian Winther


When trying to repair full on a table with the following schema my nodes stall 
and end up with spamming this 

I've recently changed the table from SizeTieredCompactionStrategy to 
LeveledCompactionStrategy.

Coming from 2.1.9 -> 2.2.0 -> 2.2.1 i ran upgradesstable without issue as well

When trying to full repair post compaction change, I got "out of order" errors. 
A few google searches later, I was told to "scrub" the keyspace - did that 
during the night (no problems logged, and no data lost)

Now a repair just stalls and output memory leaks all over the place 

{code}
CREATE KEYSPACE sessions WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE sessions.sessions (
id text PRIMARY KEY,
client_ip text,
controller text,
controller_action text,
created timestamp,
data text,
expires timestamp,
http_host text,
modified timestamp,
request_agent text,
request_agent_bot boolean,
request_path text,
site_id int,
user_id int
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"NONE", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
{code}


ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@4428a373) to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@184765:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104037-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@368dd97) 
to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@184765:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104037-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@66fb78be) to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@184765:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104037-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@9fdd2e6) 
to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@1460906269:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104788-big
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-09-24 10:25:28,475 Ref.java:187 - LEAK 
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@84fcb91) 
to class 
org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@1460906269:/data/1/cassandra/sessions/sessions-77dd22f0ab9711e49cbc410c6b6f53a6/la-104788-big
 was not released before the reference was garbage collected




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)