[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217517#comment-15217517 ] Jeff Jirsa commented on CASSANDRA-9666: --- {quote} Btw, as a side note, I believe the code for DTCS and TWCS is very similar, the alternatives of adding tiering to TWCS vs renaming DTCS options would probably end up with the same thing. {quote} I strongly encourage the project not to go down either of these routes - tiering is not worth the added complexity it introduces into the code or the user experience, and I assert that, perhaps counterintuitively, tiering increases write amplification in a time series workload rather than decreasing it. > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming operations can create sstables with old timestamps, and they'll > naturally be joined together with sstables in the same time bucket. This is > true for bootstrap/repair/sstableloader/removenode. > - It remains true that if old data and new data is written into the memtable > at the same time, the resulting sstables will be treated as if they were new > sstables, however, that no longer negatively impacts the compaction > strategy’s selection criteria for older windows. > Patch provided for : > - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 > - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 > - trunk (post-8099): https://github.com/jeffjirsa/cassandra
[jira] [Issue Comment Deleted] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-9666: -- Comment: was deleted (was: {quote} Btw, as a side note, I believe the code for DTCS and TWCS is very similar, the alternatives of adding tiering to TWCS vs renaming DTCS options would probably end up with the same thing. {quote} I strongly encourage the project not to go down either of these routes - tiering is not worth the added complexity it introduces into the code or the user experience, and I assert that, perhaps counterintuitively, tiering increases write amplification in a time series workload rather than decreasing it. ) > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming operations can create sstables with old timestamps, and they'll > naturally be joined together with sstables in the same time bucket. This is > true for bootstrap/repair/sstableloader/removenode. > - It remains true that if old data and new data is written into the memtable > at the same time, the resulting sstables will be treated as if they were new > sstables, however, that no longer negatively impacts the compaction > strategy’s selection criteria for older windows. > Patch provided for : > - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 > - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 > - trunk (post-8099): https://github.com/jeffjirsa/cassandra/commits/twcs > Rebased,
[jira] [Updated] (CASSANDRA-11460) memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stone updated CASSANDRA-11460: -- Description: env: cassandra3.3 jdk8 8G Ram so set MAX_HEAP_SIZE="2G" HEAP_NEWSIZE="400M" 1.met same problem about this: https://issues.apache.org/jira/browse/CASSANDRA-9549 I confuse about that this was fixed in release 3.3 according this page: https://github.com/apache/cassandra/blob/trunk/CHANGES.txt so I change to 3.4,and also have found this problem again I think this fix should be included in 3.3.3.4 can you explain about this? 2.our write rate exceed the value that our cassandra env can support, but i think it should descrese the write rate,or block.consumer the writed data,keep the memory down,then go on writing,not cause out-of-memory instead. was: env: cassandra3.3 jdk8 met same problem about this: https://issues.apache.org/jira/browse/CASSANDRA-9549 I confuse about that this was fixed in release 3.3 according this page: https://github.com/apache/cassandra/blob/trunk/CHANGES.txt so I change to 3.4,and have not found this problem again at this time. I think this fix should be included in 3.3. can you explain about this? > memory leak > --- > > Key: CASSANDRA-11460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11460 > Project: Cassandra > Issue Type: Bug >Reporter: stone >Priority: Critical > Attachments: aaa.jpg > > > env: > cassandra3.3 > jdk8 > 8G Ram > so set > MAX_HEAP_SIZE="2G" > HEAP_NEWSIZE="400M" > 1.met same problem about this: > https://issues.apache.org/jira/browse/CASSANDRA-9549 > I confuse about that this was fixed in release 3.3 according this page: > https://github.com/apache/cassandra/blob/trunk/CHANGES.txt > so I change to 3.4,and also have found this problem again > I think this fix should be included in 3.3.3.4 > can you explain about this? > 2.our write rate exceed the value that our cassandra env can support, > but i think it should descrese the write rate,or block.consumer the writed > data,keep the memory down,then go on writing,not cause out-of-memory instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11460) memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stone updated CASSANDRA-11460: -- Attachment: aaa.jpg > memory leak > --- > > Key: CASSANDRA-11460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11460 > Project: Cassandra > Issue Type: Bug >Reporter: stone >Priority: Critical > Attachments: aaa.jpg > > > env: > cassandra3.3 > jdk8 > met same problem about this: > https://issues.apache.org/jira/browse/CASSANDRA-9549 > I confuse about that this was fixed in release 3.3 according this page: > https://github.com/apache/cassandra/blob/trunk/CHANGES.txt > so I change to 3.4,and have not found this problem again at this time. > I think this fix should be included in 3.3. > can you explain about this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11459) cassandra performance problem when streaming large data
[ https://issues.apache.org/jira/browse/CASSANDRA-11459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217393#comment-15217393 ] Jeff Jirsa commented on CASSANDRA-11459: 2.0.x isn't supported anymore, but it sounds like CASSANDRA-9754 . > cassandra performance problem when streaming large data > > > Key: CASSANDRA-11459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11459 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: ubuntu 14.04, 3 nodes in each datacenter, > 1g networking, each node has 128G ram, 3*300G SSD in RAID5, dual E5-2620v3 > processors >Reporter: Yan Cui > > We found the problem on Cassandra 2.0.15, and have not tested on other > versions. > there is one core table, and the schema is > [user_id int, device_token text, deleted bool, device_info map, > human_code text] > user_id and device token is the primary key, and user_id is the partition key, > we have the statement that caused latency spike (3500ms to 4000 ms). > select * from table where user_id = . the hotuserid has roughly > 8 rows. On average, there is 200 bytes for each row. We feel this should > be slow because of more results out there, but it is not expected to be that > slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9692) Print sensible units for all log messages
[ https://issues.apache.org/jira/browse/CASSANDRA-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217339#comment-15217339 ] Joel Knighton commented on CASSANDRA-9692: -- Great work on this - it's hard enough to contribute to one OSS project, let alone coordinate a change across three spanning two programming languages and two processes to contribute. The CCM changes were merged in [PR #485|https://github.com/pcmanus/ccm/pull/485] to the pcmanus/ccm repo. The dtest changes have been PRed in [PR #899|https://github.com/riptano/cassandra-dtest/pull/899] to the riptano/cassandra-dtest repo and will be merged after this ticket is committed to C*. Latest CI runs with the changes made to CCM and dtest look clean relative to upstream. Final branch here: ||branch||testall||dtest|| |[9692-trunk|https://github.com/jkni/cassandra/tree/9692-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/jkni/job/jkni-9692-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/jkni/job/jkni-9692-trunk-dtest]| Note to committer: this should only go to trunk. It is rebased on latest trunk so should merge fine. > Print sensible units for all log messages > - > > Key: CASSANDRA-9692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9692 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Giampaolo >Priority: Minor > Labels: lhf > Fix For: 3.x > > Attachments: > Cassandra9692-Rev1-trunk-giampaolo-trapasso-at-radicalbit-io.diff, > Cassandra9692-Rev2-trunk-giampaolo.trapasso-at-radicalbit-io.diff, > Cassandra9692-trunk-giampaolo-trapasso-at-radicalbit-io.diff, > ccm-bb08b6798f3fda39217f2daf710116a84a3ede84.patch, > dtests-8a1017398ab55a4648fcc307a9be0644c02602dd.patch > > > Like CASSANDRA-9691, this has bugged me too long. it also adversely impacts > log analysis. I've introduced some improvements to the bits I touched for > CASSANDRA-9681, but we should do this across the codebase. It's a small > investment for a lot of long term clarity in the logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9692) Print sensible units for all log messages
[ https://issues.apache.org/jira/browse/CASSANDRA-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-9692: - Status: Ready to Commit (was: Patch Available) > Print sensible units for all log messages > - > > Key: CASSANDRA-9692 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9692 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Giampaolo >Priority: Minor > Labels: lhf > Fix For: 3.x > > Attachments: > Cassandra9692-Rev1-trunk-giampaolo-trapasso-at-radicalbit-io.diff, > Cassandra9692-Rev2-trunk-giampaolo.trapasso-at-radicalbit-io.diff, > Cassandra9692-trunk-giampaolo-trapasso-at-radicalbit-io.diff, > ccm-bb08b6798f3fda39217f2daf710116a84a3ede84.patch, > dtests-8a1017398ab55a4648fcc307a9be0644c02602dd.patch > > > Like CASSANDRA-9691, this has bugged me too long. it also adversely impacts > log analysis. I've introduced some improvements to the bits I touched for > CASSANDRA-9681, but we should do this across the codebase. It's a small > investment for a lot of long term clarity in the logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11225) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217213#comment-15217213 ] Stefania commented on CASSANDRA-11225: -- There was a problem with the test patch of yesterday, {{query_value()}} now returns a value, not a boolean. I'm really sorry but we need to repeat the tests with the fixed patch. If we still have failures, either our test is too demanding for counters, in which case we should relax it, perhaps by pausing a few milliseconds, or we may have a bug hiding somewhere. Here's what the test does: it reads at CL.ALL to find the current counter value, it increments the counter at various consistency levels, it checks that we can read back the incremented counter value from at least the number of replicas we wrote to, and then it starts all over again but it contacts another host. I would expect that a read at CL.ALL would trigger a digest mismatch and a subsequent repair, so that all nodes start from the same counter value when we apply the next increment. [~iamaleksey] is this statement correct? > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-11225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11225 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/209/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_novnode_dtest #209 > error: "AssertionError: Failed to read value from sufficient number of nodes, > required 2 but got 1 - [574, 2]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11460) memory leak
[ https://issues.apache.org/jira/browse/CASSANDRA-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stone updated CASSANDRA-11460: -- Description: env: cassandra3.3 jdk8 met same problem about this: https://issues.apache.org/jira/browse/CASSANDRA-9549 I confuse about that this was fixed in release 3.3 according this page: https://github.com/apache/cassandra/blob/trunk/CHANGES.txt so I change to 3.4,and have not found this problem again at this time. I think this fix should be included in 3.3. can you explain about this? was: env:cassandra3.3 met same problem about this: https://issues.apache.org/jira/browse/CASSANDRA-9549 I confuse about that this was fixed in release 3.3 according this page: https://github.com/apache/cassandra/blob/trunk/CHANGES.txt so I change to 3.4,and have not found this problem again at this time. I think this fix should be included in 3.3. can you explain about this? > memory leak > --- > > Key: CASSANDRA-11460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11460 > Project: Cassandra > Issue Type: Bug >Reporter: stone >Priority: Critical > > env: > cassandra3.3 > jdk8 > met same problem about this: > https://issues.apache.org/jira/browse/CASSANDRA-9549 > I confuse about that this was fixed in release 3.3 according this page: > https://github.com/apache/cassandra/blob/trunk/CHANGES.txt > so I change to 3.4,and have not found this problem again at this time. > I think this fix should be included in 3.3. > can you explain about this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11460) memory leak
stone created CASSANDRA-11460: - Summary: memory leak Key: CASSANDRA-11460 URL: https://issues.apache.org/jira/browse/CASSANDRA-11460 Project: Cassandra Issue Type: Bug Reporter: stone Priority: Critical env:cassandra3.3 met same problem about this: https://issues.apache.org/jira/browse/CASSANDRA-9549 I confuse about that this was fixed in release 3.3 according this page: https://github.com/apache/cassandra/blob/trunk/CHANGES.txt so I change to 3.4,and have not found this problem again at this time. I think this fix should be included in 3.3. can you explain about this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11389) Case sensitive in LIKE query althogh index created with false
[ https://issues.apache.org/jira/browse/CASSANDRA-11389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217187#comment-15217187 ] Pavel Yaskevich commented on CASSANDRA-11389: - I think what is going on here is that "case_sensetive" is a feature of analyzer, indexes are not analyzed by default that's why index returns no results since that flag is simply ignored. To fix this you should set - either "analyzed": "true" or ‘analyzer_class’: ‘org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer’ in the index options. > Case sensitive in LIKE query althogh index created with false > - > > Key: CASSANDRA-11389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11389 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Alon Levi >Priority: Minor > Labels: sasi > Fix For: 3.x > > > I created an index on user's first name as following: > CREATE CUSTOM INDEX ON users (first_name) USING > 'org.apache.cassandra.index.sasi.SASIIndex' > with options = { > 'mode' : 'CONTAINS', > 'case_sensitive' : 'false' > }; > This is the data I have in my table > user_id | first_name > | last_name > ---+---+--- > daa312ae-ecdf-4eb4-b6e9-206e33e5ca24 | Shlomo | Cohen > ab38ce9d-2823-4e6a-994f-7783953baef1 | Elad | Karakuli > 5e8371a7-3ed9-479f-9e4b-e4a07c750b12 | Alon | Levi > ae85cdc0-5eb7-4f08-8e42-2abd89e327ed | Gil | Elias > Although i mentioned the option 'case_sensitive' : 'false' > when I run this query : > select user_id, first_name from users where first_name LIKE '%shl%'; > The query returns no results. > However, when I run this query : > select user_id, first_name from users where first_name LIKE '%Shl%'; > The query returns the right results, > and the strangest thing is when I run this query: > select user_id, first_name from users where first_name LIKE 'shl%'; > suddenly the query is no more case sensitive and the results are fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11389) Case sensitive in LIKE query althogh index created with false
[ https://issues.apache.org/jira/browse/CASSANDRA-11389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217187#comment-15217187 ] Pavel Yaskevich edited comment on CASSANDRA-11389 at 3/30/16 1:14 AM: -- I think what is going on here is that "case_sensitive" is a feature of analyzer, indexes are not analyzed by default that's why index returns no results since that flag is simply ignored. To fix this you should set - either "analyzed": "true" or ‘analyzer_class’: ‘org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer’ in the index options. was (Author: xedin): I think what is going on here is that "case_sensetive" is a feature of analyzer, indexes are not analyzed by default that's why index returns no results since that flag is simply ignored. To fix this you should set - either "analyzed": "true" or ‘analyzer_class’: ‘org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer’ in the index options. > Case sensitive in LIKE query althogh index created with false > - > > Key: CASSANDRA-11389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11389 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Alon Levi >Priority: Minor > Labels: sasi > Fix For: 3.x > > > I created an index on user's first name as following: > CREATE CUSTOM INDEX ON users (first_name) USING > 'org.apache.cassandra.index.sasi.SASIIndex' > with options = { > 'mode' : 'CONTAINS', > 'case_sensitive' : 'false' > }; > This is the data I have in my table > user_id | first_name > | last_name > ---+---+--- > daa312ae-ecdf-4eb4-b6e9-206e33e5ca24 | Shlomo | Cohen > ab38ce9d-2823-4e6a-994f-7783953baef1 | Elad | Karakuli > 5e8371a7-3ed9-479f-9e4b-e4a07c750b12 | Alon | Levi > ae85cdc0-5eb7-4f08-8e42-2abd89e327ed | Gil | Elias > Although i mentioned the option 'case_sensitive' : 'false' > when I run this query : > select user_id, first_name from users where first_name LIKE '%shl%'; > The query returns no results. > However, when I run this query : > select user_id, first_name from users where first_name LIKE '%Shl%'; > The query returns the right results, > and the strangest thing is when I run this query: > select user_id, first_name from users where first_name LIKE 'shl%'; > suddenly the query is no more case sensitive and the results are fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11444) Upgrade ohc to 0.4.3
[ https://issues.apache.org/jira/browse/CASSANDRA-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-11444: Status: Ready to Commit (was: Patch Available) > Upgrade ohc to 0.4.3 > > > Key: CASSANDRA-11444 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11444 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Trivial > Fix For: 3.0.x > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11444) Upgrade ohc to 0.4.3
[ https://issues.apache.org/jira/browse/CASSANDRA-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217031#comment-15217031 ] Tyler Hobbs commented on CASSANDRA-11444: - +1 > Upgrade ohc to 0.4.3 > > > Key: CASSANDRA-11444 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11444 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Trivial > Fix For: 3.0.x > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11459) cassandra performance problem when streaming large data
[ https://issues.apache.org/jira/browse/CASSANDRA-11459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-11459: --- Reviewer: (was: Jonathan Ellis) > cassandra performance problem when streaming large data > > > Key: CASSANDRA-11459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11459 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: ubuntu 14.04, 3 nodes in each datacenter, > 1g networking, each node has 128G ram, 3*300G SSD in RAID5, dual E5-2620v3 > processors >Reporter: Yan Cui > > We found the problem on Cassandra 2.0.15, and have not tested on other > versions. > there is one core table, and the schema is > [user_id int, device_token text, deleted bool, device_info map, > human_code text] > user_id and device token is the primary key, and user_id is the partition key, > we have the statement that caused latency spike (3500ms to 4000 ms). > select * from table where user_id = . the hotuserid has roughly > 8 rows. On average, there is 200 bytes for each row. We feel this should > be slow because of more results out there, but it is not expected to be that > slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216937#comment-15216937 ] Jack Krupansky commented on CASSANDRA-11383: +1 for using [~jrwest]'s most recent two comments here as the source for the doc changes that I myself was referring to here. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216917#comment-15216917 ] Jordan West commented on CASSANDRA-11383: - The docs look pretty comprehensive. Thanks! I'll make a more detailed pass through them when I get a chance. I think the only thing we would like to clarify, based on the discussion in this ticket, is when to choose {{SPARSE}} over {{PREFIX}} for numerical data. My last comment (https://issues.apache.org/jira/browse/CASSANDRA-11383?focusedCommentId=15216337&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15216337) mentions a way to do that. Otherwise, specific to {{SPARSE}} the only recommendation I have is that the {{SPARSE}} example on the "CREATE CUSTOM INDEX (SASI)" page (https://docs.datastax.com/en/cql/3.3/cql/cql_reference/refCreateSASIIndex.html) uses {{age}}, which typically would not be a good candidate for a {{SPARSE}} index (the answer to question number 2 in my linked comment would be: no, there are not millions of ages with each term having a small number of matching keys). > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11459) cassandra performance problem when streaming large data
[ https://issues.apache.org/jira/browse/CASSANDRA-11459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Cui updated CASSANDRA-11459: Reviewer: Jonathan Ellis > cassandra performance problem when streaming large data > > > Key: CASSANDRA-11459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11459 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: ubuntu 14.04, 3 nodes in each datacenter, > 1g networking, each node has 128G ram, 3*300G SSD in RAID5, dual E5-2620v3 > processors >Reporter: Yan Cui > > We found the problem on Cassandra 2.0.15, and have not tested on other > versions. > there is one core table, and the schema is > [user_id int, device_token text, deleted bool, device_info map, > human_code text] > user_id and device token is the primary key, and user_id is the partition key, > we have the statement that caused latency spike (3500ms to 4000 ms). > select * from table where user_id = . the hotuserid has roughly > 8 rows. On average, there is 200 bytes for each row. We feel this should > be slow because of more results out there, but it is not expected to be that > slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10587) sstablemetadata NPE on cassandra 2.2
[ https://issues.apache.org/jira/browse/CASSANDRA-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-10587: --- Status: Patch Available (was: In Progress) Sorry this took time. I ended up fixing {{Descriptor}} object to always have canonical path as its {{directory}}. This way we don't need to think about given {{directory}} is relative or absolute. In fact, right now {{Desctiptor}} (and corresponding {{SSTable}}) is not considered equal between {{Descriptor}}'s {{directory}} being relative and absolute. (Added simple unit test to {{DescriptorTest}}). ||branch||testall||dtest|| |[10587-2.2|https://github.com/yukim/cassandra/tree/10587-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-2.2-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-2.2-dtest/lastCompletedBuild/testReport/]| |[10587-3.0|https://github.com/yukim/cassandra/tree/10587-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-3.0-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-3.0-dtest/lastCompletedBuild/testReport/]| |[10587-3.5|https://github.com/yukim/cassandra/tree/10587-3.5]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-3.5-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-3.5-dtest/lastCompletedBuild/testReport/]| |[10587-trunk|https://github.com/yukim/cassandra/tree/10587-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-trunk-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10587-trunk-dtest/lastCompletedBuild/testReport/]| For 3.0 and up, I needed to fix some unit test to work. > sstablemetadata NPE on cassandra 2.2 > > > Key: CASSANDRA-10587 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10587 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Tiago Batista >Assignee: Yuki Morishita >Priority: Minor > Fix For: 2.2.x, 3.x > > > I have recently upgraded my cassandra cluster to 2.2, currently running > 2.2.3. After running the first repair, cassandra renames the sstables to the > new naming schema that does not contain the keyspace name. > This causes sstablemetadata to fail with the following stack trace: > {noformat} > Exception in thread "main" java.lang.NullPointerException > at > org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:275) > at > org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:172) > at > org.apache.cassandra.tools.SSTableMetadataViewer.main(SSTableMetadataViewer.java:52) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11459) cassandra performance problem when streaming large data
Yan Cui created CASSANDRA-11459: --- Summary: cassandra performance problem when streaming large data Key: CASSANDRA-11459 URL: https://issues.apache.org/jira/browse/CASSANDRA-11459 Project: Cassandra Issue Type: Bug Components: Core Environment: ubuntu 14.04, 3 nodes in each datacenter, 1g networking, each node has 128G ram, 3*300G SSD in RAID5, dual E5-2620v3 processors Reporter: Yan Cui We found the problem on Cassandra 2.0.15, and have not tested on other versions. there is one core table, and the schema is [user_id int, device_token text, deleted bool, device_info map, human_code text] user_id and device token is the primary key, and user_id is the partition key, we have the statement that caused latency spike (3500ms to 4000 ms). select * from table where user_id = . the hotuserid has roughly 8 rows. On average, there is 200 bytes for each row. We feel this should be slow because of more results out there, but it is not expected to be that slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216827#comment-15216827 ] Lorina Poland commented on CASSANDRA-11383: --- Some summary of how the docs should be changed would be appreciated. The commentary is rather long and involved. I can sort it out, but it will take me quite a bit of time to do so. I'm creating a ticket to make changes. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11458) Complete support for CustomExpression
[ https://issues.apache.org/jira/browse/CASSANDRA-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216819#comment-15216819 ] Henry Manasseh edited comment on CASSANDRA-11458 at 3/29/16 8:57 PM: - I can provide a patch for this. I have done the changes locally. Who needs to approve this change? I'll write a test if approved. was (Author: henryman): I can provide a patch for this. I have done the changes locally. Who needs to approve this change? > Complete support for CustomExpression > - > > Key: CASSANDRA-11458 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11458 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Henry Manasseh >Priority: Minor > Attachments: Custom-expression-Change.png, addCustomIndexExpression > change.png > > > This is a proposal to complete the CustomExpression support first introduced > as part of https://issues.apache.org/jira/browse/CASSANDRA-10217. > The current support for custom expressions is partial. There is no clean way > to implement queries making use of the "exp('index', 'value)" syntax due to > the fact CustomExpression is declared as final and there is no way to for > developers to cleanly plug-in their own expressions. > https://github.com/apache/cassandra/blob/6e69c75900f3640195130085ad69daa1659184eb/src/java/org/apache/cassandra/db/filter/RowFilter.java#L869 > The proposal is to make CustomExpression not final so that developers can > extend and create their own subclass and provide their own isSatisfiedBy > operation (which currently always returns true). > Introducing a new custom expression would be done as follows: > 1. Developer would create a subclass of CustomExpression and override > isSatisfiedBy method with their logic (public boolean > isSatisfiedBy(CFMetaData metadata, DecoratedKey partitionKey, Row row)) > 2. This class would be packaged in a jar and copied to the cassandra lib > directory along with a secondary index class which overrides > Index.customExpressionValueType > 2. Create the custom index with an option which identifies the > CustomExpression subclass (custom_expression_class). > CREATE CUSTOM INDEX ON keyspace.my_table(my_indexed_column) USING > 'org.custom.MyCustomIndex' > WITH OPTIONS = { 'custom_expression_class': 'org.custom.MyCustomExpression' }; > I have prototyped the change and works as designed. In my case I do a type of > "IN" query filter which will simplify my client logic significantly. > The default behavior of using the CustomExpression class would be maintained > if the developer does not provide a custom class in the create index options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11458) Complete support for CustomExpression
[ https://issues.apache.org/jira/browse/CASSANDRA-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Manasseh updated CASSANDRA-11458: --- Attachment: addCustomIndexExpression change.png > Complete support for CustomExpression > - > > Key: CASSANDRA-11458 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11458 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Henry Manasseh >Priority: Minor > Attachments: Custom-expression-Change.png, addCustomIndexExpression > change.png > > > This is a proposal to complete the CustomExpression support first introduced > as part of https://issues.apache.org/jira/browse/CASSANDRA-10217. > The current support for custom expressions is partial. There is no clean way > to implement queries making use of the "exp('index', 'value)" syntax due to > the fact CustomExpression is declared as final and there is no way to for > developers to cleanly plug-in their own expressions. > https://github.com/apache/cassandra/blob/6e69c75900f3640195130085ad69daa1659184eb/src/java/org/apache/cassandra/db/filter/RowFilter.java#L869 > The proposal is to make CustomExpression not final so that developers can > extend and create their own subclass and provide their own isSatisfiedBy > operation (which currently always returns true). > Introducing a new custom expression would be done as follows: > 1. Developer would create a subclass of CustomExpression and override > isSatisfiedBy method with their logic (public boolean > isSatisfiedBy(CFMetaData metadata, DecoratedKey partitionKey, Row row)) > 2. This class would be packaged in a jar and copied to the cassandra lib > directory along with a secondary index class which overrides > Index.customExpressionValueType > 2. Create the custom index with an option which identifies the > CustomExpression subclass (custom_expression_class). > CREATE CUSTOM INDEX ON keyspace.my_table(my_indexed_column) USING > 'org.custom.MyCustomIndex' > WITH OPTIONS = { 'custom_expression_class': 'org.custom.MyCustomExpression' }; > I have prototyped the change and works as designed. In my case I do a type of > "IN" query filter which will simplify my client logic significantly. > The default behavior of using the CustomExpression class would be maintained > if the developer does not provide a custom class in the create index options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11458) Complete support for CustomExpression
[ https://issues.apache.org/jira/browse/CASSANDRA-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216819#comment-15216819 ] Henry Manasseh commented on CASSANDRA-11458: I can provide a patch for this. I have done the changes locally. Who needs to approve this change? > Complete support for CustomExpression > - > > Key: CASSANDRA-11458 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11458 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Henry Manasseh >Priority: Minor > Attachments: Custom-expression-Change.png > > > This is a proposal to complete the CustomExpression support first introduced > as part of https://issues.apache.org/jira/browse/CASSANDRA-10217. > The current support for custom expressions is partial. There is no clean way > to implement queries making use of the "exp('index', 'value)" syntax due to > the fact CustomExpression is declared as final and there is no way to for > developers to cleanly plug-in their own expressions. > https://github.com/apache/cassandra/blob/6e69c75900f3640195130085ad69daa1659184eb/src/java/org/apache/cassandra/db/filter/RowFilter.java#L869 > The proposal is to make CustomExpression not final so that developers can > extend and create their own subclass and provide their own isSatisfiedBy > operation (which currently always returns true). > Introducing a new custom expression would be done as follows: > 1. Developer would create a subclass of CustomExpression and override > isSatisfiedBy method with their logic (public boolean > isSatisfiedBy(CFMetaData metadata, DecoratedKey partitionKey, Row row)) > 2. This class would be packaged in a jar and copied to the cassandra lib > directory along with a secondary index class which overrides > Index.customExpressionValueType > 2. Create the custom index with an option which identifies the > CustomExpression subclass (custom_expression_class). > CREATE CUSTOM INDEX ON keyspace.my_table(my_indexed_column) USING > 'org.custom.MyCustomIndex' > WITH OPTIONS = { 'custom_expression_class': 'org.custom.MyCustomExpression' }; > I have prototyped the change and works as designed. In my case I do a type of > "IN" query filter which will simplify my client logic significantly. > The default behavior of using the CustomExpression class would be maintained > if the developer does not provide a custom class in the create index options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11458) Complete support for CustomExpression
[ https://issues.apache.org/jira/browse/CASSANDRA-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Manasseh updated CASSANDRA-11458: --- Attachment: Custom-expression-Change.png > Complete support for CustomExpression > - > > Key: CASSANDRA-11458 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11458 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Henry Manasseh >Priority: Minor > Attachments: Custom-expression-Change.png > > > This is a proposal to complete the CustomExpression support first introduced > as part of https://issues.apache.org/jira/browse/CASSANDRA-10217. > The current support for custom expressions is partial. There is no clean way > to implement queries making use of the "exp('index', 'value)" syntax due to > the fact CustomExpression is declared as final and there is no way to for > developers to cleanly plug-in their own expressions. > https://github.com/apache/cassandra/blob/6e69c75900f3640195130085ad69daa1659184eb/src/java/org/apache/cassandra/db/filter/RowFilter.java#L869 > The proposal is to make CustomExpression not final so that developers can > extend and create their own subclass and provide their own isSatisfiedBy > operation (which currently always returns true). > Introducing a new custom expression would be done as follows: > 1. Developer would create a subclass of CustomExpression and override > isSatisfiedBy method with their logic (public boolean > isSatisfiedBy(CFMetaData metadata, DecoratedKey partitionKey, Row row)) > 2. This class would be packaged in a jar and copied to the cassandra lib > directory along with a secondary index class which overrides > Index.customExpressionValueType > 2. Create the custom index with an option which identifies the > CustomExpression subclass (custom_expression_class). > CREATE CUSTOM INDEX ON keyspace.my_table(my_indexed_column) USING > 'org.custom.MyCustomIndex' > WITH OPTIONS = { 'custom_expression_class': 'org.custom.MyCustomExpression' }; > I have prototyped the change and works as designed. In my case I do a type of > "IN" query filter which will simplify my client logic significantly. > The default behavior of using the CustomExpression class would be maintained > if the developer does not provide a custom class in the create index options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11458) Complete support for CustomExpression
Henry Manasseh created CASSANDRA-11458: -- Summary: Complete support for CustomExpression Key: CASSANDRA-11458 URL: https://issues.apache.org/jira/browse/CASSANDRA-11458 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Henry Manasseh Priority: Minor This is a proposal to complete the CustomExpression support first introduced as part of https://issues.apache.org/jira/browse/CASSANDRA-10217. The current support for custom expressions is partial. There is no clean way to implement queries making use of the "exp('index', 'value)" syntax due to the fact CustomExpression is declared as final and there is no way to for developers to cleanly plug-in their own expressions. https://github.com/apache/cassandra/blob/6e69c75900f3640195130085ad69daa1659184eb/src/java/org/apache/cassandra/db/filter/RowFilter.java#L869 The proposal is to make CustomExpression not final so that developers can extend and create their own subclass and provide their own isSatisfiedBy operation (which currently always returns true). Introducing a new custom expression would be done as follows: 1. Developer would create a subclass of CustomExpression and override isSatisfiedBy method with their logic (public boolean isSatisfiedBy(CFMetaData metadata, DecoratedKey partitionKey, Row row)) 2. This class would be packaged in a jar and copied to the cassandra lib directory along with a secondary index class which overrides Index.customExpressionValueType 2. Create the custom index with an option which identifies the CustomExpression subclass (custom_expression_class). CREATE CUSTOM INDEX ON keyspace.my_table(my_indexed_column) USING 'org.custom.MyCustomIndex' WITH OPTIONS = { 'custom_expression_class': 'org.custom.MyCustomExpression' }; I have prototyped the change and works as designed. In my case I do a type of "IN" query filter which will simplify my client logic significantly. The default behavior of using the CustomExpression class would be maintained if the developer does not provide a custom class in the create index options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216763#comment-15216763 ] Jack Krupansky commented on CASSANDRA-11383: Thanks, [~jrwest]. I think that I finally don't have any additional questions! BTW, the DataStax Distribution of Cassandra (DDC) for 3.4 is out now, so the DataStax Cassandra doc has been updated for 3.4, including SASI: https://docs.datastax.com/en/cql/3.3/cql/cql_using/useSASIIndexConcept.html https://docs.datastax.com/en/cql/3.3/cql/cql_using/useSASIIndex.html https://docs.datastax.com/en/cql/3.3/cql/cql_reference/refCreateSASIIndex.html That happened four days ago, so maybe some of our recent discussion since then should get cycled into the doc. For example, your comments about range queries on SPARSE data. I'll pings docs to alert them of the discussion here, but you guys are free to highlight whatever info you think users should know about. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11457) Upgrade OHC to 0.4.3
[ https://issues.apache.org/jira/browse/CASSANDRA-11457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs resolved CASSANDRA-11457. - Resolution: Duplicate > Upgrade OHC to 0.4.3 > > > Key: CASSANDRA-11457 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11457 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Tyler Hobbs >Assignee: Tyler Hobbs >Priority: Minor > Fix For: 3.6 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11457) Upgrade OHC to 0.4.3
Tyler Hobbs created CASSANDRA-11457: --- Summary: Upgrade OHC to 0.4.3 Key: CASSANDRA-11457 URL: https://issues.apache.org/jira/browse/CASSANDRA-11457 Project: Cassandra Issue Type: Improvement Components: Local Write-Read Paths Reporter: Tyler Hobbs Assignee: Tyler Hobbs Priority: Minor Fix For: 3.6 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11430) forceRepairRangeAsync hangs sometimes
[ https://issues.apache.org/jira/browse/CASSANDRA-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216715#comment-15216715 ] Nick Bailey commented on CASSANDRA-11430: - This is happening because the progress notification mechanism has completely changed. The old method signatures were left in place but this is fairly pointless since old clients won't be able to understand the new progress/status reporting mechanism. > forceRepairRangeAsync hangs sometimes > - > > Key: CASSANDRA-11430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11430 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > Fix For: 3.x > > > forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available > for older clients though. Unfortunately it sometimes hangs when you call it. > It looks like it completes fine but the notification to the client that the > operation is done is never sent. This is easiest to see by using nodetool > from 2.1 against a 3.x cluster. > {noformat} > [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 OpsCenter > [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' > [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 system_distributed > ... > ... > {noformat} > (I added the ellipses) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11430) forceRepairRangeAsync hangs sometimes
[ https://issues.apache.org/jira/browse/CASSANDRA-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216715#comment-15216715 ] Nick Bailey edited comment on CASSANDRA-11430 at 3/29/16 7:38 PM: -- This is happening because the progress notification mechanism has completely changed. The old method signatures were left in place but this is fairly pointless since old clients won't be able to understand the new progress/status reporting mechanism. Anyone have any idea on how much work it would be to pull back in the old progress reporting mechanism for the old method signatures? I'm guessing quite a bit. was (Author: nickmbailey): This is happening because the progress notification mechanism has completely changed. The old method signatures were left in place but this is fairly pointless since old clients won't be able to understand the new progress/status reporting mechanism. > forceRepairRangeAsync hangs sometimes > - > > Key: CASSANDRA-11430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11430 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > Fix For: 3.x > > > forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available > for older clients though. Unfortunately it sometimes hangs when you call it. > It looks like it completes fine but the notification to the client that the > operation is done is never sent. This is easiest to see by using nodetool > from 2.1 against a 3.x cluster. > {noformat} > [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 OpsCenter > [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' > [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 system_distributed > ... > ... > {noformat} > (I added the ellipses) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216704#comment-15216704 ] Jeff Jirsa commented on CASSANDRA-9666: --- [~Bj0rn] and if you'll indulge me, what sort of TTLs, what base_time_seconds, what max_window_size, and how did you arrive at those values? > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming operations can create sstables with old timestamps, and they'll > naturally be joined together with sstables in the same time bucket. This is > true for bootstrap/repair/sstableloader/removenode. > - It remains true that if old data and new data is written into the memtable > at the same time, the resulting sstables will be treated as if they were new > sstables, however, that no longer negatively impacts the compaction > strategy’s selection criteria for older windows. > Patch provided for : > - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 > - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 > - trunk (post-8099): https://github.com/jeffjirsa/cassandra/commits/twcs > Rebased, force-pushed July 18, with bug fixes for estimated pending > compactions and potential starvation if more than min_threshold tables > existed in current window but STCS did not consider them viable candidates > Rebased, force-pushed Aug 20 to bring in relevant logic from CASSANDRA-9882 -- This message was sent by Atlassian JIRA (v6.3.4#6332
[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216688#comment-15216688 ] Jonathan Shook commented on CASSANDRA-9666: --- There are two areas of concern that we should discuss more directly.. 1. The pacing of memtable flushing on a given system can be matched up with the base window size with DTCS, avoiding logical write amplification that can occur before the scheduling discipline kicks in. This is not so easy when you water down the configuration and remove the ability to manage the fresh sstables. The benefits from time-series friendly compaction can be had for both the newest and the oldest tables, and both are relevant here. 2. The window placement. From what I've seen, the anchoring point for whether a cell goes into a bucket or not is different between the two approaches. To me this is fairly arbitrary in terms of processing overhead comparisons, all else assumed close enough. However, when trying to reconcile, shifting all of your data to a different bucket will not be a welcome event for most users. This makes "graceful" reconciliation difficult at best. Can we simply try to make DTCS as (perceptually) easy to use for the default case as TWCS (perceptually) ? To me, this is more about the user entry point and understanding behavior as designed than it is about the machinery that makes it happen. The basic design between them has so much in common that reconciling them completely would be mostly a shell game of parameter names as well as lobbing off some functionality that can be complete bypassed, given the right settings. Can we identify the functionally equivalent settings for TWCS that DTCS needs to emulate, given proper settings (possibly including anchoring point), and then simply provide the same simple configuration to users, without having to maintain two separate sibling compaction strategies? One sticking point that I've had on this suggesting in conversation is the bucketing logic being too difficult to think about. If we were able to provide the self-same behavior for TWCS-like configuration, the bucketing logic could be used only when the parameters require non-uniform windows. Would that make everyone happy? > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced ef
[jira] [Comment Edited] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216688#comment-15216688 ] Jonathan Shook edited comment on CASSANDRA-9666 at 3/29/16 7:21 PM: There are two areas of concern that we should discuss more directly.. 1. The pacing of memtable flushing on a given system can be matched up with the base window size with DTCS, avoiding logical write amplification that can occur before the scheduling discipline kicks in. This is not so easy when you water down the configuration and remove the ability to manage the fresh sstables. The benefits from time-series friendly compaction can be had for both the newest and the oldest tables, and both are relevant here. 2. The window placement. From what I've seen, the anchoring point for whether a cell goes into a bucket or not is different between the two approaches. To me this is fairly arbitrary in terms of processing overhead comparisons, all else assumed close enough. However, when trying to reconcile, shifting all of your data to a different bucket will not be a welcome event for most users. This makes "graceful" reconciliation difficult at best. Can we simply try to make DTCS as (perceptually) easy to use for the default case as TWCS (perceptually) ? To me, this is more about the user entry point and understanding behavior as designed than it is about the machinery that makes it happen. The basic design between them has so much in common that reconciling them completely would be mostly a shell game of parameter names as well as lobbing off some functionality that can be completely bypassed, given the right settings. Can we identify the functionally equivalent settings for TWCS that DTCS needs to emulate, given proper settings (possibly including anchoring point), and then simply provide the same simple configuration to users, without having to maintain two separate sibling compaction strategies? One sticking point that I've had on this suggesting in conversation is the bucketing logic being too difficult to think about. If we were able to provide the self-same behavior for TWCS-like configuration, the bucketing logic could be used only when the parameters require non-uniform windows. Would that make everyone happy? was (Author: jshook): There are two areas of concern that we should discuss more directly.. 1. The pacing of memtable flushing on a given system can be matched up with the base window size with DTCS, avoiding logical write amplification that can occur before the scheduling discipline kicks in. This is not so easy when you water down the configuration and remove the ability to manage the fresh sstables. The benefits from time-series friendly compaction can be had for both the newest and the oldest tables, and both are relevant here. 2. The window placement. From what I've seen, the anchoring point for whether a cell goes into a bucket or not is different between the two approaches. To me this is fairly arbitrary in terms of processing overhead comparisons, all else assumed close enough. However, when trying to reconcile, shifting all of your data to a different bucket will not be a welcome event for most users. This makes "graceful" reconciliation difficult at best. Can we simply try to make DTCS as (perceptually) easy to use for the default case as TWCS (perceptually) ? To me, this is more about the user entry point and understanding behavior as designed than it is about the machinery that makes it happen. The basic design between them has so much in common that reconciling them completely would be mostly a shell game of parameter names as well as lobbing off some functionality that can be complete bypassed, given the right settings. Can we identify the functionally equivalent settings for TWCS that DTCS needs to emulate, given proper settings (possibly including anchoring point), and then simply provide the same simple configuration to users, without having to maintain two separate sibling compaction strategies? One sticking point that I've had on this suggesting in conversation is the bucketing logic being too difficult to think about. If we were able to provide the self-same behavior for TWCS-like configuration, the bucketing logic could be used only when the parameters require non-uniform windows. Would that make everyone happy? > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that mak
[jira] [Commented] (CASSANDRA-11225) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216667#comment-15216667 ] Russ Hatch commented on CASSANDRA-11225: looks like there was still 1/300 failures, but maybe that's an acceptable noise level. guess we just want to be sure there's not a legitimate bug hiding here. http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/50/testReport/node_0.consistency_test/TestAccuracy/test_simple_strategy_counters/ > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-11225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11225 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/209/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_novnode_dtest #209 > error: "AssertionError: Failed to read value from sufficient number of nodes, > required 2 but got 1 - [574, 2]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11451) Don't mark sstables as repairing when doing sub range repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216650#comment-15216650 ] Marcus Eriksson commented on CASSANDRA-11451: - and a dtest: https://github.com/krummas/cassandra-dtest/commits/marcuse/11451 (tests above were run against this branch) > Don't mark sstables as repairing when doing sub range repair > > > Key: CASSANDRA-11451 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11451 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 2.2.x, 3.0.x, 3.x > > > Since CASSANDRA-10422 we don't do anticompaction when a user issues a sub > range repair (-st X -et Y), but we still mark sstables as repairing. > We should avoid marking them as users might want to run many sub range repair > sessions in parallel. > The reason we mark sstables is that we don't want another repair session to > steal the sstables before we do anticompaction, and since we do no > anticompaction with sub range repair we have no benefit from the marking. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11451) Don't mark sstables as repairing when doing sub range repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-11451: Assignee: Marcus Eriksson Reviewer: Yuki Morishita Status: Patch Available (was: Open) patch against 2.2 here: https://github.com/krummas/cassandra/commits/marcuse/11451 tests: http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11451-dtest/ http://cassci.datastax.com/view/Dev/view/krummas/job/krummas-marcuse-11451-testall/ > Don't mark sstables as repairing when doing sub range repair > > > Key: CASSANDRA-11451 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11451 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 2.2.x, 3.0.x, 3.x > > > Since CASSANDRA-10422 we don't do anticompaction when a user issues a sub > range repair (-st X -et Y), but we still mark sstables as repairing. > We should avoid marking them as users might want to run many sub range repair > sessions in parallel. > The reason we mark sstables is that we don't want another repair session to > steal the sstables before we do anticompaction, and since we do no > anticompaction with sub range repair we have no benefit from the marking. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11455) Re-executing incremental repair does not restore data on wiped node
[ https://issues.apache.org/jira/browse/CASSANDRA-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216645#comment-15216645 ] Paulo Motta commented on CASSANDRA-11455: - That's right, thanks all for the clarification! A general assumption when you run repair (with default options) is that data will be consistent after you execute it. In the case of accidental data deletion, or partial disk failure, a user may wrongly think his data is safe after running incremental repair, since the repair execute without errors. Maybe we should detect a situation where the unrepaired set is inconsistent between replicas, log a warning, fail the incremental repair or fallback to full repair? Perhaps we could maintain on each node the last repairedAt per range and compare that during the prepare phase to make sure all replicas are on the same page before starting? We would probably need to reload that info on startup, since that's when data can go missing. And we would also need to deal with fully expired sstables somehow, so I'm not sure if this would complicate more than bring benefits. > Re-executing incremental repair does not restore data on wiped node > --- > > Key: CASSANDRA-11455 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11455 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Paulo Motta > > Reproduction steps: > {noformat} > ccm create test -n 3 -s > ccm node1 stress "write n=100K cl=QUORUM -rate threads=300 -schema > replication(factor=3) > compaction(strategy=org.apache.cassandra.db.compaction.LeveledCompactionStrategy,sstable_size_in_mb=1)" > ccm flush > ccm node1 nodetool repair keyspace1 standard1 > ccm flush > ccm node2 stop > rm -rf ~/.ccm/test/node2/commitlogs/* > rm -rf ~/.ccm/test/node2/data0/keyspace1/* > ccm node2 start > ccm node1 nodetool repair keyspace1 standard1 > ccm node1 stress "read n=100k cl=ONE -rate threads=3" > {noformat} > This is log on node1 (repair coordinator): > {noformat} > INFO [Thread-8] 2016-03-29 13:01:16,990 RepairRunnable.java:125 - Starting > repair command #2, repairing keyspace keyspace1 with repair options > (parallelism: parallel, primary range: false, incremental: true, job threads: > 1, ColumnFamilies: [standard1], dataCenters: [], hosts: [], # of ranges: 3) > INFO [Thread-8] 2016-03-29 13:01:17,021 RepairSession.java:237 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] new session: will sync /127.0.0.1, > /127.0.0.2, /127.0.0.3 on range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] for keyspace1.[standard1] > INFO [Repair#2:1] 2016-03-29 13:01:17,044 RepairJob.java:100 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > INFO [Repair#2:1] 2016-03-29 13:01:17,045 RepairJob.java:174 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > DEBUG [AntiEntropyStage:1] 2016-03-29 13:01:17,054 > RepairMessageVerbHandler.java:118 - Validating > ValidationRequest{gcBefore=1458403277} > org.apache.cassandra.repair.messages.ValidationRequest@56ed77cd > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,062 StorageService.java:3100 > - Forcing flush on keyspace keyspace1, CF standard1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,066 > CompactionManager.java:1290 - Created 3 merkle trees with merkle trees size > 3, 0 partitions, 277 bytes > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:123 - > Prepared AEService trees of size 3 for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:233 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partitions > per leaf are: > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:235 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partition > sizes are: > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,070 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0.1 > DEBUG [ValidationExecutor:3] 201
[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216637#comment-15216637 ] Marcus Eriksson commented on CASSANDRA-9666: [~Bj0rn] are you running DTCS in prod? How much data per node if so? > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming operations can create sstables with old timestamps, and they'll > naturally be joined together with sstables in the same time bucket. This is > true for bootstrap/repair/sstableloader/removenode. > - It remains true that if old data and new data is written into the memtable > at the same time, the resulting sstables will be treated as if they were new > sstables, however, that no longer negatively impacts the compaction > strategy’s selection criteria for older windows. > Patch provided for : > - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 > - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 > - trunk (post-8099): https://github.com/jeffjirsa/cassandra/commits/twcs > Rebased, force-pushed July 18, with bug fixes for estimated pending > compactions and potential starvation if more than min_threshold tables > existed in current window but STCS did not consider them viable candidates > Rebased, force-pushed Aug 20 to bring in relevant logic from CASSANDRA-9882 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11067) Improve SASI syntax
[ https://issues.apache.org/jira/browse/CASSANDRA-11067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216616#comment-15216616 ] Pavel Yaskevich commented on CASSANDRA-11067: - Definitely sounds like a bug, I've created CASSANDRA-11456 to track that. > Improve SASI syntax > --- > > Key: CASSANDRA-11067 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11067 > Project: Cassandra > Issue Type: Task > Components: CQL >Reporter: Jonathan Ellis >Assignee: Pavel Yaskevich > Labels: client-impacting > Fix For: 3.4 > > > I think everyone agrees that a LIKE operator would be ideal, but that's > probably not in scope for an initial 3.4 release. > Still, I'm uncomfortable with the initial approach of overloading = to mean > "satisfies index expression." The problem is that it will be very difficult > to back out of this behavior once people are using it. > I propose adding a new operator in the interim instead. Call it MATCHES, > maybe. With the exact same behavior that SASI currently exposes, just with a > separate operator rather than being rolled into =. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11430) forceRepairRangeAsync hangs sometimes
[ https://issues.apache.org/jira/browse/CASSANDRA-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216569#comment-15216569 ] Nick Bailey commented on CASSANDRA-11430: - I was able to reproduce this with non system_distributed keyspaces. > forceRepairRangeAsync hangs sometimes > - > > Key: CASSANDRA-11430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11430 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > Fix For: 3.x > > > forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available > for older clients though. Unfortunately it sometimes hangs when you call it. > It looks like it completes fine but the notification to the client that the > operation is done is never sent. This is easiest to see by using nodetool > from 2.1 against a 3.x cluster. > {noformat} > [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 OpsCenter > [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' > [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 system_distributed > ... > ... > {noformat} > (I added the ellipses) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11430) forceRepairRangeAsync hangs sometimes
[ https://issues.apache.org/jira/browse/CASSANDRA-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Bailey updated CASSANDRA-11430: Description: forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available for older clients though. Unfortunately it sometimes hangs when you call it. It looks like it completes fine but the notification to the client that the operation is done is never sent. This is easiest to see by using nodetool from 2.1 against a 3.x cluster. {noformat} [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair -st 0 -et 1 OpsCenter [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair -st 0 -et 1 system_distributed ... ... {noformat} (I added the ellipses) was: forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available for older clients though. Unfortunately it hangs when you call it with the system_distributed table. It looks like it completes fine but the notification to the client that the operation is done is never sent. This is easiest to see by using nodetool from 2.1 against a 3.x cluster. {noformat} [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair -st 0 -et 1 OpsCenter [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair -st 0 -et 1 system_distributed ... ... {noformat} (I added the ellipses) > forceRepairRangeAsync hangs sometimes > - > > Key: CASSANDRA-11430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11430 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > Fix For: 3.x > > > forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available > for older clients though. Unfortunately it sometimes hangs when you call it. > It looks like it completes fine but the notification to the client that the > operation is done is never sent. This is easiest to see by using nodetool > from 2.1 against a 3.x cluster. > {noformat} > [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 OpsCenter > [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' > [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 system_distributed > ... > ... > {noformat} > (I added the ellipses) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11430) forceRepairRangeAsync hangs sometimes
[ https://issues.apache.org/jira/browse/CASSANDRA-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Bailey updated CASSANDRA-11430: Summary: forceRepairRangeAsync hangs sometimes (was: forceRepairRangeAsync hangs on system_distributed keyspace.) > forceRepairRangeAsync hangs sometimes > - > > Key: CASSANDRA-11430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11430 > Project: Cassandra > Issue Type: Bug >Reporter: Nick Bailey > Fix For: 3.x > > > forceRepairRangeAsync is deprecated in 2.2/3.x series. It's still available > for older clients though. Unfortunately it hangs when you call it with the > system_distributed table. It looks like it completes fine but the > notification to the client that the operation is done is never sent. This is > easiest to see by using nodetool from 2.1 against a 3.x cluster. > {noformat} > [Nicks-MacBook-Pro:16:06:21 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 OpsCenter > [2016-03-24 16:06:50,165] Nothing to repair for keyspace 'OpsCenter' > [Nicks-MacBook-Pro:16:06:50 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ > [Nicks-MacBook-Pro:16:06:55 cassandra-2.1] cassandra$ ./bin/nodetool repair > -st 0 -et 1 system_distributed > ... > ... > {noformat} > (I added the ellipses) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11456) support for PreparedStatement with LIKE
[ https://issues.apache.org/jira/browse/CASSANDRA-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-11456: Fix Version/s: (was: 3.4) 3.5 > support for PreparedStatement with LIKE > --- > > Key: CASSANDRA-11456 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11456 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Pavel Yaskevich >Assignee: Pavel Yaskevich >Priority: Minor > Fix For: 3.5 > > > Using the Java driver for example: > {code} > PreparedStatement pst = session.prepare("select * from test.users where > first_name LIKE ?"); > BoundStatement bs = pst.bind("Jon%"); > {code} > The first line fails with {{SyntaxError: line 1:47 mismatched input '?' > expecting STRING_LITERAL}} (which makes sense since it's how it's declared in > the grammar). Other operators declare the right-hand side value as a > {{Term.Raw}}, which can also be a bind marker. > I think users will expect to be able to bind the argument this way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10528) Proposal: Integrate RxJava
[ https://issues.apache.org/jira/browse/CASSANDRA-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216549#comment-15216549 ] T Jake Luciani commented on CASSANDRA-10528: Tested with an i2.8xlarge with 12 stress clients. The throughput improvement was still ~15% but tail latency improvement was > 50%. I was able to get to ~480k/sec > Proposal: Integrate RxJava > -- > > Key: CASSANDRA-10528 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10528 > Project: Cassandra > Issue Type: Improvement >Reporter: T Jake Luciani >Assignee: T Jake Luciani > Fix For: 3.x > > Attachments: rxjava-stress.png > > > The purpose of this ticket is to discuss the merits of integrating the > [RxJava|https://github.com/ReactiveX/RxJava] framework into C*. Enabling us > to incrementally make the internals of C* async and move away from SEDA to a > more modern thread per core architecture. > Related tickets: >* CASSANDRA-8520 >* CASSANDRA-8457 >* CASSANDRA-5239 >* CASSANDRA-7040 >* CASSANDRA-5863 >* CASSANDRA-6696 >* CASSANDRA-7392 > My *primary* goals in raising this issue are to provide a way of: > * *Incrementally* making the backend async > * Avoiding code complexity/readability issues > * Avoiding NIH where possible > * Building on an extendable library > My *non*-goals in raising this issue are: > >* Rewrite the entire database in one big bang >* Write our own async api/framework > > - > I've attempted to integrate RxJava a while back and found it not ready mainly > due to our lack of lambda support. Now with Java 8 I've found it very > enjoyable and have not hit any performance issues. A gentle introduction to > RxJava is [here|http://blog.danlew.net/2014/09/15/grokking-rxjava-part-1/] as > well as their > [wiki|https://github.com/ReactiveX/RxJava/wiki/Additional-Reading]. The > primary concept of RX is the > [Obervable|http://reactivex.io/documentation/observable.html] which is > essentially a stream of stuff you can subscribe to and act on, chain, etc. > This is quite similar to [Java 8 streams > api|http://www.oracle.com/technetwork/articles/java/ma14-java-se-8-streams-2177646.html] > (or I should say streams api is similar to it). The difference is java 8 > streams can't be used for asynchronous events while RxJava can. > Another improvement since I last tried integrating RxJava is the completion > of CASSANDRA-8099 which provides is a very iterable/incremental approach to > our storage engine. *Iterators and Observables are well paired conceptually > so morphing our current Storage engine to be async is much simpler now.* > In an effort to show how one can incrementally change our backend I've done a > quick POC with RxJava and replaced our non-paging read requests to become > non-blocking. > https://github.com/apache/cassandra/compare/trunk...tjake:rxjava-3.0 > As you can probably see the code is straight-forward and sometimes quite nice! > *Old* > {code} > private static PartitionIterator > fetchRows(List> commands, ConsistencyLevel > consistencyLevel) > throws UnavailableException, ReadFailureException, ReadTimeoutException > { > int cmdCount = commands.size(); > SinglePartitionReadLifecycle[] reads = new > SinglePartitionReadLifecycle[cmdCount]; > for (int i = 0; i < cmdCount; i++) > reads[i] = new SinglePartitionReadLifecycle(commands.get(i), > consistencyLevel); > for (int i = 0; i < cmdCount; i++) > reads[i].doInitialQueries(); > for (int i = 0; i < cmdCount; i++) > reads[i].maybeTryAdditionalReplicas(); > for (int i = 0; i < cmdCount; i++) > reads[i].awaitRes > ultsAndRetryOnDigestMismatch(); > for (int i = 0; i < cmdCount; i++) > if (!reads[i].isDone()) > reads[i].maybeAwaitFullDataRead(); > List results = new ArrayList<>(cmdCount); > for (int i = 0; i < cmdCount; i++) > { > assert reads[i].isDone(); > results.add(reads[i].getResult()); > } > return PartitionIterators.concat(results); > } > {code} > *New* > {code} > private static Observable > fetchRows(List> commands, ConsistencyLevel > consistencyLevel) > throws UnavailableException, ReadFailureException, ReadTimeoutException > { > return Observable.from(commands) > .map(command -> new > SinglePartitionReadLifecycle(command, consistencyLevel)) > .flatMap(read -> read.getPartitionIterator()) > .toList() > .map(results -> PartitionIterato
[jira] [Commented] (CASSANDRA-11432) Counter values become under-counted when running repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216546#comment-15216546 ] Dikang Gu commented on CASSANDRA-11432: --- We did more experiments, and we found that the counter value was still inconsistent even when the repair was finished. Seems the repair caused some permanent damage to the counters. > Counter values become under-counted when running repair. > > > Key: CASSANDRA-11432 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11432 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu >Assignee: Aleksey Yeschenko > > We are experimenting Counters in Cassandra 2.2.5. Our setup is that we have 6 > nodes, across three different regions, and in each region, the replication > factor is 2. Basically, each nodes holds a full copy of the data. > We are writing to cluster with CL = 2, and reading with CL = 1. > When are doing 30k/s counter increment/decrement per node, and at the > meanwhile, we are double writing to our mysql tier, so that we can measure > the accuracy of C* counter, compared to mysql. > The experiment result was great at the beginning, the counter value in C* and > mysql are very close. The difference is less than 0.1%. > But when we start to run the repair on one node, the counter value in C* > become much less than the value in mysql, the difference becomes larger than > 1%. > My question is that is it a known problem that the counter value will become > under-counted if repair is running? Should we avoid running repair for > counter tables? > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11456) support for PreparedStatement with LIKE
Pavel Yaskevich created CASSANDRA-11456: --- Summary: support for PreparedStatement with LIKE Key: CASSANDRA-11456 URL: https://issues.apache.org/jira/browse/CASSANDRA-11456 Project: Cassandra Issue Type: Bug Components: CQL Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Fix For: 3.4 Using the Java driver for example: {code} PreparedStatement pst = session.prepare("select * from test.users where first_name LIKE ?"); BoundStatement bs = pst.bind("Jon%"); {code} The first line fails with {{SyntaxError: line 1:47 mismatched input '?' expecting STRING_LITERAL}} (which makes sense since it's how it's declared in the grammar). Other operators declare the right-hand side value as a {{Term.Raw}}, which can also be a bind marker. I think users will expect to be able to bind the argument this way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11438) dtest failure in consistency_test.TestAccuracy.test_network_topology_strategy_users
[ https://issues.apache.org/jira/browse/CASSANDRA-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-11438: Resolution: Fixed Status: Resolved (was: Patch Available) > dtest failure in > consistency_test.TestAccuracy.test_network_topology_strategy_users > --- > > Key: CASSANDRA-11438 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11438 > Project: Cassandra > Issue Type: Test >Reporter: Philip Thompson >Assignee: Philip Thompson > Labels: dtest > > This test and > consistency_test.TestAvailability.test_network_topology_strategy have begun > failing now that we dropped the instance size we run CI with. The tests > should be altered to reflect the constrained resources. They are ambitious > for dtests, regardless. > example failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/221/testReport/consistency_test/TestAccuracy/test_network_topology_strategy_users > Failed on CassCI build cassandra-2.1_novnode_dtest #221 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11454) 2.2 Documentation conflicts with observed behavior
[ https://issues.apache.org/jira/browse/CASSANDRA-11454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216421#comment-15216421 ] Terry Liu commented on CASSANDRA-11454: --- Aha, thanks [~philipthompson]. I didn't realize the distinction before but that makes sense now. > 2.2 Documentation conflicts with observed behavior > -- > > Key: CASSANDRA-11454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11454 > Project: Cassandra > Issue Type: Task > Components: Documentation and Website > Environment: CentOS 6.6 > [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] >Reporter: Terry Liu >Priority: Minor > > Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query > would return as soon as it found enough rows to fulfill your limit. > For example, > {noformat} > SELECT COUNT(*) > FROM some_table > LIMIT 1 > {noformat} > would always return a count of 1 as long as there is at least one row in the > table. > I've noticed that Cassandra 2.2 no longer behaves in this way and yet the > documentation continues to suggest otherwise: > http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit > Cassandra 2.2 seems to return the full count despite what you set the LIMIT > to. > Looking through the version changes, it seems likely that the changes for the > following note might be related (from > https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): > {noformat} > Allow count(*) and count(1) to be use as normal aggregation > count() can now be used in aggregation. > {noformat} > If so, the related ticket seems to be > https://issues.apache.org/jira/browse/CASSANDRA-10114. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11225) dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters
[ https://issues.apache.org/jira/browse/CASSANDRA-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216391#comment-15216391 ] Russ Hatch commented on CASSANDRA-11225: doing another bulk run here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/50/ > dtest failure in consistency_test.TestAccuracy.test_simple_strategy_counters > > > Key: CASSANDRA-11225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11225 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Russ Hatch > Labels: dtest > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/209/testReport/consistency_test/TestAccuracy/test_simple_strategy_counters > Failed on CassCI build cassandra-2.1_novnode_dtest #209 > error: "AssertionError: Failed to read value from sufficient number of nodes, > required 2 but got 1 - [574, 2]" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11372) Make CQL grammar more easily extensible
[ https://issues.apache.org/jira/browse/CASSANDRA-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-11372: -- Resolution: Fixed Fix Version/s: (was: 3.x) 3.6 Status: Resolved (was: Ready to Commit) > Make CQL grammar more easily extensible > > > Key: CASSANDRA-11372 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11372 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Mike Adamson >Assignee: Mike Adamson > Fix For: 3.6 > > > The CQL grammar ({{Cql.g}}) is currently a composite grammar and, as such, is > not easy to extend. > We now have a number of 3rd parties who are extending the grammar (custom > index grammars, for example) and it would be helpful if the grammar could be > split in a parser and lexer in order to make extension easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11372) Make CQL grammar more easily extensible
[ https://issues.apache.org/jira/browse/CASSANDRA-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216344#comment-15216344 ] Aleksey Yeschenko commented on CASSANDRA-11372: --- Committed as [eea0a0cef993959354200fcde94c2664454039b6|https://github.com/apache/cassandra/commit/eea0a0cef993959354200fcde94c2664454039b6] to trunk, thanks. > Make CQL grammar more easily extensible > > > Key: CASSANDRA-11372 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11372 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Mike Adamson >Assignee: Mike Adamson > Fix For: 3.6 > > > The CQL grammar ({{Cql.g}}) is currently a composite grammar and, as such, is > not easy to extend. > We now have a number of 3rd parties who are extending the grammar (custom > index grammars, for example) and it would be helpful if the grammar could be > split in a parser and lexer in order to make extension easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] cassandra git commit: Break the CQL grammar into separate Parser and Lexer
Break the CQL grammar into separate Parser and Lexer patch by Mike Adamson; reviewed by Aleksey Yeschenko for CASSANDRA-11372 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eea0a0ce Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eea0a0ce Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eea0a0ce Branch: refs/heads/trunk Commit: eea0a0cef993959354200fcde94c2664454039b6 Parents: 9fd6322 Author: Mike Adamson Authored: Thu Mar 17 11:33:20 2016 + Committer: Aleksey Yeschenko Committed: Tue Mar 29 18:07:12 2016 +0100 -- CHANGES.txt |1 + build.xml|7 +- src/antlr/Cql.g | 121 ++ src/antlr/Lexer.g| 319 src/antlr/Parser.g | 1613 src/java/org/apache/cassandra/cql3/Cql.g | 1949 - 6 files changed, 2058 insertions(+), 1952 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/eea0a0ce/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4810a12..417d43f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.6 + * Break the CQL grammar into separate Parser and Lexer (CASSANDRA-11372) * Compress only inter-dc traffic by default (CASSANDRA-) * Add metrics to track write amplification (CASSANDRA-11420) * cassandra-stress: cannot handle "value-less" tables (CASSANDRA-7739) http://git-wip-us.apache.org/repos/asf/cassandra/blob/eea0a0ce/build.xml -- diff --git a/build.xml b/build.xml index 5dc1e06..4758d18 100644 --- a/build.xml +++ b/build.xml @@ -34,6 +34,7 @@ + @@ -211,12 +212,12 @@ --> - Building Grammar ${build.src.java}/org/apache/cassandra/cql3/Cql.g ... + Building Grammar ${build.src.antlr}/Cql.g ... - + http://git-wip-us.apache.org/repos/asf/cassandra/blob/eea0a0ce/src/antlr/Cql.g -- diff --git a/src/antlr/Cql.g b/src/antlr/Cql.g new file mode 100644 index 000..7cc16a3 --- /dev/null +++ b/src/antlr/Cql.g @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +grammar Cql; + +options { +language = Java; +} + +import Parser,Lexer; + +@header { +package org.apache.cassandra.cql3; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.EnumSet; +import java.util.HashSet; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import org.apache.cassandra.auth.*; +import org.apache.cassandra.cql3.*; +import org.apache.cassandra.cql3.restrictions.CustomIndexExpression; +import org.apache.cassandra.cql3.statements.*; +import org.apache.cassandra.cql3.selection.*; +import org.apache.cassandra.cql3.functions.*; +import org.apache.cassandra.db.marshal.CollectionType; +import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.exceptions.InvalidRequestException; +import org.apache.cassandra.exceptions.SyntaxException; +import org.apache.cassandra.utils.Pair; +} + +@members { +public void addErrorListener(ErrorListener listener) +{ +gParser.addErrorListener(listener); +} + +public void removeErrorListener(ErrorListener listener) +{ +gParser.removeErrorListener(listener); +} + +public void displayRecognitionError(String[] tokenNames, RecognitionException e) +{ +gParser.displayRecognitionError(tokenNames, e); +} + +protected void addRecognitionError(String msg) +{ +gParser.addRecog
[1/2] cassandra git commit: Break the CQL grammar into separate Parser and Lexer
Repository: cassandra Updated Branches: refs/heads/trunk 9fd6322dc -> eea0a0cef http://git-wip-us.apache.org/repos/asf/cassandra/blob/eea0a0ce/src/java/org/apache/cassandra/cql3/Cql.g -- diff --git a/src/java/org/apache/cassandra/cql3/Cql.g b/src/java/org/apache/cassandra/cql3/Cql.g deleted file mode 100644 index f7841fd..000 --- a/src/java/org/apache/cassandra/cql3/Cql.g +++ /dev/null @@ -1,1949 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - */ - -grammar Cql; - -options { -language = Java; -} - -@header { -package org.apache.cassandra.cql3; - -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.EnumSet; -import java.util.HashSet; -import java.util.HashMap; -import java.util.LinkedHashMap; -import java.util.List; -import java.util.Map; -import java.util.Set; - -import org.apache.cassandra.auth.*; -import org.apache.cassandra.cql3.*; -import org.apache.cassandra.cql3.restrictions.CustomIndexExpression; -import org.apache.cassandra.cql3.statements.*; -import org.apache.cassandra.cql3.selection.*; -import org.apache.cassandra.cql3.functions.*; -import org.apache.cassandra.db.marshal.CollectionType; -import org.apache.cassandra.exceptions.ConfigurationException; -import org.apache.cassandra.exceptions.InvalidRequestException; -import org.apache.cassandra.exceptions.SyntaxException; -import org.apache.cassandra.utils.Pair; -} - -@members { -private final List listeners = new ArrayList(); -private final List bindVariables = new ArrayList(); - -public static final Set reservedTypeNames = new HashSet() -{{ -add("byte"); -add("complex"); -add("enum"); -add("date"); -add("interval"); -add("macaddr"); -add("bitstring"); -}}; - -public AbstractMarker.Raw newBindVariables(ColumnIdentifier name) -{ -AbstractMarker.Raw marker = new AbstractMarker.Raw(bindVariables.size()); -bindVariables.add(name); -return marker; -} - -public AbstractMarker.INRaw newINBindVariables(ColumnIdentifier name) -{ -AbstractMarker.INRaw marker = new AbstractMarker.INRaw(bindVariables.size()); -bindVariables.add(name); -return marker; -} - -public Tuples.Raw newTupleBindVariables(ColumnIdentifier name) -{ -Tuples.Raw marker = new Tuples.Raw(bindVariables.size()); -bindVariables.add(name); -return marker; -} - -public Tuples.INRaw newTupleINBindVariables(ColumnIdentifier name) -{ -Tuples.INRaw marker = new Tuples.INRaw(bindVariables.size()); -bindVariables.add(name); -return marker; -} - -public Json.Marker newJsonBindVariables(ColumnIdentifier name) -{ -Json.Marker marker = new Json.Marker(bindVariables.size()); -bindVariables.add(name); -return marker; -} - -public void addErrorListener(ErrorListener listener) -{ -this.listeners.add(listener); -} - -public void removeErrorListener(ErrorListener listener) -{ -this.listeners.remove(listener); -} - -public void displayRecognitionError(String[] tokenNames, RecognitionException e) -{ -for (int i = 0, m = listeners.size(); i < m; i++) -listeners.get(i).syntaxError(this, tokenNames, e); -} - -private void addRecognitionError(String msg) -{ -for (int i = 0, m = listeners.size(); i < m; i++) -listeners.get(i).syntaxError(this, msg); -} - -public Map convertPropertyMap(Maps.Literal map) -{ -if (map == null || map.entries == null || map.entries.isEmpty()) -return Collections.emptyMap(); - -Map res = new HashMap(map.entries.size()); - -for (Pair entry : map.entries) -{ -// Because the parser tries to be smart and recover on error (to -// allow displaying more than one error I suppose), we have null -// entries in there. Just skip those, a proper error will be thrown in the en
[jira] [Comment Edited] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216337#comment-15216337 ] Jordan West edited comment on CASSANDRA-11383 at 3/29/16 5:02 PM: -- bq. Maybe that leaves one last question as to whether non-SPARSE (PREFIX) mode is considered advisable/recommended for high cardinality column data, where SPARSE mode is nominally a better choice. Maybe that is strictly a matter of whether the prefix/LIKE feature is to be utilized - if so, than PREFIX mode is required, but if not, SPARSE mode sounds like the better choice. But I don't have a handle on the internal index structures to know if that's absolutely the case - that a PREFIX index for SPARSE data would necessarily be larger and/or slower than a SPARSE index for high cardinality data. I would hope so, but it would be good to have that confirmed. {{SPARSE}} is only for numeric data so LIKE queries do not apply. For data that is sparse (every term/column value has less than 5 matching keys), such as indexing the {{created_at}} field in time series data (where there is typically few matching rows/events per {{created_at}} timestamp), it is best to use {{SPARSE}}, always, and especially in cases where range queries are used. {{SPARSE}} is primarily an optimization for range queries on this sort of data. Its biggest effect is visible on large ranges (e.g. spanning multiple days of time series data). The decision process for whether or not to use {{SPARSE}} should be: 1. is the data a numeric type? 2. is it expected that there will be a large (millions or more) number of terms (column values) in the index with each term having a small (5 or less) set of matching tokens (partition keys)? 3. will range queries be performed against this index? If the answer to all three questions is Yes then use {{SPARSE}}. >From the docs >(https://github.com/xedin/cassandra/blob/trunk/doc/SASI.md#ondiskindexbuilder): bq. The SPARSE mode differs from PREFIX in that for every 64 blocks of terms a TokenTree is built merging all the TokenTrees for each term into a single one. This copy of the data is used for efficient iteration of large ranges of e.g. timestamps. The index "mode" is configurable per column at index creation time. was (Author: jrwest): bq. Maybe that leaves one last question as to whether non-SPARSE (PREFIX) mode is considered advisable/recommended for high cardinality column data, where SPARSE mode is nominally a better choice. Maybe that is strictly a matter of whether the prefix/LIKE feature is to be utilized - if so, than PREFIX mode is required, but if not, SPARSE mode sounds like the better choice. But I don't have a handle on the internal index structures to know if that's absolutely the case - that a PREFIX index for SPARSE data would necessarily be larger and/or slower than a SPARSE index for high cardinality data. I would hope so, but it would be good to have that confirmed. {{SPARSE}} is only for numeric data so LIKE queries do not apply. For data that is sparse (every term/column value has less than 5 matching keys), such as indexing the {{created_at}} field in time series data (where there is typically few matching rows/events per {{created_at}} timestamp), it is best to use {{SPARSE}}, always, and especially in cases where range queries are used. {{SPARSE}} is primarily an optimization for range queries on this sort of data. Its biggest effect is visible on large ranges (e.g. spanning multiple days of time series data). The decision process for whether or not to use {{SPARSE}} should be: 1. is the data a numeric type? 2. is it expected that there will be a large (millions or more) number of terms (column values) in the index with each term having a small (5 or less) set of matching tokens (partition keys)? If the answer to both is Yes then use {{SPARSE}}. >From the docs >(https://github.com/xedin/cassandra/blob/trunk/doc/SASI.md#ondiskindexbuilder): bq. The SPARSE mode differs from PREFIX in that for every 64 blocks of terms a TokenTree is built merging all the TokenTrees for each term into a single one. This copy of the data is used for efficient iteration of large ranges of e.g. timestamps. The index "mode" is configurable per column at index creation time. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_bu
[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216339#comment-15216339 ] Björn Hegerfors commented on CASSANDRA-9666: Nope, there's no feature in DTCS (minus the min/max) that I'm seeing complaints about, I think it works the way it should, and to people's needs. But I feel like I keep hearing that DTCS is confusing to configure. So what I'm saying is that if that's enough to make people feel strongly about it, then maybe an overhaul which renames the options and maybe even the strategy (in response to Marcus's quote "I'm never touching that again"), would be the right thing to do. The alternative windowing algorithm isn't important, it's just a response to me having heard a couple of times that the window tiereing itself is difficult to understand if you try to. So all in all, I'm not suggesting any functional changes to DTCS (other than going though with CASSANDRA-11056), just superficial changes. Because on that level I hear complaints. But the cost of renaming options may very well outweigh the benefit. If people after all don't feel that strongly about it, I guess the saner choice is to do nothing. Btw, as a side note, I believe the code for DTCS and TWCS is very similar, the alternatives of adding tiering to TWCS vs renaming DTCS options would probably end up with the same thing. > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216337#comment-15216337 ] Jordan West commented on CASSANDRA-11383: - bq. Maybe that leaves one last question as to whether non-SPARSE (PREFIX) mode is considered advisable/recommended for high cardinality column data, where SPARSE mode is nominally a better choice. Maybe that is strictly a matter of whether the prefix/LIKE feature is to be utilized - if so, than PREFIX mode is required, but if not, SPARSE mode sounds like the better choice. But I don't have a handle on the internal index structures to know if that's absolutely the case - that a PREFIX index for SPARSE data would necessarily be larger and/or slower than a SPARSE index for high cardinality data. I would hope so, but it would be good to have that confirmed. {{SPARSE}} is only for numeric data so LIKE queries do not apply. For data that is sparse (every term/column value has less than 5 matching keys), such as indexing the {{created_at}} field in time series data (where there is typically few matching rows/events per {{created_at}} timestamp), it is best to use {{SPARSE}}, always, and especially in cases where range queries are used. {{SPARSE}} is primarily an optimization for range queries on this sort of data. Its biggest effect is visible on large ranges (e.g. spanning multiple days of time series data). The decision process for whether or not to use {{SPARSE}} should be: 1. is the data a numeric type? 2. is it expected that there will be a large (millions or more) number of terms (column values) in the index with each term having a small (5 or less) set of matching tokens (partition keys)? If the answer to both is Yes then use {{SPARSE}}. >From the docs >(https://github.com/xedin/cassandra/blob/trunk/doc/SASI.md#ondiskindexbuilder): bq. The SPARSE mode differs from PREFIX in that for every 64 blocks of terms a TokenTree is built merging all the TokenTrees for each term into a single one. This copy of the data is used for efficient iteration of large ranges of e.g. timestamps. The index "mode" is configurable per column at index creation time. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11375) COPY FROM fails when importing blob
[ https://issues.apache.org/jira/browse/CASSANDRA-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216335#comment-15216335 ] Aleksey Yeschenko commented on CASSANDRA-11375: --- Committed as [5a45aa62dd57f59753396c5d5541dbf3a0a1b220|https://github.com/apache/cassandra/commit/5a45aa62dd57f59753396c5d5541dbf3a0a1b220] to 2.1 only and merged upwards with {{-s ours}}, thanks. > COPY FROM fails when importing blob > --- > > Key: CASSANDRA-11375 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11375 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Philippe Thibaudeau >Assignee: Stefania > Fix For: 2.1.14 > > > When we try to COPY TO a table containing a blob, we get this error > COPY test.blobTable FROM '/tmp/test1.csv' WITH NULL='null' AND DELIMITER=',' > AND QUOTE='"'; > /opt/apache-cassandra-2.1.13.4/bin/../pylib/cqlshlib/copyutil.py:1602: > DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 > /opt/apache-cassandra-2.1.13.4/bin/../pylib/cqlshlib/copyutil.py:1850: > DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 > Failed to import 5 rows: ParseError - fromhex() argument 1 must be unicode, > not str - given up without retries > Failed to process 5 rows; failed rows written to import_test_blobTable.err > Same COPY TO function worked fine with 2.1.9 > The csv is generated by doing a COPY FROM on the same table. > Is there any work around this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11375) COPY FROM fails when importing blob
[ https://issues.apache.org/jira/browse/CASSANDRA-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-11375: -- Resolution: Fixed Fix Version/s: (was: 2.1.x) 2.1.14 Status: Resolved (was: Ready to Commit) > COPY FROM fails when importing blob > --- > > Key: CASSANDRA-11375 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11375 > Project: Cassandra > Issue Type: Bug > Components: CQL >Reporter: Philippe Thibaudeau >Assignee: Stefania > Fix For: 2.1.14 > > > When we try to COPY TO a table containing a blob, we get this error > COPY test.blobTable FROM '/tmp/test1.csv' WITH NULL='null' AND DELIMITER=',' > AND QUOTE='"'; > /opt/apache-cassandra-2.1.13.4/bin/../pylib/cqlshlib/copyutil.py:1602: > DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 > /opt/apache-cassandra-2.1.13.4/bin/../pylib/cqlshlib/copyutil.py:1850: > DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 > Failed to import 5 rows: ParseError - fromhex() argument 1 must be unicode, > not str - given up without retries > Failed to process 5 rows; failed rows written to import_test_blobTable.err > Same COPY TO function worked fine with 2.1.9 > The csv is generated by doing a COPY FROM on the same table. > Is there any work around this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[07/15] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ff00655 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ff00655 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ff00655 Branch: refs/heads/cassandra-2.2 Commit: 6ff006554f09b549414ce6af89ba0f7f2982ce62 Parents: fc972b9 5a45aa6 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:04 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:04 2016 +0100 -- --
[10/15] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c9e81ea7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c9e81ea7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c9e81ea7 Branch: refs/heads/trunk Commit: c9e81ea7fc6a8a2acfbd4778ca846542d250c8c4 Parents: 1de5342 6ff0065 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:21 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:21 2016 +0100 -- --
[06/15] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ff00655 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ff00655 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ff00655 Branch: refs/heads/trunk Commit: 6ff006554f09b549414ce6af89ba0f7f2982ce62 Parents: fc972b9 5a45aa6 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:04 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:04 2016 +0100 -- --
[03/15] cassandra git commit: COPY FROM fails when importing blob
COPY FROM fails when importing blob patch by Stefania Alborghetti; reviewed by Paulo Motta for CASSANDRA-11375 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a45aa62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a45aa62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a45aa62 Branch: refs/heads/trunk Commit: 5a45aa62dd57f59753396c5d5541dbf3a0a1b220 Parents: 8b8a3f5 Author: Stefania Alborghetti Authored: Fri Mar 18 16:46:37 2016 +0800 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:56:33 2016 +0100 -- CHANGES.txt| 1 + pylib/cqlshlib/copyutil.py | 16 ++-- 2 files changed, 11 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7794d4f..65d094f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * COPY FROM fails when importing blob (CASSANDRA-11375) * Backport CASSANDRA-10679 (CASSANDRA-9598) * Don't do defragmentation if reading from repaired sstables (CASSANDRA-10342) * Fix streaming_socket_timeout_in_ms not enforced (CASSANDRA-11286) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/pylib/cqlshlib/copyutil.py -- diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py index ba2a47b..28e08b1 100644 --- a/pylib/cqlshlib/copyutil.py +++ b/pylib/cqlshlib/copyutil.py @@ -1176,7 +1176,7 @@ class FeedingProcess(mp.Process): if rows: sent += self.send_chunk(ch, rows) except Exception, exc: -self.outmsg.send(ImportTaskError(exc.__class__.__name__, exc.message)) +self.outmsg.send(ImportTaskError(exc.__class__.__name__, str(exc))) if reader.exhausted: break @@ -1679,7 +1679,11 @@ class ImportConversion(object): return converters.get(t.typename, convert_unknown)(unprotect(v), ct=t) def convert_blob(v, **_): -return bytearray.fromhex(v[2:]) +try: +return bytearray.fromhex(v[2:]) +except TypeError: +# Work-around for Python 2.6 bug +return bytearray.fromhex(unicode(v[2:])) def convert_text(v, **_): return v @@ -1869,7 +1873,7 @@ class ImportConversion(object): try: return [conv(val) for conv, val in zip(converters, row)] except Exception, e: -raise ParseError(e.message) +raise ParseError(str(e)) def get_null_primary_key_message(self, idx): message = "Cannot insert null value for primary key column '%s'." % (self.columns[idx],) @@ -2183,7 +2187,7 @@ class ImportProcess(ChildProcess): try: return conv.convert_row(r) except Exception, err: -errors[err.message].append(r) +errors[str(err)].append(r) return None converted_rows = filter(None, [convert_row(r) for r in rows]) @@ -2248,7 +2252,7 @@ class ImportProcess(ChildProcess): pk = get_row_partition_key_values(row) rows_by_ring_pos[get_ring_pos(ring, pk_to_token_value(pk))].append(row) except Exception, e: -errors[e.message].append(row) +errors[str(e)].append(row) if errors: for msg, rows in errors.iteritems(): @@ -2286,7 +2290,7 @@ class ImportProcess(ChildProcess): def report_error(self, err, chunk, rows=None, attempts=1, final=True): if self.debug: traceback.print_exc(err) -self.outmsg.send(ImportTaskError(err.__class__.__name__, err.message, rows, attempts, final)) +self.outmsg.send(ImportTaskError(err.__class__.__name__, str(err), rows, attempts, final)) if final: self.update_chunk(rows, chunk)
[15/15] cassandra git commit: Merge branch 'cassandra-3.5' into trunk
Merge branch 'cassandra-3.5' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9fd6322d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9fd6322d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9fd6322d Branch: refs/heads/trunk Commit: 9fd6322dc3539dd0327f886734211a17aa89b9ad Parents: 761dfc2 bed6aae Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:48 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:48 2016 +0100 -- --
[08/15] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ff00655 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ff00655 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ff00655 Branch: refs/heads/cassandra-3.0 Commit: 6ff006554f09b549414ce6af89ba0f7f2982ce62 Parents: fc972b9 5a45aa6 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:04 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:04 2016 +0100 -- --
[05/15] cassandra git commit: COPY FROM fails when importing blob
COPY FROM fails when importing blob patch by Stefania Alborghetti; reviewed by Paulo Motta for CASSANDRA-11375 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a45aa62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a45aa62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a45aa62 Branch: refs/heads/cassandra-3.5 Commit: 5a45aa62dd57f59753396c5d5541dbf3a0a1b220 Parents: 8b8a3f5 Author: Stefania Alborghetti Authored: Fri Mar 18 16:46:37 2016 +0800 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:56:33 2016 +0100 -- CHANGES.txt| 1 + pylib/cqlshlib/copyutil.py | 16 ++-- 2 files changed, 11 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7794d4f..65d094f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * COPY FROM fails when importing blob (CASSANDRA-11375) * Backport CASSANDRA-10679 (CASSANDRA-9598) * Don't do defragmentation if reading from repaired sstables (CASSANDRA-10342) * Fix streaming_socket_timeout_in_ms not enforced (CASSANDRA-11286) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/pylib/cqlshlib/copyutil.py -- diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py index ba2a47b..28e08b1 100644 --- a/pylib/cqlshlib/copyutil.py +++ b/pylib/cqlshlib/copyutil.py @@ -1176,7 +1176,7 @@ class FeedingProcess(mp.Process): if rows: sent += self.send_chunk(ch, rows) except Exception, exc: -self.outmsg.send(ImportTaskError(exc.__class__.__name__, exc.message)) +self.outmsg.send(ImportTaskError(exc.__class__.__name__, str(exc))) if reader.exhausted: break @@ -1679,7 +1679,11 @@ class ImportConversion(object): return converters.get(t.typename, convert_unknown)(unprotect(v), ct=t) def convert_blob(v, **_): -return bytearray.fromhex(v[2:]) +try: +return bytearray.fromhex(v[2:]) +except TypeError: +# Work-around for Python 2.6 bug +return bytearray.fromhex(unicode(v[2:])) def convert_text(v, **_): return v @@ -1869,7 +1873,7 @@ class ImportConversion(object): try: return [conv(val) for conv, val in zip(converters, row)] except Exception, e: -raise ParseError(e.message) +raise ParseError(str(e)) def get_null_primary_key_message(self, idx): message = "Cannot insert null value for primary key column '%s'." % (self.columns[idx],) @@ -2183,7 +2187,7 @@ class ImportProcess(ChildProcess): try: return conv.convert_row(r) except Exception, err: -errors[err.message].append(r) +errors[str(err)].append(r) return None converted_rows = filter(None, [convert_row(r) for r in rows]) @@ -2248,7 +2252,7 @@ class ImportProcess(ChildProcess): pk = get_row_partition_key_values(row) rows_by_ring_pos[get_ring_pos(ring, pk_to_token_value(pk))].append(row) except Exception, e: -errors[e.message].append(row) +errors[str(e)].append(row) if errors: for msg, rows in errors.iteritems(): @@ -2286,7 +2290,7 @@ class ImportProcess(ChildProcess): def report_error(self, err, chunk, rows=None, attempts=1, final=True): if self.debug: traceback.print_exc(err) -self.outmsg.send(ImportTaskError(err.__class__.__name__, err.message, rows, attempts, final)) +self.outmsg.send(ImportTaskError(err.__class__.__name__, str(err), rows, attempts, final)) if final: self.update_chunk(rows, chunk)
[01/15] cassandra git commit: COPY FROM fails when importing blob
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 8b8a3f5b9 -> 5a45aa62d refs/heads/cassandra-2.2 fc972b9f2 -> 6ff006554 refs/heads/cassandra-3.0 1de534276 -> c9e81ea7f refs/heads/cassandra-3.5 daf7606a6 -> bed6aae00 refs/heads/trunk 761dfc2c8 -> 9fd6322dc COPY FROM fails when importing blob patch by Stefania Alborghetti; reviewed by Paulo Motta for CASSANDRA-11375 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a45aa62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a45aa62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a45aa62 Branch: refs/heads/cassandra-2.1 Commit: 5a45aa62dd57f59753396c5d5541dbf3a0a1b220 Parents: 8b8a3f5 Author: Stefania Alborghetti Authored: Fri Mar 18 16:46:37 2016 +0800 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:56:33 2016 +0100 -- CHANGES.txt| 1 + pylib/cqlshlib/copyutil.py | 16 ++-- 2 files changed, 11 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7794d4f..65d094f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * COPY FROM fails when importing blob (CASSANDRA-11375) * Backport CASSANDRA-10679 (CASSANDRA-9598) * Don't do defragmentation if reading from repaired sstables (CASSANDRA-10342) * Fix streaming_socket_timeout_in_ms not enforced (CASSANDRA-11286) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/pylib/cqlshlib/copyutil.py -- diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py index ba2a47b..28e08b1 100644 --- a/pylib/cqlshlib/copyutil.py +++ b/pylib/cqlshlib/copyutil.py @@ -1176,7 +1176,7 @@ class FeedingProcess(mp.Process): if rows: sent += self.send_chunk(ch, rows) except Exception, exc: -self.outmsg.send(ImportTaskError(exc.__class__.__name__, exc.message)) +self.outmsg.send(ImportTaskError(exc.__class__.__name__, str(exc))) if reader.exhausted: break @@ -1679,7 +1679,11 @@ class ImportConversion(object): return converters.get(t.typename, convert_unknown)(unprotect(v), ct=t) def convert_blob(v, **_): -return bytearray.fromhex(v[2:]) +try: +return bytearray.fromhex(v[2:]) +except TypeError: +# Work-around for Python 2.6 bug +return bytearray.fromhex(unicode(v[2:])) def convert_text(v, **_): return v @@ -1869,7 +1873,7 @@ class ImportConversion(object): try: return [conv(val) for conv, val in zip(converters, row)] except Exception, e: -raise ParseError(e.message) +raise ParseError(str(e)) def get_null_primary_key_message(self, idx): message = "Cannot insert null value for primary key column '%s'." % (self.columns[idx],) @@ -2183,7 +2187,7 @@ class ImportProcess(ChildProcess): try: return conv.convert_row(r) except Exception, err: -errors[err.message].append(r) +errors[str(err)].append(r) return None converted_rows = filter(None, [convert_row(r) for r in rows]) @@ -2248,7 +2252,7 @@ class ImportProcess(ChildProcess): pk = get_row_partition_key_values(row) rows_by_ring_pos[get_ring_pos(ring, pk_to_token_value(pk))].append(row) except Exception, e: -errors[e.message].append(row) +errors[str(e)].append(row) if errors: for msg, rows in errors.iteritems(): @@ -2286,7 +2290,7 @@ class ImportProcess(ChildProcess): def report_error(self, err, chunk, rows=None, attempts=1, final=True): if self.debug: traceback.print_exc(err) -self.outmsg.send(ImportTaskError(err.__class__.__name__, err.message, rows, attempts, final)) +self.outmsg.send(ImportTaskError(err.__class__.__name__, str(err), rows, attempts, final)) if final: self.update_chunk(rows, chunk)
[14/15] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.5
Merge branch 'cassandra-3.0' into cassandra-3.5 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bed6aae0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bed6aae0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bed6aae0 Branch: refs/heads/trunk Commit: bed6aae00e271846184d9ba7317d4e038701ca86 Parents: daf7606 c9e81ea Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:35 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:35 2016 +0100 -- --
[02/15] cassandra git commit: COPY FROM fails when importing blob
COPY FROM fails when importing blob patch by Stefania Alborghetti; reviewed by Paulo Motta for CASSANDRA-11375 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a45aa62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a45aa62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a45aa62 Branch: refs/heads/cassandra-2.2 Commit: 5a45aa62dd57f59753396c5d5541dbf3a0a1b220 Parents: 8b8a3f5 Author: Stefania Alborghetti Authored: Fri Mar 18 16:46:37 2016 +0800 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:56:33 2016 +0100 -- CHANGES.txt| 1 + pylib/cqlshlib/copyutil.py | 16 ++-- 2 files changed, 11 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7794d4f..65d094f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * COPY FROM fails when importing blob (CASSANDRA-11375) * Backport CASSANDRA-10679 (CASSANDRA-9598) * Don't do defragmentation if reading from repaired sstables (CASSANDRA-10342) * Fix streaming_socket_timeout_in_ms not enforced (CASSANDRA-11286) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/pylib/cqlshlib/copyutil.py -- diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py index ba2a47b..28e08b1 100644 --- a/pylib/cqlshlib/copyutil.py +++ b/pylib/cqlshlib/copyutil.py @@ -1176,7 +1176,7 @@ class FeedingProcess(mp.Process): if rows: sent += self.send_chunk(ch, rows) except Exception, exc: -self.outmsg.send(ImportTaskError(exc.__class__.__name__, exc.message)) +self.outmsg.send(ImportTaskError(exc.__class__.__name__, str(exc))) if reader.exhausted: break @@ -1679,7 +1679,11 @@ class ImportConversion(object): return converters.get(t.typename, convert_unknown)(unprotect(v), ct=t) def convert_blob(v, **_): -return bytearray.fromhex(v[2:]) +try: +return bytearray.fromhex(v[2:]) +except TypeError: +# Work-around for Python 2.6 bug +return bytearray.fromhex(unicode(v[2:])) def convert_text(v, **_): return v @@ -1869,7 +1873,7 @@ class ImportConversion(object): try: return [conv(val) for conv, val in zip(converters, row)] except Exception, e: -raise ParseError(e.message) +raise ParseError(str(e)) def get_null_primary_key_message(self, idx): message = "Cannot insert null value for primary key column '%s'." % (self.columns[idx],) @@ -2183,7 +2187,7 @@ class ImportProcess(ChildProcess): try: return conv.convert_row(r) except Exception, err: -errors[err.message].append(r) +errors[str(err)].append(r) return None converted_rows = filter(None, [convert_row(r) for r in rows]) @@ -2248,7 +2252,7 @@ class ImportProcess(ChildProcess): pk = get_row_partition_key_values(row) rows_by_ring_pos[get_ring_pos(ring, pk_to_token_value(pk))].append(row) except Exception, e: -errors[e.message].append(row) +errors[str(e)].append(row) if errors: for msg, rows in errors.iteritems(): @@ -2286,7 +2290,7 @@ class ImportProcess(ChildProcess): def report_error(self, err, chunk, rows=None, attempts=1, final=True): if self.debug: traceback.print_exc(err) -self.outmsg.send(ImportTaskError(err.__class__.__name__, err.message, rows, attempts, final)) +self.outmsg.send(ImportTaskError(err.__class__.__name__, str(err), rows, attempts, final)) if final: self.update_chunk(rows, chunk)
[13/15] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.5
Merge branch 'cassandra-3.0' into cassandra-3.5 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bed6aae0 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bed6aae0 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bed6aae0 Branch: refs/heads/cassandra-3.5 Commit: bed6aae00e271846184d9ba7317d4e038701ca86 Parents: daf7606 c9e81ea Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:35 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:35 2016 +0100 -- --
[04/15] cassandra git commit: COPY FROM fails when importing blob
COPY FROM fails when importing blob patch by Stefania Alborghetti; reviewed by Paulo Motta for CASSANDRA-11375 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5a45aa62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5a45aa62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5a45aa62 Branch: refs/heads/cassandra-3.0 Commit: 5a45aa62dd57f59753396c5d5541dbf3a0a1b220 Parents: 8b8a3f5 Author: Stefania Alborghetti Authored: Fri Mar 18 16:46:37 2016 +0800 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:56:33 2016 +0100 -- CHANGES.txt| 1 + pylib/cqlshlib/copyutil.py | 16 ++-- 2 files changed, 11 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7794d4f..65d094f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.14 + * COPY FROM fails when importing blob (CASSANDRA-11375) * Backport CASSANDRA-10679 (CASSANDRA-9598) * Don't do defragmentation if reading from repaired sstables (CASSANDRA-10342) * Fix streaming_socket_timeout_in_ms not enforced (CASSANDRA-11286) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5a45aa62/pylib/cqlshlib/copyutil.py -- diff --git a/pylib/cqlshlib/copyutil.py b/pylib/cqlshlib/copyutil.py index ba2a47b..28e08b1 100644 --- a/pylib/cqlshlib/copyutil.py +++ b/pylib/cqlshlib/copyutil.py @@ -1176,7 +1176,7 @@ class FeedingProcess(mp.Process): if rows: sent += self.send_chunk(ch, rows) except Exception, exc: -self.outmsg.send(ImportTaskError(exc.__class__.__name__, exc.message)) +self.outmsg.send(ImportTaskError(exc.__class__.__name__, str(exc))) if reader.exhausted: break @@ -1679,7 +1679,11 @@ class ImportConversion(object): return converters.get(t.typename, convert_unknown)(unprotect(v), ct=t) def convert_blob(v, **_): -return bytearray.fromhex(v[2:]) +try: +return bytearray.fromhex(v[2:]) +except TypeError: +# Work-around for Python 2.6 bug +return bytearray.fromhex(unicode(v[2:])) def convert_text(v, **_): return v @@ -1869,7 +1873,7 @@ class ImportConversion(object): try: return [conv(val) for conv, val in zip(converters, row)] except Exception, e: -raise ParseError(e.message) +raise ParseError(str(e)) def get_null_primary_key_message(self, idx): message = "Cannot insert null value for primary key column '%s'." % (self.columns[idx],) @@ -2183,7 +2187,7 @@ class ImportProcess(ChildProcess): try: return conv.convert_row(r) except Exception, err: -errors[err.message].append(r) +errors[str(err)].append(r) return None converted_rows = filter(None, [convert_row(r) for r in rows]) @@ -2248,7 +2252,7 @@ class ImportProcess(ChildProcess): pk = get_row_partition_key_values(row) rows_by_ring_pos[get_ring_pos(ring, pk_to_token_value(pk))].append(row) except Exception, e: -errors[e.message].append(row) +errors[str(e)].append(row) if errors: for msg, rows in errors.iteritems(): @@ -2286,7 +2290,7 @@ class ImportProcess(ChildProcess): def report_error(self, err, chunk, rows=None, attempts=1, final=True): if self.debug: traceback.print_exc(err) -self.outmsg.send(ImportTaskError(err.__class__.__name__, err.message, rows, attempts, final)) +self.outmsg.send(ImportTaskError(err.__class__.__name__, str(err), rows, attempts, final)) if final: self.update_chunk(rows, chunk)
[09/15] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2
Merge branch 'cassandra-2.1' into cassandra-2.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ff00655 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ff00655 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ff00655 Branch: refs/heads/cassandra-3.5 Commit: 6ff006554f09b549414ce6af89ba0f7f2982ce62 Parents: fc972b9 5a45aa6 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:04 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:04 2016 +0100 -- --
[11/15] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c9e81ea7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c9e81ea7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c9e81ea7 Branch: refs/heads/cassandra-3.5 Commit: c9e81ea7fc6a8a2acfbd4778ca846542d250c8c4 Parents: 1de5342 6ff0065 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:21 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:21 2016 +0100 -- --
[12/15] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0
Merge branch 'cassandra-2.2' into cassandra-3.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c9e81ea7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c9e81ea7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c9e81ea7 Branch: refs/heads/cassandra-3.0 Commit: c9e81ea7fc6a8a2acfbd4778ca846542d250c8c4 Parents: 1de5342 6ff0065 Author: Aleksey Yeschenko Authored: Tue Mar 29 17:57:21 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:57:21 2016 +0100 -- --
[jira] [Resolved] (CASSANDRA-11454) 2.2 Documentation conflicts with observed behavior
[ https://issues.apache.org/jira/browse/CASSANDRA-11454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson resolved CASSANDRA-11454. - Resolution: Invalid Fix Version/s: (was: 2.2.x) Reproduced In: (was: 2.2.x) The current behavior is correct, as stated on other tickets as you've noticed. The incorrect documentation is DataStax's, not the Apache project's, and the problem should be referred to them. I'll go ahead and send their documentation team an email. > 2.2 Documentation conflicts with observed behavior > -- > > Key: CASSANDRA-11454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11454 > Project: Cassandra > Issue Type: Task > Components: Documentation and Website > Environment: CentOS 6.6 > [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] >Reporter: Terry Liu >Priority: Minor > > Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query > would return as soon as it found enough rows to fulfill your limit. > For example, > {noformat} > SELECT COUNT(*) > FROM some_table > LIMIT 1 > {noformat} > would always return a count of 1 as long as there is at least one row in the > table. > I've noticed that Cassandra 2.2 no longer behaves in this way and yet the > documentation continues to suggest otherwise: > http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit > Cassandra 2.2 seems to return the full count despite what you set the LIMIT > to. > Looking through the version changes, it seems likely that the changes for the > following note might be related (from > https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): > {noformat} > Allow count(*) and count(1) to be use as normal aggregation > count() can now be used in aggregation. > {noformat} > If so, the related ticket seems to be > https://issues.apache.org/jira/browse/CASSANDRA-10114. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8888) Compress only inter-dc traffic by default
[ https://issues.apache.org/jira/browse/CASSANDRA-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216325#comment-15216325 ] Aleksey Yeschenko commented on CASSANDRA-: -- Committed as [761dfc2c834a10c6a1468c1a4cabe7f0f746df5f|https://github.com/apache/cassandra/commit/761dfc2c834a10c6a1468c1a4cabe7f0f746df5f] to trunk, thanks. > Compress only inter-dc traffic by default > - > > Key: CASSANDRA- > URL: https://issues.apache.org/jira/browse/CASSANDRA- > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Matt Stump >Assignee: Kaide Mu > Labels: lhf > Fix For: 3.6 > > Attachments: -3.0.txt > > > Internode compression increases GC load, and can cause high CPU utilization > for high throughput use cases. Very rarely are customers restricted by > intra-DC or cross-DC network bandwidth. I'de rather we optimize for the 75% > of cases where internode compression isn't needed and then selectively enable > it for customers where it would provide a benefit. Currently I'm advising all > field consultants disable compression by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-8888) Compress only inter-dc traffic by default
[ https://issues.apache.org/jira/browse/CASSANDRA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-: - Resolution: Fixed Fix Version/s: (was: 3.x) 3.6 Status: Resolved (was: Ready to Commit) > Compress only inter-dc traffic by default > - > > Key: CASSANDRA- > URL: https://issues.apache.org/jira/browse/CASSANDRA- > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Matt Stump >Assignee: Kaide Mu > Labels: lhf > Fix For: 3.6 > > Attachments: -3.0.txt > > > Internode compression increases GC load, and can cause high CPU utilization > for high throughput use cases. Very rarely are customers restricted by > intra-DC or cross-DC network bandwidth. I'de rather we optimize for the 75% > of cases where internode compression isn't needed and then selectively enable > it for customers where it would provide a benefit. Currently I'm advising all > field consultants disable compression by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11454) 2.2 Documentation conflicts with observed behavior
[ https://issues.apache.org/jira/browse/CASSANDRA-11454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-11454: Fix Version/s: 2.2.x > 2.2 Documentation conflicts with observed behavior > -- > > Key: CASSANDRA-11454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11454 > Project: Cassandra > Issue Type: Task > Components: Documentation and Website > Environment: CentOS 6.6 > [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] >Reporter: Terry Liu >Priority: Minor > Fix For: 2.2.x > > > Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query > would return as soon as it found enough rows to fulfill your limit. > For example, > {noformat} > SELECT COUNT(*) > FROM some_table > LIMIT 1 > {noformat} > would always return a count of 1 as long as there is at least one row in the > table. > I've noticed that Cassandra 2.2 no longer behaves in this way and yet the > documentation continues to suggest otherwise: > http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit > Cassandra 2.2 seems to return the full count despite what you set the LIMIT > to. > Looking through the version changes, it seems likely that the changes for the > following note might be related (from > https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): > {noformat} > Allow count(*) and count(1) to be use as normal aggregation > count() can now be used in aggregation. > {noformat} > If so, the related ticket seems to be > https://issues.apache.org/jira/browse/CASSANDRA-10114. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Compress only inter-dc traffic by default
Repository: cassandra Updated Branches: refs/heads/trunk 93c750ae3 -> 761dfc2c8 Compress only inter-dc traffic by default patch by Kaide Mu; reviewed by Paulo Motta for CASSANDRA- Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/761dfc2c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/761dfc2c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/761dfc2c Branch: refs/heads/trunk Commit: 761dfc2c834a10c6a1468c1a4cabe7f0f746df5f Parents: 93c750a Author: Kaide Mu Authored: Fri Mar 25 14:28:51 2016 +0100 Committer: Aleksey Yeschenko Committed: Tue Mar 29 17:51:58 2016 +0100 -- CHANGES.txt | 1 + conf/cassandra.yaml | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/761dfc2c/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 25eff53..4810a12 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.6 + * Compress only inter-dc traffic by default (CASSANDRA-) * Add metrics to track write amplification (CASSANDRA-11420) * cassandra-stress: cannot handle "value-less" tables (CASSANDRA-7739) * Add/drop multiple columns in one ALTER TABLE statement (CASSANDRA-10411) http://git-wip-us.apache.org/repos/asf/cassandra/blob/761dfc2c/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 4abe96e..58bd1b6 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -930,7 +930,7 @@ client_encryption_options: # can be: all - all traffic is compressed # dc - traffic between different datacenters is compressed # none - nothing is compressed. -internode_compression: all +internode_compression: dc # Enable or disable tcp_nodelay for inter-dc communication. # Disabling it will result in larger (but fewer) network packets being sent,
[jira] [Resolved] (CASSANDRA-11455) Re-executing incremental repair does not restore data on wiped node
[ https://issues.apache.org/jira/browse/CASSANDRA-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson resolved CASSANDRA-11455. - Resolution: Not A Problem this is expected as other nodes wont include already repaired sstables (and that data is the data that went missing on the wiped node) > Re-executing incremental repair does not restore data on wiped node > --- > > Key: CASSANDRA-11455 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11455 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Paulo Motta > > Reproduction steps: > {noformat} > ccm create test -n 3 -s > ccm node1 stress "write n=100K cl=QUORUM -rate threads=300 -schema > replication(factor=3) > compaction(strategy=org.apache.cassandra.db.compaction.LeveledCompactionStrategy,sstable_size_in_mb=1)" > ccm flush > ccm node1 nodetool repair keyspace1 standard1 > ccm flush > ccm node2 stop > rm -rf ~/.ccm/test/node2/commitlogs/* > rm -rf ~/.ccm/test/node2/data0/keyspace1/* > ccm node2 start > ccm node1 nodetool repair keyspace1 standard1 > ccm node1 stress "read n=100k cl=ONE -rate threads=3" > {noformat} > This is log on node1 (repair coordinator): > {noformat} > INFO [Thread-8] 2016-03-29 13:01:16,990 RepairRunnable.java:125 - Starting > repair command #2, repairing keyspace keyspace1 with repair options > (parallelism: parallel, primary range: false, incremental: true, job threads: > 1, ColumnFamilies: [standard1], dataCenters: [], hosts: [], # of ranges: 3) > INFO [Thread-8] 2016-03-29 13:01:17,021 RepairSession.java:237 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] new session: will sync /127.0.0.1, > /127.0.0.2, /127.0.0.3 on range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] for keyspace1.[standard1] > INFO [Repair#2:1] 2016-03-29 13:01:17,044 RepairJob.java:100 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > INFO [Repair#2:1] 2016-03-29 13:01:17,045 RepairJob.java:174 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > DEBUG [AntiEntropyStage:1] 2016-03-29 13:01:17,054 > RepairMessageVerbHandler.java:118 - Validating > ValidationRequest{gcBefore=1458403277} > org.apache.cassandra.repair.messages.ValidationRequest@56ed77cd > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,062 StorageService.java:3100 > - Forcing flush on keyspace keyspace1, CF standard1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,066 > CompactionManager.java:1290 - Created 3 merkle trees with merkle trees size > 3, 0 partitions, 277 bytes > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:123 - > Prepared AEService trees of size 3 for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:233 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partitions > per leaf are: > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:235 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partition > sizes are: > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,070 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0.1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,070 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > CompactionManager.java:1253 - Validation finished in 4 msec, for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0.2 > INFO [AntiEntro
[jira] [Updated] (CASSANDRA-8888) Compress only inter-dc traffic by default
[ https://issues.apache.org/jira/browse/CASSANDRA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-: - Assignee: Kaide Mu > Compress only inter-dc traffic by default > - > > Key: CASSANDRA- > URL: https://issues.apache.org/jira/browse/CASSANDRA- > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Matt Stump >Assignee: Kaide Mu > Labels: lhf > Fix For: 3.x > > Attachments: -3.0.txt > > > Internode compression increases GC load, and can cause high CPU utilization > for high throughput use cases. Very rarely are customers restricted by > intra-DC or cross-DC network bandwidth. I'de rather we optimize for the 75% > of cases where internode compression isn't needed and then selectively enable > it for customers where it would provide a benefit. Currently I'm advising all > field consultants disable compression by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11096) Upgrade netty to >= 4.0.34
[ https://issues.apache.org/jira/browse/CASSANDRA-11096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-11096: -- Status: Ready to Commit (was: Patch Available) > Upgrade netty to >= 4.0.34 > -- > > Key: CASSANDRA-11096 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11096 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Brandon Williams >Assignee: Benjamin Lerer > Fix For: 3.6 > > > Amongst other things, the native protocol will not bind ipv6 easily (see > CASSANDRA-11047) until we upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216267#comment-15216267 ] Jonathan Ellis commented on CASSANDRA-9666: --- bq. the problems that DTCS still suffers from with the latest patches, according to what's being said here, would perhaps be best attacked by a breaking change I haven't seen any problems asserted since my comment that "my understanding is that TWCS is basically a subset of DTCS, so I'm reluctant to include both." (I don't think max vs min timestamp is a problem so much as a different choice.) Am I missing something? > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resulting from streams of any kind will be swiftly and > aggressively compacted with the other sstables matching their similar > maxTimestamp, without causing sstables in neighboring windows to grow in size. > - The configuration options are explicit and straightforward - the tuning > parameters leave little room for error. The window is set in common, easily > understandable terms such as “12 hours”, “1 Day”, “30 days”. The > minute/hour/day options are granular enough for users keeping data for hours, > and users keeping data for years. > - There is no explicitly configurable max sstable age, though sstables will > naturally stop compacting once new data is written in that window. > - Streaming operations can create sstables with old timestamps, and they'll > naturally be joined together with sstables in the same time bucket. This is > true for bootstrap/repair/sstableloader/removenode. > - It remains true that if old data and new data is written into the memtable > at the same time, the resulting sstables will be treated as if they were new > sstables, however, that no longer negatively impacts the compaction > strategy’s selection criteria for older windows. > Patch provided for : > - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 > - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2 > - trunk (post-8099): https://github.com/jeffjirsa/cassandra/commits/twcs > Rebased, force-pushed July 18, with bug fixes for estimated p
[jira] [Commented] (CASSANDRA-11455) Re-executing incremental repair does not restore data on wiped node
[ https://issues.apache.org/jira/browse/CASSANDRA-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216254#comment-15216254 ] Yuki Morishita commented on CASSANDRA-11455: Incremental repair is not for restoring wiped node. And as your editted comment {{--full}} is used for that iirc. > Re-executing incremental repair does not restore data on wiped node > --- > > Key: CASSANDRA-11455 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11455 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Paulo Motta > > Reproduction steps: > {noformat} > ccm create test -n 3 -s > ccm node1 stress "write n=100K cl=QUORUM -rate threads=300 -schema > replication(factor=3) > compaction(strategy=org.apache.cassandra.db.compaction.LeveledCompactionStrategy,sstable_size_in_mb=1)" > ccm flush > ccm node1 nodetool repair keyspace1 standard1 > ccm flush > ccm node2 stop > rm -rf ~/.ccm/test/node2/commitlogs/* > rm -rf ~/.ccm/test/node2/data0/keyspace1/* > ccm node2 start > ccm node1 nodetool repair keyspace1 standard1 > ccm node1 stress "read n=100k cl=ONE -rate threads=3" > {noformat} > This is log on node1 (repair coordinator): > {noformat} > INFO [Thread-8] 2016-03-29 13:01:16,990 RepairRunnable.java:125 - Starting > repair command #2, repairing keyspace keyspace1 with repair options > (parallelism: parallel, primary range: false, incremental: true, job threads: > 1, ColumnFamilies: [standard1], dataCenters: [], hosts: [], # of ranges: 3) > INFO [Thread-8] 2016-03-29 13:01:17,021 RepairSession.java:237 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] new session: will sync /127.0.0.1, > /127.0.0.2, /127.0.0.3 on range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] for keyspace1.[standard1] > INFO [Repair#2:1] 2016-03-29 13:01:17,044 RepairJob.java:100 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > INFO [Repair#2:1] 2016-03-29 13:01:17,045 RepairJob.java:174 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > DEBUG [AntiEntropyStage:1] 2016-03-29 13:01:17,054 > RepairMessageVerbHandler.java:118 - Validating > ValidationRequest{gcBefore=1458403277} > org.apache.cassandra.repair.messages.ValidationRequest@56ed77cd > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,062 StorageService.java:3100 > - Forcing flush on keyspace keyspace1, CF standard1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,066 > CompactionManager.java:1290 - Created 3 merkle trees with merkle trees size > 3, 0 partitions, 277 bytes > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:123 - > Prepared AEService trees of size 3 for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:233 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partitions > per leaf are: > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:235 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partition > sizes are: > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,070 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0.1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,070 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > CompactionManager.java:1253 - Validation finished in 4 msec, for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0.2 > INFO [AntiEntr
[jira] [Updated] (CASSANDRA-11455) Re-executing incremental repair does not restore data on wiped node
[ https://issues.apache.org/jira/browse/CASSANDRA-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-11455: Description: Reproduction steps: {noformat} ccm create test -n 3 -s ccm node1 stress "write n=100K cl=QUORUM -rate threads=300 -schema replication(factor=3) compaction(strategy=org.apache.cassandra.db.compaction.LeveledCompactionStrategy,sstable_size_in_mb=1)" ccm flush ccm node1 nodetool repair keyspace1 standard1 ccm flush ccm node2 stop rm -rf ~/.ccm/test/node2/commitlogs/* rm -rf ~/.ccm/test/node2/data0/keyspace1/* ccm node2 start ccm node1 nodetool repair keyspace1 standard1 ccm node1 stress "read n=100k cl=ONE -rate threads=3" {noformat} This is log on node1 (repair coordinator): {noformat} INFO [Thread-8] 2016-03-29 13:01:16,990 RepairRunnable.java:125 - Starting repair command #2, repairing keyspace keyspace1 with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [standard1], dataCenters: [], hosts: [], # of ranges: 3) INFO [Thread-8] 2016-03-29 13:01:17,021 RepairSession.java:237 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] new session: will sync /127.0.0.1, /127.0.0.2, /127.0.0.3 on range [(3074457345618258602,-9223372036854775808], (-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]] for keyspace1.[standard1] INFO [Repair#2:1] 2016-03-29 13:01:17,044 RepairJob.java:100 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] requesting merkle trees for standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) INFO [Repair#2:1] 2016-03-29 13:01:17,045 RepairJob.java:174 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Requesting merkle trees for standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) DEBUG [AntiEntropyStage:1] 2016-03-29 13:01:17,054 RepairMessageVerbHandler.java:118 - Validating ValidationRequest{gcBefore=1458403277} org.apache.cassandra.repair.messages.ValidationRequest@56ed77cd DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,062 StorageService.java:3100 - Forcing flush on keyspace keyspace1, CF standard1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,066 CompactionManager.java:1290 - Created 3 merkle trees with merkle trees size 3, 0 partitions, 277 bytes DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:123 - Prepared AEService trees of size 3 for [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, [(3074457345618258602,-9223372036854775808], (-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]]] DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:233 - Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partitions per leaf are: DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:235 - Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partition sizes are: INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,070 RepairSession.java:181 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for standard1 from /127.0.0.1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,070 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 CompactionManager.java:1253 - Validation finished in 4 msec, for [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, [(3074457345618258602,-9223372036854775808], (-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]]] INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for standard1 from /127.0.0.2 INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for standard1 from /127.0.0.3 INFO [RepairJobTask:1] 2016-03-29 13:01:17,078 SyncTask.java:66 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Endpoints /127.0.0.2 and /127.0.0.3 are consistent for standard1 INFO [RepairJobTask:1] 2016-03-29 13:01:17,079 SyncTask.java:66 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Endpoints /127.0.0.3 and /127.0.0.1 are consistent for standard1 INFO [RepairJobTask:3] 2016-03-29 13:01:17,079 SyncTask.java:66 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Endpoints /127.0.0.2 and /127.0.0.1 are consistent for standard1 INFO [RepairJob
[jira] [Commented] (CASSANDRA-11455) Re-executing incremental repair does not restore data on wiped node
[ https://issues.apache.org/jira/browse/CASSANDRA-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216242#comment-15216242 ] Paulo Motta commented on CASSANDRA-11455: - Got this while working on other issue and registered without investigation (will have a better look later). Any ideas [~yukim] [~krummas]? > Re-executing incremental repair does not restore data on wiped node > --- > > Key: CASSANDRA-11455 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11455 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Paulo Motta > > Reproduction steps: > {noformat} > ccm create test -n 3 -s > ccm node1 stress "write n=100K cl=QUORUM -rate threads=300 -schema > replication(factor=3) > compaction(strategy=org.apache.cassandra.db.compaction.LeveledCompactionStrategy,sstable_size_in_mb=1)" > ccm flush > ccm node1 nodetool repair keyspace1 standard1 > ccm flush > ccm node2 stop > rm -rf ~/.ccm/test/node2/commitlogs/* > rm -rf ~/.ccm/test/node2/data0/keyspace1/* > ccm node2 start > ccm node1 nodetool repair keyspace1 standard1 > ccm node1 stress "read n=100k cl=ONE -rate threads=3" > {noformat} > This is log on node1 (repair coordinator): > {noformat} > INFO [Thread-8] 2016-03-29 13:01:16,990 RepairRunnable.java:125 - Starting > repair command #2, repairing keyspace keyspace1 with repair options > (parallelism: parallel, primary range: false, incremental: true, job threads: > 1, ColumnFamilies: [standard1], dataCenters: [], hosts: [], # of ranges: 3) > INFO [Thread-8] 2016-03-29 13:01:17,021 RepairSession.java:237 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] new session: will sync /127.0.0.1, > /127.0.0.2, /127.0.0.3 on range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] for keyspace1.[standard1] > INFO [Repair#2:1] 2016-03-29 13:01:17,044 RepairJob.java:100 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > INFO [Repair#2:1] 2016-03-29 13:01:17,045 RepairJob.java:174 - [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Requesting merkle trees for standard1 > (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > DEBUG [AntiEntropyStage:1] 2016-03-29 13:01:17,054 > RepairMessageVerbHandler.java:118 - Validating > ValidationRequest{gcBefore=1458403277} > org.apache.cassandra.repair.messages.ValidationRequest@56ed77cd > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,062 StorageService.java:3100 > - Forcing flush on keyspace keyspace1, CF standard1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,066 > CompactionManager.java:1290 - Created 3 merkle trees with merkle trees size > 3, 0 partitions, 277 bytes > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:123 - > Prepared AEService trees of size 3 for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:233 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partitions > per leaf are: > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:235 - > Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partition > sizes are: > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,070 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0.1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,070 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > EstimatedHistogram.java:304 - [0..0]: 1 > DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 > CompactionManager.java:1253 - Validation finished in 4 msec, for [repair > #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, > [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]]] > INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - > [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for > standard1 from /127.0.0
[jira] [Created] (CASSANDRA-11455) Re-executing incremental repair does not restore data on wiped node
Paulo Motta created CASSANDRA-11455: --- Summary: Re-executing incremental repair does not restore data on wiped node Key: CASSANDRA-11455 URL: https://issues.apache.org/jira/browse/CASSANDRA-11455 Project: Cassandra Issue Type: Bug Components: Streaming and Messaging Reporter: Paulo Motta Reproduction steps: {noformat} ccm create test -n 3 -s ccm node1 stress "write n=100K cl=QUORUM -rate threads=300 -schema replication(factor=3) compaction(strategy=org.apache.cassandra.db.compaction.LeveledCompactionStrategy,sstable_size_in_mb=1)" ccm flush ccm node1 nodetool repair keyspace1 standard1 ccm flush ccm node2 stop rm -rf ~/.ccm/test/node2/commitlogs/* rm -rf ~/.ccm/test/node2/data0/keyspace1/* ccm node2 start ccm node1 nodetool repair keyspace1 standard1 ccm node1 stress "read n=100k cl=ONE -rate threads=3" {noformat} This is log on node1 (repair coordinator): {noformat} INFO [Thread-8] 2016-03-29 13:01:16,990 RepairRunnable.java:125 - Starting repair command #2, repairing keyspace keyspace1 with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [standard1], dataCenters: [], hosts: [], # of ranges: 3) INFO [Thread-8] 2016-03-29 13:01:17,021 RepairSession.java:237 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] new session: will sync /127.0.0.1, /127.0.0.2, /127.0.0.3 on range [(3074457345618258602,-9223372036854775808], (-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]] for keyspace1.[standard1] INFO [Repair#2:1] 2016-03-29 13:01:17,044 RepairJob.java:100 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] requesting merkle trees for standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) INFO [Repair#2:1] 2016-03-29 13:01:17,045 RepairJob.java:174 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Requesting merkle trees for standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) DEBUG [AntiEntropyStage:1] 2016-03-29 13:01:17,054 RepairMessageVerbHandler.java:118 - Validating ValidationRequest{gcBefore=1458403277} org.apache.cassandra.repair.messages.ValidationRequest@56ed77cd DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,062 StorageService.java:3100 - Forcing flush on keyspace keyspace1, CF standard1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,066 CompactionManager.java:1290 - Created 3 merkle trees with merkle trees size 3, 0 partitions, 277 bytes DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:123 - Prepared AEService trees of size 3 for [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, [(3074457345618258602,-9223372036854775808], (-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]]] DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:233 - Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partitions per leaf are: DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,067 Validator.java:235 - Validated 0 partitions for 784bf8d0-f5c7-11e5-9f80-d30f63ad009f. Partition sizes are: INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,070 RepairSession.java:181 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for standard1 from /127.0.0.1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,070 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 EstimatedHistogram.java:304 - [0..0]: 1 DEBUG [ValidationExecutor:3] 2016-03-29 13:01:17,071 CompactionManager.java:1253 - Validation finished in 4 msec, for [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f on keyspace1/standard1, [(3074457345618258602,-9223372036854775808], (-9223372036854775808,-3074457345618258603], (-3074457345618258603,3074457345618258602]]] INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for standard1 from /127.0.0.2 INFO [AntiEntropyStage:1] 2016-03-29 13:01:17,077 RepairSession.java:181 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Received merkle tree for standard1 from /127.0.0.3 INFO [RepairJobTask:1] 2016-03-29 13:01:17,078 SyncTask.java:66 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Endpoints /127.0.0.2 and /127.0.0.3 are consistent for standard1 INFO [RepairJobTask:1] 2016-03-29 13:01:17,079 SyncTask.java:66 - [repair #784bf8d0-f5c7-11e5-9f80-d30f63ad009f] Endpoints /127.0.0.3 and /127.0.0.1 are consistent for standard1
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216233#comment-15216233 ] Jack Krupansky commented on CASSANDRA-11383: Thanks, [~jrwest] and [~doanduyhai]. I think I finally have the SASI terminology down now - SPARSE modes means that the index is sparse (few index entries per original column value) while the column data is dense (many distinct values.) And that non-SPARSE (AKA PREFIX) mode, the default mode, supports any cardinality of data, especially the low cardinality data that SPARSE mode does not support. Maybe that leaves one last question as to whether non-SPARSE (PREFIX) mode is considered advisable/recommended for high cardinality column data, where SPARSE mode is nominally a better choice. Maybe that is strictly a matter of whether the prefix/LIKE feature is to be utilized - if so, than PREFIX mode is required, but if not, SPARSE mode sounds like the better choice. But I don't have a handle on the internal index structures to know if that's absolutely the case - that a PREFIX index for SPARSE data would necessarily be larger and/or slower than a SPARSE index for high cardinality data. I would hope so, but it would be good to have that confirmed. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11372) Make CQL grammar more easily extensible
[ https://issues.apache.org/jira/browse/CASSANDRA-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216230#comment-15216230 ] Aleksey Yeschenko commented on CASSANDRA-11372: --- LGTM > Make CQL grammar more easily extensible > > > Key: CASSANDRA-11372 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11372 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Mike Adamson >Assignee: Mike Adamson > Fix For: 3.x > > > The CQL grammar ({{Cql.g}}) is currently a composite grammar and, as such, is > not easy to extend. > We now have a number of 3rd parties who are extending the grammar (custom > index grammars, for example) and it would be helpful if the grammar could be > split in a parser and lexer in order to make extension easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS
[ https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216231#comment-15216231 ] Björn Hegerfors commented on CASSANDRA-9666: So, I also think that the tiering should have some value, though I lack concrete benchmarks to show it. That said, the problems that DTCS still suffers from with the latest patches, according to what's being said here, would perhaps be best attacked by a breaking change. So I see an opportunity to do something about it, whether the name stays or changes to TWCS (I kind of like that name). For example, what's holding back CASSANDRA-11056 is a fear that the transition won't be completely smooth. The names of the options in DTCS is something that I would like to change towards what TWCS has. max_window_size has become more central than base_time_seconds. Renaming max_window_size to window_size and base_time_seconds to min_window_size (or even more obscure, or even removing the option) should be better. If we require user interaction anyway, then applying CASSANDRA-11056 while we're at it would be easier. Also, the idea in CASSANDRA-9013 and CASSANDRA-11407 and the recent recent dev@ mail thread, shows another way of doing the tiering that would arguably be easier to understand. Combine these suggestions, and we end up expressing the windowing forwards in time, rather than backwards. DTCS was always about "what's the size of the smallest window", "how many of those make a larger window" and "how long do we wait until we stop compacting a value". Instead, we would get the TWCS way, which people seem to like more, where the main question is "how large are the windows that we put the values into". And beyond that there's this ramp down in window sizes just towards the end to accommodate for very write heavy workloads, but options controlling that part a more of a detail, like the min_sstable_size option of STCS. Is something like this what people really want? I.e. TWCS, but with tiering near the end of the timeline, that users don't need to know or care about? > Provide an alternative to DTCS > -- > > Key: CASSANDRA-9666 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9666 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa > Fix For: 2.1.x, 2.2.x > > Attachments: dtcs-twcs-io.png, dtcs-twcs-load.png > > > DTCS is great for time series data, but it comes with caveats that make it > difficult to use in production (typical operator behaviors such as bootstrap, > removenode, and repair have MAJOR caveats as they relate to > max_sstable_age_days, and hints/read repair break the selection algorithm). > I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices > the tiered nature of DTCS in order to address some of DTCS' operational > shortcomings. I believe it is necessary to propose an alternative rather than > simply adjusting DTCS, because it fundamentally removes the tiered nature in > order to remove the parameter max_sstable_age_days - the result is very very > different, even if it is heavily inspired by DTCS. > Specifically, rather than creating a number of windows of ever increasing > sizes, this strategy allows an operator to choose the window size, compact > with STCS within the first window of that size, and aggressive compact down > to a single sstable once that window is no longer current. The window size is > a combination of unit (minutes, hours, days) and size (1, etc), such that an > operator can expect all data using a block of that size to be compacted > together (that is, if your unit is hours, and size is 6, you will create > roughly 4 sstables per day, each one containing roughly 6 hours of data). > The result addresses a number of the problems with > DateTieredCompactionStrategy: > - At the present time, DTCS’s first window is compacted using an unusual > selection criteria, which prefers files with earlier timestamps, but ignores > sizes. In TimeWindowCompactionStrategy, the first window data will be > compacted with the well tested, fast, reliable STCS. All STCS options can be > passed to TimeWindowCompactionStrategy to configure the first window’s > compaction behavior. > - HintedHandoff may put old data in new sstables, but it will have little > impact other than slightly reduced efficiency (sstables will cover a wider > range, but the old timestamps will not impact sstable selection criteria > during compaction) > - ReadRepair may put old data in new sstables, but it will have little impact > other than slightly reduced efficiency (sstables will cover a wider range, > but the old timestamps will not impact sstable selection criteria during > compaction) > - Small, old sstables resultin
[jira] [Updated] (CASSANDRA-11372) Make CQL grammar more easily extensible
[ https://issues.apache.org/jira/browse/CASSANDRA-11372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-11372: -- Status: Ready to Commit (was: Patch Available) > Make CQL grammar more easily extensible > > > Key: CASSANDRA-11372 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11372 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Mike Adamson >Assignee: Mike Adamson > Fix For: 3.x > > > The CQL grammar ({{Cql.g}}) is currently a composite grammar and, as such, is > not easy to extend. > We now have a number of 3rd parties who are extending the grammar (custom > index grammars, for example) and it would be helpful if the grammar could be > split in a parser and lexer in order to make extension easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8777) Streaming operations should log both endpoint and port associated with the operation
[ https://issues.apache.org/jira/browse/CASSANDRA-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216219#comment-15216219 ] Paulo Motta commented on CASSANDRA-8777: Thanks for the patch. LGTM, I moved the CHANGES message up to the top (as it should be), and also made a minor change on the stream error message. Updated patch and tests are below (will mark as ready to commit when CI is happy): ||trunk|| |[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-8777]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8777-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-8777-dtest/lastCompletedBuild/testReport/]| > Streaming operations should log both endpoint and port associated with the > operation > > > Key: CASSANDRA-8777 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8777 > Project: Cassandra > Issue Type: Improvement >Reporter: Jeremy Hanna > Labels: lhf > Fix For: 2.1.x > > Attachments: 8777-2.2.txt > > > Currently we log the endpoint for a streaming operation. If the port has > been overridden, it would be valuable to know that that setting is getting > picked up. Therefore, when logging the endpoint address, it would be nice to > also log the port it's trying to use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11395) dtest failure in upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_UpTo_2_2_HEAD.cas_and_list_index_test
[ https://issues.apache.org/jira/browse/CASSANDRA-11395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch updated CASSANDRA-11395: --- Resolution: Fixed Status: Resolved (was: Patch Available) > dtest failure in > upgrade_tests.cql_tests.TestCQLNodes3RF3_2_1_UpTo_2_2_HEAD.cas_and_list_index_test > --- > > Key: CASSANDRA-11395 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11395 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Philip Thompson >Assignee: Russ Hatch > Labels: dtest > > {code} > Expected [[0, ['foo', 'bar'], 'foobar']] from SELECT * FROM test, but got > [[0, [u'foi', u'bar'], u'foobar']] > {code} > example failure: > http://cassci.datastax.com/job/upgrade_tests-all/24/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3_2_1_UpTo_2_2_HEAD/cas_and_list_index_test > Failed on CassCI build upgrade_tests-all #24 > Probably a consistency issue in the test code, but I haven't looked into it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11454) 2.2 Documentation conflicts with observed behavior
[ https://issues.apache.org/jira/browse/CASSANDRA-11454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terry Liu updated CASSANDRA-11454: -- Description: Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query would return as soon as it found enough rows to fulfill your limit. For example, {noformat} SELECT COUNT(*) FROM some_table LIMIT 1 {noformat} would always return a count of 1 as long as there is at least one row in the table. I've noticed that Cassandra 2.2 no longer behaves in this way and yet the documentation continues to suggest otherwise: http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit Cassandra 2.2 seems to return the true count despite what you set the LIMIT to. Looking through the version changes, it seems likely that the changes for the following note might be related (from https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): {noformat} Allow count(*) and count(1) to be use as normal aggregation count() can now be used in aggregation. {noformat} If so, the related ticket seems to be https://issues.apache.org/jira/browse/CASSANDRA-10114. was: Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query would return as soon as it found enough rows to fulfill your limit. For example, {noformat} SELECT COUNT(*) FROM some_table LIMIT 1 {noformat} would always return a count of 1 as long as there is at least one row in the table. I've noticed that Cassandra 2.2 no longer behaves in this way and yet the documentation continues to suggest otherwise: http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit Looking through the version changes, it seems likely that the changes for the following note might be related (from https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): {noformat} Allow count(*) and count(1) to be use as normal aggregation count() can now be used in aggregation. {noformat} If so, the related ticket seems to be https://issues.apache.org/jira/browse/CASSANDRA-10114. > 2.2 Documentation conflicts with observed behavior > -- > > Key: CASSANDRA-11454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11454 > Project: Cassandra > Issue Type: Task > Components: Documentation and Website > Environment: CentOS 6.6 > [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] >Reporter: Terry Liu >Priority: Minor > > Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query > would return as soon as it found enough rows to fulfill your limit. > For example, > {noformat} > SELECT COUNT(*) > FROM some_table > LIMIT 1 > {noformat} > would always return a count of 1 as long as there is at least one row in the > table. > I've noticed that Cassandra 2.2 no longer behaves in this way and yet the > documentation continues to suggest otherwise: > http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit > Cassandra 2.2 seems to return the true count despite what you set the LIMIT > to. > Looking through the version changes, it seems likely that the changes for the > following note might be related (from > https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): > {noformat} > Allow count(*) and count(1) to be use as normal aggregation > count() can now be used in aggregation. > {noformat} > If so, the related ticket seems to be > https://issues.apache.org/jira/browse/CASSANDRA-10114. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11454) 2.2 Documentation conflicts with observed behavior
[ https://issues.apache.org/jira/browse/CASSANDRA-11454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Terry Liu updated CASSANDRA-11454: -- Description: Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query would return as soon as it found enough rows to fulfill your limit. For example, {noformat} SELECT COUNT(*) FROM some_table LIMIT 1 {noformat} would always return a count of 1 as long as there is at least one row in the table. I've noticed that Cassandra 2.2 no longer behaves in this way and yet the documentation continues to suggest otherwise: http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit Cassandra 2.2 seems to return the full count despite what you set the LIMIT to. Looking through the version changes, it seems likely that the changes for the following note might be related (from https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): {noformat} Allow count(*) and count(1) to be use as normal aggregation count() can now be used in aggregation. {noformat} If so, the related ticket seems to be https://issues.apache.org/jira/browse/CASSANDRA-10114. was: Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query would return as soon as it found enough rows to fulfill your limit. For example, {noformat} SELECT COUNT(*) FROM some_table LIMIT 1 {noformat} would always return a count of 1 as long as there is at least one row in the table. I've noticed that Cassandra 2.2 no longer behaves in this way and yet the documentation continues to suggest otherwise: http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit Cassandra 2.2 seems to return the true count despite what you set the LIMIT to. Looking through the version changes, it seems likely that the changes for the following note might be related (from https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): {noformat} Allow count(*) and count(1) to be use as normal aggregation count() can now be used in aggregation. {noformat} If so, the related ticket seems to be https://issues.apache.org/jira/browse/CASSANDRA-10114. > 2.2 Documentation conflicts with observed behavior > -- > > Key: CASSANDRA-11454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11454 > Project: Cassandra > Issue Type: Task > Components: Documentation and Website > Environment: CentOS 6.6 > [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] >Reporter: Terry Liu >Priority: Minor > > Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query > would return as soon as it found enough rows to fulfill your limit. > For example, > {noformat} > SELECT COUNT(*) > FROM some_table > LIMIT 1 > {noformat} > would always return a count of 1 as long as there is at least one row in the > table. > I've noticed that Cassandra 2.2 no longer behaves in this way and yet the > documentation continues to suggest otherwise: > http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit > Cassandra 2.2 seems to return the full count despite what you set the LIMIT > to. > Looking through the version changes, it seems likely that the changes for the > following note might be related (from > https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): > {noformat} > Allow count(*) and count(1) to be use as normal aggregation > count() can now be used in aggregation. > {noformat} > If so, the related ticket seems to be > https://issues.apache.org/jira/browse/CASSANDRA-10114. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11454) 2.2 Documentation conflicts with observed behavior
Terry Liu created CASSANDRA-11454: - Summary: 2.2 Documentation conflicts with observed behavior Key: CASSANDRA-11454 URL: https://issues.apache.org/jira/browse/CASSANDRA-11454 Project: Cassandra Issue Type: Task Components: Documentation and Website Environment: CentOS 6.6 [cqlsh 5.0.1 | Cassandra 2.2.5 | CQL spec 3.3.1 | Native protocol v4] Reporter: Terry Liu Priority: Minor Cassandra 2.1 allowed you to LIMIT a COUNT and have it mean that the query would return as soon as it found enough rows to fulfill your limit. For example, {noformat} SELECT COUNT(*) FROM some_table LIMIT 1 {noformat} would always return a count of 1 as long as there is at least one row in the table. I've noticed that Cassandra 2.2 no longer behaves in this way and yet the documentation continues to suggest otherwise: http://docs.datastax.com/en/cql/3.3/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__specifying-rows-returned-using-limit Looking through the version changes, it seems likely that the changes for the following note might be related (from https://docs.datastax.com/en/cassandra/2.2/cassandra/features.html): {noformat} Allow count(*) and count(1) to be use as normal aggregation count() can now be used in aggregation. {noformat} If so, the related ticket seems to be https://issues.apache.org/jira/browse/CASSANDRA-10114. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8643) merkle tree creation fails with NoSuchElementException
[ https://issues.apache.org/jira/browse/CASSANDRA-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216195#comment-15216195 ] Thom Valley commented on CASSANDRA-8643: Just saw this on 2.1.13 cluster during repair on Friday (only one instance on 210 nodes) INFO [AntiEntropyStage:1] 2016-03-25 00:36:32,753 Validator.java:257 - [repair #9000eb10-f221-11e5-b4de-094c21294f66] Sending completed merkle tree to /10.255 .226.29 for cds/relation_control ERROR [ValidationExecutor:1850] 2016-03-25 00:36:35,044 Validator.java:245 - Failed creating a merkle tree for [repair #6fc9d910-f221-11e5-b4de-094c21294f66 on domain_1300/xml_doc_1300, (-9123863930802393274,-9122148273801490227]], /10.255.226.29 (see log for details) ERROR [ValidationExecutor:1850] 2016-03-25 00:36:35,055 CassandraDaemon.java:229 - Exception in thread Thread[ValidationExecutor:1850,1,main] java.util.NoSuchElementException: null at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:154) ~[guava-16.0.1.jar:na] at org.apache.cassandra.repair.Validator.add(Validator.java:138) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1051) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:89) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:662) ~[cassandra-all-2.1.13.1131.jar:2.1.13.1131] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_60] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_60] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_60] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60] > merkle tree creation fails with NoSuchElementException > -- > > Key: CASSANDRA-8643 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8643 > Project: Cassandra > Issue Type: Bug > Environment: We are running on a three node cluster with three in > replication(C* 2.1.1). It uses a default C* installation and STCS. >Reporter: Jan Karlsson > Fix For: 2.1.3 > > > We have a problem that we encountered during testing over the weekend. > During the tests we noticed that repairs started to fail. This error has > occured on multiple non-coordinator nodes during repair. It also ran at least > once without producing this error. > We run repair -pr on all nodes on different days. CPU values were around 40% > and disk was 50% full. > From what I understand, the coordinator asked for merkle trees from the other > two nodes. However one of the nodes fails to create his merkle tree. > Unfortunately we do not have a way to reproduce this problem. > The coordinator receives: > {noformat} > 2015-01-09T17:55:57.091+0100 INFO [RepairJobTask:4] RepairJob.java:145 > [repair #59455950-9820-11e4-b5c1-7797064e1316] requesting merkle trees for > censored (to [/xx.90, /xx.98, /xx.82]) > 2015-01-09T17:55:58.516+0100 INFO [AntiEntropyStage:1] > RepairSession.java:171 [repair #59455950-9820-11e4-b5c1-7797064e1316] > Received merkle tree for censored from /xx.90 > 2015-01-09T17:55:59.581+0100 ERROR [AntiEntropySessions:76] > RepairSession.java:303 [repair #59455950-9820-11e4-b5c1-7797064e1316] session > completed with the following error > org.apache.cassandra.exceptions.RepairException: [repair > #59455950-9820-11e4-b5c1-7797064e1316 on censored/censored, > (-6476420463551243930,-6471459119674373580]] Validation failed in /xx.98 > at > org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:384) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:126) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) > ~[apache-cassandra-2.1.1.jar:2.1.1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_51] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_51] > at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] > 2015-01-09T17:55:59.582+0100 ERROR [AntiEntropySessions:76] > CassandraDaemon.java:153 Exception in thread > Thread[AntiEntropySessions:76,5,RMI Runtime] > java.lang.RuntimeException: org.a
[jira] [Commented] (CASSANDRA-7779) Add option to sstableloader to only stream to the local dc
[ https://issues.apache.org/jira/browse/CASSANDRA-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216190#comment-15216190 ] Thom Valley commented on CASSANDRA-7779: This grows to an even bigger scale when you move up to 5 DCs. SSTABLELOADER is very inefficient when trying to load multiple DCs. Most global multi-DC implementations are bandwidth sensitive and streaming RF copies of the data to each target DC is very expensive / can be impacted by bandwidth limitations. Being able to LOAD data to a single DC and have Cassandra do the replication to the additional DCs would be much more efficient, as Cassandra does a great job of limiting resource consumption. I realize that's not really SSTABLELOADER as it exists today, but didn't want to file yet another ticket for something so closely related. > Add option to sstableloader to only stream to the local dc > -- > > Key: CASSANDRA-7779 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7779 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Nick Bailey > Fix For: 2.1.x > > > This is meant to be a potential workaround for CASSANDRA-4756. Due to that > ticket, trying to load a cluster wide snapshot via sstableloader will > potentially stream an enormous amount of data. In a 3 datacenter cluster with > rf=3 in each datacenter, 81 copies of the data would be streamed. Once we > have per range sstables we can optimize sstableloader to merge data and only > stream one copy, but until then we need a workaround. By only streaming to > the local datacenter we can load the data locally in each datacenter and only > have 9 copies of the data rather than 81. > This could potentially be achieved by the option to ignore certain nodes that > already exists in sstableloader, but in the case of vnodes and topology > changes in the cluster, this could require specifying every node in the > cluster as 'ignored' on the command line which could be problematic. This is > just a shortcut to avoid that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216186#comment-15216186 ] Jordan West commented on CASSANDRA-11383: - bq. Was the conclusion that a SPARSE SASI index would work well even for low cardinality data (as in the original reported case, for period_end_month_int), or was there some application-level change required to adapt to a SASI change as well? {{period_end_month_int}} is still the incorrect use case for {{SPARSE}}. That did not change. {{SPARSE}} is still intended for indexes/terms where there are a large number of terms and a low number of tokens/keys per term (the token trees in the index are sparse). The {{period_end_month_int}} use-case is a dense index: there are few terms and each term has a large number of tokens/keys (the token trees in the index are dense). The merged patch improves memory overhead in either case when building indexes from a large sstable. What was modified is that indexes marked {{SPARSE}} that have more than 5 tokens for any term in the index will fail to build and an exception will be logged. bq. Is it now official that a non-SPARSE SASI index (e.g., PREFIX) can be used for non-TEXT data (int in particular), at least for the case of exact match lookup? {{PREFIX}} mode has always been supported for numeric data and was/continues to be the default mode if none is specified. PREFIX mode should be considered "NOT SPARSE" for numerical data. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216185#comment-15216185 ] DOAN DuyHai commented on CASSANDRA-11383: - 1. {{SPARSE}} index (in the sense for 1 index value, there are very few matching pk) is working well. During indexing process, if there are more than 5 partition keys for the same index value, SASI will throw an exception and skip indexing the current value to move on the next value 2. mode {{PREFIX}} is working fine for {{DENSE}} numerical index (period_end_month_int). The index supports equality and range queries > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)