[jira] [Commented] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts
[ https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179524#comment-15179524 ] Sylvain Lebresne commented on CASSANDRA-11302: -- Definitively looks fishy but since you're the author, can you have a quick look [~aweisberg]? > Invalid time unit conversion causing write timeouts > --- > > Key: CASSANDRA-11302 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11302 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Mike Heffner > Attachments: nanosec.patch > > > We've been debugging a write timeout that we saw after upgrading from the > 2.0.x release line, with our particular workload. Details of that process can > be found in this thread: > https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html > After bisecting various patch release versions, and then commits, on the > 2.1.x release line we've identified version 2.1.5 and this commit as the > point where the timeouts first start appearing: > https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9 > After examining the commit we believe this line was a typo: > https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467 > as it doesn't properly convert the timeout value from milliseconds to > nanoseconds. > After testing with the attached patch applied, we do not see timeouts on > version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've > tested our workload against this and we are fairly confident in the patch, we > are not experts with the code base so we would prefer additional review. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts
[ https://issues.apache.org/jira/browse/CASSANDRA-11302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-11302: - Reproduced In: 2.2.5, 2.1.5 (was: 2.1.5, 2.2.5) Reviewer: Ariel Weisberg > Invalid time unit conversion causing write timeouts > --- > > Key: CASSANDRA-11302 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11302 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Mike Heffner > Attachments: nanosec.patch > > > We've been debugging a write timeout that we saw after upgrading from the > 2.0.x release line, with our particular workload. Details of that process can > be found in this thread: > https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html > After bisecting various patch release versions, and then commits, on the > 2.1.x release line we've identified version 2.1.5 and this commit as the > point where the timeouts first start appearing: > https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9 > After examining the commit we believe this line was a typo: > https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467 > as it doesn't properly convert the timeout value from milliseconds to > nanoseconds. > After testing with the attached patch applied, we do not see timeouts on > version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've > tested our workload against this and we are fairly confident in the patch, we > are not experts with the code base so we would prefer additional review. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11301) Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads
[ https://issues.apache.org/jira/browse/CASSANDRA-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179421#comment-15179421 ] Dominik Keil commented on CASSANDRA-11301: -- [~christianmovi]: I saw similar symptoms on that particular cluster and increasing the compaction throughput seemed to help. Then again, the symptoms are elusive even on 2.2, where this is confirmed and I did not create a thread dump. If it happens again I will do that (if I get on it in time) and we'll see. > Non-obsoleting compaction operations over compressed files can impose rate > limit on normal reads > > > Key: CASSANDRA-11301 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11301 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.2.6 > > > Broken by CASSANDRA-9240; the rate limiting reader passes the ICompressedFile > interface to its parent, which uses this to attach an "owner" - which means > the reader gets recycled on close, i.e. pooled, for normal use. If the > compaction were to replace the sstable there would be no problem, which is > presumably why this hasn't been encountered frequently. However validation > compactions on long lived sstables would permit these rate limited readers to > accumulate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11303) New inbound throughput parameters for streaming
Satoshi Konno created CASSANDRA-11303: - Summary: New inbound throughput parameters for streaming Key: CASSANDRA-11303 URL: https://issues.apache.org/jira/browse/CASSANDRA-11303 Project: Cassandra Issue Type: New Feature Components: Configuration Reporter: Satoshi Konno Priority: Minor Attachments: cassandra_inbound_stream.diff Hi, To specify stream throughputs of a node more clearly, I would like to add the following new inbound parameters like existing outbound parameters in the cassandra.yaml. - stream_throughput_inbound_megabits_per_sec - inter_dc_stream_throughput_outbound_megabits_per_sec We use only the existing outbound parameters now, but it is difficult to control the total throughputs of a node. In our production network, some critical alerts occurs when a node exceed the specified total throughput which is the sum of the input and output throughputs. In our operation of Cassandra, the alerts occurs during the bootstrap or repair processing when a new node is added. In the worst case, we have to stop the operation of the exceed node. I have attached the patch under consideration. I would like to add a new limiter class, StreamInboundRateLimiter, and use the limiter class in StreamDeserializer class. I use Row::dataSize( )to get the input throughput in StreamDeserializer::newPartition(), but I am not sure whether the dataSize() returns the correct data size. Can someone please tell me how to do it ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11302) Invalid time unit conversion causing write timeouts
Mike Heffner created CASSANDRA-11302: Summary: Invalid time unit conversion causing write timeouts Key: CASSANDRA-11302 URL: https://issues.apache.org/jira/browse/CASSANDRA-11302 Project: Cassandra Issue Type: Bug Components: Core Reporter: Mike Heffner Attachments: nanosec.patch We've been debugging a write timeout that we saw after upgrading from the 2.0.x release line, with our particular workload. Details of that process can be found in this thread: https://www.mail-archive.com/user@cassandra.apache.org/msg46064.html After bisecting various patch release versions, and then commits, on the 2.1.x release line we've identified version 2.1.5 and this commit as the point where the timeouts first start appearing: https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9 After examining the commit we believe this line was a typo: https://github.com/apache/cassandra/commit/828496492c51d7437b690999205ecc941f41a0a9#diff-c7ef124561c4cde1c906f28ad3883a88L467 as it doesn't properly convert the timeout value from milliseconds to nanoseconds. After testing with the attached patch applied, we do not see timeouts on version 2.1.5 nor against 2.2.5 when we bring the patch forward. While we've tested our workload against this and we are fairly confident in the patch, we are not experts with the code base so we would prefer additional review. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance
[ https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179075#comment-15179075 ] Stefania commented on CASSANDRA-11053: -- Thank you, CI results are clean. This patch is ready for commit, merge information and patches available in the 7th comment above and repeated here: {quote} ||2.1||2.2||2.2 win||3.0||3.5||trunk|| |[patch|https://github.com/stef1927/cassandra/commits/11053-2.1]|[patch|https://github.com/stef1927/cassandra/commits/11053-2.2]| |[patch|https://github.com/stef1927/cassandra/commits/11053-3.0]|[patch|https://github.com/stef1927/cassandra/commits/11053-3.5]|[patch|https://github.com/stef1927/cassandra/commits/11053]| |[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11053-2.1-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11053-2.2-dtest/]|[win dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11053-2.2-windows-dtest_win32/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11053-3.0-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11053-3.5-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11053-dtest/]| There are conflicts all the way up to 3.5, only patch to merge cleanly is 3.5 into trunk. {quote} > COPY FROM on large datasets: fix progress report and debug performance > -- > > Key: CASSANDRA-11053 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11053 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: doc-impacting > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: copy_from_large_benchmark.txt, > copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, > worker_profiles.txt, worker_profiles_2.txt > > > h5. Description > Running COPY from on a large dataset (20G divided in 20M records) revealed > two issues: > * The progress report is incorrect, it is very slow until almost the end of > the test at which point it catches up extremely quickly. > * The performance in rows per second is similar to running smaller tests with > a smaller cluster locally (approx 35,000 rows per second). As a comparison, > cassandra-stress manages 50,000 rows per second under the same set-up, > therefore resulting 1.5 times faster. > See attached file _copy_from_large_benchmark.txt_ for the benchmark details. > h5. Doc-impacting changes to COPY FROM options > * A new option was added: PREPAREDSTATEMENTS - it indicates if prepared > statements should be used; it defaults to true. > * The default value of CHUNKSIZE changed from 1000 to 5000. > * The default value of MINBATCHSIZE changed from 2 to 10. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance
[ https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-11053: - Description: h5. Description Running COPY from on a large dataset (20G divided in 20M records) revealed two issues: * The progress report is incorrect, it is very slow until almost the end of the test at which point it catches up extremely quickly. * The performance in rows per second is similar to running smaller tests with a smaller cluster locally (approx 35,000 rows per second). As a comparison, cassandra-stress manages 50,000 rows per second under the same set-up, therefore resulting 1.5 times faster. See attached file _copy_from_large_benchmark.txt_ for the benchmark details. h5. Doc-impacting changes to COPY FROM options * A new option was added: PREPAREDSTATEMENTS - it indicates if prepared statements should be used; it defaults to true. * The default value of CHUNKSIZE changed from 1000 to 5000. * The default value of MINBATCHSIZE changed from 2 to 10. was: Running COPY from on a large dataset (20G divided in 20M records) revealed two issues: * The progress report is incorrect, it is very slow until almost the end of the test at which point it catches up extremely quickly. * The performance in rows per second is similar to running smaller tests with a smaller cluster locally (approx 35,000 rows per second). As a comparison, cassandra-stress manages 50,000 rows per second under the same set-up, therefore resulting 1.5 times faster. See attached file _copy_from_large_benchmark.txt_ for the benchmark details. > COPY FROM on large datasets: fix progress report and debug performance > -- > > Key: CASSANDRA-11053 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11053 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: doc-impacting > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: copy_from_large_benchmark.txt, > copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, > worker_profiles.txt, worker_profiles_2.txt > > > h5. Description > Running COPY from on a large dataset (20G divided in 20M records) revealed > two issues: > * The progress report is incorrect, it is very slow until almost the end of > the test at which point it catches up extremely quickly. > * The performance in rows per second is similar to running smaller tests with > a smaller cluster locally (approx 35,000 rows per second). As a comparison, > cassandra-stress manages 50,000 rows per second under the same set-up, > therefore resulting 1.5 times faster. > See attached file _copy_from_large_benchmark.txt_ for the benchmark details. > h5. Doc-impacting changes to COPY FROM options > * A new option was added: PREPAREDSTATEMENTS - it indicates if prepared > statements should be used; it defaults to true. > * The default value of CHUNKSIZE changed from 1000 to 5000. > * The default value of MINBATCHSIZE changed from 2 to 10. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance
[ https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-11053: - Status: Ready to Commit (was: Patch Available) > COPY FROM on large datasets: fix progress report and debug performance > -- > > Key: CASSANDRA-11053 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11053 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: doc-impacting > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: copy_from_large_benchmark.txt, > copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, > worker_profiles.txt, worker_profiles_2.txt > > > h5. Description > Running COPY from on a large dataset (20G divided in 20M records) revealed > two issues: > * The progress report is incorrect, it is very slow until almost the end of > the test at which point it catches up extremely quickly. > * The performance in rows per second is similar to running smaller tests with > a smaller cluster locally (approx 35,000 rows per second). As a comparison, > cassandra-stress manages 50,000 rows per second under the same set-up, > therefore resulting 1.5 times faster. > See attached file _copy_from_large_benchmark.txt_ for the benchmark details. > h5. Doc-impacting changes to COPY FROM options > * A new option was added: PREPAREDSTATEMENTS - it indicates if prepared > statements should be used; it defaults to true. > * The default value of CHUNKSIZE changed from 1000 to 5000. > * The default value of MINBATCHSIZE changed from 2 to 10. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance
[ https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-11053: - Labels: doc-impacting (was: ) > COPY FROM on large datasets: fix progress report and debug performance > -- > > Key: CASSANDRA-11053 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11053 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Labels: doc-impacting > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: copy_from_large_benchmark.txt, > copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, > worker_profiles.txt, worker_profiles_2.txt > > > Running COPY from on a large dataset (20G divided in 20M records) revealed > two issues: > * The progress report is incorrect, it is very slow until almost the end of > the test at which point it catches up extremely quickly. > * The performance in rows per second is similar to running smaller tests with > a smaller cluster locally (approx 35,000 rows per second). As a comparison, > cassandra-stress manages 50,000 rows per second under the same set-up, > therefore resulting 1.5 times faster. > See attached file _copy_from_large_benchmark.txt_ for the benchmark details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
svn commit: r1733539 - in /cassandra/site: publish/index.html src/content/index.html
Author: eevans Date: Fri Mar 4 00:54:57 2016 New Revision: 1733539 URL: http://svn.apache.org/viewvc?rev=1733539=rev Log: provide better IRC links Modified: cassandra/site/publish/index.html cassandra/site/src/content/index.html Modified: cassandra/site/publish/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1733539=1733538=1733539=diff == --- cassandra/site/publish/index.html (original) +++ cassandra/site/publish/index.html Fri Mar 4 00:54:57 2016 @@ -205,7 +205,7 @@ Chat -Many of the Cassandra developers and community members hang out in the #cassandra channel on http://freenode.net/; title="Freenode IRC network">irc.freenode.net. +Many of the Cassandra developers and community members hang out in the #cassandra channel on https://en.wikipedia.org/wiki/Freenode; title="Freenode IRC network">irc.freenode.net. If you are new to IRC, you can use http://webchat.freenode.net/?channels=#cassandra; title="Connect to #cassandra using webchat">a web-based client. Modified: cassandra/site/src/content/index.html URL: http://svn.apache.org/viewvc/cassandra/site/src/content/index.html?rev=1733539=1733538=1733539=diff == --- cassandra/site/src/content/index.html (original) +++ cassandra/site/src/content/index.html Fri Mar 4 00:54:57 2016 @@ -143,7 +143,7 @@ Chat -Many of the Cassandra developers and community members hang out in the #cassandra channel on http://freenode.net/; title="Freenode IRC network">irc.freenode.net. +Many of the Cassandra developers and community members hang out in the #cassandra channel on https://en.wikipedia.org/wiki/Freenode; title="Freenode IRC network">irc.freenode.net. If you are new to IRC, you can use http://webchat.freenode.net/?channels=#cassandra; title="Connect to #cassandra using webchat">a web-based client.
[jira] [Commented] (CASSANDRA-11299) AssertionError when quering by secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178783#comment-15178783 ] Tyler Hobbs commented on CASSANDRA-11299: - What is the schema of the table? > AssertionError when quering by secondary index > -- > > Key: CASSANDRA-11299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11299 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.3 >Reporter: Michał Matłoka > > Hi, > Recently we have upgraded from Cassandra 2.2.4 to 3.3. I have issues with one > table. When I try to query using any secondary index I get e.g. in cqlsh > {code} > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1249, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", > line 3122, in result > raise self._final_exception > ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation > failed - received 0 responses and 1 failures" info={'failures': 1, > 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} > {code} > Node logs shows then: > {code} > [[AWARN [SharedPool-Worker-2] 2016-03-03 00:47:01,679 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.AssertionError: null > at > org.apache.cassandra.index.internal.composites.CompositesSearcher$1Transform.findEntry(CompositesSearcher.java:225) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.index.internal.composites.CompositesSearcher$1Transform.applyToRow(CompositesSearcher.java:215) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:116) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:133) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:294) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:127) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:292) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1789) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2457) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.3.0.jar:3.3.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [apache-cassandra-3.3.0.jar:3.3.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] > {code} > SStables are upgraded, I have tried repair and scrub. I have tried to rebuild > indexes, and even remove them and re-add them.It occurs on every cluster node. > Additionally I had seen in this table case where PRIMARY KEY was > duplicated!!! (there were two rows with same primary key, by seeing what > columns were set I can say one was older, and second was from newer query > which sets only a subset of columns) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10112) Refuse to start and print txn log information in case of disk corruption
[ https://issues.apache.org/jira/browse/CASSANDRA-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178729#comment-15178729 ] Tyler Hobbs commented on CASSANDRA-10112: - Overall the patch looks good. Can you verify that the failing {{org.apache.cassandra.io.sstable.SSTableWriterTest.testAbortTxnWithOpenEarlyShouldRemoveSSTable}} utest is not a regression? Other than that, I just have a few nitpicks: * It would be nice to use constants instead of magic numbers for {{StartupException}} exit status codes. * In {{LogRecord.make()}}, why do we catch {{Throwable}}? Should we be passing that through {{JVMStabilityInspector}}? * {{removeUnfinishedCompactionLeftovers()}} could use some javadocs (especially explaining the return value). * I have a slight for using the term "directories" instead of "folders" (but it's not worth changing existing code for this) * I think this ticket needs a {{doc-impacting}} label > Refuse to start and print txn log information in case of disk corruption > > > Key: CASSANDRA-10112 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10112 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Stefania >Assignee: Stefania > Fix For: 3.x > > > Transaction logs were introduced by CASSANDRA-7066 and are read during > start-up. In case of file system errors, such as disk corruption, we > currently log a panic error and leave the sstable files and transaction logs > as they are; this is to avoid rolling back a transaction (i.e. deleting > files) by mistake. > We should instead look at the {{disk_failure_policy}} and refuse to start > unless the failure policy is {{ignore}}. > We should also consider stashing files that cannot be read during startup, > either transaction logs or sstables, by moving them to a dedicated > sub-folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11211) pushed_notifications_test.TestPushedNotifications.restart_node_test flaps infrequently
[ https://issues.apache.org/jira/browse/CASSANDRA-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch resolved CASSANDRA-11211. Resolution: Fixed fixed on dtest pr above. > pushed_notifications_test.TestPushedNotifications.restart_node_test flaps > infrequently > -- > > Key: CASSANDRA-11211 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11211 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng >Priority: Minor > > Pretty infrequent, but we're seeing some flakiness with this test: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/ > history: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-11220: Resolution: Fixed Status: Resolved (was: Patch Available) > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing > --- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11211) pushed_notifications_test.TestPushedNotifications.restart_node_test flaps infrequently
[ https://issues.apache.org/jira/browse/CASSANDRA-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178582#comment-15178582 ] Russ Hatch commented on CASSANDRA-11211: better fix here: https://github.com/riptano/cassandra-dtest/pull/836 > pushed_notifications_test.TestPushedNotifications.restart_node_test flaps > infrequently > -- > > Key: CASSANDRA-11211 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11211 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng >Priority: Minor > > Pretty infrequent, but we're seeing some flakiness with this test: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/ > history: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11301) Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads
[ https://issues.apache.org/jira/browse/CASSANDRA-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178551#comment-15178551 ] Benedict commented on CASSANDRA-11301: -- I did not look too closely as I don't work actively on the project at the moment, however I don't think 2.1 should be affected. > Non-obsoleting compaction operations over compressed files can impose rate > limit on normal reads > > > Key: CASSANDRA-11301 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11301 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.2.6 > > > Broken by CASSANDRA-9240; the rate limiting reader passes the ICompressedFile > interface to its parent, which uses this to attach an "owner" - which means > the reader gets recycled on close, i.e. pooled, for normal use. If the > compaction were to replace the sstable there would be no problem, which is > presumably why this hasn't been encountered frequently. However validation > compactions on long lived sstables would permit these rate limited readers to > accumulate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11301) Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads
[ https://issues.apache.org/jira/browse/CASSANDRA-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178546#comment-15178546 ] Christian Spriegel commented on CASSANDRA-11301: Benedict: CASSANDRA-9240 does not specify 2.1 as fixVersion. Am I correct to assume that 2.1 should not be affected by this? [~luxifer]: Didn't you say that our 2.1 installation is also affected? Did you test with setcompactionthroughput? > Non-obsoleting compaction operations over compressed files can impose rate > limit on normal reads > > > Key: CASSANDRA-11301 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11301 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict > Fix For: 2.2.6 > > > Broken by CASSANDRA-9240; the rate limiting reader passes the ICompressedFile > interface to its parent, which uses this to attach an "owner" - which means > the reader gets recycled on close, i.e. pooled, for normal use. If the > compaction were to replace the sstable there would be no problem, which is > presumably why this hasn't been encountered frequently. However validation > compactions on long lived sstables would permit these rate limited readers to > accumulate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11032) Full trace returned on ReadFailure by cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178537#comment-15178537 ] Tyler Hobbs commented on CASSANDRA-11032: - Yes, I think that's all we need to do. You can just add {{cassandra.CoordinationFailure}}, which is the parent class for {{ReadFailure}} and {{WriteFailure}}. > Full trace returned on ReadFailure by cqlsh > --- > > Key: CASSANDRA-11032 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11032 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Chris Splinter >Priority: Minor > Labels: cqlsh > > I noticed that the full traceback is returned on a read failure where I > expected this to be a one line exception with the ReadFailure message. It is > minor, but would it be better to only return the ReadFailure details? > {code} > cqlsh> SELECT * FROM test_encryption_ks.test_bad_table; > Traceback (most recent call last): > File "/usr/local/lib/dse/bin/../resources/cassandra/bin/cqlsh.py", line > 1276, in perform_simple_statement > result = future.result() > File > "/usr/local/lib/dse/resources/cassandra/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", > line 3122, in result > raise self._final_exception > ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation > failed - received 0 responses and 1 failures" info={'failures': 1, > 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-11220: Reproduced In: (was: 2.1.x, 2.2.x) Status: Patch Available (was: Open) https://github.com/riptano/cassandra-dtest/pull/835 > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing > --- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178511#comment-15178511 ] Philip Thompson edited comment on CASSANDRA-11220 at 3/3/16 8:03 PM: - Why would both SSTables expect "Repaired at: 0", if we've run sstablerepairedset against node2? EDIT: I've just seen the failing assertion is for max(matchcount), which fits with what I would expect. Opening a PR to correct that. was (Author: philipthompson): Why would both SSTables expect "Repaired at: 0", if we've run sstablerepairedset against node2? > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing > --- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178511#comment-15178511 ] Philip Thompson commented on CASSANDRA-11220: - Why would both SSTables expect "Repaired at: 0", if we've run sstablerepairedset against node2? > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing > --- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11032) Full trace returned on ReadFailure by cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-11032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178392#comment-15178392 ] Yuki Morishita commented on CASSANDRA-11032: [~thobbs] Should we add new Failures added in CASSANDRA-8592 to cqlsh's {{CQL_ERRORS}}? > Full trace returned on ReadFailure by cqlsh > --- > > Key: CASSANDRA-11032 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11032 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Chris Splinter >Priority: Minor > Labels: cqlsh > > I noticed that the full traceback is returned on a read failure where I > expected this to be a one line exception with the ReadFailure message. It is > minor, but would it be better to only return the ReadFailure details? > {code} > cqlsh> SELECT * FROM test_encryption_ks.test_bad_table; > Traceback (most recent call last): > File "/usr/local/lib/dse/bin/../resources/cassandra/bin/cqlsh.py", line > 1276, in perform_simple_statement > result = future.result() > File > "/usr/local/lib/dse/resources/cassandra/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", > line 3122, in result > raise self._final_exception > ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation > failed - received 0 responses and 1 failures" info={'failures': 1, > 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11220) repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test failing
[ https://issues.apache.org/jira/browse/CASSANDRA-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178384#comment-15178384 ] Yuki Morishita commented on CASSANDRA-11220: I think we need to fix dtest itself. bq. 1 not greater than or equal to 2 is expected for initial (pre repair) behavior, since both SSTables should have "Repaired at: 0". I don't know why [cassandra-dtest:8c66b903f51f2d4fe0eaf9d7f98f47733d734d9f|https://github.com/riptano/cassandra-dtest/commit/8c66b903f51f2d4fe0eaf9d7f98f47733d734d9f] added the change. > repair_tests.incremental_repair_test.TestIncRepair.sstable_repairedset_test > failing > --- > > Key: CASSANDRA-11220 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11220 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: Philip Thompson > Labels: dtest > > recent occurence: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/ > last 2 runs failed: > http://cassci.datastax.com/job/cassandra-2.1_dtest/427/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_repairedset_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11207) Can not remove TTL on table with default_time_to_live
[ https://issues.apache.org/jira/browse/CASSANDRA-11207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Hobbs updated CASSANDRA-11207: Status: Ready to Commit (was: Patch Available) > Can not remove TTL on table with default_time_to_live > - > > Key: CASSANDRA-11207 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11207 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Matthieu Nantern >Assignee: Benjamin Lerer > Fix For: 3.x > > > I've created a table with a default TTL: > {code:sql} > CREATE TABLE testmna.ndr ( > device_id text, > event_year text, > event_time timestamp, > active boolean, > PRIMARY KEY ((device_id, event_year), event_time) > ) WITH CLUSTERING ORDER BY (event_time DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 600 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > When I insert data with a "runtime TTL" (INSERT ... USING TTL 86400) > everything works as expected (ttl is set to 86400). > But I can't insert data without TTL at runtime: INSERT ... USING TTL 0; does > not work. > Tested on C* 2.2.4, CentOS 7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11207) Can not remove TTL on table with default_time_to_live
[ https://issues.apache.org/jira/browse/CASSANDRA-11207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178294#comment-15178294 ] Tyler Hobbs commented on CASSANDRA-11207: - +1 > Can not remove TTL on table with default_time_to_live > - > > Key: CASSANDRA-11207 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11207 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Matthieu Nantern >Assignee: Benjamin Lerer > Fix For: 3.x > > > I've created a table with a default TTL: > {code:sql} > CREATE TABLE testmna.ndr ( > device_id text, > event_year text, > event_time timestamp, > active boolean, > PRIMARY KEY ((device_id, event_year), event_time) > ) WITH CLUSTERING ORDER BY (event_time DESC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 600 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {code} > When I insert data with a "runtime TTL" (INSERT ... USING TTL 86400) > everything works as expected (ttl is set to 86400). > But I can't insert data without TTL at runtime: INSERT ... USING TTL 0; does > not work. > Tested on C* 2.2.4, CentOS 7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11211) pushed_notifications_test.TestPushedNotifications.restart_node_test flaps infrequently
[ https://issues.apache.org/jira/browse/CASSANDRA-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178277#comment-15178277 ] Russ Hatch commented on CASSANDRA-11211: looks like initial fix wasn't quite right. looks like possibly a small race condition in the test code, vetting a fix now. > pushed_notifications_test.TestPushedNotifications.restart_node_test flaps > infrequently > -- > > Key: CASSANDRA-11211 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11211 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng >Priority: Minor > > Pretty infrequent, but we're seeing some flakiness with this test: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/ > history: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11059) In cqlsh show static columns in a different color
[ https://issues.apache.org/jira/browse/CASSANDRA-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-11059: --- Fix Version/s: (was: 2.2.3) 3.x Status: Patch Available (was: Open) [~pavel.trukhanov] Thanks for the patch! Which version did you create the patch against? If not trunk, can you create one for that, since this is "improvement"? Marking this as "Patch Available" anyway so we can review shortly. > In cqlsh show static columns in a different color > - > > Key: CASSANDRA-11059 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11059 > Project: Cassandra > Issue Type: Improvement > Components: Tools > Environment: [cqlsh 5.0.1 | Cassandra 2.2.3 | CQL spec 3.3.1 | Native > protocol v4] >Reporter: Cédric Hernalsteens >Priority: Minor > Fix For: 3.x > > > The partition key columns are shown in red, the clustering columns in cyan, > it would be great to also distinguish between the static columns and the > other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11211) pushed_notifications_test.TestPushedNotifications.restart_node_test flaps infrequently
[ https://issues.apache.org/jira/browse/CASSANDRA-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178195#comment-15178195 ] Russ Hatch commented on CASSANDRA-11211: fix at https://github.com/riptano/cassandra-dtest/pull/834 > pushed_notifications_test.TestPushedNotifications.restart_node_test flaps > infrequently > -- > > Key: CASSANDRA-11211 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11211 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng >Priority: Minor > > Pretty infrequent, but we're seeing some flakiness with this test: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/ > history: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11092) EXPAND breaks when the result has 0 rows in cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-11092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-11092: --- Assignee: Yuki Morishita Fix Version/s: 3.x 3.0.x 2.2.x Status: Patch Available (was: Open) Version 2.1 shows just header in no expand style when displaying empty row with {{EXPAND ON}}. 2.2+ deleted that feature and errors out like reported, so I put that back in. https://github.com/yukim/cassandra/tree/11092-2.2 Will update with the test results and other branches. > EXPAND breaks when the result has 0 rows in cqlsh > - > > Key: CASSANDRA-11092 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11092 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Kishan Karunaratne >Assignee: Yuki Morishita >Priority: Trivial > Labels: cqlsh > Fix For: 2.2.x, 3.0.x, 3.x > > > {noformat} > cqlsh> EXPAND ON; > Now Expanded output is enabled > cqlsh> select * from system.local; > @ Row 1 > -+- > key | local > bootstrapped| COMPLETED > broadcast_address | 127.0.0.1 > cluster_name| dse_50_graph > cql_version | 3.4.0 > data_center | Graph > dse_version | 5.0.0 > gossip_generation | 1454032824 > graph | True > host_id | ad30ccb2-04a1-4511-98b6-a72e4ea182c0 > listen_address | 127.0.0.1 > native_protocol_version | 4 > partitioner | org.apache.cassandra.dht.Murmur3Partitioner > rack| rack1 > release_version | 3.0.1.816 > rpc_address | 127.0.0.1 > schema_version | 5667501a-4ac3-3f00-ab35-9040efb927ad > server_id | A0-CE-C8-01-CC-CA > thrift_version | 20.1.0 > tokens | {'-9223372036854775808'} > truncated_at| null > workload| Cassandra > (1 rows) > cqlsh> select * from system.peers; > max() arg is an empty sequence > cqlsh> EXPAND OFF; > Disabled Expanded output. > cqlsh> select * from system.peers; > peer | data_center | dse_version | graph | host_id | preferred_ip | rack | > release_version | rpc_address | schema_version | server_id | tokens | workload > --+-+-+---+-+--+--+-+-++---++-- > (0 rows) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11301) Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads
Benedict created CASSANDRA-11301: Summary: Non-obsoleting compaction operations over compressed files can impose rate limit on normal reads Key: CASSANDRA-11301 URL: https://issues.apache.org/jira/browse/CASSANDRA-11301 Project: Cassandra Issue Type: Bug Components: Core Reporter: Benedict Fix For: 2.2.6 Broken by CASSANDRA-9240; the rate limiting reader passes the ICompressedFile interface to its parent, which uses this to attach an "owner" - which means the reader gets recycled on close, i.e. pooled, for normal use. If the compaction were to replace the sstable there would be no problem, which is presumably why this hasn't been encountered frequently. However validation compactions on long lived sstables would permit these rate limited readers to accumulate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance
[ https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178132#comment-15178132 ] Adam Holmberg commented on CASSANDRA-11053: --- +1 @28f7713 > COPY FROM on large datasets: fix progress report and debug performance > -- > > Key: CASSANDRA-11053 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11053 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: copy_from_large_benchmark.txt, > copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, > worker_profiles.txt, worker_profiles_2.txt > > > Running COPY from on a large dataset (20G divided in 20M records) revealed > two issues: > * The progress report is incorrect, it is very slow until almost the end of > the test at which point it catches up extremely quickly. > * The performance in rows per second is similar to running smaller tests with > a smaller cluster locally (approx 35,000 rows per second). As a comparison, > cassandra-stress manages 50,000 rows per second under the same set-up, > therefore resulting 1.5 times faster. > See attached file _copy_from_large_benchmark.txt_ for the benchmark details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11211) pushed_notifications_test.TestPushedNotifications.restart_node_test flaps infrequently
[ https://issues.apache.org/jira/browse/CASSANDRA-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178102#comment-15178102 ] Russ Hatch commented on CASSANDRA-11211: Local testing seems to confirm it's a timing problem with the test flapping when things are a bit slower. A wait time increase on the test appears likely to fix, will investigate a bit more. > pushed_notifications_test.TestPushedNotifications.restart_node_test flaps > infrequently > -- > > Key: CASSANDRA-11211 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11211 > Project: Cassandra > Issue Type: Test >Reporter: Russ Hatch >Assignee: DS Test Eng >Priority: Minor > > Pretty infrequent, but we're seeing some flakiness with this test: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/ > history: > http://cassci.datastax.com/job/cassandra-2.1_dtest/424/testReport/pushed_notifications_test/TestPushedNotifications/restart_node_test/history/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10855) Use Caffeine (W-TinyLFU) for on-heap caches
[ https://issues.apache.org/jira/browse/CASSANDRA-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178101#comment-15178101 ] Ben Manes commented on CASSANDRA-10855: --- It sounds like we all thought Weibull was a good choice. Another option is to use [YCSB|https://github.com/brianfrankcooper/YCSB] which handles correlated omission and is popular benchmark for comparing data stores. > Use Caffeine (W-TinyLFU) for on-heap caches > --- > > Key: CASSANDRA-10855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10855 > Project: Cassandra > Issue Type: Improvement >Reporter: Ben Manes > Labels: performance > > Cassandra currently uses > [ConcurrentLinkedHashMap|https://code.google.com/p/concurrentlinkedhashmap] > for performance critical caches (key, counter) and Guava's cache for > non-critical (auth, metrics, security). All of these usages have been > replaced by [Caffeine|https://github.com/ben-manes/caffeine], written by the > author of the previously mentioned libraries. > The primary incentive is to switch from LRU policy to W-TinyLFU, which > provides [near optimal|https://github.com/ben-manes/caffeine/wiki/Efficiency] > hit rates. It performs particularly well in database and search traces, is > scan resistant, and as adds a very small time/space overhead to LRU. > Secondarily, Guava's caches never obtained similar > [performance|https://github.com/ben-manes/caffeine/wiki/Benchmarks] to CLHM > due to some optimizations not being ported over. This change results in > faster reads and not creating garbage as a side-effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11258) Repair scheduling - Resource locking API
[ https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178016#comment-15178016 ] Paulo Motta commented on CASSANDRA-11258: - bq. For this lock table to work correctly later on, it should be set up to have replicas in all data centers, right? Should this be automatically configured or should this be something that the user would have to configure when adding/removing data centers? From a usability point I think it would be great if this was handled automatically and it would probably not be too hard to create a replication strategy defined as "at most X replicas in each dc", but I'm not sure if this might cause problems if someone where to use it for other purposes? With our current design of dc-local locks, do we actually need locks to be replicated cross-DC? Will we need to read locks of remote dcs from the local DC? Assuming that we first try to acquire/read the lock from the local DC before acquiring remote locks I think we don't need to replicate locks cross DC. So, I think your 4th proposal (create separate lock keyspaces for each dc) has its value, but instead of having one keyspace per DC, we could have a single keyspace with a new {{LocalDataCenterReplicationStrategy}} that would restrict replication to the local dc, so we wouldn't need to change replication settings when adding new DCs. WDYT of this approach? If you like this approach, we could do the initial version assuming a single DC with {{SimpleStrategy}} replication + {{SERIAL}} consistency, while power users could still have multi-DC support by manually changing replication settings of the lock keyspace. We could later add transparent/efficient multi-DC support via CASSANDRA-11300 and {{LocalDataCenterReplicationStrategy}}. > Repair scheduling - Resource locking API > > > Key: CASSANDRA-11258 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11258 > Project: Cassandra > Issue Type: Sub-task >Reporter: Marcus Olsson >Assignee: Marcus Olsson >Priority: Minor > > Create a resource locking API & implementation that is able to lock a > resource in a specified data center. It should handle priorities to avoid > node starvation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11295) Make custom filtering more extensible via custom classes
[ https://issues.apache.org/jira/browse/CASSANDRA-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177999#comment-15177999 ] Sam Tunnicliffe commented on CASSANDRA-11295: - Force-pushed an update which adds a new expression type, {{UserExpression}}, to preserve {{CustomExpression}} for it's original intended use and avoid overloading it. {{UserExpression}} is declared abstract and concrete implementations must be registered before they're used to ensure that they can be correctly deserialized. In general, I think this is a cleaner and more robust solution, though use of such expressions during upgrades will still require some finesse. > Make custom filtering more extensible via custom classes > - > > Key: CASSANDRA-11295 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11295 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Minor > Fix For: 3.x > > > At the moment, the implementation of {{RowFilter.CustomExpression}} is > tightly bound to the syntax designed to support non-CQL search syntax for > custom 2i implementations. It might be interesting to decouple the two things > by making the custom expression implementation and serialization a bit more > pluggable. This would allow users to add their own custom expression > implementations to experiment with custom filtering strategies without having > to patch the C* source. As a minimally invasive first step, custom > expressions could be added programmatically via {{QueryHandler}}. Further > down the line, if this proves useful and we can figure out some reasonable > syntax we could think about adding the capability in CQL in a separate > ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11176) SSTableRewriter.InvalidateKeys should have a weak reference to cache
[ https://issues.apache.org/jira/browse/CASSANDRA-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177958#comment-15177958 ] Ariel Weisberg commented on CASSANDRA-11176: My mistake. I thought the ticket was scoped to the test change. > SSTableRewriter.InvalidateKeys should have a weak reference to cache > > > Key: CASSANDRA-11176 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11176 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Jeremiah Jordan >Assignee: Marcus Eriksson > Fix For: 2.1.14, 2.2.6, 3.5, 3.0.5 > > > From [~aweisberg] > bq. The SSTableReader.DropPageCache runnable references > SSTableRewriter.InvalidateKeys which references the cache. The cache > reference should be a WeakReference. > {noformat} > ERROR [Strong-Reference-Leak-Detector:1] 2016-02-17 14:51:52,111 > NoSpamLogger.java:97 - Strong self-ref loop detected > [/var/lib/cassandra/data/keyspace1/standard1-990bc741d56411e591d5590d7a7ad312/ma-20-big, > private java.lang.Runnable > org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier.runOnClose-org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache, > final java.lang.Runnable > org.apache.cassandra.io.sstable.format.SSTableReader$DropPageCache.andThen-org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys, > final org.apache.cassandra.cache.InstrumentingCache > org.apache.cassandra.io.sstable.SSTableRewriter$InvalidateKeys.cache-org.apache.cassandra.cache.AutoSavingCache, > protected volatile java.util.concurrent.ScheduledFuture > org.apache.cassandra.cache.AutoSavingCache.saveTask-java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask, > final java.util.concurrent.ScheduledThreadPoolExecutor > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.this$0-org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor, > private final java.util.concurrent.BlockingQueue > java.util.concurrent.ThreadPoolExecutor.workQueue-java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue, > private final java.util.concurrent.BlockingQueue > java.util.concurrent.ThreadPoolExecutor.workQueue-java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask, > private java.util.concurrent.Callable > java.util.concurrent.FutureTask.callable-java.util.concurrent.Executors$RunnableAdapter, > final java.lang.Runnable > java.util.concurrent.Executors$RunnableAdapter.task-org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable, > private final java.lang.Runnable > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.runnable-org.apache.cassandra.db.ColumnFamilyStore$3, > final org.apache.cassandra.db.ColumnFamilyStore > org.apache.cassandra.db.ColumnFamilyStore$3.this$0-org.apache.cassandra.db.ColumnFamilyStore, > public final org.apache.cassandra.db.Keyspace > org.apache.cassandra.db.ColumnFamilyStore.keyspace-org.apache.cassandra.db.Keyspace, > private final java.util.concurrent.ConcurrentMap > org.apache.cassandra.db.Keyspace.columnFamilyStores-java.util.concurrent.ConcurrentHashMap, > private final java.util.concurrent.ConcurrentMap > org.apache.cassandra.db.Keyspace.columnFamilyStores-org.apache.cassandra.db.ColumnFamilyStore, > private final org.apache.cassandra.db.lifecycle.Tracker > org.apache.cassandra.db.ColumnFamilyStore.data-org.apache.cassandra.db.lifecycle.Tracker, > final java.util.concurrent.atomic.AtomicReference > org.apache.cassandra.db.lifecycle.Tracker.view-java.util.concurrent.atomic.AtomicReference, > private volatile java.lang.Object > java.util.concurrent.atomic.AtomicReference.value-org.apache.cassandra.db.lifecycle.View, > public final java.util.List > org.apache.cassandra.db.lifecycle.View.liveMemtables-com.google.common.collect.SingletonImmutableList, > final transient java.lang.Object > com.google.common.collect.SingletonImmutableList.element-org.apache.cassandra.db.Memtable, > private final org.apache.cassandra.utils.memory.MemtableAllocator > org.apache.cassandra.db.Memtable.allocator-org.apache.cassandra.utils.memory.SlabAllocator, > private final > org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator > org.apache.cassandra.utils.memory.MemtableAllocator.onHeap-org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator, > private final org.apache.cassandra.utils.memory.MemtablePool$SubPool > org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.parent-org.apache.cassandra.utils.memory.MemtablePool$SubPool, > final org.apache.cassandra.utils.memory.MemtablePool > org.apache.cassandra.utils.memory.MemtablePool$SubPool.this$0-org.apache.cassandra.utils.memory.SlabPool, > final
[jira] [Commented] (CASSANDRA-7957) improve active/pending compaction monitoring
[ https://issues.apache.org/jira/browse/CASSANDRA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177915#comment-15177915 ] Nikolai Grigoriev commented on CASSANDRA-7957: -- OK, I see your point. Well, then what I was thinking about is simply not doable, I guess. > improve active/pending compaction monitoring > > > Key: CASSANDRA-7957 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7957 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Nikolai Grigoriev >Priority: Minor > > I think it might be useful to create a way to see what sstables are being > compacted into what new sstable. Something like an extension of "nodetool > compactionstats". I think it would be easier with this feature to > troubleshoot and understand how compactions are happening on your data. Not > sure how it is useful in everyday life but I could use such a feature when > dealing with CASSANDRA-7949. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11296) Run dtests with -Dcassandra.debugrefcount=true and increase checking frequency
[ https://issues.apache.org/jira/browse/CASSANDRA-11296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177841#comment-15177841 ] Marcus Eriksson commented on CASSANDRA-11296: - first set of errors are due to having strong references to the subscribers in Tracker. But it looks like [fixing|https://github.com/krummas/cassandra/commits/marcuse/11296-trunk-fix] that only moves the loop: http://cassci.datastax.com/job/krummas-marcuse-11296-trunk-fix-dtest/1/testReport/junit/cql_tests/AbortedQueriesTester/index_query_test/ > Run dtests with -Dcassandra.debugrefcount=true and increase checking frequency > -- > > Key: CASSANDRA-11296 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11296 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > > We should run dtests with refcount debugging and check every second instead > of every 15 minutes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9430) Add startup options to cqlshrc
[ https://issues.apache.org/jira/browse/CASSANDRA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177751#comment-15177751 ] Pavel Trukhanov commented on CASSANDRA-9430: Maybe it would be better (and solve all future requests for improvements like this one) to add and "interactive mode" flag like there's in python etc. Here're quotes from python and ipython usage help for this mode: {quote} If running code from the command line, become interactive afterwards. {quote} {quote} inspect interactively after running script; {quote} So for example it'd work like this: {code} echo "paging off;" | cqlsh -i {code} and would be prefectly combinable with {code} -f FILE, --file=FILE Execute commands from FILE {code} but without exiting afterwards > Add startup options to cqlshrc > -- > > Key: CASSANDRA-9430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9430 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Jeremy Hanna >Priority: Minor > Labels: cqlsh, lhf > > There are certain settings that would be nice to set defaults for in the > cqlshrc file. For example, a user may want to set the paging to off by > default for their environment. You can't simply do > {code} > echo "paging off;" | cqlsh > {code} > because this would disable paging and immediately exit cqlsh. > So it would be nice to have a section of the cqlshrc to include default > settings on startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11059) In cqlsh show static columns in a different color
[ https://issues.apache.org/jira/browse/CASSANDRA-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177725#comment-15177725 ] Pavel Trukhanov commented on CASSANDRA-11059: - Here's a very little patch https://github.com/okmeter/cassandra/commit/6f03489e4d869dccaf6f075e84a8ce6eb92ca6ce Should I do something else to make this happen? > In cqlsh show static columns in a different color > - > > Key: CASSANDRA-11059 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11059 > Project: Cassandra > Issue Type: Improvement > Components: Tools > Environment: [cqlsh 5.0.1 | Cassandra 2.2.3 | CQL spec 3.3.1 | Native > protocol v4] >Reporter: Cédric Hernalsteens >Priority: Minor > Fix For: 2.2.3 > > > The partition key columns are shown in red, the clustering columns in cyan, > it would be great to also distinguish between the static columns and the > other. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9161) Add random interleaving for flush/compaction when running CQL unit tests
[ https://issues.apache.org/jira/browse/CASSANDRA-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177651#comment-15177651 ] Sylvain Lebresne commented on CASSANDRA-9161: - bq. the problems we faced with upgrade tests due to the randomness of how the data is distributed Yeah, but that's a bad example: that randomness was not properly controlled, it was making things hard to reproduce and that was the problem. But this is not what I want to do here, this will _not_ make tests non reproducible at all. In testing, sometimes (always really, but that's a different subject) the state space is just too big to be systematically explored on every run. Because of that, you do your best at exploring the most meaningful part of the space (and I'm not saying we can't improve on that part btw, we can and we should), but there is still space you can't explore. Hoping you'll be so good at finding the meaningful subset of the states to test so that no bug will lurk in the remaining space is just wishful thinking. So this is just about getting incrementally better coverage of the full space by using some new random state on every run _for the parts we can't reasonably explore systematically_ (and I'm happy to discuss which parts can reasonably be explored systematically and which aren't btw). bq. For flushing, the main problem that I have seen is that only the read or write path for memtables was tested not the one for SSTables It's really more complex. Unless your test has a single insert, there isn't _just_ one path for memtables and one for sstables. There is the cases where some data is in memtable and some in sstables, where there is more than one sstable involved (and we can flush for every insert, or only in some places), whether we compact before reading etc... I have seen bugs in pretty much all of those cases (I'm genuinely not kidding: there has been cases with range tombstone in particular where things got only messes up when data was flushed at specific point and compaction was run before reading). And here again, don't get me wrong: for some tests, there may be a clear place where we want to systematically test both with and without flush because we want that to be tested every time and that's fine, we can do it. But we just can't systematically test all combinations. > Add random interleaving for flush/compaction when running CQL unit tests > > > Key: CASSANDRA-9161 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9161 > Project: Cassandra > Issue Type: Test >Reporter: Sylvain Lebresne > Labels: retrospective_generated > > Most CQL tests don't bother flushing, which means that they overwhelmingly > test the memtable path and not the sstables one. A simple way to improve on > that would be to make {{CQLTester}} issue flushes and compactions randomly > between statements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11299) AssertionError when quering by secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177560#comment-15177560 ] Michał Matłoka commented on CASSANDRA-11299: I finally had to restore pre-upgrade snapshot of this table and re-upgrade it to make our cluster working, but, the cql was just of type e.g.: {code} select * from table where indexedcolumn = true; {code} I had a secondary index for boolean column, timestamp and map keys, and each of them reacted exactly the same. > AssertionError when quering by secondary index > -- > > Key: CASSANDRA-11299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11299 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.3 >Reporter: Michał Matłoka > > Hi, > Recently we have upgraded from Cassandra 2.2.4 to 3.3. I have issues with one > table. When I try to query using any secondary index I get e.g. in cqlsh > {code} > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1249, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", > line 3122, in result > raise self._final_exception > ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation > failed - received 0 responses and 1 failures" info={'failures': 1, > 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} > {code} > Node logs shows then: > {code} > [[AWARN [SharedPool-Worker-2] 2016-03-03 00:47:01,679 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.AssertionError: null > at > org.apache.cassandra.index.internal.composites.CompositesSearcher$1Transform.findEntry(CompositesSearcher.java:225) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.index.internal.composites.CompositesSearcher$1Transform.applyToRow(CompositesSearcher.java:215) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:116) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:133) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:294) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:127) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:292) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1789) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2457) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.3.0.jar:3.3.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [apache-cassandra-3.3.0.jar:3.3.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] > {code} > SStables are upgraded, I have tried repair and scrub. I have tried to rebuild > indexes, and even remove them and re-add them.It occurs on every cluster node. > Additionally I had seen in this table case where PRIMARY KEY was > duplicated!!! (there were two rows with same primary key, by seeing what >
[jira] [Commented] (CASSANDRA-11299) AssertionError when quering by secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177553#comment-15177553 ] DOAN DuyHai commented on CASSANDRA-11299: - Can you please show the CQL query that triggers this error ? > AssertionError when quering by secondary index > -- > > Key: CASSANDRA-11299 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11299 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.3 >Reporter: Michał Matłoka > > Hi, > Recently we have upgraded from Cassandra 2.2.4 to 3.3. I have issues with one > table. When I try to query using any secondary index I get e.g. in cqlsh > {code} > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1249, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", > line 3122, in result > raise self._final_exception > ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation > failed - received 0 responses and 1 failures" info={'failures': 1, > 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'} > {code} > Node logs shows then: > {code} > [[AWARN [SharedPool-Worker-2] 2016-03-03 00:47:01,679 > AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread > Thread[SharedPool-Worker-2,5,main]: {} > java.lang.AssertionError: null > at > org.apache.cassandra.index.internal.composites.CompositesSearcher$1Transform.findEntry(CompositesSearcher.java:225) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.index.internal.composites.CompositesSearcher$1Transform.applyToRow(CompositesSearcher.java:215) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:116) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:133) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:294) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:127) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:292) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1789) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2457) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_66] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) > ~[apache-cassandra-3.3.0.jar:3.3.0] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) > [apache-cassandra-3.3.0.jar:3.3.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) > [apache-cassandra-3.3.0.jar:3.3.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66] > {code} > SStables are upgraded, I have tried repair and scrub. I have tried to rebuild > indexes, and even remove them and re-add them.It occurs on every cluster node. > Additionally I had seen in this table case where PRIMARY KEY was > duplicated!!! (there were two rows with same primary key, by seeing what > columns were set I can say one was older, and second was from newer query > which sets only a subset of columns) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11258) Repair scheduling - Resource locking API
[ https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177525#comment-15177525 ] Marcus Olsson commented on CASSANDRA-11258: --- bq. While I think we could add a new VERB (REMOTE_CAS) to the messaging service without a protocol bump (by reusing the UNUSED_X verbs), I think we could this in a separate ticket to avoid losing focus here. Great, I'll create a JIRA for it and link it to this one. bq. So I propose we use a global CAS (SERIAL consistency) for each DC lock for the first version, which should make multi-dc schedule repairs work when there is no network partition, and improve later when the REMOTE_CAS verb is in place. WDYT? +1 For this lock table to work correctly later on, it should be set up to have replicas in all data centers, right? Should this be automatically configured or should this be something that the user would have to configure when adding/removing data centers? From a usability point I think it would be great if this was handled automatically and it would probably not be too hard to create a replication strategy defined as "at most X replicas in each dc", but I'm not sure if this might cause problems if someone where to use it for other purposes? > Repair scheduling - Resource locking API > > > Key: CASSANDRA-11258 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11258 > Project: Cassandra > Issue Type: Sub-task >Reporter: Marcus Olsson >Assignee: Marcus Olsson >Priority: Minor > > Create a resource locking API & implementation that is able to lock a > resource in a specified data center. It should handle priorities to avoid > node starvation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11300) Support for forwarding of CAS requests
Marcus Olsson created CASSANDRA-11300: - Summary: Support for forwarding of CAS requests Key: CASSANDRA-11300 URL: https://issues.apache.org/jira/browse/CASSANDRA-11300 Project: Cassandra Issue Type: New Feature Reporter: Marcus Olsson Priority: Minor For CASSANDRA-11258 to be able to lock a resource in a specific data center it would be needed to forward the CAS request to a node in that data center that would act as the coordinator for the request with LOCAL_SERIAL consistency. Proposal is to add a Verb (REMOTE_CAS) that is used to forward the request to another node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)