date:20150403


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394193#comment-14394193
 ] 

Stefania commented on CASSANDRA-8893:
-

Benedict, take a look at the patch attached and let me know if this is what you 
had in mind. The entry point is ChannelProxy which wraps a file channel in a 
ref counted way and ensures that only thread safe operations are accessible. It 
also translates the IO exceptions into unchecked exceptions.

The channel proxy is shared by Builder, SegmentedFile and RandomAccessReader 
instances.

In the Builder we can receive different file paths in the complete methods, in 
which case we close the old channel and create a new one. This is the part I 
was not entirely sure about.

The remaining changes are either mechanical to pass the channel around, or 
fixes to remove leaks of the channel, mostly in the unit tests.

 RandomAccessReader should share its FileChannel with all instances (via 
 SegmentedFile)
 --

 Key: CASSANDRA-8893
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8893
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
 Fix For: 3.0


 There's no good reason to open a FileChannel for each 
 \(Compressed\)\?RandomAccessReader, and this would simplify 
 RandomAccessReader to just a thin wrapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9106) disable secondary indexes by default


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394213#comment-14394213
 ] 

Sylvain Lebresne commented on CASSANDRA-9106:
-

I generally agree that it's too easy to misuse so I'm in favor for trying to 
make it less so, and no allowing them by default does sound like it goes in 
that direction. I'm definitively not in favor of use the yaml to deal with 
that: if we do decide to disable them by default, then I think we should simply 
make that capacity not enabled by default in the context of CASSANDRA-8303.

 disable secondary indexes by default
 

 Key: CASSANDRA-9106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9106
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jon Haddad
 Fix For: 3.0


 This feature is misused constantly.  Can we disable it by default, and 
 provide a yaml config to explicitly enable it?  Along with a massive warning 
 about how they aren't there for performance, maybe with a link to 
 documentation that explains why?  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7557) User permissions for UDFs


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394192#comment-14394192
 ] 

Sylvain Lebresne commented on CASSANDRA-7557:
-

bq.  The only alternative I could come up with was to defer execution of 
terminal functions depending on the configured {{IAuthorizer}}

Alternatively, we could defer execution of functions to statement execution 
unconditionally. I mean, executing functions at preparation time when all terms 
are terminal is just a minor optimization that was done because it was easy to 
do, but in practice, it's unlikely terribly useful: for non-prepared statement, 
doing execution at preparation or execution doesn't matter at all, and for 
prepared statement, not only having function calls with only terminal terms is 
probably not that common, but if you really care about optimizing the call, 
it's easy enough to compute the function client side before preparation.
So honestly, if that minor optimization become a pain to preserve, and it does 
seem so with this (I would even argue that doing permission checking at 
preparation time is always a bad idea because if the permission is revoked 
after preparation, a user would expect further execution to be rejected), I 
submit that we should just get rid of it and simplify the code accordingly.

 User permissions for UDFs
 -

 Key: CASSANDRA-7557
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7557
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Tyler Hobbs
Assignee: Sam Tunnicliffe
  Labels: client-impacting, cql, udf
 Fix For: 3.0


 We probably want some new permissions for user defined functions.  Most 
 RDBMSes split function permissions roughly into {{EXECUTE}} and 
 {{CREATE}}/{{ALTER}}/{{DROP}} permissions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8893) RandomAccessReader should share its FileChannel with all instances (via SegmentedFile)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394195#comment-14394195
 ] 

Stefania commented on CASSANDRA-8893:
-

This patch fixes the third point of CASSANDRA-8952.

 RandomAccessReader should share its FileChannel with all instances (via 
 SegmentedFile)
 --

 Key: CASSANDRA-8893
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8893
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
 Fix For: 3.0


 There's no good reason to open a FileChannel for each 
 \(Compressed\)\?RandomAccessReader, and this would simplify 
 RandomAccessReader to just a thin wrapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8952) Remove transient RandomAccessFile usage


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394196#comment-14394196
 ] 

Stefania commented on CASSANDRA-8952:
-

The third point will be fixed by CASSANDRA-8893.

 Remove transient RandomAccessFile usage
 ---

 Key: CASSANDRA-8952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Joshua McKenzie
Assignee: Stefania
Priority: Minor
  Labels: Windows
 Fix For: 3.0


 There are a few places within the code base where we use a RandomAccessFile 
 transiently to either grab fd's or channels for other operations. This is 
 prone to access violations on Windows (see CASSANDRA-4050 and CASSANDRA-8709) 
 - while these usages don't appear to be causing issues at this time there's 
 no reason to keep them. The less RandomAccessFile usage in the code-base the 
 more stable we'll be on Windows.
 [SSTableReader.dropPageCache|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L2021]
 * Used to getFD, have FileChannel version
 [FileUtils.truncate|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/FileUtils.java#L188]
 * Used to get file channel for channel truncate call. Only use is in index 
 file close so channel truncation down-only is acceptable.
 [MMappedSegmentedFile.createSegments|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/MmappedSegmentedFile.java#L196]
 * Used to get file channel for mapping.
 Keeping these in a single ticket as all three should be fairly trivial 
 refactors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8952) Remove transient RandomAccessFile usage


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394202#comment-14394202
 ] 

Stefania commented on CASSANDRA-8952:
-

Regarding the first point, I only found dropPageCache() in SegmentedFile. We 
need to replace the transient RAF with NIO calls in CLibrary.getfd(String 
path), correct?

 Remove transient RandomAccessFile usage
 ---

 Key: CASSANDRA-8952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Joshua McKenzie
Assignee: Stefania
Priority: Minor
  Labels: Windows
 Fix For: 3.0


 There are a few places within the code base where we use a RandomAccessFile 
 transiently to either grab fd's or channels for other operations. This is 
 prone to access violations on Windows (see CASSANDRA-4050 and CASSANDRA-8709) 
 - while these usages don't appear to be causing issues at this time there's 
 no reason to keep them. The less RandomAccessFile usage in the code-base the 
 more stable we'll be on Windows.
 [SSTableReader.dropPageCache|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L2021]
 * Used to getFD, have FileChannel version
 [FileUtils.truncate|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/FileUtils.java#L188]
 * Used to get file channel for channel truncate call. Only use is in index 
 file close so channel truncation down-only is acceptable.
 [MMappedSegmentedFile.createSegments|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/MmappedSegmentedFile.java#L196]
 * Used to get file channel for mapping.
 Keeping these in a single ticket as all three should be fairly trivial 
 refactors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394218#comment-14394218
 ] 

Sylvain Lebresne commented on CASSANDRA-8979:
-

To avoid any confusion, I never suggested we wouldn't do this in a minor 
version, just that we basically added what the last patches from 
[~spo...@gmail.com] adds. so [~yukim], if you go ahead and commit those last 
patches, I'm good closing this.

 MerkleTree mismatch for deleted and non-existing rows
 -

 Key: CASSANDRA-8979
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stefan Podkowinski
Assignee: Stefan Podkowinski
 Fix For: 2.1.5

 Attachments: 8979-AvoidBufferAllocation-2.0_patch.txt, 
 8979-LazilyCompactedRow-2.0.txt, 8979-RevertPrecompactedRow-2.0.txt, 
 cassandra-2.0-8979-lazyrow_patch.txt, cassandra-2.0-8979-validator_patch.txt, 
 cassandra-2.0-8979-validatortest_patch.txt, 
 cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt


 Validation compaction will currently create different hashes for rows that 
 have been deleted compared to nodes that have not seen the rows at all or 
 have already compacted them away. 
 In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to 
 prevent hashing of expired tombstones. This still seems to be in place, but 
 does not address the issue completely. Or there was a change in 2.0 that 
 rendered the patch ineffective. 
 The problem is that rowHash() in the Validator will return a new hash in any 
 case, whether the PrecompactedRow did actually update the digest or not. This 
 will lead to the case that a purged, PrecompactedRow will not change the 
 digest, but we end up with a different tree compared to not having rowHash 
 called at all (such as in case the row already doesn't exist).
 As an implication, repair jobs will constantly detect mismatches between 
 older sstables containing purgable rows and nodes that have already compacted 
 these rows. After transfering the reported ranges, the newly created sstables 
 will immediately get deleted again during the following compaction. This will 
 happen for each repair run over again until the sstable with the purgable row 
 finally gets compacted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9037) Terminal UDFs evaluated at prepare time throw protocol version error


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394198#comment-14394198
 ] 

Sylvain Lebresne commented on CASSANDRA-9037:
-

Fyi, as I suggested in CASSANDRA-7557, I suggest we just entirely get rid of 
function execution at prepare time. The short version is that it's imo starting 
to add way more complexity than it's worth as an optimization.

 Terminal UDFs evaluated at prepare time throw protocol version error
 

 Key: CASSANDRA-9037
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9037
 Project: Cassandra
  Issue Type: Bug
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe
 Fix For: 3.0


 When a pure function with only terminal arguments (or with no arguments) is 
 used in a where clause, it's executed at prepare time and 
 {{Server.CURRENT_VERSION}} passed as the protocol version for serialization 
 purposes. For native functions, this isn't a problem, but UDFs use classes in 
 the bundled java-driver-core jar for (de)serialization of args and return 
 values. When {{Server.CURRENT_VERSION}} is greater than the highest version 
 supported by the bundled java driver the execution fails with the following 
 exception:
 {noformat}
 ERROR [SharedPool-Worker-1] 2015-03-24 18:10:59,391 QueryMessage.java:132 - 
 Unexpected error during query
 org.apache.cassandra.exceptions.FunctionExecutionException: execution of 
 'ks.overloaded[text]' failed: java.lang.IllegalArgumentException: No protocol 
 version matching integer version 4
 at 
 org.apache.cassandra.exceptions.FunctionExecutionException.create(FunctionExecutionException.java:35)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.udf.gen.Cksoverloaded_1.execute(Cksoverloaded_1.java)
  ~[na:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall.executeInternal(FunctionCall.java:78)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall.access$200(FunctionCall.java:34)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall$Raw.execute(FunctionCall.java:176)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall$Raw.prepare(FunctionCall.java:161)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.SingleColumnRelation.toTerm(SingleColumnRelation.java:108)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.SingleColumnRelation.newEQRestriction(SingleColumnRelation.java:143)
  ~[main/:na]
 at org.apache.cassandra.cql3.Relation.toRestriction(Relation.java:127) 
 ~[main/:na]
 at 
 org.apache.cassandra.cql3.restrictions.StatementRestrictions.init(StatementRestrictions.java:126)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepareRestrictions(SelectStatement.java:787)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:740)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:488)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:252) 
 ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:246) 
 ~[main/:na]
 at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[main/:na]
 at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:475)
  [main/:na]
 at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:371)
  [main/:na]
 at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_71]
 at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [main/:na]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [main/:na]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
 Caused by: java.lang.IllegalArgumentException: No protocol version matching 
 integer version 4
 at 
 com.datastax.driver.core.ProtocolVersion.fromInt(ProtocolVersion.java:89) 
 ~[cassandra-driver-core-2.1.2.jar:na]
 at

[jira] [Comment Edited] (CASSANDRA-8952) Remove transient RandomAccessFile usage


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394202#comment-14394202
 ] 

Stefania edited comment on CASSANDRA-8952 at 4/3/15 9:37 AM:
-

Regarding the first point, I only found dropPageCache() in SegmentedFile. We 
need to replace the transient RAF with a FileChannel in CLibrary.getfd(String 
path), correct?

Have a quick look here for the first two points:

https://github.com/stef1927/cassandra/commits/8952


was (Author: stefania):
Regarding the first point, I only found dropPageCache() in SegmentedFile. We 
need to replace the transient RAF with a FileChannel in CLibrary.getfd(String 
path), correct?

 Remove transient RandomAccessFile usage
 ---

 Key: CASSANDRA-8952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Joshua McKenzie
Assignee: Stefania
Priority: Minor
  Labels: Windows
 Fix For: 3.0


 There are a few places within the code base where we use a RandomAccessFile 
 transiently to either grab fd's or channels for other operations. This is 
 prone to access violations on Windows (see CASSANDRA-4050 and CASSANDRA-8709) 
 - while these usages don't appear to be causing issues at this time there's 
 no reason to keep them. The less RandomAccessFile usage in the code-base the 
 more stable we'll be on Windows.
 [SSTableReader.dropPageCache|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L2021]
 * Used to getFD, have FileChannel version
 [FileUtils.truncate|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/FileUtils.java#L188]
 * Used to get file channel for channel truncate call. Only use is in index 
 file close so channel truncation down-only is acceptable.
 [MMappedSegmentedFile.createSegments|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/MmappedSegmentedFile.java#L196]
 * Used to get file channel for mapping.
 Keeping these in a single ticket as all three should be fairly trivial 
 refactors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8820) Broken package dependency in Debian repository

2015-04-03 Thread Stephan Wienczny (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394316#comment-14394316
 ] 

Stephan Wienczny commented on CASSANDRA-8820:
-

The reason is that the Packages don't refer to the new version.

  Package: cassandra
  Version: 2.1.4
  ...

  Package: cassandra-tools  
  
  Version: 2.1.3
  
  ...

cassandra-tools is available:

http://dl.bintray.com/apache/cassandra/pool/main/c/cassandra/ 
cassandra-tools_2.1.4_all.deb

So the release process a problem because the Packages file is not updated 
correctly.

 Broken package dependency in Debian repository
 --

 Key: CASSANDRA-8820
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8820
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
 Environment: Ubuntu 14.04 LTS amd64
Reporter: Terry Moschou
Assignee: T Jake Luciani

 The Apache Debian package repository currently has unmet dependencies.
 Configured repos:
 deb http://www.apache.org/dist/cassandra/debian 21x main
 deb-src http://www.apache.org/dist/cassandra/debian 21x main
 Problem file:
 cassandra/dists/21x/main/binary-amd64/Packages
 $ sudo apt-get update  sudo apt-get install cassandra-tools
 ...(omitted)
 Reading state information... Done
 Some packages could not be installed. This may mean that you have
 requested an impossible situation or if you are using the unstable
 distribution that some required packages have not yet been created
 or been moved out of Incoming.
 The following information may help to resolve the situation:
 The following packages have unmet dependencies:
  cassandra-tools : Depends: cassandra (= 2.1.2) but it is not going to be 
 installed
 E: Unable to correct problems, you have held broken packages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8952) Remove transient RandomAccessFile usage


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394202#comment-14394202
 ] 

Stefania edited comment on CASSANDRA-8952 at 4/3/15 9:18 AM:
-

Regarding the first point, I only found dropPageCache() in SegmentedFile. We 
need to replace the transient RAF with a FileChannel in CLibrary.getfd(String 
path), correct?


was (Author: stefania):
Regarding the first point, I only found dropPageCache() in SegmentedFile. We 
need to replace the transient RAF with NIO calls in CLibrary.getfd(String 
path), correct?

 Remove transient RandomAccessFile usage
 ---

 Key: CASSANDRA-8952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Joshua McKenzie
Assignee: Stefania
Priority: Minor
  Labels: Windows
 Fix For: 3.0


 There are a few places within the code base where we use a RandomAccessFile 
 transiently to either grab fd's or channels for other operations. This is 
 prone to access violations on Windows (see CASSANDRA-4050 and CASSANDRA-8709) 
 - while these usages don't appear to be causing issues at this time there's 
 no reason to keep them. The less RandomAccessFile usage in the code-base the 
 more stable we'll be on Windows.
 [SSTableReader.dropPageCache|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L2021]
 * Used to getFD, have FileChannel version
 [FileUtils.truncate|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/FileUtils.java#L188]
 * Used to get file channel for channel truncate call. Only use is in index 
 file close so channel truncation down-only is acceptable.
 [MMappedSegmentedFile.createSegments|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/util/MmappedSegmentedFile.java#L196]
 * Used to get file channel for mapping.
 Keeping these in a single ticket as all three should be fairly trivial 
 refactors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8915) Improve MergeIterator performance


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394244#comment-14394244
 ] 

Stefania commented on CASSANDRA-8915:
-

In case you guys have not seen it yet, please check the changes proposed by 
CASSANDRA-8180, specifically this comment here: 
https://issues.apache.org/jira/browse/CASSANDRA-8180?focusedCommentId=14381674page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14381674.

The idea is that there will be two type of candidates, one greedy that knows 
its first value as it is the case right now. Another one, lazy, that gets 
compared based on a less accurate lower bound. What this means is that once 
this lazy candidate is picked, only then will it access the iterator to 
determine the exact first value, which could be much higher that the initial 
lower bound. 

The way I implemented this with the present implementation of the merge 
iterator is to add the lazy candidate back to the priority queue after it has 
calculated its first accurate value. It's not very elegant however and it is 
kind of wasteful.

If too complex to merge both approaches in one algorithm, we can always 
specialize a separate merge iterator implementation for supporting lazy 
candidates.

 Improve MergeIterator performance
 -

 Key: CASSANDRA-8915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8915
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
Priority: Minor

 The implementation of {{MergeIterator}} uses a priority queue and applies a 
 pair of {{poll}}+{{add}} operations for every item in the resulting sequence. 
 This is quite inefficient as {{poll}} necessarily applies at least {{log N}} 
 comparisons (up to {{2log N}}), and {{add}} often requires another {{log N}}, 
 for example in the case where the inputs largely don't overlap (where {{N}} 
 is the number of iterators being merged).
 This can easily be replaced with a simple custom structure that can perform 
 replacement of the top of the queue in a single step, which will very often 
 complete after a couple of comparisons and in the worst case scenarios will 
 match the complexity of the current implementation.
 This should significantly improve merge performance for iterators with 
 limited overlap (e.g. levelled compaction).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Share file handles between all instances of a SegmentedFile

Repository: cassandra
Updated Branches:
  refs/heads/trunk cf925bdfa - 4e29b7a9a


Share file handles between all instances of a SegmentedFile

patch by stefania; reviewed by benedict for CASSANDRA-8893


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e29b7a9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e29b7a9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e29b7a9

Branch: refs/heads/trunk
Commit: 4e29b7a9a4736e7e70757dc514849c5af7e2d7d1
Parents: cf925bd
Author: Stefania Alborghetti stefania.alborghe...@datastax.com
Authored: Fri Apr 3 12:32:42 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Apr 3 12:32:42 2015 +0100

--
 .../compress/CompressedRandomAccessReader.java  |  26 ++--
 .../io/compress/CompressedThrottledReader.java  |  10 +-
 .../io/sstable/format/SSTableReader.java|  84 ++--
 .../io/sstable/format/big/BigTableWriter.java   |  13 +-
 .../io/util/BufferedPoolingSegmentedFile.java   |  14 +-
 .../io/util/BufferedSegmentedFile.java  |  24 ++--
 .../io/util/CompressedPoolingSegmentedFile.java |  20 +--
 .../io/util/CompressedSegmentedFile.java|  20 +--
 .../cassandra/io/util/MmappedSegmentedFile.java |  65 +++---
 .../cassandra/io/util/PoolingSegmentedFile.java |  22 ++--
 .../cassandra/io/util/RandomAccessReader.java   | 128 ++-
 .../apache/cassandra/io/util/SegmentedFile.java |  74 ---
 .../cassandra/io/util/ThrottledReader.java  |   9 +-
 .../compress/CompressedStreamWriter.java|  14 +-
 .../apache/cassandra/db/RangeTombstoneTest.java |  27 ++--
 .../unit/org/apache/cassandra/db/ScrubTest.java |  17 +--
 .../org/apache/cassandra/db/VerifyTest.java |   3 +-
 .../db/compaction/AntiCompactionTest.java   |  36 +++---
 .../db/compaction/CompactionsTest.java  |   4 +-
 .../cassandra/db/compaction/TTLExpiryTest.java  |   5 +-
 .../CompressedRandomAccessReaderTest.java   |  22 +++-
 .../CompressedSequentialWriterTest.java |  10 +-
 .../cassandra/io/sstable/SSTableReaderTest.java |  21 +--
 .../io/sstable/SSTableScannerTest.java  |  28 ++--
 .../cassandra/io/sstable/SSTableUtils.java  |  20 +--
 .../io/util/BufferedRandomAccessFileTest.java   |  11 +-
 .../cassandra/io/util/DataOutputTest.java   |  14 +-
 .../apache/cassandra/io/util/MemoryTest.java|   1 +
 28 files changed, 377 insertions(+), 365 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e29b7a9/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
--
diff --git 
a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java 
b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
index b1b4dd4..1b3cd06 100644
--- 
a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
+++ 
b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
@@ -33,10 +33,7 @@ import org.apache.cassandra.config.Config;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.io.FSReadError;
 import org.apache.cassandra.io.sstable.CorruptSSTableException;
-import org.apache.cassandra.io.util.CompressedPoolingSegmentedFile;
-import org.apache.cassandra.io.util.FileUtils;
-import org.apache.cassandra.io.util.PoolingSegmentedFile;
-import org.apache.cassandra.io.util.RandomAccessReader;
+import org.apache.cassandra.io.util.*;
 import org.apache.cassandra.utils.FBUtilities;
 
 /**
@@ -47,15 +44,15 @@ public class CompressedRandomAccessReader extends 
RandomAccessReader
 {
 private static final boolean useMmap = 
DatabaseDescriptor.getDiskAccessMode() == Config.DiskAccessMode.mmap;
 
-public static CompressedRandomAccessReader open(String dataFilePath, 
CompressionMetadata metadata)
+public static CompressedRandomAccessReader open(ChannelProxy channel, 
CompressionMetadata metadata)
 {
-return open(dataFilePath, metadata, null);
+return open(channel, metadata, null);
 }
-public static CompressedRandomAccessReader open(String path, 
CompressionMetadata metadata, CompressedPoolingSegmentedFile owner)
+public static CompressedRandomAccessReader open(ChannelProxy channel, 
CompressionMetadata metadata, CompressedPoolingSegmentedFile owner)
 {
 try
 {
-return new CompressedRandomAccessReader(path, metadata, owner);
+return new CompressedRandomAccessReader(channel, metadata, owner);
 }
 catch (FileNotFoundException e)
 {
@@ -78,9 +75,9 @@ public class CompressedRandomAccessReader extends 
RandomAccessReader
 // raw checksum bytes
 private ByteBuffer

[2/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

http://git-wip-us.apache.org/repos/asf/cassandra/blob/23c84b16/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
--
diff --cc src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
index 06234cd,000..a761e6a
mode 100644,00..100644
--- a/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java
@@@ -1,2117 -1,0 +1,2127 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * License); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + * http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an AS IS BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +package org.apache.cassandra.io.sstable.format;
 +
 +import java.io.*;
 +import java.nio.ByteBuffer;
 +import java.util.*;
 +import java.util.concurrent.*;
 +import java.util.concurrent.atomic.AtomicBoolean;
 +import java.util.concurrent.atomic.AtomicLong;
 +
 +import com.google.common.annotations.VisibleForTesting;
 +import com.google.common.base.Predicate;
 +import com.google.common.collect.Iterators;
 +import com.google.common.collect.Ordering;
 +import com.google.common.primitives.Longs;
 +import com.google.common.util.concurrent.RateLimiter;
 +
 +import com.clearspring.analytics.stream.cardinality.CardinalityMergeException;
 +import com.clearspring.analytics.stream.cardinality.HyperLogLogPlus;
 +import com.clearspring.analytics.stream.cardinality.ICardinality;
 +import org.apache.cassandra.cache.CachingOptions;
 +import org.apache.cassandra.cache.InstrumentingCache;
 +import org.apache.cassandra.cache.KeyCacheKey;
 +import org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor;
 +import org.apache.cassandra.concurrent.ScheduledExecutors;
 +import org.apache.cassandra.config.*;
 +import org.apache.cassandra.db.*;
 +import org.apache.cassandra.db.columniterator.OnDiskAtomIterator;
 +import org.apache.cassandra.db.commitlog.ReplayPosition;
 +import org.apache.cassandra.db.composites.CellName;
 +import org.apache.cassandra.db.filter.ColumnSlice;
 +import org.apache.cassandra.db.index.SecondaryIndex;
 +import org.apache.cassandra.dht.*;
 +import org.apache.cassandra.io.compress.CompressionMetadata;
 +import org.apache.cassandra.io.sstable.*;
 +import org.apache.cassandra.io.sstable.metadata.*;
 +import org.apache.cassandra.io.util.*;
 +import org.apache.cassandra.metrics.RestorableMeter;
 +import org.apache.cassandra.metrics.StorageMetrics;
 +import org.apache.cassandra.service.ActiveRepairService;
 +import org.apache.cassandra.service.CacheService;
 +import org.apache.cassandra.service.StorageService;
 +import org.apache.cassandra.utils.*;
 +import org.apache.cassandra.utils.concurrent.OpOrder;
 +import org.slf4j.Logger;
 +import org.slf4j.LoggerFactory;
 +import org.apache.cassandra.utils.concurrent.Ref;
 +import org.apache.cassandra.utils.concurrent.SelfRefCounted;
 +
 +import static 
org.apache.cassandra.db.Directories.SECONDARY_INDEX_NAME_SEPARATOR;
 +
 +/**
 + * An SSTableReader can be constructed in a number of places, but typically 
is either
 + * read from disk at startup, or constructed from a flushed memtable, or 
after compaction
 + * to replace some existing sstables. However once created, an sstablereader 
may also be modified.
 + *
 + * A reader's OpenReason describes its current stage in its lifecycle, as 
follows:
 + *
 + * NORMAL
 + * From:   None= Reader has been read from disk, either at 
startup or from a flushed memtable
 + * EARLY   = Reader is the final result of a compaction
 + * MOVED_START = Reader WAS being compacted, but this failed and 
it has been restored to NORMAL status
 + *
 + * EARLY
 + * From:   None= Reader is a compaction replacement that is 
either incomplete and has been opened
 + *to represent its partial result status, or has 
been finished but the compaction
 + *it is a part of has not yet completed fully
 + * EARLY   = Same as from None, only it is not the first 
time it has been
 + *
 + * MOVED_START
 + * From:   NORMAL  = Reader is being compacted. This compaction has 
not finished, but the compaction result
 + *is either partially or fully opened, to either 
partially or

[jira] [Commented] (CASSANDRA-9111) SSTables originated from the same incremental repair session have different repairedAt timestamps


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394394#comment-14394394
 ] 

Philip Thompson commented on CASSANDRA-9111:


Thanks for the patch! The file you contributed seems to have some odd 
characters in it, did you create it via the steps described here: 
http://wiki.apache.org/cassandra/HowToContribute ?

 SSTables originated from the same incremental repair session have different 
 repairedAt timestamps
 -

 Key: CASSANDRA-9111
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9111
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: prmg
 Attachments: CASSANDRA-9111-v0.txt


 CASSANDRA-7168 optimizes QUORUM reads by skipping incrementally repaired 
 SSTables on other replicas that were repaired on or before the maximum 
 repairedAt timestamp of the coordinating replica's SSTables for the query 
 partition.
 One assumption of that optimization is that SSTables originated from the same 
 repair session in different nodes will have the same repairedAt timestamp, 
 since the objective is to skip reading SSTables originated in the same repair 
 session (or before).
 However, currently, each node timestamps independently SSTables originated 
 from the same repair session, so they almost never have the same timestamp.
 Steps to reproduce the problem:
 {code}
 ccm create test
 ccm populate -n 3
 ccm start
 ccm node1 cqlsh;
 {code}
 {code:sql}
 CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': 3};
 CREATE TABLE foo.bar ( key int, col int, PRIMARY KEY (key) ) ;
 INSERT INTO foo.bar (key, col) VALUES (1, 1);
 exit;
 {code}
 {code}
 ccm node1 flush;
 ccm node2 flush;
 ccm node3 flush;
 nodetool -h 127.0.0.1 -p 7100 repair -par -inc foo bar
 [2015-04-02 21:56:07,726] Starting repair command #1, repairing 3 ranges for 
 keyspace foo (parallelism=PARALLEL, full=false)
 [2015-04-02 21:56:07,816] Repair session 3655b670-d99c-11e4-b250-9107aba35569 
 for range (3074457345618258602,-9223372036854775808] finished
 [2015-04-02 21:56:07,816] Repair session 365a4a50-d99c-11e4-b250-9107aba35569 
 for range (-9223372036854775808,-3074457345618258603] finished
 [2015-04-02 21:56:07,818] Repair session 365bf800-d99c-11e4-b250-9107aba35569 
 for range (-3074457345618258603,3074457345618258602] finished
 [2015-04-02 21:56:07,995] Repair command #1 finished
 sstablemetadata 
 ~/.ccm/test/node1/data/foo/bar-377b5540d99d11e49cc09107aba35569/foo-bar-ka-1-Statistics.db
  
 ~/.ccm/test/node2/data/foo/bar-377b5540d99d11e49cc09107aba35569/foo-bar-ka-1-Statistics.db
  
 ~/.ccm/test/node3/data/foo/bar-377b5540d99d11e49cc09107aba35569/foo-bar-ka-1-Statistics.db
  | grep Repaired
 Repaired at: 1428023050318
 Repaired at: 1428023050322
 Repaired at: 1428023050340
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: follow up to CASSANDRA-8670: providing small improvements to performance of writeUTF; and improving safety of DataOutputBuffer when size is known upfront

Repository: cassandra
Updated Branches:
  refs/heads/trunk 4e29b7a9a - c2ecfe7b7


follow up to CASSANDRA-8670:
providing small improvements to performance of writeUTF; and
improving safety of DataOutputBuffer when size is known upfront

patch by ariel and benedict for CASSANDRA-8670


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c2ecfe7b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c2ecfe7b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c2ecfe7b

Branch: refs/heads/trunk
Commit: c2ecfe7b7bffbced652b4da9dcf4ca263d345695
Parents: 4e29b7a
Author: Ariel Weisberg ariel.wesib...@datastax.com
Authored: Fri Apr 3 12:29:17 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Apr 3 12:33:29 2015 +0100

--
 .../cassandra/db/commitlog/CommitLog.java   |  5 +-
 .../cassandra/db/marshal/CompositeType.java |  3 +-
 .../io/util/BufferedDataOutputStreamPlus.java   |  4 +-
 .../io/util/DataOutputBufferFixed.java  | 65 
 .../cassandra/service/pager/PagingState.java|  3 +-
 .../streaming/messages/StreamInitMessage.java   |  3 +-
 .../org/apache/cassandra/utils/FBUtilities.java |  3 +-
 .../io/util/BufferedDataOutputStreamTest.java   | 39 
 8 files changed, 117 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2ecfe7b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
--
diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java 
b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
index 7fa7575..cf38d44 100644
--- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
+++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
@@ -29,10 +29,10 @@ import com.google.common.annotations.VisibleForTesting;
 
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
-
 import org.apache.commons.lang3.StringUtils;
 
 import com.github.tjake.ICRC32;
+
 import org.apache.cassandra.config.Config;
 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.config.ParameterizedClass;
@@ -41,6 +41,7 @@ import org.apache.cassandra.io.FSWriteError;
 import org.apache.cassandra.io.compress.CompressionParameters;
 import org.apache.cassandra.io.compress.ICompressor;
 import org.apache.cassandra.io.util.BufferedDataOutputStreamPlus;
+import org.apache.cassandra.io.util.DataOutputBufferFixed;
 import org.apache.cassandra.metrics.CommitLogMetrics;
 import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.service.StorageService;
@@ -251,7 +252,7 @@ public class CommitLog implements CommitLogMBean
 {
 ICRC32 checksum = CRC32Factory.instance.create();
 final ByteBuffer buffer = alloc.getBuffer();
-BufferedDataOutputStreamPlus dos = new 
BufferedDataOutputStreamPlus(null, buffer);
+BufferedDataOutputStreamPlus dos = new 
DataOutputBufferFixed(buffer);
 
 // checksummed length
 dos.writeInt((int) size);

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2ecfe7b/src/java/org/apache/cassandra/db/marshal/CompositeType.java
--
diff --git a/src/java/org/apache/cassandra/db/marshal/CompositeType.java 
b/src/java/org/apache/cassandra/db/marshal/CompositeType.java
index 9ee9fb3..1bc772d 100644
--- a/src/java/org/apache/cassandra/db/marshal/CompositeType.java
+++ b/src/java/org/apache/cassandra/db/marshal/CompositeType.java
@@ -32,6 +32,7 @@ import org.apache.cassandra.exceptions.SyntaxException;
 import org.apache.cassandra.cql3.ColumnIdentifier;
 import org.apache.cassandra.cql3.Operator;
 import org.apache.cassandra.io.util.DataOutputBuffer;
+import org.apache.cassandra.io.util.DataOutputBufferFixed;
 import org.apache.cassandra.serializers.MarshalException;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
@@ -403,7 +404,7 @@ public class CompositeType extends AbstractCompositeType
 {
 try
 {
-DataOutputBuffer out = new DataOutputBuffer(serializedSize);
+DataOutputBuffer out = new 
DataOutputBufferFixed(serializedSize);
 if (isStatic)
 out.writeShort(STATIC_MARKER);
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c2ecfe7b/src/java/org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.java
--
diff --git 
a/src/java/org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.java 
b/src/java/org/apache/cassandra/io/util/BufferedDataOutputStreamPlus.java
index f4f46a1..5669a8d 100644
---

[jira] [Commented] (CASSANDRA-9092) Nodes in DC2 die during and after huge write workload

2015-04-03 Thread Sam Tunnicliffe (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394368#comment-14394368
]

Sam Tunnicliffe commented on CASSANDRA-9092:

What consistency level are you writing at?
How are your clients performing the writes, thrift or native protocol?
How do your clients balance requests? Are they simply sending them round robin
or using token aware routing? Are you writing in only one DC or to both?
Are there errors or warnings in the logs of the nodes which don't fail?

Also, I don't think the schema you posted is complete as the primary key
includes a {{chunk}} column not in the table definition.

If this is a not your regular workload (i.e. it's a periodic bulk load) and you
expect the normal usage pattern to be different, disabling hinted handoff
temporarily may be a reasonable workaround for you, provided you aren't relying
on CL.ANY and your clients handle {{UnavailableException}} sanely. You'll also
need to run repair after the load completes.
If that isn't an option, bumping the delivery threads and opening the throttle
might prevent a huge hints buildup if you have sufficient bandwidth and CPU,
but I doubt it will help much as the nodes or network are clearly already
overwhelmed otherwise there wouldn't be so many hints being written in the
first place.

Nodes in DC2 die during and after huge write workload
-

Key: CASSANDRA-9092
URL: https://issues.apache.org/jira/browse/CASSANDRA-9092
Project: Cassandra
Issue Type: Bug
Environment: CentOS 6.2 64-bit, Cassandra 2.1.2,
java version 1.7.0_71
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
Reporter: Sergey Maznichenko
Assignee: Sam Tunnicliffe
Fix For: 2.1.5

Attachments: cassandra_crash1.txt

Hello,
We have Cassandra 2.1.2 with 8 nodes, 4 in DC1 and 4 in DC2.
Node is VM 8 CPU, 32GB RAM
During significant workload (loading several millions blobs ~3.5MB each), 1
node in DC2 stops and after some time next 2 nodes in DC2 also stops.
Now, 2 of nodes in DC2 do not work and stops after 5-10 minutes after start.
I see many files in system.hints table and error appears in 2-3 minutes after
starting system.hints auto compaction.
Stops, means ERROR [CompactionExecutor:1] 2015-04-01 23:33:44,456
CassandraDaemon.java:153 - Exception in thread
Thread[CompactionExecutor:1,1,main]
java.lang.OutOfMemoryError: Java heap space
ERROR [HintedHandoff:1] 2015-04-01 23:33:44,456 CassandraDaemon.java:153 -
Exception in thread Thread[HintedHandoff:1,1,main]
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.OutOfMemoryError: Java heap space
Full errors listing attached in cassandra_crash1.txt
The problem exists only in DC2. We have 1GbE between DC1 and DC2.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9112) Remove ternary construction of SegmentedFile.Builder in readers


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-9112:

Attachment: 9112.txt

 Remove ternary construction of SegmentedFile.Builder in readers
 ---

 Key: CASSANDRA-9112
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9112
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 3.0

 Attachments: 9112.txt


 Self explanatory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9111) SSTables originated from the same incremental repair session have different repairedAt timestamps


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9111:
---
Reviewer: Yuki Morishita

 SSTables originated from the same incremental repair session have different 
 repairedAt timestamps
 -

 Key: CASSANDRA-9111
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9111
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: prmg
 Attachments: CASSANDRA-9111-v0.txt


 CASSANDRA-7168 optimizes QUORUM reads by skipping incrementally repaired 
 SSTables on other replicas that were repaired on or before the maximum 
 repairedAt timestamp of the coordinating replica's SSTables for the query 
 partition.
 One assumption of that optimization is that SSTables originated from the same 
 repair session in different nodes will have the same repairedAt timestamp, 
 since the objective is to skip reading SSTables originated in the same repair 
 session (or before).
 However, currently, each node timestamps independently SSTables originated 
 from the same repair session, so they almost never have the same timestamp.
 Steps to reproduce the problem:
 {code}
 ccm create test
 ccm populate -n 3
 ccm start
 ccm node1 cqlsh;
 {code}
 {code:sql}
 CREATE KEYSPACE foo WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': 3};
 CREATE TABLE foo.bar ( key int, col int, PRIMARY KEY (key) ) ;
 INSERT INTO foo.bar (key, col) VALUES (1, 1);
 exit;
 {code}
 {code}
 ccm node1 flush;
 ccm node2 flush;
 ccm node3 flush;
 nodetool -h 127.0.0.1 -p 7100 repair -par -inc foo bar
 [2015-04-02 21:56:07,726] Starting repair command #1, repairing 3 ranges for 
 keyspace foo (parallelism=PARALLEL, full=false)
 [2015-04-02 21:56:07,816] Repair session 3655b670-d99c-11e4-b250-9107aba35569 
 for range (3074457345618258602,-9223372036854775808] finished
 [2015-04-02 21:56:07,816] Repair session 365a4a50-d99c-11e4-b250-9107aba35569 
 for range (-9223372036854775808,-3074457345618258603] finished
 [2015-04-02 21:56:07,818] Repair session 365bf800-d99c-11e4-b250-9107aba35569 
 for range (-3074457345618258603,3074457345618258602] finished
 [2015-04-02 21:56:07,995] Repair command #1 finished
 sstablemetadata 
 ~/.ccm/test/node1/data/foo/bar-377b5540d99d11e49cc09107aba35569/foo-bar-ka-1-Statistics.db
  
 ~/.ccm/test/node2/data/foo/bar-377b5540d99d11e49cc09107aba35569/foo-bar-ka-1-Statistics.db
  
 ~/.ccm/test/node3/data/foo/bar-377b5540d99d11e49cc09107aba35569/foo-bar-ka-1-Statistics.db
  | grep Repaired
 Repaired at: 1428023050318
 Repaired at: 1428023050322
 Repaired at: 1428023050340
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

Merge branch 'cassandra-2.1' into trunk

Conflicts:
src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/23c84b16
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/23c84b16
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/23c84b16

Branch: refs/heads/trunk
Commit: 23c84b169febc59d3d2927bdc6389104d7d869e7
Parents: c2ecfe7 345455d
Author: Benedict Elliott Smith bened...@apache.org
Authored: Fri Apr 3 12:58:07 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Apr 3 12:58:07 2015 +0100

--
 CHANGES.txt |  1 +
 .../io/sstable/format/SSTableReader.java| 24 ++--
 2 files changed, 18 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/23c84b16/CHANGES.txt
--
diff --cc CHANGES.txt
index d049640,9ddb9c9..e8cb20b
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,94 -1,5 +1,95 @@@
 +3.0
 + * Share file handles between all instances of a SegmentedFile 
(CASSANDRA-8893)
 + * Make it possible to major compact LCS (CASSANDRA-7272)
 + * Make FunctionExecutionException extend RequestExecutionException
 +   (CASSANDRA-9055)
 + * Add support for SELECT JSON, INSERT JSON syntax and new toJson(), 
fromJson()
 +   functions (CASSANDRA-7970)
 + * Optimise max purgeable timestamp calculation in compaction (CASSANDRA-8920)
 + * Constrain internode message buffer sizes, and improve IO class hierarchy 
(CASSANDRA-8670) 
 + * New tool added to validate all sstables in a node (CASSANDRA-5791)
 + * Push notification when tracing completes for an operation (CASSANDRA-7807)
 + * Delay node up and node added notifications until native protocol 
server is started (CASSANDRA-8236)
 + * Compressed Commit Log (CASSANDRA-6809)
 + * Optimise IntervalTree (CASSANDRA-8988)
 + * Add a key-value payload for third party usage (CASSANDRA-8553)
 + * Bump metrics-reporter-config dependency for metrics 3.0 (CASSANDRA-8149)
 + * Partition intra-cluster message streams by size, not type (CASSANDRA-8789)
 + * Add WriteFailureException to native protocol, notify coordinator of
 +   write failures (CASSANDRA-8592)
 + * Convert SequentialWriter to nio (CASSANDRA-8709)
 + * Add role based access control (CASSANDRA-7653, 8650, 7216, 8760, 8849, 
8761, 8850)
 + * Record client ip address in tracing sessions (CASSANDRA-8162)
 + * Indicate partition key columns in response metadata for prepared
 +   statements (CASSANDRA-7660)
 + * Merge UUIDType and TimeUUIDType parse logic (CASSANDRA-8759)
 + * Avoid memory allocation when searching index summary (CASSANDRA-8793)
 + * Optimise (Time)?UUIDType Comparisons (CASSANDRA-8730)
 + * Make CRC32Ex into a separate maven dependency (CASSANDRA-8836)
 + * Use preloaded jemalloc w/ Unsafe (CASSANDRA-8714)
 + * Avoid accessing partitioner through StorageProxy (CASSANDRA-8244, 8268)
 + * Upgrade Metrics library and remove depricated metrics (CASSANDRA-5657)
 + * Serializing Row cache alternative, fully off heap (CASSANDRA-7438)
 + * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707)
 + * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560)
 + * Support direct buffer decompression for reads (CASSANDRA-8464)
 + * DirectByteBuffer compatible LZ4 methods (CASSANDRA-7039)
 + * Group sstables for anticompaction correctly (CASSANDRA-8578)
 + * Add ReadFailureException to native protocol, respond
 +   immediately when replicas encounter errors while handling
 +   a read request (CASSANDRA-7886)
 + * Switch CommitLogSegment from RandomAccessFile to nio (CASSANDRA-8308)
 + * Allow mixing token and partition key restrictions (CASSANDRA-7016)
 + * Support index key/value entries on map collections (CASSANDRA-8473)
 + * Modernize schema tables (CASSANDRA-8261)
 + * Support for user-defined aggregation functions (CASSANDRA-8053)
 + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419)
 + * Refactor SelectStatement, return IN results in natural order instead
 +   of IN value list order and ignore duplicate values in partition key IN 
restrictions (CASSANDRA-7981)
 + * Support UDTs, tuples, and collections in user-defined
 +   functions (CASSANDRA-7563)
 + * Fix aggregate fn results on empty selection, result column name,
 +   and cqlsh parsing (CASSANDRA-8229)
 + * Mark sstables as repaired after full repair (CASSANDRA-7586)
 + * Extend Descriptor to include a format value and refactor reader/writer
 +   APIs (CASSANDRA-7443)
 + * Integrate JMH for microbenchmarks (CASSANDRA-8151)
 + * Keep sstable levels when bootstrapping (CASSANDRA-7460)
 + * Add Sigar library and perform basic OS settings

[1/3] cassandra git commit: Do not load read meters for offline operations

Repository: cassandra
Updated Branches:
  refs/heads/trunk c2ecfe7b7 - 23c84b169


Do not load read meters for offline operations

patch by benedict; reviewed by tyler for CASSANDRA-9082


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/345455de
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/345455de
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/345455de

Branch: refs/heads/trunk
Commit: 345455dee2b154e5a9b10a7a615bcc0c7092775d
Parents: 49d64c2
Author: Benedict Elliott Smith bened...@apache.org
Authored: Fri Apr 3 12:53:45 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Apr 3 12:53:45 2015 +0100

--
 CHANGES.txt |  1 +
 .../cassandra/io/sstable/SSTableReader.java | 24 ++--
 2 files changed, 18 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/345455de/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index b1499c1..9ddb9c9 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.5
+ * Do not load read meter for offline operations (CASSANDRA-9082)
  * cqlsh: Make CompositeType data readable (CASSANDRA-8919)
  * cqlsh: Fix display of triggers (CASSANDRA-9081)
  * Fix NullPointerException when deleting or setting an element by index on

http://git-wip-us.apache.org/repos/asf/cassandra/blob/345455de/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
--
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
index 8fd7b85..c73d4a1 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
@@ -378,6 +378,7 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 return open(descriptor, components, metadata, partitioner, true);
 }
 
+// use only for offline or Standalone operations
 public static SSTableReader openNoValidation(Descriptor descriptor, 
SetComponent components, CFMetaData metadata) throws IOException
 {
 return open(descriptor, components, metadata, 
StorageService.getPartitioner(), false);
@@ -434,7 +435,7 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 sstable.ifile = 
ibuilder.complete(sstable.descriptor.filenameFor(Component.PRIMARY_INDEX));
 sstable.dfile = 
dbuilder.complete(sstable.descriptor.filenameFor(Component.DATA));
 sstable.bf = FilterFactory.AlwaysPresent;
-sstable.setup();
+sstable.setup(true);
 return sstable;
 }
 
@@ -478,7 +479,7 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 sstable.load(validationMetadata);
 logger.debug(INDEX LOAD TIME for {}: {} ms., descriptor, 
TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - start));
 
-sstable.setup();
+sstable.setup(!validate);
 if (validate)
 sstable.validate();
 
@@ -599,7 +600,7 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 this.dfile = dfile;
 this.indexSummary = indexSummary;
 this.bf = bloomFilter;
-this.setup();
+this.setup(false);
 }
 
 public static long getTotalBytes(IterableSSTableReader sstables)
@@ -2010,9 +2011,9 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 return selfRef.ref();
 }
 
-void setup()
+void setup(boolean isOffline)
 {
-tidy.setup(this);
+tidy.setup(this, isOffline);
 this.readMeter = tidy.global.readMeter;
 }
 
@@ -2059,7 +2060,7 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 
 private boolean setup;
 
-void setup(SSTableReader reader)
+void setup(SSTableReader reader, boolean isOffline)
 {
 this.setup = true;
 this.bf = reader.bf;
@@ -2070,6 +2071,8 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 this.typeRef = DescriptorTypeTidy.get(reader);
 this.type = typeRef.get();
 this.global = type.globalRef.get();
+if (!isOffline)
+global.ensureReadMeter();
 }
 
 InstanceTidier(Descriptor descriptor, CFMetaData metadata)
@@ -2212,7 +2215,7 @@ public class SSTableReader extends SSTable implements 
SelfRefCountedSSTableRead
 private RestorableMeter readMeter;
 // the scheduled persistence of the readMeter, that we

[jira] [Commented] (CASSANDRA-9092) Nodes in DC2 die during and after huge write workload

2015-04-03 Thread Sergey Maznichenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394352#comment-14394352
 ] 

Sergey Maznichenko commented on CASSANDRA-9092:
---

Should I provide any additional information from the failed node? I want to 
delete all hints and run repair on this node.

 Nodes in DC2 die during and after huge write workload
 -

 Key: CASSANDRA-9092
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9092
 Project: Cassandra
  Issue Type: Bug
 Environment: CentOS 6.2 64-bit, Cassandra 2.1.2, 
 java version 1.7.0_71
 Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
 Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
Reporter: Sergey Maznichenko
Assignee: Sam Tunnicliffe
 Fix For: 2.1.5

 Attachments: cassandra_crash1.txt


 Hello,
 We have Cassandra 2.1.2 with 8 nodes, 4 in DC1 and 4 in DC2.
 Node is VM 8 CPU, 32GB RAM
 During significant workload (loading several millions blobs ~3.5MB each), 1 
 node in DC2 stops and after some time next 2 nodes in DC2 also stops.
 Now, 2 of nodes in DC2 do not work and stops after 5-10 minutes after start. 
 I see many files in system.hints table and error appears in 2-3 minutes after 
 starting system.hints auto compaction.
 Stops, means ERROR [CompactionExecutor:1] 2015-04-01 23:33:44,456 
 CassandraDaemon.java:153 - Exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.OutOfMemoryError: Java heap space
 ERROR [HintedHandoff:1] 2015-04-01 23:33:44,456 CassandraDaemon.java:153 - 
 Exception in thread Thread[HintedHandoff:1,1,main]
 java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
 java.lang.OutOfMemoryError: Java heap space
 Full errors listing attached in cassandra_crash1.txt
 The problem exists only in DC2. We have 1GbE between DC1 and DC2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9112) Remove ternary construction of SegmentedFile.Builder in readers

Benedict created CASSANDRA-9112:
---

 Summary: Remove ternary construction of SegmentedFile.Builder in 
readers
 Key: CASSANDRA-9112
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9112
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 3.0


Self explanatory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9110) Bounded/RingBuffer CQL Collections

[
https://issues.apache.org/jira/browse/CASSANDRA-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Philip Thompson updated CASSANDRA-9110:
---
Fix Version/s: 3.1

Bounded/RingBuffer CQL Collections
--

Key: CASSANDRA-9110
URL: https://issues.apache.org/jira/browse/CASSANDRA-9110
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Jim Plush
Priority: Minor
Fix For: 3.1

Feature Request:
I've had frequent use cases for bounded and RingBuffer based collections.
For example:
I want to store the first 100 times I've see this thing.
I want to store the last 100 times I've seen this thing.
Currently that means having to do application level READ/WRITE operations and
we like to keep some of our high scale apps to write only where possible.
While probably expensive for exactly N items an approximation should be good
enough for most applications. Where N in our example could be 100 or 102, or
even make that tunable on the type or table.
For the RingBuffer example, consider I only want to store the last N login
attempts for a user. Once N+1 comes in it issues a delete for the oldest one
in the collection, or waits until compaction to drop the overflow data as
long as the CQL returns the right bounds.
A potential implementation idea, given the rowkey would live on a single node
would be to have an LRU based counter cache (tunable in the yaml settings in
MB) that keeps a current count of how many items are already in the
collection for that rowkey. If than bound, toss.
something akin to:
CREATE TABLE users (
user_id text PRIMARY KEY,
first_name text,
first_logins settext, 100, oldest
last_logins settext, 100, newest
);

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9113) Improve error message when bootstrap fails

Philip Thompson created CASSANDRA-9113:
--

 Summary: Improve error message when bootstrap fails
 Key: CASSANDRA-9113
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9113
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Philip Thompson
 Fix For: 3.1


Currently when bootstrap fails, users see a {{RuntimeException: Stream failed}} 
with a long stack trace. This typically brings them to IRC, the mailing list, 
or jira. However, most of the time, it is not due to a C* server failure, but 
network or machine issues.

While there are probably improvements that could be made to improve the 
resiliency of streaming, it would be nice if, assuming no server errors 
detected, that instead of the RuntimeException users are shown a less traumatic 
error message, that includes or points to documentation on how to solve a 
failed bootstrap stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8905) IllegalArgumentException in compaction of -ic- file after upgrade to 2.0.12


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8905:
---
Fix Version/s: 2.0.15

 IllegalArgumentException in compaction of -ic- file after upgrade to 2.0.12
 ---

 Key: CASSANDRA-8905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8905
 Project: Cassandra
  Issue Type: Bug
Reporter: Erik Forsberg
 Fix For: 2.0.15


 After upgrade from 1.2.18 to 2.0.12, I've started to get exceptions like:
 {noformat}
 ERROR [CompactionExecutor:1149] 2015-03-04 11:48:46,045 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1149,1,main]
 java.lang.IllegalArgumentException: Illegal Capacity: -2147483648
 at java.util.ArrayList.init(ArrayList.java:142)
 at 
 org.apache.cassandra.db.SuperColumns$SCIterator.next(SuperColumns.java:182)
 at 
 org.apache.cassandra.db.SuperColumns$SCIterator.next(SuperColumns.java:194)
 at 
 org.apache.cassandra.db.SuperColumns$SCIterator.next(SuperColumns.java:138)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85)
 at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
 at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115)
 at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:161)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 I've identified which sstable is causing this, it's an -ic- format sstable, 
 i.e. something written before the upgrade. I can repeat with 
 forceUserDefinedCompaction.
 Running upgradesstables also causes the same exception. 
 Scrub helps, but skips a row as incorrect. 
 I can share the sstable privately if it helps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8905) IllegalArgumentException in compaction of -ic- file after upgrade to 2.0.12


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394511#comment-14394511
 ] 

Philip Thompson commented on CASSANDRA-8905:


[~krummas], if scrubbing solved the issue, do we consider this a problem?

 IllegalArgumentException in compaction of -ic- file after upgrade to 2.0.12
 ---

 Key: CASSANDRA-8905
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8905
 Project: Cassandra
  Issue Type: Bug
Reporter: Erik Forsberg
 Fix For: 2.0.15


 After upgrade from 1.2.18 to 2.0.12, I've started to get exceptions like:
 {noformat}
 ERROR [CompactionExecutor:1149] 2015-03-04 11:48:46,045 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:1149,1,main]
 java.lang.IllegalArgumentException: Illegal Capacity: -2147483648
 at java.util.ArrayList.init(ArrayList.java:142)
 at 
 org.apache.cassandra.db.SuperColumns$SCIterator.next(SuperColumns.java:182)
 at 
 org.apache.cassandra.db.SuperColumns$SCIterator.next(SuperColumns.java:194)
 at 
 org.apache.cassandra.db.SuperColumns$SCIterator.next(SuperColumns.java:138)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98)
 at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:85)
 at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74)
 at 
 org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55)
 at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115)
 at 
 org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:161)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
 at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 I've identified which sstable is causing this, it's an -ic- format sstable, 
 i.e. something written before the upgrade. I can repeat with 
 forceUserDefinedCompaction.
 Running upgradesstables also causes the same exception. 
 Scrub helps, but skips a row as incorrect. 
 I can share the sstable privately if it helps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8589) Reconciliation in presence of tombstone might yield state data


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394526#comment-14394526
 ] 

Philip Thompson commented on CASSANDRA-8589:


[~slebresne], would you like this on your backlog? Or should I assign it to 
Benjamin, Tyler, or Carl?

 Reconciliation in presence of tombstone might yield state data
 --

 Key: CASSANDRA-8589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8589
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne

 Consider 3 replica A, B, C (so RF=3) and consider that we do the following 
 sequence of actions at {{QUORUM}} where I indicate the replicas acknowledging 
 each operation (and let's assume that a replica that don't ack is a replica 
 that don't get the update):
 {noformat}
 CREATE TABLE test (k text, t int, v int, PRIMARY KEY (k, t))
 INSERT INTO test(k, t, v) VALUES ('k', 0, 0); // acked by A, B and C
 INSERT INTO test(k, t, v) VALUES ('k', 1, 1); // acked by A, B and C
 INSERT INTO test(k, t, v) VALUES ('k', 2, 2); // acked by A, B and C
 DELETE FROM test WHERE k='k' AND t=1; // acked by A and C
 UPDATE test SET v = 3 WHERE k='k' AND t=2;// acked by B and C
 SELECT * FROM test WHERE k='k' LIMIT 2;   // answered by A and B
 {noformat}
 Every operation has achieved quorum, but on the last read, A will respond 
 {{0-0, tombstone 1, 2-2}} and B will respond {{0-0, 1-1}}. As a 
 consequence we'll answer {{0-0, 2-2}} which is incorrect (we should respond 
 {{0-0, 2-3}}).
 Put another way, if we have a limit, every replica honors that limit but 
 since tombstones can suppress results from other nodes, we may have some 
 cells for which we actually don't get a quorum of response (even though we 
 globally have a quorum of replica responses).
 In practice, this probably occurs rather rarely and so the simpler fix is 
 probably to do something similar to the short reads protection: detect when 
 this could have happen (based on how replica response are reconciled) and do 
 an additional request in that case. That detection will have potential false 
 positives but I suspect we can be precise enough that those false positives 
 will be very very rare (we should nonetheless track how often this code gets 
 triggered and if we see that it's more often than we think, we could 
 pro-actively bump user limits internally to reduce those occurrences).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9113) Improve error message when bootstrap fails


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9113:
---
Priority: Minor  (was: Major)

 Improve error message when bootstrap fails
 --

 Key: CASSANDRA-9113
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9113
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Philip Thompson
Priority: Minor
 Fix For: 3.1


 Currently when bootstrap fails, users see a {{RuntimeException: Stream 
 failed}} with a long stack trace. This typically brings them to IRC, the 
 mailing list, or jira. However, most of the time, it is not due to a C* 
 server failure, but network or machine issues.
 While there are probably improvements that could be made to improve the 
 resiliency of streaming, it would be nice if, assuming no server errors 
 detected, that instead of the RuntimeException users are shown a less 
 traumatic error message, that includes or points to documentation on how to 
 solve a failed bootstrap stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8589) Reconciliation in presence of tombstone might yield state data


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394530#comment-14394530
 ] 

Sylvain Lebresne commented on CASSANDRA-8589:
-

It would actually be nice to start by ensuring we can reproduce it through a 
dtest. It shoudn't be too hard to write one, and no point in chasing a complex 
solution if like for CASSANDRA-8933, something I forgot about in the code made 
this not a problem. Also, CASSANDRA-8099 should actually solve that, so if 
that's confirmed by said reproduction dtest, maybe we're good with fixing in 
3.0 only.

 Reconciliation in presence of tombstone might yield state data
 --

 Key: CASSANDRA-8589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8589
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne

 Consider 3 replica A, B, C (so RF=3) and consider that we do the following 
 sequence of actions at {{QUORUM}} where I indicate the replicas acknowledging 
 each operation (and let's assume that a replica that don't ack is a replica 
 that don't get the update):
 {noformat}
 CREATE TABLE test (k text, t int, v int, PRIMARY KEY (k, t))
 INSERT INTO test(k, t, v) VALUES ('k', 0, 0); // acked by A, B and C
 INSERT INTO test(k, t, v) VALUES ('k', 1, 1); // acked by A, B and C
 INSERT INTO test(k, t, v) VALUES ('k', 2, 2); // acked by A, B and C
 DELETE FROM test WHERE k='k' AND t=1; // acked by A and C
 UPDATE test SET v = 3 WHERE k='k' AND t=2;// acked by B and C
 SELECT * FROM test WHERE k='k' LIMIT 2;   // answered by A and B
 {noformat}
 Every operation has achieved quorum, but on the last read, A will respond 
 {{0-0, tombstone 1, 2-2}} and B will respond {{0-0, 1-1}}. As a 
 consequence we'll answer {{0-0, 2-2}} which is incorrect (we should respond 
 {{0-0, 2-3}}).
 Put another way, if we have a limit, every replica honors that limit but 
 since tombstones can suppress results from other nodes, we may have some 
 cells for which we actually don't get a quorum of response (even though we 
 globally have a quorum of replica responses).
 In practice, this probably occurs rather rarely and so the simpler fix is 
 probably to do something similar to the short reads protection: detect when 
 this could have happen (based on how replica response are reconciled) and do 
 an additional request in that case. That detection will have potential false 
 positives but I suspect we can be precise enough that those false positives 
 will be very very rare (we should nonetheless track how often this code gets 
 triggered and if we see that it's more often than we think, we could 
 pro-actively bump user limits internally to reduce those occurrences).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8589) Reconciliation in presence of tombstone might yield state data


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8589:
---
   Tester: Ryan McGuire
Fix Version/s: 3.0

 Reconciliation in presence of tombstone might yield state data
 --

 Key: CASSANDRA-8589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8589
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
 Fix For: 3.0


 Consider 3 replica A, B, C (so RF=3) and consider that we do the following 
 sequence of actions at {{QUORUM}} where I indicate the replicas acknowledging 
 each operation (and let's assume that a replica that don't ack is a replica 
 that don't get the update):
 {noformat}
 CREATE TABLE test (k text, t int, v int, PRIMARY KEY (k, t))
 INSERT INTO test(k, t, v) VALUES ('k', 0, 0); // acked by A, B and C
 INSERT INTO test(k, t, v) VALUES ('k', 1, 1); // acked by A, B and C
 INSERT INTO test(k, t, v) VALUES ('k', 2, 2); // acked by A, B and C
 DELETE FROM test WHERE k='k' AND t=1; // acked by A and C
 UPDATE test SET v = 3 WHERE k='k' AND t=2;// acked by B and C
 SELECT * FROM test WHERE k='k' LIMIT 2;   // answered by A and B
 {noformat}
 Every operation has achieved quorum, but on the last read, A will respond 
 {{0-0, tombstone 1, 2-2}} and B will respond {{0-0, 1-1}}. As a 
 consequence we'll answer {{0-0, 2-2}} which is incorrect (we should respond 
 {{0-0, 2-3}}).
 Put another way, if we have a limit, every replica honors that limit but 
 since tombstones can suppress results from other nodes, we may have some 
 cells for which we actually don't get a quorum of response (even though we 
 globally have a quorum of replica responses).
 In practice, this probably occurs rather rarely and so the simpler fix is 
 probably to do something similar to the short reads protection: detect when 
 this could have happen (based on how replica response are reconciled) and do 
 an additional request in that case. That detection will have potential false 
 positives but I suspect we can be precise enough that those false positives 
 will be very very rare (we should nonetheless track how often this code gets 
 triggered and if we see that it's more often than we think, we could 
 pro-actively bump user limits internally to reduce those occurrences).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8915) Improve MergeIterator performance


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394250#comment-14394250
 ] 

Benedict commented on CASSANDRA-8915:
-

I perhaps should have commented when I first saw the link. It should be quite 
viable to merge the behaviours; the Candidate just needs to have a flag 
indicating if the value is real or not, and to just discard the not-real 
values it encounters.

 Improve MergeIterator performance
 -

 Key: CASSANDRA-8915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8915
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
Priority: Minor

 The implementation of {{MergeIterator}} uses a priority queue and applies a 
 pair of {{poll}}+{{add}} operations for every item in the resulting sequence. 
 This is quite inefficient as {{poll}} necessarily applies at least {{log N}} 
 comparisons (up to {{2log N}}), and {{add}} often requires another {{log N}}, 
 for example in the case where the inputs largely don't overlap (where {{N}} 
 is the number of iterators being merged).
 This can easily be replaced with a simple custom structure that can perform 
 replacement of the top of the queue in a single step, which will very often 
 complete after a couple of comparisons and in the worst case scenarios will 
 match the complexity of the current implementation.
 This should significantly improve merge performance for iterators with 
 limited overlap (e.g. levelled compaction).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7688) Add data sizing to a system table

2015-04-03 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394294#comment-14394294
 ] 

Piotr Kołaczkowski commented on CASSANDRA-7688:
---

Will there be a command to manually refresh statistics of a table from CQL 
(like ANALYZE TABLE ...)?
I need a way to trigger this in an integration test and I don't want to wait 
until it automatically refreshes it after the update interval...
1. create table
2. add data
3. analyze (?)
4. check stats


 Add data sizing to a system table
 -

 Key: CASSANDRA-7688
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7688
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jeremiah Jordan
Assignee: Aleksey Yeschenko
 Fix For: 2.1.5

 Attachments: 7688.txt


 Currently you can't implement something similar to describe_splits_ex purely 
 from the a native protocol driver.  
 https://datastax-oss.atlassian.net/browse/JAVA-312 is open to expose easily 
 getting ownership information to a client in the java-driver.  But you still 
 need the data sizing part to get splits of a given size.  We should add the 
 sizing information to a system table so that native clients can get to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Share file handles between all instances of a SegmentedFile

Repository: cassandra
Updated Branches:
  refs/heads/trunk 868457de2 - cf925bdfa


Share file handles between all instances of a SegmentedFile

patch by stefania; reviewed by benedict for CASSANDRA-8893


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cf925bdf
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cf925bdf
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cf925bdf

Branch: refs/heads/trunk
Commit: cf925bdfa2f211784eb22d2b98b7176e551dda69
Parents: 868457d
Author: Stefania Alborghetti stefania.alborghe...@datastax.com
Authored: Fri Apr 3 11:43:30 2015 +0100
Committer: Benedict Elliott Smith bened...@apache.org
Committed: Fri Apr 3 11:43:30 2015 +0100

--
 CHANGES.txt |   1 +
 .../apache/cassandra/io/util/ChannelProxy.java  | 182 +++
 .../cassandra/io/RandomAccessReaderTest.java| 234 +++
 3 files changed, 417 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/cf925bdf/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index bda5bb7..d049640 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0
+ * Share file handles between all instances of a SegmentedFile (CASSANDRA-8893)
  * Make it possible to major compact LCS (CASSANDRA-7272)
  * Make FunctionExecutionException extend RequestExecutionException
(CASSANDRA-9055)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cf925bdf/src/java/org/apache/cassandra/io/util/ChannelProxy.java
--
diff --git a/src/java/org/apache/cassandra/io/util/ChannelProxy.java 
b/src/java/org/apache/cassandra/io/util/ChannelProxy.java
new file mode 100644
index 000..79954a5
--- /dev/null
+++ b/src/java/org/apache/cassandra/io/util/ChannelProxy.java
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.io.util;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.MappedByteBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.channels.WritableByteChannel;
+import java.nio.file.StandardOpenOption;
+
+import org.apache.cassandra.io.FSReadError;
+import org.apache.cassandra.utils.CLibrary;
+import org.apache.cassandra.utils.concurrent.RefCounted;
+import org.apache.cassandra.utils.concurrent.SharedCloseableImpl;
+
+/**
+ * A proxy of a FileChannel that:
+ *
+ * - implements reference counting
+ * - exports only thread safe FileChannel operations
+ * - wraps IO exceptions into runtime exceptions
+ *
+ * Tested by RandomAccessReaderTest.
+ */
+public final class ChannelProxy extends SharedCloseableImpl
+{
+private final String filePath;
+private final FileChannel channel;
+
+public static FileChannel openChannel(File file)
+{
+try
+{
+return FileChannel.open(file.toPath(), StandardOpenOption.READ);
+}
+catch (IOException e)
+{
+throw new RuntimeException(e);
+}
+}
+
+public ChannelProxy(String path)
+{
+this (new File(path));
+}
+
+public ChannelProxy(File file)
+{
+this(file.getAbsolutePath(), openChannel(file));
+}
+
+public ChannelProxy(String filePath, FileChannel channel)
+{
+super(new Cleanup(filePath, channel));
+
+this.filePath = filePath;
+this.channel = channel;
+}
+
+public ChannelProxy(ChannelProxy copy)
+{
+super(copy);
+
+this.filePath = copy.filePath;
+this.channel = copy.channel;
+}
+
+private final static class Cleanup implements RefCounted.Tidy
+{
+final String filePath;
+final FileChannel channel;
+
+protected Cleanup(String filePath, FileChannel channel)
+{
+this.filePath = filePath;
+this.channel = channel;
+}
+
+public String name()
+{
+

[jira] [Updated] (CASSANDRA-8893) RandomAccessReader should share its FileChannel with all instances (via SegmentedFile)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-8893:

Reviewer: Benedict

 RandomAccessReader should share its FileChannel with all instances (via 
 SegmentedFile)
 --

 Key: CASSANDRA-8893
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8893
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
 Fix For: 3.0


 There's no good reason to open a FileChannel for each 
 \(Compressed\)\?RandomAccessReader, and this would simplify 
 RandomAccessReader to just a thin wrapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8589) Reconciliation in presence of tombstone might yield state data