from:"Geoffrey Yu \(JIRA\)"


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu resolved CASSANDRA-12075.
-
Resolution: Won't Fix

> Include whether or not the client should retry the request when throwing a 
> RequestExecutionException
> 
>
> Key: CASSANDRA-12075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12075
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>
> Some requests that result in an error should not be retried by the client. 
> Right now if the client gets an error, it has no way of knowing whether or 
> not it should retry. We can include an extra field in each 
> {{RequestExecutionException}} that will indicate whether the client should 
> retry, retry on a different host, or not retry at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9875) Rebuild from targeted replica


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433601#comment-15433601
 ] 

Geoffrey Yu commented on CASSANDRA-9875:


Thanks! I've opened a PR for the dtests 
[here|https://github.com/riptano/cassandra-dtest/pull/1273]. I'll keep an eye 
on the tests and look in to any failures that come up.

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-dtest-master-v2.txt, 9875-dtest-master.txt, 
> 9875-trunk-v2.txt, 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Attachment: 9875-dtest-master-v2.txt

I've attached a new dtest patch with the changes. I ended up just adding a new 
test so we can get more granularity in the reporting. Please let me know how it 
looks!

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-dtest-master-v2.txt, 9875-dtest-master.txt, 
> 9875-trunk-v2.txt, 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Status: Patch Available  (was: Open)

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-dtest-master-v2.txt, 9875-dtest-master.txt, 
> 9875-trunk-v2.txt, 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica

2016-08-22 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Status: Patch Available  (was: Open)

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-dtest-master.txt, 9875-trunk-v2.txt, 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica

2016-08-22 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Attachment: 9875-dtest-master.txt
9875-trunk-v2.txt

I've attached a new patch that uses a source filter as well as a patch for two 
new dtests. One test verifies the behavior of rebuilding with a specific range 
and the other verifies that {{nodetool rebuild}} disallows rebuilding a range 
that the current node does not own.

Please let me know how these look!

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-dtest-master.txt, 9875-trunk-v2.txt, 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-2848) Make the Client API support passing down timeouts


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-2848:
---
Attachment: 2848-trunk-v2.txt

I'm attaching a second version of the patch that incorporates the changes in 
CASSANDRA-12256.

*TL;DR:* The timeout is represented as an {{OptionalLong}} that is encoded in 
{{QueryOptions}}. It is passed all the way to the replica nodes on reads 
through {{ReadCommand}}, but is only kept on the coordinator for writes.


The optional client specified timeout is decoded as a part of {{QueryOptions}}. 
Since this timeout may or may not be specified by a client, I opted to use an 
{{OptionalLong}} in an effort to make it clearer in the code that this is 
optional. I’ve gated the use of the new timeout flag (and encoding the timeout) 
to protocol v5 and above.

On the read path, the timeout is kept within the {{ReadCommand}} and referenced 
in the {{ReadCallback.awaitResults()}}. It is also serialized within the 
{{ReadCommand}} so that replica nodes can use it when setting the monitoring 
time in {{ReadCommandVerbHandler}}. Of course, because the time when the query 
started is not propagated to the replicas, this will only enforce the timeout 
from when the {{MessageIn}} was constructed.

On the write path, the timeout is just passed through the call stack into the 
{{AbstractWriteResponseHandler}}/{{AbstractPaxosCallback}} where it is 
referenced in the respective {{await()}} calls.

I had investigated the possibility of passing the timeout to the replicas on 
the write path. To do so we'd need to incorporate it into the outgoing 
internode message when making a write, meaning placing it into {{Mutation}} or 
otherwise creating some sort of wrapper around a mutation that can hold the 
timeout. It seemed like this would be a very invasive change for minimal gain, 
considering being able to abort an in progress write didn't seem as useful 
compared to aborting an in progress read.

This still requires a version bump in the internode protocol to support the 
change in serialization of {{ReadCommand}} (I haven't touched 
{{MessagingService.current_version}} yet, though). If we don't want to wait 
till 4.0, we can delay this part of the patch and just retain the custom 
timeout on the coordinator (i.e. don't serialize the timeout). Once the branch 
for 4.0 is available, we can modify the serialization to allow us to pass the 
timeout to the replicas.

I'd also like to include some dtests for this, namely to just validate which 
timeout is being used on the coordinator. Is the accepted practice for doing 
something like this to log something and assert for the presence of the log 
entry? I want to avoid relying on the actual timeout observed since that can 
cause the test to be flaky.

> Make the Client API support passing down timeouts
> -
>
> Key: CASSANDRA-2848
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Goffinet
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 2848-trunk-v2.txt, 2848-trunk.txt
>
>
> Having a max server RPC timeout is good for worst case, but many applications 
> that have middleware in front of Cassandra, might have higher timeout 
> requirements. In a fail fast environment, if my application starting at say 
> the front-end, only has 20ms to process a request, and it must connect to X 
> services down the stack, by the time it hits Cassandra, we might only have 
> 10ms. I propose we provide the ability to specify the timeout on each call we 
> do optionally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428853#comment-15428853
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Thanks for all the help as well! :)

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 3.10
>
> Attachments: 12311-dtest.txt, 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Status: Open  (was: Patch Available)

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9875) Rebuild from targeted replica


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428458#comment-15428458
 ] 

Geoffrey Yu commented on CASSANDRA-9875:


No worries about the delay! Yeah I totally agree, adding a host whitelist would 
be a better interface. I somehow missed the source filter in the 
{{RangeStreamer}}. I'll take a look, make the changes, and add a dtest.

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12256) Count entire coordinated request against timeout

2016-08-16 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423289#comment-15423289
 ] 

Geoffrey Yu commented on CASSANDRA-12256:
-

Thanks for the review and help along the way!

> Count entire coordinated request against timeout
> 
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.10
>
> Attachments: 12256-trunk-v1v2.diff, 12256-trunk-v2.txt, 
> 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421934#comment-15421934
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

I looked through the dtest failures and they all seem to be related to 
https://github.com/riptano/cassandra-dtest/pull/1147

I think they should pass if you rebase your {{CASSANDRA-12311-tests}} dtests 
branch off of the latest upstream master and rerun them. I was able to rebase 
the branch locally without any merge conflicts.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-dtest.txt, 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12256) Properly respect the request timeouts


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421866#comment-15421866
 ] 

Geoffrey Yu commented on CASSANDRA-12256:
-

Thanks for rerunning! I looked through a handful of the remaining failing 
dtests and they all seem to be failing due to timeouts. I wasn't able to 
replicate the failures locally when running them individually this time, which 
leads me to _suspect_ that they fail because the existing dtest timeouts are 
now too strict.

I'm not super familiar with the dtest set up, so I'm looking for some input as 
to how to best proceed. Do the tests use the timeouts configured in 
{{dtest.py}} if they don't specify their own custom values? If so, do you think 
it would be a good approach to try with those values increased versus 
specifying custom values for the failing tests? I do realize that increasing 
the default ones could potentially cause the tests to run longer than they 
already do, however. Also, is there a way for me to kick off a subset of these 
tests myself on Jenkins to test them out? I don't want to have to keep bugging 
you with these failures :)

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk-v1v2.diff, 12256-trunk-v2.txt, 
> 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12367) Add an API to request the size of a CQL partition


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12367:

Status: Patch Available  (was: Open)

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12367) Add an API to request the size of a CQL partition


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12367:

Fix Version/s: 3.x

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12367) Add an API to request the size of a CQL partition


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12367:

Attachment: 12367-trunk.txt

I've attached a patch that exposes a new method through JMX that will allow an 
operator to get the size of a partition on disk, scoped by a keyspace and 
table. I implemented it by iterating through the sstables (leveraging the bloom 
filter) and adding up the sizes of the CQL rows that fall within the partition. 
This also adds a nodetool command that can be used to invoke the API.

> Add an API to request the size of a CQL partition
> -
>
> Key: CASSANDRA-12367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 12367-trunk.txt
>
>
> It would be useful to have an API that we could use to get the total 
> serialized size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-12256) Properly respect the request timeouts

2016-08-11 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417999#comment-15417999
 ] 

Geoffrey Yu edited comment on CASSANDRA-12256 at 8/11/16 9:52 PM:
--

I've attached a v2 patch that removes the query start timestamp from 
{{QueryState}} and instead records it inside {{Message}} and passes it through 
the call chain. It turned out that {{QueryState}} is not exactly the best place 
to keep the query start timestamp because it is reused for queries that have 
the same {{streamId}}. I also attached a diff file between version 1 and 2 of 
the patches so it is easier to review since the changes are quite noisy.

If the changes are alright, could you trigger the tests again? This should fix 
the majority of them, which will make it easier to identify and address any 
further failures.


was (Author: geoffxy):
I've attached a v2 patch that removes the query start timestamp from 
{{QueryState}} and instead records it inside {{Message}} and passes it through 
the call chain. It turned out that {{QueryState}} is not exactly the best place 
to keep the query start timestamp because it is reused for queries that have 
the same {{streamId}}. I also attached a diff file between version 1 and 2 of 
the patches so it is easier to review since the changes are quite noisy.

If the changes are alright, could you trigger the tests again? This should fix 
the majority of them, which will make it easier to address any further failures.

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk-v1v2.diff, 12256-trunk-v2.txt, 
> 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12256) Properly respect the request timeouts

2016-08-11 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12256:

Status: Patch Available  (was: Open)

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk-v1v2.diff, 12256-trunk-v2.txt, 
> 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12256) Properly respect the request timeouts

2016-08-11 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12256:

Attachment: 12256-trunk-v1v2.diff
12256-trunk-v2.txt

I've attached a v2 patch that removes the query start timestamp from 
{{QueryState}} and instead records it inside {{Message}} and passes it through 
the call chain. It turned out that {{QueryState}} is not exactly the best place 
to keep the query start timestamp because it is reused for queries that have 
the same {{streamId}}. I also attached a diff file between version 1 and 2 of 
the patches so it is easier to review since the changes are quite noisy.

If the changes are alright, could you trigger the tests again? This should fix 
the majority of them, which will make it easier to address any further failures.

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk-v1v2.diff, 12256-trunk-v2.txt, 
> 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12256) Properly respect the request timeouts


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416338#comment-15416338
 ] 

Geoffrey Yu commented on CASSANDRA-12256:
-

Thanks for the review! I looked through the test results and it seems like 
there are quite a few failures that are timeouts. I'll take a look and see what 
I can do. 

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9876) One way targeted repair


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415774#comment-15415774
 ] 

Geoffrey Yu edited comment on CASSANDRA-9876 at 8/10/16 8:37 PM:
-

I've attached a v3 patch that makes some changes to {{RepairOptionTest}} to 
address the new behavior of {{RepairOption.parse()}}. The patch has four 
commits. The first is the v2 patch, the second and third are your ninja 
commits, and the fourth are the test changes. Please take a look and let me 
know what you think!

I've also updated the dtest PR to address the failures in 
{{deprecated_repair_test.py}}. It looks like the failures are related to the 
changes in {{RepairOption.toString()}}.


was (Author: geoffxy):
I've attached a v3 patch that makes some changes to {{RepairOptionTest}} to 
address the new behavior of {{RepairOption.parse()}}. The patch has four 
commits. The first is the v2 patch, the second and third are your ninja 
commits, and the fourth are the test changes. Please take a look and let me 
know what you think!

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-dtest-master.txt, 9876-trunk-v2.txt, 
> 9876-trunk-v3.txt, 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9876) One way targeted repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9876:
---
Attachment: 9876-trunk-v3.txt

I've attached a v3 patch that makes some changes to {{RepairOptionTest}} to 
address the new behavior of {{RepairOption.parse()}}. The patch has four 
commits. The first is the v2 patch, the second and third are your ninja 
commits, and the fourth are the test changes. Please take a look and let me 
know what you think!

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-dtest-master.txt, 9876-trunk-v2.txt, 
> 9876-trunk-v3.txt, 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9876) One way targeted repair


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415711#comment-15415711
 ] 

Geoffrey Yu commented on CASSANDRA-9876:


It looks like the new check for ensuring both {{--dc}} and {{--hosts}} are not 
specified together is causing the {{RepairOptionTest.testParseOptions}} test to 
fail. I'll take a look at fixing it and I'll add some new ones for the pull 
repair option parsing.

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-dtest-master.txt, 9876-trunk-v2.txt, 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9876) One way targeted repair


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415629#comment-15415629
 ] 

Geoffrey Yu commented on CASSANDRA-9876:


Thanks for the quick follow up and review! Your changes look good to me. I've 
opened a PR for the dtest here: 
https://github.com/riptano/cassandra-dtest/pull/1209

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-dtest-master.txt, 9876-trunk-v2.txt, 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Summary: Rebuild from targeted replica  (was: Rebuild with start and end 
token and from targeted replica)

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Fix Version/s: 3.x
   Status: Patch Available  (was: Open)

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9875) Rebuild from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu reassigned CASSANDRA-9875:
--

Assignee: Geoffrey Yu

> Rebuild from targeted replica
> -
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
> Attachments: 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9875) Rebuild with start and end token and from targeted replica


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9875:
---
Attachment: 9875-trunk.txt

Since CASSANDRA-10406 already implements the ability to specify ranges for 
{{nodetool rebuild}}, I attached a patch to add the ability to specify specific 
sources to stream from for the rebuild (which is the other improvement this 
ticket mentions).

*Usage:*

{{nodetool rebuild --keyspace  --tokens  --sources }}

The implementation in this ticket requires that if {{-- sources}} is used, a 
source must be specified for every single token range provided using {{-- 
tokens}}.

I also added in some code to validate the inputted ranges to make sure that the 
current node owns all of them.

> Rebuild with start and end token and from targeted replica
> --
>
> Key: CASSANDRA-9875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9875
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Priority: Minor
>  Labels: lhf
> Attachments: 9875-trunk.txt
>
>
> Nodetool rebuild command will rebuild all the token ranges handled by the 
> endpoint. Sometimes we want to rebuild only a certain token range. We should 
> add this ability to rebuild command. We should also add the ability to stream 
> from a given replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414589#comment-15414589
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Thanks, that sounds great!

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-dtest.txt, 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9876) One way targeted repair

[
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414586#comment-15414586
]

Geoffrey Yu edited comment on CASSANDRA-9876 at 8/10/16 1:51 AM:
-

Thanks for the quick review! I’ve attached a new patch that addresses your
comments, with the exception of one of them for which I wanted to get some more
feedback first.

I also attached a patch that adds one dtest to test the pull repair. It works
nearly identically to the token range repair with the exception that it asserts
that one of the nodes only sends data and the other only receives.

{quote}
I don't think it's necessary to make specifying --start-token and --end-token
mandatory, since if that is not specified it will just pull repair all common
ranges between specified hosts.
{quote}

The reason why I added in the check for a token range was that the repair code
as it is now doesn’t actually add only the common ranges between the specified
hosts. I wasn’t sure if this is was the intended behavior or a bug.

To replicate the issue, just create a 3 node cluster, add a keyspace with
replication factor 2, and run a regular repair through nodetool on that
keyspace with exactly two nodes specified.

The reason it happens is that if no ranges are specified, the repair will [add
all ranges on the local
node|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L3137].
Then when we hit {{RepairRunnable}}, we try to [find a list of neighbors for
each
range|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairRunnable.java#L160-L162].

The problem here is that it isn’t always true that every range the local node
owns is also owned by the remote node we specified through the nodetool
command. Because of this the [check
here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ActiveRepairService.java#L246-L251]
may result in an exception being thrown, which aborts the repair.

If this is intended behavior, then forcing the user to specify a token range
that is common between the nodes prevents that exception from being thrown.
Otherwise the error message, “Repair requires at least two endpoints that are
neighbours before it can continue” can be confusing to the operator since the
two specified nodes may actually share a common range. What do you think?

was (Author: geoffxy):
Thanks for the quick review! I’ve attached a new patch that addresses your
comments, with the exception of one of them for which I wanted to get some more
feedback first.

To replicate the issue, just create a 3 node cluster, add a keyspace with
replication factor 2, and run a regular repair through nodetool on that
keyspace with exactly two nodes specified.

The problem here is that it isn’t always true that every range the local node
owns is also owned by the remote node we specified through the nodetool
command. In the example above, only one range will be common between any two
nodes. Because of this the [check
here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ActiveRepairService.java#L246-L251]
may result in an exception being thrown, which aborts the repair.

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://i

[jira] [Comment Edited] (CASSANDRA-9876) One way targeted repair

[
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414586#comment-15414586
]

Geoffrey Yu edited comment on CASSANDRA-9876 at 8/10/16 1:50 AM:
-

Thanks for the quick review! I’ve attached a new patch that addresses your
comments, with the exception of one of them for which I wanted to get some more
feedback first.

To replicate the issue, just create a 3 node cluster, add a keyspace with
replication factor 2, and run a regular repair through nodetool on that
keyspace with exactly two nodes specified.

was (Author: geoffxy):
Thanks for the quick review! I’ve attached a new patch that addresses your
comments, with the exception of one of them for which I wanted to get some more
feedback first.

To replicate the issue, just create a 3 node cluster, add a keyspace with
replication factor 2, and run a regular repair through nodetool on that
keyspace with exactly two nodes specified.

The reason it happens is that if no ranges are specified, the repair will [add
all ranges on the local
node|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L3137].
Then when we hit {{RepairRunnable}}, we try to find a list of neighbors for
each range
(https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairRunnable.java#L160-L162).

> One way targeted repair
>

[jira] [Updated] (CASSANDRA-9876) One way targeted repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9876:
---
Status: Awaiting Feedback  (was: Open)

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-dtest-master.txt, 9876-trunk-v2.txt, 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9876) One way targeted repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9876:
---
Attachment: 9876-dtest-master.txt
9876-trunk-v2.txt

Thanks for the quick review! I’ve attached a new patch that addresses your 
comments, with the exception of one of them for which I wanted to get some more 
feedback first.

I also attached a patch that adds one dtest to test the pull repair. It works 
nearly identically to the token range repair with the exception that it asserts 
that one of the nodes only sends data and the other only receives.

{quote}
I don't think it's necessary to make specifying --start-token and --end-token 
mandatory, since if that is not specified it will just pull repair all common 
ranges between specified hosts.
{quote}

The reason why I added in the check for a token range was that the repair code 
as it is now doesn’t actually add only the common ranges between the specified 
hosts. I wasn’t sure if this is was the intended behavior or a bug.

To replicate the issue, just create a 3 node cluster, add a keyspace with 
replication factor 2, and run a regular repair through nodetool on that 
keyspace with exactly two nodes specified.

The reason it happens is that if no ranges are specified, the repair will [add 
all ranges on the local 
node|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L3137].
 Then when we hit {{RepairRunnable}}, we try to find a list of neighbors for 
each range 
(https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/repair/RepairRunnable.java#L160-L162).

The problem here is that it isn’t always true that every range the local node 
owns is also owned by the remote node we specified through the nodetool 
command. In the example above, only one range will be common between any two 
nodes. Because of this the [check 
here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ActiveRepairService.java#L246-L251]
 may result in an exception being thrown, which aborts the repair.

If this is intended behavior, then forcing the user to specify a token range 
that is common between the nodes prevents that exception from being thrown. 
Otherwise the error message, “Repair requires at least two endpoints that are 
neighbours before it can continue” can be confusing to the operator since the 
two specified nodes may actually share a common range. What do you think?

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-dtest-master.txt, 9876-trunk-v2.txt, 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12256) Properly respect the request timeouts


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12256:

Attachment: 12256-trunk.txt

I've attached a first pass at this ticket. The majority of the changes are to 
pass down the query start timestamp all the way to the {{ReadCallback}} and 
{{AbstractWriteResponseHandler}}. The timestamp is recorded when the 
{{QueryState}} is created for a particular query.

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12256) Properly respect the request timeouts


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12256:

Fix Version/s: 3.x
   Status: Patch Available  (was: Open)

> Properly respect the request timeouts
> -
>
> Key: CASSANDRA-12256
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12256
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Geoffrey Yu
> Fix For: 3.x
>
> Attachments: 12256-trunk.txt
>
>
> We have a number of {{request_timeout_*}} option, that probably every user 
> expect to be an upper bound on how long the coordinator will wait before 
> timeouting a request, but it's actually not always the case, especially for 
> read requests.
> I believe we don't respect those timeout properly in at least the following 
> cases:
> * On a digest mismatch: in that case, we reset the timeout for the data 
> query, which means the overall query might take up to twice the configured 
> timeout before timeouting.
> * On a range query: the timeout is reset for every sub-range that is queried. 
> With many nodes and vnodes, a range query could span tons of sub-range and so 
> a range query could take pretty much arbitrary long before actually 
> timeouting for the user.
> * On short reads: we also reset the timeout for every short reads "retries".
> It's also worth noting that even outside those, the timeouts don't take most 
> of the processing done by the coordinator (query parsing and CQL handling for 
> instance) into account.
> Now, in all fairness, the reason this is this way is that the timeout 
> currently are *not* timeout for the full user request, but rather how long a 
> coordinator should wait on any given replica for any given internal query 
> before giving up. *However*, I'm pretty sure this is not what user 
> intuitively expect and want, *especially* in the context of CASSANDRA-2848 
> where the goal is explicitely to have an upper bound on the query from the 
> user point of view.
> So I'm suggesting we change how those timeouts are handled to really be 
> timeouts on the whole user query.
> And by that I basically just mean that we'd mark the start of each query as 
> soon as possible in the processing, and use that starting time as base in 
> {{ReadCallback.await}} and {{AbstractWriteResponseHandler.get()}}. It won't 
> be perfect in the sense that we'll still only possibly timeout during 
> "blocking" operations, so typically if parsing a query takes more than your 
> timeout, you still won't timeout until that query is sent, but I think that's 
> probably fine in practice because 1) if you timeouts are small enough that 
> this matter, you're probably doing it wrong and 2) we can totally improve on 
> that later if needs be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-08-08 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-dtest.txt

Thanks for the help with the driver and the example test! I really appreciate 
it. :)

I've attached a patch meant to be applied on top of your 
{{CASSANDRA-12311-tests}} branch. It modifies 
{{paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions}} and 
{{write_failure_tests.py}} so that they check for the failure map when protocol 
v5 is used. I also added another file, {{read_failure_tests.py}} to test read 
failures due to reading too many tombstones. I modeled it after 
{{write_failure_tests.py}} and added tests for protocol v3 and v4 as well.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-dtest.txt, 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410124#comment-15410124
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Thanks! I really appreciate it. :)

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410118#comment-15410118
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Unfortunately I'm not quite familiar with the python driver. :(

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-trunk-v5.txt

Thanks! I've attached a patch with changes to the documentation and also unit 
tests to cover the serialization and deserialization of read/write failure 
error messages.

I took a look at the dtests but, since this changes the encoding for the client 
facing protocol, won't the python driver will need to be changed first to 
recognize the new failure code map?

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk-v5.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-trunk-v4.txt

I've attached a patch with {{failures}} removed. I removed it from the 
exceptions themselves, which does have the implication that we lose some 
information when decoding an {{ErrorMessage}} while using protocol v4 (i.e. we 
can't meaningfully re-create the failure reason code map with just the number 
of failures). I feel that this is okay since as far as I'm aware, decoding the 
number of failures is meaningful (in this codebase) only when it is actually 
being used by {{o.a.c.transport.Client}} client-side. Let me know if I should 
change this.

I'll get working on the dtests and update here once I have them done.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 
> 12311-trunk-v4.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9876) One way targeted repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9876:
---
Fix Version/s: 3.x
   Status: Patch Available  (was: Open)

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9876) One way targeted repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-9876:
---
Attachment: 9876-trunk.txt

I've attached a patch that should add in what is described in the ticket. 
Specifically, it adds a new option {{--pull-repair}} to {{nodetool repair}} 
that can be used as follows:

{{nodetool repair --in-hosts , --start-token  --end-token 
 --pull-repair}}

Suppose {{}} is the node where the command is being run. Then 
{{}} will only request data from {{}} during the streaming step 
(if there is a mismatch) but will not send any data to {{}}. 

The node where the command is being run must one of the two nodes specified by 
{{--in-hosts}}. And of course the token range specified must be a range that 
both nodes own.

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 9876-trunk.txt
>
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9876) One way targeted repair


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu reassigned CASSANDRA-9876:
--

Assignee: Geoffrey Yu

> One way targeted repair
> ---
>
> Key: CASSANDRA-9876
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9876
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Geoffrey Yu
>Priority: Minor
>
> Many applications use C* by writing to one local DC. The other DC is used 
> when the local DC is unavailable. When the local DC becomes available, we 
> want to run a targeted repair b/w one endpoint from each DC to minimize the 
> data transfer over WAN.  In this case, it will be helpful to do a one way 
> repair in which data will only be streamed from other DC to local DC instead 
> of streaming the data both ways. This will further minimize the traffic over 
> WAN. This feature should only be supported if a targeted repair is run 
> involving 2 hosts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408173#comment-15408173
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Also, for what it's worth, while going through the protocol documentation I 
noticed that {{\[byte\]}} is referenced a few times but never explicitly 
defined under the "Notations" section. This could lead to ambiguity when it is 
used to define an encoding for an integer (i.e. signed or unsigned). Is this 
something we should perhaps consider adding to the specification? (I'm guessing 
here that, when used as an integer, it was intended to be interpreted as an 
unsigned 8 bit integer?)

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-08-03 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406949#comment-15406949
 ] 

Geoffrey Yu edited comment on CASSANDRA-12311 at 8/4/16 1:21 AM:
-

Those ideas sound good to me. I can see how having extensibility in the failure 
codes can be useful so that we don't need to wait for protocol version bumps. 
Also passing back an endpoint to failure code map would be nice since we won't 
need to interpret the potentially different responses from the replicas at the 
coordinator to determine which (single) failure code should be used.

I attached a patch with those changes incorporated. Since we need to pass some 
sort of failure code back from the replicas, I wanted to use the same set of 
failure codes between nodes as between the client and coordinator. So I placed 
the codes in a new enum {{RequestFailureReason}} and placed the map under 
{{RequestFailureException}}, meaning {{WriteFailureException}} s will carry 
this endpoint to failure code map as well. Please let me know what you think.


was (Author: geoffxy):
Those ideas sound good to me. I can see how having extensibility in the failure 
codes can be useful so that we don't need to wait for protocol version bumps. 
Also passing back an endpoint to failure code map would be nice since we won't 
need to interpret the potentially different responses from the replicas at the 
coordinator to determine which (single) failure code should be used.

I attached a patch with those changes incorporated. Since we need to pass some 
sort of failure code back from the replicas, I wanted to use the same set of 
failure codes between nodes as between the client and coordinator. So I placed 
the codes in a new enum {{RequestFailureReason}} and placed the map under 
{{RequestFailureException}}, meaning {{WriteFailureException}}s will carry this 
endpoint to failure code map as well. Please let me know what you think.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-08-03 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-trunk-v3.txt

Those ideas sound good to me. I can see how having extensibility in the failure 
codes can be useful so that we don't need to wait for protocol version bumps. 
Also passing back an endpoint to failure code map would be nice since we won't 
need to interpret the potentially different responses from the replicas at the 
coordinator to determine which (single) failure code should be used.

I attached a patch with those changes incorporated. Since we need to pass some 
sort of failure code back from the replicas, I wanted to use the same set of 
failure codes between nodes as between the client and coordinator. So I placed 
the codes in a new enum {{RequestFailureReason}} and placed the map under 
{{RequestFailureException}}, meaning {{WriteFailureException}}s will carry this 
endpoint to failure code map as well. Please let me know what you think.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>  Labels: client-impacting, doc-impacting
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk-v3.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12367) Add an API to request the size of a CQL partition

2016-08-02 Thread Geoffrey Yu (JIRA)

Geoffrey Yu created CASSANDRA-12367:
---

 Summary: Add an API to request the size of a CQL partition
 Key: CASSANDRA-12367
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12367
 Project: Cassandra
  Issue Type: Improvement
Reporter: Geoffrey Yu
Assignee: Geoffrey Yu
Priority: Minor


It would be useful to have an API that we could use to get the total serialized 
size of a CQL partition, scoped by keyspace and table, on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-07-29 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-trunk-v2.txt

I've attached an updated patch that removes the new exception and instead adds 
a new {{reason}} field within {{ReadFailureException}} that can be used to 
indicate why the read query failed.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk-v2.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

2016-07-29 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400265#comment-15400265
 ] 

Geoffrey Yu commented on CASSANDRA-12311:
-

Sure, that sounds reasonable to me. I'll make the changes and update the patch.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Description: Right now if a data node fails to perform a read because it 
ran into a {{TombstoneOverwhelmingException}}, it only responds back to the 
coordinator node with a generic failure. Under this scheme, the coordinator 
won't be able to know exactly why the request failed and subsequently the 
client only gets a generic {{ReadFailureException}}. It would be useful to 
inform the client that their read failed because we read too many tombstones. 
We should have the data nodes reply with a failure type so the coordinator can 
pass this information to the client.  (was: Right now if a data node fails to 
perform a read because it ran into a TombstoneOverwhelmingException, it only 
responds back to the coordinator node with a generic failure. Under this 
scheme, the coordinator won't be able to know exactly why the request failed 
and subsequently the client only gets a generic ReadFailureException. It would 
be useful to inform the client that their read failed because we read too many 
tombstones. We should have the data nodes reply with a failure type so the 
coordinator can pass this information to the client.)

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> {{TombstoneOverwhelmingException}}, it only responds back to the coordinator 
> node with a generic failure. Under this scheme, the coordinator won't be able 
> to know exactly why the request failed and subsequently the client only gets 
> a generic {{ReadFailureException}}. It would be useful to inform the client 
> that their read failed because we read too many tombstones. We should have 
> the data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Attachment: 12311-trunk.txt

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> TombstoneOverwhelmingException, it only responds back to the coordinator node 
> with a generic failure. Under this scheme, the coordinator won't be able to 
> know exactly why the request failed and subsequently the client only gets a 
> generic ReadFailureException. It would be useful to inform the client that 
> their read failed because we read too many tombstones. We should have the 
> data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12311:

Fix Version/s: 4.x
   Status: Patch Available  (was: Open)

I've attached a proposed patch that implements these changes. It adds a new 
exception code and also makes changes to internode messaging, so I've marked it 
for 4.x.

> Propagate TombstoneOverwhelmingException to the client
> --
>
> Key: CASSANDRA-12311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a 
> TombstoneOverwhelmingException, it only responds back to the coordinator node 
> with a generic failure. Under this scheme, the coordinator won't be able to 
> know exactly why the request failed and subsequently the client only gets a 
> generic ReadFailureException. It would be useful to inform the client that 
> their read failed because we read too many tombstones. We should have the 
> data nodes reply with a failure type so the coordinator can pass this 
> information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client

Geoffrey Yu created CASSANDRA-12311:
---

 Summary: Propagate TombstoneOverwhelmingException to the client
 Key: CASSANDRA-12311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
 Project: Cassandra
  Issue Type: Improvement
Reporter: Geoffrey Yu
Assignee: Geoffrey Yu
Priority: Minor


Right now if a data node fails to perform a read because it ran into a 
TombstoneOverwhelmingException, it only responds back to the coordinator node 
with a generic failure. Under this scheme, the coordinator won't be able to 
know exactly why the request failed and subsequently the client only gets a 
generic ReadFailureException. It would be useful to inform the client that 
their read failed because we read too many tombstones. We should have the data 
nodes reply with a failure type so the coordinator can pass this information to 
the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12106) Add ability to blacklist a CQL partition so all requests are ignored

2016-07-19 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12106:

Fix Version/s: 4.x
   Status: Patch Available  (was: Open)

I’ve attached a patch that implements this. There are a lot of changes, so I 
thought I’d highlight the high level approach I took to make it easier to 
understand what is going on.

This patch will let us blacklist any particular CQL partition, scoped by 
keyspace and table. Any reads/writes to a blacklisted partition will be 
rejected, and the client will receive a Read/WriteRejectedException 
accordingly. The mechanism for blacklisting a partition is exposed through a 
node tool command.

The approach to implementing this is to perform the rejection at the data 
replica level. This is so each node only needs to be aware of blacklisted 
partitions for ranges that it owns, allowing this to scale to larger clusters. 
The blacklist is stored in a new table under the {{system_distributed}} 
keyspace. Each node then maintains an in memory cache to store blacklist 
entries corresponding to its token ranges.

For single partition reads and writes, we reject the request as long as one 
replica responds with a rejection.

For partition range reads, we reject the request if there is a blacklisted 
partition within the range. 

CAS writes are rejected by the data nodes only on the prepare/promise step, and 
potentially when the coordinator performs the read before the propose/accept 
step. If the write proceeds past these places, the mutation will be allowed to 
be applied. 

A mutation in a batch log that is rejected will not be considered a "failure" 
in the replay. This means if all mutations were either applied successfully or 
rejected, then the replay would be considered successful and the batch log 
would be deleted.

Mutations that are rejected are not hinted. Any hints that are rejected when 
they are attempted to be applied will still be considered "successful" and 
deleted.

There are also changes included to help cover cache consistency when a node 
starts up, undergoes a range movement, and if it is decommissioned.

> Add ability to blacklist a CQL partition so all requests are ignored
> 
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 4.x
>
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blacklist such 
> partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12106) Add ability to blacklist a CQL partition so all requests are ignored

2016-07-19 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12106:

Attachment: 12106-trunk.txt

> Add ability to blacklist a CQL partition so all requests are ignored
> 
>
> Key: CASSANDRA-12106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 12106-trunk.txt
>
>
> Sometimes reads/writes to a given partition may cause problems due to the 
> data present. It would be useful to have a manual way to blacklist such 
> partitions so all read and write requests to them are rejected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-2848) Make the Client API support passing down timeouts

2016-07-18 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-2848:
---
Status: Patch Available  (was: Open)

I’ve attached a patch implementing this, and would love some feedback!

At a high level, the approach I took was to use the last flag available in the 
protocol to allow a client to specify whether or not the client supplied a 
timeout (as a {{long}}, in milliseconds). Cassandra will then use the minimum 
of either the client specified timeout or the configured RPC timeout. The rest 
of the changes were essentially for passing the client supplied timeout down to 
where it’s actually needed. I also bumped the messaging service version to 
allow for passing the timeout to the replica nodes as a part of 
serialization/deserialization for  {{ReadCommand}} and {{Mutation}}.

> Make the Client API support passing down timeouts
> -
>
> Key: CASSANDRA-2848
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Goffinet
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 2848-trunk.txt
>
>
> Having a max server RPC timeout is good for worst case, but many applications 
> that have middleware in front of Cassandra, might have higher timeout 
> requirements. In a fail fast environment, if my application starting at say 
> the front-end, only has 20ms to process a request, and it must connect to X 
> services down the stack, by the time it hits Cassandra, we might only have 
> 10ms. I propose we provide the ability to specify the timeout on each call we 
> do optionally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-2848) Make the Client API support passing down timeouts

2016-07-18 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-2848:
---
Attachment: 2848-trunk.txt

> Make the Client API support passing down timeouts
> -
>
> Key: CASSANDRA-2848
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Goffinet
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 2848-trunk.txt
>
>
> Having a max server RPC timeout is good for worst case, but many applications 
> that have middleware in front of Cassandra, might have higher timeout 
> requirements. In a fail fast environment, if my application starting at say 
> the front-end, only has 20ms to process a request, and it must connect to X 
> services down the stack, by the time it hits Cassandra, we might only have 
> 10ms. I propose we provide the ability to specify the timeout on each call we 
> do optionally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop

2016-07-14 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378423#comment-15378423
 ] 

Geoffrey Yu commented on CASSANDRA-12178:
-

Okay that makes sense. Thanks for the quick review!

> Add prefixes to the name of snapshots created before a truncate or drop
> ---
>
> Key: CASSANDRA-12178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12178
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 12178-3.0.txt, 12178-trunk.txt
>
>
> It would be useful to be able to identify snapshots that are taken because a 
> table was truncated or dropped. We can do this by prepending a prefix to 
> snapshot names for snapshots that are created before a truncate/drop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12178:

Attachment: 12178-3.0.txt

> Add prefixes to the name of snapshots created before a truncate or drop
> ---
>
> Key: CASSANDRA-12178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12178
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 12178-3.0.txt, 12178-trunk.txt
>
>
> It would be useful to be able to identify snapshots that are taken because a 
> table was truncated or dropped. We can do this by prepending a prefix to 
> snapshot names for snapshots that are created before a truncate/drop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12178:

Fix Version/s: 3.0.x

> Add prefixes to the name of snapshots created before a truncate or drop
> ---
>
> Key: CASSANDRA-12178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12178
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 12178-3.0.txt, 12178-trunk.txt
>
>
> It would be useful to be able to identify snapshots that are taken because a 
> table was truncated or dropped. We can do this by prepending a prefix to 
> snapshot names for snapshots that are created before a truncate/drop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12178:

Attachment: 12178-trunk.txt

> Add prefixes to the name of snapshots created before a truncate or drop
> ---
>
> Key: CASSANDRA-12178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12178
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 12178-trunk.txt
>
>
> It would be useful to be able to identify snapshots that are taken because a 
> table was truncated or dropped. We can do this by prepending a prefix to 
> snapshot names for snapshots that are created before a truncate/drop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12178:

Status: Patch Available  (was: Open)

> Add prefixes to the name of snapshots created before a truncate or drop
> ---
>
> Key: CASSANDRA-12178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12178
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 12178-trunk.txt
>
>
> It would be useful to be able to identify snapshots that are taken because a 
> table was truncated or dropped. We can do this by prepending a prefix to 
> snapshot names for snapshots that are created before a truncate/drop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12178) Add prefixes to the name of snapshots created before a truncate or drop

Geoffrey Yu created CASSANDRA-12178:
---

 Summary: Add prefixes to the name of snapshots created before a 
truncate or drop
 Key: CASSANDRA-12178
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12178
 Project: Cassandra
  Issue Type: Improvement
Reporter: Geoffrey Yu
Assignee: Geoffrey Yu
Priority: Minor


It would be useful to be able to identify snapshots that are taken because a 
table was truncated or dropped. We can do this by prepending a prefix to 
snapshot names for snapshots that are created before a truncate/drop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12106) Add ability to blacklist a CQL partition so all requests are ignored

2016-06-28 Thread Geoffrey Yu (JIRA)

Geoffrey Yu created CASSANDRA-12106:
---

 Summary: Add ability to blacklist a CQL partition so all requests 
are ignored
 Key: CASSANDRA-12106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12106
 Project: Cassandra
  Issue Type: New Feature
Reporter: Geoffrey Yu
Assignee: Geoffrey Yu
Priority: Minor


Sometimes reads/writes to a given partition may cause problems due to the data 
present. It would be useful to have a manual way to blacklist such partitions 
so all read and write requests to them are rejected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12076) Add username to AuthenticationException messages

2016-06-28 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12076:

Attachment: 12076-dtest-master.txt

> Add username to AuthenticationException messages
> 
>
> Key: CASSANDRA-12076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Trivial
> Attachments: 12076-dtest-master.txt, 12076-trunk-v2.txt, 
> 12076-trunk.txt
>
>
> When an {{AuthenticationException}} is thrown, there are a few places where 
> the user that initiated the request is not included in the exception message. 
> It can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12076) Add username to AuthenticationException messages

2016-06-28 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353742#comment-15353742
 ] 

Geoffrey Yu commented on CASSANDRA-12076:
-

I attached a patch for the affected dtests in auth_test.py. The patched tests 
ran fine locally.

> Add username to AuthenticationException messages
> 
>
> Key: CASSANDRA-12076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Trivial
> Attachments: 12076-dtest-master.txt, 12076-trunk-v2.txt, 
> 12076-trunk.txt
>
>
> When an {{AuthenticationException}} is thrown, there are a few places where 
> the user that initiated the request is not included in the exception message. 
> It can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-12076) Add username to AuthenticationException messages

2016-06-23 Thread Geoffrey Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347482#comment-15347482
 ] 

Geoffrey Yu commented on CASSANDRA-12076:
-

Absolutely - I made the changes and attached a new patch. How do the messages 
look now?

As for the dtests, which version should I be restricting the existing relevant 
tests to?

> Add username to AuthenticationException messages
> 
>
> Key: CASSANDRA-12076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Trivial
> Attachments: 12076-trunk-v2.txt, 12076-trunk.txt
>
>
> When an {{AuthenticationException}} is thrown, there are a few places where 
> the user that initiated the request is not included in the exception message. 
> It can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12076) Add username to AuthenticationException messages

2016-06-23 Thread Geoffrey Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12076:

Attachment: 12076-trunk-v2.txt

> Add username to AuthenticationException messages
> 
>
> Key: CASSANDRA-12076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Trivial
> Attachments: 12076-trunk-v2.txt, 12076-trunk.txt
>
>
> When an {{AuthenticationException}} is thrown, there are a few places where 
> the user that initiated the request is not included in the exception message. 
> It can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12076) Add username to AuthenticationException messages


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12076:

Status: Patch Available  (was: Open)

> Add username to AuthenticationException messages
> 
>
> Key: CASSANDRA-12076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Trivial
> Attachments: 12076-trunk.txt
>
>
> When an {{AuthenticationException}} is thrown, there are a few places where 
> the user that initiated the request is not included in the exception message. 
> It can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12076) Add username to AuthenticationException messages


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12076:

Attachment: 12076-trunk.txt

> Add username to AuthenticationException messages
> 
>
> Key: CASSANDRA-12076
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Trivial
> Attachments: 12076-trunk.txt
>
>
> When an {{AuthenticationException}} is thrown, there are a few places where 
> the user that initiated the request is not included in the exception message. 
> It can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12076) Add username to AuthenticationException messages

Geoffrey Yu created CASSANDRA-12076:
---

 Summary: Add username to AuthenticationException messages
 Key: CASSANDRA-12076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12076
 Project: Cassandra
  Issue Type: Improvement
Reporter: Geoffrey Yu
Assignee: Geoffrey Yu
Priority: Trivial


When an {{AuthenticationException}} is thrown, there are a few places where the 
user that initiated the request is not included in the exception message. It 
can be useful to have this information included for logging purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-12075) Include whether or not the client should retry the request when throwing a RequestExecutionException


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-12075:

Issue Type: Improvement  (was: New Feature)

> Include whether or not the client should retry the request when throwing a 
> RequestExecutionException
> 
>
> Key: CASSANDRA-12075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12075
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
>
> Some requests that result in an error should not be retried by the client. 
> Right now if the client gets an error, it has no way of knowing whether or 
> not it should retry. We can include an extra field in each 
> {{RequestExecutionException}} that will indicate whether the client should 
> retry, retry on a different host, or not retry at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-12075) Include whether or not the client should retry the request when throwing a RequestExecutionException

Geoffrey Yu created CASSANDRA-12075:
---

 Summary: Include whether or not the client should retry the 
request when throwing a RequestExecutionException
 Key: CASSANDRA-12075
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12075
 Project: Cassandra
  Issue Type: New Feature
Reporter: Geoffrey Yu
Assignee: Geoffrey Yu
Priority: Minor


Some requests that result in an error should not be retried by the client. 
Right now if the client gets an error, it has no way of knowing whether or not 
it should retry. We can include an extra field in each 
{{RequestExecutionException}} that will indicate whether the client should 
retry, retry on a different host, or not retry at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11880) Display number of tables in cfstats


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Yu updated CASSANDRA-11880:

Attachment: 11880-trunk.txt

> Display number of tables in cfstats
> ---
>
> Key: CASSANDRA-11880
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11880
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Geoffrey Yu
>Assignee: Geoffrey Yu
>Priority: Minor
> Attachments: 11880-trunk.txt
>
>
> We should display the number of tables in a Cassandra cluster in {{nodetool 
> cfstats}}. This would be useful for monitoring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11880) Display number of tables in cfstats