[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-09-18 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140004#comment-14140004
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

[~jbellis]  What do you think?


> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-09-18 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140078#comment-14140078
 ] 

Benedict commented on CASSANDRA-7979:
-

I think this is an excellent idea. We can ensure low cost by only taking the 
minimum delta for all columns in a given update. Once we deliver improved 
latency metrics there will be effectively no cost/competition for maintaining 
this, but in the meantime this will keep it low enough.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-18 Thread Jon Haddad (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252466#comment-14252466
 ] 

Jon Haddad commented on CASSANDRA-7979:
---

Maybe I'm being dumb here, but I don't see the point of this.  If your clocks 
are skewed, why do you need Cassandra to detect it?  Don't you have to deal 
with clock skew on non-cassandra servers as well?  If you're using client 
supplied timestamps, wouldn't clock skew on Cassandra be a non-issue? 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-18 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253089#comment-14253089
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

This does not detect clock skew. This tells you with some accuracy how much 
clock skew is acceptable by your application. 
There will always be clock skew across machines. This tells you a rough bound 
on them. 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-19 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253386#comment-14253386
 ] 

Ryan Svihla commented on CASSANDRA-7979:


Except it doesn't even do that unless you define rough as "only when things are 
accurate do you have some idea how much clock skew you COULD tolerate at a 
later date".

Recently I had a client that had 2.5 minute clock skew between nodes. They were 
actually writing and updating to their partition in 'buckets' so their domain 
was not very tolerant of clock skew, however using these metrics it's quite 
possible they could have two writes go back to back ..and because the variance 
between which node accepted the write, they could end up showing as "2.5 
minutes apart" even though they happened at the same time. This would 
incorrectly show as an acceptable clock skew of 2.5 minutes, and I would have 
ended up arguing with the client that can't be correct, have to go hunt down 
this jira,explain to them how it's calculated, and then convince them they were 
wrong and that this software I'm supporting is actually lying to them. 

Now I'm using this extreme use case, but this goes to show, this number is only 
even "roughly" accurate when there is no time skew, and it's wildly inaccurate 
and misleading when there is time skew, so in the case where it would be MOST 
USEFUL, it has NO VALUE, and all other cases where it's roughly accurate 
(because there is no time skew) no one would care. Therefore, I would file this 
idea under the category of "bad experience for new users". 

Last write win systems, like Cassandra, cannot accurately calculate "acceptable 
clock skew" any more than an individual person can diagnose if their view of 
the world is imbalanced.



> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-19 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253398#comment-14253398
 ] 

Ryan Svihla commented on CASSANDRA-7979:


For the record, if we rename it to something more innocuous such as (horrible 
name but tossing it out there to start the conversation) "key update frequency" 
I'd be a lot less concerned

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-19 Thread Sebastian Estevez (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253470#comment-14253470
 ] 

Sebastian Estevez commented on CASSANDRA-7979:
--

This kind of instrumentation probably belongs at the app level or even at the 
driver where you can measure accurately the time deltas against column updates 
without worrying about server clocks being in sync.

At the very least I agree with Ryans thoughts on naming.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253565#comment-14253565
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

"Now I'm using this extreme use case, but this goes to show, this number is 
only even "roughly" accurate when there is no time skew, and it's wildly 
inaccurate and misleading when there is time skew, so in the case where it 
would be MOST USEFUL, it has NO VALUE, and all other cases where it's roughly 
accurate (because there is no time skew) no one would care. Therefore, I would 
file this idea under the category of "bad experience for new users"."

This won't be accurate when there is a time skew. But you can use this with 
measured clock skew among your machines. 
There are applications which update the same columns minutes apart and some 
which will update it within a few mills. This tells you which application is 
what. This help you decide which time infrastructure you will need in your DC.  

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-19 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253566#comment-14253566
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

[~sebastian.este...@datastax.com]  How can you measure it from the client?

Also I don't mind changing the name of this but it is very useful to us. 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-12-19 Thread Sebastian Estevez (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254186#comment-14254186
 ] 

Sebastian Estevez commented on CASSANDRA-7979:
--

Disregard, it would only be possible if they were coming from the same app 
server.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Fix For: 2.0.12, 2.1.2, 3.0
>
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-08 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164375#comment-14164375
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

I have added the patch for 2.0 and trunk. I am using a Histogram to collect all 
the time diffs. So if a column is updated after 2 seconds, I will add that to 
the histogram. What actually I want is a reverse histogram since you are 
interested in knowing what percentile of updates happened within 50 milli, 10 
milli, etc. You want more accuracy at the lower end. Should I use 
EstimatedHistogram? 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-08 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164380#comment-14164380
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

[~benedict]  Can you please review? 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-14 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171251#comment-14171251
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

[~JoshuaMcKenzie]  What do you think about about Histogram? 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-21 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178475#comment-14178475
 ] 

Joshua McKenzie commented on CASSANDRA-7979:


h5. 2.0
h6. General:
# I don't see anything in there to limit the amount of sampling we're doing - 
right now it looks like we're sampling all updates rather than min delta for 
all columns as Benedict mentioned earlier.

h6. AtomicSortedColumns
# nit: Spacing on addAllWithSizeDelta. Remove extra after assignment of pair
# Update javadoc for return type

h6. ColumnFamilyStore
# nit: extra space after 'timeDelta  ='

h5. trunk
h6. AtomicBTreeColumns
# In ColumnUpdater.apply, the Math.min check is redundant.  Anything is always 
going to be <= Long.MAX_VALUE


Looks pretty straightforward and appears to work as expected.  Also - we should 
probably have a 2.0 patch and a 2.1 and merge 2.1 up to trunk.
Once we've limited it to min delta per column on update we should be good to go.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-29 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189050#comment-14189050
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

"In ColumnUpdater.apply, the Math.min check is redundant. Anything is always 
going to be <= Long.MAX_VALUE"

This is where limiting is happening as requested by Benedict. apply method is 
being called for all columns and I am keeping the minimum. This is true for 
both 2.0 and trunk patch. 

If this sounds good, I can provide the patches for 2.0,2.1 and trunk. 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-30 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190296#comment-14190296
 ] 

Joshua McKenzie commented on CASSANDRA-7979:


Ah - that's on me.  I missed that we were storing the value back in 
colUpdateTimeDelta - ignore that bit of feedback.

That kills 2 birds with one stone w/regards to my notes above.  If you can 
could clean up the spacing inconsistencies and comment the colUpdateTimeDelta 
holding min value logic, a 2.0 branch and 2.1 branch patch ought to do it.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, trunk_7979.diff
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-30 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190593#comment-14190593
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

Added 3 patches for 2.0,2.1 and trunk.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.1_7979_v2.txt, 
> trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-31 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192067#comment-14192067
 ] 

Joshua McKenzie commented on CASSANDRA-7979:


Didn't even think about this until going to commit - [~jbellis]: are we frozen 
on new features for 2.0?  If so, I'll go ahead and commit to 2.1 and merge up.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.1_7979_v2.txt, 
> trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-10-31 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192182#comment-14192182
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

We will need this in 2.0. 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.1_7979_v2.txt, 
> trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-11-03 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194805#comment-14194805
 ] 

Joshua McKenzie commented on CASSANDRA-7979:


[~kohlisankalp]: since new features are targeted at 2.1 or trunk and 2.0 is 
more or less feature-frozen, could you modify the 2.0 patch to be disabled by 
default and require a -D param to the JVM to enable it?  I'd be comfortable 
committing to 2.0 w/that change.

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.1_7979_v2.txt, 
> trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-11-03 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194850#comment-14194850
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

Sure. Let me give you that patch. If you like, you can commit the other 2.1 and 
trunk patches. 

> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.1_7979_v2.txt, 
> trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7979) Acceptable time skew for C*

2014-11-03 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194878#comment-14194878
 ] 

sankalp kohli commented on CASSANDRA-7979:
--

Attached patch for 2.0 which makes it off by default. The histogram will not 
have any values since it will always return Long.MAX. 


> Acceptable time skew for C*
> ---
>
> Key: CASSANDRA-7979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7979
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
> Attachments: 2.0_7979.diff, 2.0_7979_v2.txt, 2.0_7979_v3.txt, 
> 2.1_7979_v2.txt, trunk_7979.diff, trunk_7979_v2.txt
>
>
> It is very hard to know the bounds on clock skew required for C* to work 
> properly. Since the resolution is based on time and is at thrift column 
> level, it depends on the application. How fast is the application updating 
> the same column. If you update a column say after 5 millisecond and the clock 
> skew is more than that, you might not see the updates in correct order. 
> In this JIRA, I am proposing a change which will answer this question: "How 
> much clock skew is acceptable for a given application". This will help answer 
> the question whether the system needs some alternate NTP algorithms to keep 
> time in sync. 
> If we measure the time difference between two updates to the same column,  we 
> will be able to answer the question on clock skew. 
> We can implement this in memtable(AtomicSortedColumns.addColumn). If we find 
> that a column is updated within say 100 millisecond, add the diff to a 
> histogram. Since this might have performance issues, we might want to have 
> some throttling like randomization or only enable it for a small time via 
> nodetool. 
> With this histogram, we will know what is an acceptable clock skew. 
> Also apart from column resolution, is there any other area which will be 
> affected by clock skew? 
> Note: For the sake of argument, I am not talking about back date deletes or 
> application modified timestamps. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)