[jira] [Commented] (CASSANDRA-7296) Add CL.COORDINATOR_ONLY

2016-10-07 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556007#comment-15556007
 ] 

Tupshin Harper commented on CASSANDRA-7296:
---

Given the fresh activity, I'd like to re-emphasize my support for this ticket. 
I think node/data debugging via request pinning is an excellent use of it, and 
is basically the original reason for the ticket. Spark turned out to be an 
irrelevant tangent, but there is significant benefit in supporting this 
(degeneratively simple) form of consistency. If [~jjirsa]'s patch is still 
applicable (or can be), i'd love to see it given a fair shake.

> Add CL.COORDINATOR_ONLY
> ---
>
> Key: CASSANDRA-7296
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7296
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>
> For reasons such as CASSANDRA-6340 and similar, it would be nice to have a 
> read that never gets distributed, and only works if the coordinator you are 
> talking to is an owner of the row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9779) Append-only optimization

2016-06-22 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15344434#comment-15344434
 ] 

Tupshin Harper commented on CASSANDRA-9779:
---

Basically we are talking about frozen rows(as analogy to frozen collections), 
and I am very much in favor of this. *Many* use cases, particularly IoT, would 
be able to use such an optimization while still benefiting from representing 
data in highly structured columns.

> Append-only optimization
> 
>
> Key: CASSANDRA-9779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9779
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Jonathan Ellis
> Fix For: 3.x
>
>
> Many common workloads are append-only: that is, they insert new rows but do 
> not update existing ones.  However, Cassandra has no way to infer this and so 
> it must treat all tables as if they may experience updates in the future.
> If we added syntax to tell Cassandra about this ({{WITH INSERTS ONLY}} for 
> instance) then we could do a number of optimizations:
> - Compaction would only need to worry about defragmenting partitions, not 
> rows.  We could default to DTCS or similar.
> - CollationController could stop scanning sstables as soon as it finds a 
> matching row
> - Most importantly, materialized views wouldn't need to worry about deleting 
> prior values, which would eliminate the majority of the MV overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels

2016-06-14 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330525#comment-15330525
 ] 

Tupshin Harper commented on CASSANDRA-8119:
---

I like the overall approach that Tyler proposes, but I have been convinced for 
a long time, that the ultimate desirable functionality would be to combine the 
above expressive consistency levels with multiple CL callbacks per request 
(e.g. one callback at LQ and another at EQ). I would love to see this ticket 
prepare to make protocol/conceptual changes to support that even though it 
would surely be prohibitive to implement multiple CL callbacks within the scope 
of this ticket.

> More Expressive Consistency Levels
> --
>
> Key: CASSANDRA-8119
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8119
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Tyler Hobbs
> Fix For: 3.x
>
>
> For some multi-datacenter environments, the current set of consistency levels 
> are too restrictive.  For example, the following consistency requirements 
> cannot be expressed:
> * LOCAL_QUORUM in two specific DCs
> * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC
> * LOCAL_QUORUM in the local DC plus N remote replicas in any DC
> I propose that we add a new consistency level: CUSTOM.  In the v4 (or v5) 
> protocol, this would be accompanied by an additional map argument.  A map of 
> {DC: CL} or a map of {DC: int} is sufficient to cover the first example.  If 
> we accept a special keys to represent "any datacenter", the second case can 
> be handled.  A similar technique could be used for "any other nodes".
> I'm not in love with the special keys, so if anybody has ideas for something 
> more elegant, feel free to propose them.  The main idea is that we want to be 
> flexible enough to cover any reasonable consistency or durability 
> requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7666) Range-segmented sstables

2016-06-10 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325050#comment-15325050
 ] 

Tupshin Harper commented on CASSANDRA-7666:
---

In addition to being relevant to CASSANDRA-11989, I believe range-segmented 
sstables represents an under-appreciated potential optimization for compaction 
strategies. As a rule of thumb, we tend to recommend that STCS workloads be 
kept under 2TB, or so. The main reason for this (besides operational concerns 
involving time to bootstrap/repair/etc), is that STCS compaction performance 
scales sublinearly with the amount of data in a table/node, and that the write 
amplification factor is substantially higher at 10TB than 2.  With 
range-segmented-sstables, just 5 segments would allow 10TB to be isolated into 
2 segment sections, and as long as the cumulative IO and CPU of the nodes was 
sufficient for the total workload, could sustain performance at that scale. 

I suggest that this ticket be re-opsened for those two reasons.

> Range-segmented sstables
> 
>
> Key: CASSANDRA-7666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7666
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Jonathan Ellis
>  Labels: dense-storage
>
> It would be useful to segment sstables by data range (not just token range as 
> envisioned by CASSANDRA-6696).
> The primary use case is to allow deleting those data ranges for "free" by 
> dropping the sstables involved.  We should also (possibly as a separate 
> ticket) be able to leverage this information in query planning to avoid 
> unnecessary sstable reads.
> Relational databases typically call this "partitioning" the table, but 
> obviously we use that term already for something else: 
> http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
> Tokutek's take for mongodb: 
> http://docs.tokutek.com/tokumx/tokumx-partitioned-collections.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11989) Rehabilitate Byte Ordered Partitioning

2016-06-10 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-11989:
---
Description: 
This is a placeholder ticket to aid in NGCC discussion and should lead to a 
design doc.

The general idea is that Byte Ordered Partitoning is the only way to maximize 
locality (beyond the healthy size of a single partition). Because of 
random/murmur's inability to do so, BOP has intrinsic value, assuming the 
operational downside are eliminated. This ticket tries to address the 
operational challenges of BOP and proposes that it should be the default in the 
distant future.

http://slides.com/tupshinharper/rehabilitating_bop

https://docs.google.com/a/datastax.com/document/d/1zcvLbyZAebmvrqnKidpXlTtdICNox92pWYGKSd7SS7M/edit?usp=docslist_api

  was:
This is a placeholder ticket to aid in NGCC discussion and should lead to a 
design doc.

The general idea is that Byte Ordered Partitoning is the only way to maximize 
locality (beyond the healthy size of a single partition). Because of 
random/murmur's inability to do so, BOP has intrinsic value, assuming the 
operational downside are eliminated. This ticket tries to address the 
operational challenges of BOP and proposes that it should be the default in the 
distant future.

http://slides.com/tupshinharper/rehabilitating_bop


> Rehabilitate Byte Ordered Partitioning
> --
>
> Key: CASSANDRA-11989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11989
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
> Fix For: 4.x
>
>
> This is a placeholder ticket to aid in NGCC discussion and should lead to a 
> design doc.
> The general idea is that Byte Ordered Partitoning is the only way to maximize 
> locality (beyond the healthy size of a single partition). Because of 
> random/murmur's inability to do so, BOP has intrinsic value, assuming the 
> operational downside are eliminated. This ticket tries to address the 
> operational challenges of BOP and proposes that it should be the default in 
> the distant future.
> http://slides.com/tupshinharper/rehabilitating_bop
> https://docs.google.com/a/datastax.com/document/d/1zcvLbyZAebmvrqnKidpXlTtdICNox92pWYGKSd7SS7M/edit?usp=docslist_api



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11989) Rehabilitate Byte Ordered Partitioning

2016-06-10 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324754#comment-15324754
 ] 

Tupshin Harper commented on CASSANDRA-11989:


I'm envisioning that everything would be built off of low level "acquire_token" 
and "release_token" type operations, and that giving nodes the ability to 
dynamically perform those two operations safely will be a pre-requisite, so 
would require a gossip enhancement. I'm avoiding depending on any more complex 
semantics, and am working on mechanisms to to dynamically reallocate based on 
just those two primitives.

> Rehabilitate Byte Ordered Partitioning
> --
>
> Key: CASSANDRA-11989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11989
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
> Fix For: 4.x
>
>
> This is a placeholder ticket to aid in NGCC discussion and should lead to a 
> design doc.
> The general idea is that Byte Ordered Partitoning is the only way to maximize 
> locality (beyond the healthy size of a single partition). Because of 
> random/murmur's inability to do so, BOP has intrinsic value, assuming the 
> operational downside are eliminated. This ticket tries to address the 
> operational challenges of BOP and proposes that it should be the default in 
> the distant future.
> http://slides.com/tupshinharper/rehabilitating_bop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11989) Rehabilitate Byte Ordered Partitioning

2016-06-10 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-11989:
---
Labels: ponies  (was: )
Issue Type: Improvement  (was: Bug)

> Rehabilitate Byte Ordered Partitioning
> --
>
> Key: CASSANDRA-11989
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11989
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
> Fix For: 4.x
>
>
> This is a placeholder ticket to aid in NGCC discussion and should lead to a 
> design doc.
> The general idea is that Byte Ordered Partitoning is the only way to maximize 
> locality (beyond the healthy size of a single partition). Because of 
> random/murmur's inability to do so, BOP has intrinsic value, assuming the 
> operational downside are eliminated. This ticket tries to address the 
> operational challenges of BOP and proposes that it should be the default in 
> the distant future.
> http://slides.com/tupshinharper/rehabilitating_bop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11989) Rehabilitate Byte Ordered Partitioning

2016-06-10 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-11989:
--

 Summary: Rehabilitate Byte Ordered Partitioning
 Key: CASSANDRA-11989
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11989
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tupshin Harper
 Fix For: 4.x


This is a placeholder ticket to aid in NGCC discussion and should lead to a 
design doc.

The general idea is that Byte Ordered Partitoning is the only way to maximize 
locality (beyond the healthy size of a single partition). Because of 
random/murmur's inability to do so, BOP has intrinsic value, assuming the 
operational downside are eliminated. This ticket tries to address the 
operational challenges of BOP and proposes that it should be the default in the 
distant future.

http://slides.com/tupshinharper/rehabilitating_bop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-07 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319696#comment-15319696
 ] 

Tupshin Harper commented on CASSANDRA-7622:
---

+1 from me. I just wanted to make sure that write support would stay on the 
short term road map. Fine staging it that way.

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-07 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319483#comment-15319483
 ] 

Tupshin Harper commented on CASSANDRA-7622:
---

As the filer of this ticket, I largely agree about the scope issue. I was 
surprised to see any notion of replication being discussed, because I didn't 
view persistence or cross-node awareness/aggregation as being a feature of 
virtual tables.

I do think that exposing JMX provides the best initial use case, and would like 
to target that first. The higher level interface that  Sylvain proposes is also 
very much in the right direction. 

That said, I disagree with one aspect. There's no reason to restrict the API to 
read only, even initially. Most JMX metrics are read only, and those would 
either be ignore, or raise an error, if they were attempted to be written to. 
But JMX metrix that arer settable, should be exposable as r/w (with separate 
read vs write permissions, of course).

If an interface is designed sufficient to allow the elegant  reading and 
writing of jmx metrics, it will be widely usable for many other plugins/virtual 
tables as well.

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-07 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088221#comment-15088221
 ] 

Tupshin Harper commented on CASSANDRA-8844:
---

Relying on the Cassandra libs doesn't prevent you from copying the logs 
elsewhere and processing there, and doesn't require cassandra to be running on 
those machines. It does require the Java consumer to be implemented in a JVM 
language, however. I'm not fond of that last part, and would love it if we 
formalized the format, but I suppose I'll start by reverse engineering it. :)

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continu

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-12-23 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069834#comment-15069834
 ] 

Tupshin Harper commented on CASSANDRA-8844:
---

While I haven't really followed how MVs are doing mutation-based repair, your 
idea to go down that path mirrors my own thinking. 

to clarify, I believe there are two separate issues:
1) Currently, nothing, including repair, is able to cause a partially 
replicated CDC table to converge towards fully CDC-replicated, even when only 
worrying about delivering the latest copy and not caring about intermediate 
mutations
2) intermediate mutations aren't retained, and therefore any plausible fixes to 
#1, short of mutation-based repair, will still not recover all mutations that 
were applied to mutable-state columns.

So +1 to [~JoshuaMcKenzie]'s suggestion.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> 

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-12-21 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066606#comment-15066606
 ] 

Tupshin Harper commented on CASSANDRA-8844:
---

It is designed to be RF copies for redundancy and high availability. If 
Cassandra were to deduplicate, and then the node that owned the remaining copy 
goes down, you have CDC data loss (failure to capture and send some data to a 
remote system). It is essential that the consumer be given enough capability 
that they can build a highly reliable system out of it. I believe that there 
will need to be a small number of reliably-enqueuing implementations built on 
top of CDC that will have any necessary de-dupe logic built in. What I would 
*most* like to see is a Kafka consumer of CDC that could then be used as the 
delivery mechanism to other systems. 

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> le

[jira] [Commented] (CASSANDRA-7464) Retire/replace sstable2json and json2sstable

2015-09-04 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730687#comment-14730687
 ] 

Tupshin Harper commented on CASSANDRA-7464:
---

With sstable2json going away with 3.0, but no activity nor timeline for this 
ticket, it seems like we are going to be left in a situation where we have no 
way to debug the contents of an sstable. This would seem to be a requirement 
for a 3.0-final release.

> Retire/replace sstable2json and json2sstable
> 
>
> Key: CASSANDRA-7464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7464
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Priority: Minor
>
> Both tools are pretty awful. They are primarily meant for debugging (there is 
> much more efficient and convenient ways to do import/export data), but their 
> output manage to be hard to handle both for humans and for tools (especially 
> as soon as you have modern stuff like composites).
> There is value to having tools to export sstable contents into a format that 
> is easy to manipulate by human and tools for debugging, small hacks and 
> general tinkering, but sstable2json and json2sstable are not that.  
> So I propose that we deprecate those tools and consider writing better 
> replacements. It shouldn't be too hard to come up with an output format that 
> is more aware of modern concepts like composites, UDTs, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631882#comment-14631882
 ] 

Tupshin Harper edited comment on CASSANDRA-6477 at 7/17/15 8:55 PM:


OK, so let me summarize my view of the conflicting viewpoints here
# If the MV shares the same partition key (and only reorders the partition 
based on different clustering columns), then the problem is relatively easy. 
Unfortunately the general consensus is that a common case will be to have 
different partition keys in the MV than the base table, so we can't support 
only that easy case.
# If the MV has a different partition key than the base table, then there are 
inherently more nodes involved in fulfilling the entire request, and we have to 
address that case.
# As [~tjake] and [~jbellis] say, the more nodes involved in a query, the 
higher the risk of unavailability if the MV is updated synchronously.
# Some use cases expect synchronous updates (as argued by [~rustyrazorblade] 
and [~brianmhess]
# But others use cases definitely do not. I think it is absurd to say that just 
because a table has a MV, every write should care about the MV. Even more 
absurd to say that adding an MV to a table will reduce the availability of all 
writes to the base table. 

Given all of those, the conclusion that both sync and async forms are necessary 
seems totally unavoidable.

Ideally, I'd like to see an extension of what [~iamaleksey] proposed above but 
be much more thorough and flexible about it.

If each request were able to pass multiple consistency-level contracts to the 
coordinator, each one could represent the expectation for a separate callback 
at the driver level.
e.g. A query to a table with a MV could express the following compound 
consistency levels. {noformat} {LQ, LOCAL_ONE{DC3,DC4}, LQ{MV1,MV2}} {noformat}
That would tell the coordinator to deliver three separate notifications back to 
the client. One when LQ in the local dc was fulfilled. Another when at least 
one copy was delivered to each of DC3 and DC4, and another when LQ was 
fulfilled in the local dc for MV1 and MV2.

and yes, you would need more flexible syntax that could express per-dc per 
table consistency, e.g. {noformat}LQ{DCs:DC3,DC4,VIEWS:MV1,MV2}{noformat}

I realize that this is a very far-fetched proposal, but I wanted to throw it 
out there as, imo, it reflects the theoretically best option that fulfills 
everybody's requirements. (and is also a very general mechanism that could be 
used in other scenarios).

Short of that,  I don't think there is any choice but to support both sync and 
async forms of writes to tables with MVs.

One more point(not to distract from the above). With the current design of MVs, 
there will always be risk of inconsistent reads (timeouts leaving data 
queryable in the primary table but not in one or more MVs) until the data is 
eventually propagated to the MV. While it would be at a high cost, RAMP would 
still be useful to be to provide read isolation in that scenario. 


was (Author: tupshin):
OK, so let me summarize my view of the conflicting viewpoints here
# If the MV shares the same partition key (and only reorders the partition 
based on different clustering columns), then the problem is relatively easy. 
Unfortunately the general consensus is that a common case will be to have 
different partition keys in the MV than the base table, so we can't support 
only that easy case.
# If the MV has a different partition key than the base table, then there are 
inherently more nodes involved in fulfilling the entire request, and we have to 
address that case.
# As [~tjake] and [~jbellis] say, the more nodes involved in a query, the 
higher the risk of unavailability if the MV is updated synchronously.
# Some use cases expect synchronous updates (as argued by [~rustyrazorblade] 
and [~brianmhess]
# But others use cases definitely do not. I think it is absurd to say that just 
because a table has a MV, every write should care about the MV. Even more 
absurd to say that adding an MV to a table will reduce the availability of all 
writes to the base table. 

Given all of those, the conclusion that both sync and async forms are necessary 
seems totally unavoidable.

Ideally, I'd like to see an extension of what [~iamaleksey] proposed above but 
be much more thorough and flexible about it.

If each request were able to pass multiple consistency-level contracts to the 
coordinator, each one could represent the expectation for a separate callback 
at the driver level.
e.g. A query to a table with a MV could express the following compound 
consistency levels. {noformat} {LQ, LOCAL_ONE{DC3,DC4}, LQ{MV1,MV2}} {noformat}
and yes, you would need more flexible syntax that could express per-dc per 
table consistency, e.g. {noformat}LQ{DCs:DC3,DC4,VIEWS:MV1,MV2}{noformat}
That would tell the coordinator t

[jira] [Comment Edited] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631882#comment-14631882
 ] 

Tupshin Harper edited comment on CASSANDRA-6477 at 7/17/15 8:55 PM:


OK, so let me summarize my view of the conflicting viewpoints here
# If the MV shares the same partition key (and only reorders the partition 
based on different clustering columns), then the problem is relatively easy. 
Unfortunately the general consensus is that a common case will be to have 
different partition keys in the MV than the base table, so we can't support 
only that easy case.
# If the MV has a different partition key than the base table, then there are 
inherently more nodes involved in fulfilling the entire request, and we have to 
address that case.
# As [~tjake] and [~jbellis] say, the more nodes involved in a query, the 
higher the risk of unavailability if the MV is updated synchronously.
# Some use cases expect synchronous updates (as argued by [~rustyrazorblade] 
and [~brianmhess]
# But others use cases definitely do not. I think it is absurd to say that just 
because a table has a MV, every write should care about the MV. Even more 
absurd to say that adding an MV to a table will reduce the availability of all 
writes to the base table. 

Given all of those, the conclusion that both sync and async forms are necessary 
seems totally unavoidable.

Ideally, I'd like to see an extension of what [~iamaleksey] proposed above but 
be much more thorough and flexible about it.

If each request were able to pass multiple consistency-level contracts to the 
coordinator, each one could represent the expectation for a separate callback 
at the driver level.
e.g. A query to a table with a MV could express the following compound 
consistency levels. {noformat} {LQ, LOCAL_ONE{DC3,DC4}, LQ{MV1,MV2}} {noformat}
and yes, you would need more flexible syntax that could express per-dc per 
table consistency, e.g. {noformat}LQ{DCs:DC3,DC4,VIEWS:MV1,MV2}{noformat}
That would tell the coordinator to deliver three separate notifications back to 
the client. One when LQ in the local dc was fulfilled. Another when at least 
one copy was delivered to each of DC3 and DC4, and another when LQ was 
fulfilled in the local dc for MV1 and MV2.

I realize that this is a very far-fetched proposal, but I wanted to throw it 
out there as, imo, it reflects the theoretically best option that fulfills 
everybody's requirements. (and is also a very general mechanism that could be 
used in other scenarios).

Short of that,  I don't think there is any choice but to support both sync and 
async forms of writes to tables with MVs.

One more point(not to distract from the above). With the current design of MVs, 
there will always be risk of inconsistent reads (timeouts leaving data 
queryable in the primary table but not in one or more MVs) until the data is 
eventually propagated to the MV. While it would be at a high cost, RAMP would 
still be useful to be to provide read isolation in that scenario. 


was (Author: tupshin):
OK, so let me summarize my view of the conflicting viewpoints here
# If the MV shares the same partition key (and only reorders the partition 
based on different clustering columns), then the problem is relatively easy. 
Unfortunately the general consensus is that a common case will be to have 
different partition keys in the MV than the base table, so we can't support 
only that easy case.
# If the MV has a different partition key than the base table, then there are 
inherently more nodes involved in fulfilling the entire request, and we have to 
address that case.
# As [~tjake] and [~jbellis] say, the more nodes involved in a query, the 
higher the risk of unavailability if the MV is updated synchronously.
# Some use cases expect synchronous updates (as argued by [~rustyrazorblade] 
and [~brianmhess]
# But others use cases definitely do not. I think it is absurd to say that just 
because a table has a MV, every write should care about the MV. Even more 
absurd to say that adding an MV to a table will reduce the availability of all 
writes to the base table. 

Given all of those, the conclusion that both sync and async forms are necessary 
seems totally unavoidable.

Ideally, I'd like to see an extension of what [~iamaleksey] proposed above but 
be much more thorough and flexible about it.

If each request were able to pass multiple consistency-level contracts to the 
coordinator, each one could represent the expectation for a separate callback 
at the driver level.
e.g. A query to a table with a MV could express the following compound 
consistency levels. {noformat} {LQ, LOCAL_ONE{DC3,DC4}, LQ{MV1,MV2}} {noformat}
That would tell the coordinator to deliver three separate notifications back to 
the client. One when LQ in the local dc was fulfilled. Another when at least 
one copy was delivered to

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631882#comment-14631882
 ] 

Tupshin Harper commented on CASSANDRA-6477:
---

OK, so let me summarize my view of the conflicting viewpoints here
# If the MV shares the same partition key (and only reorders the partition 
based on different clustering columns), then the problem is relatively easy. 
Unfortunately the general consensus is that a common case will be to have 
different partition keys in the MV than the base table, so we can't support 
only that easy case.
# If the MV has a different partition key than the base table, then there are 
inherently more nodes involved in fulfilling the entire request, and we have to 
address that case.
# As [~tjake] and [~jbellis] say, the more nodes involved in a query, the 
higher the risk of unavailability if the MV is updated synchronously.
# Some use cases expect synchronous updates (as argued by [~rustyrazorblade] 
and [~brianmhess]
# But others use cases definitely do not. I think it is absurd to say that just 
because a table has a MV, every write should care about the MV. Even more 
absurd to say that adding an MV to a table will reduce the availability of all 
writes to the base table. 

Given all of those, the conclusion that both sync and async forms are necessary 
seems totally unavoidable.

Ideally, I'd like to see an extension of what [~iamaleksey] proposed above but 
be much more thorough and flexible about it.

If each request were able to pass multiple consistency-level contracts to the 
coordinator, each one could represent the expectation for a separate callback 
at the driver level.
e.g. A query to a table with a MV could express the following compound 
consistency levels. {noformat} {LQ, LOCAL_ONE{DC3,DC4}, LQ{MV1,MV2}} {noformat}
That would tell the coordinator to deliver three separate notifications back to 
the client. One when LQ in the local dc was fulfilled. Another when at least 
one copy was delivered to each of DC3 and DC4, and another when LQ was 
fulfilled in the local dc for MV1 and MV2.

I realize that this is a very far-fetched proposal, but I wanted to throw it 
out there as, imo, it reflects the theoretically best option that fulfills 
everybody's requirements. (and is also a very general mechanism that could be 
used in other scenarios).

Short of that,  I don't think there is any choice but to support both sync and 
async forms of writes to tables with MVs.

One more point(not to distract from the above). With the current design of MVs, 
there will always be risk of inconsistent reads (timeouts leaving data 
queryable in the primary table but not in one or more MVs) until the data is 
eventually propagated to the MV. While it would be at a high cost, RAMP would 
still be useful to be to provide read isolation in that scenario. 

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631500#comment-14631500
 ] 

Tupshin Harper commented on CASSANDRA-6477:
---

Just a reminder (since it was a loong time ago in this ticket), that we were 
going to target immediate consistency once we could leverage RAMP, and not 
before. 

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-14 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626761#comment-14626761
 ] 

Tupshin Harper commented on CASSANDRA-6477:
---

I find myself disagreeing with the hard requirement that all rows in the table 
must show up in the materialized views. While it would be nice, I believe that 
clearly documenting the limitation and providing a couple of reasonable choices 
is far preferable then encouraging using rope sufficient to hang the user.

My suggestion:
* Create a formal notion of NOT NULL columns in the schema that can be applied 
to a table, irrespective of any MV usage. 
* Columns that are NOT NULL would have the exact same restrictions as PK 
columns, namely that they need to be included in all inserts and updates (with 
the possible exception of LWT updates)
* Document (and warn in cqlsh) the fact that if you create a MV with a PK using 
a nullable column from the table, then those values will not be in the view

It seems to me like this is a far less dangerous (and in many ways less 
surprising) than automatically creating a hotspot in the MV because lots of 
data with NULLs get added.

Now with 8099 supporting NULLs for clustering columns, this might only apply to 
columns that would be a partition key in the MV, and that seems appealing. But 
I can't talk myself into liking inserting nulls into a MV partition key.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-14 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-6477:
--
Comment: was deleted

(was: I find myself disagreeing with the hard requirement that all rows in the 
table must show up in the materialized views. While it would be nice, I believe 
that clearly documenting the limitation and providing a couple of reasonable 
choices is far preferable then encouraging using rope sufficient to hang the 
user.

My suggestion:
* Create a formal notion of NOT NULL columns in the schema that can be applied 
to a table, irrespective of any MV usage. 
* Columns that are NOT NULL would have the exact same restrictions as PK 
columns, namely that they need to be included in all inserts and updates (with 
the possible exception of LWT updates)
* Document (and warn in cqlsh) the fact that if you create a MV with a PK using 
a nullable column from the table, then those values will not be in the view

It seems to me like this is a far less dangerous (and in many ways less 
surprising) than automatically creating a hotspot in the MV because lots of 
data with NULLs get added.

Now with 8099 supporting NULLs for clustering columns, this might only apply to 
columns that would be a partition key in the MV, and that seems appealing. But 
I can't talk myself into liking inserting nulls into a MV partition key.)

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-14 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626760#comment-14626760
 ] 

Tupshin Harper commented on CASSANDRA-6477:
---

I find myself disagreeing with the hard requirement that all rows in the table 
must show up in the materialized views. While it would be nice, I believe that 
clearly documenting the limitation and providing a couple of reasonable choices 
is far preferable then encouraging using rope sufficient to hang the user.

My suggestion:
* Create a formal notion of NOT NULL columns in the schema that can be applied 
to a table, irrespective of any MV usage. 
* Columns that are NOT NULL would have the exact same restrictions as PK 
columns, namely that they need to be included in all inserts and updates (with 
the possible exception of LWT updates)
* Document (and warn in cqlsh) the fact that if you create a MV with a PK using 
a nullable column from the table, then those values will not be in the view

It seems to me like this is a far less dangerous (and in many ways less 
surprising) than automatically creating a hotspot in the MV because lots of 
data with NULLs get added.

Now with 8099 supporting NULLs for clustering columns, this might only apply to 
columns that would be a partition key in the MV, and that seems appealing. But 
I can't talk myself into liking inserting nulls into a MV partition key.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9200) Sequences

2015-07-01 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14610346#comment-14610346
 ] 

Tupshin Harper commented on CASSANDRA-9200:
---

An example of an application domain where strictly increasing integers are 
required is the IMAP protocol.
https://tools.ietf.org/html/rfc3501#page-8 where this is mandatory.

{{A 32-bit value assigned to each message, which when used with the unique 
identifier validity value (see below) forms a 64-bit value that MUST NOT refer 
to any other message in the mailbox or any subsequent mailbox with the same 
name forever.  Unique identifiers are assigned in a strictly ascending fashion 
in the mailbox; as each message is added to the mailbox it is assigned a higher 
UID than the message(s) which were added previously.  Unlike message sequence 
numbers, unique identifiers are not necessarily contiguous.}}

Building this kind of system on top of C* today requires an external CP system 
(ick operational complexity), though it is likely the case that the sequences 
here really only need to be modeled as clustering keys and not partition keys.

> Sequences
> -
>
> Key: CASSANDRA-9200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9200
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> UUIDs are usually the right choice for surrogate keys, but sometimes 
> application constraints dictate an increasing numeric value.
> We could do this by using LWT to reserve "blocks" of the sequence for each 
> member of the cluster, which would eliminate paxos contention at the cost of 
> not being strictly increasing.
> PostgreSQL syntax: 
> http://www.postgresql.org/docs/9.4/static/sql-createsequence.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-06-09 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578794#comment-14578794
 ] 

Tupshin Harper commented on CASSANDRA-7066:
---

Users don't care about SSTables, users care about their data. It's unclear 
what, if any, impact this would have on the availability/existence of data. So 
a few questions about failure conditions, all of which would apply to a single 
node cluster, and with commitlog durability set to batch, for simplicity of 
discussion.

Could this result in any circumstances where:
# a write was acknowledged to be written (consistency level met), but then no 
longer exists on disk through this sstable cleanup/deletion?
# a datum was queryable (through  memtable or sstable read), but then is either 
no longer on disk or queryable?
# a datum was deleted (tombstone?) and then comes back?
# similar questions to above when a snapshot/backup occurred prior to the 
sstable cleanup, and restoration from that backup was necessary.

If the answer to all of those is "no", then I have a hard time imagining any 
objections, though would love additional input from others. If yes, then huge 
problem. :)

Given the reference to "partial results" above, I'd also like some clarity on 
whether that has had any user-facing impact of data availability/queryability.


> Simplify (and unify) cleanup of compaction leftovers
> 
>
> Key: CASSANDRA-7066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Stefania
>Priority: Minor
>  Labels: compaction
> Fix For: 3.x
>
> Attachments: 7066.txt
>
>
> Currently we manage a list of in-progress compactions in a system table, 
> which we use to cleanup incomplete compactions when we're done. The problem 
> with this is that 1) it's a bit clunky (and leaves us in positions where we 
> can unnecessarily cleanup completed files, or conversely not cleanup files 
> that have been superceded); and 2) it's only used for a regular compaction - 
> no other compaction types are guarded in the same way, so can result in 
> duplication if we fail before deleting the replacements.
> I'd like to see each sstable store in its metadata its direct ancestors, and 
> on startup we simply delete any sstables that occur in the union of all 
> ancestor sets. This way as soon as we finish writing we're capable of 
> cleaning up any leftovers, so we never get duplication. It's also much easier 
> to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2015-05-12 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539854#comment-14539854
 ] 

Tupshin Harper commented on CASSANDRA-7622:
---

An additional thought is that the capabilities framework (CASSANDRA-8303) could 
be used to to restrict the available commands that would make it through to the 
virtual table implementation.

A possibly controversial example of this would be to only support UPDATE 
operations and not INSERT operations to semantically denote the fact that this 
table doesn't support adding new hosts, metrics, or attributes, but does 
support updating them. This wouldn't restrict all unsupported behavior, and the 
table implementation would still have to return errors if a read-only (or 
non-existent) attribute were updated, but it seems a bit cleaner than having 
the table claim to support INSERTs.

A (maybe) less controversial use of 8303 would be to disallow all write 
operations (both UPDATE and INSERT as well as others) for tables that are truly 
read-only.

And in the JMX case, it would certainly make sense to have different users have 
either SELECT only permissions or SELECT and UPDATE.



> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7622) Implement virtual tables

2015-05-12 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539841#comment-14539841
 ] 

Tupshin Harper edited comment on CASSANDRA-7622 at 5/12/15 1:54 PM:


I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
{code}
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT metric_type, attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index
{code}




was (Author: tupshin):
I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
{code}
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT metrics_type, attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index
{code}



> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7622) Implement virtual tables

2015-05-12 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539841#comment-14539841
 ] 

Tupshin Harper edited comment on CASSANDRA-7622 at 5/12/15 1:53 PM:


I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
{code}
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT metrics_type, attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index
{code}




was (Author: tupshin):
I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
{code}
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index
{code}



> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2015-05-12 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539841#comment-14539841
 ] 

Tupshin Harper commented on CASSANDRA-7622:
---

I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index




> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7622) Implement virtual tables

2015-05-12 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539841#comment-14539841
 ] 

Tupshin Harper edited comment on CASSANDRA-7622 at 5/12/15 1:48 PM:


I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
{code}
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index
{code}




was (Author: tupshin):
I think the assumption should be that each virtual table supports a subset of 
the queries performed on regular tables. If the virtual table can support all 
operations great, but otherwise noops or unsupported exceptions should be fine 
if a given operation doesn't make sense for the table.

The locality of the data (and whether distributed or not), should be internal 
to the implementation of each virtual table.

Using JMX, I suggest this as a simplified starting point:
CREATE TABLE jmx (
  node_id uuid,
  metric_type text,
  attributes map,
  host_ip text static,
  PRIMARY KEY ((node_id), metric_type)
)
CREATE INDEX host_by_ip ON jmx (host_ip) #this will work after CASSANDRA-8103 

SELECT attributes FROM jmx where node_id = 
'eedea3e3-e36d-4371-8937-57f5a8303165' #returns all metrics for a given node
SELECT attributes FROM jmx where host_ip = '10.10.10.10'  and 
metric_type='CompactionManager' #returns all compaction metrics for a given 
node, looking up the node by a pseudo secondary index




> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2015-05-07 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533578#comment-14533578
 ] 

Tupshin Harper commented on CASSANDRA-7622:
---

The correct decision to make JMX bind to localhost only for security reasons 
creates additional importance and urgency for this as a feature.

I'd like to promote it from hand-wavey 3.x to more concrete 3.1 in hopes that 
it wouldn't slip from there. We really need to simplify the access patterns and 
reduce the surface area.

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9242) Add PerfDisableSharedMem to default JVM params

2015-04-24 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512125#comment-14512125
 ] 

Tupshin Harper commented on CASSANDRA-9242:
---

Big plus one on this. Since that linked article came out, I've heard of a 
couple of cases where this was tried, and in each case, it helped with long 
tail latencies.

> Add PerfDisableSharedMem to default JVM params
> --
>
> Key: CASSANDRA-9242
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9242
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Config
>Reporter: Matt Stump
>
> We should add PerfDisableSharedMem to default JVM params. The JVM will save 
> stats to a memory mapped file when reaching a safepoint. This is performed 
> synchronously and the JVM remains paused while this action takes place. 
> Occasionally the OS will stall the calling thread while this happens 
> resulting in significant impact to worst case JVM pauses. By disabling the 
> save in the JVM these mysterious multi-second pauses disappear.
> The behavior is outlined in [this 
> article|http://www.evanjones.ca/jvm-mmap-pause.html]. Another manifestation 
> is significant time spent in sys during GC pauses. In [the linked 
> test|http://cstar.datastax.com/graph?stats=762d9c2a-eace-11e4-8236-42010af0688f&metric=gc_max_ms&operation=1_write&smoothing=1&show_aggregates=true&xmin=0&xmax=110.77&ymin=0&ymax=10421.4]
>  you'll notice multiple seconds spent in sys during the longest pauses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8692) Coalesce intra-cluster network messages

2015-03-13 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360966#comment-14360966
 ] 

Tupshin Harper commented on CASSANDRA-8692:
---

I commented on CASSANDRA-7032 that "It seems like there might be a way to 
constrain vnode RDF (replication distribution factor) in the general scope of 
this ticket as well."

I feel like there are some very compelling availability arguments (in addition 
to these possible performance optimizations) in favor of being able to 
constrain how many other nodes (within a DC) that a given vnode-enabled node 
actually replicates with. 

e.g. you could have 256 vnodes, but guarantee that those 256 would only 
replicate to 32 (out of possibly thousands) of other nodes.

> Coalesce intra-cluster network messages
> ---
>
> Key: CASSANDRA-8692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8692
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Fix For: 2.1.4
>
> Attachments: batching-benchmark.png
>
>
> While researching CASSANDRA-8457 we found that it is effective and can be 
> done without introducing additional latency at low concurrency/throughput.
> The patch from that was used and found to be useful in a real life scenario 
> so I propose we implement this in 2.1 in addition to 3.0.
> The change set is a single file and is small enough to be reviewable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-02-21 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14330198#comment-14330198
 ] 

Tupshin Harper commented on CASSANDRA-8844:
---

To clarify what I think is the minimum viable feature set that Cassandra should 
support:
# A DDL mechanism for turning on and off logging for a given table
# Either file-based logging built in, or a pluggable interface where such 
logging could be built
# If it's a pluggable interface, the ability to specify the classname of the 
logger in the DDL command

Ideally, I'd love to see the pluggable interface to allow for other logging 
mechanisms, but for Cassandra itself to include a bare-bones logger that could 
be integrated with out of the box, and to serve as an example for how others 
should implement the interface.
I certainly see the CQL delivery mechanism, as well as the more flexible 
logging (multiple logs per table along with filtering), as out of scope for 
this ticket. I would create another "future" one for both of those.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Tupshin Harper
> Fix For: 3.1
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> -

[jira] [Updated] (CASSANDRA-8844) Change Data Capture (CDC)

2015-02-21 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-8844:
--
Description: 
"In databases, change data capture (CDC) is a set of software design patterns 
used to determine (and track) the data that has changed so that action can be 
taken using the changed data. Also, Change data capture (CDC) is an approach to 
data integration that is based on the identification, capture and delivery of 
the changes made to enterprise data sources."
-Wikipedia

As Cassandra is increasingly being used as the Source of Record (SoR) for 
mission critical data in large enterprises, it is increasingly being called 
upon to act as the central hub of traffic and data flow to other systems. In 
order to try to address the general need, we (cc [~brianmhess]), propose 
implementing a simple data logging mechanism to enable per-table CDC patterns.

h2. The goals:
# Use CQL as the primary ingestion mechanism, in order to leverage its 
Consistency Level semantics, and in order to treat it as the single 
reliable/durable SoR for the data.
# To provide a mechanism for implementing good and reliable 
(deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
continuous semi-realtime feeds of mutations going into a Cassandra cluster.
# To eliminate the developmental and operational burden of users so that they 
don't have to do dual writes to other systems.
# For users that are currently doing batch export from a Cassandra system, give 
them the opportunity to make that realtime with a minimum of coding.

h2. The mechanism:
We propose a durable logging mechanism that functions similar to a commitlog, 
with the following nuances:
- Takes place on every node, not just the coordinator, so RF number of copies 
are logged.
- Separate log per table.
- Per-table configuration. Only tables that are specified as CDC_LOG would do 
any logging.
- Per DC. We are trying to keep the complexity to a minimum to make this an 
easy enhancement, but most likely use cases would prefer to only implement CDC 
logging in one (or a subset) of the DCs that are being replicated to
- In the critical path of ConsistencyLevel acknowledgment. Just as with the 
commitlog, failure to write to the CDC log should fail that node's write. If 
that means the requested consistency level was not met, then clients *should* 
experience UnavailableExceptions.
- Be written in a Row-centric manner such that it is easy for consumers to 
reconstitute rows atomically.
- Written in a simple format designed to be consumed *directly* by daemons 
written in non JVM languages

h2. Nice-to-haves
I strongly suspect that the following features will be asked for, but I also 
believe that they can be deferred for a subsequent release, and to guage actual 
interest.
- Multiple logs per table. This would make it easy to have multiple 
"subscribers" to a single table's changes. A workaround would be to create a 
forking daemon listener, but that's not a great answer.
- Log filtering. Being able to apply filters, including UDF-based filters would 
make Casandra a much more versatile feeder into other systems, and again, 
reduce complexity that would otherwise need to be built into the daemons.

h2. Format and Consumption
- Cassandra would only write to the CDC log, and never delete from it. 
- Cleaning up consumed logfiles would be the client daemon's responibility
- Logfile size should probably be configurable.
- Logfiles should be named with a predictable naming schema, making it triivial 
to process them in order.
- Daemons should be able to checkpoint their work, and resume from where they 
left off. This means they would have to leave some file artifact in the CDC 
log's directory.
- A sophisticated daemon should be able to be written that could 
-- Catch up, in written-order, even when it is multiple logfiles behind in 
processing
-- Be able to continuously "tail" the most recent logfile and get 
low-latency(ms?) access to the data as it is written.

h2. Alternate approach
In order to make consuming a change log easy and efficient to do with low 
latency, the following could supplement the approach outlined above
- Instead of writing to a logfile, by default, Cassandra could expose a socket 
for a daemon to connect to, and from which it could pull each row.
- Cassandra would have a limited buffer for storing rows, should the listener 
become backlogged, but it would immediately spill to disk in that case, never 
incurring large in-memory costs.

h2. Additional consumption possibility
With all of the above, still relevant:
- instead (or in addition to) using the other logging mechanisms, use CQL 
transport itself as a logger.
- Extend the CQL protoocol slightly so that rows of data can be return to a 
listener that didn't explicit make a query, but instead registered itself with 
Cassandra as a listener for a particular event

[jira] [Updated] (CASSANDRA-8844) Change Data Capture (CDC)

2015-02-20 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-8844:
--
Description: 
"In databases, change data capture (CDC) is a set of software design patterns 
used to determine (and track) the data that has changed so that action can be 
taken using the changed data. Also, Change data capture (CDC) is an approach to 
data integration that is based on the identification, capture and delivery of 
the changes made to enterprise data sources."
-Wikipedia

As Cassandra is increasingly being used as the Source of Record (SoR) for 
mission critical data in large enterprises, it is increasingly being called 
upon to act as the central hub of traffic and data flow to other systems. In 
order to try to address the general need, we (cc [~brianmhess]), propose 
implementing a simple data logging mechanism to enable per-table CDC patterns.

h2. The goals:
# Use CQL as the primary ingestion mechanism, in order to leverage its 
Consistency Level semantics, and in order to treat it as the single 
reliable/durable SoR for the data.
# To provide a mechanism for implementing good and reliable 
(deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
continuous semi-realtime feeds of mutations going into a Cassandra cluster.
# To eliminate the developmental and operational burden of users so that they 
don't have to do dual writes to other systems.
# For users that are currently doing batch export from a Cassandra system, give 
them the opportunity to make that realtime with a minimum of coding.

The mechanism:
We propose a durable logging mechanism that functions similar to a commitlog, 
with the following nuances:
- Takes place on every node, not just the coordinator, so RF number of copies 
are logged.
- Separate log per table.
- Per-table configuration. Only tables that are specified as CDC_LOG would do 
any logging.
- Per DC. We are trying to keep the complexity to a minimum to make this an 
easy enhancement, but most likely use cases would prefer to only implement CDC 
logging in one (or a subset) of the DCs that are being replicated to
- In the critical path of ConsistencyLevel acknowledgment. Just as with the 
commitlog, failure to write to the CDC log should fail that node's write. If 
that means the requested consistency level was not met, then clients *should* 
experience UnavailableExceptions.
- Be written in a Row-centric manner such that it is easy for consumers to 
reconstitute rows atomically.
- Written in a simple format designed to be consumed *directly* by daemons 
written in non JVM languages

h2. Nice-to-haves
I strongly suspect that the following features will be asked for, but I also 
believe that they can be deferred for a subsequent release, and to guage actual 
interest.
- Multiple logs per table. This would make it easy to have multiple 
"subscribers" to a single table's changes. A workaround would be to create a 
forking daemon listener, but that's not a great answer.
- Log filtering. Being able to apply filters, including UDF-based filters would 
make Casandra a much more versatile feeder into other systems, and again, 
reduce complexity that would otherwise need to be built into the daemons.

h2. Format and Consumption
- Cassandra would only write to the CDC log, and never delete from it. 
- Cleaning up consumed logfiles would be the client daemon's responibility
- Logfile size should probably be configurable.
- Logfiles should be named with a predictable naming schema, making it triivial 
to process them in order.
- Daemons should be able to checkpoint their work, and resume from where they 
left off. This means they would have to leave some file artifact in the CDC 
log's directory.
- A sophisticated daemon should be able to be written that could 
-- Catch up, in written-order, even when it is multiple logfiles behind in 
processing
-- Be able to continuously "tail" the most recent logfile and get 
low-latency(ms?) access to the data as it is written.

h2. Alternate approach
In order to make consuming a change log easy and efficient to do with low 
latency, the following could supplement the approach outlined above
- Instead of writing to a logfile, by default, Cassandra could expose a socket 
for a daemon to connect to, and from which it could pull each row.
- Cassandra would have a limited buffer for storing rows, should the listener 
become backlogged, but it would immediately spill to disk in that case, never 
incurring large in-memory costs.

h2. Additional consumption possibility
With all of the above, still relevant:
- instead (or in addition to) using the other logging mechanisms, use CQL 
transport itself as a logger.
- Extend the CQL protoocol slightly so that rows of data can be return to a 
listener that didn't explicit make a query, but instead registered itself with 
Cassandra as a listener for a particular event typ

[jira] [Created] (CASSANDRA-8844) Change Data Capture (CDC)

2015-02-20 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-8844:
-

 Summary: Change Data Capture (CDC)
 Key: CASSANDRA-8844
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Tupshin Harper
 Fix For: 3.1


"In databases, change data capture (CDC) is a set of software design patterns 
used to determine (and track) the data that has changed so that action can be 
taken using the changed data. Also, Change data capture (CDC) is an approach to 
data integration that is based on the identification, capture and delivery of 
the changes made to enterprise data sources."
-Wikipedia

As Cassandra is increasingly being used as the Source of Record (SoR) for 
mission critical data in large enterprises, it is increasingly being called 
upon to act as the central hub of traffic and data flow to other systems. In 
order to try to address the general need, we (cc [~brianmhess]), propose 
implementing a simple data logging mechanism to enable per-table CDC patterns.

h2. The goals:
# Use CQL as the primary ingestion mechanism, in order to leverage its 
Consistency Level semantics, and in order to treat it as the single 
reliable/durable SoR for the data.
# To provide a mechanism for implementing good and reliable 
(deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
continuous semi-realtime feeds of mutations going into a Cassandra cluster.
# To eliminate the developmental and operational burden of users so that they 
don't have to do dual writes to other systems.
# For users that are currently doing batch export from a Cassandra system, give 
them the opportunity to make that realtime with a minimum of coding.

The mechanism:
We propose a durable logging mechanism that functions similar to a commitlog, 
with the following nuances:
- Takes place on every node, not just the coordinator, so RF number of copies 
are logged.
- Separate log per table.
- Per-table configuration. Only tables that are specified as CDC_LOG would do 
any logging.
- Per DC. We are trying to keep the complexity to a minimum to make this an 
easy enhancement, but most likely use cases would prefer to only implement CDC 
logging in one (or a subset) of the DCs that are being replicated to
- In the critical path of ConsistencyLevel acknowledgment. Just as with the 
commitlog, failure to write to the CDC log should fail that node's write. If 
that means the requested consistency level was not met, then clients *should* 
experience UnavailableExceptions.
- Be written in a Row-centric manner such that it is easy for consumers to 
reconstitute rows atomically.
- Written in a simple format designed to be consumed *directly* by daemons 
written in non JVM languages

h2. Nice-to-haves
I strongly suspect that the following features will be asked for, but I also 
believe that they can be deferred for a subsequent release, and to guage actual 
interest.
- Multiple logs per table. This would make it easy to have multiple 
"subscribers" to a single table's changes. A workaround would be to create a 
forking daemon listener, but that's not a great answer.
- Log filtering. Being able to apply filters, including UDF-based filters would 
make Casandra a much more versatile feeder into other systems, and again, 
reduce complexity that would otherwise need to be built into the daemons.

h2. Format and Consumption
- Cassandra would only write to the CDC log, and never delete from it. 
- Cleaning up consumed logfiles would be the client daemon's responibility
- Logfile size should probably be configurable.
- Logfiles should be named with a predictable naming schema, making it triivial 
to process them in order.
- Daemons should be able to checkpoint their work, and resume from where they 
left off. This means they would have to leave some file artifact in the CDC 
log's directory.
- A sophisticated daemon should be able to be written that could 
-- Catch up, in written-order, even when it is multiple logfiles behind in 
processing
-- Be able to continuously "tail" the most recent logfile and get 
low-latency(ms?) access to the data as it is written.

h2. Alternate approach
In order to make consuming a change log easy and efficient to do with low 
latency, the following could supplement the approach outlined above
- Instead of writing to a logfile, by default, Cassandra could expose a socket 
for a daemon to connect to, and from which it could pull each row.
- Cassandra would have a limited buffer for storing rows, should the listener 
become backlogged, but it would immediately spill to disk in that case, never 
incurring large in-memory costs.

h2. Additional consumption possibility
With all of the above, still relevant:
- instead (or in addition to) using the other logging mechanisms, use CQL 
transport itself as a logger.
- Extend the CQL protoocol slightly

[jira] [Commented] (CASSANDRA-8754) Required consistency level

2015-02-06 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309509#comment-14309509
 ] 

Tupshin Harper commented on CASSANDRA-8754:
---

something like a set of ALLOWED_CONSISTENCY_LEVELS (maybe separate ones for 
reads and writes?) per table. The biggest benefit would be to enforce sanity on 
LWT operations not mixing with non-LWT, but in general, useful to reduce the 
amount of rope users have to hang themselves with.

> Required consistency level
> --
>
> Key: CASSANDRA-8754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8754
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>
> Idea is to prevent a query based on a consistency level not being met. For 
> example we can specify that all queries should be at least CL LOCAL_QUORUM.
> Lots of customers struggle with getting all their dev teams on board with 
> consistency levels and all the ramifications. The normal solution for this 
> has traditionally to build a service in front of Cassandra that the entire 
> dev team accesses. However, this has proven challenging for some 
> organizations to do correctly, and I think an easier approach would be to 
> require a given consistency level as a matter of enforced policy in the 
> database. 
> I'm open for where this belongs. The most flexible approach is at a table 
> level, however I'm concerned this is potentially error prone and labor 
> intensive. It could be a table attribute similar to compaction strategy.
> The simplest administratively is a cluster level, in say the cassandra.yaml
> The middle ground is at they keyspace level, the only downside I could 
> foresee is keyspace explosion to fit involved minimum schemes. It could be a 
> keyspace attribute such as replication strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8754) Required consistency level

2015-02-06 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309493#comment-14309493
 ] 

Tupshin Harper commented on CASSANDRA-8754:
---

-1 on cluster level (too limiting, IMO), but big +1 for table level 
restrictions of CL

> Required consistency level
> --
>
> Key: CASSANDRA-8754
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8754
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>
> Idea is to prevent a query based on a consistency level not being met.
> Lots of customers struggle with getting all their dev teams on board with 
> consistency levels and all the ramifications. The normal solution for this 
> has traditionally to build a service in front of Cassandra that the entire 
> dev team accesses. However, this has proven challenging for some 
> organizations to do correctly, and I think an easier approach would be to 
> require a given consistency level as a matter of enforced policy in the 
> database. 
> I'm open for where this belongs. The most flexible approach is at a table 
> level, however I'm concerned this is potentially error prone and labor 
> intensive. It could be a table attribute similar to compaction strategy.
> The simplest administratively is a cluster level, in say the cassandra.yaml
> The middle ground is at they keyspace level, the only downside I could 
> foresee is keyspace explosion to fit involved minimum schemes. It could be a 
> keyspace attribute such as replication strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8692) Coalesce intra-cluster network messages

2015-01-30 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299100#comment-14299100
 ] 

Tupshin Harper commented on CASSANDRA-8692:
---

At least 2.1 inclusion please. This is looking to be a pretty substantial win.

> Coalesce intra-cluster network messages
> ---
>
> Key: CASSANDRA-8692
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8692
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Ariel Weisberg
>Assignee: Ariel Weisberg
> Attachments: batching-benchmark.png
>
>
> While researching CASSANDRA-8457 we found that it is effective and can be 
> done without introducing additional latency at low concurrency/throughput.
> The patch from that was used and found to be useful in a real life scenario 
> so I propose we implement this in 2.1 in addition to 3.0.
> The change set is a single file and is small enough to be reviewable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8586) support millions of sstables by lazily acquiring/caching/dropping filehandles

2015-01-08 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-8586:
-

 Summary: support millions of sstables by lazily 
acquiring/caching/dropping filehandles
 Key: CASSANDRA-8586
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8586
 Project: Cassandra
  Issue Type: New Feature
Reporter: Tupshin Harper
Assignee: Aleksey Yeschenko


This might turn into a meta ticket if other obstacles are found in the goal of 
supporting a huge number of sstables.

Technically, the only gap that I know of to prevent us from supporting absurd 
numbers of sstables is the fact that we hold on to an open filehandle for every 
single sstable. 

For use cases that are willing to take a hit to read-performance in order to 
achieve high densities and low write amplification, a mechanism for only 
retaining file handles for recently read sstables could be very valuable.

This will allow for alternate compaction strategies and compaction strategy 
tuning that don't try to optimize for read performance as aggresively.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7666) Range-segmented sstables

2014-12-23 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257751#comment-14257751
 ] 

Tupshin Harper commented on CASSANDRA-7666:
---

I think that it's sufficient to let this be dormant until or unless it is 
needed to support other features. DTCS covers most of the immediate benefit. 
Future possible features such as tiered storage and the ability to drop whole 
segments at a time, however, mean that we should not defer this one 
indefinitely.

> Range-segmented sstables
> 
>
> Key: CASSANDRA-7666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7666
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Sam Tunnicliffe
> Fix For: 3.0
>
>
> It would be useful to segment sstables by data range (not just token range as 
> envisioned by CASSANDRA-6696).
> The primary use case is to allow deleting those data ranges for "free" by 
> dropping the sstables involved.  We should also (possibly as a separate 
> ticket) be able to leverage this information in query planning to avoid 
> unnecessary sstable reads.
> Relational databases typically call this "partitioning" the table, but 
> obviously we use that term already for something else: 
> http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
> Tokutek's take for mongodb: 
> http://docs.tokutek.com/tokumx/tokumx-partitioned-collections.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7275) Errors in FlushRunnable may leave threads hung

2014-12-17 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250554#comment-14250554
 ] 

Tupshin Harper commented on CASSANDRA-7275:
---

Strongly in favor of the opt in policy based approach that [~jbellis] 
mentioned.  There isn't a one size fits all approach to deal with this 

> Errors in FlushRunnable may leave threads hung
> --
>
> Key: CASSANDRA-7275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7275
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 2.0.12
>
> Attachments: 0001-Move-latch.countDown-into-finally-block.patch, 
> 7252-2.0-v2.txt, CASSANDRA-7275-flush-info.patch
>
>
> In Memtable.FlushRunnable, the CountDownLatch will never be counted down if 
> there are errors, which results in hanging any threads that are waiting for 
> the flush to complete.  For example, an error like this causes the problem:
> {noformat}
> ERROR [FlushWriter:474] 2014-05-20 12:10:31,137 CassandraDaemon.java (line 
> 198) Exception in thread Thread[FlushWriter:474,5,main]
> java.lang.IllegalArgumentException
> at java.nio.Buffer.position(Unknown Source)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:64)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72)
> at 
> org.apache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType.java:138)
> at 
> org.apache.cassandra.io.sstable.ColumnNameHelper.minComponents(ColumnNameHelper.java:103)
> at 
> org.apache.cassandra.db.ColumnFamily.getColumnStats(ColumnFamily.java:439)
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:194)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:397)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:350)
> at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8494) incremental bootstrap

2014-12-16 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249114#comment-14249114
 ] 

Tupshin Harper commented on CASSANDRA-8494:
---

bq. I think the improved feedback will make a huge difference for people 
wondering if bootstrap is working!
So much this. Would make the ticket worthwhile by itself.

> incremental bootstrap
> -
>
> Key: CASSANDRA-8494
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8494
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jon Haddad
>Assignee: Yuki Morishita
>Priority: Minor
>  Labels: density
> Fix For: 3.0
>
>
> Current bootstrapping involves (to my knowledge) picking tokens and streaming 
> data before the node is available for requests.  This can be problematic with 
> "fat nodes", since it may require 20TB of data to be streamed over before the 
> machine can be useful.  This can result in a massive window of time before 
> the machine can do anything useful.
> As a potential approach to mitigate the huge window of time before a node is 
> available, I suggest modifying the bootstrap process to only acquire a single 
> initial token before being marked UP.  This would likely be a configuration 
> parameter "incremental_bootstrap" or something similar.
> After the node is bootstrapped with this one token, it could go into UP 
> state, and could then acquire additional tokens (one or a handful at a time), 
> which would be streamed over while the node is active and serving requests.  
> The benefit here is that with the default 256 tokens a node could become an 
> active part of the cluster with less than 1% of it's final data streamed over.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting

2014-12-01 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230450#comment-14230450
 ] 

Tupshin Harper commented on CASSANDRA-8371:
---

FWIW, I believe that setting the parameter to less than a day will be a common 
case, and not an unusual one. For write-heavy, high velocity workloads, the 
additional read cost of reading from an extra (repair-created) sstable in the 
case of repair taking place after the segment is frozen will often be the 
correct optimization, in order to minimize write amplification at the expense 
of tiny additional read overhead.

> DateTieredCompactionStrategy is always compacting 
> --
>
> Key: CASSANDRA-8371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: mck
>Assignee: Björn Hegerfors
>  Labels: compaction, performance
> Attachments: java_gc_counts_rate-month.png, 
> read-latency-recommenders-adview.png, read-latency.png, 
> sstables-recommenders-adviews.png, sstables.png, vg2_iad-month.png
>
>
> Running 2.0.11 and having switched a table to 
> [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
> disk IO and gc count increase, along with the number of reads happening in 
> the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always 
> happening, and pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in 
> slides 7+8 of https://prezi.com/b9-aj6p2esft/
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
> screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
> to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under 
> (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226474#comment-14226474
 ] 

Tupshin Harper commented on CASSANDRA-7438:
---

[~xedin] I'm lost in too many layers of snark and indirection (not just yours). 
Can you elaborate on what strategy you actually find appealling?

> Serializing Row cache alternative (Fully off heap)
> --
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux
>Reporter: Vijay
>Assignee: Vijay
>  Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting

2014-11-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226341#comment-14226341
 ] 

Tupshin Harper commented on CASSANDRA-8371:
---

And I'd also like to see an option to change max_sstable_age_days to be a 
smaller unit of time. Right now, you can only set it to integer days. 
Particularly with high ingestion rates, and low TTL, I see legitimate use cases 
where that could benefit from being as low as an hour, or even less, in order 
to minimize any write amplification.

Just switching to use seconds as the unit of time here would make a lot of 
sense to me. 365 days would then be expressible as 31536. :)


> DateTieredCompactionStrategy is always compacting 
> --
>
> Key: CASSANDRA-8371
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: mck
>Assignee: Björn Hegerfors
>  Labels: compaction, performance
> Attachments: java_gc_counts_rate-month.png, read-latency.png, 
> sstables.png, vg2_iad-month.png
>
>
> Running 2.0.11 and having switched a table to 
> [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602] we've seen that 
> disk IO and gc count increase, along with the number of reads happening in 
> the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always 
> happening, and pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in 
> slides 7+8 of https://prezi.com/b9-aj6p2esft
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and 
> screenshots of our munin graphs as we have gone from STCS, to LCS (week ~44), 
> to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under 
> (3) in the description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7826) support arbitrary nesting of collection

2014-11-06 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200853#comment-14200853
 ] 

Tupshin Harper commented on CASSANDRA-7826:
---

Correct interpretation of my end goal. I'm neutral on the need/benefit of doing 
frozen/nested in 2.x. Personally I'd be OK deferring full nesting of 
collections until unfrozen nested support in 3.0.

> support arbitrary nesting of collection
> ---
>
> Key: CASSANDRA-7826
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Tyler Hobbs
>  Labels: ponies
>
> The inability to nest collections is one of the bigger data modelling 
> limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8225) Production-capable COPY FROM

2014-11-03 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14194640#comment-14194640
 ] 

Tupshin Harper commented on CASSANDRA-8225:
---

fwiw, i agree wholeheartedly with sylvain. the cqlsh-based approach (executing 
python code) is a dead end for getting decent performance out of bulk loading.

> Production-capable COPY FROM
> 
>
> Key: CASSANDRA-8225
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8225
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
> Fix For: 2.1.2
>
>
> Via [~schumacr],
> bq. I pulled down a sourceforge data generator and created a moc file of 
> 500,000 rows that had an incrementing sequence number, date, and SSN. I then 
> used our COPY command and MySQL's LOAD DATA INFILE to load the file on my 
> Mac. Results were: 
> {noformat}
> mysql> load data infile '/Users/robin/dev/datagen3.txt'  into table p_test  
> fields terminated by ',';
> Query OK, 50 rows affected (2.18 sec)
> {noformat}
> C* 2.1.0 (pre-CASSANDRA-7405)
> {noformat}
> cqlsh:dev> copy p_test from '/Users/robin/dev/datagen3.txt' with 
> delimiter=',';
> 50 rows imported in 16 minutes and 45.485 seconds.
> {noformat}
> Cassandra 2.1.1:
> {noformat}
> cqlsh:dev> copy p_test from '/Users/robin/dev/datagen3.txt' with 
> delimiter=',';
> Processed 50 rows; Write: 4037.46 rows/s
> 50 rows imported in 2 minutes and 3.058 seconds.
> {noformat}
> [jbellis] 7405 gets us almost an order of magnitude improvement.  
> Unfortunately we're still almost 2 orders slower than mysql.
> I don't think we can continue to tell people, "use sstableloader instead."  
> The number of users sophisticated enough to use the sstable writers is small 
> and (relatively) decreasing as our user base expands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8168) Require Java 8

2014-10-28 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187203#comment-14187203
 ] 

Tupshin Harper commented on CASSANDRA-8168:
---

I'm also +1 on it, but more so if we can endorse openjdk8 (as opposed to just 
oracle jdk) from day 1.

> Require Java 8
> --
>
> Key: CASSANDRA-8168
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8168
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
> Fix For: 3.0
>
>
> This is to discuss requiring Java 8 for version >= 3.0  
> There are a couple big reasons for this.
> * Better support for complex async work  e.g (CASSANDRA-5239)
> http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html
> * Use Nashorn for Javascript UDFs CASSANDRA-7395



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-7028) Allow C* to compile under java 8

2014-09-14 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper reopened CASSANDRA-7028:
---

> Allow C* to compile under java 8
> 
>
> Key: CASSANDRA-7028
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7028
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 2.1.1, 3.0
>
> Attachments: 7028.txt, 7028_v2.txt, 7028_v3.txt, 7028_v4.txt, 
> 7028_v5.patch
>
>
> antlr 3.2 has a problem with java 8, as described here: 
> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8015656
> updating to antlr 3.5.2 solves this, however they have split up the jars 
> differently, which adds some changes, but also the generation of 
> CqlParser.java causes a method to be too large, so i needed to split that 
> method to reduce the size of it.
> (patch against trunk)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7849) Server logged error messages (in binary protocol) for unexpected exceptions could be more helpful

2014-09-07 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125066#comment-14125066
 ] 

Tupshin Harper commented on CASSANDRA-7849:
---

Strong +1 to disabling those kinds of messages except at debug level.  Less 
noise, please. 

> Server logged error messages (in binary protocol) for unexpected exceptions 
> could be more helpful
> -
>
> Key: CASSANDRA-7849
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7849
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: graham sanderson
> Fix For: 1.2.19, 2.0.11
>
> Attachments: cassandra-1.2-7849.txt
>
>
> From time to time (actually quite frequently) we get error messages in the 
> server logs like this
> {code}
> ERROR [Native-Transport-Requests:288] 2014-08-29 04:48:07,118 
> ErrorMessage.java (line 222) Unexpected exception during request
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> These particular cases are almost certainly problems with the client driver, 
> client machine, client process, however after the fact this particular 
> exception is practically impossible to debug because there is no indication 
> in the underlying JVM/netty exception of who the peer was. I should note we 
> have lots of different types of applications running against the cluster so 
> it is very hard to correlate these to anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7857) Ability to froze UDT

2014-09-01 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117740#comment-14117740
 ] 

Tupshin Harper edited comment on CASSANDRA-7857 at 9/1/14 9:39 PM:
---

 I'll suggest "fixed", as an alternative to frozen, static, or serialized. 


was (Author: tupshin):
 I'll sugget "fixed", as an alternative to frozen, static, or serialized. 

> Ability to froze UDT
> 
>
> Key: CASSANDRA-7857
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7857
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 2.1.0
>
> Attachments: 7857-v2.txt, 7857.txt
>
>
> Currently, UDT are serialized into a single value. For 3.0, we want to change 
> that somewhat and allow updating individual subfields: CASSANDRA-7423 (and 
> ultimately, we'll probably allow querying subpart of UDT to some extend). 
> Also for 3.0, we want to allow some nesting of collections (CASSANDRA-7826).
> However, migrating the currently serialized UDT would be challenging. Besides 
> that, even with nested collections, we probably won't be able to support 
> nesting within map keys and sets without serializing (at the very least, not 
> initially). Also, it can be useful in some specific case to have UDT or 
> collections for PK columns, even if those are serialized.
> So we need a better way to distinguish when a composite types (collections & 
> UDT) are serialized (which imply you can't update subpart of the value, you 
> have to rewrite it fully) and when they are not. The suggestion is then to 
> introduce a new keyword, {{frozen}}, to indicate that a type is serialized:
> {noformat}
> CREATE TYPE foo (a int, b int);
> CREATE TABLE bar (
> k frozen PRIMARY KEY,
> m map>, text>
> )
> {noformat}
> A big advantage is that it makes the downside (you can't update the value 
> without rewriting it all) clear and upfront.
> Now, as of 2.1, we only support frozen UDT, and so we should make this clear 
> by 1) adding the frozen keyword and 2) don't allow use of UDT unless they are 
> "frozen" (since that's all we really support). This is what this ticket 
> proposes to do. And this should be done in 2.1.0 or this will be a breaking 
> change.
> We will have a follow-up ticket that will extend {{frozen}} to collection, 
> but this is less urgent since this will be strictly an improvement.
> I'll note that in term of syntax, {{serialized}} was suggested as an 
> alternative to {{frozen}}. I personally have a minor preference for 
> {{serialized}} but it was argued that it had a "sequential" connotation which 
> {{frozen}} don't have. Changing that is still up for discussion, but we need 
> to reach a decision quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7857) Ability to froze UDT

2014-09-01 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117740#comment-14117740
 ] 

Tupshin Harper commented on CASSANDRA-7857:
---

 I'll sugget "fixed", as an alternative to frozen, static, or serialized. 

> Ability to froze UDT
> 
>
> Key: CASSANDRA-7857
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7857
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 2.1.0
>
> Attachments: 7857-v2.txt, 7857.txt
>
>
> Currently, UDT are serialized into a single value. For 3.0, we want to change 
> that somewhat and allow updating individual subfields: CASSANDRA-7423 (and 
> ultimately, we'll probably allow querying subpart of UDT to some extend). 
> Also for 3.0, we want to allow some nesting of collections (CASSANDRA-7826).
> However, migrating the currently serialized UDT would be challenging. Besides 
> that, even with nested collections, we probably won't be able to support 
> nesting within map keys and sets without serializing (at the very least, not 
> initially). Also, it can be useful in some specific case to have UDT or 
> collections for PK columns, even if those are serialized.
> So we need a better way to distinguish when a composite types (collections & 
> UDT) are serialized (which imply you can't update subpart of the value, you 
> have to rewrite it fully) and when they are not. The suggestion is then to 
> introduce a new keyword, {{frozen}}, to indicate that a type is serialized:
> {noformat}
> CREATE TYPE foo (a int, b int);
> CREATE TABLE bar (
> k frozen PRIMARY KEY,
> m map>, text>
> )
> {noformat}
> A big advantage is that it makes the downside (you can't update the value 
> without rewriting it all) clear and upfront.
> Now, as of 2.1, we only support frozen UDT, and so we should make this clear 
> by 1) adding the frozen keyword and 2) don't allow use of UDT unless they are 
> "frozen" (since that's all we really support). This is what this ticket 
> proposes to do. And this should be done in 2.1.0 or this will be a breaking 
> change.
> We will have a follow-up ticket that will extend {{frozen}} to collection, 
> but this is less urgent since this will be strictly an improvement.
> I'll note that in term of syntax, {{serialized}} was suggested as an 
> alternative to {{frozen}}. I personally have a minor preference for 
> {{serialized}} but it was argued that it had a "sequential" connotation which 
> {{frozen}} don't have. Changing that is still up for discussion, but we need 
> to reach a decision quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7826) support arbitrary nesting of collection

2014-08-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111690#comment-14111690
 ] 

Tupshin Harper commented on CASSANDRA-7826:
---

I'd much rather see it done right, with individual cell level access in 3.0 
rather than rushed in.

> support arbitrary nesting of collection
> ---
>
> Key: CASSANDRA-7826
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Tyler Hobbs
>  Labels: ponies
>
> The inability to nest collections is one of the bigger data modelling 
> limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7028) Allow C* to compile under java 8

2014-08-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110777#comment-14110777
 ] 

Tupshin Harper edited comment on CASSANDRA-7028 at 8/26/14 2:54 PM:


Re-opening and adding additional 2.1.1 target for [~tuxslayer]


was (Author: tupshin):
Re-opening and adding additional 2.1.1 target for [~skyline81]

> Allow C* to compile under java 8
> 
>
> Key: CASSANDRA-7028
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7028
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 2.1.1, 3.0
>
> Attachments: 7028.txt, 7028_v2.txt, 7028_v3.txt, 7028_v4.txt, 
> 7028_v5.patch
>
>
> antlr 3.2 has a problem with java 8, as described here: 
> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8015656
> updating to antlr 3.5.2 solves this, however they have split up the jars 
> differently, which adds some changes, but also the generation of 
> CqlParser.java causes a method to be too large, so i needed to split that 
> method to reduce the size of it.
> (patch against trunk)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7028) Allow C* to compile under java 8

2014-08-26 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7028:
--

Fix Version/s: 2.1.1
 Assignee: Aleksey Yeschenko  (was: Dave Brosius)

Re-opening and adding additional 2.1.1 target for [~skyline81]

> Allow C* to compile under java 8
> 
>
> Key: CASSANDRA-7028
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7028
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Dave Brosius
>Assignee: Aleksey Yeschenko
>Priority: Minor
> Fix For: 2.1.1, 3.0
>
> Attachments: 7028.txt, 7028_v2.txt, 7028_v3.txt, 7028_v4.txt, 
> 7028_v5.patch
>
>
> antlr 3.2 has a problem with java 8, as described here: 
> http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8015656
> updating to antlr 3.5.2 solves this, however they have split up the jars 
> differently, which adds some changes, but also the generation of 
> CqlParser.java causes a method to be too large, so i needed to split that 
> method to reduce the size of it.
> (patch against trunk)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7826) support arbitrary nesting of collection

2014-08-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109449#comment-14109449
 ] 

Tupshin Harper commented on CASSANDRA-7826:
---

Actually, it might not require a specific nesting depth if you can nest the 
same UDT in itself (haven't tried since it doesn't matter in this case).

> support arbitrary nesting of collection
> ---
>
> Key: CASSANDRA-7826
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
>
> The inability to nest collections is one of the bigger data modelling 
> limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7826) support arbitrary nesting of collection

2014-08-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109443#comment-14109443
 ] 

Tupshin Harper commented on CASSANDRA-7826:
---

UDT would require predefining a specific nesting depth, though that's not 
necessarily a huge obstacle. but without CASSANDRA-7423 I couldn't begin to 
recommend UDTs for most use cases.

> support arbitrary nesting of collection
> ---
>
> Key: CASSANDRA-7826
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
>
> The inability to nest collections is one of the bigger data modelling 
> limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7826) support arbitrary nesting of collection

2014-08-25 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7826:
-

 Summary: support arbitrary nesting of collection
 Key: CASSANDRA-7826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7826
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
 Fix For: 3.0


The inability to nest collections is one of the bigger data modelling 
limitations we have right now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7642) Adaptive Consistency

2014-08-22 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107052#comment-14107052
 ] 

Tupshin Harper commented on CASSANDRA-7642:
---

 I don't like the minx/max consistency terminology in the context of:

"Transparent downgrading violates the CL contract, and that contract
considered be just about the most important element of Cassandra's
runtime behaviour. Fully transparent downgrading without any contract
is dangerous. However, would it be problem if we specify explicitly
only two discrete CL levels - MIN_CL and MAX_CL?"

I strongly believe that it is a problem even with only two explicit
levels specified.

As such, I propose two changes to the spec:

1) the terminology changes from min/max to terms representing "block
until" for max and "actual contractual consistency level" for min.
2) Even more critically, ensure that the protocol and driver
provide a communication mechanism back to the client for every
operation, which of the two CL levels was fulfilled by the request.

> Adaptive Consistency
> 
>
> Key: CASSANDRA-7642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7642
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Rustam Aliyev
> Fix For: 3.0
>
>
> h4. Problem
> At minimum, application requires consistency level of X, which must be fault 
> tolerant CL. However, when there is no failure it would be advantageous to 
> use stronger consistency Y (Y>X).
> h4. Suggestion
> Application defines minimum (X) and maximum (Y) consistency levels. C* can 
> apply adaptive consistency logic to use Y whenever possible and downgrade to 
> X when failure occurs.
> Implementation should not negatively impact performance. Therefore, state has 
> to be maintained globally (not per request).
> h4. Example
> {{MIN_CL=LOCAL_QUORUM}}
> {{MAX_CL=EACH_QUORUM}}
> h4. Use Case
> Consider a case where user wants to maximize their uptime and consistency. 
> They designing a system using C* where transactions are read/written with 
> LOCAL_QUORUM and distributed across 2 DCs. Occasional inconsistencies between 
> DCs can be tolerated. R/W with LOCAL_QUORUM is satisfactory in most of the 
> cases.
> Application requires new transactions to be read back right after they were 
> generated. Write and read could be done through different DCs (no 
> stickiness). In some cases when user writes into DC1 and reads immediately 
> from DC2, replication delay may cause problems. Transaction won't show up on 
> read in DC2, user will retry and create duplicate transaction. Occasional 
> duplicates are fine and the goal is to minimize number of dups.
> Therefore, we want to perform writes with stronger consistency (EACH_QUORUM) 
> whenever possible without compromising on availability. Using adaptive 
> consistency they should be able to define:
>{{Read CL = LOCAL_QUORUM}}
>{{Write CL = ADAPTIVE (MIN:LOCAL_QUORUM, MAX:EACH_QUORUM)}}
> Similar scenario can be described for {{Write CL = ADAPTIVE (MIN:QUORUM, 
> MAX:ALL)}} case.
> h4. Criticism
> # This functionality can/should be implemented by user himself.
> bq. It will be hard for an average user to implement topology monitoring and 
> state machine. Moreover, this is a pattern which repeats.
> # Transparent downgrading violates the CL contract, and that contract 
> considered be just about the most important element of Cassandra's runtime 
> behavior.
> bq.Fully transparent downgrading without any contract is dangerous. However, 
> would it be problem if we specify explicitly only two discrete CL levels - 
> MIN_CL and MAX_CL?
> # If you have split brain DCs (partitioned in CAP), you have to sacrifice 
> either consistency or availability, and auto downgrading sacrifices the 
> consistency in dangerous ways if the application isn't designed to handle it. 
> And if the application is designed to handle it, then it should be able to 
> handle it in normal circumstances, not just degraded/extraordinary ones.
> bq. Agreed. Application should be designed for MIN_CL. In that case, MAX_CL 
> will not be causing much harm, only adding flexibility.
> # It might be a better idea to loudly downgrade, instead of silently 
> downgrading, meaning that the client code does an explicit retry with lower 
> consistency on failure and takes some other kind of action to attempt to 
> inform either users or operators of the problem. The silent part of the 
> downgrading which could be dangerous.
> bq. There are certainly cases where user should be informed when consistency 
> changes in order to perform custom action. For this purpose we could 
> allow/require user to register callback function which will be triggered when 
> consistency level changes. Best practices could be enforced by requiring 
> callback.



--
This message 

[jira] [Created] (CASSANDRA-7730) altering a table to add a static column bypasses clustering column requirement check

2014-08-09 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7730:
-

 Summary: altering a table to add a static column bypasses 
clustering column requirement check
 Key: CASSANDRA-7730
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7730
 Project: Cassandra
  Issue Type: Bug
Reporter: Tupshin Harper
 Fix For: 2.1.0


cqlsh:test_ks> create TABLE foo ( bar int, primary key (bar));
cqlsh:test_ks> alter table foo add bar2 text static;
cqlsh:test_ks> describe table foo;

CREATE TABLE foo (
  bar int,
  bar2 text static,
  PRIMARY KEY ((bar))
) 

cqlsh:test_ks> select * from foo;
TSocket read 0 bytes


ERROR [Thrift:12] 2014-08-09 15:08:22,518 CassandraDaemon.java (line 199) 
Exception in thread Thread[Thrift:12,5,main]
java.lang.AssertionError
at 
org.apache.cassandra.config.CFMetaData.getStaticColumnNameBuilder(CFMetaData.java:2142)
at 
org.apache.cassandra.cql3.statements.SelectStatement.makeFilter(SelectStatement.java:454)
at 
org.apache.cassandra.cql3.statements.SelectStatement.getRangeCommand(SelectStatement.java:360)
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:206)
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:61)
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:158)





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7370) Create a new system table "node_config" to load cassandra.yaml config data.

2014-07-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074541#comment-14074541
 ] 

Tupshin Harper commented on CASSANDRA-7370:
---

While I'm very much in favor of this feature, I'd like to propose that the 
implementation get deferred and ultimately redone in terms of CASSANDRA-7622, 
so that we will have a more general mechanism for other similar needs.

> Create a new system table "node_config" to load cassandra.yaml config data.
> ---
>
> Key: CASSANDRA-7370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7370
> Project: Cassandra
>  Issue Type: Wish
>  Components: Config
>Reporter: Hayato Shimizu
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: ponies
> Attachments: 7370-v3.txt
>
>
> Currently the node configuration information specified in cassandra.yaml can 
> only be viewed via JMX or by looking at the file on individual machines.
> As an administrator, it would be extremely useful to be able to execute 
> queries like the following example;
> select concurrent_reads from system.node_config;
> which will list all the concurrent_reads value from all of the nodes in a 
> cluster.
> This will require a new table in the system keyspace and the data to be 
> loaded (if required) during the bootstrap, and updated when MBeans attribute 
> value updates are performed. The data from other nodes in the cluster is also 
> required in the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7622) Implement virtual tables

2014-07-25 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7622:
-

 Summary: Implement virtual tables
 Key: CASSANDRA-7622
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
 Fix For: 3.0


There are a variety of reasons to want virtual tables, which would be any table 
that would be backed by an API, rather than data explicitly managed and stored 
as sstables.

One possible use case would be to expose JMX data through CQL as a resurrection 
of CASSANDRA-3527.

Another is a more general framework to implement the ability to expose yaml 
configuration information. So it would be an alternate approach to 
CASSANDRA-7370.

A possible implementation would be in terms of CASSANDRA-7443, but I am not 
presupposing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7075) Add the ability to automatically distribute your commitlogs across all data volumes

2014-07-17 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7075:
--

Fix Version/s: 3.0

> Add the ability to automatically distribute your commitlogs across all data 
> volumes
> ---
>
> Key: CASSANDRA-7075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7075
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>  Labels: performance
> Fix For: 3.0
>
>
> given the prevalance of ssds (no need to separate commitlog and data), and 
> improved jbod support, along with CASSANDRA-3578, it seems like we should 
> have an option to have one commitlog per data volume, to even the load. i've 
> been seeing more and more cases where there isn't an obvious "extra" volume 
> to put the commitlog on, and sticking it on only one of the jbodded ssd 
> volumes leads to IO imbalance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7026) CQL:WHERE ... IN with full partition keys

2014-07-16 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7026:
--

Fix Version/s: (was: 3.0)

> CQL:WHERE ... IN with full partition keys
> -
>
> Key: CASSANDRA-7026
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7026
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Dan Hunt
>Priority: Minor
>  Labels: cql
>
> It would be handy to be able to pass in a list of fully qualified composite 
> partition keys in an IN filter to retrieve multiple distinct rows with a 
> single select.  Not entirely sure how that would work.  It looks like maybe 
> it could be done with the existing token() function, like:
> SELECT * FROM table WHERE token(keyPartA, keyPartB) IN (token(1, 1), token(4, 
> 2))
> Though, I guess you'd also want some way to pass a list of tokens to a 
> prepared statement through the driver.  This of course all assumes that an IN 
> filter could be faster than a bunch of prepared statements, which might not 
> be true.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7026) CQL:WHERE ... IN with full partition keys

2014-07-16 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7026:
--

Fix Version/s: 3.0

> CQL:WHERE ... IN with full partition keys
> -
>
> Key: CASSANDRA-7026
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7026
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Dan Hunt
>Priority: Minor
>  Labels: cql
>
> It would be handy to be able to pass in a list of fully qualified composite 
> partition keys in an IN filter to retrieve multiple distinct rows with a 
> single select.  Not entirely sure how that would work.  It looks like maybe 
> it could be done with the existing token() function, like:
> SELECT * FROM table WHERE token(keyPartA, keyPartB) IN (token(1, 1), token(4, 
> 2))
> Though, I guess you'd also want some way to pass a list of tokens to a 
> prepared statement through the driver.  This of course all assumes that an IN 
> filter could be faster than a bunch of prepared statements, which might not 
> be true.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7471) sstableloader should have the ability to strip ttls

2014-06-30 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7471:
-

 Summary: sstableloader should have the ability to strip ttls
 Key: CASSANDRA-7471
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7471
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Tupshin Harper
Priority: Minor


When restoring data from backup, for reasons of data recovery or analysis, if 
the data was set to TTL, then some or all of the data will be inaccessible 
unless you either force your entire cluster to have their clocks set in the 
past, or by slowly and painfully using sstable2json, stripping ttls there, and 
then json2sstable before loading. 

I propose a flag "-ignore-ttl" that could be based to sstableloader that would 
automatically strip any ttls from cells as they are loaded



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions

2014-06-28 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046902#comment-14046902
 ] 

Tupshin Harper commented on CASSANDRA-7056:
---

I also want to point out that [~iamaleksey]'s response to global indexes 
(CASSANDRA-6477) was: "I think we should leave it to people's client code. We 
don't need more complexity on our read/write paths when this can be done 
client-side."

That combined with "alternatively, we just don't invent new unnecessary 
concepts (batch reads) to justify hypothetical things we could do that nobody 
asked us for" would leave us with absolutely no approach to achieve consistent 
cross-partition consistent indexes through either client or server-side code.

> Add RAMP transactions
> -
>
> Key: CASSANDRA-7056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7463) Update CQLSSTableWriter to allow parallel writing of SSTables on the same table within the same JVM

2014-06-27 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046082#comment-14046082
 ] 

Tupshin Harper commented on CASSANDRA-7463:
---

Since we push people towards doing SSTableLoading for fast import, and since 
the CQLSSTableWriter is the new shiny way to create sstables, we need to make 
it easy to generate sstables in parallel. High priority, imo.

> Update CQLSSTableWriter to allow parallel writing of SSTables on the same 
> table within the same JVM
> ---
>
> Key: CASSANDRA-7463
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7463
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Johnny Miller
>
> Currently it is not possible to programatically write multiple SSTables for 
> the same table in parallel using the CQLSSTableWriter. This is quite a 
> limitation and the workaround of attempting to do this in a separate JVM is 
> not a great solution.
> See: 
> http://stackoverflow.com/questions/24396902/using-cqlsstablewriter-concurrently



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions

2014-06-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044162#comment-14044162
 ] 

Tupshin Harper commented on CASSANDRA-7056:
---

I am absolutely fine with vetting it as part another feature (indexes) before 
exposing new API to provide explicit support for RAMP transactions. I'm simply 
refuting the "hypothetical things we could do that nobody asked us for" part. 
Just because nobody thought to ask for this specific form of consistency 
doesn't mean the practical benefits are at all unclear.

> Add RAMP transactions
> -
>
> Key: CASSANDRA-7056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7056) Add RAMP transactions

2014-06-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043367#comment-14043367
 ] 

Tupshin Harper edited comment on CASSANDRA-7056 at 6/25/14 12:25 PM:
-

Cross table consistent reads are of fundamental importance. 

Once you allow that they are useful for consistent index reads, then you have 
admitted that they are useful for for direct consumption by users, since we are 
constantly advising them to build their own index solutions since 2i are 
horrendously weak.  That pressure will be only slightly reduced with global 
indexes. 

Even separate from custom (client-side) 2i implementations, having all or 
nothing read visibility of writes spanning tables captures fundamental business 
logic that is either painfully worked around today, or else is glossed over as 
statistically unlikely (depending on the r/w patterns) and the race conditions 
duly ignored. 

It would be a tragic mistake to ignore the benefits of the gains in correctness 
that can be achieved.


was (Author: tupshin):
Cross table consistent reads are of fundamental importance. 

Once you allow that they are useful for consistent index reads, then you have 
admitted that they are useful for for direction consumption by users, since we 
are constantly advising them to build their own index solutions since 2i are 
horrendously weak.  That pressure will be only slightly reduced with global 
indexes. 

Even separate from custom (client-side) 2i implementations, having all or 
nothing read visibility of writes spanning tables captures fundamental business 
logic that is either painfully worked around today, or else is glossed over as 
statistically unlikely (depending on the r/w patterns) and the race conditions 
duly ignored. 

It would be a tragic mistake to ignore the benefits of the gains in correctness 
that can be achieved.

> Add RAMP transactions
> -
>
> Key: CASSANDRA-7056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7056) Add RAMP transactions

2014-06-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043367#comment-14043367
 ] 

Tupshin Harper edited comment on CASSANDRA-7056 at 6/25/14 12:05 PM:
-

Cross table consistent reads are of fundamental importance. 

Once you allow that they are useful for consistent index reads, then you have 
admitted that they are useful for for direction consumption by users, since we 
are constantly advising them to build their own index solutions since 2i are 
horrendously weak.  That pressure will be only slightly reduced with global 
indexes. 

Even separate from custom (client-side) 2i implementations, having all or 
nothing read visibility of writes spanning tables captures fundamental business 
logic that is either painfully worked around today, or else is glossed over as 
statistically unlikely (depending on the r/w patterns) and the race conditions 
duly ignored. 

It would be a tragic mistake to ignore the benefits of the gains in correctness 
that can be achieved.


was (Author: tupshin):
Cross table consistent reads are of fundamental importance. 

Once you allow that they are useful for consistent index reads, then you have 
admitted that they are useful for for direction consumption by users, since we 
are constantly advising them to build their own index solutions since 2i are 
horrendously weak.  That pressure will be only slightly reduced with global 
indexes. 

Even separate from custom (client-side) 2i implementations, having all or 
nothing read visibility of writes spanning partitions/tables captures 
fundamental business logic that is either painfully worked around today, or 
else is glossed over as statistically unlikely (depending on the r/w patterns) 
and the race conditions duly ignored. 

It would be a tragic mistake to ignore the benefits of the gains in correctness 
that can be achieved.

> Add RAMP transactions
> -
>
> Key: CASSANDRA-7056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions

2014-06-25 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043367#comment-14043367
 ] 

Tupshin Harper commented on CASSANDRA-7056:
---

Cross table consistent reads are of fundamental importance. 

Once you allow that they are useful for consistent index reads, then you have 
admitted that they are useful for for direction consumption by users, since we 
are constantly advising them to build their own index solutions since 2i are 
horrendously weak.  That pressure will be only slightly reduced with global 
indexes. 

Even separate from custom (client-side) 2i implementations, having all or 
nothing read visibility of writes spanning partitions/tables captures 
fundamental business logic that is either painfully worked around today, or 
else is glossed over as statistically unlikely (depending on the r/w patterns) 
and the race conditions duly ignored. 

It would be a tragic mistake to ignore the benefits of the gains in correctness 
that can be achieved.

> Add RAMP transactions
> -
>
> Key: CASSANDRA-7056
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Tupshin Harper
>Priority: Minor
>
> We should take a look at 
> [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
>  transactions, and figure out if they can be used to provide more efficient 
> LWT (or LWT-like) operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7423) make user defined types useful for non-trivial use cases

2014-06-20 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7423:
--

Description: 
Since user defined types were implemented in CASSANDRA-5590 as blobs (you have 
to rewrite the entire type in order to make any modifications), they can't be 
safely used without LWT for any operation that wants to modify a subset of the 
UDT's fields by any client process that is not authoritative for the entire 
blob. 

When trying to use UDTs to model complex records (particularly with nesting), 
this is not an exceptional circumstance, this is the totally expected normal 
situation. 

The use of UDTs for anything non-trivial is harmful to either performance or 
consistency or both.

edit: to clarify, i believe that most potential uses of UDTs should be 
considered anti-patterns until/unless we have field-level r/w access to 
individual elements of the UDT, with individual timestamps and standard LWW 
semantics

  was:
Since user defined types were implemented in CASSANDRA-5590 as blobs (you have 
to rewrite the entire type in order to make any modifications), they can't be 
safely used without LWT for any operation that wants to modify a subset of the 
UDT's fields by any client process that is not authoritative for the entire 
blob. 

When trying to use UDTs to model complex records (particularly with nesting), 
this is not an exceptional circumstance, this is the totally expected normal 
situation. 

The use of UDTs for anything non-trivial is harmful to either performance or 
consistency or both.


> make user defined types useful for non-trivial use cases
> 
>
> Key: CASSANDRA-7423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7423
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API, Core
>Reporter: Tupshin Harper
>
> Since user defined types were implemented in CASSANDRA-5590 as blobs (you 
> have to rewrite the entire type in order to make any modifications), they 
> can't be safely used without LWT for any operation that wants to modify a 
> subset of the UDT's fields by any client process that is not authoritative 
> for the entire blob. 
> When trying to use UDTs to model complex records (particularly with nesting), 
> this is not an exceptional circumstance, this is the totally expected normal 
> situation. 
> The use of UDTs for anything non-trivial is harmful to either performance or 
> consistency or both.
> edit: to clarify, i believe that most potential uses of UDTs should be 
> considered anti-patterns until/unless we have field-level r/w access to 
> individual elements of the UDT, with individual timestamps and standard LWW 
> semantics



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7370) Create a new system table "node_config" to load cassandra.yaml config data.

2014-06-20 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039248#comment-14039248
 ] 

Tupshin Harper commented on CASSANDRA-7370:
---

i'm +1 on abusing the system with virtual/phantom tables. they are a well 
established rdbms pattern that is conceptually simple (albeit not particularly 
elegant) and well understood.

reflection could be leveraged to eliminate the need to keep an up to date list


> Create a new system table "node_config" to load cassandra.yaml config data.
> ---
>
> Key: CASSANDRA-7370
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7370
> Project: Cassandra
>  Issue Type: Wish
>  Components: Config
>Reporter: Hayato Shimizu
>Priority: Minor
>  Labels: ponies
>
> Currently the node configuration information specified in cassandra.yaml can 
> only be viewed via JMX or by looking at the file on individual machines.
> As an administrator, it would be extremely useful to be able to execute 
> queries like the following example;
> select concurrent_reads from system.node_config;
> which will list all the concurrent_reads value from all of the nodes in a 
> cluster.
> This will require a new table in the system keyspace and the data to be 
> loaded (if required) during the bootstrap, and updated when MBeans attribute 
> value updates are performed. The data from other nodes in the cluster is also 
> required in the table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7423) make user defined types useful for non-trivial use cases

2014-06-20 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7423:
-

 Summary: make user defined types useful for non-trivial use cases
 Key: CASSANDRA-7423
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7423
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Tupshin Harper


Since user defined types were implemented in CASSANDRA-5590 as blobs (you have 
to rewrite the entire type in order to make any modifications), they can't be 
safely used without LWT for any operation that wants to modify a subset of the 
UDT's fields by any client process that is not authoritative for the entire 
blob. 

When trying to use UDTs to model complex records (particularly with nesting), 
this is not an exceptional circumstance, this is the totally expected normal 
situation. 

The use of UDTs for anything non-trivial is harmful to either performance or 
consistency or both.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7156) Add a new seed provider for Apache Cloudstack platforms

2014-06-17 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033815#comment-14033815
 ] 

Tupshin Harper commented on CASSANDRA-7156:
---

I'm fine with linking to external github repos to provide additional seed 
providers, at least initially. There just needs to be very clear and 
straightforward instructions for building and deploying them.

> Add a new seed provider for Apache Cloudstack platforms
> ---
>
> Key: CASSANDRA-7156
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7156
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: needs access to a cloudstack API endpoint
>Reporter: Pierre-Yves Ritschard
>Assignee: Michael Shuler
>Priority: Minor
> Fix For: 2.0.9, 2.1.1
>
> Attachments: 0001-initial-work-on-a-cloudstack-seed-provider.patch
>
>
> The attached patch adds a new seed provider which queries a cloudstack API 
> endpoint for instances having a specific tag.
> The tag key and value can be controlled in the configuration file and 
> 
> will default to 'cassandra_seed' and 'default'.   
> 
> The Cloudstack endpoint is configured by three parameters in the  
> 
> configuration file: 'cloudstack_api_endpoint', 'cloudstack_api_key' and   
> 
> 'cloudstack_api_secret'   
> 
> By default, CloudstackSeedProvider fetchs the ipaddress of the first  
> 
> interface, if another index should be used, the nic_index parameter will hold 
> it.
> A typical configuration file would thus have:
> {code:yaml}
> seed_provider:
> - class_name: org.apache.cassandra.locator.CloudstackSeedProvider
>   parameters:
>   - cloudstack_api_endpoint: "https://some.cloudstack.host";
> cloudstack_api_key: "X"
> cloudstack_api_secret: "X"
> tag_value: "my_cluster_name"
> {code}
> This introduces no new dependency and together with CASSANDRA-7147 gives an 
> easy way of getting started on cloudstack platforms



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6178) Consider allowing timestamp at the protocol level ... and deprecating server side timestamps

2014-06-13 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031203#comment-14031203
 ] 

Tupshin Harper commented on CASSANDRA-6178:
---

FWIW I am very negative on client side time stamps ever being mandatory.

> Consider allowing timestamp at the protocol level ... and deprecating server 
> side timestamps
> 
>
> Key: CASSANDRA-6178
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6178
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>
> Generating timestamps server side by default for CQL has been done for 
> convenience, so that end-user don't have to provide one with every query.  
> However, doing it server side has the downside that updates made sequentially 
> by one single client (thread) are no guaranteed to have sequentially 
> increasing timestamps. Unless a client thread is always pinned to one 
> specific server connection that is, but no good client driver out (that is, 
> including thrit driver) there does that because that's contradictory to 
> abstracting fault tolerance to the driver user (and goes again most sane load 
> balancing strategy).
> Very concretely, this means that if you write a very trivial test program 
> that sequentially insert a value and then erase it (or overwrite it), then, 
> if you let CQL pick timestamp server side, the deletion might not erase the 
> just inserted value (because the delete might reach a different coordinator 
> than the insert and thus get a lower timestamp). From the user point of view, 
> this is a very confusing behavior, and understandably so: if timestamps are 
> optional, you'd hope that they are least respect the sequentiality of 
> operation from a single client thread.
> Of course we do support client-side assigned timestamps so it's not like the 
> test above is not fixable. And you could argue that's it's not a bug per-se.  
> Still, it's a very confusing "default" behavior for something very simple, 
> which suggest it's not the best default.
> You could also argue that inserting a value and deleting/overwriting right 
> away (in the same thread) is not something real program often do. And indeed, 
> it's likely that in practice server-side timestamps work fine for most real 
> application. Still, it's too easy to get counter-intuitive behavior with 
> server-side timestamps and I think we should consider moving away from them.
> So what I'd suggest is that we push back the job of providing timestamp 
> client side. But to make it easy for the driver to generate it (rather than 
> the end user), we should allow providing said timestamp at the protocol level.
> As a side note, letting the client provide the timestamp would also have the 
> advantage of making it easy for the driver to retry failed operations with 
> their initial timestamp, so that retries are truly idempotent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7362) Be able to selectively "mount" a snapshot of a table as a read-only version of that table

2014-06-06 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7362:
--

Description: When doing batch jobs (thinking hive and shark as prominent 
examples) or repeated analysis of the same data, it can be challenging to get a 
consistent result if the data is changing under your feet. Rather than the low 
level CASSSANDRA-2527, I propose that we add the capability to take a named 
snapshot  (exact uuid in 2.1 and later), and be able to activate and deactivate 
it as a regular sstable (e.g. myks.mytable snapshot could be activated as 
myks.mytable-longuuid). That table would be queryable just like any other, but 
would not be writable. Any attempt to insert or update would throw an 
exception.   (was: When doing batch jobs (thinking hive and shark as prominent 
examples) or repeated analysis of the same data, it can be challenging to get a 
consistent result if the data is changing under your feet. Rather than the low 
level CASSSANDRA-2527, I propose that we add the capability to take a named 
snapshot  (exact uuid in 2.1 and later), and be able to activate and deactivate 
it as a regular sstable (e.g. myks.mytable snapshot could be activated as 
myks.mytable-longuuid). That table would be queryable just like any other, but 
would not be writable. Any attempt to insert or update would throw an 
exception. Because it would )

> Be able to selectively "mount" a snapshot of a table as a read-only version 
> of that table
> -
>
> Key: CASSANDRA-7362
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7362
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core, Tools
>Reporter: Tupshin Harper
>Priority: Minor
> Fix For: 3.0
>
>
> When doing batch jobs (thinking hive and shark as prominent examples) or 
> repeated analysis of the same data, it can be challenging to get a consistent 
> result if the data is changing under your feet. Rather than the low level 
> CASSSANDRA-2527, I propose that we add the capability to take a named 
> snapshot  (exact uuid in 2.1 and later), and be able to activate and 
> deactivate it as a regular sstable (e.g. myks.mytable snapshot could be 
> activated as myks.mytable-longuuid). That table would be queryable just like 
> any other, but would not be writable. Any attempt to insert or update would 
> throw an exception. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7362) Be able to selectively "mount" a snapshot of a table as a read-only version of that table

2014-06-06 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7362:
-

 Summary: Be able to selectively "mount" a snapshot of a table as a 
read-only version of that table
 Key: CASSANDRA-7362
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7362
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Tupshin Harper
Priority: Minor
 Fix For: 3.0


When doing batch jobs (thinking hive and shark as prominent examples) or 
repeated analysis of the same data, it can be challenging to get a consistent 
result if the data is changing under your feet. Rather than the low level 
CASSSANDRA-2527, I propose that we add the capability to take a named snapshot  
(exact uuid in 2.1 and later), and be able to activate and deactivate it as a 
regular sstable (e.g. myks.mytable snapshot could be activated as 
myks.mytable-longuuid). That table would be queryable just like any other, but 
would not be writable. Any attempt to insert or update would throw an 
exception. Because it would 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7306) Support "edge dcs" with more flexible gossip

2014-05-27 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010194#comment-14010194
 ] 

Tupshin Harper commented on CASSANDRA-7306:
---

#1 is definitely more ill-defined than it should be. The main thing I'd want to 
see is good overall cluster stability and behavior with 100s of spoke DCs that 
each could be offline up to 50% of the time (as a useful baseline). Until, and 
unless, that is formally tested, I don't have too much to add.

> Support "edge dcs" with more flexible gossip
> 
>
> Key: CASSANDRA-7306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
>
> As Cassandra clusters get bigger and bigger, and their topology becomes more 
> complex, there is more and more need for a notion of "hub" and "spoke" 
> datacenters.
> One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
> is the assumption that all dcs need to talk to each other (and be connected 
> all the time).
> This ticket is a vague placeholder with the goals of achieving:
> 1) better behavioral support for occasionally disconnected datacenters
> 2) explicit support for custom dc to dc routing. A simple approach would be 
> an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7306) Support "edge dcs" with more flexible gossip

2014-05-27 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010194#comment-14010194
 ] 

Tupshin Harper edited comment on CASSANDRA-7306 at 5/27/14 7:56 PM:


#1 is definitely more ill-defined than it should be. The main thing I'd want to 
see is good overall cluster stability and behavior with 100s of spoke DCs, 
where each DC could be offline up to 50% of the time (as a useful baseline). 
Until, and unless, that is formally tested, I don't have too much to add.


was (Author: tupshin):
#1 is definitely more ill-defined than it should be. The main thing I'd want to 
see is good overall cluster stability and behavior with 100s of spoke DCs that 
each could be offline up to 50% of the time (as a useful baseline). Until, and 
unless, that is formally tested, I don't have too much to add.

> Support "edge dcs" with more flexible gossip
> 
>
> Key: CASSANDRA-7306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
>
> As Cassandra clusters get bigger and bigger, and their topology becomes more 
> complex, there is more and more need for a notion of "hub" and "spoke" 
> datacenters.
> One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
> is the assumption that all dcs need to talk to each other (and be connected 
> all the time).
> This ticket is a vague placeholder with the goals of achieving:
> 1) better behavioral support for occasionally disconnected datacenters
> 2) explicit support for custom dc to dc routing. A simple approach would be 
> an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7306) Support "edge dcs" with more flexible gossip

2014-05-27 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7306:
-

 Summary: Support "edge dcs" with more flexible gossip
 Key: CASSANDRA-7306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper


As Cassandra clusters get bigger and bigger, and their topology becomes more 
complex, there is more and more need for a notion of "hub" and "spoke" 
datacenters.

One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
is the assumption that all dcs need to talk to each other (and be connected all 
the time).

This ticket is a vague placeholder with the goals of achieving:
1) better behavioral support for occasionally disconnected datacenters
2) explicit support for custom dc to dc routing. A simple approach would be an 
optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7306) Support "edge dcs" with more flexible gossip

2014-05-27 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper updated CASSANDRA-7306:
--

Labels: ponies  (was: )

> Support "edge dcs" with more flexible gossip
> 
>
> Key: CASSANDRA-7306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>  Labels: ponies
>
> As Cassandra clusters get bigger and bigger, and their topology becomes more 
> complex, there is more and more need for a notion of "hub" and "spoke" 
> datacenters.
> One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
> is the assumption that all dcs need to talk to each other (and be connected 
> all the time).
> This ticket is a vague placeholder with the goals of achieving:
> 1) better behavioral support for occasionally disconnected datacenters
> 2) explicit support for custom dc to dc routing. A simple approach would be 
> an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7297) semi-immutable CQL rows

2014-05-26 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008902#comment-14008902
 ] 

Tupshin Harper commented on CASSANDRA-7297:
---

The functionality described in CASSANDRA-6412 would provide a super-set of this 
ticket.

> semi-immutable CQL rows
> ---
>
> Key: CASSANDRA-7297
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7297
> Project: Cassandra
>  Issue Type: Improvement
>  Components: API, Core
>Reporter: Tupshin Harper
>
> There are many use cases, where data is immutable at the domain model level. 
> Most time-series/audit trail/logging applications fit this approach.
> A relatively simple way to implement a bare-bones version of this would be to 
> have a table-level schema option for "first writer wins", so that in the 
> event of any conflict, the more recent version would be thrown on the floor.
> Obviously, this is not failure proof in the face of inconsistent timestamps, 
> but that is a problem to be addressed outside of Cassandra.
> Optional additional features could include logging any non-identical cells 
> discarded due to collision.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7297) semi-immutable CQL rows

2014-05-24 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7297:
-

 Summary: semi-immutable CQL rows
 Key: CASSANDRA-7297
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7297
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Tupshin Harper


There are many use cases, where data is immutable at the domain model level. 
Most time-series/audit trail/logging applications fit this approach.

A relatively simple way to implement a bare-bones version of this would be to 
have a table-level schema option for "first writer wins", so that in the event 
of any conflict, the more recent version would be thrown on the floor.

Obviously, this is not failure proof in the face of inconsistent timestamps, 
but that is a problem to be addressed outside of Cassandra.

Optional additional features could include logging any non-identical cells 
discarded due to collision.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7296) Add CL.COORDINATOR_ONLY

2014-05-24 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7296:
-

 Summary: Add CL.COORDINATOR_ONLY
 Key: CASSANDRA-7296
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7296
 Project: Cassandra
  Issue Type: Improvement
Reporter: Tupshin Harper


For reasons such as CASSANDRA-6340 and similar, it would be nice to have a read 
that never gets distributed, and only works if the coordinator you are talking 
to is an owner of the row.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-5394) Allow assigning disk quotas by keyspace

2014-05-24 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008189#comment-14008189
 ] 

Tupshin Harper commented on CASSANDRA-5394:
---

Grouping together multitenant feature requests. There might be a "soft cap" 
approach to make this one viable.

> Allow assigning disk quotas by keyspace
> ---
>
> Key: CASSANDRA-5394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5394
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: J.B. Langston
>Assignee: Tupshin Harper
>Priority: Minor
>
> A customer is requesting this. They are implementing a multi-tenant Cassandra 
> Service offering.  They want to limit the amount of diskspace that a user or 
> application can consume.  They would also want to be able to modify the quota 
> after the keyspace is set up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-5394) Allow assigning disk quotas by keyspace

2014-05-24 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper reassigned CASSANDRA-5394:
-

Assignee: Tupshin Harper

> Allow assigning disk quotas by keyspace
> ---
>
> Key: CASSANDRA-5394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5394
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: J.B. Langston
>Assignee: Tupshin Harper
>Priority: Minor
>
> A customer is requesting this. They are implementing a multi-tenant Cassandra 
> Service offering.  They want to limit the amount of diskspace that a user or 
> application can consume.  They would also want to be able to modify the quota 
> after the keyspace is set up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-841) Track statistics by user as well as CF

2014-05-24 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008188#comment-14008188
 ] 

Tupshin Harper commented on CASSANDRA-841:
--

Grouping together multitenant feature requests

> Track statistics by user as well as CF
> --
>
> Key: CASSANDRA-841
> URL: https://issues.apache.org/jira/browse/CASSANDRA-841
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Tupshin Harper
>Priority: Minor
> Fix For: 0.8 beta 1
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-841) Track statistics by user as well as CF

2014-05-24 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper reassigned CASSANDRA-841:


Assignee: Tupshin Harper

> Track statistics by user as well as CF
> --
>
> Key: CASSANDRA-841
> URL: https://issues.apache.org/jira/browse/CASSANDRA-841
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Tupshin Harper
>Priority: Minor
> Fix For: 0.8 beta 1
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (CASSANDRA-2068) Improvements for Multi-tenant clusters

2014-05-24 Thread Tupshin Harper (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tupshin Harper reassigned CASSANDRA-2068:
-

Assignee: Tupshin Harper

> Improvements for Multi-tenant clusters
> --
>
> Key: CASSANDRA-2068
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2068
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Goffinet
>Assignee: Tupshin Harper
>Priority: Minor
>
> It would be helpful if we could actually set some limits per CF to help 
> Multi-tenant clusters. Here are some ideas I was thinking:
> (per CF)
> 1.  Set an upper bound (max) for count when slicing or multi/get calls
> 2.  Set an upper bound (max) for how much data in bytes can be returned 
> (64KB, 512KB, 1MB, etc) 
> This would introduce new exceptions that can be thrown. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6602) Compaction improvements to optimize time series data

2014-05-16 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1401#comment-1401
 ] 

Tupshin Harper commented on CASSANDRA-6602:
---

Two comments:
1) Promising solution that I'd love to see validated and backported to at least 
2.1, and if at all possible, all the way to 2.0.x
2) I don't want to end up closing the issue and losing track of the approaches 
benedict and I were talking about, so one or the other should become a new 
ticket.

> Compaction improvements to optimize time series data
> 
>
> Key: CASSANDRA-6602
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6602
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Björn Hegerfors
>  Labels: compaction, performance
> Fix For: 3.0
>
> Attachments: 
> cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, 
> cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt
>
>
> There are some unique characteristics of many/most time series use cases that 
> both provide challenges, as well as provide unique opportunities for 
> optimizations.
> One of the major challenges is in compaction. The existing compaction 
> strategies will tend to re-compact data on disk at least a few times over the 
> lifespan of each data point, greatly increasing the cpu and IO costs of that 
> write.
> Compaction exists to
> 1) ensure that there aren't too many files on disk
> 2) ensure that data that should be contiguous (part of the same partition) is 
> laid out contiguously
> 3) deleting data due to ttls or tombstones
> The special characteristics of time series data allow us to optimize away all 
> three.
> Time series data
> 1) tends to be delivered in time order, with relatively constrained exceptions
> 2) often has a pre-determined and fixed expiration date
> 3) Never gets deleted prior to TTL
> 4) Has relatively predictable ingestion rates
> Note that I filed CASSANDRA-5561 and this ticket potentially replaces or 
> lowers the need for it. In that ticket, jbellis reasonably asks, how that 
> compaction strategy is better than disabling compaction.
> Taking that to heart, here is a compaction-strategy-less approach that could 
> be extremely efficient for time-series use cases that follow the above 
> pattern.
> (For context, I'm thinking of an example use case involving lots of streams 
> of time-series data with a 5GB per day ingestion rate, and a 1000 day 
> retention with TTL, resulting in an eventual steady state of 5TB per node)
> 1) You have an extremely large memtable (preferably off heap, if/when doable) 
> for the table, and that memtable is sized to be able to hold a lengthy window 
> of time. A typical period might be one day. At the end of that period, you 
> flush the contents of the memtable to an sstable and move to the next one. 
> This is basically identical to current behaviour, but with thresholds 
> adjusted so that you can ensure flushing at predictable intervals. (Open 
> question is whether predictable intervals is actually necessary, or whether 
> just waiting until the huge memtable is nearly full is sufficient)
> 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be 
> efficiently dropped once all of the columns have. (Another side note, it 
> might be valuable to have a modified version of CASSANDRA-3974 that doesn't 
> bother storing per-column TTL since it is required that all columns have the 
> same TTL)
> 3) Be able to mark column families as read/write only (no explicit deletes), 
> so no tombstones.
> 4) Optionally add back an additional type of delete that would delete all 
> data earlier than a particular timestamp, resulting in immediate dropping of 
> obsoleted sstables.
> The result is that for in-order delivered data, Every cell will be laid out 
> optimally on disk on the first pass, and over the course of 1000 days and 5TB 
> of data, there will "only" be 1000 5GB sstables, so the number of filehandles 
> will be reasonable.
> For exceptions (out-of-order delivery), most cases will be caught by the 
> extended (24 hour+) memtable flush times and merged correctly automatically. 
> For those that were slightly askew at flush time, or were delivered so far 
> out of order that they go in the wrong sstable, there is relatively low 
> overhead to reading from two sstables for a time slice, instead of one, and 
> that overhead would be incurred relatively rarely unless out-of-order 
> delivery was the common case, in which case, this strategy should not be used.
> Another possible optimization to address out-of-order would be to maintain 
> more than one time-centric memtables in memory at a time (e.g. two 12 hour 
> ones), and then you always ins

[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-05 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989553#comment-13989553
 ] 

Tupshin Harper commented on CASSANDRA-6696:
---

It does, thanks.

> Drive replacement in JBOD can cause data to reappear. 
> --
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-05 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989543#comment-13989543
 ] 

Tupshin Harper commented on CASSANDRA-6696:
---

I may be misunderstanding, but this seems to be optimizing for compaction 
throughput/parallelization, but at the expense of doing more total compaction 
activity (number of compactions per mutation over the life of that mutation, a 
form of write-amplification) by starting with smaller tables. 

If that's not the case, then please ignore, but it is important to note that 
for the largest scale, highest velocity, longest retained use cases, it's the 
number of recompactions/write amplification that really hurts.

> Drive replacement in JBOD can cause data to reappear. 
> --
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7136) Change default paths to ~ instead of /var

2014-05-02 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987842#comment-13987842
 ] 

Tupshin Harper commented on CASSANDRA-7136:
---

+1

> Change default paths to ~ instead of /var
> -
>
> Key: CASSANDRA-7136
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7136
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Albert P Tobey
> Fix For: 2.1.0
>
>
> Defaulting to /var makes it more difficult for both multi-user systems and 
> people unfamiliar with the command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-02 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987835#comment-13987835
 ] 

Tupshin Harper commented on CASSANDRA-6696:
---

They are basically splittable and resizable vnodes if you were to use shuffled 
vnodes with a byte ordered partitioner. Which makes them have more in common 
with CQL partitions than with vnodes,  from a "range of data" point of view.  
Except that the size of the ranges don't vary with the data model like they do 
with Cassandra. 

> Drive replacement in JBOD can cause data to reappear. 
> --
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-6696) Drive replacement in JBOD can cause data to reappear.

2014-05-02 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987837#comment-13987837
 ] 

Tupshin Harper commented on CASSANDRA-6696:
---

Hbase actually has pluggable compaction strategies these days. 

> Drive replacement in JBOD can cause data to reappear. 
> --
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
> Fix For: 3.0
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7136) Change default paths to ~ instead of /var

2014-05-02 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987799#comment-13987799
 ] 

Tupshin Harper commented on CASSANDRA-7136:
---

$CASSANDRA_HOME, and if not set, extracted_location/data. 
That's the only right answer. 

> Change default paths to ~ instead of /var
> -
>
> Key: CASSANDRA-7136
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7136
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Albert P Tobey
> Fix For: 2.1.0
>
>
> Defaulting to /var makes it more difficult for both multi-user systems and 
> people unfamiliar with the command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-3783) Add 'null' support to CQL 3.0

2014-04-23 Thread Tupshin Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978277#comment-13978277
 ] 

Tupshin Harper commented on CASSANDRA-3783:
---

Hi Dmytro,

This ticket contracted from its original scope and turned into just support for 
upserting a null actually performing a delete operation on the cell. There is 
currently no select support for indexed nulls, and given the design of 
Cassandra, is considered a difficult/prohibitive problem.

> Add 'null' support to CQL 3.0
> -
>
> Key: CASSANDRA-3783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3783
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Reporter: Sylvain Lebresne
>Assignee: Michał Michalski
>Priority: Minor
>  Labels: cql3
> Fix For: 1.2.4
>
> Attachments: 3783-v2.patch, 3783-v3.patch, 3783-v4.txt, 3783-v5.txt, 
> 3783-wip-v1.patch
>
>
> Dense composite supports adding records where only a prefix of all the 
> component specifying the key is defined. In other words, with:
> {noformat}
> CREATE TABLE connections (
>userid int,
>ip text,
>port int,
>protocol text,
>time timestamp,
>PRIMARY KEY (userid, ip, port, protocol)
> ) WITH COMPACT STORAGE
> {noformat}
> you can insert
> {noformat}
> INSERT INTO connections (userid, ip, port, time) VALUES (2, '192.168.0.1', 
> 80, 123456789);
> {noformat}
> You cannot however select that column specifically (i.e, without selecting 
> column (2, '192.168.0.1', 80, 'http') for instance).
> This ticket proposes to allow that though 'null', i.e. to allow
> {noformat}
> SELECT * FROM connections WHERE userid = 2 AND ip = '192.168.0.1' AND port = 
> 80 AND protocol = null;
> {noformat}
> It would then also make sense to support:
> {noformat}
> INSERT INTO connections (userid, ip, port, protocol, time) VALUES (2, 
> '192.168.0.1', 80, null, 123456789);
> {noformat}
> as an equivalent to the insert query above.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7075) Add the ability to automatically distribute your commitlogs across all data volumes

2014-04-23 Thread Tupshin Harper (JIRA)
Tupshin Harper created CASSANDRA-7075:
-

 Summary: Add the ability to automatically distribute your 
commitlogs across all data volumes
 Key: CASSANDRA-7075
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7075
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
 Environment: given the prevalance of ssds (no need to separate 
commitlog and data), and improved jbod support, along with 
[#3578|https://issues.apache.org/jira/browse/CASSANDRA-3578], it seems like we 
should have an option to have one commitlog per data volume, to even the load. 
i've been seeing more and more cases where there isn't an obvious "extra" 
volume to put the commitlog on, and sticking it on only one of the jbodded ssd 
volumes leads to IO imbalance.
Reporter: Tupshin Harper
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >