from:"Kenneth Brotman"

Is node restart required to update yaml changes in 2.1x

2018-03-12 Thread Kenneth Brotman

Can you update changes to cassandra.yaml in version 2.1x without restating
the node?

 

Kenneth Brotman

RE: system.size_estimates - safe to remove sstables?

2018-03-12 Thread Kenneth Brotman

Kunal,

Is  this the GCE cluster you are speaking of in the “Adding new DC?” thread?

Kenneth Brotman

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 2:18 PM
To: user@cassandra.apache.org
Subject: Re: system.size_estimates - safe to remove sstables?

Finally, got a chance to work on it over the weekend.

It worked as advertised. :)

Thanks a lot, Chris.

Kunal

On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote:

Thanks a lot, Chris.

Will try it today/tomorrow and update here.

Thanks,

Kunal

On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:

While its off you can delete the files in the directory yeah

Chris

On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote:

Hi Chris,

I checked for snapshots and backups - none found.

Also, we're not using opscenter, hadoop or spark or any such tool.

So, do you think we can just remove the cf and restart the service?

Thanks,

Kunal

On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:

Any chance space used by snapshots? What files exist there that are taking up 
space?

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> 
> wrote:
>

> Hi all,
>
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it shows 
> up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> output.
>
> This is while the other node is chugging along - shows only 25MiB consumed by 
> size_estimates (du -sh output).
>
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node and 
> restart the service?
>
> Thanks,
> Kunal

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
<mailto:user-unsubscribe@cassandra.apacheorg> 
For additional commands, e-mail: user-h...@cassandra.apache.org

What snitch to use with AWS and Google

2018-03-12 Thread Kenneth Brotman

Quick question:  If you have one cluster made of nodes of a datacenter in
AWS and a datacenter in Google, what snitch do you use?

 

Kenneth Brotman

RE: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kenneth Brotman

I didn’t understand something.  Are you saying you are using one data center on 
Google and one on Amazon?

 

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Monday, March 12, 2018 4:24 PM
To: user@cassandra.apache.org
Cc: Nikhil Soman
Subject: Re: [EXTERNAL] RE: Adding new DC?

 

 

On 13 March 2018 at 03:28, Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote:

You can’t migrate and upgrade at the same time perhaps but you could do one and 
then the other so as to end up on new version.  I’m guessing it’s an error in 
the yaml file or a port not open.  Is there any good reason for a production 
cluster to still be on version 2.1x?

 

I'm not trying to migrate AND upgrade at the same time. However, the apt repo 
shows only 2.120 as the available version.

This is the output from the new node in AWS

 

ubuntu@ip-10-0-43-213:~$ apt-cache policy cassandra 
cassandra: 
 Installed: 2.1.20 
 Candidate: 2.1.20 
 Version table: 
*** 2.1.20 500 
   500 http://www.apache.org/dist/cassandra/debian 21x/main amd64 Packages 
   100 /var/lib/dpkg/status

Regarding open ports, I can cqlsh into the GCE node(s) from the AWS node into 
GCE nodes.

As I mentioned earlier, I've opened the ports 9042, 7000, 7001 in GCE firewall 
for the public IP of the AWS instance.

 

I mentioned earlier - there are some differences in the column types - for 
example, date (>= 2.2) vs. timestamp (2.1.x)

The application has not been updated yet.

Hence sticking to 2.1.x for now.

 

And, so far, 2.1.x has been serving the purpose.



Kunal

 

 

Kenneth Brotman

 

From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] 
Sent: Monday, March 12, 2018 11:36 AM
To: user@cassandra.apache.org
Subject: RE: [EXTERNAL] RE: Adding new DC?

 

You cannot migrate and upgrade at the same time across major versions. 
Streaming is (usually) not compatible between versions.

 

As to the migration question, I would expect that you may need to put the 
external-facing ip addresses in several places in the cassandra.yaml file. And, 
yes, it would require a restart. Why is a non-restart more desirable? Most 
Cassandra changes require a restart, but you can do a rolling restart and not 
impact your application. This is fairly normal admin work and can/should be 
automated.

 

How large is the cluster to migrate (# of nodes and size of data). The 
preferred method might depend on how much data needs to move. Is any 
application outage acceptable?

 

Sean Durity

lord of the (C*) rings (Staff Systems Engineer – Cassandra)

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 10:20 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: Adding new DC?

 

Hi Kenneth,

 

Replies inline below.

 

On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:

Hi Kunal,

 

That version of Cassandra is too far before me so I’ll let others answer.  I 
was wonder why you wouldn’t want to end up on 3.0x if you’re going through all 
the trouble of migrating anyway?  

 

 

Application side constraints - some data types are different between 2.1.x and 
3.x (for example, date vs. timestamp).

 

Besides, this is production setup - so, cannot take risk

Are both data centers in the same region on AWS?  Can you provide yaml file for 
us to see?

 

 

No, they are in different regions - GCE setup is in us-east while AWS setup is 
in Asia-south (Mumbai)

 

Thanks,

Kunal

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 2:32 PM
To: user@cassandra.apache.org
Subject: Adding new DC?

 

Hi all,

 

We currently have a cluster in GCE for one of the customers.

They want it to be migrated to AWS.

 

I have setup one node in AWS to join into the cluster by following:

https://docsdatastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html
 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=>
 

 

Will add more nodes once the first one joins successfully.

 

The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 
7199, 9042 in GCE firewall.

 

The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, 
rack=RAC1 while on AWS, I changed the DC to dc=DC2.

 

When I start cassandra service on the AWS instance, I see the version handshake 
msgs in the logs trying to connect to the public IPs of the GCE nodes:

OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx

However, nodetool status output on both sides don't show the other side at all. 
That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup 
doesn't show old DC (dc=D

RE: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kenneth Brotman

Kunal,

 

Please provide the following setting from the yaml files you  are using:

 

seeds: 

listen_address: 

broadcast_address: 

rpc_address: 

endpoint_snitch: 

auto_bootstrap: 

 

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Monday, March 12, 2018 4:13 PM
To: user@cassandra.apache.org
Cc: Nikhil Soman
Subject: Re: [EXTERNAL] RE: Adding new DC?

 

 

On 13 March 2018 at 00:06, Durity, Sean R <sean_r_dur...@homedepot.com> wrote:

You cannot migrate and upgrade at the same time across major versions. 
Streaming is (usually) not compatible between versions.

 

I'm not trying to upgrade as of now - first priority is the migration.

We can look at version upgrade later on.

 

 

As to the migration question, I would expect that you may need to put the 
external-facing ip addresses in several places in the cassandra.yaml file. And, 
yes, it would require a restart. Why is a non-restart more desirable? Most 
Cassandra changes require a restart, but you can do a rolling restart and not 
impact your application. This is fairly normal admin work and can/should be 
automated.

 

I just tried setting the broadcast_address in one of the instances in GCE to 
its public IP and restarted the service.

However, it now shows all other nodes (in GCE) as DN in nodetool status output 
and the other nodes also report this node as DN with its internal/private IP 
address.

 

I also tried setting the broadcast_rpc_address to the internal/private IP 
address - still the same.

 

 

How large is the cluster to migrate (# of nodes and size of data). The 
preferred method might depend on how much data needs to move. Is any 
application outage acceptable?

 

No. of nodes: 5

RF: 3

Data size (as reported by the load factor in nodetool status output): ~30GB per 
node

 

Thanks,
Kunal

 

 

Sean Durity

lord of the (C*) rings (Staff Systems Engineer – Cassandra)

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 10:20 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: Adding new DC?

 

Hi Kenneth,

 

Replies inline below.

 

On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:

Hi Kunal,

 

That version of Cassandra is too far before me so I’ll let others answer.  I 
was wonder why you wouldn’t want to end up on 3.0x if you’re going through all 
the trouble of migrating anyway?  

 

 

Application side constraints - some data types are different between 2.1.x and 
3.x (for example, date vs. timestamp).

 

Besides, this is production setup - so, cannot take risk.

Are both data centers in the same region on AWS?  Can you provide yaml file for 
us to see?

 

 

No, they are in different regions - GCE setup is in us-east while AWS setup is 
in Asia-south (Mumbai)

 

Thanks,

Kunal

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhedkar@gmailcom 
<mailto:kgangakhed...@gmail.com> ] 
Sent: Sunday, March 11, 2018 2:32 PM
To: user@cassandra.apache.org
Subject: Adding new DC?

 

Hi all,

 

We currently have a cluster in GCE for one of the customers.

They want it to be migrated to AWS.

 

I have setup one node in AWS to join into the cluster by following:

https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html
 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=>
 

 

Will add more nodes once the first one joins successfully.

 

The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 
7199, 9042 in GCE firewall.

 

The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, 
rack=RAC1 while on AWS, I changed the DC to dc=DC2.

 

When I start cassandra service on the AWS instance, I see the version handshake 
msgs in the logs trying to connect to the public IPs of the GCE nodes:

OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx

However, nodetool status output on both sides don't show the other side at all. 
That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup 
doesn't show old DC (dc=DC1).

 

In cassandra.yaml file, I'm only using listen_interface and rpc_interface 
settings - no explicit IP addresses used - so, ends up using the internal 
private IP ranges.

 

Do I need to explicitly add the broadcast_address? for both side?

Would that require restarting of cassandra service on GCE side? Or is it 
possible to change that setting on-the-fly without a restart?

 

I would prefer a non-restart option.

 

PS: The cassandra version running in GCE is 2.1.18 while the new node setup in 
AWS is running 2.1.20 - just in case if that's relevant

 

Thanks,


Kunal

 

 

  _

RE: [EXTERNAL] RE: Adding new DC?

2018-03-12 Thread Kenneth Brotman

You can’t migrate and upgrade at the same time perhaps but you could do one and 
then the other so as to end up on new version.  I’m guessing it’s an error in 
the yaml file or a port not open.  Is there any good reason for a production 
cluster to still be on version 2.1x?

 

Kenneth Brotman

 

From: Durity, Sean R [mailto:sean_r_dur...@homedepot.com] 
Sent: Monday, March 12, 2018 11:36 AM
To: user@cassandra.apache.org
Subject: RE: [EXTERNAL] RE: Adding new DC?

 

You cannot migrate and upgrade at the same time across major versions. 
Streaming is (usually) not compatible between versions.

 

As to the migration question, I would expect that you may need to put the 
external-facing ip addresses in several places in the cassandra.yaml file. And, 
yes, it would require a restart. Why is a non-restart more desirable? Most 
Cassandra changes require a restart, but you can do a rolling restart and not 
impact your application. This is fairly normal admin work and can/should be 
automated.

 

How large is the cluster to migrate (# of nodes and size of data). The 
preferred method might depend on how much data needs to move. Is any 
application outage acceptable?

 

Sean Durity

lord of the (C*) rings (Staff Systems Engineer – Cassandra)

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 10:20 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] RE: Adding new DC?

 

Hi Kenneth,

 

Replies inline below.

 

On 12-Mar-2018 3:40 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:

Hi Kunal,

 

That version of Cassandra is too far before me so I’ll let others answer.  I 
was wonder why you wouldn’t want to end up on 3.0x if you’re going through all 
the trouble of migrating anyway?  

 

 

Application side constraints - some data types are different between 2.1.x and 
3.x (for example, date vs. timestamp).

 

Besides, this is production setup - so, cannot take risk.

Are both data centers in the same region on AWS?  Can you provide yaml file for 
us to see?

 

 

No, they are in different regions - GCE setup is in us-east while AWS setup is 
in Asia-south (Mumbai)

 

Thanks,

Kunal

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 2:32 PM
To: user@cassandra.apache.org
Subject: Adding new DC?

 

Hi all,

 

We currently have a cluster in GCE for one of the customers.

They want it to be migrated to AWS.

 

I have setup one node in AWS to join into the cluster by following:

https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html
 
<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.datastax.com_en_cassandra_2.1_cassandra_operations_ops-5Fadd-5Fdc-5Fto-5Fcluster-5Ft.html=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=4s2PNt4_Ty1RVe_0dQ4sn-jQTjmz-Wmxnf2OS4URoYo=pfA6Jkn2UwG7AISlAM3OJ1OzQpghd_nVJj-KnYLCvBk=>
 

 

Will add more nodes once the first one joins successfully.

 

The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 
7199, 9042 in GCE firewall.

 

The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, 
rack=RAC1 while on AWS, I changed the DC to dc=DC2.

 

When I start cassandra service on the AWS instance, I see the version handshake 
msgs in the logs trying to connect to the public IPs of the GCE nodes:

OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx

However, nodetool status output on both sides don't show the other side at all. 
That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup 
doesn't show old DC (dc=DC1).

 

In cassandra.yaml file, I'm only using listen_interface and rpc_interface 
settings - no explicit IP addresses used - so, ends up using the internal 
private IP ranges.

 

Do I need to explicitly add the broadcast_address? for both side?

Would that require restarting of cassandra service on GCE side? Or is it 
possible to change that setting on-the-fly without a restart?

 

I would prefer a non-restart option.

 

PS: The cassandra version running in GCE is 2.1.18 while the new node setup in 
AWS is running 2.1.20 - just in case if that's relevant

 

Thanks,


Kunal

 

 

  _  


The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or l

RE: What versions should the documentation support now?

2018-03-12 Thread Kenneth Brotman

It seems like the documentation that should be in the trunk for version 3.0, 
should include information for users of version 3.0 and 2.1; the documentation 
that should in 4.0 (when its released), should include information for users of 
4.0 and at least one previous version, etc. 

 

How about if we do it that way?

 

Kenneth Brotman

 

From: Jonathan Haddad [mailto:j...@jonhaddad.com] 
Sent: Monday, March 12, 2018 9:10 AM
To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

 

Right now they can’t.

On Mon, Mar 12, 2018 at 9:03 AM Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

I see how that makes sense Jon but how does a user then select the 
documentation for the version they are running on the Apache Cassandra web site?

 

Kenneth Brotman

 

From: Jonathan Haddad [mailto:j...@jonhaddad.com] 
Sent: Monday, March 12, 2018 8:40 AM


To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

 

The docs are in tree, meaning they are versioned, and should be written for the 
version they correspond to. Trunk docs should reflect the current state of 
trunk, and shouldn’t have caveats for other versions. 

On Mon, Mar 12, 2018 at 8:15 AM Kenneth Brotman <kenbrotman@yahoocom.invalid 
<mailto:kenbrot...@yahoo.com.invalid> > wrote:

If we use DataStax’s example, we would have instructions for v3.0 and v2.1.  
How’s that?  

 

We should have to be instructions for the cloud platforms like AWS but how do 
you do that and stay vendor neutral?

 

Kenneth Brotman

 

From: Hannu Kröger [mailto:hkro...@gmail.com] 
Sent: Monday, March 12, 2018 7:40 AM
To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

 

In my opinion, a good documentation should somehow include version specific 
pieces of information. Whether it is nodetool command that came in certain 
version or parameter for something or something else.

 

That would very useful. It’s confusing if I see documentation talking about 4.0 
specifics and I try to find that in my 3.11.x

 

Hannu

 

On 12 Mar 2018, at 16:38, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> wrote:

 

I’m unclear what versions are most popular right now? What version are you 
running?

 

What version should still be supported in the documentation?  For example, I’m 
turning my attention back to writing a section on adding a data center.  What 
versions should I support in that information?

 

I’m working on it right now.  Thanks,

 

Kenneth Brotman

RE: What versions should the documentation support now?

2018-03-12 Thread Kenneth Brotman

I see how that makes sense Jon but how does a user then select the 
documentation for the version they are running on the Apache Cassandra web site?

 

Kenneth Brotman

 

From: Jonathan Haddad [mailto:j...@jonhaddad.com] 
Sent: Monday, March 12, 2018 8:40 AM
To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

 

The docs are in tree, meaning they are versioned, and should be written for the 
version they correspond to. Trunk docs should reflect the current state of 
trunk, and shouldn’t have caveats for other versions. 

On Mon, Mar 12, 2018 at 8:15 AM Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

If we use DataStax’s example, we would have instructions for v3.0 and v2.1.  
How’s that?  

 

We should have to be instructions for the cloud platforms like AWS but how do 
you do that and stay vendor neutral?

 

Kenneth Brotman

 

From: Hannu Kröger [mailto:hkro...@gmail.com] 
Sent: Monday, March 12, 2018 7:40 AM
To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

 

In my opinion, a good documentation should somehow include version specific 
pieces of information. Whether it is nodetool command that came in certain 
version or parameter for something or something else.

 

That would very useful. It’s confusing if I see documentation talking about 4.0 
specifics and I try to find that in my 3.11.x

 

Hannu

 

On 12 Mar 2018, at 16:38, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> wrote:

 

I’m unclear what versions are most popular right now? What version are you 
running?

 

What version should still be supported in the documentation?  For example, I’m 
turning my attention back to writing a section on adding a data center.  What 
versions should I support in that information?

 

I’m working on it right now.  Thanks,

 

Kenneth Brotman

RE: What versions should the documentation support now?

2018-03-12 Thread Kenneth Brotman

If we use DataStax’s example, we would have instructions for v3.0 and v2.1.  
How’s that?  

 

We should have to be instructions for the cloud platforms like AWS but how do 
you do that and stay vendor neutral?

 

Kenneth Brotman

 

From: Hannu Kröger [mailto:hkro...@gmail.com] 
Sent: Monday, March 12, 2018 7:40 AM
To: user@cassandra.apache.org
Subject: Re: What versions should the documentation support now?

 

In my opinion, a good documentation should somehow include version specific 
pieces of information. Whether it is nodetool command that came in certain 
version or parameter for something or something else.

 

That would very useful. It’s confusing if I see documentation talking about 4.0 
specifics and I try to find that in my 3.11.x

 

Hannu





On 12 Mar 2018, at 16:38, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> wrote:

 

I’m unclear what versions are most popular right now? What version are you 
running?

 

What version should still be supported in the documentation?  For example, I’m 
turning my attention back to writing a section on adding a data center.  What 
versions should I support in that information?

 

I’m working on it right now.  Thanks,

 

Kenneth Brotman

What versions should the documentation support now?

2018-03-12 Thread Kenneth Brotman

I'm unclear what versions are most popular right now? What version are you
running?

 

What version should still be supported in the documentation?  For example,
I'm turning my attention back to writing a section on adding a data center.
What versions should I support in that information?

 

I'm working on it right now.  Thanks,

 

Kenneth Brotman

RE: Adding new DC?

2018-03-11 Thread Kenneth Brotman

Hi Kunal,

 

That version of Cassandra is too far before me so I’ll let others answer.  I 
was wonder why you wouldn’t want to end up on 3.0x if you’re going through all 
the trouble of migrating anyway?  

 

Are both data centers in the same region on AWS?  Can you provide yaml file for 
us to see?

 

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 2:32 PM
To: user@cassandra.apache.org
Subject: Adding new DC?

 

Hi all,

 

We currently have a cluster in GCE for one of the customers.

They want it to be migrated to AWS.

 

I have setup one node in AWS to join into the cluster by following:

https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html

 

Will add more nodes once the first one joins successfully.

 

The node in AWS has an elastic IP - which is white-listed for ports 7000-7001, 
7199, 9042 in GCE firewall.

 

The snitch is set to GossipingPropertyFileSnitch. The GCE setup has dc=DC1, 
rack=RAC1 while on AWS, I changed the DC to dc=DC2.

 

When I start cassandra service on the AWS instance, I see the version handshake 
msgs in the logs trying to connect to the public IPs of the GCE nodes:

OutboundTcpConnection.java:496 - Handshaking version with /xx.xx.xx.xx

However, nodetool status output on both sides don't show the other side at all. 
That is, the GCE setup doesn't show the new DC (dc=DC2) and the AWS setup 
doesn't show old DC (dc=DC1).

 

In cassandra.yaml file, I'm only using listen_interface and rpc_interface 
settings - no explicit IP addresses used - so, ends up using the internal 
private IP ranges.

 

Do I need to explicitly add the broadcast_address? for both side?

Would that require restarting of cassandra service on GCE side? Or is it 
possible to change that setting on-the-fly without a restart?

 

I would prefer a non-restart option.

 

PS: The cassandra version running in GCE is 2.1.18 while the new node setup in 
AWS is running 2.1.20 - just in case if that's relevant

 

Thanks,


Kunal

Your Invitation to Participate in Fourteen Really Cool Really Important Projects

2018-03-11 Thread Kenneth Brotman

that uses artificial intelligence and block chains to assemble bodies of
knowledge and competencies, assess competencies, identify gaps in desired
competencies, identify learning solutions and curate reliable credentials of
competencies demonstrated.

14.  Virtual Distributed Asynchronous Conferences: A project to
provide all the benefits of attending a conference while reaching everyone
in a way where no one is geographically disadvantaged, with no travel time,
no travel expenses and no ticket fees which makes it accessible to a lot of
people that otherwise would have to miss out, and with lower production
costs and simpler administrative workloads that allow faster
implementations.

 

Kenneth Brotman

RE: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Kenneth Brotman

Yasir,

 

How many nodes are in the cluster?  

What is num_tokens set to in the Cassandra.yaml file?  

Is it just this one node doing this?  

What replication factor do you use that affects the ranges on that disk?

 

Kenneth Brotman

 

From: Kyrylo Lebediev [mailto:kyrylo_lebed...@epam.com] 
Sent: Friday, March 09, 2018 4:14 AM
To: user@cassandra.apache.org
Subject: Re: uneven data movement in one of the disk in Cassandra

 

Not sure where I heard this, but AFAIK data imbalance when multiple
data_directories are in use is a known issue for older versions of
Cassandra. This might be the root-cause of your issue. 

Which version of C* are you using?

Unfortunately, don't remember in which version this imbalance issue was
fixed.

 

-- Kyrill

  _  

From: Yasir Saleem <yasirsaleem9...@gmail.com>
Sent: Friday, March 9, 2018 1:34:08 PM
To: user@cassandra.apache.org
Subject: Re: uneven data movement in one of the disk in Cassandra 

 

Hi Alex, 

 

no active compaction, right now.

 



 

On Fri, Mar 9, 2018 at 3:47 PM, Oleksandr Shulgin
<oleksandr.shul...@zalando.de> wrote:

On Fri, Mar 9, 2018 at 11:40 AM, Yasir Saleem <yasirsaleem9...@gmail.com>
wrote:

Thanks, Nicolas Guyomar 

 

I am new to cassandra, here is the properties which I can see in yaml file: 

 

# of compaction, including validation compaction.

compaction_throughput_mb_per_sec: 16

compaction_large_partition_warning_threshold_mb: 100

 

To check currently active compaction please use this command:

 

nodetool compactionstats -H

 

on the host which shows the problem.

 

--

Alex

Cassandra at Instagram with Dikang Gu interview by Jeff Carpenter

2018-03-06 Thread Kenneth Brotman

Just released on DataStax Distributed Data Show, DiKang Gu of Instagram
interviewed by author Jeff Carpenter.

Found it really interesting:  Shadow clustering, migrating from 2.2 to 3.0,
using the Rocks DB as a pluggable storage engine for Cassandra

https://academy.datastax.com/content/distributed-data-show-episode-37-cassan
dra-instagram-dikang-gu

 

Kenneth Brotman

Please give input or approval of JIRA 14128 so we can continue document cleanup

2018-03-04 Thread Kenneth Brotman

Two months ago Kurt Greaves cleaned up the home page of the website which
currently has broken links and other issues.  We need to get that JIRA
rapped up.  Further improvements, scores of them are coming.  Get ready!
Please take time soon to review the patch he submitted.  

https://issues.apache.org/jira/browse/CASSANDRA-14128

 

Kenneth Brotman

Whch version is the best version to run now?

2018-03-02 Thread Kenneth Brotman

Seems like a lot of people are running old versions of Cassandra.  What is
the best version, most reliable stable version to use now?

 

Kenneth Brotman

RE: failing GOSSIP on localhost flooding the debug log

2018-03-02 Thread Kenneth Brotman

Way to go Marco!

From: Marco Giovannini [mailto:usern...@gmail.com] 
Sent: Friday, March 02, 2018 2:45 AM
To: Nicolas Guyomar
Cc: user@cassandra.apache.org
Subject: Re: failing GOSSIP on localhost flooding the debug log

CASSANDRA-14285  

Regards,
Marco

On Fri, Mar 2, 2018 at 11:33 AM, Marco Giovannini  wrote:

Hi, 

I'll use your code to fill up a Jira ticket.

Regards,
Marco

On Fri, Mar 2, 2018 at 11:26 AM, Marco Giovannini  wrote:

Hi

You morning guess ended up to be right. :)

Sometimes a couple of fresh eyes are priceless. 

Thanks Nicolas. 

Regards,
Marco

On Fri, Mar 2, 2018 at 11:14 AM, Nicolas Guyomar  
wrote:

Hi Marco,

Could that be because your seed list has an extra comma in the end of the line, 
thus being interpreted by default as localhost by Cassandra ? And because you 
are listening on the node IP localhost is not reachable  (need to check to code 
to be sure)

Here => seeds: '10.1.20.10,10.1.21.10,10.1.22.10,' 

Wild morning guess ;) 

On 2 March 2018 at 11:06, Marco Giovannini  wrote:

Hi,

I'm running

 Cassandra

 a cluster of 3 nodes on AWS across 3 AZ (every instance has only one 
interface).

Cassandra version is 3.11.1. 

My debug log get flooded with

 messages like this one but the cluster work fine.

D

EBUG [MessagingService-Outgoing-localhost/127.0.0.1-Gossip] 2018-02-28 
15:53:57,314 OutboundTcpConnection.java:545 - Unable to connect to 
localhost/127.0.0.1

Have you ever seen it this issue on your cluster?

I attach my conf. 

I tried to play with the broadcast setting too as I found in the previous 
discussion but without success. 

#broadcast_address: "10.1.20.10"

https://lists.apache.org/thread.html/%3c561a65ed-7828-40ab-834d-de2dc8e57...@cisco.com%3E

Regards,
Marco

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 

For additional commands, e-mail: user-h...@cassandra.apache.org

RE: Slender Cassandra Cluster Project

2018-03-01 Thread Kenneth Brotman

For Slender Cassandra (an 18 node cluster for illustrating best practices) to 
communicate between regions I seem to have three choices for Amazon Web 
Services:
1. VPN
2. VPN Peering
3. Over the world wide web

What is the correct choice?

This article on best practices for running Cassandra on AWS just came out by 
the way: 
https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/
I'm not sure all of the information in it is correct.  Is it?

Kenneth Brotman

-Original Message-
From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] 
Sent: Wednesday, January 31, 2018 4:20 PM
To: 'user@cassandra.apache.org'
Subject: RE: Slender Cassandra Cluster Project

Thank you Yuri and Michael for the suggestion.  Yes, a Terraform version makes 
sense.  Will do.

Kenneth Brotman

-Original Message-
From: Yuri Subach [mailto:ysub...@gmail.com]
Sent: Wednesday, January 31, 2018 7:20 AM
To: user@cassandra.apache.org
Subject: Re: Slender Cassandra Cluster Project

Yes, I'd prefer Terraform too.

On 2018-01-31 06:32:21, Michael Mior <mm...@uwaterloo.ca> wrote:
> While whatever format this comes out in would be helpful, you might 
> want to consider Terraform. 1Password recently published a blog post 
> on their experience with Terraform vs. CloudFormation.
> 
> https://blog.agilebits.com/2018/01/25/terraforming-1password/
> 
> --
> Michael Mior
> mm...@apache.org
> 
> 2018-01-31 2:34 GMT-05:00 Kenneth Brotman <kenbrot...@yahoo.com.invalid>:
> 
> > Hi Yuri,
> >
> > If possible I will do everything with AWS Cloudformation.  I'm 
> > working on it now.  Nothing published yet.
> >
> > Kenneth Brotman
> >
> > -Original Message-
> > From: Yuri Subach [mailto:ysub...@gmail.com]
> > Sent: Tuesday, January 30, 2018 7:02 PM
> > To: user@cassandra.apache.org
> > Subject: RE: Slender Cassandra Cluster Project
> >
> > Hi Kenneth,
> >
> > I like this project idea!
> >
> > A couple of questions:
> > - What tools are you going to use for AWS cluster setup?
> > - Do you have anything published already (github)?
> >
> > On 2018-01-22 22:42:11, Kenneth Brotman 
> > <kenbrot...@yahoo.com.INVALID>
> > wrote:
> > > Thanks Anthony!  I’ve made a note to include that information in 
> > > the
> > documentation. You’re right.  It won’t work as intended unless that 
> > is configured properly.
> > >
> > >
> > >
> > > I’m also favoring a couple other guidelines for Slender Cassandra:
> > >
> > > 1.   SSD’s only, no spinning disks
> > >
> > > 2.   At least two cores per node
> > >
> > >
> > >
> > > For AWS, I’m favoring the c3.large on Linux.  It’s available in 
> > > these
> > regions: US-East, US-West and US-West2.  The specifications are listed as:
> > >
> > > · Two (2) vCPU’s
> > >
> > > · 3.7 Gib Memory
> > >
> > > · Two (2) 16 GB SSD’s
> > >
> > > · Moderate I/O
> > >
> > >
> > >
> > > It’s going to be hard to beat the inexpensive cost of operating a
> > Slender Cluster on demand in the cloud – and it fits a lot of the 
> > use cases
> > well:
> > >
> > >
> > >
> > > · For under a $100 a month, in current pricing for EC2
> > instances, you can operate an eighteen (18) node Slender Cluster for 
> > five
> > (5) hours a day, ten (10) days a month.  That’s fine for 
> > demonstrations, teaching or experiments that last half a day or less.
> > >
> > > · For under $20, you can have that Slender Cluster up all day
> > long, up to ten (10) hours, for whatever demonstrations or 
> > experiments you want it for.
> > >
> > >
> > >
> > > As always, feedback is encouraged.
> > >
> > >
> > >
> > > Thanks,
> > >
> > >
> > >
> > > Kenneth Brotman
> > >
> > >
> > >
> > > From: Anthony Grasso [mailto:anthony.gra...@gmail.com]
> > > Sent: Sunday, January 21, 2018 3:57 PM
> > > To: user
> > > Subject: Re: Slender Cassandra Cluster Project
> > >
> > >
> > >
> > > Hi Kenneth,
> > >
> > >
> > >
> > > Fantastic idea!
> > >
> > >
> > >
> > > One thing that came to mind from my reading of the proposed setup 
> > > was
> > rack awareness of each node. Given that the proposed setup contains 
> > three

RE: The home page of Cassandra is mobile friendly but the link to the third parties is not

2018-03-01 Thread Kenneth Brotman

Thanks Nicolas.  That worked.  Definitely overlap.  Nice work on hints and read 
repair Kurt.  I added the relevant proposed text to Issue #’s 14270 and 14271.

 

Kenneth Brotman

 

From: Nicolas Guyomar [mailto:nicolas.guyo...@gmail.com] 
Sent: Thursday, March 01, 2018 1:42 AM
To: user@cassandra.apache.org
Subject: Re: The home page of Cassandra is mobile friendly but the link to the 
third parties is not

 

Works for me : https://issues.apache.org/jira/browse/CASSANDRA-14128 

 

On 1 March 2018 at 10:36, Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote:

The link for 14128 doesn’t work and I can’t find it anywhere.  

 

Kenneth Brotman

 

From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Wednesday, February 28, 2018 8:39 PM
To: User
Subject: Re: The home page of Cassandra is mobile friendly but the link to the 
third parties is not

 

Already addressed in CASSANDRA-14128 
<https://issues.apacheorg/jira/browse/CASSANDRA-14128> , however waiting on 
review/comments regarding what we actually do with this page.

 

If you want to bring attention to JIRA's, user list is probably appropriate. 
I'd avoid spamming it too much though.

 

On 26 February 2018 at 19:22, Kenneth Brotman <kenbrot...@yahoo.com.invalid 
<mailto:kenbrotman@yahoocom.invalid> > wrote:

The home page of Cassandra is mobile friendly but the link to the third parties 
from that web page is not.  Any suggestions?  

 

I made a JIRA for it: https://issues.apache.org/jira/browse/CASSANDRA-14263

 

Should posts about JIRA’s be on this list or the dev list?

 

Kenneth Brotman

RE: Filling in the blank To Do sections on the Apache Cassandra web site

2018-03-01 Thread Kenneth Brotman

I’ve got Sphinx on my PC, in a VirtualBox image and going in a Docker 
container.  I’m tickering with it to figure out the best set up going forward.  
If the VirtualBox image ends up easy to use I might add an image file to the 
repository for others. 

I’m really wanting to get the documentation on the website caught up and get us 
past that so I’m going to:

Hunt down content from different sources including posts to 
this group and start throwing content into the various JIRA’s, then

Begin to render away the duplicate information and post on the 
JIRA a cleaner draft, then

Rewrite the content (of myself and others) into more and more 
well composed drafts.

>From there, aside from some of  the graphics that might need to be 
>contributed, it’s pretty much low hanging fruit at this point to finish it up. 
> Hopefully everyone is okay with this approach.

Kenneth Brotman 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] 
Sent: Tuesday, February 27, 2018 9:56 AM
To: 'user@cassandra.apache.org'
Subject: RE: Filling in the blank To Do sections on the Apache Cassandra web 
site

I was just getting ready to install sphinx.  Cool.  

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Tuesday, February 27, 2018 9:51 AM
To: user@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

The docs have been in tree for years :)

https://github.com/apache/cassandra/tree/trunk/doc

There’s even a docker image to build them so you don’t need to mess with 
sphinx.  Check the README for instructions.

Jon

On Feb 27, 2018, at 9:49 AM, Carl Mueller <carl.muel...@smartthings.com> wrote:

If there was a github for the docs, we could start posting content to it for 
review. Not sure what the review/contribution process is on Apache. Google 
searches on apache documentation and similar run into lots of noise from actual 
projects.

I wouldn't mind trying to do a little doc work on the regular if there was a 
wiki, a proven means to do collaborative docs. 

On Tue, Feb 27, 2018 at 11:42 AM, Kenneth Brotman 
<kenbrot...@yahoo.com.invalid> wrote:

It’s just content for web pages.  There isn’t a working outline or any draft on 
any of the JIRA’s yet.  I like to keep things simple.  Did I miss something?  
What does it matter right now?

Thanks Carl,

Kenneth Brotman

From: Carl Mueller [mailto:carl.muel...@smartthings.com] 
Sent: Tuesday, February 27, 2018 8:50 AM
To: user@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

so... are those pages in the code tree of github? I don't see them or a 
directory structure under /doc. Is mirroring the documentation between the 
apache site and a github source a big issue?

On Tue, Feb 27, 2018 at 7:50 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

I was debating that.  Splitting it up into smaller tasks makes each one seem 
less over-whelming.  

Kenneth Brotman

From: Josh McKenzie [mailto:jmckenzie@apacheorg <mailto:jmcken...@apache.org> ] 
Sent: Tuesday, February 27, 2018 5:44 AM
To: cassandra
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

Might help, organizationally, to put all these efforts under a single ticket of 
"Improve web site Documentation" and add these as sub-tasks Should be able to 
do that translation post-creation (i.e. in its current state) if that's 
something that makes sense to you.

On Mon, Feb 26, 2018 at 5:24 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Here are the related JIRA’s.  Please add content even if It’s not well formed 
compositionally.  Myself or someone else will take it from there

https://issues.apache.org/jira/browse/CASSANDRA-14274  The troubleshooting 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14273  The Bulk Loading web 
page on the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14272  The Backups web page on 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14271  The Hints web page in 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14270  The Read repair web page 
is empty

https://issuesapache.org/jira/browse/CASSANDRA-14269 
<https://issues.apache.org/jira/browse/CASSANDRA-14269>   The Data Modeling 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14268  The 
Architecture:Guarantees web page is empty

https://issuesapache.org/jira/browse/CASSANDRA-14267 
<https://issues.apache.org/jira/browse/CASSANDRA-14267>   The Dynamo web page 
on the Apache Cassandra site is missing content

https://issues.apache.org/jira/browse/CASSANDRA-14266  The Architecture 
Overview web page on the Apache Cassandra site is empty

Thanks fo

RE: The home page of Cassandra is mobile friendly but the link to the third parties is not

2018-03-01 Thread Kenneth Brotman

The link for 14128 doesn’t work and I can’t find it anywhere.  

 

Kenneth Brotman

 

From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Wednesday, February 28, 2018 8:39 PM
To: User
Subject: Re: The home page of Cassandra is mobile friendly but the link to the 
third parties is not

 

Already addressed in CASSANDRA-14128 
<https://issues.apacheorg/jira/browse/CASSANDRA-14128> , however waiting on 
review/comments regarding what we actually do with this page.

 

If you want to bring attention to JIRA's, user list is probably appropriate. 
I'd avoid spamming it too much though.

 

On 26 February 2018 at 19:22, Kenneth Brotman <kenbrot...@yahoo.com.invalid 
<mailto:kenbrotman@yahoocom.invalid> > wrote:

The home page of Cassandra is mobile friendly but the link to the third parties 
from that web page is not.  Any suggestions?  

 

I made a JIRA for it: https://issues.apache.org/jira/browse/CASSANDRA-14263

 

Should posts about JIRA’s be on this list or the dev list?

 

Kenneth Brotman

RE: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-27 Thread Kenneth Brotman

I was just getting ready to install sphinx.  Cool.  

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Tuesday, February 27, 2018 9:51 AM
To: user@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

The docs have been in tree for years :)

https://github.com/apache/cassandra/tree/trunk/doc

There’s even a docker image to build them so you don’t need to mess with 
sphinx.  Check the README for instructions.

Jon

On Feb 27, 2018, at 9:49 AM, Carl Mueller <carl.muel...@smartthings.com> wrote:

If there was a github for the docs, we could start posting content to it for 
review. Not sure what the review/contribution process is on Apache. Google 
searches on apache documentation and similar run into lots of noise from actual 
projects.

I wouldn't mind trying to do a little doc work on the regular if there was a 
wiki, a proven means to do collaborative docs. 

On Tue, Feb 27, 2018 at 11:42 AM, Kenneth Brotman 
<kenbrot...@yahoo.com.invalid> wrote:

It’s just content for web pages.  There isn’t a working outline or any draft on 
any of the JIRA’s yet.  I like to keep things simple.  Did I miss something?  
What does it matter right now?

Thanks Carl,

Kenneth Brotman

From: Carl Mueller [mailto:carl.muel...@smartthings.com] 
Sent: Tuesday, February 27, 2018 8:50 AM
To: user@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

so... are those pages in the code tree of github? I don't see them or a 
directory structure under /doc. Is mirroring the documentation between the 
apache site and a github source a big issue?

On Tue, Feb 27, 2018 at 7:50 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

I was debating that.  Splitting it up into smaller tasks makes each one seem 
less over-whelming.  

Kenneth Brotman

From: Josh McKenzie [mailto:jmckenzie@apacheorg <mailto:jmcken...@apache.org> ] 
Sent: Tuesday, February 27, 2018 5:44 AM
To: cassandra
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

Might help, organizationally, to put all these efforts under a single ticket of 
"Improve web site Documentation" and add these as sub-tasks Should be able to 
do that translation post-creation (i.e. in its current state) if that's 
something that makes sense to you.

On Mon, Feb 26, 2018 at 5:24 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Here are the related JIRA’s.  Please add content even if It’s not well formed 
compositionally.  Myself or someone else will take it from there

https://issues.apache.org/jira/browse/CASSANDRA-14274  The troubleshooting 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14273  The Bulk Loading web 
page on the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14272  The Backups web page on 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14271  The Hints web page in 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14270  The Read repair web page 
is empty

https://issuesapache.org/jira/browse/CASSANDRA-14269 
<https://issues.apache.org/jira/browse/CASSANDRA-14269>   The Data Modeling 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14268  The 
Architecture:Guarantees web page is empty

https://issuesapache.org/jira/browse/CASSANDRA-14267 
<https://issues.apache.org/jira/browse/CASSANDRA-14267>   The Dynamo web page 
on the Apache Cassandra site is missing content

https://issues.apache.org/jira/browse/CASSANDRA-14266  The Architecture 
Overview web page on the Apache Cassandra site is empty

Thanks for pitching in  

Kenneth Brotman

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Monday, February 26, 2018 1:54 PM
To: user@cassandra.apache.org
Subject: RE: Filling in the blank To Do sections on the Apache Cassandra web 
site

Nice!  Thanks for the help Oliver!

Kenneth Brotman

From: Oliver Ruebenacker [mailto:cur...@gmail.com] 
Sent: Sunday, February 25, 2018 7:12 AM
To: user@cassandra.apache.org
Cc: dev@cassandra.apacheorg <mailto:d...@cassandra.apache.org> 
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 Hello,

  I have some slides about Cassandra 
<https://docs.google.com/presentation/d/1JZYugL4WC9grgZswg1i6gAfWmFBqg9iQ0YcPLUiQ-6w/edit?usp=sharing>
 , feel free to borrow.

 Best, Oliver

On Fri, Feb 23, 2018 at 7:28 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

These nine web pages on the Apache Cassandra web site have blank To Do 
sections.  Most of the web pages are completely blank.  Mind you there is a lot 
of hard work already done on the documentation.  I’ll make JIRA’s for any of 
the blank sections where there is not already a

RE: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-27 Thread Kenneth Brotman

It’s just content for web pages.  There isn’t a working outline or any draft on 
any of the JIRA’s yet.  I like to keep things simple.  Did I miss something?  
What does it matter right now?

 

Thanks Carl,

 

Kenneth Brotman

 

From: Carl Mueller [mailto:carl.muel...@smartthings.com] 
Sent: Tuesday, February 27, 2018 8:50 AM
To: user@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

so... are those pages in the code tree of github? I don't see them or a 
directory structure under /doc. Is mirroring the documentation between the 
apache site and a github source a big issue?

 

On Tue, Feb 27, 2018 at 7:50 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

I was debating that.  Splitting it up into smaller tasks makes each one seem 
less over-whelming.  

 

Kenneth Brotman

 

From: Josh McKenzie [mailto:jmckenzie@apacheorg <mailto:jmcken...@apache.org> ] 
Sent: Tuesday, February 27, 2018 5:44 AM
To: cassandra
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

Might help, organizationally, to put all these efforts under a single ticket of 
"Improve web site Documentation" and add these as sub-tasks Should be able to 
do that translation post-creation (i.e. in its current state) if that's 
something that makes sense to you.

 

On Mon, Feb 26, 2018 at 5:24 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Here are the related JIRA’s.  Please add content even if It’s not well formed 
compositionally.  Myself or someone else will take it from there

 

https://issues.apache.org/jira/browse/CASSANDRA-14274  The troubleshooting 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14273  The Bulk Loading web 
page on the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14272  The Backups web page on 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14271  The Hints web page in 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14270  The Read repair web page 
is empty

https://issuesapache.org/jira/browse/CASSANDRA-14269 
<https://issues.apache.org/jira/browse/CASSANDRA-14269>   The Data Modeling 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14268  The 
Architecture:Guarantees web page is empty

https://issuesapache.org/jira/browse/CASSANDRA-14267 
<https://issues.apache.org/jira/browse/CASSANDRA-14267>   The Dynamo web page 
on the Apache Cassandra site is missing content

https://issues.apache.org/jira/browse/CASSANDRA-14266  The Architecture 
Overview web page on the Apache Cassandra site is empty

 

Thanks for pitching in  

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Monday, February 26, 2018 1:54 PM
To: user@cassandra.apache.org
Subject: RE: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

Nice!  Thanks for the help Oliver!

 

Kenneth Brotman

 

From: Oliver Ruebenacker [mailto:cur...@gmail.com] 
Sent: Sunday, February 25, 2018 7:12 AM
To: user@cassandra.apache.org
Cc: dev@cassandra.apacheorg <mailto:d...@cassandra.apache.org> 
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

 

 Hello,

  I have some slides about Cassandra 
<https://docs.google.com/presentation/d/1JZYugL4WC9grgZswg1i6gAfWmFBqg9iQ0YcPLUiQ-6w/edit?usp=sharing>
 , feel free to borrow.

     Best, Oliver

 

On Fri, Feb 23, 2018 at 7:28 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

These nine web pages on the Apache Cassandra web site have blank To Do 
sections.  Most of the web pages are completely blank.  Mind you there is a lot 
of hard work already done on the documentation.  I’ll make JIRA’s for any of 
the blank sections where there is not already a JIRA.  Then it will be on to 
writing up those sections.  If you have any text to help me get started for any 
of these sections that would be really cool. 

 

http://cassandra.apache.org/doc/latest/architecture/overview.html

 

http://cassandra.apacheorg/doc/latest/architecture/dynamo.html 
<http://cassandra.apache.org/doc/latest/architecture/dynamo.html> 

 

http://cassandra.apache.org/doc/latest/architecture/guarantees.html 
<http://cassandra.apache.org/doc/latest/architecture/guaranteeshtml> 

 

http://cassandra.apache.org/doc/latest/data_modeling/index.html

 

http://cassandra.apacheorg/doc/latest/operating/read_repair.html 
<http://cassandra.apache.org/doc/latest/operating/read_repair.html> 

 

http://cassandra.apache.org/doc/latest/operating/hints.html

 

http://cassandra.apache.org/doc/latest/operating/backups.html

 

http://cassandra.apache.org/doc/latest/operating/bulk_loading.html

 

http://cassandra.apache.org/doc/latest/troubleshooting/index.html

 

Kenneth Brotman

 




-- 

Oliver Ruebenacker

Senior Software Engineer

RE: Version Rollback

2018-02-27 Thread Kenneth Brotman

I wonder if you could use Apache Spark to do it?

 

Kenneth Brotman

 

From: Carl Mueller [mailto:carl.muel...@smartthings.com] 
Sent: Tuesday, February 27, 2018 9:22 AM
To: user@cassandra.apache.org
Subject: Re: Version Rollback

 

My speculation is that IF (bigif) the sstable formats are compatible between 
the versions, which probably isn't the case for major versions, then you could 
drop back. 

If the sstables changed format, then you'll probably need to figure out how to 
rewrite the sstables in the older format and then sstableloader them in the 
older-version cluster if need be. Alas, while there is an sstable upgrader, 
there isn't a downgrader AFAIK. 

And I don't have an intimate view of version-by-version sstable format changes 
and compatibilities. You'd probably need to check the upgrade instructions 
(which you presumably did if you're upgrading versions) to tell.

Basically, version rollback is pretty unlikely to be done.

The OTHER option:

1) build a new cluster with the new version, no new data. 

2) code your driver interfaces to interface with both clusters. Write to both, 
but read preferentially from the new, then fall through to the old. Yes, that 
gets hairy on multiple row queries. Port your data with sstable loading from 
the old to the new gradually. 

When you've done a full load of all the data from old to new, and you're 
satisfied with the new cluster stability, retire the old cluster.

For merging two multirow sets you'll probably need your multirow queries to 
return the partition hash value (or extract the code that generates the hash), 
and have two simulaneous java-driver ResultSets going, and merge their results, 
providing the illusion of a single database query. You'll need to pay attention 
to both the row key ordering and column key ordering to ensure the combined 
results are properly ordered.

Writes will be slowed by the double-writes, reads you'll be bound by the worse 
performing cluster.

 

On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Could you tell us the size and configuration of your Cassandra cluster?

 

Kenneth Brotman

 

From: shalom sagges [mailto:shalomsag...@gmail.com] 
Sent: Tuesday, February 27, 2018 6:19 AM
To: user@cassandra.apache.org
Subject: Version Rollback

 

Hi All, 

I'm planning to upgrade my C* cluster to version 3.x and was wondering what's 
the best way to perform a rollback if need be. 

If I used snapshot restoration, I would be facing data loss, depends when I 
took the snapshot (i.e. a rollback might be required after upgrading half the 
cluster for example). 

If I add another DC to the cluster with the old version, then I could point the 
apps to talk to that DC if anything bad happens, but building it is really time 
consuming and requires a lot of resources. 

Can anyone provide recommendations on this matter? Any ideas on how to make the 
upgrade foolproof, or at least "really really safe"? 

 

Thanks!

RE: Cassandra Summit 2019 / Cassandra Summit 2018

2018-02-27 Thread Kenneth Brotman

The Meet Up groups might be a good way to implement a distributed asynchronous 
event that can still be optionally attended in person.  We could make a nice 
kit for a few meetings for them that ties together with the event and connects 
them as participates in it.  

 

Here is the Meetup website with the Cassandra groups loaded: 
https://www.meetup.com/find/?allMeetups=false 
<https://www.meetup.com/find/?allMeetups=false=Cassandra=Infinity=Sacramento%2C+CA=Sacramento%2C+CA=default>
 
=Cassandra=Infinity=Sacramento%2C+CA=Sacramento%2C+CA=default
 .  

I stopped counting after I easily got to 10,000 members.

 

Kenneth Brotman

 

From: Rahul Singh [mailto:rahul.xavier.si...@gmail.com] 
Sent: Tuesday, February 27, 2018 6:27 AM
To: user@cassandra.apache.org
Subject: Re: Cassandra Summit 2019 / Cassandra Summit 2018

 

I can help organize. I organize three meetups in the area here and know several 
venues that would be able to lend space if needed.

We can host about 20-30 people at our office or get a location through a 
coworking spot / one of the universities ( Georgetown or George Washington)

The community should do something this year — even if it is semi - Virtual. 
I’ve seen some decent implementations of it in other disciplines. Some
Combination of physical get together on a certain day around the world with 
groups presenting locally and then some folks presenting on a global hangout. 
Breaking momentum is the worst killer of community.

We can rally around one date and see how we do. You can count on DC Cassandra 
committing to make our part of it happen. Best,


--
Rahul Singh
rahul.si...@anant.us

Anant Corporation


On Feb 27, 2018, 5:43 AM -0600, Carlos Rolo <r...@pythian.com>, wrote:



Hello all, 

 

I'm interested planning/organizing a small kinda of NGCC in Lisbon, Portugal in 
late May early June. Just waiting for the venue to confirm possible dates.

 

Would be a 1day event kinda last year, is this something people would be 
interested? I can push a google form for accessing the interest today.

 




Regards,

 

Carlos Juzarte Rolo

Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

 

Pythian - Love your data

 

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
linkedin.com/in/carlosjuzarterolo


Mobile: +351 918 918 100 <tel:+351%20918%20918%20100> 

 <http://www.pythian.com/> www.pythian.com

 

On Tue, Feb 27, 2018 at 11:39 AM, Kenneth Brotman 
<kenbrot...@yahoo.com.invalid> wrote:

Event planning is fun as long as you can pace it out properly.  Once you set a 
firm date for an event the pressure on you to keep everything on track is nerve 
racking.  To do something on the order of Cassandra Summit 2016, I think we are 
should plan for 2020.  It’s too late for 2018 and even trying to meet the 
timeline for everything that would have to come together makes 2019 too nerve 
racking a target date.  The steps should be:

Form a planning committee

Bring potential sponsors into the planning early

Select an event planning vendor to guide us and to do the heavy 
lifting for us



In the meantime, we could have a World-wide Distributed Asynchronous Cassandra 
Convention which offers four benefits:

It allows us to address the fact that we are a world-wide group 
that needs a way to reach everyone in a way where no one is geographically 
disadvantaged

No travel time, no travel expenses and no ticket fees makes it 
accessible to a lot of people that otherwise would have to miss out

The lower production costs and simpler administrative workload allows us to 
reach implementation sooner 

It’s cutting edge, world class innovation like Cassandra

    

Kenneth Brotman

 

From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Monday, February 26, 2018 9:38 PM
To: cassandra
Subject: Re: Cassandra Summit 2019 / Cassandra Summit 2018

 

Instaclustr sponsored the 2017 NGCC (Next Gen Cassandra Conference), which was 
developer/development focused (vs user focused).

 

For 2018, we're looking at options for both a developer conference and a user 
conference. There's a lot of logistics involved, and I think it's fairly 
obvious that most of the PMC members aren't professional event planners, so 
it's possible that either/both conferences may not happen, but we're doing our 
best to try to put something together.

 

 

On Mon, Feb 26, 2018 at 3:00 PM, Rahul Singh <rahul.xavier.si...@gmail.com> 
wrote:

I think some of the Instaclustr folks had done one last year which I really 
wanted to go to.. Distributed / Async both would be easier to get people to 
write papers, make slides, do youtube videos with.. and then we could do a 
virtual web conf of the best submissions. 


On Feb 26, 2018, 1:04 PM -0600, Kenneth Brotman <kenbrot...@yahoo.com.invalid>, 
wrote:

Is there any planning yet for a Cass

RE: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Kenneth Brotman

Nicolas,

 

I think you had the link to the other version I was thinking of.  I couldn’t 
find it.  I think it might have gotten taken down; a lot of other stuff seems 
to be gone too.  Maybe it will be back.  Maybe they are just redoing stuff.  
Either way, it’s another sign of Mom and Dad drifting apart – I’m not sure 
who’s Mom and who’s Dad: DataStax or ASF.  Hopefully, for the sake of everyone 
in the family they will reconcile.

 

It’s gems like that presentation that will keep us vital.  

 

Kenneth Brotman

 

From: Nicolas Guyomar [mailto:nicolas.guyo...@gmail.com] 
Sent: Tuesday, February 27, 2018 8:21 AM
To: user@cassandra.apache.org
Subject: Re: Jon Haddad on Diagnosing Performance Problems in Production

 

Is Jon blog post 
https://academy.datastax.com/planet-cassandra/blog/cassandra-summit-recap-diagnosing-problems-in-production
 was relocated somewhere ?

 

On 27 February 2018 at 16:34, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

One presentation that I hope can get updated is Jon Haddad’s very thorough 
presentation on Diagnosing Performance Problems in Production.  I’ve seen 
another version somewhere where I believe he says something like “This should 
help you fix 99% of the problems you see.”  Seems right.

 

I’m sure it will be well attended and well viewed for some time.  Here’s the 
version I found: https://www.youtube.com/watch?v=2JlUpgsEdN8

 

If Jon did a new version I’d probably stop and watch it three times right now.  

 

If we started with that video inline on the Apache Cassandra web site in the 
troubleshooting section, that would help a lot of people because of the quality 
of the content and the density of the content.  

 

Kenneth Brotman

Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Kenneth Brotman

One presentation that I hope can get updated is Jon Haddad's very thorough
presentation on Diagnosing Performance Problems in Production.  I've seen
another version somewhere where I believe he says something like "This
should help you fix 99% of the problems you see."  Seems right.

 

I'm sure it will be well attended and well viewed for some time.  Here's the
version I found: https://www.youtube.com/watch?v=2JlUpgsEdN8

 

If Jon did a new version I'd probably stop and watch it three times right
now.  

 

If we started with that video inline on the Apache Cassandra web site in the
troubleshooting section, that would help a lot of people because of the
quality of the content and the density of the content.  

 

Kenneth Brotman

RE: Version Rollback

2018-02-27 Thread Kenneth Brotman

Could you tell us the size and configuration of your Cassandra cluster?

 

Kenneth Brotman

 

From: shalom sagges [mailto:shalomsag...@gmail.com] 
Sent: Tuesday, February 27, 2018 6:19 AM
To: user@cassandra.apache.org
Subject: Version Rollback

 

Hi All, 

I'm planning to upgrade my C* cluster to version 3.x and was wondering what's 
the best way to perform a rollback if need be. 

If I used snapshot restoration, I would be facing data loss, depends when I 
took the snapshot (i.e. a rollback might be required after upgrading half the 
cluster for example). 

If I add another DC to the cluster with the old version, then I could point the 
apps to talk to that DC if anything bad happens, but building it is really time 
consuming and requires a lot of resources. 

Can anyone provide recommendations on this matter? Any ideas on how to make the 
upgrade foolproof, or at least "really really safe"? 

 

Thanks!

RE: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-27 Thread Kenneth Brotman

I was debating that.  Splitting it up into smaller tasks makes each one seem 
less over-whelming.  

 

Kenneth Brotman

 

From: Josh McKenzie [mailto:jmcken...@apache.org] 
Sent: Tuesday, February 27, 2018 5:44 AM
To: cassandra
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

Might help, organizationally, to put all these efforts under a single ticket of 
"Improve web site Documentation" and add these as sub-tasks. Should be able to 
do that translation post-creation (i.e. in its current state) if that's 
something that makes sense to you.

 

On Mon, Feb 26, 2018 at 5:24 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Here are the related JIRA’s.  Please add content even if It’s not well formed 
compositionally.  Myself or someone else will take it from there

 

https://issues.apache.org/jira/browse/CASSANDRA-14274  The troubleshooting 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14273  The Bulk Loading web 
page on the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14272  The Backups web page on 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14271  The Hints web page in 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14270  The Read repair web page 
is empty

https://issues.apache.org/jira/browse/CASSANDRA-14269  The Data Modeling 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14268  The 
Architecture:Guarantees web page is empty

https://issuesapache.org/jira/browse/CASSANDRA-14267 
<https://issues.apache.org/jira/browse/CASSANDRA-14267>   The Dynamo web page 
on the Apache Cassandra site is missing content

https://issues.apache.org/jira/browse/CASSANDRA-14266  The Architecture 
Overview web page on the Apache Cassandra site is empty

 

Thanks for pitching in  

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Monday, February 26, 2018 1:54 PM
To: user@cassandra.apache.org
Subject: RE: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

Nice!  Thanks for the help Oliver!

 

Kenneth Brotman

 

From: Oliver Ruebenacker [mailto:cur...@gmail.com] 
Sent: Sunday, February 25, 2018 7:12 AM
To: user@cassandra.apache.org
Cc: d...@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

 

 Hello,

  I have some slides about Cassandra 
<https://docs.google.com/presentation/d/1JZYugL4WC9grgZswg1i6gAfWmFBqg9iQ0YcPLUiQ-6w/edit?usp=sharing>
 , feel free to borrow.

 Best, Oliver

 

On Fri, Feb 23, 2018 at 7:28 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

These nine web pages on the Apache Cassandra web site have blank To Do 
sections.  Most of the web pages are completely blank.  Mind you there is a lot 
of hard work already done on the documentation.  I’ll make JIRA’s for any of 
the blank sections where there is not already a JIRA.  Then it will be on to 
writing up those sections.  If you have any text to help me get started for any 
of these sections that would be really cool. 

 

http://cassandra.apache.org/doc/latest/architecture/overview.html

 

http://cassandra.apache.org/doc/latest/architecture/dynamo.html

 

http://cassandra.apache.org/doc/latest/architecture/guarantees.html 
<http://cassandra.apache.org/doc/latest/architecture/guaranteeshtml> 

 

http://cassandra.apache.org/doc/latest/data_modeling/index.html

 

http://cassandra.apacheorg/doc/latest/operating/read_repair.html 
<http://cassandra.apache.org/doc/latest/operating/read_repair.html> 

 

http://cassandra.apache.org/doc/latest/operating/hints.html

 

http://cassandra.apache.org/doc/latest/operating/backups.html

 

http://cassandra.apache.org/doc/latest/operating/bulk_loading.html

 

http://cassandra.apache.org/doc/latest/troubleshooting/index.html

 

Kenneth Brotman

 




-- 

Oliver Ruebenacker

Senior Software Engineer, Diabetes Portal 
<http://www.type2diabetesgenetics.org/> , Broad Institute 
<http://www.broadinstitute.org/>

RE: Cassandra Summit 2019 / Cassandra Summit 2018

2018-02-27 Thread Kenneth Brotman

Event planning is fun as long as you can pace it out properly.  Once you set a 
firm date for an event the pressure on you to keep everything on track is nerve 
racking.  To do something on the order of Cassandra Summit 2016, I think we are 
should plan for 2020.  It’s too late for 2018 and even trying to meet the 
timeline for everything that would have to come together makes 2019 too nerve 
racking a target date.  The steps should be:

Form a planning committee

Bring potential sponsors into the planning early

Select an event planning vendor to guide us and to do the heavy 
lifting for us



In the meantime, we could have a World-wide Distributed Asynchronous Cassandra 
Convention which offers four benefits:

It allows us to address the fact that we are a world-wide group 
that needs a way to reach everyone in a way where no one is geographically 
disadvantaged

No travel time, no travel expenses and no ticket fees makes it 
accessible to a lot of people that otherwise would have to miss out

The lower production costs and simpler administrative workload allows us to 
reach implementation sooner 

It’s cutting edge, world class innovation like Cassandra



Kenneth Brotman

 

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Monday, February 26, 2018 9:38 PM
To: cassandra
Subject: Re: Cassandra Summit 2019 / Cassandra Summit 2018

 

Instaclustr sponsored the 2017 NGCC (Next Gen Cassandra Conference), which was 
developer/development focused (vs user focused).

 

For 2018, we're looking at options for both a developer conference and a user 
conference. There's a lot of logistics involved, and I think it's fairly 
obvious that most of the PMC members aren't professional event planners, so 
it's possible that either/both conferences may not happen, but we're doing our 
best to try to put something together.

 

 

On Mon, Feb 26, 2018 at 3:00 PM, Rahul Singh <rahul.xavier.si...@gmail.com> 
wrote:

I think some of the Instaclustr folks had done one last year which I really 
wanted to go to.. Distributed / Async both would be easier to get people to 
write papers, make slides, do youtube videos with.. and then we could do a 
virtual web conf of the best submissions. 


On Feb 26, 2018, 1:04 PM -0600, Kenneth Brotman <kenbrot...@yahoo.com.invalid>, 
wrote:



Is there any planning yet for a Cassandra Summit 2019 or Cassandra Summit 2018 
(probably too late)?

 

Is there a planning committee?

 

Who wants there to be a Cassandra Summit 2019 and who thinks there is a better 
way?

 

We could try a Cassandra Distributed Summit 2019 where we meet virtually and 
perhaps asynchronously, but there would be a lot more energy and bonding if 
it’s not virtual.  I’m up for any of these.

 

Kenneth Brotman

RE: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-26 Thread Kenneth Brotman

Here are the related JIRA’s.  Please add content even if It’s not well formed 
compositionally.  Myself or someone else will take it from there.

 

https://issues.apache.org/jira/browse/CASSANDRA-14274  The troubleshooting 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14273  The Bulk Loading web 
page on the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14272  The Backups web page on 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14271  The Hints web page in 
the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14270  The Read repair web page 
is empty

https://issues.apache.org/jira/browse/CASSANDRA-14269  The Data Modeling 
section of the web site is empty

https://issues.apache.org/jira/browse/CASSANDRA-14268  The 
Architecture:Guarantees web page is empty

https://issues.apache.org/jira/browse/CASSANDRA-14267  The Dynamo web page on 
the Apache Cassandra site is missing content

https://issues.apache.org/jira/browse/CASSANDRA-14266  The Architecture 
Overview web page on the Apache Cassandra site is empty

 

Thanks for pitching in.  

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Monday, February 26, 2018 1:54 PM
To: user@cassandra.apache.org
Subject: RE: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

Nice!  Thanks for the help Oliver!

 

Kenneth Brotman

 

From: Oliver Ruebenacker [mailto:cur...@gmail.com] 
Sent: Sunday, February 25, 2018 7:12 AM
To: user@cassandra.apache.org
Cc: d...@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

 

 Hello,

  I have some slides about Cassandra 
<https://docs.google.com/presentation/d/1JZYugL4WC9grgZswg1i6gAfWmFBqg9iQ0YcPLUiQ-6w/edit?usp=sharing>
 , feel free to borrow.

 Best, Oliver

 

On Fri, Feb 23, 2018 at 7:28 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

These nine web pages on the Apache Cassandra web site have blank To Do 
sections.  Most of the web pages are completely blank.  Mind you there is a lot 
of hard work already done on the documentation.  I’ll make JIRA’s for any of 
the blank sections where there is not already a JIRA.  Then it will be on to 
writing up those sections.  If you have any text to help me get started for any 
of these sections that would be really cool. 

 

http://cassandra.apache.org/doc/latest/architecture/overview.html

 

http://cassandra.apache.org/doc/latest/architecture/dynamo.html

 

http://cassandra.apache.org/doc/latest/architecture/guarantees.html

 

http://cassandra.apache.org/doc/latest/data_modeling/index.html

 

http://cassandra.apache.org/doc/latest/operating/read_repair.html

 

http://cassandra.apache.org/doc/latest/operating/hints.html

 

http://cassandra.apache.org/doc/latest/operating/backups.html

 

http://cassandra.apache.org/doc/latest/operating/bulk_loading.html

 

http://cassandra.apache.org/doc/latest/troubleshooting/index.html

 

Kenneth Brotman

 




-- 

Oliver Ruebenacker

Senior Software Engineer, Diabetes Portal 
<http://www.type2diabetesgenetics.org/> , Broad Institute 
<http://www.broadinstitute.org/>

RE: Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-26 Thread Kenneth Brotman

Nice!  Thanks for the help Oliver!

 

Kenneth Brotman

 

From: Oliver Ruebenacker [mailto:cur...@gmail.com] 
Sent: Sunday, February 25, 2018 7:12 AM
To: user@cassandra.apache.org
Cc: d...@cassandra.apache.org
Subject: Re: Filling in the blank To Do sections on the Apache Cassandra web 
site

 

 

 Hello,

  I have some slides about Cassandra 
<https://docs.google.com/presentation/d/1JZYugL4WC9grgZswg1i6gAfWmFBqg9iQ0YcPLUiQ-6w/edit?usp=sharing>
 , feel free to borrow.

 Best, Oliver

 

On Fri, Feb 23, 2018 at 7:28 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

These nine web pages on the Apache Cassandra web site have blank To Do 
sections.  Most of the web pages are completely blank.  Mind you there is a lot 
of hard work already done on the documentation.  I’ll make JIRA’s for any of 
the blank sections where there is not already a JIRA.  Then it will be on to 
writing up those sections.  If you have any text to help me get started for any 
of these sections that would be really cool. 

 

http://cassandra.apache.org/doc/latest/architecture/overview.html

 

http://cassandra.apache.org/doc/latest/architecture/dynamo.html

 

http://cassandra.apache.org/doc/latest/architecture/guarantees.html

 

http://cassandra.apache.org/doc/latest/data_modeling/index.html

 

http://cassandra.apache.org/doc/latest/operating/read_repair.html

 

http://cassandra.apache.org/doc/latest/operating/hints.html

 

http://cassandra.apache.org/doc/latest/operating/backups.html

 

http://cassandra.apache.org/doc/latest/operating/bulk_loading.html

 

http://cassandra.apache.org/doc/latest/troubleshooting/index.html

 

Kenneth Brotman

 




-- 

Oliver Ruebenacker

Senior Software Engineer, Diabetes Portal 
<http://www.type2diabetesgenetics.org/> , Broad Institute 
<http://www.broadinstitute.org/>

Add explaination of vNodes to online documentation

2018-02-26 Thread Kenneth Brotman

JIRA 14265

Add explanation of vNodes to online documentation:

https://issues.apache.org/jira/browse/CASSANDRA-14265

 

A lot of inquiries on the mailing list about how vNodes work and how to set
configuration properly.  We should add and explanation to the documentation.

 

Kenneth Brotman

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-26 Thread Kenneth Brotman

Eric,

 

My tone changed as I studied in more detail the thread.  He begin with a 
well-intended but ill-advised inquiry, very public inquiry at that which itself 
was problematic.  It’s not a board member’s place to push their weight around 
like that.  That’s board member training 101.  Not his job.  He stepped in it.  
Go through staff.  Very poorly handled.  I’ll give him the benefit of the doubt 
that he meant well.  We have a problem.  It must be fixed.

 

As to getting caught in the middle I will let you ponder that.  I have to help 
get Cassandra out of Document Hell!!!

 

Kenneth Brotman

  

 

From: Eric Plowe [mailto:eric.pl...@gmail.com] 
Sent: Monday, February 26, 2018 1:14 PM
To: user@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

Kenneth, 

 

How did you get "caught in the middle" of this "stuff"? You are the one 
bringing it up? Also, your tone switched between calling Chris a "well intended 
ASF" board member, to calling him an "idiot". He asked a perfectly reasonable 
question, and then other questions followed as a result. If you want to 
contribute to the community, please start by being respectful to all members of 
the community. 

 

Regards,

 

Eric Plowe

 

On Mon, Feb 26, 2018 at 12:35 PM Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

I got caught in  the middle of this stuff.  I feel for everyone.  I said my two 
cents.  I had to vent.   I’m back to concentrating on helping the group.

 

Kenneth Brotman

 

From: Eric Evans [mailto:john.eric.ev...@gmail.com] 
Sent: Monday, February 26, 2018 9:16 AM
To: user@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

 

 

On Sun, Feb 25, 2018 at 8:45 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Chris Mattmann acted without authority and completely improperly as an Apache 
Software Foundation board member as a board member on their own has no 
authority.  Their authority is to participate and vote at board meetings.  They 
are not allowed to transact business, they are not supposed to force themselves 
on anyone or order anyone around.  The one that was acting controlling was this 
idiot board member that has caused this situation between DataStax and the rest 
of our community.

 

Furthermore, when he instructed Cassandra legend Jonathan Ellis, the Cassandra 
PMC Chair to include certain information in a report to the Apache Software 
Foundation board that escalated the matter to something that was before the 
board.  

 

I am not an attorney and this should not be taken as legal advice!

 

It is clear to me as one someone who is experienced and trained as a board 
member that Chris Mattmann and the ASF itself probably will find themselves in 
court over this.  I think a lot of folks should raise this matter with their 
legal counsel.

 

What happened is not trivial.  It is news worthy.  I suggest people talk to the 
media about this story  Ask them to investigate and report the story.  

 

Is APC interfering with other communities?

 

Kenneth, I really think you need to pump the brakes here.  You're leveling some 
pretty serious accusations, and have now resorted to personal attacks; This is 
not constructive.

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 3:29 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns
Importance: High

 

If you read the email message, the first link below, you’ll see that it’s a 
well intending Apache Foundation board member who could not grasp how our 
community functioned.  Apache Foundation messed up our community by the way 
they handled a routine inquiry, leaving no option for DataStax but to seek 
legal counsel.  I’ve been there.  Your own legal counsel deal the final blow. 
They tell you all communication has to go through them.  They tell you there 
has to be clear separation.  They say you have to take their advice or they 
will not keep defending you and you will not any personal protection.  Anyone 
can be sued and you will be liable for defending yourself.  Sound familiar!  

 

Everyone kept saying that everything was good.  That the community, our 
community liked the way things worked.  

 

I call on Apache Foundation to reach out to DataStax and fix the mess 
forthwith!  Report openly on your efforts.  You can fix your mess Apache 
Foundation.   This email says it all.  A total miscall: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09090.html.  And the 
guy has a PhD!

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 12:58 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

Jon,

 

This is co

The home page of Cassandra is mobile friendly but the link to the third parties is not

2018-02-26 Thread Kenneth Brotman

The home page of Cassandra is mobile friendly but the link to the third
parties from that web page is not.  Any suggestions?  

 

I made a JIRA for it: https://issues.apache.org/jira/browse/CASSANDRA-14263

 

Should posts about JIRA's be on this list or the dev list?

 

Kenneth Brotman

Cassandra Summit 2019 / Cassandra Summit 2018

2018-02-26 Thread Kenneth Brotman

Is there any planning yet for a Cassandra Summit 2019 or Cassandra Summit
2018 (probably too late)?

 

Is there a planning committee?

 

Who wants there to be a Cassandra Summit 2019 and who thinks there is a
better way?

 

We could try a Cassandra Distributed Summit 2019 where we meet virtually and
perhaps asynchronously, but there would be a lot more energy and bonding if
it's not virtual.  I'm up for any of these.

 

Kenneth Brotman

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-26 Thread Kenneth Brotman

I got caught in  the middle of this stuff.  I feel for everyone.  I said my two 
cents.  I had to vent.   I’m back to concentrating on helping the group.

 

Kenneth Brotman

 

From: Eric Evans [mailto:john.eric.ev...@gmail.com] 
Sent: Monday, February 26, 2018 9:16 AM
To: user@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

 

 

On Sun, Feb 25, 2018 at 8:45 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Chris Mattmann acted without authority and completely improperly as an Apache 
Software Foundation board member as a board member on their own has no 
authority.  Their authority is to participate and vote at board meetings.  They 
are not allowed to transact business, they are not supposed to force themselves 
on anyone or order anyone around.  The one that was acting controlling was this 
idiot board member that has caused this situation between DataStax and the rest 
of our community.

 

Furthermore, when he instructed Cassandra legend Jonathan Ellis, the Cassandra 
PMC Chair to include certain information in a report to the Apache Software 
Foundation board that escalated the matter to something that was before the 
board.  

 

I am not an attorney and this should not be taken as legal advice!

 

It is clear to me as one someone who is experienced and trained as a board 
member that Chris Mattmann and the ASF itself probably will find themselves in 
court over this.  I think a lot of folks should raise this matter with their 
legal counsel.

 

What happened is not trivial.  It is news worthy.  I suggest people talk to the 
media about this story  Ask them to investigate and report the story.  

 

Is APC interfering with other communities?

 

Kenneth, I really think you need to pump the brakes here.  You're leveling some 
pretty serious accusations, and have now resorted to personal attacks; This is 
not constructive.

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 3:29 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns
Importance: High

 

If you read the email message, the first link below, you’ll see that it’s a 
well intending Apache Foundation board member who could not grasp how our 
community functioned.  Apache Foundation messed up our community by the way 
they handled a routine inquiry, leaving no option for DataStax but to seek 
legal counsel.  I’ve been there.  Your own legal counsel deal the final blow. 
They tell you all communication has to go through them.  They tell you there 
has to be clear separation.  They say you have to take their advice or they 
will not keep defending you and you will not any personal protection.  Anyone 
can be sued and you will be liable for defending yourself.  Sound familiar!  

 

Everyone kept saying that everything was good.  That the community, our 
community liked the way things worked.  

 

I call on Apache Foundation to reach out to DataStax and fix the mess 
forthwith!  Report openly on your efforts.  You can fix your mess Apache 
Foundation.   This email says it all.  A total miscall: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09090.html.  And the 
guy has a PhD!

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 12:58 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

Jon,

 

This is considered the start of the problem: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09050.html

 

That’s according to this well sourced article called “Fear of Staxit: What next 
for ASF’s Cassandra as biggest donor cuts back” 
https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/ 
<https://www.theregisterco.uk/2016/11/14/datastax_versus_asf_staxeit/> 

 

I am one of the people who didn’t know the history and is now as this article 
describes, caught between “A Rock and a hard place…: 

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/

 

I bet it’s been painful for everyone.  It’s really said.

 

Kenneth Brotman




-- 

Eric Evans
john.eric.ev...@gmail.com

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-25 Thread Kenneth Brotman

Chris Mattmann acted without authority and completely improperly as an Apache 
Software Foundation board member as a board member on their own has no 
authority.  Their authority is to participate and vote at board meetings.  They 
are not allowed to transact business, they are not supposed to force themselves 
on anyone or order anyone around.  The one that was acting controlling was this 
idiot board member that has caused this situation between DataStax and the rest 
of our community.

 

Furthermore, when he instructed Cassandra legend Jonathan Ellis, the Cassandra 
PMC Chair to include certain information in a report to the Apache Software 
Foundation board that escalated the matter to something that was before the 
board.  

 

I am not an attorney and this should not be taken as legal advice!

 

It is clear to me as one someone who is experienced and trained as a board 
member that Chris Mattmann and the ASF itself probably will find themselves in 
court over this.  I think a lot of folks should raise this matter with their 
legal counsel.

 

What happened is not trivial.  It is news worthy.  I suggest people talk to the 
media about this story.  Ask them to investigate and report the story.  

 

Is APC interfering with other communities?

 

Kenneth Brotman

  

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 3:29 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns
Importance: High

 

If you read the email message, the first link below, you’ll see that it’s a 
well intending Apache Foundation board member who could not grasp how our 
community functioned.  Apache Foundation messed up our community by the way 
they handled a routine inquiry, leaving no option for DataStax but to seek 
legal counsel.  I’ve been there.  Your own legal counsel deal the final blow. 
They tell you all communication has to go through them.  They tell you there 
has to be clear separation.  They say you have to take their advice or they 
will not keep defending you and you will not any personal protection.  Anyone 
can be sued and you will be liable for defending yourself.  Sound familiar!  

 

Everyone kept saying that everything was good.  That the community, our 
community liked the way things worked.  

 

I call on Apache Foundation to reach out to DataStax and fix the mess 
forthwith!  Report openly on your efforts.  You can fix your mess Apache 
Foundation.   This email says it all.  A total miscall: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09090.html.  And the 
guy has a PhD!

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 12:58 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

Jon,

 

This is considered the start of the problem: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09050.html

 

That’s according to this well sourced article called “Fear of Staxit: What next 
for ASF’s Cassandra as biggest donor cuts back” 
https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/

 

I am one of the people who didn’t know the history and is now as this article 
describes, caught between “A Rock and a hard place…: 

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/

 

I bet it’s been painful for everyone.  It’s really said.

 

Kenneth Brotman

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-24 Thread Kenneth Brotman

If you read the email message, the first link below, you’ll see that it’s a 
well intending Apache Foundation board member who could not grasp how our 
community functioned.  Apache Foundation messed up our community by the way 
they handled a routine inquiry, leaving no option for DataStax but to seek 
legal counsel.  I’ve been there.  Your own legal counsel deal the final blow. 
They tell you all communication has to go through them.  They tell you there 
has to be clear separation.  They say you have to take their advice or they 
will not keep defending you and you will not any personal protection.  Anyone 
can be sued and you will be liable for defending yourself.  Sound familiar!  

 

Everyone kept saying that everything was good.  That the community, our 
community liked the way things worked.  

 

I call on Apache Foundation to reach out to DataStax and fix the mess 
forthwith!  Report openly on your efforts.  You can fix your mess Apache 
Foundation.   This email says it all.  A total miscall: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09090.html.  And the 
guy has a PhD!

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] 
Sent: Saturday, February 24, 2018 12:58 PM
To: user@cassandra.apache.org
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

Jon,

 

This is considered the start of the problem: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09050.html

 

That’s according to this well sourced article called “Fear of Staxit: What next 
for ASF’s Cassandra as biggest donor cuts back” 
https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/

 

I am one of the people who didn’t know the history and is now as this article 
describes, caught between “A Rock and a hard place…: 

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/

 

I bet it’s been painful for everyone.  It’s really said.

 

Kenneth Brotman

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-24 Thread Kenneth Brotman

Jon,

 

This is considered the start of the problem: 
https://www.mail-archive.com/dev@cassandra.apache.org/msg09050.html

 

That’s according to this well sourced article called “Fear of Staxit: What next 
for ASF’s Cassandra as biggest donor cuts back” 
https://www.theregister.co.uk/2016/11/14/datastax_versus_asf_staxeit/

 

I am one of the people who didn’t know the history and is now as this article 
describes, caught between “A Rock and a hard place…: 

http://www.zdnet.com/article/a-rock-and-a-hard-place-between-scylladb-and-cassandra/

 

I bet it’s been painful for everyone.  It’s really said.

 

Kenneth Brotman

 

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Saturday, February 24, 2018 12:26 PM
To: Kenneth Brotman
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

I really don’t want to continue this discussion any further on the ML, because 
I used to work at DataStax and I’d rather not have this turn into a mess.  Take 
a look at the closed JIRAS and git history, they’re mostly pulled out of Apache 
Cassandra development and ship their own fork.  They are done contributing docs 
as well.

 

https://www.datastax.com/2016/11/serving-customers-serving-the-community

 

Any discussion on the matter is a waste of time, so this is the last email from 
me on the topic.

 

Jon

 





On Feb 24, 2018, at 12:08 PM, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> 
wrote:

 

Hey Jon,

 

If that was the issue the whole time, it’s a big nothing to fix.  All DataStax 
and Apache Foundation ever had to do, and it’s really really easy, is execute a 
property rights sharing agreement that makes everyone comfortable and protects 
the parties from being controlled by the other party.  Super, super easy stuff 
to work out WHEN you have two parties that want it to work out.  If they would 
just do that  we could go back to being one big healthy family.  I could work 
that out with them.  I’ve done this type of thing before.   I’m not kidding 
it’s really easy.  Just so you know.  Just for the record. Just in case the 
right people are following along.

 

Kenneth Brotman

 

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Saturday, February 24, 2018 10:44 AM
To: user@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

DataStax academy is great but no, no work needs to be or should be aligned with 
it.  Datastax is an independent company trying to make a profit, they could 
yank their docs at any time.  There’s a reason why we started doing the docs 
in-tree, there was too much of a reliance on DS documentation.

 

DataStax isn’t Cassandra.






On Feb 24, 2018, at 10:42 AM, Kenneth Brotman < 
<mailto:kenbrot...@yahoo.com.INVALID> kenbrot...@yahoo.com.INVALID> wrote:

 

Any efforts described below should be aligned with, complement, enhance, fill 
in the outstanding work of DataStax Academy. 

 

Kenneth Brotman

 

From: Kenneth Brotman [ <mailto:kenbrot...@yahoo.com> 
mailto:kenbrot...@yahoo.com] 
Sent: Saturday, February 24, 2018 10:16 AM
To: ' <mailto:user@cassandra.apache.org> user@cassandra.apache.org'
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

To Rahul,

 

This is your official email (just from me as an individual) requesting your 
assistance to help solve the knowledge management problem. I can appreciate the 
work you put into the Awesome Cassandra list.  It is difficult to keep 
everything up to date.  I’ve been there too.

 

The golden trophy if you want to do the absolute best thing is a full-fledged 
professional development initiative for Cassandra.   From an instructional 
design view, what you do is create a body of knowledge and exhaustive list of 
competencies, some call KSA’s: knowledge, skills and abilities; then you do a 
gap analysis to find the areas in practice where gaps exists between the 
competencies desired and those of practitioners, then generate a mix of media 
for difference learning styles in a structured properly sequenced series of 
easy to work through steps complete with apperception exercises, and everyone 
will then have a smooth path towards mastery.  It’s that easy.

 

So, yes let’s turn it up a few notches.

 

Thank you,

 

Kenneth Brotman

 


--
Rahul Singh
 <mailto:rahul.si...@anant.us> rahul.si...@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller < 
<mailto:carl.muel...@smartthings.com> carl.muel...@smartthings.com>, wrote:

Isn't a github markdown site about the most easiest collaborative platform 
there is for stuff like this? I'm not saying the end product will knock 
anyone's socks off.

 

On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh < 
<mailto:rahul.xavier.si...@gmail.com> rahul.xavier.si...@gmail.com> wrote:

There’s always a reason to complain

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-24 Thread Kenneth Brotman

Hey Jon,

 

If that was the issue the whole time, it’s a big nothing to fix.  All DataStax 
and Apache Foundation ever had to do, and it’s really really easy, is execute a 
property rights sharing agreement that makes everyone comfortable and protects 
the parties from being controlled by the other party.  Super, super easy stuff 
to work out WHEN you have two parties that want it to work out.  If they would 
just do that  we could go back to being one big healthy family.  I could work 
that out with them.  I’ve done this type of thing before.   I’m not kidding 
it’s really easy.  Just so you know.  Just for the record. Just in case the 
right people are following along.

 

Kenneth Brotman

 

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Saturday, February 24, 2018 10:44 AM
To: user@cassandra.apache.org
Subject: Re: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

DataStax academy is great but no, no work needs to be or should be aligned with 
it.  Datastax is an independent company trying to make a profit, they could 
yank their docs at any time.  There’s a reason why we started doing the docs 
in-tree, there was too much of a reliance on DS documentation.

 

DataStax isn’t Cassandra.





On Feb 24, 2018, at 10:42 AM, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> 
wrote:

 

Any efforts described below should be aligned with, complement, enhance, fill 
in the outstanding work of DataStax Academy. 

 

Kenneth Brotman

 

From: Kenneth Brotman [ <mailto:kenbrot...@yahoo.com> 
mailto:kenbrot...@yahoo.com] 
Sent: Saturday, February 24, 2018 10:16 AM
To: ' <mailto:user@cassandra.apache.org> user@cassandra.apache.org'
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

To Rahul,

 

This is your official email (just from me as an individual) requesting your 
assistance to help solve the knowledge management problem. I can appreciate the 
work you put into the Awesome Cassandra list.  It is difficult to keep 
everything up to date.  I’ve been there too.

 

The golden trophy if you want to do the absolute best thing is a full-fledged 
professional development initiative for Cassandra.   From an instructional 
design view, what you do is create a body of knowledge and exhaustive list of 
competencies, some call KSA’s: knowledge, skills and abilities; then you do a 
gap analysis to find the areas in practice where gaps exists between the 
competencies desired and those of practitioners, then generate a mix of media 
for difference learning styles in a structured properly sequenced series of 
easy to work through steps complete with apperception exercises, and everyone 
will then have a smooth path towards mastery.  It’s that easy.

 

So, yes let’s turn it up a few notches.

 

Thank you,

 

Kenneth Brotman

 


--
Rahul Singh
 <mailto:rahul.si...@anant.us> rahul.si...@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller < 
<mailto:carl.muel...@smartthings.com> carl.muel...@smartthings.com>, wrote:

Isn't a github markdown site about the most easiest collaborative platform 
there is for stuff like this? I'm not saying the end product will knock 
anyone's socks off.

 

On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh < 
<mailto:rahul.xavier.si...@gmail.com> rahul.xavier.si...@gmail.com> wrote:

There’s always a reason to complain if you aren’t paying for something. There’s 
always a reason to complain if you are paying for something. 

 

TLDR; If you want to help curate / organize / gather knowledge about Cassandra, 
send me an email. I’d love to solve at least the knowledge management problem. 

Complaining itself is not a solution or a step in the right direction. Defining 
an issue helps by identifying specifically what the pain is and a decision can 
be made to resolve or not resolve it.

RE: Gathering / Curating / Organizing Cassandra Best Practices & Patterns

2018-02-24 Thread Kenneth Brotman

Any efforts described below should be aligned with, complement, enhance, fill 
in the outstanding work of DataStax Academy. 

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] 
Sent: Saturday, February 24, 2018 10:16 AM
To: 'user@cassandra.apache.org'
Subject: RE: Gathering / Curating / Organizing Cassandra Best Practices & 
Patterns

 

To Rahul,

 

This is your official email (just from me as an individual) requesting your 
assistance to help solve the knowledge management problem. I can appreciate the 
work you put into the Awesome Cassandra list.  It is difficult to keep 
everything up to date.  I’ve been there too.

 

The golden trophy if you want to do the absolute best thing is a full-fledged 
professional development initiative for Cassandra.   From an instructional 
design view, what you do is create a body of knowledge and exhaustive list of 
competencies, some call KSA’s: knowledge, skills and abilities; then you do a 
gap analysis to find the areas in practice where gaps exists between the 
competencies desired and those of practitioners, then generate a mix of media 
for difference learning styles in a structured properly sequenced series of 
easy to work through steps complete with apperception exercises, and everyone 
will then have a smooth path towards mastery.  It’s that easy.

 

So, yes let’s turn it up a few notches.

 

Thank you,

 

Kenneth Brotman

 


--
Rahul Singh
rahul.si...@anant.us

Anant Corporation


On Feb 23, 2018, 5:56 PM -0500, Carl Mueller <carl.muel...@smartthings.com>, 
wrote:

Isn't a github markdown site about the most easiest collaborative platform 
there is for stuff like this? I'm not saying the end product will knock 
anyone's socks off.

 

On Thu, Feb 22, 2018 at 10:55 AM, Rahul Singh <rahul.xavier.si...@gmail.com> 
wrote:

There’s always a reason to complain if you aren’t paying for something. There’s 
always a reason to complain if you are paying for something. 

 

TLDR; If you want to help curate / organize / gather knowledge about Cassandra, 
send me an email. I’d love to solve at least the knowledge management problem. 

Complaining itself is not a solution or a step in the right direction. Defining 
an issue helps by identifying specifically what the pain is and a decision can 
be made to resolve or not resolve it.

RE: Alignment and coodination between DataStax's online documentation and Apache Cassandra online documentation

2018-02-24 Thread Kenneth Brotman

Myrle,

 

I see what you’re saying.  That does make sense.  From now on this thread will 
just be on the dev mailing list unless someone on this list wants to reply or 
comment elsewhere on this mailing list of course.

 

If you see anything beyond what is being worked on, please make suggestions or 
contribute material to use. We will get past the pain. 

 

Kenneth Brotman

 

From: Myrle Krantz [mailto:my...@apache.org] 
Sent: Friday, February 23, 2018 10:56 PM
To: user@cassandra.apache.org
Subject: Re: Alignment and coodination between DataStax's online documentation 
and Apache Cassandra online documentation

 

Hey Kenneth,

 

I think it’s great that you’re working on improving the Cassandra 
documentation. As a user, I thank you. This really is a pain point.

 

I have one suggestion: I feel like the process of improving that documentation 
belongs on the dev mailing list, and the end-result belongs on user. This would 
help improve the signal/noise ratio in my inbox.

 

Thank you again for helping to improve Cassandra,

Myrle 

 

On Fri 23. Feb 2018 at 21:34 Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

To the amazing people of DataStax,

 

The DataStax website is a little unwieldy as it tries to support open source 
Cassandra and DataStax’s version.  Meanwhile some of that information would 
help fill in the Apache Cassandra web site.  If you would like to work on this, 
I’m up for it.  

 

Kenneth Brotman

Filling in the blank To Do sections on the Apache Cassandra web site

2018-02-23 Thread Kenneth Brotman

These nine web pages on the Apache Cassandra web site have blank To Do
sections.  Most of the web pages are completely blank.  Mind you there is a
lot of hard work already done on the documentation.  I'll make JIRA's for
any of the blank sections where there is not already a JIRA.  Then it will
be on to writing up those sections.  If you have any text to help me get
started for any of these sections that would be really cool. 

 

http://cassandra.apache.org/doc/latest/architecture/overview.html

 

http://cassandra.apache.org/doc/latest/architecture/dynamo.html

 

http://cassandra.apache.org/doc/latest/architecture/guarantees.html

 

http://cassandra.apache.org/doc/latest/data_modeling/index.html

 

http://cassandra.apache.org/doc/latest/operating/read_repair.html

 

http://cassandra.apache.org/doc/latest/operating/hints.html

 

http://cassandra.apache.org/doc/latest/operating/backups.html

 

http://cassandra.apache.org/doc/latest/operating/bulk_loading.html

 

http://cassandra.apache.org/doc/latest/troubleshooting/index.html

 

Kenneth Brotman

Alignment and coodination between DataStax's online documentation and Apache Cassandra online documentation

2018-02-23 Thread Kenneth Brotman

To the amazing people of DataStax,

 

The DataStax website is a little unwieldy as it tries to support open source
Cassandra and DataStax's version.  Meanwhile some of that information would
help fill in the Apache Cassandra web site.  If you would like to work on
this, I'm up for it.  

 

Kenneth Brotman

Three simple JIRA's on the documention

2018-02-23 Thread Kenneth Brotman

Three quick and easy JIRA's to begin work on the documentation.  I awake
your feedback.  

 

https://issues.apache.org/jira/browse/CASSANDRA-14257

Add a separate Installing Cassandra section on the menu and move the content
there

 

https://issues.apache.org/jira/browse/CASSANDRA-14256

Renaming Reporting Bugs and Contributions to just Reporting Bugs

 

https://issues.apache.org/jira/browse/CASSANDRA-14255

Moving the Configuring Cassandra web page

 

Kenneth Brotman

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-23 Thread Kenneth Brotman

A sincere thank you for everyone that replied.  I will heavy lift the docs for 
a while, do my Slender Cassandra reference project and then I’ll try to find 
one or two areas where I can contribute code to get going on that.  

I'll have a few JIRA's started by the end of the workday.

Kenneth Brotman


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

RE: Initializing a multiple node cluster (multiple datacenters)

2018-02-22 Thread Kenneth Brotman

I will heavy lift the docs for a while, do my Slender Cassandra reference 
project and then I’ll try to find one or two areas where I can contribute code 
to get going on that.  I have read the section on contributing before I start.  
I’ll self-assign the JIRA right now.

 

Kenneth Brotman

 

From: Jonathan Haddad [mailto:j...@jonhaddad.com] 
Sent: Thursday, February 22, 2018 1:21 PM
To: user@cassandra.apache.org
Subject: Re: Initializing a multiple node cluster (multiple datacenters)

 

Kenneth, if you want to take the JIRA, feel free to self-assign it to yourself 
and put up a pull request or patch, and I'll review.  I'd be very happy to get 
more people involved in the docs.

 

On Thu, Feb 22, 2018 at 12:56 PM Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

That information would have saved me time too.  Thanks for making a JIRA for it 
Jon.  Perhaps this is a good JIRA for me to begin with.

 

Kenneth Brotman  

 

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Thursday, February 22, 2018 11:11 AM
To: user
Subject: Re: Initializing a multiple node cluster (multiple datacenters)

 

Great question.  Unfortunately, our OSS docs lack a step by step process on how 
to add a DC, I’ve created a JIRA to do that: 
https://issues.apache.org/jira/browse/CASSANDRA-14254

 

The datastax docs are pretty good for this though: 
https://docs.datastax.com/en/cassandra/latest/cassandra/operations/opsAddDCToCluster.html

 

Regarding token allocation, it was random prior to 3.0.  In 3.0 and up, it is 
calculated a little more intelligently.  in 3.11.2, which was just released, 
CASSANDRA-13080 was backported which will help out when you add your second DC. 
 If you go this route, you can drop your token count down to 16 and get all the 
benefits with no drawbacks.  

 

At this point I would go straight to 3.11.2 and skip 3.0 as there were quite a 
few improvements that make it worthwhile along the way, in my opinion.  We work 
with several customers that are running 3.11 and are pretty happy with it 

 

Yes, if there’s no data, you can initialize the cluster with auto_boostrap: 
true.  Be sure to change any key spaces using simple strategy to NTS first, and 
replica them to the new DC as well. 

 

Jon

 

 

On Feb 22, 2018, at 10:53 AM, Jean Carlo <jean.jeancar...@gmail.com> wrote:

 

Hi jonathan

 

Thank you for the answer. Do you know where to look to understand why this 
works. As i understood all the node then will chose ramdoms tokens. How can i 
assure the correctness of the ring?

 

So as you said. Under the condition that there.is <http://there.is/>  no data 
in the cluster. I can initialize a cluster multi dc without disable auto 
bootstrap.?

 

On Feb 22, 2018 5:43 PM, "Jonathan Haddad" <j...@jonhaddad.com> wrote:

If it's a new cluster, there's no need to disable auto_bootstrap.  That setting 
prevents the first node in the second DC from being a replica for all the data 
in the first DC.  If there's no data in the first DC, you can skip a couple 
steps and just leave it on.

 

Leave it on, and enjoy your afternoon.

 

Seeds don't bootstrap by the way, changing the setting on those nodes doesn't 
do anything.

 

On Thu, Feb 22, 2018 at 8:36 AM Jean Carlo <jeanjeancar...@gmail.com 
<mailto:jean.jeancar...@gmail.com> > wrote:

Hello

I would like to clarify this,

 

In order to initialize  a  cassandra multi dc cluster, without data. If I  
follow the documentation datastax




https://docs.datastax.com/en/cassandra/2.1/cassandra/initialize/initializeMultipleDS.html

It says

*   auto_bootstrap: false (Add this setting only when initializing a clean 
node with no data.) 

But I dont understand the way this works regarding to the auto_bootstraps. 

If all the machines make their own tokens in a ramdon way using 
murmur3partitioner and vnodes , it isn't probable that two nodes will have the 
tokens in common ?

It is not better to bootstrap first the seeds with auto_bootstrap: false and 
then the rest of the nodes with auto_bootstrap: true ?

 

Thank you for the help

 

Jean Carlo


"The best way to predict the future is to invent it" Alan Kay

RE: Initializing a multiple node cluster (multiple datacenters)

2018-02-22 Thread Kenneth Brotman

That information would have saved me time too.  Thanks for making a JIRA for it 
Jon.  Perhaps this is a good JIRA for me to begin with.

 

Kenneth Brotman  

 

From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Thursday, February 22, 2018 11:11 AM
To: user
Subject: Re: Initializing a multiple node cluster (multiple datacenters)

 

Great question.  Unfortunately, our OSS docs lack a step by step process on how 
to add a DC, I’ve created a JIRA to do that: 
https://issues.apache.org/jira/browse/CASSANDRA-14254

 

The datastax docs are pretty good for this though: 
https://docs.datastax.com/en/cassandra/latest/cassandra/operations/opsAddDCToCluster.html

 

Regarding token allocation, it was random prior to 3.0.  In 3.0 and up, it is 
calculated a little more intelligently.  in 3.11.2, which was just released, 
CASSANDRA-13080 was backported which will help out when you add your second DC. 
 If you go this route, you can drop your token count down to 16 and get all the 
benefits with no drawbacks.  

 

At this point I would go straight to 3.11.2 and skip 3.0 as there were quite a 
few improvements that make it worthwhile along the way, in my opinion.  We work 
with several customers that are running 3.11 and are pretty happy with it. 

 

Yes, if there’s no data, you can initialize the cluster with auto_boostrap: 
true.  Be sure to change any key spaces using simple strategy to NTS first, and 
replica them to the new DC as well. 

 

Jon

 





On Feb 22, 2018, at 10:53 AM, Jean Carlo <jean.jeancar...@gmail.com> wrote:

 

Hi jonathan

 

Thank you for the answer. Do you know where to look to understand why this 
works. As i understood all the node then will chose ramdoms tokens. How can i 
assure the correctness of the ring?

 

So as you said. Under the condition that there.is <http://there.is/>  no data 
in the cluster. I can initialize a cluster multi dc without disable auto 
bootstrap.?

 

On Feb 22, 2018 5:43 PM, "Jonathan Haddad" <j...@jonhaddad.com> wrote:

If it's a new cluster, there's no need to disable auto_bootstrap.  That setting 
prevents the first node in the second DC from being a replica for all the data 
in the first DC.  If there's no data in the first DC, you can skip a couple 
steps and just leave it on.

 

Leave it on, and enjoy your afternoon.

 

Seeds don't bootstrap by the way, changing the setting on those nodes doesn't 
do anything.

 

On Thu, Feb 22, 2018 at 8:36 AM Jean Carlo <jean.jeancar...@gmail.com> wrote:

Hello

I would like to clarify this,

 

In order to initialize  a  cassandra multi dc cluster, without data. If I  
follow the documentation datastax




https://docs.datastax.com/en/cassandra/2.1/cassandra/initialize/initializeMultipleDS.html



It says

*   auto_bootstrap: false (Add this setting only when initializing a clean 
node with no data.) 

But I dont understand the way this works regarding to the auto_bootstraps. 

If all the machines make their own tokens in a ramdon way using 
murmur3partitioner and vnodes , it isn't probable that two nodes will have the 
tokens in common ?

It is not better to bootstrap first the seeds with auto_bootstrap: false and 
then the rest of the nodes with auto_bootstrap: true ?

 

Thank you for the help

 

Jean Carlo


"The best way to predict the future is to invent it" Alan Kay

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman

 

Jeff,

 

I already addressed everything you said.  Boy! Would I like to bring up the out 
of date articles on the web that trip people up and the lousy documentation on 
the Apache website but I can’t because a lot of folks don’t know me or why I’m 
saying these things.  

 

I will be making another post that I hope clarifies what’s going on with me.  
After that I will either be a freakishly valuable asset to this community or I 
will be a freakishly valuable asset to another community.  

 

You sure have a funny way of reigning in people that are used to helping out.  
You sure misjudged me.  Wow.

 

Kenneth Brotman

 

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Wednesday, February 21, 2018 3:12 PM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

 

 

On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was 
suggesting the big companies could justify taking it on easy enough and you 
know actually pay the people who would be working at it so those people could 
have a life.

The part I don't get is the aversion to usability.  Isn't that what you think 
about when you are coding?  "Am I making this thing I'm building easy to use?"  
If you were programming for me, we would be constantly talking about what we 
are building and how we can make things easier for users.  If I had to fight 
with a developer, architect or engineer about usability all the time, they 
would be gone and quick.  How do approach programming if you aren't trying to 
make things easy.

 

 

There's no aversion to usability, you're assuming things that just aren't true 
Nobody's against usability, we've just prioritized other things HIGHER. We make 
those decisions in part by looking at open JIRAs and determining what's asked 
for the most, what members of the community have contributed, and then balance 
that against what we ourselves care about. You're making a statement that it 
should be the top priority for the next release, with no JIRA, and history of 
contributing (and indeed, no real clear sign that you even understand the full 
extent of the database), no sign that you're willing to do the work yourself, 
and making a ton of assumptions about the level of effort and ROI.

 

I would love for Cassandra to be easier to use, I'm sure everyone does. There's 
a dozen features I'd love to add if I had infinite budget and infinite 
manpower. But what you're asking for is A LOT of effort and / or A LOT of 
money, and you're assuming someone's going to step up and foot the bill, but 
there's no real reason to believe that's the case. 

 

In the mean time, everyone's spending hours replying to this thread that is 0% 
actionable. We would all have been objectively better off had everyone ignored 
this thread and just spent 10 minutes writing some section of the docs. So the 
next time I get the urge to reply, I'm just going to do that instead.

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman

Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was 
suggesting the big companies could justify taking it on easy enough and you 
know actually pay the people who would be working at it so those people could 
have a life.

The part I don't get is the aversion to usability.  Isn't that what you think 
about when you are coding?  "Am I making this thing I'm building easy to use?"  
If you were programming for me, we would be constantly talking about what we 
are building and how we can make things easier for users.  If I had to fight 
with a developer, architect or engineer about usability all the time, they 
would be gone and quick.  How do approach programming if you aren't trying to 
make things easy.

Kenneth Brotman

-Original Message-
From: Akash Gangil [mailto:akashg1...@gmail.com] 
Sent: Wednesday, February 21, 2018 2:24 PM
To: d...@cassandra.apache.org
Cc: user@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

I would second Jon in the arguments he made. Contributing outside work is 
draining and really requires a lot of commitment. If someone requires features 
around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < 
kenbrot...@yahoo.com.invalid> wrote:

> Jon,
>
> Very sorry that you don't see the value of the time I'm taking for this.
> I don't have demands; I do have a stern warning and I'm right Jon.  
> Please be very careful not to mischaracterized my words Jon.
>
> You suggest I put things in JIRA's, then seem to suggest that I'd be 
> lucky if anyone looked at it and did anything. That's what I figured too.
>
> I don't appreciate the hostility.  You will understand more fully in 
> the next post where I'm coming from.  Try to keep the conversation civilized.
> I'm trying or at least so you understand I think what I'm doing is 
> saving your gig and mine.  I really like a lot of people is this group.
>
> I've come to a preliminary assessment on things.  Soon the cloud will 
> clear or I'll be gone.  Don't worry.  I'm a very peaceful person and 
> like you I am driven by real important projects that I feel compelled 
> to work on for the good of others.  I don't have time for people to 
> hand hold a database and I can't get stuck with my projects on the wrong 
> stuff.
>
> Kenneth Brotman
>
>
> -Original Message-
> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon 
> Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: user@cassandra.apache.org
> Cc: d...@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Ken,
>
> Maybe it’s not clear how open source projects work, so let me try to 
> explain.  There’s a bunch of us who either get paid by someone or 
> volunteer on our free time.  The folks that get paid, (yay!) usually 
> take direction on what the priorities are, and work on projects that 
> directly affect our jobs.  That means that someone needs to care 
> enough about the features you want to work on them, if you’re not going to do 
> it yourself.
>
> Now as others have said already, please put your list of demands in 
> JIRA, if someone is interested, they will work on it.  You may need to 
> contribute a little more than you’ve done already, be prepared to get 
> involved if you actually want to to see something get done.  Perhaps 
> learning a little more about Cassandra’s internals and the people 
> involved will reveal some of the design decisions and priorities of the 
> project.
>
> Third, you seem to be a little obsessed with market share.  While 
> market share is fun to talk about, *most* of us that are working on 
> and contributing to Cassandra do so because it does actually solve a 
> problem we have, and solves it reasonably well.  If some magic open 
> source DB appears out of no where and does everything you want 
> Cassandra to, and is bug free, keeps your data consistent, 
> automatically does backups, comes with really nice cert management, ad 
> hoc querying, amazing materialized views that are perfect, no caveats 
> to secondary indexes, and somehow still gives you linear scalability 
> without any mental overhead whatsoever then sure, people might start 
> using it.  And that’s actually OK, because if that happens we’ll all 
> be incredibly pumped out of our minds because we won’t have to work as 
> hard.  If on the slim chance that doesn’t manifest, those of us that 
> use Cassandra and are part of the community will keep working on the 
> things we care about, iterating, and improving things.  Maybe someone will 
> even take a look at your JIRA issues.
>
> Further filling the mailing list with your grievances will likely not 
> help you progress tow

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman

Jon,

Very sorry that you don't see the value of the time I'm taking for this.  I 
don't have demands; I do have a stern warning and I'm right Jon.  Please be 
very careful not to mischaracterized my words Jon.

You suggest I put things in JIRA's, then seem to suggest that I'd be lucky if 
anyone looked at it and did anything. That's what I figured too.  

I don't appreciate the hostility.  You will understand more fully in the next 
post where I'm coming from.  Try to keep the conversation civilized.  I'm 
trying or at least so you understand I think what I'm doing is saving your gig 
and mine.  I really like a lot of people is this group.

I've come to a preliminary assessment on things.  Soon the cloud will clear or 
I'll be gone.  Don't worry.  I'm a very peaceful person and like you I am 
driven by real important projects that I feel compelled to work on for the good 
of others.  I don't have time for people to hand hold a database and I can't 
get stuck with my projects on the wrong stuff.  

Kenneth Brotman

-Original Message-
From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Wednesday, February 21, 2018 12:44 PM
To: user@cassandra.apache.org
Cc: d...@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Ken,

Maybe it’s not clear how open source projects work, so let me try to explain.  
There’s a bunch of us who either get paid by someone or volunteer on our free 
time.  The folks that get paid, (yay!) usually take direction on what the 
priorities are, and work on projects that directly affect our jobs.  That means 
that someone needs to care enough about the features you want to work on them, 
if you’re not going to do it yourself. 

Now as others have said already, please put your list of demands in JIRA, if 
someone is interested, they will work on it.  You may need to contribute a 
little more than you’ve done already, be prepared to get involved if you 
actually want to to see something get done.  Perhaps learning a little more 
about Cassandra’s internals and the people involved will reveal some of the 
design decisions and priorities of the project.  

Third, you seem to be a little obsessed with market share.  While market share 
is fun to talk about, *most* of us that are working on and contributing to 
Cassandra do so because it does actually solve a problem we have, and solves it 
reasonably well.  If some magic open source DB appears out of no where and does 
everything you want Cassandra to, and is bug free, keeps your data consistent, 
automatically does backups, comes with really nice cert management, ad hoc 
querying, amazing materialized views that are perfect, no caveats to secondary 
indexes, and somehow still gives you linear scalability without any mental 
overhead whatsoever then sure, people might start using it.  And that’s 
actually OK, because if that happens we’ll all be incredibly pumped out of our 
minds because we won’t have to work as hard.  If on the slim chance that 
doesn’t manifest, those of us that use Cassandra and are part of the community 
will keep working on the things we care about, iterating, and improving things. 
 Maybe someone will even take a look at your JIRA issues.  

Further filling the mailing list with your grievances will likely not help you 
progress towards your goal of a Cassandra that’s easier to use, so I encourage 
you to try to be a little more productive and try to help rather than just 
complain, which is not constructive.  I did a quick search for your name on the 
mailing list, and I’ve seen very little from you, so to everyone’s who’s been 
around for a while and trying to help you it looks like you’re just some random 
dude asking for people to work for free on the things you’re asking for, 
without offering anything back in return.

Jon

> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> 
> wrote:
> 
> Josh,
> 
> To say nothing is indifference.  If you care about your community, sometimes 
> don't you have to bring up a subject even though you know it's also 
> temporarily adding some discomfort?  
> 
> As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
> easy one I'll work on and then announce.  Someone else will have to do the 
> coding.  A year from now I would probably just knock it out to make sure it's 
> as easy as I expect it to be but to be honest, as I've been saying, I'm not 
> set up to do that right now.  I've barely looked at any Cassandra code; for 
> one; everyone on this list probably codes more than I do, secondly; and 
> lastly, it's a good one for someone that wants an easy one to start with: 
> vNodes.  I've already seen too many people seeking assistance with the vNode 
> setting.
> 
> And you can expect as others have been mentioning that there should be 
> similar ones on compaction, repair and backup. 
>

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman

 Josh, 

To say nothing is indifference.  If you care about your community, sometimes 
don't you have to bring up a subject even though you know it's also temporarily 
adding some discomfort?  

As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
easy one I'll work on and then announce.  Someone else will have to do the 
coding.  A year from now I would probably just knock it out to make sure it's 
as easy as I expect it to be but to be honest, as I've been saying, I'm not set 
up to do that right now.  I've barely looked at any Cassandra code; for one; 
everyone on this list probably codes more than I do, secondly; and lastly, it's 
a good one for someone that wants an easy one to start with: vNodes.  I've 
already seen too many people seeking assistance with the vNode setting.

And you can expect as others have been mentioning that there should be similar 
ones on compaction, repair and backup. 

Microsoft knows poor usability gives them an easy market to take over. And they 
make it easy to switch.

Beginning at 4:17 in the video, it says the following:

"You don't need to worry about replica sets, quorum or read repair.  
You can focus on writing correct application logic."

At 4:42, it says:
"Hopefully this gives you a quick idea of how seamlessly you can bring 
your existing Cassandra applications to Azure Cosmos DB.  No code changes are 
required.  It works with your favorite Cassandra tools and drivers including 
for example native Cassandra driver for Spark. And it takes seconds to get 
going, and it's elastically and globally scalable."

More to come,

Kenneth Brotman

-Original Message-
From: Josh McKenzie [mailto:jmcken...@apache.org] 
Sent: Wednesday, February 21, 2018 8:28 AM
To: d...@cassandra.apache.org
Cc: User
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's 
what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a 
patch to move the needle on *any* of these things being complained about in 
this thread.

For the Apache Way <https://www.apache.org/foundation/governance/> to work, 
people need to step up and meaningfully contribute to a project to scratch 
their own itch instead of just waiting for a random corporation-subsidized 
engineer to happen to have interests that align with them and contribute that 
to the project.

Beating a dead horse for things everyone on the project knows are serious pain 
points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin < 
oleksandr.shul...@zalando.de> wrote:

> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman < 
> kenbrot...@yahoo.com.invalid> wrote:
>
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff.  I think you have a concern there about 
> > the need to test sufficiently to ensure the stability of the next 
> > major release.  That makes perfect sense.- for every release, 
> > especially the major ones.  Continuous improvement is not a phase of 
> > development for example.  CI should be in everything, in every 
> > phase.  Stability and testing a part of every release not just one.  
> > A major release should be
> a
> > nice step from the previous major release though.
> >
>
> I guess what Jeff refers to is the tick-tock release cycle experiment, 
> which has proven to be a complete disaster by popular opinion.
>
> There's also the "materialized views" feature which failed to 
> materialize in the end (pun intended) and had to be declared 
> experimental retroactively.
>
> Another prominent example is incremental repair which was introduced 
> as the default option in 2.2 and now is not recommended to use because 
> of so many corner cases where it can fail.  So again experimental as an 
> afterthought.
>
> Not to mention that even if you are aware of the default incremental 
> and go with full repair instead, you're still up for a sad surprise:
> anti-compaction will be triggered despite the "full" repair.  Because 
> anti-compaction is only disabled in case of sub-range repair (don't 
> ask why), so you need to use something advanced like Reaper if you 
> want to avoid that.  I don't think you'll ever find this in the documentation.
>
> Honestly, for an eventually-consistent system like Cassandra 
> anti-entropy repair is one of the most important pieces to get right.  
> And Cassandra fails really badly on that one: the feature is not 
> really well d

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-20 Thread Kenneth Brotman

If you watch this video through you'll see why usability is so important.  You 
can't ignore usability issues.  

Cassandra does not exist in a vacuum.  The competitors are world class.  

The video is on the New Cassandra API for Azure Cosmos DB:
https://www.youtube.com/watch?v=1Sf4McGN1AQ

Kenneth Brotman

-Original Message-
From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-in...@bitmovin.com] 
Sent: Tuesday, February 20, 2018 1:28 AM
To: user@cassandra.apache.org; James Briggs
Cc: d...@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Hi,

I have to add my own two cents here as the main thing that keeps me from really 
running Cassandra is the amount of pain running it incurs.
Not so much because it's actually painful but because the tools are so 
different and the documentation and best practices are scattered across a dozen 
outdated DataStax articles and this mailing list etc.. We've been hesitant 
(although our use case is perfect for using Cassandra) to deploy Cassandra to 
any critical systems as even after a year of running it we still don't have the 
operational experience to confidently run critical systems with it.

Simple things like a foolproof / safe cluster-wide S3 Backup (like 
Elasticsearch has it) would for example solve a TON of issues for new people. I 
don't need it auto-scheduled or something, but having to configure cron jobs 
across the whole cluster is a pain in the ass for small teams.
To be honest, even the way snapshots are done right now is already super 
painful. Every other system I operated so far will just create one backup 
folder I can export, in C* the Backup is scattered across a bunch of different 
Keyspace folders etc.. needless to say that it took a while until I trusted my 
backup scripts fully.

And especially for a Database I believe Backup/Restore needs to be a non-issue 
that's documented front and center. If not smaller teams just don't have the 
resources to dedicate to learning and building the tools around it.

Now that the team is getting larger we could spare the resources to operate 
these things, but switching from a well-understood RDBMs schema to Cassandra is 
now incredibly hard and will probably take years.

greetings Daniel

On Tue, 20 Feb 2018 at 05:56 James Briggs <james.bri...@yahoo.com.invalid>
wrote:

> Kenneth:
>
> What you said is not wrong.
>
> Vertica and Riak are examples of distributed databases that don't 
> require hand-holding.
>
> Cassandra is for Java-programmer DIYers, or more often Datastax 
> clients, at this point.
> Thanks, James.
>
> ------
> *From:* Kenneth Brotman <kenbrot...@yahoo.com.INVALID>
> *To:* user@cassandra.apache.org
> *Cc:* d...@cassandra.apache.org
> *Sent:* Monday, February 19, 2018 4:56 PM
>
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Jeff, you helped me figure out what I was missing.  It just took me a 
> day to digest what you wrote.  I’m coming over from another type of 
> engineering.  I didn’t know and it’s not really documented.  Cassandra 
> runs in a data center.  Now days that means the nodes are going to be 
> in managed containers, Docker containers, managed by Kerbernetes,  
> Meso or something, and for that reason anyone operating Cassandra in a 
> real world setting would not encounter the issues I raised in the way I 
> described.
>
> Shouldn’t the architectural diagrams people reference indicate that in 
> some way?  That would have help me.
>
> Kenneth Brotman
>
> *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com]
> *Sent:* Monday, February 19, 2018 10:43 AM
> *To:* 'user@cassandra.apache.org'
> *Cc:* 'd...@cassandra.apache.org'
> *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>
> Well said.  Very fair.  I wouldn’t mind hearing from others still  
> You’re a good guy!
>
> Kenneth Brotman
>
> *From:* Jeff Jirsa [mailto:jji...@gmail.com <jji...@gmail.com>]
> *Sent:* Monday, February 19, 2018 9:10 AM
> *To:* cassandra
> *Cc:* Cassandra DEV
> *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a lot of things below I disagree with, but it's ok. I 
> convinced myself not to nit-pick every point.
>
> https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of 
> Stefan's work with cert management
>
> Beyond that, I encourage you to do what Michael suggested: open JIRAs 
> for things you care strongly about, work on them if you have time. 
> Sometime this year we'll schedule a NGCC (Next Generation Cassandra 
> Conference) where we talk about future project work and direction, I 
> encourage you to attend if you're able (I encourage anyone who cares 
> about the direction of Cassandra to attend, it's probably be either 
> free or very low cost, just to cover a venue an

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Kenneth Brotman

Jeff, you helped me figure out what I was missing.  It just took me a day to 
digest what you wrote.  I’m coming over from another type of engineering.  I 
didn’t know and it’s not really documented.  Cassandra runs in a data center.  
Now days that means the nodes are going to be in managed containers, Docker 
containers, managed by Kerbernetes,  Meso or something, and for that reason 
anyone operating Cassandra in a real world setting would not encounter the 
issues I raised in the way I described.

Shouldn’t the architectural diagrams people reference indicate that in some 
way?  That would have help me.

Kenneth Brotman

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] 
Sent: Monday, February 19, 2018 10:43 AM
To: 'user@cassandra.apache.org'
Cc: 'd...@cassandra.apache.org'
Subject: RE: Cassandra Needs to Grow Up by Version Five!

Well said.  Very fair.  I wouldn’t mind hearing from others still.  You’re a 
good guy!

Kenneth Brotman

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a lot of things below I disagree with, but it's ok. I convinced myself 
not to nit-pick every point.

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work 
with cert management

Beyond that, I encourage you to do what Michael suggested: open JIRAs for 
things you care strongly about, work on them if you have time. Sometime this 
year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk 
about future project work and direction, I encourage you to attend if you're 
able (I encourage anyone who cares about the direction of Cassandra to attend, 
it's probably be either free or very low cost, just to cover a venue and some 
food). If nothing else, you'll meet some of the teams who are working on the 
project, and learn why they've selected the projects on which they're working. 
You'll have an opportunity to pitch your vision, and maybe you can talk some 
folks into helping out. 

- Jeff

On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Comments inline

>-Original Message-
>From: Jeff Jirsa [mailto:jji...@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: user@cassandra.apache.org
>Cc: d...@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> 
>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that 
> >it’s open source or cutting edge.  It’s an open source cutting edge program 
> >that lacks some of its basic functionality.  We are all stuck addressing 
> >fundamental mechanical tasks for Cassandra because the basic code that would 
> >do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very 
>narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. 
>We don’t ship fancy UIs. We ship a database. I think for the most part the 
>narrow vision has been for the best, but maybe it’s time to reconsider some of 
>the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
>know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
>the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King. 
 When users have to learn the database, then learn what they have to automate, 
then learn an automation tool and then use the automation tool to do something 
that is as fundamental as the fundamental tasks I described, then something is 
missing from the database itself that is adversely affecting usability - and 
that is very bad.  Where those big companies need to calculate the ROI is in 
the cost of acquiring or training the next group of users.  Consider how steep 
the learning curve is for new users.  Consider the business case for improving 
ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of 
>the companies working on/with it tend to be big companies. Big companies often 
>have pre-existing automation that solved the stuff you consider fundamental 
>tasks, so there’s probably nobody actively working on the solved problems that 
>you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the 
companies would take the time to contribute more code, then the rest of the 
code needed could be generated easily.

>3) It’s not nearly as basic as you think it

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Kenneth Brotman

Well said.  Very fair.  I wouldn’t mind hearing from others still.  You’re a 
good guy!

Kenneth Brotman

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a lot of things below I disagree with, but it's ok. I convinced myself 
not to nit-pick every point.

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work 
with cert management

Beyond that, I encourage you to do what Michael suggested: open JIRAs for 
things you care strongly about, work on them if you have time. Sometime this 
year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk 
about future project work and direction, I encourage you to attend if you're 
able (I encourage anyone who cares about the direction of Cassandra to attend, 
it's probably be either free or very low cost, just to cover a venue and some 
food). If nothing else, you'll meet some of the teams who are working on the 
project, and learn why they've selected the projects on which they're working. 
You'll have an opportunity to pitch your vision, and maybe you can talk some 
folks into helping out. 

- Jeff

On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Comments inline

>-Original Message-
>From: Jeff Jirsa [mailto:jji...@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: user@cassandra.apache.org
>Cc: d...@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> 
>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that 
> >it’s open source or cutting edge.  It’s an open source cutting edge program 
> >that lacks some of its basic functionality.  We are all stuck addressing 
> >fundamental mechanical tasks for Cassandra because the basic code that would 
> >do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very 
>narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. 
>We don’t ship fancy UIs. We ship a database. I think for the most part the 
>narrow vision has been for the best, but maybe it’s time to reconsider some of 
>the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
>know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
>the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King. 
 When users have to learn the database, then learn what they have to automate, 
then learn an automation tool and then use the automation tool to do something 
that is as fundamental as the fundamental tasks I described, then something is 
missing from the database itself that is adversely affecting usability - and 
that is very bad.  Where those big companies need to calculate the ROI is in 
the cost of acquiring or training the next group of users.  Consider how steep 
the learning curve is for new users.  Consider the business case for improving 
ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of 
>the companies working on/with it tend to be big companies. Big companies often 
>have pre-existing automation that solved the stuff you consider fundamental 
>tasks, so there’s probably nobody actively working on the solved problems that 
>you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the 
companies would take the time to contribute more code, then the rest of the 
code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a 
>multi-person team on opscenter, and while it was better than anything else 
>around last time I used it (before it stopped supporting the OSS version), it 
>left a lot to be desired. It’s probably 2-3 engineers working for a month  to 
>have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and 
>I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then.  I'm not kidding.  
For a big company with revenues in the tens of billions or more, and a heavy 
use of Cassandra nodes, it's easy to make a case for having a full time person 
or more that involved.  They aren't paying for using the open source code that 
is Cassandra.  Let's see what would the licensing fees be for a big company if 
the costs where like

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Kenneth Brotman

Comments inline

>-Original Message-
>From: Jeff Jirsa [mailto:jji...@gmail.com] 
>Sent: Sunday, February 18, 2018 10:58 PM
>To: user@cassandra.apache.org
>Cc: d...@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline 
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> 
>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that 
> >it’s open source or cutting edge.  It’s an open source cutting edge program 
> >that lacks some of its basic functionality.  We are all stuck addressing 
> >fundamental mechanical tasks for Cassandra because the basic code that would 
> >do that part has not been contributed yet.
>> 
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very 
>narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. 
>We don’t ship fancy UIs. We ship a database. I think for the most part the 
>narrow vision has been for the best, but maybe it’s time to reconsider some of 
>the scope. 
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
>know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
>the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King. 
 When users have to learn the database, then learn what they have to automate, 
then learn an automation tool and then use the automation tool to do something 
that is as fundamental as the fundamental tasks I described, then something is 
missing from the database itself that is adversely affecting usability - and 
that is very bad.  Where those big companies need to calculate the ROI is in 
the cost of acquiring or training the next group of users.  Consider how steep 
the learning curve is for new users.  Consider the business case for improving 
ease of use. 

>2) Cassandra is, by definition, a database for large scale problems. Most of 
>the companies working on/with it tend to be big companies. Big companies often 
>have pre-existing automation that solved the stuff you consider fundamental 
>tasks, so there’s probably nobody actively working on the solved problems that 
>you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the 
companies would take the time to contribute more code, then the rest of the 
code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a 
>multi-person team on opscenter, and while it was better than anything else 
>around last time I used it (before it stopped supporting the OSS version), it 
>left a lot to be desired. It’s probably 2-3 engineers working for a month  to 
>have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and 
>I can think of about 10 JIRAs I’d rather see that time be spent on first. 

How about 6-9 engineers working 12 months a year on it then.  I'm not kidding.  
For a big company with revenues in the tens of billions or more, and a heavy 
use of Cassandra nodes, it's easy to make a case for having a full time person 
or more that involved.  They aren't paying for using the open source code that 
is Cassandra.  Let's see what would the licensing fees be for a big company if 
the costs where like Microsoft or Oracle would charge for their enterprise 
level relational database?   What's the contribution of one or two people in 
comparison.

>> Ease of use issues need to be given much more attention.  For an 
>> administrator, the ease of use of Cassandra is very poor. 
>>
>>Furthermore, currently Cassandra is an idiot.  We have to do everything for 
>>Cassandra. Contrast that with the fact that we are in the dawn of artificial 
>>intelligence.
>> 
>
>And for everything you think is obvious, there’s a 50% chance someone else 
>will have already solved differently, and your obvious new solution will be 
>seen as an inconvenient assumption and complexity they won’t appreciate. Open 
>source projects get to walk a fine line of trying to be useful without making 
>too many assumptions, being “too” opinionated, or overstepping bounds. We may 
>be too conservative, but it’s very easy to go too far in the opposite 
>direction. 
>

I appreciate that but when such concerns result in inaction instead of 
resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to 
>> administer tasks for a database.  I’m an engineering type.  My job is to 
>> apply science and technology to solve real world problems.  And th

Cassandra Needs to Grow Up by Version Five!

2018-02-18 Thread Kenneth Brotman

Cassandra feels like an unfinished program to me.  The problem is not that
it's open source or cutting edge.  It's an open source cutting edge program
that lacks some of its basic functionality.  We are all stuck addressing
fundamental mechanical tasks for Cassandra because the basic code that would
do that part has not been contributed yet.

Ease of use issues need to be given much more attention.  For an
administrator, the ease of use of Cassandra is very poor.  

Furthermore, currently Cassandra is an idiot.  We have to do everything for
Cassandra. Contrast that with the fact that we are in the dawn of artificial
intelligence.

Software exists to automate tasks for humans, not mechanize humans to
administer tasks for a database.  I'm an engineering type.  My job is to
apply science and technology to solve real world problems.  And that's where
I need an organization's I.T. talent to focus; not in crank starting an
unfinished database.

For example, I should be able to go to any node, replace the Cassandra.yaml
file and have a prompt on the display ask me if I want to update all the
yaml files across the cluster.  I shouldn't have to manually modify yaml
files on each node or have to create a script for some third party
automation tool to do it.  

I should not have to turn off service, clear directories, restart service in
coordination with the other nodes.  It's already a computer system.  It can
do those things on its own.

How about read repair.  First there is something wrong with the name.  Maybe
it should be called Consistency Repair.  An administrator shouldn't have to
do anything.  It should be a behavior of Cassandra that is programmed in. It
should consider the GC setting of each node, calculate how often it has to
run repair, when it should run it so all the nodes aren't trying at the same
time and when other circumstances indicate it should also run it.

Certificate management should be automated.

Cluster wide management should be a big theme in any next major release.
What is a major release?  How many major releases could a program have
before all the coding for basic stuff like installation, configuration and
maintenance is included!

Finish the basic coding of Cassandra, make it easy to use for
administrators, make is smart, add cluster wide management.  Keep Cassandra
competitive or it will soon be the old Model T we all remember fondly.

I ask the Committee to compile a list of all such items, make a plan, and
commit to including the completed and tested code as part of major release
5.0.  I further ask that release 4.0 not be delayed and then there be an
unusually short skip to version 5.0. 

Kenneth Brotman

A Big Thumbs Up to You for the VOTE

2018-02-14 Thread Kenneth Brotman

I'm watching the voting underway, my first time seeing it and must say to: 

Nate McCall, 

Jon Haddad, 

Michael Shuler, 

Brandon Williams, 

Kurt Greaves, 

MCK, 

Jeff Jirsa, 

Aleksey Yeshchenko, 

Marcus Eriksson, 

Gary Dusbabek, 

Josh McKenzie, 

Jason Brown,

and others,

 

how impressive you all are.  Thanks for stepping up and contributing - even
though you are all obviously really well paid I'm sure for all that
knowledge and ability you possess.

 

With admiration,

 

Kenneth Brotman

RE: Slender Cassandra Cluster Project

2018-01-31 Thread Kenneth Brotman

Thank you Yuri and Michael for the suggestion.  Yes, a Terraform version makes 
sense.  Will do.

Kenneth Brotman

-Original Message-
From: Yuri Subach [mailto:ysub...@gmail.com] 
Sent: Wednesday, January 31, 2018 7:20 AM
To: user@cassandra.apache.org
Subject: Re: Slender Cassandra Cluster Project

Yes, I'd prefer Terraform too.

On 2018-01-31 06:32:21, Michael Mior <mm...@uwaterloo.ca> wrote:
> While whatever format this comes out in would be helpful, you might 
> want to consider Terraform. 1Password recently published a blog post 
> on their experience with Terraform vs. CloudFormation.
> 
> https://blog.agilebits.com/2018/01/25/terraforming-1password/
> 
> --
> Michael Mior
> mm...@apache.org
> 
> 2018-01-31 2:34 GMT-05:00 Kenneth Brotman <kenbrot...@yahoo.com.invalid>:
> 
> > Hi Yuri,
> >
> > If possible I will do everything with AWS Cloudformation.  I'm 
> > working on it now.  Nothing published yet.
> >
> > Kenneth Brotman
> >
> > -Original Message-
> > From: Yuri Subach [mailto:ysub...@gmail.com]
> > Sent: Tuesday, January 30, 2018 7:02 PM
> > To: user@cassandra.apache.org
> > Subject: RE: Slender Cassandra Cluster Project
> >
> > Hi Kenneth,
> >
> > I like this project idea!
> >
> > A couple of questions:
> > - What tools are you going to use for AWS cluster setup?
> > - Do you have anything published already (github)?
> >
> > On 2018-01-22 22:42:11, Kenneth Brotman 
> > <kenbrot...@yahoo.com.INVALID>
> > wrote:
> > > Thanks Anthony!  I’ve made a note to include that information in 
> > > the
> > documentation. You’re right.  It won’t work as intended unless that 
> > is configured properly.
> > >
> > >
> > >
> > > I’m also favoring a couple other guidelines for Slender Cassandra:
> > >
> > > 1.   SSD’s only, no spinning disks
> > >
> > > 2.   At least two cores per node
> > >
> > >
> > >
> > > For AWS, I’m favoring the c3.large on Linux.  It’s available in 
> > > these
> > regions: US-East, US-West and US-West2.  The specifications are listed as:
> > >
> > > · Two (2) vCPU’s
> > >
> > > · 3.7 Gib Memory
> > >
> > > · Two (2) 16 GB SSD’s
> > >
> > > · Moderate I/O
> > >
> > >
> > >
> > > It’s going to be hard to beat the inexpensive cost of operating a
> > Slender Cluster on demand in the cloud – and it fits a lot of the 
> > use cases
> > well:
> > >
> > >
> > >
> > > · For under a $100 a month, in current pricing for EC2
> > instances, you can operate an eighteen (18) node Slender Cluster for 
> > five
> > (5) hours a day, ten (10) days a month.  That’s fine for 
> > demonstrations, teaching or experiments that last half a day or less.
> > >
> > > · For under $20, you can have that Slender Cluster up all day
> > long, up to ten (10) hours, for whatever demonstrations or 
> > experiments you want it for.
> > >
> > >
> > >
> > > As always, feedback is encouraged.
> > >
> > >
> > >
> > > Thanks,
> > >
> > >
> > >
> > > Kenneth Brotman
> > >
> > >
> > >
> > > From: Anthony Grasso [mailto:anthony.gra...@gmail.com]
> > > Sent: Sunday, January 21, 2018 3:57 PM
> > > To: user
> > > Subject: Re: Slender Cassandra Cluster Project
> > >
> > >
> > >
> > > Hi Kenneth,
> > >
> > >
> > >
> > > Fantastic idea!
> > >
> > >
> > >
> > > One thing that came to mind from my reading of the proposed setup 
> > > was
> > rack awareness of each node. Given that the proposed setup contains 
> > three DCs, I assume that each node will be made rack aware? If not, 
> > consider defining three racks for each DC and placing two nodes in 
> > each rack. This will ensure that all the nodes in a single rack 
> > contain at most one replica of the data.
> > >
> > >
> > >
> > > Regards,
> > >
> > > Anthony
> > >
> > >
> > >
> > > On 17 January 2018 at 11:24, Kenneth Brotman
> > <kenbrot...@yahoo.com.invalid> wrote:
> > >
> > > Sure.  That takes the project from awesome to 10X awesome.  I 
> > > absolutely
> > would be willing to do that.  Thanks Kurt!

RE: Slender Cassandra Cluster Project

2018-01-30 Thread Kenneth Brotman

Hi Yuri,

If possible I will do everything with AWS Cloudformation.  I'm working on it 
now.  Nothing published yet.

Kenneth Brotman

-Original Message-
From: Yuri Subach [mailto:ysub...@gmail.com] 
Sent: Tuesday, January 30, 2018 7:02 PM
To: user@cassandra.apache.org
Subject: RE: Slender Cassandra Cluster Project

Hi Kenneth,

I like this project idea!

A couple of questions:
- What tools are you going to use for AWS cluster setup?
- Do you have anything published already (github)?

On 2018-01-22 22:42:11, Kenneth Brotman <kenbrot...@yahoo.com.INVALID> wrote:
> Thanks Anthony!  I’ve made a note to include that information in the 
> documentation. You’re right.  It won’t work as intended unless that is 
> configured properly.
> 
>  
> 
> I’m also favoring a couple other guidelines for Slender Cassandra:
> 
> 1.   SSD’s only, no spinning disks
> 
> 2.   At least two cores per node
> 
>  
> 
> For AWS, I’m favoring the c3.large on Linux.  It’s available in these 
> regions: US-East, US-West and US-West2.  The specifications are listed as:
> 
> · Two (2) vCPU’s
> 
> · 3.7 Gib Memory
> 
> · Two (2) 16 GB SSD’s
> 
> · Moderate I/O
> 
>  
> 
> It’s going to be hard to beat the inexpensive cost of operating a Slender 
> Cluster on demand in the cloud – and it fits a lot of the use cases well:  
> 
>  
> 
> · For under a $100 a month, in current pricing for EC2 instances, you 
> can operate an eighteen (18) node Slender Cluster for five (5) hours a day, 
> ten (10) days a month.  That’s fine for demonstrations, teaching or 
> experiments that last half a day or less. 
> 
> · For under $20, you can have that Slender Cluster up all day long, 
> up to ten (10) hours, for whatever demonstrations or experiments you want it 
> for.
> 
>  
> 
> As always, feedback is encouraged.
> 
>  
> 
> Thanks,
> 
>  
> 
> Kenneth Brotman
> 
>  
> 
> From: Anthony Grasso [mailto:anthony.gra...@gmail.com] 
> Sent: Sunday, January 21, 2018 3:57 PM
> To: user
> Subject: Re: Slender Cassandra Cluster Project
> 
>  
> 
> Hi Kenneth,
> 
>  
> 
> Fantastic idea!
> 
>  
> 
> One thing that came to mind from my reading of the proposed setup was rack 
> awareness of each node. Given that the proposed setup contains three DCs, I 
> assume that each node will be made rack aware? If not, consider defining 
> three racks for each DC and placing two nodes in each rack. This will ensure 
> that all the nodes in a single rack contain at most one replica of the data.
> 
>  
> 
> Regards,
> 
> Anthony
> 
>  
> 
> On 17 January 2018 at 11:24, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
> wrote:
> 
> Sure.  That takes the project from awesome to 10X awesome.  I absolutely 
> would be willing to do that.  Thanks Kurt!
> 
>  
> 
> Regarding your comment on the keyspaces, I agree.  There should be a few 
> simple examples one way or the other that can be duplicated and observed, and 
> then an example to duplicate and play with that has a nice real world mix, 
> with some keyspaces that replicate over only a subset of DC’s and some that 
> replicate to all DC’s.
> 
>  
> 
> Kenneth Brotman 
> 
>  
> 
> From: kurt greaves [mailto:k...@instaclustr.com] 
> Sent: Tuesday, January 16, 2018 1:31 PM
> To: User
> Subject: Re: Slender Cassandra Cluster Project
> 
>  
> 
> Sounds like a great idea. Probably would be valuable to add to the official 
> docs as an example set up if you're willing.
> 
>  
> 
> Only thing I'd add is that you should have keyspaces that replicate over only 
> a subset of DC's, plus one/some replicated to all DC's
> 
>  
> 
> On 17 Jan. 2018 03:26, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:
> 
> I’ve begun working on a reference project intended to provide guidance on 
> configuring and operating a modest Cassandra cluster of about 18 nodes 
> suitable for the economic study, demonstration, experimentation and testing 
> of a Cassandra cluster.
> 
>  
> 
> The slender cluster would be designed to be as inexpensive as possible while 
> still using real world hardware in order to lower the cost to those with 
> limited initial resources. Sorry no Raspberry Pi’s for this project.  
> 
>  
> 
> There would be an on-premises version and a cloud version.  Guidance would be 
> provided on configuring the cluster, on demonstrating key Cassandra 
> behaviors, on files sizes, capacity to use with the Slender Cassandra 
> Cluster, and so on.
> 
>  
> 
> Why about eighteen nodes? I tried to figure out what the minimu

RE: TWCS not deleting expired sstables

2018-01-30 Thread Kenneth Brotman

Wow!  It’s in the DataStax documentation: 
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSStabExpiredBlockers.html

 

Other nice tools there as well: 
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableUtilitiesTOC.html

 

Kenneth Brotman

 

From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Monday, January 29, 2018 8:20 PM
To: User
Subject: Re: TWCS not deleting expired sstables

 

Likely a read repair caused old data to be brought into a newer SSTable. Try 
running sstableexpiredblockers to find out if there's a newer SSTable blocking 
that one from being dropped.

RE: Nodes show different number of tokens than initially

2018-01-26 Thread Kenneth Brotman

Oleksandr,

 

Could it be that after distributing the data, some of the nodes did not need to 
have a fourth token?

 

Kenneth Brotman

 

From: Oleksandr Shulgin [mailto:oleksandr.shul...@zalando.de] 
Sent: Thursday, January 25, 2018 3:44 AM
To: User
Subject: Nodes show different number of tokens than initially

 

Hello,

 

While testing token allocation with version 3.0.15 we are experiencing some 
quite unexpected result.

 

We have deployed a secondary virtual DC with 6 nodes, 4 tokens per node.  Then 
we were adding the 7th node to the new DC in order to observe the effect of 
ownership re-distribution.

 

To set up the new DC we've used the following steps:

 

1. Alter all keyspaces to replicate to the upcoming new DC.

2. Deploy 3 seed nodes (IP ends with .31) with num_tokens=4 and tokens 
specified by initial_token list, auto_bootstrap=false.

3. Deploy 3 more nodes (IP ends with .32) with num_tokens=4 and 
allocate_tokens_for_keyspace=data_ks, auto_bootstrap=true.

4. Rebuild all new nodes specifying eu-central as the source DC (for the 3 
already bootstrapped nodes, workaround by truncating system.available_ranges 
first).

 

The following is the output of nodetool status after starting to bootstrap the 
7th node (172.31.128.33):

 

Datacenter: eu-central

==

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  AddressLoad   Tokens   Owns (effective)  Host ID
   Rack

UN  172.31.160.12  26.4 GB256  48.9% 
89067222-b0eb-49e5-be7d-758ea24ace9a  1c

UN  172.31.144.12  28.92 GB   256  52.6% 
2ab4786f-9722-4418-ba78-9c435cbb30e5  1b

UN  172.31.128.12  28.13 GB   256  47.9% 
c4733a5c-abc5-4bab-9449-1e3f584cf64f  1a

UN  172.31.128.11  29.84 GB   256  52.2% 
6083369c-1a0f-4098-a420-313dacd429b6  1a

UN  172.31.160.11  28.25 GB   256  51.1% 
4dc361fc-818a-4b7f-abd3-9121488a7db1  1c

UN  172.31.144.11  28.14 GB   256  47.4% 
05e5df92-d196-46d5-8812-e843fbbd2922  1b

Datacenter: eu-central_4vn

==

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  AddressLoad   Tokens   Owns (effective)  Host ID
   Rack

UN  172.31.128.31  24.83 GB   445.8% 
4d7decb3-8692-4aec-a2e1-2ac89aed8c5a  1a

UN  172.31.144.31  26.52 GB   445.8% 
2eb29602-2df5-4f4f-b419-b5a94cf785f0  1b

UN  172.31.160.31  248 GB445.8% 
f1bd4696-c25c-4bc3-8c30-292f2bd027c1  1c

UJ  172.31.128.33  568.94 MB  4? 
ffa21d50-9bb4-4d2b-9e3e-7a6945f6f071  1a

UN  172.31.144.32  29.3 GB454.2% 
5ce019f6-99fd-4333-b231-d04a266229bb  1b

UN  172.31.160.32  27.8 GB454.2% 
193bef27-eea8-4aa6-9d5f-8baf3decdd76  1c

UN  172.31128.32  30.5 GB454.2% 
6a046b64-31f9-4881-85b0-ab3a2f6dcdc4  1a

 

 

Then we wanted to start testing distribution with 8 vnodes.  For that we 
started to deploy yet another DC.

 

The following is the output of nodetool status after deploying the 3 seed nodes 
of the 8-tokens DC:

 

Datacenter: eu-central

==

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  AddressLoad   Tokens   Owns (effective)  Host ID
   Rack

UN  172.31.160.12  26.4 GB256  48.9% 
89067222-b0eb-49e5-be7d-758ea24ace9a  1c

UN  172.31.144.12  28.92 GB   256  52.6% 
2ab4786f-9722-4418-ba78-9c435cbb30e5  1b

UN  172.31.128.12  28.13 GB   256  47.9% 
c4733a5c-abc5-4bab-9449-1e3f584cf64f  1a

UN  172.31.128.11  29.84 GB   256  52.2% 
6083369c-1a0f-4098-a420-313dacd429b6  1a

UN  172.31.160.11  28.25 GB   256  51.1% 
4dc361fc-818a-4b7f-abd3-9121488a7db1  1c

UN  172.31.144.11  28.14 GB   256  47.4% 
05e5df92-d196-46d5-8812-e843fbbd2922  1b

Datacenter: eu-central_4vn

==

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  AddressLoad   Tokens   Owns (effective)  Host ID
   Rack

UN  172.31.128.31  24.83 GB   345.8% 
4d7decb3-8692-4aec-a2e1-2ac89aed8c5a  1a

UN  172.31.144.31  26.52 GB   445.8% 
2eb29602-2df5-4f4f-b419-b5a94cf785f0  1b

UN  172.31.160.31  24.8 GB445.8% 
f1bd4696-c25c-4bc3-8c30-292f2bd027c1  1c

UJ  172.31.128.33  4.21 GB4? 
ffa21d50-9bb4-4d2b-9e3e-7a6945f6f071  1a

UN  17231.160.32  27.8 GB454.2% 
193bef27-eea8-4aa6-9d5f-8baf3decdd76  1c

UN  172.31.144.32  29.3 GB354.2% 
5ce019f6-99fd-4333-b231-d04a266229bb  1b

UN  172.31.128.32  30.5 GB454.2

RE: Slender Cassandra Cluster Project

2018-01-22 Thread Kenneth Brotman

Thanks Anthony!  I’ve made a note to include that information in the 
documentation. You’re right.  It won’t work as intended unless that is 
configured properly.

 

I’m also favoring a couple other guidelines for Slender Cassandra:

1.   SSD’s only, no spinning disks

2.   At least two cores per node

 

For AWS, I’m favoring the c3.large on Linux.  It’s available in these regions: 
US-East, US-West and US-West2.  The specifications are listed as:

· Two (2) vCPU’s

· 3.7 Gib Memory

· Two (2) 16 GB SSD’s

· Moderate I/O

 

It’s going to be hard to beat the inexpensive cost of operating a Slender 
Cluster on demand in the cloud – and it fits a lot of the use cases well:  

 

· For under a $100 a month, in current pricing for EC2 instances, you 
can operate an eighteen (18) node Slender Cluster for five (5) hours a day, ten 
(10) days a month.  That’s fine for demonstrations, teaching or experiments 
that last half a day or less. 

· For under $20, you can have that Slender Cluster up all day long, up 
to ten (10) hours, for whatever demonstrations or experiments you want it for.

 

As always, feedback is encouraged.

 

Thanks,

 

Kenneth Brotman

 

From: Anthony Grasso [mailto:anthony.gra...@gmail.com] 
Sent: Sunday, January 21, 2018 3:57 PM
To: user
Subject: Re: Slender Cassandra Cluster Project

 

Hi Kenneth,

 

Fantastic idea!

 

One thing that came to mind from my reading of the proposed setup was rack 
awareness of each node. Given that the proposed setup contains three DCs, I 
assume that each node will be made rack aware? If not, consider defining three 
racks for each DC and placing two nodes in each rack. This will ensure that all 
the nodes in a single rack contain at most one replica of the data.

 

Regards,

Anthony

 

On 17 January 2018 at 11:24, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

Sure.  That takes the project from awesome to 10X awesome.  I absolutely would 
be willing to do that.  Thanks Kurt!

 

Regarding your comment on the keyspaces, I agree.  There should be a few simple 
examples one way or the other that can be duplicated and observed, and then an 
example to duplicate and play with that has a nice real world mix, with some 
keyspaces that replicate over only a subset of DC’s and some that replicate to 
all DC’s.

 

Kenneth Brotman 

 

From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Tuesday, January 16, 2018 1:31 PM
To: User
Subject: Re: Slender Cassandra Cluster Project

 

Sounds like a great idea. Probably would be valuable to add to the official 
docs as an example set up if you're willing.

 

Only thing I'd add is that you should have keyspaces that replicate over only a 
subset of DC's, plus one/some replicated to all DC's

 

On 17 Jan. 2018 03:26, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:

I’ve begun working on a reference project intended to provide guidance on 
configuring and operating a modest Cassandra cluster of about 18 nodes suitable 
for the economic study, demonstration, experimentation and testing of a 
Cassandra cluster.

 

The slender cluster would be designed to be as inexpensive as possible while 
still using real world hardware in order to lower the cost to those with 
limited initial resources. Sorry no Raspberry Pi’s for this project.  

 

There would be an on-premises version and a cloud version.  Guidance would be 
provided on configuring the cluster, on demonstrating key Cassandra behaviors, 
on files sizes, capacity to use with the Slender Cassandra Cluster, and so on.

 

Why about eighteen nodes? I tried to figure out what the minimum number of 
nodes needed for Cassandra to be Cassandra is?  Here were my considerations:

 

• A user wouldn’t run Cassandra in just one data center; so at 
least two datacenters.

• A user probably would want a third data center available for 
analytics.

• There needs to be enough nodes for enough parallelism to observe 
Cassandra’s distributed nature.

• The cluster should have enough nodes that one gets a sense of the 
need for cluster wide management tools to do things like repairs, snapshots and 
cluster monitoring.

• The cluster should be able to demonstrate a RF=3 with local 
quorum.  If replicated in all three data centers, one write would impact half 
the 18 nodes, 3 datacenters X 3 nodes per data center = 9 nodes of 18 nodes  If 
replicated in two of the data centers, one write would still impact one third 
of the 18 nodes, 2 DC’s X 3 nodes per DC = 6 of the 18 nodes.  

 

So eighteen seems like the minimum number of nodes needed.  That’s six nodes in 
each of three data centers.

 

Before I get too carried away with this project, I’m looking for some feedback 
on whether this project would indeed be helpful to others? Also, should the 
project be changed in any way?

 

It’s always a pleasure to connect wi

RE: Slender Cassandra Cluster Project

2018-01-16 Thread Kenneth Brotman

Sure.  That takes the project from awesome to 10X awesome.  I absolutely would 
be willing to do that.  Thanks Kurt!

 

Regarding your comment on the keyspaces, I agree.  There should be a few simple 
examples one way or the other that can be duplicated and observed, and then an 
example to duplicate and play with that has a nice real world mix, with some 
keyspaces that replicate over only a subset of DC’s and some that replicate to 
all DC’s.

 

Kenneth Brotman 

 

From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Tuesday, January 16, 2018 1:31 PM
To: User
Subject: Re: Slender Cassandra Cluster Project

 

Sounds like a great idea. Probably would be valuable to add to the official 
docs as an example set up if you're willing.

 

Only thing I'd add is that you should have keyspaces that replicate over only a 
subset of DC's, plus one/some replicated to all DC's

 

On 17 Jan. 2018 03:26, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid> wrote:

I’ve begun working on a reference project intended to provide guidance on 
configuring and operating a modest Cassandra cluster of about 18 nodes suitable 
for the economic study, demonstration, experimentation and testing of a 
Cassandra cluster.

 

The slender cluster would be designed to be as inexpensive as possible while 
still using real world hardware in order to lower the cost to those with 
limited initial resources. Sorry no Raspberry Pi’s for this project.  

 

There would be an on-premises version and a cloud version.  Guidance would be 
provided on configuring the cluster, on demonstrating key Cassandra behaviors, 
on files sizes, capacity to use with the Slender Cassandra Cluster, and so on.

 

Why about eighteen nodes? I tried to figure out what the minimum number of 
nodes needed for Cassandra to be Cassandra is?  Here were my considerations:

 

• A user wouldn’t run Cassandra in just one data center; so at 
least two datacenters.

• A user probably would want a third data center available for 
analytics.

• There needs to be enough nodes for enough parallelism to observe 
Cassandra’s distributed nature.

• The cluster should have enough nodes that one gets a sense of the 
need for cluster wide management tools to do things like repairs, snapshots and 
cluster monitoring.

• The cluster should be able to demonstrate a RF=3 with local 
quorum.  If replicated in all three data centers, one write would impact half 
the 18 nodes, 3 datacenters X 3 nodes per data center = 9 nodes of 18 nodes.  
If replicated in two of the data centers, one write would still impact one 
third of the 18 nodes, 2 DC’s X 3 nodes per DC = 6 of the 18 nodes.  

 

So eighteen seems like the minimum number of nodes needed.  That’s six nodes in 
each of three data centers.

 

Before I get too carried away with this project, I’m looking for some feedback 
on whether this project would indeed be helpful to others? Also, should the 
project be changed in any way?

 

It’s always a pleasure to connect with the Cassandra users’ community.  Thanks 
for all the hard work, the expertise, the civil dialog.

 

Kenneth Brotman

Slender Cassandra Cluster Project

2018-01-16 Thread Kenneth Brotman

I've begun working on a reference project intended to provide guidance on
configuring and operating a modest Cassandra cluster of about 18 nodes
suitable for the economic study, demonstration, experimentation and testing
of a Cassandra cluster.

 

The slender cluster would be designed to be as inexpensive as possible while
still using real world hardware in order to lower the cost to those with
limited initial resources. Sorry no Raspberry Pi's for this project.  

 

There would be an on-premises version and a cloud version.  Guidance would
be provided on configuring the cluster, on demonstrating key Cassandra
behaviors, on files sizes, capacity to use with the Slender Cassandra
Cluster, and so on.

 

Why about eighteen nodes? I tried to figure out what the minimum number of
nodes needed for Cassandra to be Cassandra is?  Here were my considerations:

 

. A user wouldn't run Cassandra in just one data center; so at
least two datacenters.

. A user probably would want a third data center available for
analytics.

. There needs to be enough nodes for enough parallelism to
observe Cassandra's distributed nature.

. The cluster should have enough nodes that one gets a sense of
the need for cluster wide management tools to do things like repairs,
snapshots and cluster monitoring.

. The cluster should be able to demonstrate a RF=3 with local
quorum.  If replicated in all three data centers, one write would impact
half the 18 nodes, 3 datacenters X 3 nodes per data center = 9 nodes of 18
nodes.  If replicated in two of the data centers, one write would still
impact one third of the 18 nodes, 2 DC's X 3 nodes per DC = 6 of the 18
nodes.  

 

So eighteen seems like the minimum number of nodes needed.  That's six nodes
in each of three data centers.

 

Before I get too carried away with this project, I'm looking for some
feedback on whether this project would indeed be helpful to others? Also,
should the project be changed in any way?

 

It's always a pleasure to connect with the Cassandra users' community.
Thanks for all the hard work, the expertise, the civil dialog.

 

Kenneth Brotman

< 1 2

101 - 166 of 166 matches

Mail list logo