For UI and interactive data exploration there is already the Cassandra
interpreter for Apache Zeppelin that is more than decent for the job

On Wed, Feb 21, 2018 at 9:19 AM, Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:

> But what does this video really show? That Microsoft managed to run
> Cassandra as a SaaS product with nice UI?
> Google did that years ago with BigTable and Amazon with DynamoDB.
>
> I agree that we need more tools, but not so much for querying (although
> that would also help a bit), but just in general the project feels
> unapproachable right now.
> Besides the excellent DataStax documentation there is little best practice
> knowledge about how to operate and provision Cassandra clusters.
> Having some recipes for Chef, Puppet or Ansible that show the most common
> settings (or some Cloudfoundry/GCP Templates or Helm Charts) would be
> really useful.
> Also a list of all the projects that Cassandra goes well with (like TLP
> Reaper and and Netflix's Priam etc..)
>
> greetings Daniel
>
> On Wed, 21 Feb 2018 at 07:23 Kenneth Brotman <kenbrot...@yahoo.com.invalid>
> wrote:
>
>> If you watch this video through you'll see why usability is so
>> important.  You can't ignore usability issues.
>>
>> Cassandra does not exist in a vacuum.  The competitors are world class.
>>
>> The video is on the New Cassandra API for Azure Cosmos DB:
>> https://www.youtube.com/watch?v=1Sf4McGN1AQ
>>
>> Kenneth Brotman
>>
>> -----Original Message-----
>> From: Daniel Hölbling-Inzko [mailto:daniel.hoelbling-in...@bitmovin.com]
>> Sent: Tuesday, February 20, 2018 1:28 AM
>> To: user@cassandra.apache.org; James Briggs
>> Cc: d...@cassandra.apache.org
>> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>>
>> Hi,
>>
>> I have to add my own two cents here as the main thing that keeps me from
>> really running Cassandra is the amount of pain running it incurs.
>> Not so much because it's actually painful but because the tools are so
>> different and the documentation and best practices are scattered across a
>> dozen outdated DataStax articles and this mailing list etc.. We've been
>> hesitant (although our use case is perfect for using Cassandra) to deploy
>> Cassandra to any critical systems as even after a year of running it we
>> still don't have the operational experience to confidently run critical
>> systems with it.
>>
>> Simple things like a foolproof / safe cluster-wide S3 Backup (like
>> Elasticsearch has it) would for example solve a TON of issues for new
>> people. I don't need it auto-scheduled or something, but having to
>> configure cron jobs across the whole cluster is a pain in the ass for small
>> teams.
>> To be honest, even the way snapshots are done right now is already super
>> painful. Every other system I operated so far will just create one backup
>> folder I can export, in C* the Backup is scattered across a bunch of
>> different Keyspace folders etc.. needless to say that it took a while until
>> I trusted my backup scripts fully.
>>
>> And especially for a Database I believe Backup/Restore needs to be a
>> non-issue that's documented front and center. If not smaller teams just
>> don't have the resources to dedicate to learning and building the tools
>> around it.
>>
>> Now that the team is getting larger we could spare the resources to
>> operate these things, but switching from a well-understood RDBMs schema to
>> Cassandra is now incredibly hard and will probably take years.
>>
>> greetings Daniel
>>
>> On Tue, 20 Feb 2018 at 05:56 James Briggs <james.bri...@yahoo.com.
>> invalid>
>> wrote:
>>
>> > Kenneth:
>> >
>> > What you said is not wrong.
>> >
>> > Vertica and Riak are examples of distributed databases that don't
>> > require hand-holding.
>> >
>> > Cassandra is for Java-programmer DIYers, or more often Datastax
>> > clients, at this point.
>> > Thanks, James.
>> >
>> > ------------------------------
>> > *From:* Kenneth Brotman <kenbrot...@yahoo.com.INVALID>
>> > *To:* user@cassandra.apache.org
>> > *Cc:* d...@cassandra.apache.org
>> > *Sent:* Monday, February 19, 2018 4:56 PM
>> >
>> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>> >
>> > Jeff, you helped me figure out what I was missing.  It just took me a
>> > day to digest what you wrote.  I’m coming over from another type of
>> > engineering.  I didn’t know and it’s not really documented.  Cassandra
>> > runs in a data center.  Now days that means the nodes are going to be
>> > in managed containers, Docker containers, managed by Kerbernetes,
>> > Meso or something, and for that reason anyone operating Cassandra in a
>> > real world setting would not encounter the issues I raised in the way I
>> described.
>> >
>> > Shouldn’t the architectural diagrams people reference indicate that in
>> > some way?  That would have help me.
>> >
>> > Kenneth Brotman
>> >
>> > *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com]
>> > *Sent:* Monday, February 19, 2018 10:43 AM
>> > *To:* 'user@cassandra.apache.org'
>> > *Cc:* 'd...@cassandra.apache.org'
>> > *Subject:* RE: Cassandra Needs to Grow Up by Version Five!
>> >
>> > Well said.  Very fair.  I wouldn’t mind hearing from others still
>> > You’re a good guy!
>> >
>> > Kenneth Brotman
>> >
>> > *From:* Jeff Jirsa [mailto:jji...@gmail.com <jji...@gmail.com>]
>> > *Sent:* Monday, February 19, 2018 9:10 AM
>> > *To:* cassandra
>> > *Cc:* Cassandra DEV
>> > *Subject:* Re: Cassandra Needs to Grow Up by Version Five!
>> >
>> > There's a lot of things below I disagree with, but it's ok. I
>> > convinced myself not to nit-pick every point.
>> >
>> > https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of
>> > Stefan's work with cert management
>> >
>> > Beyond that, I encourage you to do what Michael suggested: open JIRAs
>> > for things you care strongly about, work on them if you have time.
>> > Sometime this year we'll schedule a NGCC (Next Generation Cassandra
>> > Conference) where we talk about future project work and direction, I
>> > encourage you to attend if you're able (I encourage anyone who cares
>> > about the direction of Cassandra to attend, it's probably be either
>> > free or very low cost, just to cover a venue and some food). If
>> > nothing else, you'll meet some of the teams who are working on the
>> > project, and learn why they've selected the projects on which they're
>> > working. You'll have an opportunity to pitch your vision, and maybe you
>> can talk some folks into helping out.
>> >
>> > - Jeff
>> >
>> >
>> >
>> >
>> > On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
>> > kenbrot...@yahoo.com.invalid> wrote:
>> > Comments inline
>> >
>> > >-----Original Message-----
>> > >From: Jeff Jirsa [mailto:jji...@gmail.com]
>> > >Sent: Sunday, February 18, 2018 10:58 PM
>> > >To: user@cassandra.apache.org
>> > >Cc: d...@cassandra.apache.org
>> > >Subject: Re: Cassandra Needs to Grow Up by Version Five!
>> > >
>> > >Comments inline
>> > >
>> > >
>> > >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman <
>> > kenbrot...@yahoo.com.INVALID> wrote:
>> > >>
>> > > >Cassandra feels like an unfinished program to me. The problem is
>> > > >not
>> > that it’s open source or cutting edge.  It’s an open source cutting
>> > edge program that lacks some of its basic functionality.  We are all
>> > stuck addressing fundamental mechanical tasks for Cassandra because
>> > the basic code that would do that part has not been contributed yet.
>> > >>
>> > >There’s probably 2-3 reasons why here:
>> > >
>> > >1) Historically the pmc has tried to keep the scope of the project
>> > >very
>> > narrow. It’s a database. We don’t ship drivers. We don’t ship
>> > developer tools. We don’t ship fancy UIs. We ship a database. I think
>> > for the most part the narrow vision has been for the best, but maybe
>> > it’s time to reconsider some of the scope.
>> > >
>> > >Postgres will autovacuum to prevent wraparound (hopefully),  but
>> > >everyone
>> > I know running Postgres uses flexible-freeze in cron - sometimes it’s
>> > ok to let the database have its opinions and let third party tools
>> > fill in the gaps.
>> > >
>> >
>> > I can appreciate the desire to stay in scope.  I believe usability is
>> > the King.  When users have to learn the database, then learn what they
>> > have to automate, then learn an automation tool and then use the
>> > automation tool to do something that is as fundamental as the
>> > fundamental tasks I described, then something is missing from the
>> > database itself that is adversely affecting usability - and that is
>> > very bad.  Where those big companies need to calculate the ROI is in
>> > the cost of acquiring or training the next group of users.  Consider
>> how steep the learning curve is for new users.
>> > Consider the business case for improving ease of use.
>> >
>> > >2) Cassandra is, by definition, a database for large scale problems.
>> > >Most
>> > of the companies working on/with it tend to be big companies. Big
>> > companies often have pre-existing automation that solved the stuff you
>> > consider fundamental tasks, so there’s probably nobody actively
>> > working on the solved problems that you may consider missing features
>> > - for many people they’re already solved.
>> > >
>> >
>> > I could be wrong but it sounds like a lot of the code work is done,
>> > and if the companies would take the time to contribute more code, then
>> > the rest of the code needed could be generated easily.
>> >
>> > >3) It’s not nearly as basic as you think it is. Datastax seemingly
>> > >had a
>> > multi-person team on opscenter, and while it was better than anything
>> > else around last time I used it (before it stopped supporting the OSS
>> > version), it left a lot to be desired. It’s probably 2-3 engineers
>> > working for a month  to have any sort of meaningful, reliable, mostly
>> > trivial cluster-managing UI, and I can think of about 10 JIRAs I’d
>> > rather see that time be spent on first.
>> >
>> > How about 6-9 engineers working 12 months a year on it then.  I'm not
>> > kidding.  For a big company with revenues in the tens of billions or
>> > more, and a heavy use of Cassandra nodes, it's easy to make a case for
>> > having a full time person or more that involved.  They aren't paying
>> > for using the open source code that is Cassandra.  Let's see what
>> > would the licensing fees be for a big company if the costs where like
>> Microsoft or Oracle would
>> > charge for their enterprise level relational database?   What's the
>> > contribution of one or two people in comparison.
>> >
>> > >> Ease of use issues need to be given much more attention.  For an
>> > administrator, the ease of use of Cassandra is very poor.
>> > >>
>> > >>Furthermore, currently Cassandra is an idiot.  We have to do
>> > >>everything
>> > for Cassandra. Contrast that with the fact that we are in the dawn of
>> > artificial intelligence.
>> > >>
>> > >
>> > >And for everything you think is obvious, there’s a 50% chance someone
>> > else will have already solved differently, and your obvious new
>> > solution will be seen as an inconvenient assumption and complexity
>> > they won’t appreciate. Open source projects get to walk a fine line of
>> > trying to be useful without making too many assumptions, being “too”
>> > opinionated, or overstepping bounds. We may be too conservative, but
>> > it’s very easy to go too far in the opposite direction.
>> > >
>> >
>> > I appreciate that but when such concerns result in inaction instead of
>> > resolution that is no good.
>> >
>> > >> Software exists to automate tasks for humans, not mechanize humans
>> > >> to
>> > administer tasks for a database.  I’m an engineering type.  My job is
>> > to apply science and technology to solve real world problems.  And
>> > that’s where I need an organization’s I.T. talent to focus; not in
>> > crank starting an unfinished database.
>> > >>
>> > >
>> > >And that’s why nobody’s done it - we all have bigger problems we’re
>> > >being
>> > paid to solve, and nobody’s felt it necessary. Because it’s not
>> > necessary, it’s nice, but not required.
>> > >
>> >
>> > Of course you would say that, you're Jeff Jirsa.  In apprenticeship
>> > speak, you’re a master.  It's the classic challenge of trying to  get
>> > a master to see the legitimate issues of the apprentices.  I do
>> > appreciate the time you give to answer posts to the groups , like this
>> > post.  So I don't want you to take anything the wrong way.  Where it's
>> > going to bit everyone is in the future adoption rate.  It has to be
>> addressed.
>> >
>> > [snip]
>> >
>> > >> Certificate management should be automated.
>> > >>
>> > >Stefan (in particular) has done a fair amount of work on this, but
>> > >I’d
>> > bet 90% of users don’t use ssl and genuinely don’t care.
>> > >
>> >
>> > I didn't realize.  Could I trouble you for a link so I could get up to
>> > speed?
>> >
>> > >> Cluster wide management should be a big theme in any next major
>> release.
>> > >>
>> > >Na. Stability and testing should be a big theme in the next major
>> release.
>> > >
>> >
>> > Double Na on that one Jeff.  I think you have a concern there about
>> > the need to test sufficiently to ensure the stability of the next
>> > major release.  That makes perfect sense.- for every release,
>> > especially the major ones.  Continuous improvement is not a phase of
>> > development for example.  CI should be in everything, in every phase.
>> > Stability and testing a part of every release not just one.  A major
>> > release should be a nice step from the previous major release though.
>> >
>> > >> What is a major release?  How many major releases could a program
>> > >> have
>> > before all the coding for basic stuff like installation, configuration
>> > and maintenance is included!
>> > >>
>> > >> Finish the basic coding of Cassandra, make it easy to use for
>> > administrators, make is smart, add cluster wide management.  Keep
>> > Cassandra competitive or it will soon be the old Model T we all
>> remember fondly.
>> > >>
>> > >
>> > >Let’s keep some perspective. Most of us came to Cassandra from rdbms
>> > worlds where we were building solutions out of a bunch of master/slave
>> > MySQL / Postgres type databases. I started using Cassandra 0.6 when I
>> > needed to store something like 400gb/day in 200whatever on spinning
>> > disks when 100gb felt like a “big” database, and the thought of
>> > writing runbooks and automation to automatically pick the most up to
>> > date slave as the new master, promote it, repoint the other slave to
>> > the new master, then reformat the old master and add it as a new slave
>> > without downtime and without potentially deleting the company’s whole
>> dataset sounded awful.
>> > Cassandra solved that problem, at the cost of maintaining a few yaml
>> > (then
>> > xml) files. Yes there are rough edges - they get slightly less rough
>> > on each new release. Can we do better? Sure, use your engineering time
>> > and send some patches. But the basic stuff is the nuts and bolts of
>> > the
>> > database: I care way more about streaming and compaction than I’ll
>> > ever care about installation.
>> > >
>> >
>> > I can relate.  I was studying the enterprise level MS SQL Server
>> > stuff. I noticed exactly what you described.  I decided maybe I'll
>> > just do other stuff and wait for things to develop more.  I'm very
>> > excited about the way Cassandra addresses things.  Streaming and
>> > compaction - very good.  I'm glad.  Items related to usability are not
>> optional though.
>> >
>> > >> I ask the Committee to compile a list of all such items, make a
>> > >> plan,
>> > and commit to including the completed and tested code as part of major
>> > release 5.0.  I further ask that release 4.0 not be delayed and then
>> > there be an unusually short skip to version 5.0.
>> > >>
>> > >
>> > >The committers are working their ass off on all sorts of hard problems.
>> > Some of those are probably even related to Cassandra. If you have
>> > idea, open a JIRA. If you have time, send a patch. Or review a patch.
>> > But don’t expect a bunch of people to set down work on optimizing the
>> > database to work on packaging and installation, because there’s no ROI
>> > in it for 99% of the existing committers: we’re working on the
>> > database to solve problems, and installation isn’t one of those
>> problems.
>> >
>> > I'm sure they are working very hard on all kinds of hard problems.  I
>> > actually wrote "Committee", not "committers"  There is an obvious
>> > shortage of contributors when you consider the size of the
>> > organizations using Cassandra.  That leave the burden on an unfair
>> > few.  Installation or more generally I would say usability is not that
>> > big a problem for the big companies out there. Good for them.
>> >
>> > Ask a new organization or a modest size organization that is
>> > struggling to manage their Cassandra cluster that usability is not a
>> > big problem. It truly is a big problem for many stakeholders of
>> > Cassandra. It needs to be given a bigger priority.  Hopefully others
>> will weigh in.
>> >
>> > Kenneth Brotman
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> > <user-unsubscribe@cassandra.apacheorg>
>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>> >
>> >
>> >
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>

Reply via email to