Hi, the embedded cassandra to speedup entering the project may will work for developers, we used it for junit. But a simple clone and maven build - I guess it will end in a single node cassandra cluster. Remember cassandra is a distributed database, one will need more than one node to get performance and fault tolerance. Also I would not recommend adding and removing of cluster nodes at high frequency with application start-stop-cycles.
To help in getting things up and running, provide a small readme for downloading and starting cassandra. For mac and linux unpacking the tar.gz and running cassandra.sh is not too complicated. Or use a hint to the DataStax Community Edition installers. Apart from installing Java that is a five minute stop to a single node "TestCluster". Configuring a distributed setup is a bit more or a lot more difficult and definitly needs more understanding and planning. Just as a hint and offtopic: I saw people using cassandra as application glue for interprocess communication where every app server started a node (for communication, sessions and as queue and so on). If that is eventually a use case - have a look at hazelcast. Jan Von meinem iPhone gesendet > Am 14.02.2016 um 23:26 schrieb John Sanda <john.sa...@gmail.com>: > > The motivation was to make it easy for someone to get up and running quickly > with the project. Clone the git repo, run the maven build, and then you are > all set. It definitely does lower the learning curve for someone just getting > started with a project and who is not really thinking about Cassandra. It > also is convenient for non-devs who need to quickly get the project up and > running. For development, we have people working on Linux, Mac OS X, and > Windows. I am not a Windows user and not even sure if ccm works on Windows, > so ccm can't be the de factor standard for development. > >> On Sun, Feb 14, 2016 at 2:52 PM, Jack Krupansky <jack.krupan...@gmail.com> >> wrote: >> What motivated the use of an embedded instance for development - as opposed >> to simply spawning a process for Cassandra? >> >> >> >> -- Jack Krupansky >> >>> On Sun, Feb 14, 2016 at 2:05 PM, John Sanda <john.sa...@gmail.com> wrote: >>> The project I work on day to day uses an embedded instance of Cassandra, >>> but it is intended for primarily for development. We embed Cassandra in a >>> WildFly (i.e., JBoss) server. It is packaged and deployed as an EAR. I >>> personally do not do this. I use and recommend ccm for development. If you >>> do you WildFly, there is also wildfly-cassandra which deploys Cassandra as >>> a custom WildFly extension. In other words it is deployed in WildFly like >>> other subsystems like EJB, web, etc, not like an application. There isn't a >>> whole lot of active development on this, but it could be another option. >>> >>> For production, we have to support single node clusters (not embedded >>> though), and it has been challenging for pretty much all the reasons you >>> find people saying not to do so. >>> >>> As for failure detection and cluster membership changes, are you using the >>> Datastax driver? You can register an event listener with the driver to >>> receive notifications for those things. >>> >>>> On Sat, Feb 13, 2016 at 6:33 PM, Jonathan Haddad <j...@jonhaddad.com> >>>> wrote: >>>> +1 to what jack said. Don't mess with embedded till you understand the >>>> basics of the db. You're not making your system any less complex, I'd say >>>> you're most likely going to shoot yourself in the foot. >>>>> On Sat, Feb 13, 2016 at 2:22 PM Jack Krupansky <jack.krupan...@gmail.com> >>>>> wrote: >>>>> HA requires an odd number of replicas - 3, 5, 7 - so that split-brain can >>>>> be avoided. Two nodes would not support HA. You need to be able to reach >>>>> a quorum, which is defined as n/2+1 where n is the number of replicas. >>>>> IOW, you cannot update the data if a quorum cannot be reached. The data >>>>> on any given node needs to be replicated on at least two other nodes. >>>>> >>>>> Embedded Cassandra is only for extremely sophisticated developers - not >>>>> those who are new to Cassandra, with a "superficial understanding". >>>>> >>>>> As a general proposition, you should not be running application code on >>>>> Cassandra nodes. >>>>> >>>>> That said, if any of the senior Cassandra developers wish to personally >>>>> support your efforts towards embedded clusters, they are certainly free >>>>> to do so. we'll see if any of them step forward. >>>>> >>>>> >>>>> -- Jack Krupansky >>>>> >>>>>> On Sat, Feb 13, 2016 at 3:47 PM, Binil Thomas >>>>>> <binil.thomas.pub...@gmail.com> wrote: >>>>>> Hi all, >>>>>> >>>>>> TL;DR: I have a very superficial understanding of Cassandra and am >>>>>> currently evaluating it for a project. >>>>>> >>>>>> * Can Cassandra be embedded into another JVM application? >>>>>> * Can such embedded instances form a cluster? >>>>>> * Can the application use the the failure detection and cluster >>>>>> membership dissemination infrastructure of embedded Cassandra? >>>>>> >>>>>> ---- >>>>>> >>>>>> I am in the process of re-packaging a SaaS system written in Java to be >>>>>> deployed on-premise by customers. The SaaS system currently uses AWS >>>>>> DynamoDB. The data storage needs for this application are modest, but I >>>>>> would like to keep the deployment complexity to a minimum. Here are >>>>>> three different usecases the on-premise system should support: >>>>>> >>>>>> 1. single-node deployments with minimal complexity >>>>>> 2. two-node HA deployments; the data and processing needs dictated by >>>>>> the load on the system are well under what a single node can do, but the >>>>>> second node is there to satisfy the HA requirement as a hot standby >>>>>> 3. a multi-node clustered deployment, where higher operational >>>>>> complexity is justified >>>>>> >>>>>> I am considering Cassandra for these usecases. >>>>>> >>>>>> For usecase #1, I hope to embed Cassandra into the same JVM as my >>>>>> application. I read on the web that CassandraDaemon can be used this >>>>>> way. Is that accurate? What other applications embed Cassandra this way? >>>>>> I *think* JetBrains Upsource does, but do you know other ones? >>>>>> (Incidentally, my Java application embeds Jetty webserver also). >>>>>> >>>>>> For usecase #2, I am hoping that I can deploy two instances of this >>>>>> ensemble and have the embedded Cassandra instances form a cluster. If I >>>>>> configure every write to be replicated on both nodes synchronously, then >>>>>> it will satisfy the HA needs of this usecase. Is it feasible to form >>>>>> clusters of embedded Cassandra instances? >>>>>> >>>>>> For usecase #3, I can form a large cluster of the ensemble where all >>>>>> writes are replicated synchronously to a quorum of nodes. >>>>>> >>>>>> Finally, in usecase #2 and #3, I'd like to use the failure detection and >>>>>> cluster membership dissemination infrastructure of Cassandra from within >>>>>> my application. Is it possible to be notified of membership changes when >>>>>> embedding Cassandra? I could use a separate library to do this (say, >>>>>> with JGroups or Akka) but I fear that if this library and the embedded >>>>>> Cassandra instances disagrees, it could lead to subtle bugs. >>>>>> >>>>>> Thanks, >>>>>> Binil >>>>>> >>>>>> PS: Cross-posted at >>>>>> http://stackoverflow.com/questions/35384983/forming-a-cluster-of-embedded-cassandra-instances >>> >>> >>> >>> -- >>> >>> - John > > > > -- > > - John