But again, you could also simply spawn a process running Cassandra as-is in its intended form which would eliminate the potential for conflict between the app heap and Casandra's JVM heap.
-- Jack Krupansky On Mon, Feb 15, 2016 at 12:56 AM, Jan Kesten <j.kes...@enercast.de> wrote: > Hi, > > the embedded cassandra to speedup entering the project may will work for > developers, we used it for junit. But a simple clone and maven build - I > guess it will end in a single node cassandra cluster. Remember cassandra is > a distributed database, one will need more than one node to get performance > and fault tolerance. Also I would not recommend adding and removing of > cluster nodes at high frequency with application start-stop-cycles. > > To help in getting things up and running, provide a small readme for > downloading and starting cassandra. For mac and linux unpacking the tar.gz > and running cassandra.sh is not too complicated. Or use a hint to the > DataStax Community Edition installers. Apart from installing Java that is a > five minute stop to a single node "TestCluster". > > Configuring a distributed setup is a bit more or a lot more difficult and > definitly needs more understanding and planning. > > Just as a hint and offtopic: I saw people using cassandra as application > glue for interprocess communication where every app server started a node > (for communication, sessions and as queue and so on). If that is > eventually a use case - have a look at hazelcast. > > Jan > > Von meinem iPhone gesendet > > Am 14.02.2016 um 23:26 schrieb John Sanda <john.sa...@gmail.com>: > > The motivation was to make it easy for someone to get up and running > quickly with the project. Clone the git repo, run the maven build, and then > you are all set. It definitely does lower the learning curve for someone > just getting started with a project and who is not really thinking about > Cassandra. It also is convenient for non-devs who need to quickly get the > project up and running. For development, we have people working on Linux, > Mac OS X, and Windows. I am not a Windows user and not even sure if ccm > works on Windows, so ccm can't be the de factor standard for development. > > On Sun, Feb 14, 2016 at 2:52 PM, Jack Krupansky <jack.krupan...@gmail.com> > wrote: > >> What motivated the use of an embedded instance for development - as >> opposed to simply spawning a process for Cassandra? >> >> >> >> -- Jack Krupansky >> >> On Sun, Feb 14, 2016 at 2:05 PM, John Sanda <john.sa...@gmail.com> wrote: >> >>> The project I work on day to day uses an embedded instance of Cassandra, >>> but it is intended for primarily for development. We embed Cassandra in a >>> WildFly (i.e., JBoss) server. It is packaged and deployed as an EAR. I >>> personally do not do this. I use and recommend ccm >>> <https://github.com/pcmanus/ccm> for development. If you do you >>> WildFly, there is also wildfly-cassandra >>> <https://github.com/hawkular/wildfly-cassandra> which deploys Cassandra >>> as a custom WildFly extension. In other words it is deployed in WildFly >>> like other subsystems like EJB, web, etc, not like an application. There >>> isn't a whole lot of active development on this, but it could be another >>> option. >>> >>> For production, we have to support single node clusters (not embedded >>> though), and it has been challenging for pretty much all the reasons you >>> find people saying not to do so. >>> >>> As for failure detection and cluster membership changes, are you using >>> the Datastax driver? You can register an event listener with the driver to >>> receive notifications for those things. >>> >>> On Sat, Feb 13, 2016 at 6:33 PM, Jonathan Haddad <j...@jonhaddad.com> >>> wrote: >>> >>>> +1 to what jack said. Don't mess with embedded till you understand the >>>> basics of the db. You're not making your system any less complex, I'd say >>>> you're most likely going to shoot yourself in the foot. >>>> On Sat, Feb 13, 2016 at 2:22 PM Jack Krupansky < >>>> jack.krupan...@gmail.com> wrote: >>>> >>>>> HA requires an odd number of replicas - 3, 5, 7 - so that split-brain >>>>> can be avoided. Two nodes would not support HA. You need to be able to >>>>> reach a quorum, which is defined as n/2+1 where n is the number of >>>>> replicas. IOW, you cannot update the data if a quorum cannot be reached. >>>>> The data on any given node needs to be replicated on at least two other >>>>> nodes. >>>>> >>>>> Embedded Cassandra is only for extremely sophisticated developers - >>>>> not those who are new to Cassandra, with a "superficial understanding". >>>>> >>>>> As a general proposition, you should not be running application code >>>>> on Cassandra nodes. >>>>> >>>>> That said, if any of the senior Cassandra developers wish to >>>>> personally support your efforts towards embedded clusters, they are >>>>> certainly free to do so. we'll see if any of them step forward. >>>>> >>>>> >>>>> -- Jack Krupansky >>>>> >>>>> On Sat, Feb 13, 2016 at 3:47 PM, Binil Thomas < >>>>> binil.thomas.pub...@gmail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> TL;DR: I have a very superficial understanding of Cassandra and am >>>>>> currently evaluating it for a project. >>>>>> >>>>>> * Can Cassandra be embedded into another JVM application? >>>>>> * Can such embedded instances form a cluster? >>>>>> * Can the application use the the failure detection and cluster >>>>>> membership dissemination infrastructure of embedded Cassandra? >>>>>> >>>>>> ---- >>>>>> >>>>>> I am in the process of re-packaging a SaaS system written in Java to >>>>>> be deployed on-premise by customers. The SaaS system currently uses AWS >>>>>> DynamoDB. The data storage needs for this application are modest, but I >>>>>> would like to keep the deployment complexity to a minimum. Here are three >>>>>> different usecases the on-premise system should support: >>>>>> >>>>>> 1. single-node deployments with minimal complexity >>>>>> 2. two-node HA deployments; the data and processing needs dictated by >>>>>> the load on the system are well under what a single node can do, but the >>>>>> second node is there to satisfy the HA requirement as a hot standby >>>>>> 3. a multi-node clustered deployment, where higher operational >>>>>> complexity is justified >>>>>> >>>>>> I am considering Cassandra for these usecases. >>>>>> >>>>>> For usecase #1, I hope to embed Cassandra into the same JVM as my >>>>>> application. I read on the web that CassandraDaemon can be used this way. >>>>>> Is that accurate? What other applications embed Cassandra this way? I >>>>>> *think* JetBrains Upsource does, but do you know other ones? >>>>>> (Incidentally, >>>>>> my Java application embeds Jetty webserver also). >>>>>> >>>>>> For usecase #2, I am hoping that I can deploy two instances of this >>>>>> ensemble and have the embedded Cassandra instances form a cluster. If I >>>>>> configure every write to be replicated on both nodes synchronously, then >>>>>> it >>>>>> will satisfy the HA needs of this usecase. Is it feasible to form >>>>>> clusters >>>>>> of embedded Cassandra instances? >>>>>> >>>>>> For usecase #3, I can form a large cluster of the ensemble where all >>>>>> writes are replicated synchronously to a quorum of nodes. >>>>>> >>>>>> Finally, in usecase #2 and #3, I'd like to use the failure detection >>>>>> and cluster membership dissemination infrastructure of Cassandra from >>>>>> within my application. Is it possible to be notified of membership >>>>>> changes >>>>>> when embedding Cassandra? I could use a separate library to do this (say, >>>>>> with JGroups or Akka) but I fear that if this library and the embedded >>>>>> Cassandra instances disagrees, it could lead to subtle bugs. >>>>>> >>>>>> Thanks, >>>>>> Binil >>>>>> >>>>>> PS: Cross-posted at >>>>>> http://stackoverflow.com/questions/35384983/forming-a-cluster-of-embedded-cassandra-instances >>>>>> >>>>>> >>>>> >>> >>> >>> -- >>> >>> - John >>> >> >> > > > -- > > - John > >