Re: Certified Cassandra for Enterprise use

2018-05-31 Thread Rahul Singh
To be as objective as possible : Product vendors Datastax Stratio Infrastructure/ Database as a Service Instaclustr CosmosDB on Azure. Container Orchestration Mesosphere (DCOS creator) has limited support of “certified” Cassandra and DSE containers on Mesos Disclosure : our firm is a DataStax

Mongo DB vs Cassandra

2018-05-31 Thread Sudhakar Ganesan
Team, I need to make a decision on Mongo DB vs Cassandra for loading the csv file data and store csv file as well. If any of you did such study in last couple of months, please share your analysis or observations. Regards, Sudhakar Legal Disclaimer : The information contained in this message m

Re: Mongo DB vs Cassandra

2018-05-31 Thread Joseph Arriola
Hi Sudhakar! each one have a different goals, which means that they are complementary. Could you share more detail of the use case to give you a better advice? El El jue, 31 de may. de 2018 a las 5:50 a. m., Sudhakar Ganesan escribió: > Team, > > > > I need to make a decision on Mongo DB vs Cas

Re: Mongo DB vs Cassandra

2018-05-31 Thread Russell Bateman
Sudhakar, MongoDB will accommodate loading CSV without regard to schema while still creating identifiable "columns" in the database, but you'll have to predict or back-impose some schema later if you're going to create indices for fast searching of the data. You can perform searching of data

how to immediately delete tombstones

2018-05-31 Thread onmstester onmstester
Hi, I've deleted 50% of my data row by row now disk usage of cassandra data is more than 80%. The gc_grace of table was default (10 days), now i set that to 0, although many compactions finished but no space reclaimed so far. How could i force deletion of tombstones in sstables and reclaim th

Re: how to immediately delete tombstones

2018-05-31 Thread Nicolas Guyomar
Hi, You need to manually force compaction if you do not care ending up with one big sstable (nodetool compact) On 31 May 2018 at 11:07, onmstester onmstester wrote: > Hi, > I've deleted 50% of my data row by row now disk usage of cassandra data is > more than 80%. > The gc_grace of table was de

RE: Mongo DB vs Cassandra

2018-05-31 Thread Sudhakar Ganesan
At high level, in the production line, machine will provide the data in the form of CSV in every 1 sec to 1 minutes to 1 day ( depending on machine type used in the line operations). I need to parse those files and load it to DB and build and API layer expose it to downstream systems. Number of

Re: Mongo DB vs Cassandra

2018-05-31 Thread daemeon reiydelle
If you are starting with a modest amount of data (e.g. under .25 PB) and do not have extremely high availability requirements, then it is easier to start with MongoDB, avoiding HA clusters. I would suggest you start with MongoDB. Both are great, but C* scales far beyond MongoDB FOR A GIVEN LEVEL OF

Re: Mongo DB vs Cassandra

2018-05-31 Thread Jeff Jirsa
277 TB/day seems like the type of task I'd not trust to random mailing list advice. Cassandra can do that, but it's nontrivial. MongoDB may be able to do it, too (not sure). A lot of it will depend on how you're trying to query the data. On Thu, May 31, 2018 at 9:00 AM, Sudhakar Ganesan < sudha

Re: Mongo DB vs Cassandra

2018-05-31 Thread Jonathan Haddad
I haven’t seen any query requirements, which is going to be the thing that makes Cassandra difficult. If you can’t define your queries beforehand, cassandra is a no go. If you just want to store data somewhere, and it’s just CSV, I’d go with a simple blob store like s3 and pick a DB later when you

Re: how to immediately delete tombstones

2018-05-31 Thread Alain RODRIGUEZ
Hello, It's a very common but somewhat complex topic. We wrote about it 2 years ago and I really think this post might have answers you are looking for: http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html Something that you could try (if you do care ending up with one big s

Re: Mongo DB vs Cassandra

2018-05-31 Thread Joseph Arriola
Based on the metrics you say, I think the big data architecture can be: cassandra with spark. you mention high availability. the apis could use node.js. This combination is powerful, the challenge is in the data model. On the other hand, if you are willing to sacrifice high availability and slow r

REMINDER: Apache EU Roadshow 2018 in Berlin is less than 2 weeks away!

2018-05-31 Thread sharan
Hello Apache Supporters and Enthusiasts This is a reminder that our Apache EU Roadshow in Berlin is less than two weeks away and we need your help to spread the word. Please let your work colleagues, friends and anyone interested in any attending know about our Apache EU Roadshow event. We h