Any idea? Sent using Zoho Mail ============ Forwarded message ============ From
: onmstester onmstester <onmstes...@zoho.com> To :
"user"<user@hbase.apache.org> Date : Sat, 08 Sep 2018 10:46:25 +0430 Subject :
Migrating from Apache Cassandra to Hbase ============ Forwarded message
============ Hi, Currently I'm using Apache Cassandra as backend for my
restfull application. Having a cluster of 30 nodes (each having 12 cores, 64gb
ram and 6 TB disk which 50% of the disk been used) write and read throughput is
more than satisfactory for us. The input is a fixed set of long and int columns
which we need to query it based on every column, so having 8 columns there
should be 8 tables based on Cassandra query plan recommendation. The cassandra
keyspace schema would be someting like this: Table 1 (timebucket,col1,
...,col8, primary key(timebuecket,col1)) to handle select * from input where
timebucket = X and col1 = Y .... Table 8 (timebucket,col1, ...,col8, primary
key(timebuecket,col8)) So for each input row, there would be 8X insert in
Cassandra (not considering RF) and using TTL of 12 months, production cluster
should keep about 2 Peta Bytes of data With recommended node density for
Cassandra cluster (2 TB per node), i need a cluster with more than 1000 nodes
(which i can not afford) So long story short: I'm looking for an alternative to
Apache Cassandra for this application. How HBase would solve these problem: 1.
8X data redundancy due to needed queries 2. nodes with large data density (30
TB data on each node if No.1 could not be solved in HBase), how HBase would
handle compaction and node join-remove problems while there is only 5 * 6 TB
7200 SATA Disk available on each node? How much Hbase needs as empty space for
template files of compaction? 3. Also i read in some documents (including
datastax's) that HBase is more of a offline & data-lake backend that better not
to be used as web application backendd which needs less than some seconds QoS
in response time. Thanks in advance Sent using Zoho Mail