This is great, but in the future please refrain from borderline marketing of a commercial product on these lists. This is not the appropriate venue for that.
It is especially poor form to dump on a fellow open source project, as you claim to be. This I think is the tell behind the commercial motivation. Also I should point out, being pretty familiar with Phoenix in operation where I work, and in my interactions with various Phoenix committers and PMC, that the particular group of HBasers in that group appeared to share a negative view - which I will not comment on, they are entitled to their opinions, and more choice in SQL access to HBase is good! - that should not be claimed to be universal or even representative. On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <rohit.j...@esgyn.com> wrote: > Hi folks, > > This is a nice write-up of the round-table meeting at HBaseConAsia. I > would like to address the points I have pulled out from write-up (at the > bottom of this message). > > Many in the HBase community may not be aware that besides Apache Phoenix, > there has been a project called Apache Trafodion, contributed by > Hewlett-Packard in 2015 that has now been top-level project for a while. > Apache Trafodion is essentially technology from Tandem-Compaq-HP that > started its OLTP / Operational journey as NonStop SQL effectively in the > early 1990s. Granted it is a C++ project, but it has 170+ patents as part > of it that were contributed to Apache. These are capabilities that still > don’t exist in other databases. > > It is a full-fledged SQL relational database engine with the breadth of > ANSI SQL support, including OLAP functions mentioned, and including many de > facto standard functions from databases like Oracle. You can go to the > Apache Trafodion wiki to see the documentation as to what all is supported > by Trafodion. > > When we introduced Apache Trafodion, we implemented a completely > distributed transaction management capability right into the HBase engine > using coprocessors, that is completely scalable with no bottlenecks > what-so-ever. We have made this infrastructure very efficient over time, > e.g. reducing two-phase commit overhead for single region transactions. We > have presented this at HBaseCon. > > The engine also supports secondary indexes. However, because of our > Multi-dimensional Access Method patented technology the need to use a > secondary index is substantially reduced. All DDL and index updates are > completely protected by ACID transactions. > > Probably because of our own inability to create excitement about the > project, and potentially other reasons, we could not get community > involvement as we were expecting. That is why you may see that while we > are maintaining the code base and introducing enhancements to it, much of > our focus has shifted to the commercial product based on Apache Trafodion, > namely EsgynDB. But if the community involvement increases, we can > certainly refresh Trafodion with some of the additional functionality we > have added on the HBase side of the product. > > But let me be clear. We are about 150 employees at Esgyn with 40 or so in > the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and > Guiyang. We cannot sustain the company on service revenue alone. You have > seen companies that tried to do that have not been successful, unless they > have a way to leverage the open source project for a different business > model – enhanced capabilities, Cloud services, etc. > > To that end we have added to EsgynDB complete Disaster Recovery, > Point-in-Time, fuzzy Backup and Restore, Manageability via a Database > Manager, Multi-tenancy, and a large number of other capabilities for High > Availability scale-out production deployments. EsgynDB also provides full > BI and Analytics capabilities, again because of our heritage products > supporting up to 250TB EDWs for HP and customers like Walmart competing > with Teradata, leveraging Apache ORC and Parquet. So yes, it can integrate > with other storage engines as needed. > > However, in spite of all this, the pricing on EsgynDB is very competitive > – in other words “cheap” compared to anything else with the same caliber of > capabilities. > > We have demonstrated the capability of the product by running the TPC-C > and TPC-DS (all 99 queries) benchmarks, especially at high concurrency > which our product is especially well suited for, based on its architecture > and patents. (The TPC-DS benchmarks are run on ORC and Parquet for obvious > reasons.) > > We just closed a couple of very large Core Banking deals in Guiyang where > we are replacing the entire Core Banking system for these banks from their > current Oracle implementations – where they were having challenges scaling > at a reasonable cost. But we have many customers both in the US and China > that are using EsgynDB for operational, BI and Analytics needs. And now > finally … OLTP. > > I know that this is sounding more like a commercial for Esgyn, but that is > not my intent. I would like to make you aware of Apache Trafodion as a > solution to many of these issues that the community is facing. We will > provide full support for Trafodion with community involvement and hope that > some of that involvement results in EsgynDB revenue that we can sustain the > company on 😊. I would like to encourage the community to look at > Trafodion to address many of the concerns sighted below. > > “Allan Yang said that most of their customers want secondary index, even > more than SQL. And for global strong consistent secondary index, we agree > that the only safe way is to use transaction. Other 'local' solutions will > be in trouble when splitting/merging.” > > “We talked about Phoenix, the problem for Phoenix is well known: not > stable enough. We even had a user on the mailing-list said he/she will > never use Phoenix again.” > > “Some guys said that the current feature set for 3.0.0 is not good enough > to attract more users, especially for small companies. Only internal > improvements, no users visible features. SQL and secondary index are very > important.” > > “Then we back to SQL again. Alibaba said that most of their customers are > migrate from old business, so they need 'full' SQL support. That's why they > need Phoenix. And lots of small companies wants to run OLAP queries > directly on the database, they do no want to use ETL. So maybe in the SQL > proxy (planned above), we should delegate the OLAP queries to spark SQL or > something else, rather than just rejecting them.” > > “And a Phoenix committer said that, the Phoenix community are currently > re-evaluate the relationship with HBase, because when upgrading to HBase > 2.1.x, lots of things are broken. They plan to break the tie between > Phoenix and HBase, which means Phoenix plans to also run on other storage > systems. Note: This is not on the meeting but personally, I think this > maybe a good news, since Phoenix is not HBase only, we have more reasons to > introduce our own SQL layer.” > > Rohit Jain > CTO > Esgyn > > > > -----Original Message----- > From: Stack <st...@duboce.net> > Sent: Friday, July 26, 2019 12:01 PM > To: HBase Dev List <d...@hbase.apache.org> > Cc: hbase-user <user@hbase.apache.org> > Subject: Re: The note of the round table meeting after HBaseConAsia 2019 > > > > External > > > > Thanks for the thorough write-up Duo. Made for a good read.... > > S > > > > On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino...@gmail.com > <mailto:palomino...@gmail.com>> wrote: > > > > > The conclusion of the HBaseConAsia 2019 will be available later. And > > > here is the note of the round table meeting after the conference. A bit > long... > > > > > > First we talked about splittable meta. At Xiaomi we have a cluster > > > which has nearly 200k regions and meta is very easy to overload and > > > can not recover. Anoop said we can try read replica, but agreed that > > > read replica can not solve all the problems, finally we still need to > split meta. > > > > > > Then we talked about SQL. Allan Yang said that most of their customers > > > want secondary index, even more than SQL. And for global strong > > > consistent secondary index, we agree that the only safe way is to use > transaction. > > > Other 'local' solutions will be in trouble when splitting/merging. > > > Xiaomi has an global secondary index solution, open source it? > > > > > > Then we back to SQL. We talked about Phoenix, the problem for Phoenix > > > is well known: not stable enough. We even had a user on the > > > mailing-list said he/she will never use Phoenix again. Alibaba and > > > Huawei both have their in-house SQL solution, and Huawei also talked > > > about it on HBaseConAsia 2019, they will try to open source it. And we > > > could introduce a SQL proxy in hbase-connector repo. No push down > > > support first, all logics are done at the proxy side, can optimize later. > > > > > > Some guys said that the current feature set for 3.0.0 is not good > > > enough to attract more users, especially for small companies. Only > > > internal improvements, no users visible features. SQL and secondary > > > index are very important. > > > > > > Yu Li talked about the CCSMap, we still want it to be release in > > > 3.0.0. One problem is the relationship with in memory compaction. > > > Theoretically they should have no conflicts but actually they have. > > > And Xiaomi guys mentioned that in memory compaction still has some > > > bugs, even for basic mode, the MVCC writePoint may be stuck and hang > > > the region server. And Jieshan Bi asked why not just use CCSMap to > > > replace CSLM. Yu Li said this is for better memory usage, the index and > data could be placed together. > > > > > > Then we started to talk about the HBase on cloud. For now, it is a bit > > > difficult to deploy HBase on cloud as we need to deploy zookeeper and > > > HDFS first. Then we talked about the HBOSS and WAL > abstraction(HBASE-209520. > > > Wellington said the HBOSS basicly works, it use s3a and zookeeper to > > > help simulating the operations of HDFS. We could introduce our own > 'FileSystem' > > > interface, not the hadoop one, and we could remove the 'atomic renaming' > > > dependency so the 'FileSystem' implementation will be easier. And on > > > the WAL abstraction, Wellington said there are still some guys working > > > it, but now they focus on patching ratis, rather than abstracting the > > > WAL system first. We agreed that a better way is to abstract WAL > > > system at a level higher than FileSystem. so maybe we could even use > Kafka to store the WAL. > > > > > > Then we talked about the FPGA usage for compaction at Alibaba. Jieshan > > > Bi said that in Huawei they offload the compaction to storage layer. > > > For open source solution, maybe we could offload the compaction to > > > spark, and then use something like bulkload to let region server load > > > the new HFiles. The problem for doing compaction inside region server > > > is the CPU cost and GC pressure. We need to scan every cell so the CPU > > > cost is high. Yu Li talked about their page based compaction in flink > > > state store, maybe it could also benefit HBase. > > > > > > Then it is the time for MOB. Huawei said MOD can not solve their problem. > > > We still need to read the data through RPC, and it will also introduce > > > pressures on the memstore, since the memstore is still a bit small, > > > comparing to MOB cell. And we will also flush a lot although there are > > > only a small number of MOB cells in the memstore, so we still need to > > > compact a lot. So maybe the suitable scenario for using MOB is that, > > > most of your data are still small, and a small amount of the data are > > > a bit larger, where MOD could increase the performance, and users do > > > not need to use another system to store the larger data. > > > Huawei said that they implement the logic at client side. If the data > > > is larger than a threshold, the client will go to another storage > > > system rather than HBase. > > > Alibaba said that if we want to support large blob, we need to > > > introduce streaming API. > > > And Kuaishou said that they do not use MOB, they just store data on > > > HDFS and the index in HBase, typical solution. > > > > > > Then we talked about which company to host the next year's > > > HBaseConAsia. It will be Tencent or Huawei, or both, probably in > > > Shenzhen. And since there is no HBaseCon in America any more(it is > > > called 'NoSQL Day'), maybe next year we could just call the conference > HBaseCon. > > > > > > Then we back to SQL again. Alibaba said that most of their customers > > > are migrate from old business, so they need 'full' SQL support. That's > > > why they need Phoenix. And lots of small companies wants to run OLAP > > > queries directly on the database, they do no want to use ETL. So maybe > > > in the SQL proxy(planned above), we should delegate the OLAP queries > > > to spark SQL or something else, rather than just rejecting them. > > > > > > And a Phoenix committer said that, the Phoenix community are currently > > > re-evaluate the relationship with HBase, because when upgrading to > > > HBase 2.1.x, lots of things are broken. They plan to break the tie > > > between Phoenix and HBase, which means Phoenix plans to also run on > > > other storage systems. > > > Note: This is not on the meeting but personally, I think this maybe a > > > good news, since Phoenix is not HBase only, we have more reasons to > > > introduce our own SQL layer. > > > > > > Then we talked about Kudu. It is faster than HBase on scan. If we want > > > to increase the performance on scan, we should have larger block size, > > > but this will lead to a slower random read, so we need to trade-off. > > > The Kuaishou guys asked whether HBase could support storing HFile in > > > columnar format. The answer is no, as said above, it will slow random > read. > > > But we could learn what google done in bigtable. We could write a copy > > > of the data in parquet format to another FileSystem, and user could > > > just scan the parquet file for better analysis performance. And if > > > they want the newest data, they could ask HBase for the newest data, > > > and it should be small. This is more like a solution, not only HBase > > > is involved. But at least we could introduce some APIs in HBase so > > > users can build the solution in their own environment. And if you do > > > not care the newest data, you could also use replication to replicate > > > the data to ES or other systems, and search there. > > > > > > And Didi talked about their problems using HBase. They use kylin so > > > they also have lots of regions, so meta is also a problem for them. > > > And the pressure on zookeeper is also a problem, as the replication > > > queues are stored on zk. And after 2.1, zookeeper is only used as an > > > external storage in replication implementation, so it is possible to > > > switch to other storages, such as etcd. But it is still a bit > > > difficult to store the data in a system table, as now we need to start > > > the replication system before WAL system, but if we want to store the > > > replication data in a hbase table, obviously the WAL system must be > > > started before replication system, as we need the region of the system > > > online first, and it will write an open marker to WAL. We need to find a > way to break the dead lock. > > > And they also mentioned that, the rsgroup feature also makes big znode > > > on zookeeper, as they have lots of tables. We have HBASE-22514 which > > > aims to solve the problem. > > > And last, they shared their experience when upgrading from 0.98 to 1.4.x. > > > they should be compatible but actually there are problems. They agreed > > > to post a blog about this. > > > > > > And the Flipkart guys said they will open source their test-suite, > > > which focus on the consistency(Jepsen?). This is a good news, hope we > > > could have another useful tool other than ITBLL. > > > > > > That's all. Thanks for reading. > > > > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk