Re: The note of the round table meeting after HBaseConAsia 2019

Andrew Purtell Thu, 08 Aug 2019 11:26:24 -0700

This is great, but in the future please refrain from borderline marketing
of a commercial product on these lists. This is not the appropriate venue
for that.


It is especially poor form to dump on a fellow open source project, as you
claim to be. This I think is the tell behind the commercial motivation.

Also I should point out, being pretty familiar with Phoenix in operation
where I work, and in my interactions with various Phoenix committers and
PMC, that the particular group of HBasers in that group appeared to share a
negative view - which I will not comment on, they are entitled to their
opinions, and more choice in SQL access to HBase is good! - that should not
be claimed to be universal or even representative.



On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <rohit.j...@esgyn.com> wrote:

> Hi folks,
>
> This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> would like to address the points I have pulled out from write-up (at the
> bottom of this message).
>
> Many in the HBase community may not be aware that besides Apache Phoenix,
> there has been a project called Apache Trafodion, contributed by
> Hewlett-Packard in 2015 that has now been top-level project for a while.
> Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> started its OLTP / Operational journey as NonStop SQL effectively in the
> early 1990s.  Granted it is a C++ project, but it has 170+ patents as part
> of it that were contributed to Apache.  These are capabilities that still
> don’t exist in other databases.
>
> It is a full-fledged SQL relational database engine with the breadth of
> ANSI SQL support, including OLAP functions mentioned, and including many de
> facto standard functions from databases like Oracle.  You can go to the
> Apache Trafodion wiki to see the documentation as to what all is supported
> by Trafodion.
>
> When we introduced Apache Trafodion, we implemented a completely
> distributed transaction management capability right into the HBase engine
> using coprocessors, that is completely scalable with no bottlenecks
> what-so-ever.  We have made this infrastructure very efficient over time,
> e.g. reducing two-phase commit overhead for single region transactions.  We
> have presented this at HBaseCon.
>
> The engine also supports secondary indexes.  However, because of our
> Multi-dimensional Access Method patented technology the need to use a
> secondary index is substantially reduced.  All DDL and index updates are
> completely protected by ACID transactions.
>
> Probably because of our own inability to create excitement about the
> project, and potentially other reasons, we could not get community
> involvement as we were expecting.  That is why you may see that while we
> are maintaining the code base and introducing enhancements to it, much of
> our focus has shifted to the commercial product based on Apache Trafodion,
> namely EsgynDB.  But if the community involvement increases, we can
> certainly refresh Trafodion with some of the additional functionality we
> have added on the HBase side of the product.
>
> But let me be clear.  We are about 150 employees at Esgyn with 40 or so in
> the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and
> Guiyang.  We cannot sustain the company on service revenue alone.  You have
> seen companies that tried to do that have not been successful, unless they
> have a way to leverage the open source project for a different business
> model – enhanced capabilities, Cloud services, etc.
>
> To that end we have added to EsgynDB complete Disaster Recovery,
> Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> Manager, Multi-tenancy, and a large number of other capabilities for High
> Availability scale-out production deployments.  EsgynDB also provides full
> BI and Analytics capabilities, again because of our heritage products
> supporting up to 250TB EDWs for HP and customers like Walmart competing
> with Teradata, leveraging Apache ORC and Parquet.  So yes, it can integrate
> with other storage engines as needed.
>
> However, in spite of all this, the pricing on EsgynDB is very competitive
> – in other words “cheap” compared to anything else with the same caliber of
> capabilities.
>
> We have demonstrated the capability of the product by running the TPC-C
> and TPC-DS (all 99 queries) benchmarks, especially at high concurrency
> which our product is especially well suited for, based on its architecture
> and patents.  (The TPC-DS benchmarks are run on ORC and Parquet for obvious
> reasons.)
>
> We just closed a couple of very large Core Banking deals in Guiyang where
> we are replacing the entire Core Banking system for these banks from their
> current Oracle implementations – where they were having challenges scaling
> at a reasonable cost.  But we have many customers both in the US and China
> that are using EsgynDB for operational, BI and Analytics needs.  And now
> finally … OLTP.
>
> I know that this is sounding more like a commercial for Esgyn, but that is
> not my intent.  I would like to make you aware of Apache Trafodion as a
> solution to many of these issues that the community is facing.  We will
> provide full support for Trafodion with community involvement and hope that
> some of that involvement results in EsgynDB revenue that we can sustain the
> company on 😊.  I would like to encourage the community to look at
> Trafodion to address many of the concerns sighted below.
>
> “Allan Yang said that most of their customers want secondary index, even
> more than SQL. And for global strong consistent secondary index, we agree
> that the only safe way is to use transaction. Other 'local' solutions will
> be in trouble when splitting/merging.”
>
> “We talked about Phoenix, the problem for Phoenix is well known: not
> stable enough. We even had a user on the mailing-list said he/she will
> never use Phoenix again.”
>
> “Some guys said that the current feature set for 3.0.0 is not good enough
> to attract more users, especially for small companies. Only internal
> improvements, no users visible features. SQL and secondary index are very
> important.”
>
> “Then we back to SQL again. Alibaba said that most of their customers are
> migrate from old business, so they need 'full' SQL support. That's why they
> need Phoenix. And lots of small companies wants to run OLAP queries
> directly on the database, they do no want to use ETL. So maybe in the SQL
> proxy (planned above), we should delegate the OLAP queries to spark SQL or
> something else, rather than just rejecting them.”
>
> “And a Phoenix committer said that, the Phoenix community are currently
> re-evaluate the relationship with HBase, because when upgrading to HBase
> 2.1.x, lots of things are broken. They plan to break the tie between
> Phoenix and HBase, which means Phoenix plans to also run on other storage
> systems. Note: This is not on the meeting but personally, I think this
> maybe a good news, since Phoenix is not HBase only, we have more reasons to
> introduce our own SQL layer.”
>
> Rohit Jain
> CTO
> Esgyn
>
>
>
> -----Original Message-----
> From: Stack <st...@duboce.net>
> Sent: Friday, July 26, 2019 12:01 PM
> To: HBase Dev List <d...@hbase.apache.org>
> Cc: hbase-user <user@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
>
>
> External
>
>
>
> Thanks for the thorough write-up Duo. Made for a good read....
>
> S
>
>
>
> On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino...@gmail.com
> <mailto:palomino...@gmail.com>> wrote:
>
>
>
> > The conclusion of the HBaseConAsia 2019 will be available later. And
>
> > here is the note of the round table meeting after the conference. A bit
> long...
>
> >
>
> > First we talked about splittable meta. At Xiaomi we have a cluster
>
> > which has nearly 200k regions and meta is very easy to overload and
>
> > can not recover. Anoop said we can try read replica, but agreed that
>
> > read replica can not solve all the problems, finally we still need to
> split meta.
>
> >
>
> > Then we talked about SQL. Allan Yang said that most of their customers
>
> > want secondary index, even more than SQL. And for global strong
>
> > consistent secondary index, we agree that the only safe way is to use
> transaction.
>
> > Other 'local' solutions will be in trouble when splitting/merging.
>
> > Xiaomi has an global secondary index solution, open source it?
>
> >
>
> > Then we back to SQL. We talked about Phoenix, the problem for Phoenix
>
> > is well known: not stable enough. We even had a user on the
>
> > mailing-list said he/she will never use Phoenix again. Alibaba and
>
> > Huawei both have their in-house SQL solution, and Huawei also talked
>
> > about it on HBaseConAsia 2019, they will try to open source it. And we
>
> > could introduce a SQL proxy in hbase-connector repo. No push down
>
> > support first, all logics are done at the proxy side, can optimize later.
>
> >
>
> > Some guys said that the current feature set for 3.0.0 is not good
>
> > enough to attract more users, especially for small companies. Only
>
> > internal improvements, no users visible features. SQL and secondary
>
> > index are very important.
>
> >
>
> > Yu Li talked about the CCSMap, we still want it to be release in
>
> > 3.0.0. One problem is the relationship with in memory compaction.
>
> > Theoretically they should have no conflicts but actually they have.
>
> > And Xiaomi guys mentioned that in memory compaction still has some
>
> > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
>
> > the region server. And Jieshan Bi asked why not just use CCSMap to
>
> > replace CSLM. Yu Li said this is for better memory usage, the index and
> data could be placed together.
>
> >
>
> > Then we started to talk about the HBase on cloud. For now, it is a bit
>
> > difficult to deploy HBase on cloud as we need to deploy zookeeper and
>
> > HDFS first. Then we talked about the HBOSS and WAL
> abstraction(HBASE-209520.
>
> > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
>
> > help simulating the operations of HDFS. We could introduce our own
> 'FileSystem'
>
> > interface, not the hadoop one, and we could remove the 'atomic renaming'
>
> > dependency so the 'FileSystem' implementation will be easier. And on
>
> > the WAL abstraction, Wellington said there are still some guys working
>
> > it, but now they focus on patching ratis, rather than abstracting the
>
> > WAL system first. We agreed that a better way is to abstract WAL
>
> > system at a level higher than FileSystem. so maybe we could even use
> Kafka to store the WAL.
>
> >
>
> > Then we talked about the FPGA usage for compaction at Alibaba. Jieshan
>
> > Bi said that in Huawei they offload the compaction to storage layer.
>
> > For open source solution, maybe we could offload the compaction to
>
> > spark, and then use something like bulkload to let region server load
>
> > the new HFiles. The problem for doing compaction inside region server
>
> > is the CPU cost and GC pressure. We need to scan every cell so the CPU
>
> > cost is high. Yu Li talked about their page based compaction in flink
>
> > state store, maybe it could also benefit HBase.
>
> >
>
> > Then it is the time for MOB. Huawei said MOD can not solve their problem.
>
> > We still need to read the data through RPC, and it will also introduce
>
> > pressures on the memstore, since the memstore is still a bit small,
>
> > comparing to MOB cell. And we will also flush a lot although there are
>
> > only a small number of MOB cells in the memstore, so we still need to
>
> > compact a lot. So maybe the suitable scenario for using MOB is that,
>
> > most of your data are still small, and a small amount of the data are
>
> > a bit larger, where MOD could increase the performance, and users do
>
> > not need to use another system to store the larger data.
>
> > Huawei said that they implement the logic at client side. If the data
>
> > is larger than a threshold, the client will go to another storage
>
> > system rather than HBase.
>
> > Alibaba said that if we want to support large blob, we need to
>
> > introduce streaming API.
>
> > And Kuaishou said that they do not use MOB, they just store data on
>
> > HDFS and the index in HBase, typical solution.
>
> >
>
> > Then we talked about which company to host the next year's
>
> > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
>
> > Shenzhen. And since there is no HBaseCon in America any more(it is
>
> > called 'NoSQL Day'), maybe next year we could just call the conference
> HBaseCon.
>
> >
>
> > Then we back to SQL again. Alibaba said that most of their customers
>
> > are migrate from old business, so they need 'full' SQL support. That's
>
> > why they need Phoenix. And lots of small companies wants to run OLAP
>
> > queries directly on the database, they do no want to use ETL. So maybe
>
> > in the SQL proxy(planned above), we should delegate the OLAP queries
>
> > to spark SQL or something else, rather than just rejecting them.
>
> >
>
> > And a Phoenix committer said that, the Phoenix community are currently
>
> > re-evaluate the relationship with HBase, because when upgrading to
>
> > HBase 2.1.x, lots of things are broken. They plan to break the tie
>
> > between Phoenix and HBase, which means Phoenix plans to also run on
>
> > other storage systems.
>
> > Note: This is not on the meeting but personally, I think this maybe a
>
> > good news, since Phoenix is not HBase only, we have more reasons to
>
> > introduce our own SQL layer.
>
> >
>
> > Then we talked about Kudu. It is faster than HBase on scan. If we want
>
> > to increase the performance on scan, we should have larger block size,
>
> > but this will lead to a slower random read, so we need to trade-off.
>
> > The Kuaishou guys asked whether HBase could support storing HFile in
>
> > columnar format. The answer is no, as said above, it will slow random
> read.
>
> > But we could learn what google done in bigtable. We could write a copy
>
> > of the data in parquet format to another FileSystem, and user could
>
> > just scan the parquet file for better analysis performance. And if
>
> > they want the newest data, they could ask HBase for the newest data,
>
> > and it should be small. This is more like a solution, not only HBase
>
> > is involved. But at least we could introduce some APIs in HBase so
>
> > users can build the solution in their own environment. And if you do
>
> > not care the newest data, you could also use replication to replicate
>
> > the data to ES or other systems, and search there.
>
> >
>
> > And Didi talked about their problems using HBase. They use kylin so
>
> > they also have lots of regions, so meta is also a problem for them.
>
> > And the pressure on zookeeper is also a problem, as the replication
>
> > queues are stored on zk. And after 2.1, zookeeper is only used as an
>
> > external storage in replication implementation, so it is possible to
>
> > switch to other storages, such as etcd. But it is still a bit
>
> > difficult to store the data in a system table, as now we need to start
>
> > the replication system before WAL system, but  if we want to store the
>
> > replication data in a hbase table, obviously the WAL system must be
>
> > started before replication system, as we need the region of the system
>
> > online first, and it will write an open marker to WAL. We need to find a
> way to break the dead lock.
>
> > And they also mentioned that, the rsgroup feature also makes big znode
>
> > on zookeeper, as they have lots of tables. We have HBASE-22514 which
>
> > aims to solve the problem.
>
> > And last, they shared their experience when upgrading from 0.98 to 1.4.x.
>
> > they should be compatible but actually there are problems. They agreed
>
> > to post a blog about this.
>
> >
>
> > And the Flipkart guys said they will open source their test-suite,
>
> > which focus on the consistency(Jepsen?). This is a good news, hope we
>
> > could have another useful tool other than ITBLL.
>
> >
>
> > That's all. Thanks for reading.
>
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The note of the round table meeting after HBaseConAsia 2019

Reply via email to