Re: Performance issue in the Join query on the HBase tables

2017-09-29 Thread wenxing zheng
@Eric: for the trafodion, will take a look. @Nick: And for the Hive/Spark over snapshots, I just have a try on the Hive over HBase snapshots, the select(count) is much more faster than Hive over HBase. Since the HBase tables are all so big, how to make the engine respecting the data locality? Tha

Re: Performance issue in the Join query on the HBase tables

2017-09-29 Thread Nick Dimiduk
Have you considered running Hive/Spark over snapshots of your HBase tables? If you're seeing network saturation over HBase but not hdfs, makes me think data locality is not being honored. Might be worth investigating as well. On Fri, Sep 29, 2017 at 3:26 AM wenxing zheng wrote: > Dear all, > >

RE: Performance issue in the Join query on the HBase tables

2017-09-29 Thread Eric Owhadi
Subject: Re: Performance issue in the Join query on the HBase tables Thanks to Ted. We didn't try the phoneix yet. From the performance test on the official site of phoenix, I didn't find the report on the Join query. Not sure whether it's much better or not On Fri, Sep 29, 2017 at

Re: Performance issue in the Join query on the HBase tables

2017-09-29 Thread wenxing zheng
Thanks to Ted. We didn't try the phoneix yet. From the performance test on the official site of phoenix, I didn't find the report on the Join query. Not sure whether it's much better or not On Fri, Sep 29, 2017 at 8:01 PM, Ted Yu wrote: > Have you looked at Phoenix ? > > https://phoenix.apache.

Re: Performance issue in the Join query on the HBase tables

2017-09-29 Thread Ted Yu
Have you looked at Phoenix ? https://phoenix.apache.org/joins.html On Fri, Sep 29, 2017 at 3:25 AM, wenxing zheng wrote: > Dear all, > > I have 3 big HBase tables, which all have millions of rows(rows are synced > from MySQL DB via Bin log) and for each HBase table, we have an external > table