Hi Siva/Jaime, In my opinion: HBase is meant for quick key/value lookup or short range based scans and Hive is meant for Analytical/Datawarehouse kind of workload. Full table scan in HBase is not what HBase is known/popular for. Doing joins is not really a sweet spot for HBase if you are doing full table scans. If you are doing full table scan in HBase then you can also try running a MapReduce job over HBase snapshot. Or You could just use Hive OLAP type workload.
Thanks, Anil Gupta On Tue, Jun 2, 2015 at 4:43 PM, Siva <[email protected]> wrote: > Hi Jaime, > > When we ran queries with complex joins (which involves ~10 tables) on > Phoenix on the tables which has large data, initially we have seen a lot of > issues, queries failed with errors. We started to tune both hbase and > phoenix, now few queries are running fine, but queries with larger data set > still have same issues. Still working on tuning them. The reason for > failures could be because of small cluster, limited by memory and IO. > > On the other hand, same quires with same data size on Hive 14 (with Tez + > ORC format + SNAPPY compression) were finished with in 70~100 seconds. It > would be good if Phoenix can publish the performance results on join > queries. > > Thanks, > Siva. > > On Tue, Jun 2, 2015 at 1:47 PM, Jaime Solano <[email protected]> wrote: > >> Hi guys, >> >> Are there benchmarks or numbers showing how Phoenix performs during the >> join of two or more huge tables? I'm not familiar with the join >> implementation, so I'm not sure if there's a limitation regarding number of >> regions, memory, disk, etc. >> >> Any thoughts? >> >> Thanks, >> -Jaime >> > > -- Thanks & Regards, Anil Gupta
