Hi I am benchmarking phoenix to better understand its strength and weaknesses. My basis is to compare to postgresql for OLTP workload and hive llap for OLAP workload. I am testing on a 10 computer cluster instance with hive (2.1) and phoenix (4.8) 220 GO RAM/32CPU versus a postgresql (9.6) 128GO RAM 32CPU.
Right now, my opinion is: - when getting a subset on a large table, phoenix performs the best - when getting a subset from multiple large tables, postgres performs the best - when getting a subset from a large table joining one to many small table, phoenix performs the best - when ingesting high frequency data, Phoenix performs the best - when grouping by query, hive > postgresql > phoenix - when windowning, transforming, grouping, hive performs the best, phoenix the worst Finally, my conclusion is phoenix is not intended at all for analytics queries such grouping, windowing, and joining large tables. It suits well for very specific use case like maintaining a very large table with eventually small tables to join with (such timeseries data, or binary storage data with hbase MOB enabled). Am I missing something ? Thanks, -- nicolas