On Fri, Aug 19, 2016 at 3:01 PM, Andrew Purtell <apurt...@apache.org> wrote:
> > I have a long interest in 'canned' loadings. Interesting ones are hard to > > come by. If Phoenix ran any or a subset of TPCs, I'd like to try it. > > Likewise > > > But I don't want to be the first to try it. I am not a Phoenix expert. > > Same here, I'd just email dev@phoenix with a report that TPC query XYZ > didn't work and that would be as far as I could get. > > I don't think the first phase would require Phoenix experience. It's more around the automation for running each TPC benchmark so the process is repeatable: - pulling in the data - scripting the jobs - having a test harness they run inside - identifying the queries that don't work (ideally you wouldn't stop at the first error) - filing JIRAs for these The entire framework could be built and tested using standard JDBC APIs, and then initially run using MySQL or some other RDBMS before trying it with Phoenix. Maybe there's such a test harness that already exists for TPC? Then I think the next phase would require more Phoenix & HBase experience: - tweaking queries where possible given any limitations in Phoenix - adding missing syntax (or potentially using the calcite branch which supports more) - tweaking Phoenix schema declarations to optimize - tweaking Phoenix & HBase configs to optimize - determining which secondary indexes to add (though I think there's an academic paper on this, I can't seem to find it) Both phases would require a significant amount of time and effort. Each benchmark would likely require unique tweaks. Thanks, James