While solving the problem in Phoenix is great, I've also been mulling this over in Apache Avatica (the tech behind PQS). I neglected to mention it earlier: https://issues.apache.org/jira/browse/CALCITE-1311. Being able to dump these metrics anywhere is ideal, as James points out.
But, to your last question, Michael, I don't think you should expect to see any analysis tools provided by Phoenix or Avatica -- just the data in a consumable format. However, dumping data to a file and then using Phoenix to import those Phoenix query metrics for analysis sounds pretty groovy.. On Wed, Apr 12, 2017 at 9:55 PM, James Taylor <[email protected]> wrote: > The JIRA for making sure that the metrics can be collected through PQS is > PHOENIX-3655. At SFDC, we have a layer on top of Phoenix that's outputting a > log line with the metrics information. A better abstraction IMHO would be to > output metrics through JMX (see PHOENIX-3247) to make it easier to hook up > other clients to show this metric information than getting it from logs > through a tool like Splunk. > > If you're interested in contributing to either of those JIRA, please let us > know. > > Thanks, > James > > On Wed, Apr 12, 2017 at 6:24 PM, Michael Young <[email protected]> wrote: >> >> James, >> >> Our users access phoenix via the query server. Is there a way to use this >> metric framework with PQS to log user query data (eg. number of queries, >> number of failures, execution time, 90/95/99 percentiles, topN queries >> etc...? We'd like to know what users are running and what might cause >> performance issues or errors during usage. >> >> Also, any other way to analyze these logs besides Splunk? >> >> On Tue, Apr 11, 2017 at 4:05 PM, James Taylor <[email protected]> >> wrote: >>> >>> FWIW, we use our metric framework[1] to collect that information, >>> outputting one log line per statement or query and then Splunk to look at >>> it. >>> >>> [1] http://phoenix.apache.org/metrics.html >>> >>> On Tue, Apr 11, 2017 at 3:24 PM Michael Young <[email protected]> >>> wrote: >>>> >>>> Yes, the tracing webapp is working, at least in our distro HDP 2.5 >>>> release. >>>> >>>> However, it seemed to negatively impact our performance and created a >>>> large volume of trace data which was somewhat overwhelming. >>>> >>>> We could not get simple SELECT query logging and query time info from >>>> the trace logs it produces. So it didn't seem appropriate to address the >>>> type of simple query logging we have in our use case. I suppose it is more >>>> for detailed tracing use cases. >>>> >>>> Cheers, >>>> Michael >>>> >>>> On Mon, Apr 3, 2017 at 2:28 PM, Ryan Templeton >>>> <[email protected]> wrote: >>>>> >>>>> I see there’s a phoenix-tracing-webapp project in the build plus this >>>>> on the website - https://phoenix.apache.org/tracing.html >>>>> >>>>> Is this project still working and usable? The project looks like it’s >>>>> had updates as of a few months ago… >>>>> >>>>> >>>>> Thanks, >>>>> Ryan >>>>> >>>>> On 3/3/17, 10:33 AM, "Josh Elser" <[email protected]> wrote: >>>>> >>>>> >>>>> > >https://github.com/apache/calcite/blob/master/avatica/server/src/main/java/org/apache/calcite/avatica/jdbc/JdbcMeta.java >>>>> > >>>>> >This is ultimately where the requests from the client using the thin >>>>> >JDBC driver get executed inside of PQS. The API's aren't 100%, but it >>>>> >should be obvious what is getting invoked with you're familiar with >>>>> > the >>>>> >JDBC APIs. >>>>> > >>>>> >Michael Young wrote: >>>>> >> Josh, >>>>> >> I am interested in looking at adding this to Avatica myself, >>>>> >> although >>>>> >> I'm not familiar with that code base. >>>>> >> >>>>> >> Can you point me to where in the avatica code I should look at to >>>>> >> add >>>>> >> this logging? >>>>> >> >>>>> >> Cheers >>>>> >> >>>>> >> >>>>> >> On Tue, Feb 28, 2017 at 4:15 AM, Josh Elser <[email protected] >>>>> >> <mailto:[email protected]>> wrote: >>>>> >> >>>>> >> No, I don't believe there is any log4j logging done in PQS that >>>>> >> would show queries being executed. >>>>> >> >>>>> >> Ideally, we would have a "query log" in Phoenix which would >>>>> >> present >>>>> >> an interface to this data and it wouldn't require anything >>>>> >> special >>>>> >> in PQS. However, I wouldn't be opposed to some trivial additions >>>>> >> to >>>>> >> PQS (Avatica, really) to add a simple logging as a stopgap. >>>>> >> >>>>> >> >>>>> >> On Feb 27, 2017 20:49, "Michael Young" <[email protected] >>>>> >> <mailto:[email protected]>> wrote: >>>>> >> >>>>> >> I hadn't seen a reply to my earlier question. >>>>> >> >>>>> >> We have business analysts running queries using BI tools >>>>> >> (like >>>>> >> Tableau) which connect via the Phoenix Query Server. >>>>> >> >>>>> >> How can we log all SELECT queries (raw query, start time, >>>>> >> end >>>>> >> time, etc...)? >>>>> >> >>>>> >> Any way to tweak log4j or other properties to get this? The >>>>> >> TRACE logging I tried (mentioned in my post above) was way >>>>> >> too >>>>> >> dense to be useful for reporting usage, and doesn't seem to >>>>> >> show >>>>> >> the full SQL query params and query start/end times. Also, >>>>> >> it >>>>> >> logs every UPSERT during data load (which overwhelms the log >>>>> >> files). We really just need SELECTS logged. >>>>> >> >>>>> >> >>>>> >> >>>>> >> On Tue, Jan 31, 2017 at 5:10 PM, Michael Young >>>>> >> <[email protected] <mailto:[email protected]>> wrote: >>>>> >> >>>>> >> Does the Phoenix Query Server have an option to log the >>>>> >> SQL >>>>> >> statements which are executed? >>>>> >> >>>>> >> We see there are ways to get various PQS trace logs >>>>> >> modifying the log4j settings used by the queryserver.py: >>>>> >> >>>>> >> >>>>> >> log4j.org.apache.phoenix.jdbc.PhoenixStatementFactory=TRACE >>>>> >> (or DEBUG) >>>>> >> log4j.org.apache.phoenix.jdbc.PhoenixStatement=TRACE >>>>> >> log4j.logger.org.apache.calcite.avatica=TRACE >>>>> >> log4j.logger.org.apache.phoenix.queryserver.server=TRACE >>>>> >> etc... >>>>> >> >>>>> >> but the data in the trace logs (which show SQL >>>>> >> statements) >>>>> >> are not particularly user friendly. And it does not >>>>> >> seem >>>>> >> straightforward to get to end-to-end query execution >>>>> >> times. >>>>> >> >>>>> >> Any suggestions how to get simple SQL logs (raw query, >>>>> >> execution time, ...)? The idea is to monitor user >>>>> >> activity >>>>> >> and take action if query times are slow, or timeout. >>>>> >> >>>>> >> Thanks, >>>>> >> Michael >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> > >>>> >>>> >> >
