Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8529 )

Change subject: [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web 
UI.
......................................................................


Patch Set 2:

(28 comments)

Thanks for the review Bharath. Many nice suggestions. There will be a sequence 
of patches to improve our ability to monitor/debug the catalog. I just want to 
avoid cramming too much stuff in a single patch which is already kind of big.

http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.h
File be/src/catalog/catalog-server.h:

http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.h@201
PS2, Line 201: used
> Clarify what is 'used'?
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.h@217
PS2, Line 217: Update
> nit: May be 'GetTopNTableUsageMetrics()' or something (I know it sounds hor
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.h@224
PS2, Line 224: object_name
> nit: probably just 'name'? Its implicit (from the URL) that its a table.
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc@359
PS2, Line 359:   UpdateCatalogUsage(document);
> Whats your take on moving top-x stuff to the /metrics page and populating t
In principal, that's a good idea. I plan on doing it in the future for more 
generic catalog metrics. It will require some thinking into how to best 
integrate the JVM metrics with the metrics management system we use in the 
backend, so I thought of postponing it for another patch.


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc@390
PS2, Line 390:  has_metrics.SetBool(true);
> Just curious, whats the use of has_metrics=true/has_large_tables=true? We c
That's for preventing the impalad web ui from trying to print the catalog usage 
metrics. I had issue doing this from mustache, so I resorted to using this 
boolean flag.


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog-server.cc@492
PS2, Line 492:   Webserver::ArgumentMap::const_iterator object_name_arg = 
args.find("object_name");
> should we enable json view for this? (?json)
Good idea. Left a TODO.


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog.h
File be/src/catalog/catalog.h:

http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog.h@90
PS2, Line 90: in
> nit: into?
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/be/src/catalog/catalog.h@96
PS2, Line 96:   Status GetCatalogUsage(TGetCatalogUsageResponse* response);
> Same comment as in the other header, may be rename to convey that it is Get
I renamed the other one to GetCatalogUsage. I thought of keeping it a bit more 
generic in case we want to add other stuff as well. Thoughts?


http://gerrit.cloudera.org:8080/#/c/8529/2/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/8529/2/common/thrift/Frontend.thrift@90
PS2, Line 90: // Arguments to getTableMetrics, which returns the metrics of a 
specific table.
            : struct TGetTableMetricsParams {
            :   1: required string db
            :   2: required string tbl
            : }
            :
            : // Response to a getTableMetrics request. The response contains 
all the collected metrics
            : // pretty-printed into a string.
            : struct TGetTableMetricsResponse {
            :   1: required string metrics
            : }
> Can't we use TTableName and String directly? (or) probably use typedefs
Good point about TTableName (used that). I kept the TGetTableMetrics* structs 
for consistency with the other API calls.


http://gerrit.cloudera.org:8080/#/c/8529/2/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

http://gerrit.cloudera.org:8080/#/c/8529/2/common/thrift/JniCatalog.thrift@604
PS2, Line 604: Currently, it keeps track of the
             : // estimated memory usage and the number of metadata operations 
that were performed on
             : // that table since it was loaded.
> Remove here and describe below may be, can likely become stale soon (Same f
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/common/thrift/JniCatalog.thrift@607
PS2, Line 607: Stats
> Looks like we are using Stats/Metrics interchangeably. Maybe use Metrics ev
Good point. Done


http://gerrit.cloudera.org:8080/#/c/8529/2/common/thrift/JniCatalog.thrift@608
PS2, Line 608: required string table_name
> TTableName?
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/Catalog.java@154
PS2, Line 154: Table tbl = db.getTable(tableName);
             :     if (tbl != null) {
             :       tbl.incrementMetadataOpsCount();
             :       
CatalogUsageMonitor.INSTANCE.updateFrequentlyAccessedTables(tbl);
             :     }
> Looks subtle and flaky to me. This seems to work on the premise that we cal
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1007
PS2, Line 1007:     final Timer.Context context
> Should the scope include tryLockTable()?
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1372
PS2, Line 1372:   public TGetCatalogUsageResponse getCatalogUsage() {
> getTopN...Metrics?
See previous comment. Thought best to keep it more generic. Let me know what 
you think.


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1391
PS2, Line 1391:   public String getTableMetrics(String dbName, String tblName) 
throws CatalogException {
> Doc?
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1392
PS2, Line 1392:     Table tbl = getTable(dbName, tblName);
> This is the problem I was talking about. This call increments the metadatOp
Yeap, good point. Fixed it.


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1398
PS2, Line 1398: No metrics available for table " + dbName + "." + tblName
> Also may be add "Table not yet loaded..."? Its unclear why metrics are not
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
File fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@47
PS2, Line 47: true
> It is unclear why evict should be true (or) whats the usecase for that.
I wanted to avoid the scenario where 25 tables that were accessed in the past 
prevent any other table from showing in the top n. In your example, if the 25 
tables are still relevant (accessed frequently), they will keep showing up in 
the topN. Essentially, all the infrequent ones will be competing for the last 
spot in the list. Obviously, there are other better ways to implement something 
like that but that was just one simple approach. What I really want is an LFU 
cache with dynamic aging.


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@221
PS2, Line 221: StorageStats
> Maybe call FileMetadataStats to be ni sync with rest of the class?
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1201
PS2, Line 1201:   public void load(boolean reuseMetadata, IMetaStoreClient 
client,
> Make rest of the load()'s private so that all loads happen via this call an
Unfortunately, they both need to be public (sigh). Since the former calls the 
latter, I moved the timer there to make sure we capture all calls.


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@234
PS2, Line 234:                                         kuduTableName_, e);
> formatting (below too)
Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/catalog/Table.java@135
PS2, Line 135: public void setEstimatedMetadataSize(long estimatedMetadataSize) 
{
             :     estimatedMetadataSize_.set(estimatedMetadataSize);
             :   }
             :
             :   public void incrementMetadataOpsCount() { 
metadataOpsCount_.incrementAndGet(); }
> Can these make a call to CatalogUsageMonitor.increment*(this) too? We can a
Good point. Done


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/common/Metrics.java
File fe/src/main/java/org/apache/impala/common/Metrics.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/common/Metrics.java@143
PS2, Line 143:
> %le ?
Not sure I follow. Why le? I just want to print the % to indicate percentile.


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/common/Metrics.java@147
PS2, Line 147:   }
> Same as my comment in the backend, can we expose json versions of these? to
Added a TODO.


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@259
PS2, Line 259:
> Can we add gauge metrics around this switch-case block to count DDLs by typ
Good point. That will be added in a follow up patch where we will add these 
types of metrics (not tied to specific objects). We also want to add a a view 
of in-flight catalog operations (similar to what impalads have in queries/ tab).


http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/util/TopNCache.java
File fe/src/main/java/org/apache/impala/util/TopNCache.java:

http://gerrit.cloudera.org:8080/#/c/8529/2/fe/src/main/java/org/apache/impala/util/TopNCache.java@41
PS2, Line 41: R extends Long>
> Do we need this? It is almost always Long, no?
Yes, it's kind of annoying to have to specify it here. I need it for the 
function_ declaration though.


http://gerrit.cloudera.org:8080/#/c/8529/2/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/8529/2/www/catalog.tmpl@23
PS2, Line 23: {{?has_large_tables}}
> Like I mentioned elsewhere, my feeling is that these belong to metrics and
You can get ?json of that page and get the metrics (I just tried it). We need 
to think about the catalog metrics a bit more. The metrics exposed in this 
patch are tied to specific objects (tables that come and go). To my mind there 
is another class of "generic" metrics to capture the catalog's behavior and I 
think this should definitely be in the /metrics page.



--
To view, visit http://gerrit.cloudera.org:8080/8529
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e
Gerrit-Change-Number: 8529
Gerrit-PatchSet: 2
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Mon, 11 Dec 2017 19:43:56 +0000
Gerrit-HasComments: Yes

Reply via email to