[jira] [Commented] (YARN-4265) Provide new timeline plugin storage to support fine-grained entity caching

Junping Du (JIRA) Mon, 04 Jan 2016 09:54:53 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081472#comment-15081472
 ]


Junping Du commented on YARN-4265:
----------------------------------

Thanks [~gtCarrera9] for clarification. Assume Jason is fine with going on with 
this patch, I quickly go through the v3 patch. 
with following comments so far (I haven't finished my review yet as patch is 
pretty big):

In YarnConfiguration.java,
{code}
TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_SCAN_INTERVAL_SECONDS_DEFAULT = 60;
{code}
I noticed that we are setting 1 minutes as default scan interval but original 
patch in HDFS-3942 is 5 minutes. Why shall we do any update here? The same 
question on "app-cache-size", the default value in HDFS-3942 is 5 but here is 
10. Any reason to update the value?

In yarn-default.xml,
{code}
+    <description>DFS path to store active application’s timeline 
data</description>
...
+    <description>DFS path to store done application’s timeline 
data</description>
{code}
DFS is very old name, use HDFS instead to be more clear.

Why we don't have any default value specified in property of 
"yarn.timeline-service.entity-group-fs-store.group-id-plugin-classes"?

In hadoop-yarn-server-timeline-pluginstorage/pom.xml,


For EmptyTimelineEntityGroupPlugin.java, why we need this class? I didn't see 
any help provided even in tests. We should remove it if useless.

In EntityCacheItem.java,
We should have a description for this class in Javadoc.

Can we optimize the synchronization logic here? Like in synchronized method 
refreshCache, we are intialize/start/stop TimelineDataManager (and 
MemoryTimelineStore) which is quite expensive and unnecessary to block other 
synchronized operations. Shall we move these operations out of synchronized 
block?

{code}
+      LOG.warn("Error closing datamanager", e);
{code}
I think we are closing store here instead of datamanager. Isn't it?

{code}
+  public boolean needRefresh() {
+    //TODO: make a config for cache freshness
+    return (Time.monotonicNow() - lastRefresh > 10000);
+  }
{code}
Does refresh interval here need to do any coordination with scan interval 
specificed in 
"yarn.timeline-service.entity-group-fs-store.scan-interval-seconds"?

In EntityGroupFSTimelineStore.java,

{code}
+      if (appState != AppState.UNKNOWN) {
+        appLogs = new AppLogs(applicationId, appDirPath, appState);
+        LOG.debug("Create and try to add new appLogs to appIdLogMap for {}",
+            applicationId);
+        AppLogs oldAppLogs = appIdLogMap.putIfAbsent(applicationId, appLogs);
+        if (oldAppLogs != null) {
+          appLogs = oldAppLogs;
+        }
+      }
{code}
This logic is very similiar with method of getAndSetActiveLog(). Can we 
consolidate them together?

If parseSummaryLogs() is synchronized, it seems getSummaryLogs() should be 
synchronized too or the getter will get stale(half-done) result.

Still checking if other multi-threads issues. More comments will come soon.

> Provide new timeline plugin storage to support fine-grained entity caching
> --------------------------------------------------------------------------
>
>                 Key: YARN-4265
>                 URL: https://issues.apache.org/jira/browse/YARN-4265
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Li Lu
>            Assignee: Li Lu
>         Attachments: YARN-4265-trunk.001.patch, 
> YARN-4265.YARN-4234.001.patch, YARN-4265.YARN-4234.002.patch
>
>
> To support the newly proposed APIs in YARN-4234, we need to create a new 
> plugin timeline store. The store may have similar behavior as the 
> EntityFileTimelineStore proposed in YARN-3942, but cache date in cache id 
> granularity, instead of application id granularity. Let's have this storage 
> as a standalone one, instead of updating EntityFileTimelineStore, to keep the 
> existing store (EntityFileTimelineStore) stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4265) Provide new timeline plugin storage to support fine-grained entity caching

Reply via email to