[
https://issues.apache.org/jira/browse/YARN-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081472#comment-15081472
]
Junping Du commented on YARN-4265:
----------------------------------
Thanks [~gtCarrera9] for clarification. Assume Jason is fine with going on with
this patch, I quickly go through the v3 patch.
with following comments so far (I haven't finished my review yet as patch is
pretty big):
In YarnConfiguration.java,
{code}
TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_SCAN_INTERVAL_SECONDS_DEFAULT = 60;
{code}
I noticed that we are setting 1 minutes as default scan interval but original
patch in HDFS-3942 is 5 minutes. Why shall we do any update here? The same
question on "app-cache-size", the default value in HDFS-3942 is 5 but here is
10. Any reason to update the value?
In yarn-default.xml,
{code}
+ <description>DFS path to store active application’s timeline
data</description>
...
+ <description>DFS path to store done application’s timeline
data</description>
{code}
DFS is very old name, use HDFS instead to be more clear.
Why we don't have any default value specified in property of
"yarn.timeline-service.entity-group-fs-store.group-id-plugin-classes"?
In hadoop-yarn-server-timeline-pluginstorage/pom.xml,
For EmptyTimelineEntityGroupPlugin.java, why we need this class? I didn't see
any help provided even in tests. We should remove it if useless.
In EntityCacheItem.java,
We should have a description for this class in Javadoc.
Can we optimize the synchronization logic here? Like in synchronized method
refreshCache, we are intialize/start/stop TimelineDataManager (and
MemoryTimelineStore) which is quite expensive and unnecessary to block other
synchronized operations. Shall we move these operations out of synchronized
block?
{code}
+ LOG.warn("Error closing datamanager", e);
{code}
I think we are closing store here instead of datamanager. Isn't it?
{code}
+ public boolean needRefresh() {
+ //TODO: make a config for cache freshness
+ return (Time.monotonicNow() - lastRefresh > 10000);
+ }
{code}
Does refresh interval here need to do any coordination with scan interval
specificed in
"yarn.timeline-service.entity-group-fs-store.scan-interval-seconds"?
In EntityGroupFSTimelineStore.java,
{code}
+ if (appState != AppState.UNKNOWN) {
+ appLogs = new AppLogs(applicationId, appDirPath, appState);
+ LOG.debug("Create and try to add new appLogs to appIdLogMap for {}",
+ applicationId);
+ AppLogs oldAppLogs = appIdLogMap.putIfAbsent(applicationId, appLogs);
+ if (oldAppLogs != null) {
+ appLogs = oldAppLogs;
+ }
+ }
{code}
This logic is very similiar with method of getAndSetActiveLog(). Can we
consolidate them together?
If parseSummaryLogs() is synchronized, it seems getSummaryLogs() should be
synchronized too or the getter will get stale(half-done) result.
Still checking if other multi-threads issues. More comments will come soon.
> Provide new timeline plugin storage to support fine-grained entity caching
> --------------------------------------------------------------------------
>
> Key: YARN-4265
> URL: https://issues.apache.org/jira/browse/YARN-4265
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Li Lu
> Assignee: Li Lu
> Attachments: YARN-4265-trunk.001.patch,
> YARN-4265.YARN-4234.001.patch, YARN-4265.YARN-4234.002.patch
>
>
> To support the newly proposed APIs in YARN-4234, we need to create a new
> plugin timeline store. The store may have similar behavior as the
> EntityFileTimelineStore proposed in YARN-3942, but cache date in cache id
> granularity, instead of application id granularity. Let's have this storage
> as a standalone one, instead of updating EntityFileTimelineStore, to keep the
> existing store (EntityFileTimelineStore) stable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)