[ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544659#comment-14544659 ]
Sangjin Lee commented on YARN-3411: ----------------------------------- The latest patch looks pretty good. I think we're almost there. I went over it one more time bit more closely, and have mostly minor comments. Some are performance-related so you might want to consider them. Hopefully it won't be too difficult to address these... (EntityColumnDetails.java) - let's add Private and Unstable to this (EntityTableDetails.java) - let's add Private and Unstable to this - l.27: this also should be non-public? (HBaseTimelineWriterImpl.java) - l.88: (nit) you must be sick of me saying it, but let's declare row inside the for loop, and also use a more standard declaration (byte[] row = ...) - l.165: it might be a minor optimization, but we might want to reuse bytes for te.getType() and te.getId() for other calls that need them (e.g. getInfoPut) - l.234: we should get the bytes for the key *once* outside the inner for loop and reuse it - l.274: we should get the bytes for the id *once* outside the inner for loop and reuse it (Range.java) - let's add Private and Unstable to this (TimelineSchemaCreator.java) - l.137: I think we should use LOG.error() as we have the logger - l.144-147: I'm unsure why we're using the log4j logger when we're using the Apache Commons logger; is this for HBase debug logging? (TimelineWriterUtil.java) - l.53: do we need to handle the case where a component byte array is empty? - l.53: it's not critical but it could be good to handle the case where components.length = 1 (just return it) - l.181,197: what are "taskValues"? (TestHBaseTimelineWriterImpl.java) - l.56-60: the javadoc should be before the class name - l.56: (nit) remove PoC - l.62: (nit) it should be util (not upper case) - l.141: (nit) it should call init() (best practice) - l.147: (nit) should be long, not Long - l.212: (nit) it should call stop() (best practice) > [Storage implementation] explore the native HBase write schema for storage > -------------------------------------------------------------------------- > > Key: YARN-3411 > URL: https://issues.apache.org/jira/browse/YARN-3411 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Reporter: Sangjin Lee > Assignee: Vrushali C > Priority: Critical > Attachments: ATSv2BackendHBaseSchemaproposal.pdf, > YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, > YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, > YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt, > YARN-3411.poc.txt > > > There is work that's in progress to implement the storage based on a Phoenix > schema (YARN-3134). > In parallel, we would like to explore an implementation based on a native > HBase schema for the write path. Such a schema does not exclude using > Phoenix, especially for reads and offline queries. > Once we have basic implementations of both options, we could evaluate them in > terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)