----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/55154/#review160533 -----------------------------------------------------------
metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java (line 53) <https://reviews.apache.org/r/55154/#comment231682> I'm not convinced that this is a good method to add, since this is repl-specific, and adds complexity. Any presence of checksum must be encoded into the uris, so that when we call getFiles(), it contains it. Also, the files have no explicit meaning without the checksum, since they will not be stable uris. The getFiles() returned by InsertMessage should already be a CM uri that encodes the checksum, for eg: cm://hdfs%3A%2F%2Fblah$2Ffile1#abcdef1234567890 might imply the file hdfs://blah/file1 with checksum "abcdef1234567890". I'm not super pick on the actual encoding mechanism used, but we want the getFiles() results to be uris that are stable uris - ones which, even if we don't have a FileSystem object associated with it directly, we can extract the info we want from it at the endpoint when we use it, and generate it when we generate it, and all areas in between simply pass it on without doing anything additional with it. Thus, the places I see "generating" this are either DbNotificationListener or fireInsertEvent(), or ReplCopyTask during a bootstrap dump. The only place I see extracting/consuming this uri would be in ReplCopyTask on destination. All other areas should not split this. metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java (line 376) <https://reviews.apache.org/r/55154/#comment231687> We should not be adding more of these methods into JSONMessageFactory that add field names here. That knowledge should belong to the domain of the message itself. The existing methods that do this are currently slated for removal once we refactor DbNotificationListener to not depend on them. ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java (line 576) <https://reviews.apache.org/r/55154/#comment231678> The partspec can be obtained from insertMsg.getPartitionKeyValues() - we should'nt make calls to JSONMessageFactory here. JSONInsertMessage, in its implementation of getPartitionKeyValues, can, in turn, then call generic functions from JSONMessageFactory using knowledge it has about itself. There should'nt be any explicit calls to JSONMessageFactory from any class which is not a JSON*Message. See the previous ALTER patch and how it changed the ADD_PTNS/CREATE_TABLE processing for a reference. ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java (line 595) <https://reviews.apache.org/r/55154/#comment231681> We should not be making calls to JSONMessageFactory, or getting fields with knowledge of names such as "fileChecksums" or "files". Knowledge of fieldnames should be restricted to inside the message itself, which exposes api via its parent Message class. This should simply be a dump of what the InsertMessage.getFiles() returns and no more. Any encoding of checksum/etc that we do must happen in DbNotificationListener, or even possibly in fireInsertEvent, since the location is meaningless without the checksum. - Sushanth Sowmyan On Jan. 4, 2017, 12:59 p.m., Vaibhav Gumashta wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/55154/ > ----------------------------------------------------------- > > (Updated Jan. 4, 2017, 12:59 p.m.) > > > Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair. > > > Bugs: HIVE-15366 > https://issues.apache.org/jira/browse/HIVE-15366 > > > Repository: hive-git > > > Description > ------- > > https://issues.apache.org/jira/browse/HIVE-15366 > > > Diffs > ----- > > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java > e29aa22 > > metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java > fe747df > > metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java > bd9f9ec > > metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java > 9954902 > ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 > ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f > ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java > f61274b > ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java > 5561e06 > > ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java > 9b83407 > > Diff: https://reviews.apache.org/r/55154/diff/ > > > Testing > ------- > > > Thanks, > > Vaibhav Gumashta > >