-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55154/#review160533
-----------------------------------------------------------




metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
 (line 53)
<https://reviews.apache.org/r/55154/#comment231682>

    I'm not convinced that this is a good method to add, since this is 
repl-specific, and adds complexity. Any presence of checksum must be encoded 
into the uris, so that when we call getFiles(), it contains it. Also, the files 
have no explicit meaning without the checksum, since they will not be stable 
uris. The getFiles() returned by InsertMessage should already be a CM uri that 
encodes the checksum, for eg: cm://hdfs%3A%2F%2Fblah$2Ffile1#abcdef1234567890 
might imply the file hdfs://blah/file1 with checksum "abcdef1234567890". I'm 
not super pick on the actual encoding mechanism used, but we want the 
getFiles() results to be uris that are stable uris - ones which, even if we 
don't have a FileSystem object associated with it directly, we can extract the 
info we want from it at the endpoint when we use it, and generate it when we 
generate it, and all areas in between simply pass it on without doing anything 
additional with it.
    
    Thus, the places I see "generating" this are either DbNotificationListener 
or fireInsertEvent(), or ReplCopyTask during a bootstrap dump. The only place I 
see extracting/consuming this uri would be in ReplCopyTask on destination. All 
other areas should not split this.



metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
 (line 376)
<https://reviews.apache.org/r/55154/#comment231687>

    We should not be adding more of these methods into JSONMessageFactory that 
add field names here. That knowledge should belong to the domain of the message 
itself. The existing methods that do this are currently slated for removal once 
we refactor DbNotificationListener to not depend on them.



ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
(line 576)
<https://reviews.apache.org/r/55154/#comment231678>

    The partspec can be obtained from insertMsg.getPartitionKeyValues() - we 
should'nt make calls to JSONMessageFactory here.
    
    JSONInsertMessage, in its implementation of getPartitionKeyValues, can, in 
turn, then call generic functions from JSONMessageFactory using knowledge it 
has about itself.
    
    There should'nt be any explicit calls to JSONMessageFactory from any class 
which is not a JSON*Message.
    
    See the previous ALTER patch and how it changed the ADD_PTNS/CREATE_TABLE 
processing for a reference.



ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
(line 595)
<https://reviews.apache.org/r/55154/#comment231681>

    We should not be making calls to JSONMessageFactory, or getting fields with 
knowledge of names such as "fileChecksums" or "files". Knowledge of fieldnames 
should be restricted to inside the message itself, which exposes api via its 
parent Message class.
    
    This should simply be a dump of what the InsertMessage.getFiles() returns 
and no more. Any encoding of checksum/etc that we do must happen in 
DbNotificationListener, or even possibly in fireInsertEvent, since the location 
is meaningless without the checksum.


- Sushanth Sowmyan


On Jan. 4, 2017, 12:59 p.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55154/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2017, 12:59 p.m.)
> 
> 
> Review request for hive, Daniel Dai, Sushanth Sowmyan, and Thejas Nair.
> 
> 
> Bugs: HIVE-15366
>     https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/HIVE-15366
> 
> 
> Diffs
> -----
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestReplicationScenarios.java
>  e29aa22 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/InsertMessage.java
>  fe747df 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONInsertMessage.java
>  bd9f9ec 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
>  9954902 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 4c0f817 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 6e9602f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java 
> f61274b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 5561e06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
> 9b83407 
> 
> Diff: https://reviews.apache.org/r/55154/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>

Reply via email to