[ 
https://issues.apache.org/jira/browse/HADOOP-16621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999436#comment-16999436
 ] 

Vinayakumar B edited comment on HADOOP-16621 at 12/18/19 7:03 PM:
------------------------------------------------------------------

Sorry for being late.

Following public APIs were introduced by HADOOP-12563 back in 2016 in Hadoop 
3.0.0 version.
{code:java}
public Token(TokenProto tokenPB);

public TokenProto toTokenProto();
{code}
Ideally there should not be any public API with @Public interface with protobuf 
in signature.
 Right now, this is breaking the binary compatibility of downstream due to 
protobuf version upgrade. Because generated proto classes' super class name is 
changed to {{GeneratedMessage3}} from {{GeneratedMessage}} in 2.5.0 protobuf.

So possible options to proceed will be only
 # Remove all public methods with protobuf signature replace with helper 
classes to do the same job. as being done in HDFS' {{PBHelperClient.java}}. 
This will break the compatibility if by any chance these methods are being used 
outside hadoop-common module (also Hadoop project overall, as upgrade happens 
all Hadoop components together).
 # Mark methods deprecated, Keep the old 'TokenProto' class with 2.5.0 
generated protobuf committed to repo. And rename current {{TokenProto}} to 
{{TokenProto3}} and all their occurances throughout project (Hopefully 
TokenProto is not used outside Hadoop project). And skip shading of 2.5.0 
TokenProto. Can remove methods and committed TokenProto class.

 

Approach #1 is would be easy and direct change, but again compatibility issue 
if these methods used by other projects which is most unlikely.

[~ste...@apache.org] / [~vinodkv] / [~raviprak] is it okay to remove above 
mentioned methods ? and replace with something similar to 
{{PBHelperClient#convert(Token<?> tok)}} and 
{{PBHelperClient#convert(TokenProto tok)}}

 

Approach #2 is a workaround still keeping the Compatibility but unnecessary 
(most possibly unused ) code will be present in repo.  

    Also #2 is possible only after HADOOP-16596 is in, to support both 2.5.0 
and 3.x versions of protobuf together.

This change is very much mandatory to allow spark(and others, which just 
imports Token classes) to compile/run successfully without need to explicitly 
set the protobuf version same as Hadoop.

 

Please let me know your opinions.


was (Author: vinayrpet):
Sorry for being late.

Following public APIs were introduced by HADOOP-12563 back in 2016 in Hadoop 
3.0.0 version.
{code:java}
public Token(TokenProto tokenPB);

public TokenProto toTokenProto();
{code}
Ideally there should not be any public API with @Public interface with protobuf 
in signature.
 Right now, this is breaking the binary compatibility of downstream due to 
protobuf version upgrade. Because generated proto classes' super class name is 
changed to {{GeneratedMessage3}} from {{GeneratedMessage}} in 2.5.0 protobuf.

So possible options to proceed will be only
 # Remove all public methods with protobuf signature replace with helper 
classes to do the same job. as being done in HDFS' {{PBHelperClient.java}}. 
This will break the compatibility if by any chance these methods are being used 
outside hadoop-common module (also Hadoop project overall, as upgrade happens 
all Hadoop components together).
 # Mark methods deprecated, Keep the old 'TokenProto' class with 2.5.0 
generated protobuf committed to repo. And rename current {{TokenProto}} to 
{{TokenProto3}} and all their occurances throughout project (Hopefully 
TokenProto is not used outside Hadoop project). And skip shading of 2.5.0 
TokenProto. Can remove methods and committed TokenProto class.

 

Approach #1 is would be easy and direct change, but again compatibility issue 
if these methods used by other projects which is most unlikely.

[~ste...@apache.org] / [~vinodkv] / [~raviprak] is it okay to remove above 
mentioned methods ? and replace with something similar to 
{{PBHelperClient#convert(Token<?> tok)}} and 
{{PBHelperClient#convert(TokenProto tok)}}

 

Approach #2 is a workaround still keeping the Compatibility but unnecessary 
(most possibly unused ) code will be present in repo.


 This change is very much mandatory to allow spark(and others, which just 
imports Token classes) to compile/run successfully without need to explicitly 
set the protobuf version same as Hadoop.

 

Please let me know your opinions.

> [pb-upgrade] spark-hive doesn't compile against hadoop trunk because of 
> Token's marshalling
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16621
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16621
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: common
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> the move to protobuf 3.x stops spark building because Token has a method 
> which returns a protobuf, and now its returning some v3 types.
> if we want to isolate downstream code from protobuf changes, we need to move 
> that marshalling method from token and put in a helper class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to