Re: hadoop-hdfs-client splitoff is going to break code

2015-10-23 Thread Kihwal Lee
I think a lot of "client-side" tests use MiniDFSCluster. I know mechanical 
division is possible, but what about test coverage?
Kihwal

  From: Haohui Mai 
 To: hdfs-dev@hadoop.apache.org; Kihwal Lee  
Cc: "common-...@hadoop.apache.org"  
 Sent: Friday, October 23, 2015 4:43 PM
 Subject: Re: hadoop-hdfs-client splitoff is going to break code
   
All tests that need to spin up a MiniDFSCluster will need to stay in
hadoop-hdfs. Other client only tests are being moved to the
hadoop-hdfs-client module, which is tracked in HDFS-9168.

~Haohui



On Fri, Oct 23, 2015 at 2:14 PM, Kihwal Lee
 wrote:
> I am not sure whether it was mentioned by anyone before, butI noticed that 
> client only changes do not trigger running anytest in hdfs-precommit. This is 
> because hadoop-hdfs-client does nothave any test.
> Kihwal
>
>      From: Colin P. McCabe 
>  To: "hdfs-dev@hadoop.apache.org" 
> Cc: "common-...@hadoop.apache.org" 
>  Sent: Monday, October 19, 2015 4:01 PM
>  Subject: Re: hadoop-hdfs-client splitoff is going to break code
>
> Thanks for being proactive here, Steve.  I think this is a good example of
> why this change should have been done in a branch rather than having been
> done directly in trunk.
>
> regards,
> Colin
>
>
>
>
> On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran 
> wrote:
>
>> just an FYI, the split off of hadoop hdfs into client and server is going
>> to break things.
>>
>> I know that, as my code is broken; DFSConfigKeys off the path,
>> HdfsConfiguration, the class I've been loading to force pickup of
>> hdfs-site.xml -all missing.
>>
>> This is because hadoop-client  POM now depends on hadoop-hdfs-client, not
>> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad
>> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of
>> HDFS constants, HDFS-6566 covering the issue of making this public,
>> something that's been sitting around for a year.
>>
>> I'm fixing my build by explicitly adding a hadoop-hdfs dependency.
>>
>> Any application which used stuff which has now been declared server-side
>> isn't going to compile any more, which does appear to break the
>> compatibility guidelines we've adopted, specifically "The hadoop-client
>> artifact (maven groupId:artifactId) stays compatible within a major release"
>>
>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts
>>
>>
>> We need to do one of
>>
>> 1. agree that this change, is considered acceptable according to policy,
>> and mark it as incompatible in hdfs/CHANGES.TXT
>> 2. Change the POMs to add both hdfs-client and -hdfs server in
>> hadoop-client -with downstream users free to exclude the server code
>>
>> We unintentionally caused similar grief with the move of the s3n clients
>> to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd.
>> This time we know the problems going to arise, so lets explicitly make a
>> decision this time, and share it with our users.
>>
>> -steve
>>
>
>
>


  

Re: hadoop-hdfs-client splitoff is going to break code

2015-10-23 Thread Haohui Mai
All tests that need to spin up a MiniDFSCluster will need to stay in
hadoop-hdfs. Other client only tests are being moved to the
hadoop-hdfs-client module, which is tracked in HDFS-9168.

~Haohui

On Fri, Oct 23, 2015 at 2:14 PM, Kihwal Lee
 wrote:
> I am not sure whether it was mentioned by anyone before, butI noticed that 
> client only changes do not trigger running anytest in hdfs-precommit. This is 
> because hadoop-hdfs-client does nothave any test.
> Kihwal
>
>   From: Colin P. McCabe 
>  To: "hdfs-dev@hadoop.apache.org" 
> Cc: "common-...@hadoop.apache.org" 
>  Sent: Monday, October 19, 2015 4:01 PM
>  Subject: Re: hadoop-hdfs-client splitoff is going to break code
>
> Thanks for being proactive here, Steve.  I think this is a good example of
> why this change should have been done in a branch rather than having been
> done directly in trunk.
>
> regards,
> Colin
>
>
>
>
> On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran 
> wrote:
>
>> just an FYI, the split off of hadoop hdfs into client and server is going
>> to break things.
>>
>> I know that, as my code is broken; DFSConfigKeys off the path,
>> HdfsConfiguration, the class I've been loading to force pickup of
>> hdfs-site.xml -all missing.
>>
>> This is because hadoop-client  POM now depends on hadoop-hdfs-client, not
>> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad
>> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of
>> HDFS constants, HDFS-6566 covering the issue of making this public,
>> something that's been sitting around for a year.
>>
>> I'm fixing my build by explicitly adding a hadoop-hdfs dependency.
>>
>> Any application which used stuff which has now been declared server-side
>> isn't going to compile any more, which does appear to break the
>> compatibility guidelines we've adopted, specifically "The hadoop-client
>> artifact (maven groupId:artifactId) stays compatible within a major release"
>>
>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts
>>
>>
>> We need to do one of
>>
>> 1. agree that this change, is considered acceptable according to policy,
>> and mark it as incompatible in hdfs/CHANGES.TXT
>> 2. Change the POMs to add both hdfs-client and -hdfs server in
>> hadoop-client -with downstream users free to exclude the server code
>>
>> We unintentionally caused similar grief with the move of the s3n clients
>> to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd.
>> This time we know the problems going to arise, so lets explicitly make a
>> decision this time, and share it with our users.
>>
>> -steve
>>
>
>
>


Re: hadoop-hdfs-client splitoff is going to break code

2015-10-23 Thread Kihwal Lee
I am not sure whether it was mentioned by anyone before, butI noticed that 
client only changes do not trigger running anytest in hdfs-precommit. This is 
because hadoop-hdfs-client does nothave any test.
Kihwal

  From: Colin P. McCabe 
 To: "hdfs-dev@hadoop.apache.org"  
Cc: "common-...@hadoop.apache.org"  
 Sent: Monday, October 19, 2015 4:01 PM
 Subject: Re: hadoop-hdfs-client splitoff is going to break code
   
Thanks for being proactive here, Steve.  I think this is a good example of
why this change should have been done in a branch rather than having been
done directly in trunk.

regards,
Colin




On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran 
wrote:

> just an FYI, the split off of hadoop hdfs into client and server is going
> to break things.
>
> I know that, as my code is broken; DFSConfigKeys off the path,
> HdfsConfiguration, the class I've been loading to force pickup of
> hdfs-site.xml -all missing.
>
> This is because hadoop-client  POM now depends on hadoop-hdfs-client, not
> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad
> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of
> HDFS constants, HDFS-6566 covering the issue of making this public,
> something that's been sitting around for a year.
>
> I'm fixing my build by explicitly adding a hadoop-hdfs dependency.
>
> Any application which used stuff which has now been declared server-side
> isn't going to compile any more, which does appear to break the
> compatibility guidelines we've adopted, specifically "The hadoop-client
> artifact (maven groupId:artifactId) stays compatible within a major release"
>
>
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts
>
>
> We need to do one of
>
> 1. agree that this change, is considered acceptable according to policy,
> and mark it as incompatible in hdfs/CHANGES.TXT
> 2. Change the POMs to add both hdfs-client and -hdfs server in
> hadoop-client -with downstream users free to exclude the server code
>
> We unintentionally caused similar grief with the move of the s3n clients
> to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd.
> This time we know the problems going to arise, so lets explicitly make a
> decision this time, and share it with our users.
>
> -steve
>


  

Re: hadoop-hdfs-client splitoff is going to break code

2015-10-20 Thread Steve Loughran

> On 19 Oct 2015, at 22:01, Colin P. McCabe  wrote:
> 
> Thanks for being proactive here, Steve.

no, just building downstream things. Caught a failure of spark to build against 
trunk too, but that's a one liner to import the no-deprecated Auth Exception

>  I think this is a good example of
> why this change should have been done in a branch rather than having been
> done directly in trunk.

Given the size of the change, I'm now convincedt that yes, the hadoop-client 
split should have been in a branch. What it offers there is the ability to 
choose when to merge in. As it is, any Hadoop 2.8 release will have this 
feature. It's going to be visible, and that's going to add more testing. We 
should expect this to cause things to surface in the release process. We also 
need to consider what's going to be the policy if 2.8.0 turns out to break 
something: what are we prepared to roll back?



Re: hadoop-hdfs-client splitoff is going to break code

2015-10-19 Thread Colin P. McCabe
Thanks for being proactive here, Steve.  I think this is a good example of
why this change should have been done in a branch rather than having been
done directly in trunk.

regards,
Colin


On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran 
wrote:

> just an FYI, the split off of hadoop hdfs into client and server is going
> to break things.
>
> I know that, as my code is broken; DFSConfigKeys off the path,
> HdfsConfiguration, the class I've been loading to force pickup of
> hdfs-site.xml -all missing.
>
> This is because hadoop-client  POM now depends on hadoop-hdfs-client, not
> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad
> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of
> HDFS constants, HDFS-6566 covering the issue of making this public,
> something that's been sitting around for a year.
>
> I'm fixing my build by explicitly adding a hadoop-hdfs dependency.
>
> Any application which used stuff which has now been declared server-side
> isn't going to compile any more, which does appear to break the
> compatibility guidelines we've adopted, specifically "The hadoop-client
> artifact (maven groupId:artifactId) stays compatible within a major release"
>
>
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts
>
>
> We need to do one of
>
> 1. agree that this change, is considered acceptable according to policy,
> and mark it as incompatible in hdfs/CHANGES.TXT
> 2. Change the POMs to add both hdfs-client and -hdfs server in
> hadoop-client -with downstream users free to exclude the server code
>
> We unintentionally caused similar grief with the move of the s3n clients
> to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd.
> This time we know the problems going to arise, so lets explicitly make a
> decision this time, and share it with our users.
>
> -steve
>


Re: hadoop-hdfs-client splitoff is going to break code

2015-10-14 Thread Mingliang Liu
The jira tracking this issue is: https://issues.apache.org/jira/browse/HDFS-9241

 +1 on option 2

I think it makes sense to make hadoop-client directly depend on hadoop-hdfs 
(which itself depends on hadoop-hdfs-client).

Ciao,

Mingliang Liu
Member of Technical Staff - HDFS,
Hortonworks Inc.
m...@hortonworks.com



> On Oct 14, 2015, at 10:36 AM, Steve Loughran  wrote:
> 
> just an FYI, the split off of hadoop hdfs into client and server is going to 
> break things.
> 
> I know that, as my code is broken; DFSConfigKeys off the path, 
> HdfsConfiguration, the class I've been loading to force pickup of 
> hdfs-site.xml -all missing.
> 
> This is because hadoop-client  POM now depends on hadoop-hdfs-client, not 
> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad 
> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of 
> HDFS constants, HDFS-6566 covering the issue of making this public, something 
> that's been sitting around for a year.
> 
> I'm fixing my build by explicitly adding a hadoop-hdfs dependency.
> 
> Any application which used stuff which has now been declared server-side 
> isn't going to compile any more, which does appear to break the compatibility 
> guidelines we've adopted, specifically "The hadoop-client artifact (maven 
> groupId:artifactId) stays compatible within a major release"
> 
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts
> 
> 
> We need to do one of
> 
> 1. agree that this change, is considered acceptable according to policy, and 
> mark it as incompatible in hdfs/CHANGES.TXT
> 2. Change the POMs to add both hdfs-client and -hdfs server in hadoop-client 
> -with downstream users free to exclude the server code
> 
> We unintentionally caused similar grief with the move of the s3n clients to 
> hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. This 
> time we know the problems going to arise, so lets explicitly make a decision 
> this time, and share it with our users.
> 
> -steve



hadoop-hdfs-client splitoff is going to break code

2015-10-14 Thread Steve Loughran
just an FYI, the split off of hadoop hdfs into client and server is going to 
break things.

I know that, as my code is broken; DFSConfigKeys off the path, 
HdfsConfiguration, the class I've been loading to force pickup of hdfs-site.xml 
-all missing.

This is because hadoop-client  POM now depends on hadoop-hdfs-client, not 
hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad about 
DfsConfigKeys, as everybody uses it as the one hard-coded resource of HDFS 
constants, HDFS-6566 covering the issue of making this public, something that's 
been sitting around for a year.

I'm fixing my build by explicitly adding a hadoop-hdfs dependency.

Any application which used stuff which has now been declared server-side isn't 
going to compile any more, which does appear to break the compatibility 
guidelines we've adopted, specifically "The hadoop-client artifact (maven 
groupId:artifactId) stays compatible within a major release"

http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts


We need to do one of

1. agree that this change, is considered acceptable according to policy, and 
mark it as incompatible in hdfs/CHANGES.TXT
2. Change the POMs to add both hdfs-client and -hdfs server in hadoop-client 
-with downstream users free to exclude the server code

We unintentionally caused similar grief with the move of the s3n clients to 
hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. This 
time we know the problems going to arise, so lets explicitly make a decision 
this time, and share it with our users.

-steve