Re: hadoop-hdfs-client splitoff is going to break code
I think a lot of "client-side" tests use MiniDFSCluster. I know mechanical division is possible, but what about test coverage? Kihwal From: Haohui Mai To: hdfs-dev@hadoop.apache.org; Kihwal Lee Cc: "common-...@hadoop.apache.org" Sent: Friday, October 23, 2015 4:43 PM Subject: Re: hadoop-hdfs-client splitoff is going to break code All tests that need to spin up a MiniDFSCluster will need to stay in hadoop-hdfs. Other client only tests are being moved to the hadoop-hdfs-client module, which is tracked in HDFS-9168. ~Haohui On Fri, Oct 23, 2015 at 2:14 PM, Kihwal Lee wrote: > I am not sure whether it was mentioned by anyone before, butI noticed that > client only changes do not trigger running anytest in hdfs-precommit. This is > because hadoop-hdfs-client does nothave any test. > Kihwal > > From: Colin P. McCabe > To: "hdfs-dev@hadoop.apache.org" > Cc: "common-...@hadoop.apache.org" > Sent: Monday, October 19, 2015 4:01 PM > Subject: Re: hadoop-hdfs-client splitoff is going to break code > > Thanks for being proactive here, Steve. I think this is a good example of > why this change should have been done in a branch rather than having been > done directly in trunk. > > regards, > Colin > > > > > On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran > wrote: > >> just an FYI, the split off of hadoop hdfs into client and server is going >> to break things. >> >> I know that, as my code is broken; DFSConfigKeys off the path, >> HdfsConfiguration, the class I've been loading to force pickup of >> hdfs-site.xml -all missing. >> >> This is because hadoop-client POM now depends on hadoop-hdfs-client, not >> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad >> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of >> HDFS constants, HDFS-6566 covering the issue of making this public, >> something that's been sitting around for a year. >> >> I'm fixing my build by explicitly adding a hadoop-hdfs dependency. >> >> Any application which used stuff which has now been declared server-side >> isn't going to compile any more, which does appear to break the >> compatibility guidelines we've adopted, specifically "The hadoop-client >> artifact (maven groupId:artifactId) stays compatible within a major release" >> >> >> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts >> >> >> We need to do one of >> >> 1. agree that this change, is considered acceptable according to policy, >> and mark it as incompatible in hdfs/CHANGES.TXT >> 2. Change the POMs to add both hdfs-client and -hdfs server in >> hadoop-client -with downstream users free to exclude the server code >> >> We unintentionally caused similar grief with the move of the s3n clients >> to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. >> This time we know the problems going to arise, so lets explicitly make a >> decision this time, and share it with our users. >> >> -steve >> > > >
Re: hadoop-hdfs-client splitoff is going to break code
All tests that need to spin up a MiniDFSCluster will need to stay in hadoop-hdfs. Other client only tests are being moved to the hadoop-hdfs-client module, which is tracked in HDFS-9168. ~Haohui On Fri, Oct 23, 2015 at 2:14 PM, Kihwal Lee wrote: > I am not sure whether it was mentioned by anyone before, butI noticed that > client only changes do not trigger running anytest in hdfs-precommit. This is > because hadoop-hdfs-client does nothave any test. > Kihwal > > From: Colin P. McCabe > To: "hdfs-dev@hadoop.apache.org" > Cc: "common-...@hadoop.apache.org" > Sent: Monday, October 19, 2015 4:01 PM > Subject: Re: hadoop-hdfs-client splitoff is going to break code > > Thanks for being proactive here, Steve. I think this is a good example of > why this change should have been done in a branch rather than having been > done directly in trunk. > > regards, > Colin > > > > > On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran > wrote: > >> just an FYI, the split off of hadoop hdfs into client and server is going >> to break things. >> >> I know that, as my code is broken; DFSConfigKeys off the path, >> HdfsConfiguration, the class I've been loading to force pickup of >> hdfs-site.xml -all missing. >> >> This is because hadoop-client POM now depends on hadoop-hdfs-client, not >> hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad >> about DfsConfigKeys, as everybody uses it as the one hard-coded resource of >> HDFS constants, HDFS-6566 covering the issue of making this public, >> something that's been sitting around for a year. >> >> I'm fixing my build by explicitly adding a hadoop-hdfs dependency. >> >> Any application which used stuff which has now been declared server-side >> isn't going to compile any more, which does appear to break the >> compatibility guidelines we've adopted, specifically "The hadoop-client >> artifact (maven groupId:artifactId) stays compatible within a major release" >> >> >> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts >> >> >> We need to do one of >> >> 1. agree that this change, is considered acceptable according to policy, >> and mark it as incompatible in hdfs/CHANGES.TXT >> 2. Change the POMs to add both hdfs-client and -hdfs server in >> hadoop-client -with downstream users free to exclude the server code >> >> We unintentionally caused similar grief with the move of the s3n clients >> to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. >> This time we know the problems going to arise, so lets explicitly make a >> decision this time, and share it with our users. >> >> -steve >> > > >
Re: hadoop-hdfs-client splitoff is going to break code
I am not sure whether it was mentioned by anyone before, butI noticed that client only changes do not trigger running anytest in hdfs-precommit. This is because hadoop-hdfs-client does nothave any test. Kihwal From: Colin P. McCabe To: "hdfs-dev@hadoop.apache.org" Cc: "common-...@hadoop.apache.org" Sent: Monday, October 19, 2015 4:01 PM Subject: Re: hadoop-hdfs-client splitoff is going to break code Thanks for being proactive here, Steve. I think this is a good example of why this change should have been done in a branch rather than having been done directly in trunk. regards, Colin On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran wrote: > just an FYI, the split off of hadoop hdfs into client and server is going > to break things. > > I know that, as my code is broken; DFSConfigKeys off the path, > HdfsConfiguration, the class I've been loading to force pickup of > hdfs-site.xml -all missing. > > This is because hadoop-client POM now depends on hadoop-hdfs-client, not > hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad > about DfsConfigKeys, as everybody uses it as the one hard-coded resource of > HDFS constants, HDFS-6566 covering the issue of making this public, > something that's been sitting around for a year. > > I'm fixing my build by explicitly adding a hadoop-hdfs dependency. > > Any application which used stuff which has now been declared server-side > isn't going to compile any more, which does appear to break the > compatibility guidelines we've adopted, specifically "The hadoop-client > artifact (maven groupId:artifactId) stays compatible within a major release" > > > http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts > > > We need to do one of > > 1. agree that this change, is considered acceptable according to policy, > and mark it as incompatible in hdfs/CHANGES.TXT > 2. Change the POMs to add both hdfs-client and -hdfs server in > hadoop-client -with downstream users free to exclude the server code > > We unintentionally caused similar grief with the move of the s3n clients > to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. > This time we know the problems going to arise, so lets explicitly make a > decision this time, and share it with our users. > > -steve >
Re: hadoop-hdfs-client splitoff is going to break code
> On 19 Oct 2015, at 22:01, Colin P. McCabe wrote: > > Thanks for being proactive here, Steve. no, just building downstream things. Caught a failure of spark to build against trunk too, but that's a one liner to import the no-deprecated Auth Exception > I think this is a good example of > why this change should have been done in a branch rather than having been > done directly in trunk. Given the size of the change, I'm now convincedt that yes, the hadoop-client split should have been in a branch. What it offers there is the ability to choose when to merge in. As it is, any Hadoop 2.8 release will have this feature. It's going to be visible, and that's going to add more testing. We should expect this to cause things to surface in the release process. We also need to consider what's going to be the policy if 2.8.0 turns out to break something: what are we prepared to roll back?
Re: hadoop-hdfs-client splitoff is going to break code
Thanks for being proactive here, Steve. I think this is a good example of why this change should have been done in a branch rather than having been done directly in trunk. regards, Colin On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran wrote: > just an FYI, the split off of hadoop hdfs into client and server is going > to break things. > > I know that, as my code is broken; DFSConfigKeys off the path, > HdfsConfiguration, the class I've been loading to force pickup of > hdfs-site.xml -all missing. > > This is because hadoop-client POM now depends on hadoop-hdfs-client, not > hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad > about DfsConfigKeys, as everybody uses it as the one hard-coded resource of > HDFS constants, HDFS-6566 covering the issue of making this public, > something that's been sitting around for a year. > > I'm fixing my build by explicitly adding a hadoop-hdfs dependency. > > Any application which used stuff which has now been declared server-side > isn't going to compile any more, which does appear to break the > compatibility guidelines we've adopted, specifically "The hadoop-client > artifact (maven groupId:artifactId) stays compatible within a major release" > > > http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts > > > We need to do one of > > 1. agree that this change, is considered acceptable according to policy, > and mark it as incompatible in hdfs/CHANGES.TXT > 2. Change the POMs to add both hdfs-client and -hdfs server in > hadoop-client -with downstream users free to exclude the server code > > We unintentionally caused similar grief with the move of the s3n clients > to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. > This time we know the problems going to arise, so lets explicitly make a > decision this time, and share it with our users. > > -steve >
Re: hadoop-hdfs-client splitoff is going to break code
The jira tracking this issue is: https://issues.apache.org/jira/browse/HDFS-9241 +1 on option 2 I think it makes sense to make hadoop-client directly depend on hadoop-hdfs (which itself depends on hadoop-hdfs-client). Ciao, Mingliang Liu Member of Technical Staff - HDFS, Hortonworks Inc. m...@hortonworks.com > On Oct 14, 2015, at 10:36 AM, Steve Loughran wrote: > > just an FYI, the split off of hadoop hdfs into client and server is going to > break things. > > I know that, as my code is broken; DFSConfigKeys off the path, > HdfsConfiguration, the class I've been loading to force pickup of > hdfs-site.xml -all missing. > > This is because hadoop-client POM now depends on hadoop-hdfs-client, not > hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad > about DfsConfigKeys, as everybody uses it as the one hard-coded resource of > HDFS constants, HDFS-6566 covering the issue of making this public, something > that's been sitting around for a year. > > I'm fixing my build by explicitly adding a hadoop-hdfs dependency. > > Any application which used stuff which has now been declared server-side > isn't going to compile any more, which does appear to break the compatibility > guidelines we've adopted, specifically "The hadoop-client artifact (maven > groupId:artifactId) stays compatible within a major release" > > http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts > > > We need to do one of > > 1. agree that this change, is considered acceptable according to policy, and > mark it as incompatible in hdfs/CHANGES.TXT > 2. Change the POMs to add both hdfs-client and -hdfs server in hadoop-client > -with downstream users free to exclude the server code > > We unintentionally caused similar grief with the move of the s3n clients to > hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. This > time we know the problems going to arise, so lets explicitly make a decision > this time, and share it with our users. > > -steve
hadoop-hdfs-client splitoff is going to break code
just an FYI, the split off of hadoop hdfs into client and server is going to break things. I know that, as my code is broken; DFSConfigKeys off the path, HdfsConfiguration, the class I've been loading to force pickup of hdfs-site.xml -all missing. This is because hadoop-client POM now depends on hadoop-hdfs-client, not hadoop-hdfs, so the things I'm referencing are gone. I'm particularly sad about DfsConfigKeys, as everybody uses it as the one hard-coded resource of HDFS constants, HDFS-6566 covering the issue of making this public, something that's been sitting around for a year. I'm fixing my build by explicitly adding a hadoop-hdfs dependency. Any application which used stuff which has now been declared server-side isn't going to compile any more, which does appear to break the compatibility guidelines we've adopted, specifically "The hadoop-client artifact (maven groupId:artifactId) stays compatible within a major release" http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Build_artifacts We need to do one of 1. agree that this change, is considered acceptable according to policy, and mark it as incompatible in hdfs/CHANGES.TXT 2. Change the POMs to add both hdfs-client and -hdfs server in hadoop-client -with downstream users free to exclude the server code We unintentionally caused similar grief with the move of the s3n clients to hadoop-aws , HADOOP-11074 -something we should have picked up and -1'd. This time we know the problems going to arise, so lets explicitly make a decision this time, and share it with our users. -steve