Re: Compatibility guidelines for toString overrides
How about this idea: Define a new annotation "StableImplUnstableInterface" which means consumers can't assume stability but producers can't change things. Mark all toStrings with this annotation. Then, in a lazy fashion, as the need arises to change various toString methods, diligence can be done first to see if any legacy code depends on a method in a compatibility-breaking manner, those dependencies can be fixed, then the method is changed and remarked as unstable. Conversely, there might be circumstances where a toString method might be marked as stable. (Certainly it's reasonable to assume the Integer.toString returns a parsable result, for example, the point being that for some classes it makes sense to have a stable spec for toString). Over the years one would hope that the StableImplUnstableSpec annotations would disappear. Sent from my iPhone > On May 12, 2016, at 1:40 PM, Sean Busbeywrote: > > As a downstream user of Hadoop, it would be much clearer if the > toString functions included the appropriate annotations to say they're > non-public, evolving, or whatever. > > Most downstream users of Hadoop aren't going to remember in-detail > exceptions to the java API compatibility rules, once they see that a > class is labeled Public/Stable, they're going to presume that applies > to all non-private members. > >> On Thu, May 12, 2016 at 9:32 AM, Colin McCabe wrote: >> Hi all, >> >> Recently a discussion came up on HADOOP-13028 about the wisdom of >> overloading S3AInputStream#toString to output statistics information. >> It's a difficult judgement for me to make, since I'm not aware of any >> compatibility guidelines for InputStream#toString. Do we have >> compatibility guidelines for toString functions? >> >> It seems like the output of toString functions is usually used as a >> debugging aid, rather than as a stable format suitable for UI display or >> object serialization. Clearly, there are a few cases where we might >> want to specifically declare that a toString method is a stable API. >> However, I think if we attempt to treat the toString output of all >> public classes as stable, we will have greatly increased the API >> surface. Should we formalize this and declare that toString functions >> are @Unstable, Evolving unless declared otherwise? >> >> best, >> Colin >> >> - >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > > -- > busbey > > - > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-11806) Test issue for JIRA automation scripts
Raymie Stata created HADOOP-11806: - Summary: Test issue for JIRA automation scripts Key: HADOOP-11806 URL: https://issues.apache.org/jira/browse/HADOOP-11806 Project: Hadoop Common Issue Type: Test Reporter: Raymie Stata Assignee: Raymie Stata Priority: Trivial I'm writing some scripts to automate some JIRA clean-up activities. I've created this issue for testing these scripts. Please ignore... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Looking to a Hadoop 3 release
Avoiding the use of JDK8 language features (and, presumably, APIs) means you've abandoned #1, i.e., you haven't (really) bumped the JDK source version to JDK8. Also, note that releasing from trunk is a way of achieving #3, it's not a way of abandoning it. On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi Raymie, Konst proposed just releasing off of trunk rather than cutting a branch-2, and there was general agreement there. So, consider #3 abandoned. 12 can be achieved at the same time, we just need to avoid using JDK8 language features in trunk so things can be backported. Best, Andrew On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata rst...@altiscale.com wrote: In this (and the related threads), I see the following three requirements: 1. Bump the source JDK version to JDK8 (ie, drop JDK7 support). 2. We'll still be releasing 2.x releases for a while, with similar feature sets as 3.x. 3. Avoid the risk of split-brain behavior by minimize backporting headaches. Pulling trunk branch-2 branch-2.x is already tedious. Adding a branch-3, branch-3.x would be obnoxious. These three cannot be achieved at the same time. Which do we abandon? On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia sanjayo...@gmail.com wrote: On Mar 5, 2015, at 3:21 PM, Siddharth Seth ss...@apache.org wrote: 2) Simplification of configs - potentially separating client side configs and those used by daemons. This is another source of perpetual confusion for users. + 1 on this. sanjay
Re: Looking to a Hadoop 3 release
In this (and the related threads), I see the following three requirements: 1. Bump the source JDK version to JDK8 (ie, drop JDK7 support). 2. We'll still be releasing 2.x releases for a while, with similar feature sets as 3.x. 3. Avoid the risk of split-brain behavior by minimize backporting headaches. Pulling trunk branch-2 branch-2.x is already tedious. Adding a branch-3, branch-3.x would be obnoxious. These three cannot be achieved at the same time. Which do we abandon? On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia sanjayo...@gmail.com wrote: On Mar 5, 2015, at 3:21 PM, Siddharth Seth ss...@apache.org wrote: 2) Simplification of configs - potentially separating client side configs and those used by daemons. This is another source of perpetual confusion for users. + 1 on this. sanjay
Re: Plans of moving towards JDK7 in trunk
There's an outstanding question addressed to me: Are there particular features or new dependencies that you would like to contribute (or see contributed) that require using the Java 1.7 APIs? The question misses the point: We'd figure out how to write something we wanted to contribute to Hadoop against the APIs of Java4 if that's what it took to get them into a stable release. And at current course and speed, that's how ridiculous things could get. To summarize, it seems like there's a vague consensus that it might be okay to eventually allow the use of Java7 in trunk, but there's no decision. And there's been no answer to the concern that even if such dependencies were allowed in Java7, the only people using them would be people who uninterested in getting their patches into a stable release of Hadoop on any knowable timeframe, which doesn't bode well for the ability to stabilize that Java7 code when it comes time to attempt to. I don't have more to add, so I'll go back to lurking. It'll be interesting to see where we'll be standing a year from now. On Sun, Apr 13, 2014 at 2:09 AM, Tsuyoshi OZAWA ozawa.tsuyo...@gmail.com wrote: Hi, +1 for Karthik's idea(non-binding). IMO, we should keep the compatibility between JDK 6 and JDK 7 on both branch-1 and branch-2, because users can be using them. For future releases that we can declare breaking compatibility(e.g. 3.0.0 release), we can use JDK 7 features if we can get benefits. However, it can increase maintenance costs and distributes the efforts of contributions to maintain branches. Then, I think it is reasonable approach that we use limited and minimum JDK-7 APIs when we have reasons we need to use the features. By the way, if we start to use JDK 7 APIs, we should declare the basis when to use JDK 7 APIs on Wiki not to confuse contributors. Thanks, - Tsuyoshi On Wed, Apr 9, 2014 at 11:44 AM, Raymie Stata rst...@altiscale.com wrote: It might make sense to try to enumerate the benefits of switching to Java7 APIs and dependencies. - Java7 introduced a huge number of language, byte-code, API, and tooling enhancements! Just to name a few: try-with-resources, newer and stronger encyrption methods, more scalable concurrency primitives. See http://www.slideshare.net/boulderjug/55-things-in-java-7 - We can't update current dependencies, and we can't add cool new ones. - Putting language/APIs aside, don't forget that a huge amount of effort goes into qualifying for Java6 (at least, I hope the folks claiming to support Java6 are putting in such an effort :-). Wouldn't Hadoop users/customers be better served if qualification effort went into Java7/8 versus Java6/7? Getting to Java7 as a development env (and Java8 as a runtime env) seems like a no-brainer. Question is: How? On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza sandy.r...@cloudera.com wrote: It might make sense to try to enumerate the benefits of switching to Java7 APIs and dependencies. IMO, the ones listed so far on this thread don't make a compelling enough case to drop Java6 in branch-2 on any time frame, even if this means supporting Java6 through 2015. For example, the change in RawLocalFileSystem semantics might be an incompatible change for branch-2 any way. On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla ka...@cloudera.comwrote: +1 to NOT breaking compatibility in branch-2. I think it is reasonable to require JDK7 for trunk, if we limit use of JDK7-only API to security fixes etc. If we make other optimizations (like IO), it would be a pain to backport things to branch-2. I guess this all depends on when we see ourselves shipping Hadoop-3. Any ideas on that? On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins e...@cloudera.com wrote: On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi davi.ottenhei...@emc.com wrote: From: Eli Collins [mailto:e...@cloudera.com] Sent: Monday, April 07, 2014 11:54 AM IMO we should not drop support for Java 6 in a minor update of a stable release (v2). I don't think the larger Hadoop user base would find it acceptable that upgrading to a minor update caused their systems to stop working because they didn't upgrade Java. There are people still getting support for Java 6. ... Thanks, Eli Hi Eli, Technically you are correct those with extended support get critical security fixes for 6 until the end of 2016. I am curious whether many of those are in the Hadoop user base. Do you know? My guess is the vast majority are within Oracle's official public end of life, which was over 12 months ago. Even Premier support ended Dec 2013: http://www.oracle.com/technetwork/java/eol-135779.html The end of Java 6 support carries much risk. It has to be considered in terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS score 10.0. http://www.cvedetails.com/cve/CVE-2013-2465/ Since you mentioned caused systems
Re: Plans of moving towards JDK7 in trunk
I think the problem to be solved here is to define a point in time when the average Hadoop contributor can start using Java7 dependencies in their code. The use Java7 dependencies in trunk(/branch3) plan, by itself, does not solve this problem. The average Hadoop contributor wants to see their contributions make it into a stable release in a predictable amount of time. Putting code with a Java7 dependency into trunk means the exact opposite: there is no timeline to a stable release. So most contributors will stay away from Java7 dependencies, despite the nominal policy that they're allowed in trunk. (And the few that do use Java7 dependencies are people who do not value releasing code into stable releases, which arguably could lead to a situation that the Java7-dependent code in trunk is, on average, on the buggy side.) I'm not saying the branch2-in-the-future plan is the only way to solve the problem of putting Java7 dependencies on a known time-table, but at least it solves it. Is there another solution? On Thu, Apr 10, 2014 at 1:11 AM, Steve Loughran ste...@hortonworks.com wrote: On 9 April 2014 23:52, Eli Collins e...@cloudera.com wrote: For the sake of this discussion we should separate the runtime from the programming APIs. Users are already migrating to the java7 runtime for most of the reasons listed below (support, performance, bugs, etc), and the various distributions cert their Hadoop 2 based distributions on java7. This gives users many of the benefits of java7, without forcing users off java6. Ie Hadoop does not need to switch to the java7 programming APIs to make sure everyone has a supported runtime. +1: you can use Java 7 today; I'm not sure how tested Java 8 is The question here is really about when Hadoop, and the Hadoop ecosystem (since adjacent projects often end up in the same classpath) start using the java7 programming APIs and therefore break compatibility with java6 runtimes. I think our java6 runtime users would consider dropping support for their java runtime in an update of a major release to be an incompatible change (the binaries stop working on their current jvm). do you mean major 2.x - 3.y or minor 2.x - 2.(x+1) here? That may be worth it if we can articulate sufficient value to offset the cost (they have to upgrade their environment, might make rolling upgrades stop working, etc), but I've not yet heard an argument that articulates the value relative to the cost. Eg upgrading to the java7 APIs allows us to pull in dependencies with new major versions, but only if those dependencies don't break compatibility (which is likely given that our classpaths aren't so isolated), and, realistically, only if the entire Hadoop stack moves to java7 as well (eg we have to recompile HBase to generate v1.7 binaries even if they stick on API v1.6). I'm not aware of a feature, bug etc that really motivates this. I don't see that being needed unless we move up to new java7+ only libraries and HBase needs to track this. The big recompile to work issue is google guava, which is troublesome enough I'd be tempted to say can we drop it entirely An alternate approach is to keep the current stable release series (v2.x) as is, and start using new APIs in trunk (for v3). This will be a major upgrade for Hadoop and therefore an incompatible change like this is to be expected (it would be great if this came with additional changes to better isolate classpaths and dependencies from each other). It allows us to continue to support multiple types of users with different branches, vs forcing all users onto a new version. It of course means that 2.x users will not get the benefits of the new API, but its unclear what those benefits are given theIy can already get the benefits of adopting the newer java runtimes today. I'm (personally) +1 to this, I also think we should plan to do the switch some time this year to not only get the benefits, but discover the costs -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Plans of moving towards JDK7 in trunk
Is there broad consensus that, by end of 3Q2014 at the latest, the average contributor to Hadoop should be free to use Java7 features? And start pulling in libraries that have a Java7 dependency? And start doing the janitorial work of taking advantage of the Java7 APIs? Or do we think that the bulk of Hadoop work will be done against Java6 APIs (and avoiding Java7-dependent libraries) through the end of the year? If the consensus is that we introduce Java7 into the bulk of Hadoop coding, what's the plan for getting there? The answer can't be right now, in trunk. Even if we agreed to start allowing Java7 dependencies into trunk, as a practical matter this isn't enough. Right now, if I'm a random Hadoop contributor, I'd be stupid to contribute to trunk: I know that any stable release in the near term will be from branch2, so if I want a prayer of seeing my change in a stable release, I'd better contribute to branch2. If we want a path to allowing Java7 dependencies by Q4, then we need one of the following: 1) branch3 plan: The major Hadoop vendors (you know who you are) commit to shipping a v3 of Hadoop in Q4 that allows Java7 dependencies and show signs of living up to that commitment (e.g., a branch3 is created sometime soon). This puts us all on a path towards a real release of Hadoop that allows Java7 dependencies. 2) branch2 plan: deprecate Java6 as a runtime environment now, publicly declare a time frame (e.g., 4Q2014) when _future development_ stops supporting Java6 runtime, and work with our customers in the meantime to get them off a crazy-old version of Java (that's what we're doing right now). I don't see another path to allowing Java7 dependencies. In the current state of indecision, the smart programmer would be assuming no Java7 dependencies into 2015. On the one hand, I don't see the branch3 plan actually happening. This is a big decision involving marketing, engineering, customer support. Plus it creates a problem for sales: Come summertime, they'll have a hard time selling 2.x-based releases because they've pre-announced support for 3.x. It's just not going to happen. On the other hand, I don't see the problem with the branch2 plan. The branch2 plan also requires the commitment from the major vendors, but this decision is not nearly as galactic. By the time 3Q2014 comes along, this problem will be very rarified. Also, don't forget that it typically takes a customer 3-6 months to upgrade their Hadoop -- and a customer who's afraid to shift off Java6 in 3Q2014 will probably take a year to upgrade. The branch2 plan implies a last Java6 release of Hadoop in 3Q2014. If we assume a Java7-averse customer will take a year to upgrade to this release -- and then will take another year to upgrade their cluster after that -- then they can be happily using Java6 all the way into 2016. (Another point, if 3Q2014 comes along and vendors find they have so many customers still on Java6 that they can't afford the discontinuity, then they can shift their MAJOR version number of their product to communicate the discontinuity -- there's nothing that says that a vendor's versioning scheme must agree exactly with Hadoop's.) In short, we don't currently have a realistic path for introducing Java7 dependencies into Hadoop. Simply allowing them into trunk will NOT solve this problem: any contributor who wants to see their code in a stable release knows it'll have to flow through branch2 -- and thus they'll have to avoid Java6 dependencies. The branch2 plan is the only plan proposed so far that gets us to Java7 dependencies by Q4. And the important part of the branch2 plan is we make the decision soon -- so we have time to notify folks and otherwise work that decision out into the field. Raymie On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins e...@cloudera.com wrote: On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi davi.ottenhei...@emc.com wrote: From: Eli Collins [mailto:e...@cloudera.com] Sent: Monday, April 07, 2014 11:54 AM IMO we should not drop support for Java 6 in a minor update of a stable release (v2). I don't think the larger Hadoop user base would find it acceptable that upgrading to a minor update caused their systems to stop working because they didn't upgrade Java. There are people still getting support for Java 6. ... Thanks, Eli Hi Eli, Technically you are correct those with extended support get critical security fixes for 6 until the end of 2016. I am curious whether many of those are in the Hadoop user base. Do you know? My guess is the vast majority are within Oracle's official public end of life, which was over 12 months ago. Even Premier support ended Dec 2013: http://www.oracle.com/technetwork/java/eol-135779.html The end of Java 6 support carries much risk. It has to be considered in terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS score 10.0. http://www.cvedetails.com/cve/CVE-2013-2465/ Since you
Re: Plans of moving towards JDK7 in trunk
It might make sense to try to enumerate the benefits of switching to Java7 APIs and dependencies. - Java7 introduced a huge number of language, byte-code, API, and tooling enhancements! Just to name a few: try-with-resources, newer and stronger encyrption methods, more scalable concurrency primitives. See http://www.slideshare.net/boulderjug/55-things-in-java-7 - We can't update current dependencies, and we can't add cool new ones. - Putting language/APIs aside, don't forget that a huge amount of effort goes into qualifying for Java6 (at least, I hope the folks claiming to support Java6 are putting in such an effort :-). Wouldn't Hadoop users/customers be better served if qualification effort went into Java7/8 versus Java6/7? Getting to Java7 as a development env (and Java8 as a runtime env) seems like a no-brainer. Question is: How? On Tue, Apr 8, 2014 at 10:21 AM, Sandy Ryza sandy.r...@cloudera.com wrote: It might make sense to try to enumerate the benefits of switching to Java7 APIs and dependencies. IMO, the ones listed so far on this thread don't make a compelling enough case to drop Java6 in branch-2 on any time frame, even if this means supporting Java6 through 2015. For example, the change in RawLocalFileSystem semantics might be an incompatible change for branch-2 any way. On Tue, Apr 8, 2014 at 10:05 AM, Karthik Kambatla ka...@cloudera.comwrote: +1 to NOT breaking compatibility in branch-2. I think it is reasonable to require JDK7 for trunk, if we limit use of JDK7-only API to security fixes etc. If we make other optimizations (like IO), it would be a pain to backport things to branch-2. I guess this all depends on when we see ourselves shipping Hadoop-3. Any ideas on that? On Tue, Apr 8, 2014 at 9:19 AM, Eli Collins e...@cloudera.com wrote: On Tue, Apr 8, 2014 at 2:00 AM, Ottenheimer, Davi davi.ottenhei...@emc.com wrote: From: Eli Collins [mailto:e...@cloudera.com] Sent: Monday, April 07, 2014 11:54 AM IMO we should not drop support for Java 6 in a minor update of a stable release (v2). I don't think the larger Hadoop user base would find it acceptable that upgrading to a minor update caused their systems to stop working because they didn't upgrade Java. There are people still getting support for Java 6. ... Thanks, Eli Hi Eli, Technically you are correct those with extended support get critical security fixes for 6 until the end of 2016. I am curious whether many of those are in the Hadoop user base. Do you know? My guess is the vast majority are within Oracle's official public end of life, which was over 12 months ago. Even Premier support ended Dec 2013: http://www.oracle.com/technetwork/java/eol-135779.html The end of Java 6 support carries much risk. It has to be considered in terms of serious security vulnerabilities such as CVE-2013-2465 with CVSS score 10.0. http://www.cvedetails.com/cve/CVE-2013-2465/ Since you mentioned caused systems to stop as an example of what would be a concern to Hadoop users, please note the CVE-2013-2465 availability impact: Complete (There is a total shutdown of the affected resource. The attacker can render the resource completely unavailable.) This vulnerability was patched in Java 6 Update 51, but post end of life. Apple pushed out the update specifically because of this vulnerability (http://support.apple.com/kb/HT5717) as did some other vendors privately, but for the majority of people using Java 6 means they have a ticking time bomb. Allowing it to stay should be considered in terms of accepting the whole risk posture. There are some who get extended support, but I suspect many just have a if-it's-not-broke mentality when it comes to production deployments. The current code supports both java6 and java7 and so allows these people to remain compatible, while enabling others to upgrade to the java7 runtime. This seems like the right compromise for a stable release series. Again, absolutely makes sense for trunk (ie v3) to require java7 or greater.
Re: Plans of moving towards JDK7 in trunk
To summarize the thread so far: a) Java7 is already a supported compile- and runtime environment for Hadoop branch2 and trunk b) Java6 must remain a supported compile- and runtime environment for Hadoop branch2 c) (b) implies that branch2 must stick to Java6 APIs I wonder if point (b) should be revised. We could immediately deprecate Java6 as a runtime (and thus compile-time) environment for Hadoop. We could end support for in some published time frame (perhaps 3Q2014). That is, we'd say that all future 2.x release past some date would not be guaranteed to run on Java6. This would set us up for using Java7 APIs into branch2. An alternative might be to keep branch2 on Java6 APIs forever, and to start using Java7 APIs in trunk relatively soon. The concern here would be that trunk isn't getting the kind of production torture testing that branch2 is subjected to, and won't be for a while. If trunk and branch2 diverge too much too quickly, trunk could become a nest of bugs, endangering the timeline and quality of Hadoop 3. This would argue for keeping trunk and branch2 in closer sync (maybe until a branch3 is created and starts getting used by bleeding-edge users). However, as just suggested, keeping them in closer sync need _not_ imply that Java7 features be avoided indefinitely: again, with sufficient warning, Java6 support could be sunset within branch2. On a related note, Steve points out that we need to start thinking about Java8. YES!! Lambdas are a Really Big Deal! If we sunset Java6 in a few quarters, maybe we can add Java8 compile and runtime (but not API) support about the same time. This does NOT imply bringing Java8 APIs into branch2: Even if we do allow Java7 APIs into branch2 in the future, I doubt that bringing Java8 APIs into it will ever make sense. However, if Java8 is a supported runtime environment for Hadoop 2, that sets us up for using Java8 APIs for the eventual branch3 sometime in 2015. On Sat, Apr 5, 2014 at 10:52 AM, Steve Loughran ste...@hortonworks.com wrote: On 5 April 2014 11:53, Colin McCabe cmcc...@alumni.cmu.edu wrote: I've been using JDK7 for Hadoop development for a while now, and I know a lot of other folks have as well. Correct me if I'm wrong, but what we're talking about here is not moving towards JDK7 but breaking compatibility with JDK6. +1 There are a lot of good reasons to ditch JDK6. It would let us use new APIs in JDK7, especially the new file APIs. It would let us update a few dependencies to newer versions. +1 I don't like the idea of breaking compatibility with JDK6 in trunk, but not in branch-2. The traditional reason for putting something in trunk but not in branch-2 is that it is new code that needs some time to prove itself. +1. branch-2 must continue to run on JDK6 This doesn't really apply to incrementing min.jdk-- we could do that easily whenever we like. Meanwhile, if trunk starts accumulating jdk7-only code and dependencies, backports from trunk to branch-2 will become harder and harder over time. I agree, but note that trunk diverges from branch-2 over time anyway -it's happening. Since we make stable releases off of branch-2, and not trunk, I don't see any upside to this. To be honest, I see only negatives here. More time backporting, more issues that show up only in production (branch-2) and not on dev machines (trunk). Maybe it's time to start thinking about what version of branch-2 will drop jdk6 support. But until there is such a version, I don't think trunk should do it. 1. Let's assume that branch-2 will never drop JDK6 -clusters are committed to it, and saying JDK updated needed will simply stop updates. 2. By the hadoop 3.0 ships -2015?- JDK6 will be EOL, java 8 will be in common use, and even JDK7 seen as trailing edge. 3. JDK7 improves JVM performance: NUMA, nativeIO c -which you get for free -as we're confident its stable there's no reason to not move to it in production. 4. As we update the dependencies on hadoop 3, we'll end up upgrading to libraries that are JDK7+ only (jetty!), so JDK6 is implicitly abandoned. 5. There are new packages and APIs in Java7 which we can adopt to make our lives better and development more productive -as well as improving the user experience. as a case in point, java.io.File.mkdirs() says true if and only if the directory was created; false otherwise , and returns false in either of the two cases: -the path resolves to a directory that exists -the path resolves to a file think about that, anyone using local filesystems could write code that assumes that mkdir()==0 is harmless, because if you apply it more than once on a directory it is. But call it on a file and you don't get told its only a file until you try to do something under it, and then things stop behaving. In comparison, java.nio.files.Files differentiates this case by declaring FileAlreadyExistsException -
Re: Will there be a 2.2 patch releases?
I took a look at items in 2.3 and 2.4, as well as CDH5 and HDP2 (also looked at a few of the patches to assess their risk levels), and came up with the following strawman propose of bug-patches to be included in a 2.2.1 release: HADOOP-10029 [major] - Specifying har file to MR job fails in secure cluster HDFS-5089 [major] - When a LayoutVersion supports SNAPSHOT, it must support FSIMAGE_NAME_OPTIMIZATION HDFS-5403 [major] - WebHdfs client cannot communicate with older WebHdfs servers post HDFS-5306 HDFS-5433 [critical] - When reloading fsimage during checkpointing, we should clear existing snapshottable directories MAPREDUCE-5028 [critical] - Maps fail when io.sort.mb is set to high value YARN-1295 [major] - In UnixLocalWrapperScriptBuilder, using bash -c can cause Text file busy errors YARN-1374 [blocker] - Resource Manager fails to start due to ConcurrentModificationException YARN-1176 [critical] - RM web services ClusterMetricsInfo total nodes doesn't include unhealthy nodes There are lots of outstanding bug fixes, so this list is definitely a bit arbitrary, but it seemed like a good list to me. Any thoughts? On Fri, Jan 3, 2014 at 5:26 PM, Sandy Ryza sandy.r...@cloudera.com wrote: Re-reading the thread, it seems what I said about 2.2.1 never happening was incorrect. My impression is still that nobody has plans to drive a 2.2.1 release on any particular timeline. The changes that are now in 2.3 have been moved out of the branch-2.2.1. I suppose the idea is that changes slated for 2.2.1 should be committed both to branch-2.2 and branch-2.2.1. -Sandy On Fri, Jan 3, 2014 at 4:57 PM, Raymie Stata rst...@altiscale.com wrote: Yes, that thread is part of what's confusing me. Arun's initial 11/8 message suggests that there would be room for blocker fixes leading to a 2.2.1 patch release (...and then be very careful about including only *blocker* fixes in branch-2.2). And nothing else in that thread suggests that there wouldn't be a patch release. And yet, Sandy seems to think that 2.2.1 isn't happening at all (YARN-1295), a view that's consistent with the currently confused state of the repo (branch-2.2.1 exists but not released, branch-2.2 version is 2.2.2-SNAPSHOT). Seems to me that we should be planning for a 2.2.1 patch release at some point... Raymie On Fri, Jan 3, 2014 at 1:17 AM, Steve Loughran ste...@hortonworks.com wrote: the last discussion on this was in november -I presume that's still the plan http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201311.mbox/%3CA31E1430-33BE-437C-A61E-050F9A67C109%40hortonworks.com%3E On 3 January 2014 04:10, Raymie Stata rst...@altiscale.com wrote: Nudge, any thoughts? On Sun, Dec 29, 2013 at 1:25 AM, Raymie Stata rst...@altiscale.com wrote: In discussing YARN-1295 it's become clear that I'm confused about the outcome of the Next releases thread. I had assumed there would be patch releases to 2.2, and indeed one would be coming out early Q1. Is this correct? If so, then things seem a little messed-up right now in 2.2-land. There already is a branch-2.2.1, but there hasn't been a release. And branch-2.2 has Maven version 2.2.2-SNAPSHOT. Due to the 2.3 rename a few weeks ago, it might be that the first patch release for 2.2 needs to be 2.2.2. But if so, notice these lists of fixes for 2.2.1: https://issues.apache.org/jira/browse/YARN/fixforversion/12325667 https://issues.apache.org/jira/browse/HDFS/fixforversion/12325666 Do these need to have their fix-versions updated? Raymie P.S. While we're on the subject of point releases, let me check my assumptions. I assumed that, for release x.y.z, fixes deemed to be critical bug fixes would be put into branch-x.y as a matter of course. The Maven release-number in branch-x.y would be x.y.(z+1)-SNAPSHOT, and JIRAs (to be) committed to branch-x.y would have x.y.(z+1) as one of their fix-versions. When enough fixes have accumulated to warrant a release, or when a fix comes up that is critical enough to warrant an immediate release, then branch-x-y is branched to branch-x.y.(z+1), and a release is made. (As Hadoop itself moves from x.y to x.(y+1) and then x.(y+2), the threshold for what is considered to be a critical bug would naturally start to rise, as the effort of back-porting goes up.) Do I have it right? -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please
Re: Will there be a 2.2 patch releases?
Yes, that thread is part of what's confusing me. Arun's initial 11/8 message suggests that there would be room for blocker fixes leading to a 2.2.1 patch release (...and then be very careful about including only *blocker* fixes in branch-2.2). And nothing else in that thread suggests that there wouldn't be a patch release. And yet, Sandy seems to think that 2.2.1 isn't happening at all (YARN-1295), a view that's consistent with the currently confused state of the repo (branch-2.2.1 exists but not released, branch-2.2 version is 2.2.2-SNAPSHOT). Seems to me that we should be planning for a 2.2.1 patch release at some point... Raymie On Fri, Jan 3, 2014 at 1:17 AM, Steve Loughran ste...@hortonworks.com wrote: the last discussion on this was in november -I presume that's still the plan http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201311.mbox/%3CA31E1430-33BE-437C-A61E-050F9A67C109%40hortonworks.com%3E On 3 January 2014 04:10, Raymie Stata rst...@altiscale.com wrote: Nudge, any thoughts? On Sun, Dec 29, 2013 at 1:25 AM, Raymie Stata rst...@altiscale.com wrote: In discussing YARN-1295 it's become clear that I'm confused about the outcome of the Next releases thread. I had assumed there would be patch releases to 2.2, and indeed one would be coming out early Q1. Is this correct? If so, then things seem a little messed-up right now in 2.2-land. There already is a branch-2.2.1, but there hasn't been a release. And branch-2.2 has Maven version 2.2.2-SNAPSHOT. Due to the 2.3 rename a few weeks ago, it might be that the first patch release for 2.2 needs to be 2.2.2. But if so, notice these lists of fixes for 2.2.1: https://issues.apache.org/jira/browse/YARN/fixforversion/12325667 https://issues.apache.org/jira/browse/HDFS/fixforversion/12325666 Do these need to have their fix-versions updated? Raymie P.S. While we're on the subject of point releases, let me check my assumptions. I assumed that, for release x.y.z, fixes deemed to be critical bug fixes would be put into branch-x.y as a matter of course. The Maven release-number in branch-x.y would be x.y.(z+1)-SNAPSHOT, and JIRAs (to be) committed to branch-x.y would have x.y.(z+1) as one of their fix-versions. When enough fixes have accumulated to warrant a release, or when a fix comes up that is critical enough to warrant an immediate release, then branch-x-y is branched to branch-x.y.(z+1), and a release is made. (As Hadoop itself moves from x.y to x.(y+1) and then x.(y+2), the threshold for what is considered to be a critical bug would naturally start to rise, as the effort of back-porting goes up.) Do I have it right? -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Will there be a 2.2 patch releases?
In discussing YARN-1295 it's become clear that I'm confused about the outcome of the Next releases thread. I had assumed there would be patch releases to 2.2, and indeed one would be coming out early Q1. Is this correct? If so, then things seem a little messed-up right now in 2.2-land. There already is a branch-2.2.1, but there hasn't been a release. And branch-2.2 has Maven version 2.2.2-SNAPSHOT. Due to the 2.3 rename a few weeks ago, it might be that the first patch release for 2.2 needs to be 2.2.2. But if so, notice these lists of fixes for 2.2.1: https://issues.apache.org/jira/browse/YARN/fixforversion/12325667 https://issues.apache.org/jira/browse/HDFS/fixforversion/12325666 Do these need to have their fix-versions updated? Raymie P.S. While we're on the subject of point releases, let me check my assumptions. I assumed that, for release x.y.z, fixes deemed to be critical bug fixes would be put into branch-x.y as a matter of course. The Maven release-number in branch-x.y would be x.y.(z+1)-SNAPSHOT, and JIRAs (to be) committed to branch-x.y would have x.y.(z+1) as one of their fix-versions. When enough fixes have accumulated to warrant a release, or when a fix comes up that is critical enough to warrant an immediate release, then branch-x-y is branched to branch-x.y.(z+1), and a release is made. (As Hadoop itself moves from x.y to x.(y+1) and then x.(y+2), the threshold for what is considered to be a critical bug would naturally start to rise, as the effort of back-porting goes up.) Do I have it right?