Re: [VOTE] Release Apache Hadoop 2.4.1
+1 verified checksum signature on SRC TARBALL verified CHANGES.txt files run apache-rat:check on SRC build SRC installed pseudo cluster run successfully a few MR sample jobs verified HttpFS Thanks Arun On Mon, Jun 16, 2014 at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro
[jira] [Created] (HADOOP-10717) Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode.
Dapeng Sun created HADOOP-10717: --- Summary: Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode. Key: HADOOP-10717 URL: https://issues.apache.org/jira/browse/HADOOP-10717 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Dapeng Sun Fix For: 3.0.0 When user want to start NameNode, user would got the following exception, it is caused by missing org.mortbay.jetty:jsp-2.1-jetty:jar:6.1.26 in the pom.xml 14/06/18 14:55:30 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 14/06/18 14:55:30 INFO http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 14/06/18 14:55:30 INFO http.HttpServer2: Jetty bound to port 50070 14/06/18 14:55:30 INFO mortbay.log: jetty-6.1.26 14/06/18 14:55:30 INFO mortbay.log: NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet 14/06/18 14:57:38 WARN mortbay.log: EXCEPTION java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at java.net.Socket.connect(Socket.java:478) at sun.net.NetworkClient.doConnect(NetworkClient.java:163) at sun.net.www.http.HttpClient.openServer(HttpClient.java:395) at sun.net.www.http.HttpClient.openServer(HttpClient.java:530) at sun.net.www.http.HttpClient.init(HttpClient.java:234) at sun.net.www.http.HttpClient.New(HttpClient.java:307) at sun.net.www.http.HttpClient.New(HttpClient.java:324) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:677) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522) at javax.xml.parsers.SAXParser.parse(SAXParser.java:395) at
[jira] [Created] (HADOOP-10718) IOException: An existing connection was forcibly closed by the remote host frequently happens on Windows
Zhijie Shen created HADOOP-10718: Summary: IOException: An existing connection was forcibly closed by the remote host frequently happens on Windows Key: HADOOP-10718 URL: https://issues.apache.org/jira/browse/HADOOP-10718 Project: Hadoop Common Issue Type: Bug Components: ipc Reporter: Zhijie Shen After HADOOP-317, we still observed that on windows platform, there're a number of IOException: An existing connection was forcibly closed by the remote host when running a MR job. For example, {code} 2014-06-09 09:11:40,675 INFO [Socket Reader #3 for port 59622] org.apache.hadoop.ipc.Server: Socket Reader #3 for port 59622: readAndProcess from client 10.215.30.53 threw exception [java.io.IOException: An existing connection was forcibly closed by the remote host] java.io.IOException: An existing connection was forcibly closed by the remote host at sun.nio.ch.SocketDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) at sun.nio.ch.IOUtil.read(IOUtil.java:198) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) at org.apache.hadoop.ipc.Server.channelRead(Server.java:2558) at org.apache.hadoop.ipc.Server.access$2800(Server.java:130) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1459) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:750) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:624) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:595) {code} {code} 2014-06-09 09:15:38,539 WARN [main] org.apache.hadoop.mapred.Task: Failure sending commit pending: java.io.IOException: Failed on local exception: java.io.IOException: An existing connection was forcibly closed by the remote host; Host Details : local host is: sdevin-clster53/10.215.16.72; destination host is: sdevin-clster54:63415; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1414) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231) at com.sun.proxy.$Proxy9.commitPending(Unknown Source) at org.apache.hadoop.mapred.Task.done(Task.java:1006) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:397) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host at sun.nio.ch.SocketDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) at sun.nio.ch.IOUtil.read(IOUtil.java:198) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:133) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:510) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1054) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:949) {code} And the latter one results in the issue of MAPREDUCE-5924. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Edit permission to Hadoop Wiki page
Thanks Harsh and Steve. I verified that I can edit the page. -- Asokan On 06/18/2014 12:57 AM, Harsh J wrote: Hi, You should be able to edit pages on the wiki.apache.org/hadoop wiki as your username's in there (thanks Steve!). Are you unable to? Let us know. On Tue, Jun 17, 2014 at 1:55 AM, Asokan, M maso...@syncsort.commailto:maso...@syncsort.com wrote: I would like to update the page http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support with my company's Hadoop related offerings. My Wiki user id is: masokan Can someone point out how I can get edit permission? Thanks in advance. -- Asokan ATTENTION: - The information contained in this message (including any files transmitted with this message) may contain proprietary, trade secret or other confidential and/or legally privileged information. Any pricing information contained in this message or in any files transmitted with this message is always confidential and cannot be shared with any third parties without prior written approval from Syncsort. This message is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any use, disclosure, copying or distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Syncsort and destroy all copies of this message in your possession, custody or control.
Re: Plans of moving towards JDK7 in trunk
I think we should come up with a plan for when the next Hadoop release will drop support for JDK6. We all know that day needs to come... the only question is when. I agree that writing the JDK7-only code doesn't seem very productive unless we have a plan for when it will be released and usable. best, Colin On Tue, Jun 17, 2014 at 10:08 PM, Andrew Wang andrew.w...@cloudera.com wrote: Reviving this thread, I noticed there's been a patch and +1 on HADOOP-10530, and I don't think we actually reached a conclusion. I (and others) have expressed concerns about moving to JDK7 for trunk. Summarizing a few points: - We can't move to JDK7 in branch-2 because of compatibility - branch-2 is currently the only Hadoop release vehicle, there are no plans for a trunk-based Hadoop 3 - Introducing JDK7-only APIs in trunk will increase divergence with branch-2 and make backports harder - Almost all developers care only about branch-2, since it is the only release vehicle With this in mind, I struggle to see any upsides to introducing JDK7-only APIs to trunk. Please let's not do anything on HADOOP-10530 or related until we agree on this. Thanks, Andrew On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran ste...@hortonworks.com wrote: On 14 April 2014 17:46, Andrew Purtell apurt...@apache.org wrote: How well is trunk tested? Does anyone deploy it with real applications running on top? When will the trunk codebase next be the basis for a production release? An impromptu diff of hadoop-common trunk against branch-2 as of today is 38,625 lines. Can they be said to be the same animal? I ask because any disincentive toward putting code in trunk is beside the point, if the only target worth pursuing today is branch-2 unless one doesn't care if the code is released for production use. Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for the vast majority of Hadoopers if talking about branch-2. I think its partly a timescale issue; its also because the 1-2 transition was so significant, especially at the YARN layer, that it's still taking time to trickle through. If you do want code to ship this year, branch-2 is where you are going to try and get it in -and like you say, that's where things get tried in the field. At the same time, the constraints of stability are holding us back -already-. I don't see why we should have such another major 1-2 transition in future; the rate that Arun is pushing out 2.x releases its almost back to the 0.1x timescale -though at that point most people were fending for themselves and expectations of stability were less. We do want smaller version increments in future, which branch-2 is -mostly- delivering. While Java 7 doesn't have some must-have features, Java 8 is a significant improvement in the language, and we should be looking ahead to that, maybe even doing some leading-edge work on the side, so the same discussion doesn't come up in two years time when java 7 goes EOL. -steve (personal opinions only, etc, ) On Mon, Apr 14, 2014 at 9:22 AM, Colin McCabe cmcc...@alumni.cmu.edu wrote: I think the bottom line here is that as long as our stable release uses JDK6, there is going to be a very, very strong disincentive to put any code which can't run on JDK6 into trunk. Like I said earlier, the traditional reason for putting something in trunk but not the stable release is that it needs more testing. If a stable release that drops support for JDK6 is more than a year away, does it make sense to put anything in trunk like that? What might need more than a year of testing? Certainly not changes to LocalFileSystem to use the new APIs. I also don't think an upgrade to various libraries qualifies. It might be best to shelve this for now, like we've done in the past, until we're ready to talk about a stable release that requires JDK7+. At least that's my feeling. If we're really desperate for the new file APIs JDK7 provides, we could consider using loadable modules for it in branch-2. This is similar to how we provide JNI versions of certain things on certain platforms, without dropping support for the other platforms. best, Colin On Sun, Apr 13, 2014 at 10:39 AM, Raymie Stata rst...@altiscale.com wrote: There's an outstanding question addressed to me: Are there particular features or new dependencies that you would like to contribute (or see contributed) that require using the Java 1.7 APIs? The question misses the point: We'd figure out how to write something we wanted to contribute to Hadoop against the APIs of Java4 if that's what it took to get them into a stable release. And at current course and speed, that's how ridiculous things could get. To summarize, it seems like there's a vague consensus that it might be okay to eventually allow the use of Java7 in trunk, but there's no
[jira] [Created] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
Alejandro Abdelnur created HADOOP-10719: --- Summary: Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider Key: HADOOP-10719 URL: https://issues.apache.org/jira/browse/HADOOP-10719 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur This is a follow up on [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] KeyProvider API should have 2 new methods: * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion encryptedKey) The implementation would do a known transformation on the IV (i.e.: xor with 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Plans of moving towards JDK7 in trunk
I also think we need to recognise that its been three months since that last discussion, and Java 6 has not suddenly burst back into popularity - nobody providing commercial support for Hadoop is offering branch-2 support on Java 6 AFAIK - therefore, nobody is testing it at scale except privately, and they aren't reporting bugs if they are - if someone actually did file a bug on something on branch-2 which didn't work on Java 6 but went away on Java7+, we'd probably close it as a WORKSFORME whether we acknowledge it or not, Hadoop 2.x is now really Java 7+. We do all agree that hadoop 3 will not be java 6, so the only issue is when and how to make that transition. That patch of mine just makes it possible to do today. I have actually jumped to Java7 in the slider project, and actually being using Java 8 and twill; the new language features there are significant and would be great to use in Hadoop *at some point in the future* For Java 7 though, based on that experience, the language changes are convenient but not essential - try-with-resources simply swallows close failures without the log integration we have with IOUtils.closeStream(), so shoudn't be used in hadoop core anyway. - string based switching: convenient, but not critical - type inference on template constructors. Modern IDEs handle the pain anyway The only feature I like is multi-catch and typed rethrow catch(IOException | ExitException e) { log.warn(e.toString(); throw e; } this would make e look like Exception, but when rethrown go back to its original type. This reduces duplicate work, and is the bit l actually value. Is it enough to justify making code incompatible across branches? No. So i'm going to propose this, and would like to start a vote on it soon 1. we parameterize java versions in the POMs on all branches, with separate JDK versions and Java language 2. branch-2: java-6-language and JDK-6 minimum JDK 3. trunk: java-6-language and JDK-7 minimum JDK This would guarantee that none of the java 7 language features went in, but we could move trunk up to java 7+ only libraries (jersey, guava). Adopting JDK7 features then becomes no more different from adopting java7+ libraries: those bits of code that have moved can't be backported. -Steve On 17 June 2014 22:08, Andrew Wang andrew.w...@cloudera.com wrote: Reviving this thread, I noticed there's been a patch and +1 on HADOOP-10530, and I don't think we actually reached a conclusion. I (and others) have expressed concerns about moving to JDK7 for trunk. Summarizing a few points: - We can't move to JDK7 in branch-2 because of compatibility - branch-2 is currently the only Hadoop release vehicle, there are no plans for a trunk-based Hadoop 3 - Introducing JDK7-only APIs in trunk will increase divergence with branch-2 and make backports harder - Almost all developers care only about branch-2, since it is the only release vehicle With this in mind, I struggle to see any upsides to introducing JDK7-only APIs to trunk. Please let's not do anything on HADOOP-10530 or related until we agree on this. Thanks, Andrew On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran ste...@hortonworks.com wrote: On 14 April 2014 17:46, Andrew Purtell apurt...@apache.org wrote: How well is trunk tested? Does anyone deploy it with real applications running on top? When will the trunk codebase next be the basis for a production release? An impromptu diff of hadoop-common trunk against branch-2 as of today is 38,625 lines. Can they be said to be the same animal? I ask because any disincentive toward putting code in trunk is beside the point, if the only target worth pursuing today is branch-2 unless one doesn't care if the code is released for production use. Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for the vast majority of Hadoopers if talking about branch-2. I think its partly a timescale issue; its also because the 1-2 transition was so significant, especially at the YARN layer, that it's still taking time to trickle through. If you do want code to ship this year, branch-2 is where you are going to try and get it in -and like you say, that's where things get tried in the field. At the same time, the constraints of stability are holding us back -already-. I don't see why we should have such another major 1-2 transition in future; the rate that Arun is pushing out 2.x releases its almost back to the 0.1x timescale -though at that point most people were fending for themselves and expectations of stability were less. We do want smaller version increments in future, which branch-2 is -mostly- delivering. While Java 7 doesn't have some must-have features, Java 8 is a significant improvement in the language, and we should be looking ahead to that, maybe even doing some leading-edge work on the side, so the same
Re: [VOTE] Release Apache Hadoop 2.4.1
There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: Plans of moving towards JDK7 in trunk
Actually, a lot of our customers are still on JDK6, so if anything, its popularity hasn't significantly decreased. We still test and support JDK6 for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no one supports JDK6 is untrue. One issue with your proposal is that java 7+ libraries can have incompatible APIs compared to their java 6 versions. Guava moves very quickly with regard to the deprecate+remove cycle. This means branch-2 and trunk divergence, as we're stuck using different Guava APIs to do the same thing. No one's arguing against moving to Java 7+ in trunk eventually, but there isn't a clear plan for a trunk-based release. I don't see any point to switching trunk over until that's true, for the aforementioned reasons. Best, Andrew On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran ste...@hortonworks.com wrote: I also think we need to recognise that its been three months since that last discussion, and Java 6 has not suddenly burst back into popularity - nobody providing commercial support for Hadoop is offering branch-2 support on Java 6 AFAIK - therefore, nobody is testing it at scale except privately, and they aren't reporting bugs if they are - if someone actually did file a bug on something on branch-2 which didn't work on Java 6 but went away on Java7+, we'd probably close it as a WORKSFORME whether we acknowledge it or not, Hadoop 2.x is now really Java 7+. We do all agree that hadoop 3 will not be java 6, so the only issue is when and how to make that transition. That patch of mine just makes it possible to do today. I have actually jumped to Java7 in the slider project, and actually being using Java 8 and twill; the new language features there are significant and would be great to use in Hadoop *at some point in the future* For Java 7 though, based on that experience, the language changes are convenient but not essential - try-with-resources simply swallows close failures without the log integration we have with IOUtils.closeStream(), so shoudn't be used in hadoop core anyway. - string based switching: convenient, but not critical - type inference on template constructors. Modern IDEs handle the pain anyway The only feature I like is multi-catch and typed rethrow catch(IOException | ExitException e) { log.warn(e.toString(); throw e; } this would make e look like Exception, but when rethrown go back to its original type. This reduces duplicate work, and is the bit l actually value. Is it enough to justify making code incompatible across branches? No. So i'm going to propose this, and would like to start a vote on it soon 1. we parameterize java versions in the POMs on all branches, with separate JDK versions and Java language 2. branch-2: java-6-language and JDK-6 minimum JDK 3. trunk: java-6-language and JDK-7 minimum JDK This would guarantee that none of the java 7 language features went in, but we could move trunk up to java 7+ only libraries (jersey, guava). Adopting JDK7 features then becomes no more different from adopting java7+ libraries: those bits of code that have moved can't be backported. -Steve On 17 June 2014 22:08, Andrew Wang andrew.w...@cloudera.com wrote: Reviving this thread, I noticed there's been a patch and +1 on HADOOP-10530, and I don't think we actually reached a conclusion. I (and others) have expressed concerns about moving to JDK7 for trunk. Summarizing a few points: - We can't move to JDK7 in branch-2 because of compatibility - branch-2 is currently the only Hadoop release vehicle, there are no plans for a trunk-based Hadoop 3 - Introducing JDK7-only APIs in trunk will increase divergence with branch-2 and make backports harder - Almost all developers care only about branch-2, since it is the only release vehicle With this in mind, I struggle to see any upsides to introducing JDK7-only APIs to trunk. Please let's not do anything on HADOOP-10530 or related until we agree on this. Thanks, Andrew On Mon, Apr 14, 2014 at 3:31 PM, Steve Loughran ste...@hortonworks.com wrote: On 14 April 2014 17:46, Andrew Purtell apurt...@apache.org wrote: How well is trunk tested? Does anyone deploy it with real applications running on top? When will the trunk codebase next be the basis for a production release? An impromptu diff of hadoop-common trunk against branch-2 as of today is 38,625 lines. Can they be said to be the same animal? I ask because any disincentive toward putting code in trunk is beside the point, if the only target worth pursuing today is branch-2 unless one doesn't care if the code is released for production use. Questions on whither JDK6 or JDK7+ (or JRE6 versus JRE7+) only matter for the vast majority of Hadoopers if talking about branch-2. I think its partly a timescale issue; its also because the 1-2
[jira] [Resolved] (HADOOP-10358) libhadoop doesn't compile on Mac OS X
[ https://issues.apache.org/jira/browse/HADOOP-10358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Maykov resolved HADOOP-10358. -- Resolution: Duplicate Resolving dupe libhadoop doesn't compile on Mac OS X - Key: HADOOP-10358 URL: https://issues.apache.org/jira/browse/HADOOP-10358 Project: Hadoop Common Issue Type: Improvement Components: native Environment: Mac OS X 10.8.5 Oracle JDK 1.7.0_51 Reporter: Ilya Maykov Priority: Minor Attachments: HADOOP-10358-fix-hadoop-common-native-on-os-x.patch The native component of hadoop-common (libhadoop.so on linux, libhadoop.dylib on mac) fails to compile on Mac OS X. The problem is in hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsNetgroupMapping.c at lines 76-78: [exec] /Users/ilyam/src/github/apache/hadoop-common/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/JniBasedUnixGroupsNetgroupMapping.c:77:26: error: invalid operands to binary expression ('void' and 'int') [exec] if(setnetgrent(cgroup) == 1) { [exec] ~~~ ^ ~ There are two problems in the code: 1) The #ifndef guard only checks for __FreeBSD__ but should check for either one of __FreeBSD__ or __APPLE__. This is because Mac OS X inherits its syscalls from FreeBSD rather than Linux, and thus the setnetgrent() syscall returns void. 2) setnetgrentCalledFlag = 1 is set outside the #ifndef guard, but the syscall is only invoked inside the guard. This means that on FreeBSD, endnetgrent() can be called in the cleanup code without a corresponding setnetgrent() invocation. I have a patch that fixes both issues (will attach in a bit). With this patch, I'm able to compile libhadoop.dylib on Mac OS X, which in turn lets me install native snappy, lzo, etc compressor libraries on my client. That lets me run commands like 'hadoop fs -text somefile.lzo' from the macbook rather than having to ssh to a linux box, etc. Note that this patch only fixes the native build of hadoop-common-project. Some other components of hadoop still fail to build their native components, but libhadoop.dylib is enough for the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Plans of moving towards JDK7 in trunk
On 18 June 2014 12:32, Andrew Wang andrew.w...@cloudera.com wrote: Actually, a lot of our customers are still on JDK6, so if anything, its popularity hasn't significantly decreased. We still test and support JDK6 for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no one supports JDK6 is untrue. Really? I was misinformed http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Requirements-and-Supported-Versions/cdhrsv_jdk.html -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
RE: Plans of moving towards JDK7 in trunk
Andrew, “I don't see any point to switching” is an interesting perspective, given the well-known risks of running unsafe software. Clearly customer best interest is stability. JDK6 is in a known unsafe state. The longer anyone delays the necessary transition to safety the longer the door is left open to predictable disaster. You also said we still test and support JDK6. I searched but have not been able to find Cloudera critical security fixes for JDK6. Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In other words, did you release to your customers any kind of public alert or warning of this CVSS 10.0 event as part of your JDK6 support? http://www.cvedetails.com/cve/CVE-2013-2465/ If you are not releasing your own security fixes for JDK6 post-EOL would it perhaps be safer to say Cloudera is hands-off; neither supports, nor opposes the known insecure and deprecated/unpatched JDK? I mentioned before in this thread the Oracle support timeline: - official public EOL (end of life) was more than a year ago - premier support ended more than six months ago - extended support may get critical security fixes until the end of 2016 Given this timeline, does Cloudera officially take responsibility for Hadoop customer safety? Are you going to be releasing critical security fixes to a known unsafe JDK? Davi -Original Message- From: Andrew Wang [mailto:andrew.w...@cloudera.com] Sent: Wednesday, June 18, 2014 12:33 PM To: common-dev@hadoop.apache.org Subject: Re: Plans of moving towards JDK7 in trunk Actually, a lot of our customers are still on JDK6, so if anything, its popularity hasn't significantly decreased. We still test and support JDK6 for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no one supports JDK6 is untrue. One issue with your proposal is that java 7+ libraries can have incompatible APIs compared to their java 6 versions. Guava moves very quickly with regard to the deprecate+remove cycle. This means branch-2 and trunk divergence, as we're stuck using different Guava APIs to do the same thing. No one's arguing against moving to Java 7+ in trunk eventually, but there isn't a clear plan for a trunk-based release. I don't see any point to switching trunk over until that's true, for the aforementioned reasons. Best, Andrew On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran ste...@hortonworks.commailto:ste...@hortonworks.com wrote: I also think we need to recognise that its been three months since that last discussion, and Java 6 has not suddenly burst back into popularity - nobody providing commercial support for Hadoop is offering branch-2 support on Java 6 AFAIK - therefore, nobody is testing it at scale except privately, and they aren't reporting bugs if they are - if someone actually did file a bug on something on branch-2 which didn't work on Java 6 but went away on Java7+, we'd probably close it as a WORKSFORME whether we acknowledge it or not, Hadoop 2.x is now really Java 7+. We do all agree that hadoop 3 will not be java 6, so the only issue is when and how to make that transition. That patch of mine just makes it possible to do today. I have actually jumped to Java7 in the slider project, and actually being using Java 8 and twill; the new language features there are significant and would be great to use in Hadoop *at some point in the future* For Java 7 though, based on that experience, the language changes are convenient but not essential - try-with-resources simply swallows close failures without the log integration we have with IOUtils.closeStream(), so shoudn't be used in hadoop core anyway. - string based switching: convenient, but not critical - type inference on template constructors. Modern IDEs handle the pain anyway The only feature I like is multi-catch and typed rethrow catch(IOException | ExitException e) { log.warn(e.toString(); throw e; } this would make e look like Exception, but when rethrown go back to its original type. This reduces duplicate work, and is the bit l actually value. Is it enough to justify making code incompatible across branches? No. So i'm going to propose this, and would like to start a vote on it soon 1. we parameterize java versions in the POMs on all branches, with separate JDK versions and Java language 2. branch-2: java-6-language and JDK-6 minimum JDK 3. trunk: java-6-language and JDK-7 minimum JDK This would guarantee that none of the java 7 language features went in, but we could move trunk up to java 7+ only libraries (jersey, guava). Adopting JDK7 features then becomes no more different from adopting java7+ libraries: those bits of code that have moved can't be
[jira] [Resolved] (HADOOP-10717) Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode.
[ https://issues.apache.org/jira/browse/HADOOP-10717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HADOOP-10717. - Resolution: Invalid Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode. - Key: HADOOP-10717 URL: https://issues.apache.org/jira/browse/HADOOP-10717 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Dapeng Sun Assignee: Dapeng Sun Priority: Blocker Fix For: 3.0.0 Attachments: HADOOP-10717.patch When user want to start NameNode, user would got the following exception, it is caused by missing org.mortbay.jetty:jsp-2.1-jetty:jar:6.1.26 in the pom.xml 14/06/18 14:55:30 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 14/06/18 14:55:30 INFO http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 14/06/18 14:55:30 INFO http.HttpServer2: Jetty bound to port 50070 14/06/18 14:55:30 INFO mortbay.log: jetty-6.1.26 14/06/18 14:55:30 INFO mortbay.log: NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet 14/06/18 14:57:38 WARN mortbay.log: EXCEPTION java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at java.net.Socket.connect(Socket.java:478) at sun.net.NetworkClient.doConnect(NetworkClient.java:163) at sun.net.www.http.HttpClient.openServer(HttpClient.java:395) at sun.net.www.http.HttpClient.openServer(HttpClient.java:530) at sun.net.www.http.HttpClient.init(HttpClient.java:234) at sun.net.www.http.HttpClient.New(HttpClient.java:307) at sun.net.www.http.HttpClient.New(HttpClient.java:324) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:677) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at
[jira] [Reopened] (HADOOP-10717) Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode.
[ https://issues.apache.org/jira/browse/HADOOP-10717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai reopened HADOOP-10717: - Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode. - Key: HADOOP-10717 URL: https://issues.apache.org/jira/browse/HADOOP-10717 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Dapeng Sun Assignee: Dapeng Sun Priority: Blocker Fix For: 3.0.0 Attachments: HADOOP-10717.patch When user want to start NameNode, user would got the following exception, it is caused by missing org.mortbay.jetty:jsp-2.1-jetty:jar:6.1.26 in the pom.xml 14/06/18 14:55:30 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 14/06/18 14:55:30 INFO http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 14/06/18 14:55:30 INFO http.HttpServer2: Jetty bound to port 50070 14/06/18 14:55:30 INFO mortbay.log: jetty-6.1.26 14/06/18 14:55:30 INFO mortbay.log: NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet 14/06/18 14:57:38 WARN mortbay.log: EXCEPTION java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at java.net.Socket.connect(Socket.java:478) at sun.net.NetworkClient.doConnect(NetworkClient.java:163) at sun.net.www.http.HttpClient.openServer(HttpClient.java:395) at sun.net.www.http.HttpClient.openServer(HttpClient.java:530) at sun.net.www.http.HttpClient.init(HttpClient.java:234) at sun.net.www.http.HttpClient.New(HttpClient.java:307) at sun.net.www.http.HttpClient.New(HttpClient.java:324) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:677) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808) at
[jira] [Created] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
Alejandro Abdelnur created HADOOP-10720: --- Summary: KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API Key: HADOOP-10720 URL: https://issues.apache.org/jira/browse/HADOOP-10720 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur KMS client/server should implement support for generating encrypted keys and decrypting them via the REST API being introduced by HADOOP-10719. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Plans of moving towards JDK7 in trunk
Most of the security problems in Java are sandbox jailbreaking and not relevant. Anything related to kerberos, HTTPS or other in-cluster security issues would be a different story...I haven't heard anything. Its a different matter client-side, but anyone who enables Java in their web browsers is doomed already. Java security issues may matter developer-side, as if you really want to support java6, you need a java6 JVM to hand. There's a risk there...but if you run an OS/X box apple keep them around for you even after you upgrade (try /usr/libexec/java_home -V to see this). On 18 June 2014 13:41, Sandy Ryza sandy.r...@cloudera.com wrote: We do release warnings when we are aware of vulnerabilities in our dependencies. However, unless I'm grossly misunderstanding, the vulnerability that you point out is not a vulnerability within the context of our software. Hadoop doesn't try to sandbox within JVMs. In a secure setup, any JVM running non-trusted user code is running as that user, so breaking out doesn't offer the ability to do anything malicious. -Sandy On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi davi.ottenhei...@emc.com wrote: Andrew, “I don't see any point to switching” is an interesting perspective, given the well-known risks of running unsafe software. Clearly customer best interest is stability. JDK6 is in a known unsafe state. The longer anyone delays the necessary transition to safety the longer the door is left open to predictable disaster. You also said we still test and support JDK6. I searched but have not been able to find Cloudera critical security fixes for JDK6. Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In other words, did you release to your customers any kind of public alert or warning of this CVSS 10.0 event as part of your JDK6 support? http://www.cvedetails.com/cve/CVE-2013-2465/ If you are not releasing your own security fixes for JDK6 post-EOL would it perhaps be safer to say Cloudera is hands-off; neither supports, nor opposes the known insecure and deprecated/unpatched JDK? I mentioned before in this thread the Oracle support timeline: - official public EOL (end of life) was more than a year ago - premier support ended more than six months ago - extended support may get critical security fixes until the end of 2016 Given this timeline, does Cloudera officially take responsibility for Hadoop customer safety? Are you going to be releasing critical security fixes to a known unsafe JDK? Davi -Original Message- From: Andrew Wang [mailto:andrew.w...@cloudera.com] Sent: Wednesday, June 18, 2014 12:33 PM To: common-dev@hadoop.apache.org Subject: Re: Plans of moving towards JDK7 in trunk Actually, a lot of our customers are still on JDK6, so if anything, its popularity hasn't significantly decreased. We still test and support JDK6 for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no one supports JDK6 is untrue. One issue with your proposal is that java 7+ libraries can have incompatible APIs compared to their java 6 versions. Guava moves very quickly with regard to the deprecate+remove cycle. This means branch-2 and trunk divergence, as we're stuck using different Guava APIs to do the same thing. No one's arguing against moving to Java 7+ in trunk eventually, but there isn't a clear plan for a trunk-based release. I don't see any point to switching trunk over until that's true, for the aforementioned reasons. Best, Andrew On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran ste...@hortonworks.commailto:ste...@hortonworks.com wrote: I also think we need to recognise that its been three months since that last discussion, and Java 6 has not suddenly burst back into popularity - nobody providing commercial support for Hadoop is offering branch-2 support on Java 6 AFAIK - therefore, nobody is testing it at scale except privately, and they aren't reporting bugs if they are - if someone actually did file a bug on something on branch-2 which didn't work on Java 6 but went away on Java7+, we'd probably close it as a WORKSFORME whether we acknowledge it or not, Hadoop 2.x is now really Java 7+. We do all agree that hadoop 3 will not be java 6, so the only issue is when and how to make that transition. That patch of mine just makes it possible to do today. I have actually jumped to Java7 in the slider project, and actually being using Java 8 and twill; the new language features there are significant and would be great to use in Hadoop *at some point in the future* For
Re: [VOTE] Release Apache Hadoop 2.4.1
I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.4.1
I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Thanks and Regards, Mayank Cell: 408-718-9370
Re: Plans of moving towards JDK7 in trunk
In CDH5, Cloudera encourages people to use JDK7. JDK6 has been EOL for a while now and is not something we recommend. As we discussed before, everyone is in favor of upgrading to JDK7. Every cluster operator of a reasonably modern Hadoop should do it whatever distro or release you run. As developers, we run JDK7 as well. I'd just like to see a plan for when branch-2 (or some other branch) will create a stable release that drops support for JDK1.6. If we don't have such a plan, I feel like it's too early to talk about this stuff. If we drop support for 1.6 in trunk but not in branch-2, we are fragmenting the project. People will start writing unreleaseable code (because it doesn't work on branch-2) and we'll be back to the bad old days of Hadoop version fragmentation that branch-2 was intended to solve. Backports will become harder. The biggest problem is that trunk will start to depend on libraries or Maven plugins that branch-2 can't even use, because they're JDK7+-only. Steve wrote: if someone actually did file a bug on something on branch-2 which didn't work on Java 6 but went away on Java7+, we'd probably close it as a WORKSFORME. Steve, if this is true, we should just bump the minimum supported version for branch-2 to 1.7 today and resolve this. If we truly believe that there are no issues here, then let's just decide to drop 1.6 in a specific future release of Hadoop 2. If there are issues with releasing JDK1.7+ only code, then let's figure out what they are before proceeding. best, Colin On Wed, Jun 18, 2014 at 1:41 PM, Sandy Ryza sandy.r...@cloudera.com wrote: We do release warnings when we are aware of vulnerabilities in our dependencies. However, unless I'm grossly misunderstanding, the vulnerability that you point out is not a vulnerability within the context of our software. Hadoop doesn't try to sandbox within JVMs. In a secure setup, any JVM running non-trusted user code is running as that user, so breaking out doesn't offer the ability to do anything malicious. -Sandy On Wed, Jun 18, 2014 at 1:30 PM, Ottenheimer, Davi davi.ottenhei...@emc.com wrote: Andrew, “I don't see any point to switching” is an interesting perspective, given the well-known risks of running unsafe software. Clearly customer best interest is stability. JDK6 is in a known unsafe state. The longer anyone delays the necessary transition to safety the longer the door is left open to predictable disaster. You also said we still test and support JDK6. I searched but have not been able to find Cloudera critical security fixes for JDK6. Can you clarify, for example, Java 6 Update 51 for CVE-2013-2465? In other words, did you release to your customers any kind of public alert or warning of this CVSS 10.0 event as part of your JDK6 support? http://www.cvedetails.com/cve/CVE-2013-2465/ If you are not releasing your own security fixes for JDK6 post-EOL would it perhaps be safer to say Cloudera is hands-off; neither supports, nor opposes the known insecure and deprecated/unpatched JDK? I mentioned before in this thread the Oracle support timeline: - official public EOL (end of life) was more than a year ago - premier support ended more than six months ago - extended support may get critical security fixes until the end of 2016 Given this timeline, does Cloudera officially take responsibility for Hadoop customer safety? Are you going to be releasing critical security fixes to a known unsafe JDK? Davi -Original Message- From: Andrew Wang [mailto:andrew.w...@cloudera.com] Sent: Wednesday, June 18, 2014 12:33 PM To: common-dev@hadoop.apache.org Subject: Re: Plans of moving towards JDK7 in trunk Actually, a lot of our customers are still on JDK6, so if anything, its popularity hasn't significantly decreased. We still test and support JDK6 for CDH4 and CDH5. The claim that branch-2 is effectively JDK7 because no one supports JDK6 is untrue. One issue with your proposal is that java 7+ libraries can have incompatible APIs compared to their java 6 versions. Guava moves very quickly with regard to the deprecate+remove cycle. This means branch-2 and trunk divergence, as we're stuck using different Guava APIs to do the same thing. No one's arguing against moving to Java 7+ in trunk eventually, but there isn't a clear plan for a trunk-based release. I don't see any point to switching trunk over until that's true, for the aforementioned reasons. Best, Andrew On Wed, Jun 18, 2014 at 12:08 PM, Steve Loughran ste...@hortonworks.commailto:ste...@hortonworks.com wrote: I also think we need to recognise that its been three months since that last discussion, and Java 6 has not suddenly burst back into popularity - nobody providing commercial support for Hadoop is offering branch-2 support on Java 6 AFAIK - therefore, nobody
Renewable Ticket using Keytab through JAAS API
Hi, Checking if you had come across the same problem while implementing security in Hadoop specifically auto ticket renewal. I am using a Key tab file with the below JAAS configuration. com.sun.security.auth.module.Krb5LoginModule required useKeyTab = true useTicketCache = true keyTab=xyz.keytab storeKey=true principal=user/xyz.com The configuration works only if the Kinit is called before hand and the ticket is present in the cache. I am checking a condition for renewable ticket using JAAS API and it works. Now if I modify the JAAS configuration not to use ticket cache i.e., by setting the useTicketCache = false then without calling Kinit and just using the keyTab is failing to set the renewable flag although I am able to get the ticket authenticated from the kerberos using JAAS API. Below is the JAAS configuration. com.sun.security.auth.module.Krb5LoginModule required useKeyTab = true useTicketCache = false keyTab=xyz.keytab storeKey=true principal=user/xyz.com Please let me know how do we use keytab in JAAS API bypassing kinit command and the renewable ticket flag is set. Thanks, Raghav
Re: [VOTE] Release Apache Hadoop 2.4.1
Point to another MR compatibility issue marked for 2.4.1: MAPREDUCE-5831 Old MR client is not compatible with new MR application, though it happens since 2.3. It would be good to figure out whether we include it now or later. It seems that we're going to be in a better position once we have versioning for MR components. Other than that, +1 (non-binding) for rc0. I've downloaded the source code, built the executable from it, run through MR examples and DS jobs, checked the metrics in the timeline server, and passed the test cases mentioned in the change log. - Zhijie On Thu, Jun 19, 2014 at 5:45 AM, Mayank Bansal maban...@gmail.com wrote: I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Thanks and Regards, Mayank Cell: 408-718-9370 -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-10721) The result does not show up after running hive query on Swift.
YongHun Jeon created HADOOP-10721: - Summary: The result does not show up after running hive query on Swift. Key: HADOOP-10721 URL: https://issues.apache.org/jira/browse/HADOOP-10721 Project: Hadoop Common Issue Type: Bug Components: fs/swift Reporter: YongHun Jeon Priority: Critical I configured Hadoop and Swift system as the site is mentioned : http://docs.openstack.org/developer/sahara/userdoc/hadoop-swift.html. So, I succeeded to access the Swift from Hadoop. I am running TPC-H performance test on Hadoop system integrated with Swift. I ran the below hive query. - DROP TABLE lineitem; DROP TABLE q1_pricing_summary_report; -- create tables and load data Create external table lineitem (L_ORDERKEY INT, L_PARTKEY INT, L_SUPPKEY INT, L_LINENUMBER INT, L_QUANTITY DOUBLE, L_EXTENDEDPRICE DOUBLE, L_DISCOUNT DOUBLE, L_TAX DOUBLE, L_RETURNFLAG STRING, L_LINESTATUS STRING, L_SHIPDATE STRING, L_COMMITDATE STRING, L_RECEIPTDATE STRING, L_SHIPINSTRUCT STRING, L_SHIPMODE STRING, L_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'swift://test.provider/tpch/lineitem'; -- create the target table CREATE external TABLE q1_pricing_summary_report ( L_RETURNFLAG STRING, L_LINESTATUS STRING, SUM_QTY DOUBLE, SUM_BASE_PRICE DOUBLE, SUM_DISC_PRICE DOUBLE, SUM_CHARGE DOUBLE, AVE_QTY DOUBLE, AVE_PRICE DOUBLE, AVE_DISC DOUBLE, COUNT_ORDER INT) LOCATION 'swift://test.provider/user/result/q1_pricing_summary_report'; set mapred.min.split.size=536870912; -- the query INSERT OVERWRITE TABLE q1_pricing_summary_report SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY), SUM(L_EXTENDEDPRICE), SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)), SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)), AVG(L_QUANTITY), AVG(L_EXTENDEDPRICE), AVG(L_DISCOUNT), COUNT(1) FROM lineitem WHERE L_SHIPDATE='1998-09-02' GROUP BY L_RETURNFLAG, L_LINESTATUS ORDER BY L_RETURNFLAG, L_LINESTATUS; - You can get the files(such as lineitem) for the test through running dbgen which is in this site : http://www.tpc.org/tpch/. I saw the some temporary files are generated and deleted. However, the result does not show up after running hive query. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Introducing ConsensusNode and a Coordination Engine
Guys, In the last a couple of weeks, we had a very good and productive initial round of discussions on the JIRAs. I think it is worthy to keep the momentum going and have a more detailed conversation. For that, we'd like to host s Hadoop developers meetup to get into the bowls of the consensus-based coordination implementation for HDFS. The proposed venue is our office in San Ramon, CA. Considering that it is already a mid week and the following one looks short because of the holidays, how would the week of July 7th looks for yall? Tuesday or Thursday look pretty good on our end. Please chime in on your preference either here or reach of directly to me. Once I have a few RSVPs I will setup an event on Eventbrite or similar. Looking forward to your input. Regards, Cos On Thu, May 29, 2014 at 02:09PM, Konstantin Shvachko wrote: Hello hadoop developers, I just opened two jiras proposing to introduce ConsensusNode into HDFS and a Coordination Engine into Hadoop Common. The latter should benefit HDFS and HBase as well as potentially other projects. See HDFS-6469 and HADOOP-10641 for details. The effort is based on the system we built at Wandisco with my colleagues, who are glad to contribute it to Apache, as quite a few people in the community expressed interest in this ideas and their potential applications. We should probably keep technical discussions in the jiras. Here on the dev list I wanted to touch-base on any logistic issues / questions. - First of all, any ideas and help are very much welcome. - We would like to set up a meetup to discuss this if people are interested. Hadoop Summit next week may be a potential time-place to meet. Not sure in what form. If not, we can organize one in our San Ramon office later on. - The effort may take a few months depending on the contributors schedules. Would it make sense to open a branch for the ConsensusNode work? - APIs and the implementation of the Coordination Engine should be a fairly independent, so it may be reasonable to add it directly to Hadoop Common trunk. Thanks, --Konstantin
[jira] [Reopened] (HADOOP-10717) Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode.
[ https://issues.apache.org/jira/browse/HADOOP-10717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun reopened HADOOP-10717: - Hi Steve Thanks a lot for your review. Hi Haohui The time-out exception blocked me about 2 minutes every time. I only remain the property fs.defaultFS and clean all the cache files ( include maven cache in my environment), I still get the time-out error. After I applied the patch, the error was fixed. Also thanks for your review. Missing JSP support in Jetty, 'NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet' when user want to start namenode. - Key: HADOOP-10717 URL: https://issues.apache.org/jira/browse/HADOOP-10717 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Dapeng Sun Assignee: Dapeng Sun Priority: Blocker Fix For: 3.0.0 Attachments: HADOOP-10717.patch When user want to start NameNode, user would got the following exception, it is caused by missing org.mortbay.jetty:jsp-2.1-jetty:jar:6.1.26 in the pom.xml 14/06/18 14:55:30 INFO http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 14/06/18 14:55:30 INFO http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 14/06/18 14:55:30 INFO http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 14/06/18 14:55:30 INFO http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 14/06/18 14:55:30 INFO http.HttpServer2: Jetty bound to port 50070 14/06/18 14:55:30 INFO mortbay.log: jetty-6.1.26 14/06/18 14:55:30 INFO mortbay.log: NO JSP Support for /, did not find org.apache.jasper.servlet.JspServlet 14/06/18 14:57:38 WARN mortbay.log: EXCEPTION java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at java.net.Socket.connect(Socket.java:478) at sun.net.NetworkClient.doConnect(NetworkClient.java:163) at sun.net.www.http.HttpClient.openServer(HttpClient.java:395) at sun.net.www.http.HttpClient.openServer(HttpClient.java:530) at sun.net.www.http.HttpClient.init(HttpClient.java:234) at sun.net.www.http.HttpClient.New(HttpClient.java:307) at sun.net.www.http.HttpClient.New(HttpClient.java:324) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:677) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648) at