Re: HADOOP_ROOT_LOGGER
In my experience the default HADOOP_ROOT_LOGGER definition will override any root logger defined in log4j.properties, which is where the problems have arisen. If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were removed, wouldn't the root logger defined in the log4j.properties file be used? Or do the client commands not read that configuration file? I'm trying to understand why the root logger should be defined outside of the log4j.properties file. Rob On 05/22/2014 12:53 AM, Vinayakumar B wrote: Hi Robert, I understand your confusion. HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set for anything and logs will be displayed on the console itself. This will be true for any client commands you run. For ex: hdfs dfs -ls / But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc) HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env variable is not defined. So that all the log messages of the server daemons goto some log files and this will be maintained by RollingFileAppender. If you want to override all these default and set your own loglevel then define that as env variable HADOOP_ROOT_LOGGER. For ex: export HADOOP_ROOT_LOGGER=DEBUG,RFA export above env variable and then start server scripts or execute client commands, all logs goto files and will be maintained by RollingFileAppender. Regards, Vinay On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote: I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
Re: HADOOP_ROOT_LOGGER
Ah, that makes sense. Would it make sense to default the root logger to the one defined in log4j.properties file instead of the static value in the script then? That way an admin can set all logging properties desired in the log4j.properties file, but can override with HADOOP_ROOT_LOGGER to debug. It feels a little black box-y that if HADOOP_ROOT_LOGGER isn't set then the root logger set in log4j.properties is ignored. Maybe this is all very well known and just a bit black box-y to me since I'm new-ish to hadoop. Rob On 05/22/2014 03:41 PM, Colin McCabe wrote: It's not always practical to edit the log4j.properties file. For one thing, if you're using a management system, there may be many log4j properties sprinkled around the system, and it could be difficult to figure out which is the one you need to edit. For another, you may not (should not?) have permission to do this on a production cluster. Doing something like HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -cat /foo has helped me diagnose problems in the past. best, Colin On Thu, May 22, 2014 at 6:34 AM, Robert Rati rr...@redhat.com wrote: In my experience the default HADOOP_ROOT_LOGGER definition will override any root logger defined in log4j.properties, which is where the problems have arisen. If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were removed, wouldn't the root logger defined in the log4j.properties file be used? Or do the client commands not read that configuration file? I'm trying to understand why the root logger should be defined outside of the log4j.properties file. Rob On 05/22/2014 12:53 AM, Vinayakumar B wrote: Hi Robert, I understand your confusion. HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set for anything and logs will be displayed on the console itself. This will be true for any client commands you run. For ex: hdfs dfs -ls / But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc) HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env variable is not defined. So that all the log messages of the server daemons goto some log files and this will be maintained by RollingFileAppender. If you want to override all these default and set your own loglevel then define that as env variable HADOOP_ROOT_LOGGER. For ex: export HADOOP_ROOT_LOGGER=DEBUG,RFA export above env variable and then start server scripts or execute client commands, all logs goto files and will be maintained by RollingFileAppender. Regards, Vinay On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote: I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
HADOOP_ROOT_LOGGER
I noticed in hadoop-config.sh there is this line: HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console} which is setting a root logger if HADOOP_ROOT_LOGGER isn't set. Why is this here.needed? There is a log4j.properties file provided that defines a default logger. I believe the line above will result in overriding whatever is set for the root logger in the log4j.properties file. This has caused some confusion and hacks to work around this. Is there a reason not to remove the above code and just have all the logger definitions in the log4j.properties file? Is there maybe a compatibility concern? Rob
Re: Plans of moving towards JDK7 in trunk
Just an FYI, but I'm working on updating that jetty patch for the current 2.4.0 release. The one that is there won't cleanly apply because so much has changed since it was posted. I'll post a new patch when it's done. Rob On 04/11/2014 04:24 AM, Steve Loughran wrote: On 10 April 2014 18:12, Eli Collins e...@cloudera.com wrote: Let's speak less abstractly, are there particular features or new dependencies that you would like to contribute (or see contributed) that require using the Java 1.7 APIs? Breaking compat in v2 or rolling a v3 release are both non-trivial, not something I suspect we'd want to do just because it would be, for example, nicer to have a newer version of Jetty. Oddly enough, rolling the web framework is something I'd like to see in a v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is reliably switch to servlet API v3 But.. I think we may be able to increment Jetty more without going to java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
Re: Plans of moving towards JDK7 in trunk
I don't mean to be dense, but can you expand on why jetty 8 can't go into branch2? What is the concern? Rob On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote: if you mean updating jetty on branch2, we cannot do that. it has to be done in trunk. thx Alejandro (phone typing) On Apr 11, 2014, at 4:46, Robert Rati rr...@redhat.com wrote: Just an FYI, but I'm working on updating that jetty patch for the current 2.4.0 release. The one that is there won't cleanly apply because so much has changed since it was posted. I'll post a new patch when it's done. Rob On 04/11/2014 04:24 AM, Steve Loughran wrote: On 10 April 2014 18:12, Eli Collins e...@cloudera.com wrote: Let's speak less abstractly, are there particular features or new dependencies that you would like to contribute (or see contributed) that require using the Java 1.7 APIs? Breaking compat in v2 or rolling a v3 release are both non-trivial, not something I suspect we'd want to do just because it would be, for example, nicer to have a newer version of Jetty. Oddly enough, rolling the web framework is something I'd like to see in a v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is reliably switch to servlet API v3 But.. I think we may be able to increment Jetty more without going to java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .
Improved HDFS Web UI
Did something happen to the improved HDFS Web UI from this jira? https://issues.apache.org/jira/browse/HDFS-2933 The fixed version is 2.1.1-beta, but I don't see the code on any of the branch-2.2.x or branch-2.3 branches. Rob
Re: Hadoop in Fedora updated to 2.2.0
Just to clarify, the tomcat/jasper updates and the jersey updates should be able to go in without any jetty changes. There is also a separate BZ for updating jetty to jetty 8, which is the last jetty version that will run on java 6, if there is a desire to update jetty without requiring java 7. If jetty 9 is being looked at for inclusion it will affect the jasper bits in the poms. Jetty 9 has its own jsp compiler and would need to replace jasper, but that's largely just pom changes iirc. Jetty 9 does revamp some apis and should definitely be looked at by people more knowledgeable with how jetty is used, especially as it relates to secure mode. Rob On 11/13/2013 03:31 PM, Steve Loughran wrote: I've just been through some of these as part of my background project, fix up the POMs https://issues.apache.org/jira/browse/HADOOP-9991. 1. I've applied the simple low/risk ones. 2. I've not done the bookkeeper one, as people working with that code need to play with it first. 3. I've not touched anything related to {jersey, tomcat, jetty} This is more than just a java6/7 issue, is is that Jetty has been very brittle in the past, and there's code in hadoop to detect when it's not actually servicing requests properly. Moving up Jetty/web server versions is something that needs to be done carefull and with consensus -and once you leave Jetty alone, I don't know where the jersey and tomcat changes go. There is always the option of s/jetty/r/grizzly/ -steve On 1 November 2013 14:57, Robert Rati rr...@redhat.com wrote: Putting the java 6 vs java 7 issue aside, what about the other patches to update dependencies? Can those be looked at an planned for inclusion into a releation? Rob On 10/31/2013 05:51 PM, Andrew Wang wrote: I'm in agreement with Steve on this one. We're aware that Java 6 is EOL, but we can't drop support for the lifetime of the 2.x line since it's a (very) incompatible change. AFAIK a 3.x release fixing this isn't on any of our horizons yet. Best, Andrew On Thu, Oct 31, 2013 at 6:15 AM, Robert Rati rr...@redhat.com wrote: https://issues.apache.org/jira/browse/HADOOP-9594https: //issues.apache.org/**jira/browse/HADOOP-9594 https:**//issues.apache.org/jira/**browse/HADOOP-9594http s://issues.apache.org/jira/browse/HADOOP-9594 https://issues.apache.org/jira/browse/MAPREDUCE-5431htt ps://issues.apache.org/**jira/browse/MAPREDUCE-5431 htt**ps://issues.apache.org/jira/**browse/MAPREDUCE-5431h ttps://issues.apache.org/jira/browse/MAPREDUCE-5431 https://issues.apache.org/jira/browse/HADOOP-9611https: //issues.apache.org/**jira/browse/HADOOP-9611 https:**//issues.apache.org/jira/**browse/HADOOP-9611http s://issues.apache.org/jira/browse/HADOOP-9611 https://issues.apache.org/jira/browse/HADOOP-9613https: //issues.apache.org/**jira/browse/HADOOP-9613 https:**//issues.apache.org/jira/**browse/HADOOP-9613http s://issues.apache.org/jira/browse/HADOOP-9613 https://issues.apache.org/jira/browse/HADOOP-9623https: //issues.apache.org/**jira/browse/HADOOP-9623 https:**//issues.apache.org/jira/**browse/HADOOP-9623http s://issues.apache.org/jira/browse/HADOOP-9623 https://issues.apache.org/jira/browse/HDFS-5411https:// issues.apache.org/**jira/browse/HDFS-5411 https://**issues.apache.org/jira/browse/**HDFS-5411https: //issues.apache.org/jira/browse/HDFS-5411 https://issues.apache.org/jira/browse/HADOOP-10067https ://issues.apache.org/**jira/browse/HADOOP-10067 https**://issues.apache.org/jira/**browse/HADOOP-10067htt ps://issues.apache.org/jira/browse/HADOOP-10067 https://issues.apache.org/jira/browse/HDFS-5075https:// issues.apache.org/**jira/browse/HDFS-5075 https://**issues.apache.org/jira/browse/**HDFS-5075https: //issues.apache.org/jira/browse/HDFS-5075 https://issues.apache.org/jira/browse/HADOOP-10068https ://issues.apache.org/**jira/browse/HADOOP-10068 https**://issues.apache.org/jira/**browse/HADOOP-10068htt ps://issues.apache.org/jira/browse/HADOOP-10068 https://issues.apache.org/jira/browse/HADOOP-10075https ://issues.apache.org/**jira/browse/HADOOP-10075 https**://issues.apache.org/jira/**browse/HADOOP-10075htt ps://issues.apache.org/jira/browse/HADOOP-10075 https://issues.apache.org/jira/browse/HADOOP-10076https ://issues.apache.org/**jira/browse/HADOOP-10076 https**://issues.apache.org/jira/**browse/HADOOP-10076htt ps://issues.apache.org/jira/browse/HADOOP-10076 https://issues.apache.org/jira/browse/HADOOP-9849https: //issues.apache.org/**jira/browse/HADOOP-9849 https:**//issues.apache.org/jira/**browse/HADOOP-9849http s://issues.apache.org/jira/browse/HADOOP-9849 most (all?) of these are pom changes A good number are basically pom changes to update to newer versions of dependencies. A few, such as commons-math3, required code changes as well because of a namespace change. Some are minor code changes to enhance compatibility
[jira] [Created] (HADOOP-10096) Missing dependency on commons-collections
Robert Rati created HADOOP-10096: Summary: Missing dependency on commons-collections Key: HADOOP-10096 URL: https://issues.apache.org/jira/browse/HADOOP-10096 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.2.0 Reporter: Robert Rati Priority: Minor Attachments: HADOOP-10096.patch There's a missing dependency on commons-collections -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HADOOP-10096) Missing dependency on commons-collections
[ https://issues.apache.org/jira/browse/HADOOP-10096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Rati resolved HADOOP-10096. -- Resolution: Duplicate Missing dependency on commons-collections - Key: HADOOP-10096 URL: https://issues.apache.org/jira/browse/HADOOP-10096 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.2.0 Reporter: Robert Rati Priority: Minor Attachments: HADOOP-10096.patch There's a missing dependency on commons-collections -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Hadoop in Fedora updated to 2.2.0
https://issues.apache.org/**jira/browse/HADOOP-9594https://issues.apache.org/jira/browse/HADOOP-9594 https://issues.apache.org/**jira/browse/MAPREDUCE-5431https://issues.apache.org/jira/browse/MAPREDUCE-5431 https://issues.apache.org/**jira/browse/HADOOP-9611https://issues.apache.org/jira/browse/HADOOP-9611 https://issues.apache.org/**jira/browse/HADOOP-9613https://issues.apache.org/jira/browse/HADOOP-9613 https://issues.apache.org/**jira/browse/HADOOP-9623https://issues.apache.org/jira/browse/HADOOP-9623 https://issues.apache.org/**jira/browse/HDFS-5411https://issues.apache.org/jira/browse/HDFS-5411 https://issues.apache.org/**jira/browse/HADOOP-10067https://issues.apache.org/jira/browse/HADOOP-10067 https://issues.apache.org/**jira/browse/HDFS-5075https://issues.apache.org/jira/browse/HDFS-5075 https://issues.apache.org/**jira/browse/HADOOP-10068https://issues.apache.org/jira/browse/HADOOP-10068 https://issues.apache.org/**jira/browse/HADOOP-10075https://issues.apache.org/jira/browse/HADOOP-10075 https://issues.apache.org/**jira/browse/HADOOP-10076https://issues.apache.org/jira/browse/HADOOP-10076 https://issues.apache.org/**jira/browse/HADOOP-9849https://issues.apache.org/jira/browse/HADOOP-9849 most (all?) of these are pom changes A good number are basically pom changes to update to newer versions of dependencies. A few, such as commons-math3, required code changes as well because of a namespace change. Some are minor code changes to enhance compatibility with newer dependencies. Even tomcat is mostly changes in pom files. Most of the changes are minor. There are 2 big updates though: Jetty 9 (which requires java 7) and tomcat 7. These are also the most difficult patches to rebase when hadoop produces a new release. that's not going to go in the 2.x branch. Java 6 is still a common platform that people are using, because historically java7 (or any leading edge java version) is buggy. that said, our QA team did test hadoop 2 HDP-2 at scale on java7 and openjdk 7, so it all works -it's just the commit java7 only is a big decision that I realize moving to java 7 is a big decision and wasn't trying to imply this should happen without discussion and planning, just that it would be nice to have the discussion and see where things land. It can also help minimize work. There is an open bz for updating jetty to jetty 8 (the last version that would work on java 6), but if there are plans to move to java7, maybe it makes sense to just to jetty 9 and not test a new version of jetty twice. With Hadoop in Fedora running on these newer deps there is a test bed to play with to give some level of confidence before taking the plunge on any major change. Rob
Hadoop in Fedora updated to 2.2.0
I've updated the version of Hadoop in Fedora 20 to 2.2.0. This means Hadoop 2.2.0 will be the included in the official release of Fedora 20. Hadoop on Fedora is running against numerous updated dependencies, including: Java 7 (OpenJDK IcedTea) Jetty 9 Tomcat 7 Jets3t 0.9.0 I've logged/updated jiras for all the changes we've made that could be useful to the Hadoop project: https://issues.apache.org/jira/browse/HADOOP-9594 https://issues.apache.org/jira/browse/MAPREDUCE-5431 https://issues.apache.org/jira/browse/HADOOP-9611 https://issues.apache.org/jira/browse/HADOOP-9613 https://issues.apache.org/jira/browse/HADOOP-9623 https://issues.apache.org/jira/browse/HDFS-5411 https://issues.apache.org/jira/browse/HADOOP-10067 https://issues.apache.org/jira/browse/HDFS-5075 https://issues.apache.org/jira/browse/HADOOP-10068 https://issues.apache.org/jira/browse/HADOOP-10075 https://issues.apache.org/jira/browse/HADOOP-10076 https://issues.apache.org/jira/browse/HADOOP-9849 Most of the changes are minor. There are 2 big updates though: Jetty 9 (which requires java 7) and tomcat 7. These are also the most difficult patches to rebase when hadoop produces a new release. It would be great to get some feedback on these proposed changes and discuss how/when/if these could make it into a Hadoop release. Rob
[jira] [Created] (HADOOP-10075) Update jetty dependency to version 9
Robert Rati created HADOOP-10075: Summary: Update jetty dependency to version 9 Key: HADOOP-10075 URL: https://issues.apache.org/jira/browse/HADOOP-10075 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Robert Rati Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HADOOP-10067) Missing POM dependency on jsr305
Robert Rati created HADOOP-10067: Summary: Missing POM dependency on jsr305 Key: HADOOP-10067 URL: https://issues.apache.org/jira/browse/HADOOP-10067 Project: Hadoop Common Issue Type: Improvement Reporter: Robert Rati Priority: Minor Compiling for Fedora revels a missing declaration for javax.annotation.Nullable. This is the result of a missing explicit dependency on jsr305. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HADOOP-10068) Improve log4j regex in testFindContainingJar
Robert Rati created HADOOP-10068: Summary: Improve log4j regex in testFindContainingJar Key: HADOOP-10068 URL: https://issues.apache.org/jira/browse/HADOOP-10068 Project: Hadoop Common Issue Type: Improvement Reporter: Robert Rati Priority: Trivial Improved the regular expression in TestClassUtil:testFindContainingJar to work in in both Fedora and non-Fedora environments -- This message was sent by Atlassian JIRA (v6.1#6144)
Hadoop is in Fedora
The hadoop package has passed review and has been built for Fedora 20. Rob