Re: HADOOP_ROOT_LOGGER

2014-05-22 Thread Robert Rati
In my experience the default HADOOP_ROOT_LOGGER definition will override 
any root logger defined in log4j.properties, which is where the problems 
have arisen.  If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh 
were removed, wouldn't the root logger defined in the log4j.properties 
file be used?  Or do the client commands not read that configuration file?


I'm trying to understand why the root logger should be defined outside 
of the log4j.properties file.


Rob

On 05/22/2014 12:53 AM, Vinayakumar B wrote:

Hi Robert,

I understand your confusion.

HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set
for anything and logs will be displayed on the console itself.
This will be true for any client commands you run. For ex: hdfs dfs -ls /

But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc)
  HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env
variable is not defined.
So that all the log messages of the server daemons goto some log files and
this will be maintained by RollingFileAppender. If you want to override all
these default and set your own loglevel then define that as env variable
HADOOP_ROOT_LOGGER.

For ex:
export HADOOP_ROOT_LOGGER=DEBUG,RFA
   export above env variable and then start server scripts or execute client
commands, all logs goto files and will be maintained by RollingFileAppender.


Regards,
Vinay


On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote:


I noticed in hadoop-config.sh there is this line:

HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_
ROOT_LOGGER:-INFO,console}

which is setting a root logger if HADOOP_ROOT_LOGGER isn't set.  Why is
this here.needed?  There is a log4j.properties file provided that defines a
default logger.  I believe the line above will result in overriding
whatever is set for the root logger in the log4j.properties file.  This has
caused some confusion and hacks to work around this.

Is there a reason not to remove the above code and just have all the
logger definitions in the log4j.properties file?  Is there maybe a
compatibility concern?

Rob







Re: HADOOP_ROOT_LOGGER

2014-05-22 Thread Robert Rati
Ah, that makes sense.  Would it make sense to default the root logger to 
the one defined in log4j.properties file instead of the static value in 
the script then?  That way an admin can set all logging properties 
desired in the log4j.properties file, but can override with 
HADOOP_ROOT_LOGGER to debug.


It feels a little black box-y that if HADOOP_ROOT_LOGGER isn't set then 
the root logger set in log4j.properties is ignored.


Maybe this is all very well known and just a bit black box-y to me since 
I'm new-ish to hadoop.


Rob

On 05/22/2014 03:41 PM, Colin McCabe wrote:

It's not always practical to edit the log4j.properties file.  For one
thing, if you're using a management system, there may be many log4j
properties sprinkled around the system, and it could be difficult to figure
out which is the one you need to edit.  For another, you may not (should
not?) have permission to do this on a production cluster.

Doing something like HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -cat
/foo has helped me diagnose problems in the past.

best,
Colin


On Thu, May 22, 2014 at 6:34 AM, Robert Rati rr...@redhat.com wrote:


In my experience the default HADOOP_ROOT_LOGGER definition will override
any root logger defined in log4j.properties, which is where the problems
have arisen.  If the HADOOP_ROOT_LOGGER definition in hadoop-config.sh were
removed, wouldn't the root logger defined in the log4j.properties file be
used?  Or do the client commands not read that configuration file?

I'm trying to understand why the root logger should be defined outside of
the log4j.properties file.

Rob


On 05/22/2014 12:53 AM, Vinayakumar B wrote:


Hi Robert,

I understand your confusion.

HADOOP_ROOT_LOGGER is set to default value INFO,console if it hasn't set
for anything and logs will be displayed on the console itself.
This will be true for any client commands you run. For ex: hdfs dfs -ls
/

But for the server scripts (hadoop-daemon.sh, yarn-daemon.sh, etc)
   HADOOP_ROOT_LOGGER will be set to INFO, RFA if HADOOP_ROOT_LOGGER env
variable is not defined.
So that all the log messages of the server daemons goto some log files and
this will be maintained by RollingFileAppender. If you want to override
all
these default and set your own loglevel then define that as env variable
HADOOP_ROOT_LOGGER.

For ex:
 export HADOOP_ROOT_LOGGER=DEBUG,RFA
export above env variable and then start server scripts or execute
client
commands, all logs goto files and will be maintained by
RollingFileAppender.


Regards,
Vinay


On Wed, May 21, 2014 at 6:42 PM, Robert Rati rr...@redhat.com wrote:

  I noticed in hadoop-config.sh there is this line:


HADOOP_OPTS=$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_
ROOT_LOGGER:-INFO,console}

which is setting a root logger if HADOOP_ROOT_LOGGER isn't set.  Why is
this here.needed?  There is a log4j.properties file provided that
defines a
default logger.  I believe the line above will result in overriding
whatever is set for the root logger in the log4j.properties file.  This
has
caused some confusion and hacks to work around this.

Is there a reason not to remove the above code and just have all the
logger definitions in the log4j.properties file?  Is there maybe a
compatibility concern?

Rob










HADOOP_ROOT_LOGGER

2014-05-21 Thread Robert Rati

I noticed in hadoop-config.sh there is this line:

HADOOP_OPTS=$HADOOP_OPTS 
-Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}


which is setting a root logger if HADOOP_ROOT_LOGGER isn't set.  Why is 
this here.needed?  There is a log4j.properties file provided that 
defines a default logger.  I believe the line above will result in 
overriding whatever is set for the root logger in the log4j.properties 
file.  This has caused some confusion and hacks to work around this.


Is there a reason not to remove the above code and just have all the 
logger definitions in the log4j.properties file?  Is there maybe a 
compatibility concern?


Rob


Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Robert Rati
Just an FYI, but I'm working on updating that jetty patch for the 
current 2.4.0 release.  The one that is there won't cleanly apply 
because so much has changed since it was posted.  I'll post a new patch 
when it's done.


Rob

On 04/11/2014 04:24 AM, Steve Loughran wrote:

On 10 April 2014 18:12, Eli Collins e...@cloudera.com wrote:


Let's speak less abstractly, are there particular features or new
dependencies that you would like to contribute (or see contributed) that
require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
release are both non-trivial, not something I suspect we'd want to do just
because it would be, for example, nicer to have a newer version of Jetty.



Oddly enough, rolling the web framework is something I'd like to see in a
v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is
reliably switch to servlet API v3

But.. I think we may be able to increment Jetty more without going to
java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .



Re: Plans of moving towards JDK7 in trunk

2014-04-11 Thread Robert Rati
I don't mean to be dense, but can you expand on why jetty 8 can't go 
into branch2?  What is the concern?


Rob

On 04/11/2014 10:55 AM, Alejandro Abdelnur wrote:

if you mean updating jetty on branch2, we cannot do that. it has to be done in 
trunk.

thx

Alejandro
(phone typing)


On Apr 11, 2014, at 4:46, Robert Rati rr...@redhat.com wrote:

Just an FYI, but I'm working on updating that jetty patch for the current 2.4.0 
release.  The one that is there won't cleanly apply because so much has changed 
since it was posted.  I'll post a new patch when it's done.

Rob


On 04/11/2014 04:24 AM, Steve Loughran wrote:

On 10 April 2014 18:12, Eli Collins e...@cloudera.com wrote:

Let's speak less abstractly, are there particular features or new
dependencies that you would like to contribute (or see contributed) that
require using the Java 1.7 APIs?  Breaking compat in v2 or rolling a v3
release are both non-trivial, not something I suspect we'd want to do just
because it would be, for example, nicer to have a newer version of Jetty.


Oddly enough, rolling the web framework is something I'd like to see in a
v3. the shuffle may be off jetty, but webhdfs isn't. Moving up also lets is
reliably switch to servlet API v3

But.. I think we may be able to increment Jetty more without going to
java7, see https://issues.apache.org/jira/browse/HADOOP-9650 .



Improved HDFS Web UI

2013-12-20 Thread Robert Rati

Did something happen to the improved HDFS Web UI from this jira?

https://issues.apache.org/jira/browse/HDFS-2933

The fixed version is 2.1.1-beta, but I don't see the code on any of the 
branch-2.2.x or branch-2.3 branches.


Rob


Re: Hadoop in Fedora updated to 2.2.0

2013-11-21 Thread Robert Rati
Just to clarify, the tomcat/jasper updates and the jersey updates should 
be able to go in without any jetty changes.  There is also a separate BZ 
for updating jetty to jetty 8, which is the last jetty version that will 
run on java 6, if there is a desire to update jetty without requiring 
java 7.


If jetty 9 is being looked at for inclusion it will affect the jasper 
bits in the poms.  Jetty 9 has its own jsp compiler and would need to 
replace jasper, but that's largely just pom changes iirc.  Jetty 9 does 
revamp some apis and should definitely be looked at by people more 
knowledgeable with how jetty is used, especially as it relates to secure 
mode.


Rob

On 11/13/2013 03:31 PM, Steve Loughran wrote:

I've just been through some of these as part of my background project, fix
up the POMs https://issues.apache.org/jira/browse/HADOOP-9991.


1. I've applied the simple low/risk ones.
2. I've not done the bookkeeper one, as people working with that code
need to play with it first.
3. I've not touched anything related to {jersey, tomcat, jetty}

This is more than just a java6/7 issue, is is that Jetty has been very
brittle in the past, and there's code in hadoop to detect when it's not
actually servicing requests properly. Moving up Jetty/web server versions
is something that needs to be done carefull and with consensus -and once
you leave Jetty alone, I don't know where the jersey and tomcat changes go.

There is always the option of s/jetty/r/grizzly/

-steve




On 1 November 2013 14:57, Robert Rati rr...@redhat.com wrote:


Putting the java 6 vs java 7 issue aside, what about the other patches to
update dependencies?  Can those be looked at an planned for inclusion into
a releation?

Rob


On 10/31/2013 05:51 PM, Andrew Wang wrote:


I'm in agreement with Steve on this one. We're aware that Java 6 is EOL,
but we can't drop support for the lifetime of the 2.x line since it's a
(very) incompatible change. AFAIK a 3.x release fixing this isn't on any
of
our horizons yet.

Best,
Andrew


On Thu, Oct 31, 2013 at 6:15 AM, Robert Rati rr...@redhat.com wrote:

  https://issues.apache.org/jira/browse/HADOOP-9594https:

//issues.apache.org/**jira/browse/HADOOP-9594


https:**//issues.apache.org/jira/**browse/HADOOP-9594http

s://issues.apache.org/jira/browse/HADOOP-9594



  https://issues.apache.org/jira/browse/MAPREDUCE-5431htt

ps://issues.apache.org/**jira/browse/MAPREDUCE-5431
htt**ps://issues.apache.org/jira/**browse/MAPREDUCE-5431h
ttps://issues.apache.org/jira/browse/MAPREDUCE-5431



  https://issues.apache.org/jira/browse/HADOOP-9611https:

//issues.apache.org/**jira/browse/HADOOP-9611
https:**//issues.apache.org/jira/**browse/HADOOP-9611http
s://issues.apache.org/jira/browse/HADOOP-9611



  https://issues.apache.org/jira/browse/HADOOP-9613https:

//issues.apache.org/**jira/browse/HADOOP-9613
https:**//issues.apache.org/jira/**browse/HADOOP-9613http
s://issues.apache.org/jira/browse/HADOOP-9613



  https://issues.apache.org/jira/browse/HADOOP-9623https:

//issues.apache.org/**jira/browse/HADOOP-9623
https:**//issues.apache.org/jira/**browse/HADOOP-9623http
s://issues.apache.org/jira/browse/HADOOP-9623



  https://issues.apache.org/jira/browse/HDFS-5411https://

issues.apache.org/**jira/browse/HDFS-5411
https://**issues.apache.org/jira/browse/**HDFS-5411https:
//issues.apache.org/jira/browse/HDFS-5411



  https://issues.apache.org/jira/browse/HADOOP-10067https

://issues.apache.org/**jira/browse/HADOOP-10067
https**://issues.apache.org/jira/**browse/HADOOP-10067htt
ps://issues.apache.org/jira/browse/HADOOP-10067



  https://issues.apache.org/jira/browse/HDFS-5075https://

issues.apache.org/**jira/browse/HDFS-5075
https://**issues.apache.org/jira/browse/**HDFS-5075https:
//issues.apache.org/jira/browse/HDFS-5075



  https://issues.apache.org/jira/browse/HADOOP-10068https

://issues.apache.org/**jira/browse/HADOOP-10068
https**://issues.apache.org/jira/**browse/HADOOP-10068htt
ps://issues.apache.org/jira/browse/HADOOP-10068



  https://issues.apache.org/jira/browse/HADOOP-10075https

://issues.apache.org/**jira/browse/HADOOP-10075
https**://issues.apache.org/jira/**browse/HADOOP-10075htt
ps://issues.apache.org/jira/browse/HADOOP-10075



  https://issues.apache.org/jira/browse/HADOOP-10076https

://issues.apache.org/**jira/browse/HADOOP-10076
https**://issues.apache.org/jira/**browse/HADOOP-10076htt
ps://issues.apache.org/jira/browse/HADOOP-10076



  https://issues.apache.org/jira/browse/HADOOP-9849https:

//issues.apache.org/**jira/browse/HADOOP-9849
https:**//issues.apache.org/jira/**browse/HADOOP-9849http
s://issues.apache.org/jira/browse/HADOOP-9849






   most (all?) of these are  pom changes





A good number are basically pom changes to update to newer versions of
dependencies.  A few, such as commons-math3, required code changes as
well
because of a namespace change.  Some are minor code changes to enhance
compatibility

[jira] [Created] (HADOOP-10096) Missing dependency on commons-collections

2013-11-13 Thread Robert Rati (JIRA)
Robert Rati created HADOOP-10096:


 Summary: Missing dependency on commons-collections
 Key: HADOOP-10096
 URL: https://issues.apache.org/jira/browse/HADOOP-10096
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Robert Rati
Priority: Minor
 Attachments: HADOOP-10096.patch

There's a missing dependency on commons-collections



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HADOOP-10096) Missing dependency on commons-collections

2013-11-13 Thread Robert Rati (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Rati resolved HADOOP-10096.
--

Resolution: Duplicate

 Missing dependency on commons-collections
 -

 Key: HADOOP-10096
 URL: https://issues.apache.org/jira/browse/HADOOP-10096
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Robert Rati
Priority: Minor
 Attachments: HADOOP-10096.patch


 There's a missing dependency on commons-collections



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Hadoop in Fedora updated to 2.2.0

2013-10-31 Thread Robert Rati

https://issues.apache.org/**jira/browse/HADOOP-9594https://issues.apache.org/jira/browse/HADOOP-9594
https://issues.apache.org/**jira/browse/MAPREDUCE-5431https://issues.apache.org/jira/browse/MAPREDUCE-5431
https://issues.apache.org/**jira/browse/HADOOP-9611https://issues.apache.org/jira/browse/HADOOP-9611
https://issues.apache.org/**jira/browse/HADOOP-9613https://issues.apache.org/jira/browse/HADOOP-9613
https://issues.apache.org/**jira/browse/HADOOP-9623https://issues.apache.org/jira/browse/HADOOP-9623
https://issues.apache.org/**jira/browse/HDFS-5411https://issues.apache.org/jira/browse/HDFS-5411
https://issues.apache.org/**jira/browse/HADOOP-10067https://issues.apache.org/jira/browse/HADOOP-10067
https://issues.apache.org/**jira/browse/HDFS-5075https://issues.apache.org/jira/browse/HDFS-5075
https://issues.apache.org/**jira/browse/HADOOP-10068https://issues.apache.org/jira/browse/HADOOP-10068
https://issues.apache.org/**jira/browse/HADOOP-10075https://issues.apache.org/jira/browse/HADOOP-10075
https://issues.apache.org/**jira/browse/HADOOP-10076https://issues.apache.org/jira/browse/HADOOP-10076
https://issues.apache.org/**jira/browse/HADOOP-9849https://issues.apache.org/jira/browse/HADOOP-9849



most (all?) of these are  pom changes


A good number are basically pom changes to update to newer versions of 
dependencies.  A few, such as commons-math3, required code changes as 
well because of a namespace change.  Some are minor code changes to 
enhance compatibility with newer dependencies.  Even tomcat is mostly 
changes in pom files.



Most of the changes are minor.  There are 2 big updates though: Jetty 9
(which requires java 7) and tomcat 7.  These are also the most difficult
patches to rebase when hadoop produces a new release.



that's not going to go in the 2.x branch. Java 6 is still a common platform
that people are using, because historically java7 (or any leading edge java
version) is buggy.

that said, our QA team did test hadoop 2  HDP-2 at scale on java7 and
openjdk 7, so it all works -it's just the commit java7 only is a big
decision that


I realize moving to java 7 is a big decision and wasn't trying to imply 
this should happen without discussion and planning, just that it would 
be nice to have the discussion and see where things land.  It can also 
help minimize work.  There is an open bz for updating jetty to jetty 8 
(the last version that would work on java 6), but if there are plans to 
move to java7, maybe it makes sense to just to jetty 9 and not test a 
new version of jetty twice.


With Hadoop in Fedora running on these newer deps there is a test bed to 
play with to give some level of confidence before taking the plunge on 
any major change.


Rob


Hadoop in Fedora updated to 2.2.0

2013-10-30 Thread Robert Rati
I've updated the version of Hadoop in Fedora 20 to 2.2.0.  This means 
Hadoop 2.2.0 will be the included in the official release of Fedora 20.


Hadoop on Fedora is running against numerous updated dependencies, 
including:

Java 7 (OpenJDK IcedTea)
Jetty 9
Tomcat 7
Jets3t 0.9.0

I've logged/updated jiras for all the changes we've made that could be 
useful to the Hadoop project:


https://issues.apache.org/jira/browse/HADOOP-9594
https://issues.apache.org/jira/browse/MAPREDUCE-5431
https://issues.apache.org/jira/browse/HADOOP-9611
https://issues.apache.org/jira/browse/HADOOP-9613
https://issues.apache.org/jira/browse/HADOOP-9623
https://issues.apache.org/jira/browse/HDFS-5411
https://issues.apache.org/jira/browse/HADOOP-10067
https://issues.apache.org/jira/browse/HDFS-5075
https://issues.apache.org/jira/browse/HADOOP-10068
https://issues.apache.org/jira/browse/HADOOP-10075
https://issues.apache.org/jira/browse/HADOOP-10076
https://issues.apache.org/jira/browse/HADOOP-9849

Most of the changes are minor.  There are 2 big updates though: Jetty 9 
(which requires java 7) and tomcat 7.  These are also the most difficult 
patches to rebase when hadoop produces a new release.


It would be great to get some feedback on these proposed changes and 
discuss how/when/if these could make it into a Hadoop release.


Rob


[jira] [Created] (HADOOP-10075) Update jetty dependency to version 9

2013-10-28 Thread Robert Rati (JIRA)
Robert Rati created HADOOP-10075:


 Summary: Update jetty dependency to version 9
 Key: HADOOP-10075
 URL: https://issues.apache.org/jira/browse/HADOOP-10075
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Robert Rati


Jetty6 is no longer maintained.  Update the dependency to jetty9.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HADOOP-10067) Missing POM dependency on jsr305

2013-10-24 Thread Robert Rati (JIRA)
Robert Rati created HADOOP-10067:


 Summary: Missing POM dependency on jsr305
 Key: HADOOP-10067
 URL: https://issues.apache.org/jira/browse/HADOOP-10067
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Rati
Priority: Minor


Compiling for Fedora revels a missing declaration for 
javax.annotation.Nullable.  This is the result of a missing explicit dependency 
on jsr305.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HADOOP-10068) Improve log4j regex in testFindContainingJar

2013-10-24 Thread Robert Rati (JIRA)
Robert Rati created HADOOP-10068:


 Summary: Improve log4j regex in testFindContainingJar
 Key: HADOOP-10068
 URL: https://issues.apache.org/jira/browse/HADOOP-10068
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Robert Rati
Priority: Trivial


Improved the regular expression in TestClassUtil:testFindContainingJar to work 
in in both Fedora and non-Fedora environments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Hadoop is in Fedora

2013-08-20 Thread Robert Rati

The hadoop package has passed review and has been built for Fedora 20.

Rob