from:"Luke Lu $JIRA$"

[jira] [Commented] (HADOOP-12747) support wildcard in libjars argument

2016-08-01 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403094#comment-15403094
 ] 

Luke Lu commented on HADOOP-12747:
--

Hey [~sjlee0], this is a very convenient feature. The patch looks good to me. 
+1.

> support wildcard in libjars argument
> 
>
> Key: HADOOP-12747
> URL: https://issues.apache.org/jira/browse/HADOOP-12747
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: HADOOP-12747.01.patch, HADOOP-12747.02.patch, 
> HADOOP-12747.03.patch, HADOOP-12747.04.patch, HADOOP-12747.05.patch, 
> HADOOP-12747.06.patch, HADOOP-12747.07.patch
>
>
> There is a problem when a user job adds too many dependency jars in their 
> command line. The HADOOP_CLASSPATH part can be addressed, including using 
> wildcards (\*). But the same cannot be done with the -libjars argument. Today 
> it takes only fully specified file paths.
> We may want to consider supporting wildcards as a way to help users in this 
> situation. The idea is to handle it the same way the JVM does it: \* expands 
> to the list of jars in that directory. It does not traverse into any child 
> directory.
> Also, it probably would be a good idea to do it only for libjars (i.e. don't 
> do it for -files and -archives).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-7266) Deprecate metrics v1

2015-05-07 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533575#comment-14533575
 ] 

Luke Lu commented on HADOOP-7266:
-

[~ajisakaa], thanks for taking this over. Changed job, have different 
priorities :) I remember that the reason v1 is  still around is mostly due to 
the downstream projects (e.g. Hive?) needed them. I think that's no longer the 
case now.

 Deprecate metrics v1
 

 Key: HADOOP-7266
 URL: https://issues.apache.org/jira/browse/HADOOP-7266
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics, rics
Affects Versions: 2.8.0
Reporter: Luke Lu
Assignee: Akira AJISAKA
Priority: Blocker
 Attachments: HADOOP-7266.001.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-10062) TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk

2015-01-15 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278835#comment-14278835
 ] 

Luke Lu commented on HADOOP-10062:
--

This would fix the test, but the test will probably fail again if one set the 
regular metrics publish interval to be small. A cleaner fix would change the 
system to use ScheduledExecutorService (instead of the timer) with a thread 
pool of one thread and have publishMetricsNow do an out of band 
service.execute(...).

 TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
 -

 Key: HADOOP-10062
 URL: https://issues.apache.org/jira/browse/HADOOP-10062
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 3.0.0
 Environment: CentOS 6.4, Oracle JDK 1.6.0_31, JDK1.7.0_45
Reporter: Shinichi Yamashita
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-10062-failed.txt, HADOOP-10062-success.txt, 
 HADOOP-10062.003.patch, HADOOP-10062.patch, HADOOP-10062.patch


 TestMetricsSystemInpl#testMultiThreadedPublish failed with Metrics not 
 collected
 {code}
 Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec  
 FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
   Time elapsed: 0.056 sec   FAILURE!
 java.lang.AssertionError: Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
 Results :
 Failed tests:
   TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-10062) TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk

2015-01-15 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279158#comment-14279158
 ] 

Luke Lu commented on HADOOP-10062:
--

You're right. Your current fix is good enough for correctness and low risk. +1 
for the patch. Using ScheduledExecutorService is just a little cleaner, which 
can be deferred.

 TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
 -

 Key: HADOOP-10062
 URL: https://issues.apache.org/jira/browse/HADOOP-10062
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 3.0.0
 Environment: CentOS 6.4, Oracle JDK 1.6.0_31, JDK1.7.0_45
Reporter: Shinichi Yamashita
Assignee: Sangjin Lee
Priority: Minor
 Attachments: HADOOP-10062-failed.txt, HADOOP-10062-success.txt, 
 HADOOP-10062.003.patch, HADOOP-10062.patch, HADOOP-10062.patch


 TestMetricsSystemInpl#testMultiThreadedPublish failed with Metrics not 
 collected
 {code}
 Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec  
 FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
   Time elapsed: 0.056 sec   FAILURE!
 java.lang.AssertionError: Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
 Results :
 Failed tests:
   TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-11152) Better random number generator

2014-09-29 Thread Luke Lu (JIRA)

Luke Lu created HADOOP-11152:


 Summary: Better random number generator
 Key: HADOOP-11152
 URL: https://issues.apache.org/jira/browse/HADOOP-11152
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Luke Lu


HDFS-7122 showed that naive ThreadLocal usage of simple LCG based j.u.Random 
creates unacceptable distribution of random numbers for block placement. 
Similarly, ThreadLocalRandom in java 7 (same static thread local with 
synchronized methods overridden) has the same problem. 

Better is defined as better quality and faster than j.u.Random (which is 
already much faster (20x) than SecureRandom).

People (e.g. Numerical Recipes) have shown that by combining LCG and XORShift 
we can have a better fast RNG. It'd be worthwhile to investigate a thread local 
version of these better RNG.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite

2014-06-02 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9704:


Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

+1 pending jenkins' results.

 Write metrics sink plugin for Hadoop/Graphite
 -

 Key: HADOOP-9704
 URL: https://issues.apache.org/jira/browse/HADOOP-9704
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Chu Tong
 Attachments: 
 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, 
 HADOOP-9704.patch, HADOOP-9704.patch, HADOOP-9704.patch


 Write a metrics sink plugin for Hadoop to send metrics directly to Graphite 
 in additional to the current ganglia and file ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite

2014-05-28 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011733#comment-14011733
 ] 

Luke Lu commented on HADOOP-9704:
-

[~raviprak]: Exceptions in sink impls won't bring down the daemon. The metrics 
system is designed to be resilient to transient back-end errors. It'll do 
retries according to config as well. 

 Write metrics sink plugin for Hadoop/Graphite
 -

 Key: HADOOP-9704
 URL: https://issues.apache.org/jira/browse/HADOOP-9704
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Chu Tong
 Attachments: 
 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, 
 HADOOP-9704.patch, HADOOP-9704.patch, Hadoop-9704.patch


 Write a metrics sink plugin for Hadoop to send metrics directly to Graphite 
 in additional to the current ganglia and file ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite

2014-05-28 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011743#comment-14011743
 ] 

Luke Lu commented on HADOOP-9704:
-

The patch looks good overall. Thanks [~babakbehzad]! Please remove the tabs in 
the source and format according to 

https://wiki.apache.org/hadoop/CodeReviewChecklist

 Write metrics sink plugin for Hadoop/Graphite
 -

 Key: HADOOP-9704
 URL: https://issues.apache.org/jira/browse/HADOOP-9704
 Project: Hadoop Common
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Chu Tong
 Attachments: 
 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, 
 HADOOP-9704.patch, HADOOP-9704.patch, Hadoop-9704.patch


 Write a metrics sink plugin for Hadoop to send metrics directly to Graphite 
 in additional to the current ganglia and file ones.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10577) Fix some minors error and compile on macosx

2014-05-06 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13991279#comment-13991279
 ] 

Luke Lu commented on HADOOP-10577:
--

lgtm. +1.

 Fix some minors error and compile on macosx
 ---

 Key: HADOOP-10577
 URL: https://issues.apache.org/jira/browse/HADOOP-10577
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-10577.v1.patch, HADOOP-10577.v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10389) Native RPCv9 client

2014-04-30 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13986134#comment-13986134
 ] 

Luke Lu commented on HADOOP-10389:
--

I assume this is a prototype/partial demo? as all the security stuff is 
missing. What else is missing (including not yet published)? Are there any work 
left for the newly minted branch committers? :) 

 Native RPCv9 client
 ---

 Key: HADOOP-10389
 URL: https://issues.apache.org/jira/browse/HADOOP-10389
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: HADOOP-10388
Reporter: Binglin Chang
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-10388.001.patch, HADOOP-10389.002.patch, 
 HADOOP-10389.004.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-22 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830403#comment-13830403
 ] 

Luke Lu commented on HADOOP-10090:
--

The 1.4 patch lgtm. +1. 

 Jobtracker metrics not updated properly after execution of a mapreduce job
 --

 Key: HADOOP-10090
 URL: https://issues.apache.org/jira/browse/HADOOP-10090
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 1.2.1
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-10090.branch-1.2.patch, 
 HADOOP-10090.branch-1.3.patch, HADOOP-10090.branch-1.4.patch, 
 HADOOP-10090.branch-1.patch, OneBoxRepro.png


 After executing a wordcount mapreduce sample job, jobtracker metrics are not 
 updated properly. Often times the response from the jobtracker has higher 
 number of job_completed than job_submitted (for example 8 jobs completed and 
 7 jobs submitted). 
 Issue reported by Toma Paunovic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-18 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825516#comment-13825516
 ] 

Luke Lu commented on HADOOP-10090:
--

bq. doing only the backport is not enough, the bug from this Jira will remain. 

I must be missing something obvious. Why would the bug remain if lastRecs 
always contains a complete snapshot?

 Jobtracker metrics not updated properly after execution of a mapreduce job
 --

 Key: HADOOP-10090
 URL: https://issues.apache.org/jira/browse/HADOOP-10090
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 1.2.1
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-10090.branch-1.2.patch, 
 HADOOP-10090.branch-1.3.patch, HADOOP-10090.branch-1.patch, OneBoxRepro.png


 After executing a wordcount mapreduce sample job, jobtracker metrics are not 
 updated properly. Often times the response from the jobtracker has higher 
 number of job_completed than job_submitted (for example 8 jobs completed and 
 7 jobs submitted). 
 Issue reported by Toma Paunovic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-15 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824041#comment-13824041
]

Luke Lu commented on HADOOP-10090:
--

I now recalled some hesitation of extra lock on source:

* It could adversely affect the application performance buy holding the source
lock while doing a snapshot. Currently source have a choice on whether and how
the snapshot should be synchronized or not depending on the nature of the
metrics involved. In many cases, source is implemented by a real object the has
application locking logic. Holding a lock doing a potentially large snapshot
(many metrics) _could_ increase lock contention significantly.
* Locking far away from the object is consider an anti-pattern that makes it
hard to reason about locking by looking at the source only.

bq. do you think we should backport YARN-1043? Looks like an incompatible
change so not sure whether we want it back to 1.0 line.

Always update all should be a compatible change, semantic wise, besides extra
objects for non-changing metrics. It seems that simply backporting the one
line change (all is true always) from YARN-1043 (and keeping the test) is less
risky to change the locking mechanisms.

Jobtracker metrics not updated properly after execution of a mapreduce job
--

Key: HADOOP-10090
URL: https://issues.apache.org/jira/browse/HADOOP-10090
Project: Hadoop Common
Issue Type: Bug
Components: metrics
Affects Versions: 1.2.1
Reporter: Ivan Mitic
Assignee: Ivan Mitic
Attachments: HADOOP-10090.branch-1.2.patch,
HADOOP-10090.branch-1.patch, OneBoxRepro.png

After executing a wordcount mapreduce sample job, jobtracker metrics are not
updated properly. Often times the response from the jobtracker has higher
number of job_completed than job_submitted (for example 8 jobs completed and
7 jobs submitted).
Issue reported by Toma Paunovic.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-15 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824237#comment-13824237
 ] 

Luke Lu commented on HADOOP-10090:
--

It seems to me that just porting the one line change from YARN-1043 will fix 
the problem and less risky? The unit test in the current patch is still useful.

Consistent issue should be fixed in trunk due to YARN-1043.

 Jobtracker metrics not updated properly after execution of a mapreduce job
 --

 Key: HADOOP-10090
 URL: https://issues.apache.org/jira/browse/HADOOP-10090
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 1.2.1
Reporter: Ivan Mitic
Assignee: Ivan Mitic
 Attachments: HADOOP-10090.branch-1.2.patch, 
 HADOOP-10090.branch-1.patch, OneBoxRepro.png


 After executing a wordcount mapreduce sample job, jobtracker metrics are not 
 updated properly. Often times the response from the jobtracker has higher 
 number of job_completed than job_submitted (for example 8 jobs completed and 
 7 jobs submitted). 
 Issue reported by Toma Paunovic.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-14 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822772#comment-13822772
]

Luke Lu commented on HADOOP-10090:
--

bq. I'm not sure where #3 differs from #2.

#3 is an improvement of #2, where cache TTL regular snapshot interval, where
jmx will get at least the same freshness of sinks, even with a longer TTL.
Anyway, it appears #2 is easier to understand and serves typical use case
(cache TTL regular snapshot interval) well enough.

bq. JMX will always return complete result, but the sink might miss some changes

You patch already introduced forceAllMetricsOnSource _after_ TTL expiry, it
might be able to eliminate the problem with following changes?

Comments on the patch:
# forceAllMetricsOnSource doesn't need to be volatile as it's always
read/written in synchronized sections.
# updateJmxCache now copies some logic of getMetrics and doesn't work with
source metrics filtering (a feature regression). It seems to me that you can
still reuse getMetrics by adding a check {{if (!calledWithAll)}} for resetting
forceAllMetricsOnSource to false, so that next sink update will be consistent?

Jobtracker metrics not updated properly after execution of a mapreduce job
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-14 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823152#comment-13823152
]

Luke Lu commented on HADOOP-10090:
--

[~cnauroth]: That's a good idea. I actually thought about doing that for
HADOOP-8050, as MetricsSystemImpl is already implementing MetricsSource in
trunk.

OTOH, this seems to be no longer an issue in trunk due to YARN-1043, which
disabled sparse update completely. I was not aware of that until now. In
retrospect, I probably should've done that by default, as all the sinks except
the first (and then sole) production sink that the system was designed for
cannot handle sparse updates anyway. I still cringe at wasting resource
snapshotting all the failure metrics that don't change much. Maybe I'll
scratch that itch one day.

Jobtracker metrics not updated properly after execution of a mapreduce job
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated properly after execution of a mapreduce job

2013-11-12 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820421#comment-13820421
]

Luke Lu commented on HADOOP-10090:
--

I was aware of the suboptimal behavior and hoped it'd be OK for metrics, which
don't require strong consistency.

#1 incurs unnecessary overhead (updating jmx cache) for people who don't use
JMX. This is the reason of the current cache logic.
#2 is risky, as we don't know all existing jmx query patterns (especially due
to HDFS-5333). User (admins) actually already has a choice to use a small JMX
cache TTL for refreshness.

How about #3: we only initialize and update the JMX cache when JMX is first
used and stops updating after a period inactivity and reinitialize and update
JMX cache upon activity. Initialize/reinitialize is a dense update, while
update means the sparse update with the current lastRecs mechanisms. I
think #3 is should be a fairly straightforward patch and more flexible than #1
and #2.

Jobtracker metrics not updated properly after execution of a mapreduce job
--

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10062) TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk

2013-11-10 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818638#comment-13818638
 ] 

Luke Lu commented on HADOOP-10062:
--

I think the following has more to do with the failure than anything else:
{code}
WARN  impl.MetricsSinkAdapter 
(MetricsSinkAdapter.java:putMetricsImmediate(112)) - hanging couldn't fulfill 
an immediate putMetrics request in time. Abandoning.
{code}

Is your environment physical or virtual?

 TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
 -

 Key: HADOOP-10062
 URL: https://issues.apache.org/jira/browse/HADOOP-10062
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 3.0.0
 Environment: CentOS 6.4, Oracle JDK 1.6.0_31, JDK1.7.0_45
Reporter: Shinichi Yamashita
Priority: Minor
 Attachments: HADOOP-10062-failed.txt, HADOOP-10062-success.txt, 
 HADOOP-10062.patch


 TestMetricsSystemInpl#testMultiThreadedPublish failed with Metrics not 
 collected
 {code}
 Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec  
 FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
   Time elapsed: 0.056 sec   FAILURE!
 java.lang.AssertionError: Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
 Results :
 Failed tests:
   TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10062) TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk

2013-11-08 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817920#comment-13817920
 ] 

Luke Lu commented on HADOOP-10062:
--

Since publishMetrics is already synchronized, I don't see how the additional 
lock help. The barrier fix in the test looks more promising. Can you verify 
that the barrier only fix works?

 TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
 -

 Key: HADOOP-10062
 URL: https://issues.apache.org/jira/browse/HADOOP-10062
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 3.0.0
 Environment: CentOS 6.4, Oracle JDK 1.6.0_31
Reporter: Shinichi Yamashita
Priority: Minor
 Attachments: HADOOP-10062.patch


 TestMetricsSystemInpl#testMultiThreadedPublish failed with Metrics not 
 collected
 {code}
 Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec  
 FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
   Time elapsed: 0.056 sec   FAILURE!
 java.lang.AssertionError: Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
 Results :
 Failed tests:
   TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10059) RPC authentication and authorization metrics overflow to negative values on busy clusters

2013-11-07 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815812#comment-13815812
 ] 

Luke Lu commented on HADOOP-10059:
--

The .2 patch lgtm. +1.

 RPC authentication and authorization metrics overflow to negative values on 
 busy clusters
 -

 Key: HADOOP-10059
 URL: https://issues.apache.org/jira/browse/HADOOP-10059
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 0.23.9, 2.2.0
Reporter: Jason Lowe
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: HADOOP-10059.1.patch, HADOOP-10059.2.patch


 The RPC metrics for authorization and authentication successes can easily 
 overflow to negative values on a busy cluster that has been up for a long 
 time.  We should consider providing 64-bit values for these counters.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10062) TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk

2013-10-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804023#comment-13804023
 ] 

Luke Lu commented on HADOOP-10062:
--

Platform, JDK version?

 TestMetricsSystemImpl#testMultiThreadedPublish fails on trunk
 -

 Key: HADOOP-10062
 URL: https://issues.apache.org/jira/browse/HADOOP-10062
 Project: Hadoop Common
  Issue Type: Bug
  Components: metrics
Affects Versions: 3.0.0
Reporter: Shinichi Yamashita
Priority: Minor

 TestMetricsSystemInpl#testMultiThreadedPublish failed with Metrics not 
 collected
 {code}
 Running org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.688 sec  
 FAILURE! - in org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
 testMultiThreadedPublish(org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl)
   Time elapsed: 0.056 sec   FAILURE!
 java.lang.AssertionError: Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at 
 org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl.testMultiThreadedPublish(TestMetricsSystemImpl.java:232)
 Results :
 Failed tests:
   TestMetricsSystemImpl.testMultiThreadedPublish:232 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Metric not collected!
 Passed
 Tests run: 6, Failures: 1, Errors: 0, Skipped: 0
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-9559) When metrics system is restarted MBean names get incorrectly flagged as dupes

2013-10-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804026#comment-13804026
 ] 

Luke Lu commented on HADOOP-9559:
-

The patch looks reasonable with minor nits: the new static 
DefaultMetricsSystem#sourceName method is not used anywhere.

 When metrics system is restarted MBean names get incorrectly flagged as dupes
 -

 Key: HADOOP-9559
 URL: https://issues.apache.org/jira/browse/HADOOP-9559
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mostafa Elhemali
 Attachments: HADOOP-9559.patch


 In the Metrics2 system, every source gets registered as an MBean name, which 
 gets put into a unique name pool in the singleton DefaultMetricsSystem 
 object. The problem is that when the metrics system is shutdown (which 
 unregisters the MBeans) this unique name pool is left as is, so if the 
 metrics system is started again every attempt to register the same MBean 
 names fails (exception is eaten and a warning is logged).
 I think the fix here is to remove the name from the unique name pool if an 
 MBean is unregistered, since it's OK at this point to add it again.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10015) UserGroupInformation prints out excessive ERROR warnings

2013-10-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804032#comment-13804032
 ] 

Luke Lu commented on HADOOP-10015:
--

Would excluding common innocuous exceptions like FileNotFoundException from 
error logging (still log all exception at debug level) cover most cases here?

 UserGroupInformation prints out excessive ERROR warnings
 

 Key: HADOOP-10015
 URL: https://issues.apache.org/jira/browse/HADOOP-10015
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HADOOP-10015.000.patch, HADOOP-10015.001.patch, 
 HADOOP-10015.002.patch


 In UserGroupInformation::doAs(), it prints out a log at ERROR level whenever 
 it catches an exception.
 However, it prints benign warnings in the following paradigm:
 {noformat}
  try {
 ugi.doAs(new PrivilegedExceptionActionFileStatus() {
   @Override
   public FileStatus run() throws Exception {
 return fs.getFileStatus(nonExist);
   }
 });
   } catch (FileNotFoundException e) {
   }
 {noformat}
 For example, FileSystem#exists() follows this paradigm. Distcp uses this 
 paradigm too. The exception is expected therefore there should be no ERROR 
 logs printed in the namenode logs.
 Currently, the user quickly finds out that the namenode log is quickly filled 
 by _benign_ ERROR logs when he or she runs distcp in secure set up. This 
 behavior confuses the operators.
 This jira proposes to move the log to DEBUG level.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10040) hadoop.cmd in UNIX format and would not run by default on Windows

2013-10-14 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794596#comment-13794596
 ] 

Luke Lu commented on HADOOP-10040:
--

Chris, thanks for taking care of this!

Some notes for posterity: 

Due to the different ways subversion and git handle text EOLs. The build 
artifacts are currently VCS (version control system) dependent.

Currently if a release is built with git on either windows and \*nix platforms, 
the result will be correct (windows text (\*.cmd etc.) will have CRLF as EOL). 
If a release is built with subversion on *nix platforms, windows text will have 
LF as EOL.

This is not a major issue yet, as people typically build windows releases on 
windows boxes (required for windows native bits) and \*nix releases on 
corresponding machines.

 hadoop.cmd in UNIX format and would not run by default on Windows
 -

 Key: HADOOP-10040
 URL: https://issues.apache.org/jira/browse/HADOOP-10040
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Yingda Chen
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.2.1


 The hadoop.cmd currently checked in into hadoop-common is in UNIX format, 
 same as most of other src files. However, the hadoop.cmd is meant to be used 
 on Windows only, the fact that it is in UNIX format makes it unrunnable as is 
 on Window platform.
 An exception shall be made on hadoop.cmd (and other cmd files for what 
 matters) to make sure they are in DOS format, for them to be runnable as is 
 when checked out from source repository.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-10040) hadoop.cmd in UNIX format and would not run by default on Windows

2013-10-12 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13793469#comment-13793469
 ] 

Luke Lu commented on HADOOP-10040:
--

[~cnauroth]: You'll need to revert the change first before redo the propset, so 
that the repository contains normalized text. When you change .gitattributes, 
make sure it contains *.vcxproj, which was not there.

[~ste...@apache.org]: Build time fix is good idea, but orthogonal to the 
repository fix.

 hadoop.cmd in UNIX format and would not run by default on Windows
 -

 Key: HADOOP-10040
 URL: https://issues.apache.org/jira/browse/HADOOP-10040
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Yingda Chen
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.2.1


 The hadoop.cmd currently checked in into hadoop-common is in UNIX format, 
 same as most of other src files. However, the hadoop.cmd is meant to be used 
 on Windows only, the fact that it is in UNIX format makes it unrunnable as is 
 on Window platform.
 An exception shall be made on hadoop.cmd (and other cmd files for what 
 matters) to make sure they are in DOS format, for them to be runnable as is 
 when checked out from source repository.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Reopened] (HADOOP-10040) hadoop.cmd in UNIX format and would not run by default on Windows

2013-10-11 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu reopened HADOOP-10040:
--


Woah, this completely mess up git.

Short answer: you should svn propset windows file as eol-style *native*

Long answer: in order for .gitattributes to work correctly with eol attributes, 
all text file with eol attributes are stored as with LF in the repository and 
converted to the value of eol upon checkout. This is not compatible with svn 
eol-style CRLF, which change the content in the repository as well. With svn 
eol-style native, an svn checkout will convert normalized text files (stored 
with LF) to CRLF.

I committed a workaround (to trunk and branch-2, so people can work with git) 
with .gitattributes for windows file as binary, so git won't touch them.

 hadoop.cmd in UNIX format and would not run by default on Windows
 -

 Key: HADOOP-10040
 URL: https://issues.apache.org/jira/browse/HADOOP-10040
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Yingda Chen
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.2.1


 The hadoop.cmd currently checked in into hadoop-common is in UNIX format, 
 same as most of other src files. However, the hadoop.cmd is meant to be used 
 on Windows only, the fact that it is in UNIX format makes it unrunnable as is 
 on Window platform.
 An exception shall be made on hadoop.cmd (and other cmd files for what 
 matters) to make sure they are in DOS format, for them to be runnable as is 
 when checked out from source repository.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HADOOP-9964) O.A.H.U.ReflectionUtils.printThreadInfo() is not thread-safe which cause TestHttpServer pending 10 minutes or longer.

2013-09-30 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9964:


   Resolution: Fixed
Fix Version/s: 2.3.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks Junping for the patch!

 O.A.H.U.ReflectionUtils.printThreadInfo() is not thread-safe which cause 
 TestHttpServer pending 10 minutes or longer.
 -

 Key: HADOOP-9964
 URL: https://issues.apache.org/jira/browse/HADOOP-9964
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 2.3.0

 Attachments: HADOOP-9964.patch, jstack-runTestHttpServer.log


 The printThreadInfo() in ReflectionUtils is not thread-safe which cause two 
 or more threads calling this method from StackServlet to deadlock. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite

2013-09-29 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781494#comment-13781494
]

Luke Lu commented on HADOOP-9704:
-

Thanks for the patch, Alex and Chu! Both of you will be credited :)

Feedback for the latest patch:
# There is no need to do context filtering in a sink implemention, as metrics
filtering is provided by framework itself for any sinks. Though currently,
context filtering is either all or a particular context, though individual
record and metrics can handle glob patterns. File another JIRA for support
context globbing (simple to implement) if that's what you want.
# Use StringBuilder instead of StringBuffer as the buffer is not shared across
threads, hence no need to pay the synchronized penalty.
# ArgumentCaptorString should work with a mockito mock writer (use
#getAllValues to access values from multiple invocations). There is no need to
do manual capture.

Write metrics sink plugin for Hadoop/Graphite
-

Key: HADOOP-9704
URL: https://issues.apache.org/jira/browse/HADOOP-9704
Project: Hadoop Common
Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Chu Tong
Attachments:
0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch,
HADOOP-9704.patch, HADOOP-9704.patch

Write a metrics sink plugin for Hadoop to send metrics directly to Graphite
in additional to the current ganglia and file ones.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HADOOP-9964) O.A.H.U.ReflectionUtils.printThreadInfo() is not thread-safe which cause TestHttpServer pending 10 minutes or longer.

2013-09-27 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13779776#comment-13779776
 ] 

Luke Lu commented on HADOOP-9964:
-

The patch lgtm. +1.

 O.A.H.U.ReflectionUtils.printThreadInfo() is not thread-safe which cause 
 TestHttpServer pending 10 minutes or longer.
 -

 Key: HADOOP-9964
 URL: https://issues.apache.org/jira/browse/HADOOP-9964
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9964.patch, jstack-runTestHttpServer.log


 The printThreadInfo() in ReflectionUtils is not thread-safe which cause two 
 or more threads calling this method from StackServlet to deadlock. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9332) Crypto codec implementations for AES

2013-09-06 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760393#comment-13760393
]

Luke Lu commented on HADOOP-9332:
-

Preliminary comments:
* I had to dig to find out that both SimpleAESCodec and AESCodec are using CTR
mode, which is fine. I was thrown off by the comment in the beginning of the
SimpleAESCodec, as we know AES in the default ECB mode is pretty much
worthless. Suggest rename the codec to AESCTRCodec, unless you're going to
support other modes (CBC etc.), which requires an explicit comment.
* There is much code duplication between SimpleAESCodec (without compressor
option) and AESCodec (with compressor option). Suggest consolidate code in one
codec.

Crypto codec implementations for AES

Key: HADOOP-9332
URL: https://issues.apache.org/jira/browse/HADOOP-9332
Project: Hadoop Common
Issue Type: Sub-task
Components: security
Affects Versions: 3.0.0
Reporter: Yi Liu
Fix For: 3.0.0

Attachments: HADOOP-9332.patch, HADOOP-9332.patch

This JIRA task provides three crypto codec implementations based on the
Hadoop crypto codec framework. They are:
1.Simple AES Codec. AES codec implementation based on AES-NI. (Not
splittable)
2.AES Codec. AES codec implementation based on AES-NI in splittable
format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9916) Race condition in ipc.Client causes TestIPC timeout

2013-09-04 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9916:


   Resolution: Fixed
Fix Version/s: 2.1.1-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, 2.1-beta. Thanks Binglin for the patch!

 Race condition in ipc.Client causes TestIPC timeout
 ---

 Key: HADOOP-9916
 URL: https://issues.apache.org/jira/browse/HADOOP-9916
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Fix For: 2.1.1-beta

 Attachments: HADOOP-9916.v1.patch


 TestIPC timeouts occasionally, for example: 
 [https://issues.apache.org/jira/browse/HDFS-5130?focusedCommentId=13749870page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13749870]
 [https://issues.apache.org/jira/browse/HADOOP-9915?focusedCommentId=13753302page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13753302]
 Look into the code, there is race condition in oah.ipc.Client, the race 
 condition happen between RPC call thread and connection read response thread:
 {code}
 if (status == RpcStatusProto.SUCCESS) {
   Writable value = ReflectionUtils.newInstance(valueClass, conf);
   value.readFields(in); // read value
   call.setRpcResponse(value);
   calls.remove(callId);
 {code}
 Read Thread: 
 Connection.receiveRpcResponse- call.setRpcResponse(value) - notify Call 
 Thread
 Call Thread:
 Client.call - Connection.addCall(retry with the same callId) - notify read 
 thread
 Read Thread:
 calls.remove(callId) # intend to remove old call, but removes newly added 
 call...
 Connection.waitForWork end up wait maxIdleTime and close the connection. The 
 call never get response and dead.
 The problem doesn't show because previously callId is unique, we never 
 accidentally remove newly added calls, but when retry added this race 
 condition became possible.
 To solve this, we can simply change order, remove the call first, then notify 
 call thread.
 Note there are many places need this order change(normal case, error case, 
 cleanup case)
 And there are some minor issues in TestIPC:
 1. there are two method with same name:
void testSerial()
void testSerial(int handlerCount, boolean handlerSleep, ...)
the second is not a test case(so should not use testXXX prefix), but 
 somehow it causes testSerial(first one) run two times, see test report:
 {code}
   testcase time=26.896 classname=org.apache.hadoop.ipc.TestIPC 
 name=testSerial/
   testcase time=25.426 classname=org.apache.hadoop.ipc.TestIPC 
 name=testSerial/
 {code}
 2. timeout annotation should be added, so next time related log is available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9916) Race condition in ipc.Client causes TestIPC timeout

2013-09-03 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756883#comment-13756883
 ] 

Luke Lu commented on HADOOP-9916:
-

Nice catch Binglin! The patch lgtm. +1. Will commit shortly.

 Race condition in ipc.Client causes TestIPC timeout
 ---

 Key: HADOOP-9916
 URL: https://issues.apache.org/jira/browse/HADOOP-9916
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-9916.v1.patch


 TestIPC timeouts occasionally, for example: 
 [https://issues.apache.org/jira/browse/HDFS-5130?focusedCommentId=13749870page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13749870]
 [https://issues.apache.org/jira/browse/HADOOP-9915?focusedCommentId=13753302page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13753302]
 Look into the code, there is race condition in oah.ipc.Client, the race 
 condition happen between RPC call thread and connection read response thread:
 {code}
 if (status == RpcStatusProto.SUCCESS) {
   Writable value = ReflectionUtils.newInstance(valueClass, conf);
   value.readFields(in); // read value
   call.setRpcResponse(value);
   calls.remove(callId);
 {code}
 Read Thread: 
 Connection.receiveRpcResponse- call.setRpcResponse(value) - notify Call 
 Thread
 Call Thread:
 Client.call - Connection.addCall(retry with the same callId) - notify read 
 thread
 Read Thread:
 calls.remove(callId) # intend to remove old call, but removes newly added 
 call...
 Connection.waitForWork end up wait maxIdleTime and close the connection. The 
 call never get response and dead.
 The problem doesn't show because previously callId is unique, we never 
 accidentally remove newly added calls, but when retry added this race 
 condition became possible.
 To solve this, we can simply change order, remove the call first, then notify 
 call thread.
 Note there are many places need this order change(normal case, error case, 
 cleanup case)
 And there are some minor issues in TestIPC:
 1. there are two method with same name:
void testSerial()
void testSerial(int handlerCount, boolean handlerSleep, ...)
the second is not a test case(so should not use testXXX prefix), but 
 somehow it causes testSerial(first one) run two times, see test report:
 {code}
   testcase time=26.896 classname=org.apache.hadoop.ipc.TestIPC 
 name=testSerial/
   testcase time=25.426 classname=org.apache.hadoop.ipc.TestIPC 
 name=testSerial/
 {code}
 2. timeout annotation should be added, so next time related log is available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9784) Add a builder for HttpServer

2013-08-21 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9784:


  Resolution: Fixed
   Fix Version/s: 2.3.0
Target Version/s:   (was: 3.0.0, 2.1.0-beta)
  Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks Junping for the patch!

 Add a builder for HttpServer
 

 Key: HADOOP-9784
 URL: https://issues.apache.org/jira/browse/HADOOP-9784
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Junping Du
Assignee: Junping Du
 Fix For: 2.3.0

 Attachments: HADOOP-9784.patch, HADOOP-9784-v2.patch, 
 HADOOP-9784-v3.patch, HADOOP-9784-v4.patch, HADOOP-9784-v5.1.patch, 
 HADOOP-9784-v5.patch


 There are quite a lot of constructors in class of HttpServer to create 
 instance. Create a builder class to abstract the building steps which helps 
 to avoid more constructors in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9784) Add a builder for HttpServer

2013-08-20 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745345#comment-13745345
 ] 

Luke Lu commented on HADOOP-9784:
-

This is more palatable than the anonymous inner class blob to init SPNEGO. The 
v5.1 patch lgtm. +1. Will commit shortly.

 Add a builder for HttpServer
 

 Key: HADOOP-9784
 URL: https://issues.apache.org/jira/browse/HADOOP-9784
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9784.patch, HADOOP-9784-v2.patch, 
 HADOOP-9784-v3.patch, HADOOP-9784-v4.patch, HADOOP-9784-v5.1.patch, 
 HADOOP-9784-v5.patch


 There are quite a lot of constructors in class of HttpServer to create 
 instance. Create a builder class to abstract the building steps which helps 
 to avoid more constructors in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9871) Fix intermittent findbug warnings in DefaultMetricsSystem

2013-08-14 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739411#comment-13739411
 ] 

Luke Lu commented on HADOOP-9871:
-

The patch lgtm. +1. Will commit shortly.

 Fix intermittent findbug warnings in DefaultMetricsSystem
 -

 Key: HADOOP-9871
 URL: https://issues.apache.org/jira/browse/HADOOP-9871
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Luke Lu
Assignee: Junping Du
Priority: Minor
 Attachments: HADOOP-9871.patch


 Findbugs sometimes picks up warnings from DefaultMetricsSystem due to some of 
 the fields not being transient for serializable class (DefaultMetricsSystem 
 is an Enum (which is serializable)). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9872) Improve protoc version handling and detection

2013-08-14 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740012#comment-13740012
 ] 

Luke Lu commented on HADOOP-9872:
-

As long as I can override protoc.path and protobuf.version, I'm happy. +1 
pending jenkins.

 Improve protoc version handling and detection
 -

 Key: HADOOP-9872
 URL: https://issues.apache.org/jira/browse/HADOOP-9872
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Blocker
 Fix For: 2.1.0-beta

 Attachments: HADOOP-9872.patch


 HADOOP-9845 bumped up protoc from 2.4.1 to 2.5.0, but we run into a few 
 quirks:
 * 'protoc --version' in 2.4.1 exits with 1
 * 'protoc --version' in 2.5.0 exits with 0
 * if you have multiple protoc in your environment, you have to the the one 
 you want to use in the PATH before building hadoop
 * build documentation and requirements of protoc are outdated
 This patch does:
 * handles protoc version correctly independently of the exit code
 * if HADOOP_PROTOC_PATH env var is defined, it uses it as the protoc 
 executable * if HADOOP_PROTOC_PATH is not defined, it picks protoc from the 
 PATH
 * documentation updated to reflect 2.5.0 is required
 * enforces the version of protoc and protobuf JAR are the same
 * Added to VersionInfo the protoc version used (sooner or later this will be 
 useful for in a troubleshooting situation).
 [~vicaya] suggested to make the version check for protoc lax (i.e. 2.5.*). 
 While working on the patch I've thought about that. But that would introduce 
 a potential mismatch between protoc and protobuff  JAR.
 Still If you want to use different version of protoc/protobuff from the one 
 defined in the POM, you can use the -Dprotobuf.version= to specify your 
 alternate version. But I would recommend not to do this, because if you 
 publish the artifacts to a Maven repo, the fact you used 
 -Dprotobuf.version= will be lost and the version defined in the POM 
 properties will be used (IMO Maven should use the effective POM on deploy, 
 but they don't).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9871) Fix intermittent findbug warnings in DefaultMetricsSystem

2013-08-14 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9871:


   Resolution: Fixed
Fix Version/s: 2.3.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks, Junping!

 Fix intermittent findbug warnings in DefaultMetricsSystem
 -

 Key: HADOOP-9871
 URL: https://issues.apache.org/jira/browse/HADOOP-9871
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Luke Lu
Assignee: Junping Du
Priority: Minor
 Fix For: 2.3.0

 Attachments: HADOOP-9871.patch


 Findbugs sometimes picks up warnings from DefaultMetricsSystem due to some of 
 the fields not being transient for serializable class (DefaultMetricsSystem 
 is an Enum (which is serializable)). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9871) Fix intermittent findbug warnings in DefaultMetricsSystem

2013-08-13 Thread Luke Lu (JIRA)

Luke Lu created HADOOP-9871:
---

 Summary: Fix intermittent findbug warnings in DefaultMetricsSystem
 Key: HADOOP-9871
 URL: https://issues.apache.org/jira/browse/HADOOP-9871
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Luke Lu
Assignee: Junping Du
Priority: Minor


Findbugs sometimes (not always) picks up warnings from DefaultMetricsSystem due 
to some of the fields not being transient for serializable class 
(DefaultMetricsSystem is an Enum (which is serializable)). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9446) Support Kerberos HTTP SPNEGO authentication for non-SUN JDK

2013-08-13 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738848#comment-13738848
]

Luke Lu commented on HADOOP-9446:
-

The v2 patch lgtm. +1. Filed HADOOP-9871 to address #2. Will commit shortly.

Support Kerberos HTTP SPNEGO authentication for non-SUN JDK
---

Key: HADOOP-9446
URL: https://issues.apache.org/jira/browse/HADOOP-9446
Project: Hadoop Common
Issue Type: Improvement
Components: security
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Yu Gao
Assignee: Yu Gao
Attachments: HADOOP-9446-branch-2.patch,
HADOOP-9446-branch-2-v2.patch, HADOOP-9446.patch, HADOOP-9446-v2.patch,
TestKerberosHttpSPNEGO.java,
TEST-org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator.xml,

TEST-org.apache.hadoop.security.authentication.server.TestKerberosAuthenticationHandler.xml

Class KerberosAuthenticator and KerberosAuthenticationHandler currently only
support running with SUN JDK when Kerberos is enabled. In order to support
alternative JDKs like IBM JDK which has different options supported by
Krb5LoginModule and different login module classes, the HTTP Kerberos
authentication classes need to be changed.
In addition, NT_GSS_KRB5_PRINCIPAL, which is used in KerberosAuthenticator to
get the corresponding oid instance, is a field defined in SUN JDK, but not in
IBM JDK.
This JIRA is to fix the existing problems and add support for Kerberos HTTP
SPNEGO authentication with non-SUN JDK.

[jira] [Updated] (HADOOP-9446) Support Kerberos HTTP SPNEGO authentication for non-SUN JDK

2013-08-13 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9446:


   Resolution: Fixed
Fix Version/s: 2.1.1-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, 2.1-beta. Thanks Yu for the patch!

 Support Kerberos HTTP SPNEGO authentication for non-SUN JDK
 ---

 Key: HADOOP-9446
 URL: https://issues.apache.org/jira/browse/HADOOP-9446
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Yu Gao
Assignee: Yu Gao
 Fix For: 2.1.1-beta

 Attachments: HADOOP-9446-branch-2.patch, 
 HADOOP-9446-branch-2-v2.patch, HADOOP-9446.patch, HADOOP-9446-v2.patch, 
 TestKerberosHttpSPNEGO.java, 
 TEST-org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator.xml,
  
 TEST-org.apache.hadoop.security.authentication.server.TestKerberosAuthenticationHandler.xml


 Class KerberosAuthenticator and KerberosAuthenticationHandler currently only 
 support running with SUN JDK when Kerberos is enabled. In order to support  
 alternative JDKs like IBM JDK which has different options supported by 
 Krb5LoginModule and different login module classes, the HTTP Kerberos 
 authentication classes need to be changed.
 In addition, NT_GSS_KRB5_PRINCIPAL, which is used in KerberosAuthenticator to 
 get the corresponding oid instance, is a field defined in SUN JDK, but not in 
 IBM JDK.
 This JIRA is to fix the existing problems and add support for Kerberos HTTP 
 SPNEGO authentication with non-SUN　JDK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9871) Fix intermittent findbug warnings in DefaultMetricsSystem

2013-08-13 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9871:


Description: Findbugs sometimes picks up warnings from DefaultMetricsSystem 
due to some of the fields not being transient for serializable class 
(DefaultMetricsSystem is an Enum (which is serializable)).   (was: Findbugs 
sometimes (not always) picks up warnings from DefaultMetricsSystem due to some 
of the fields not being transient for serializable class (DefaultMetricsSystem 
is an Enum (which is serializable)). )

 Fix intermittent findbug warnings in DefaultMetricsSystem
 -

 Key: HADOOP-9871
 URL: https://issues.apache.org/jira/browse/HADOOP-9871
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Luke Lu
Assignee: Junping Du
Priority: Minor

 Findbugs sometimes picks up warnings from DefaultMetricsSystem due to some of 
 the fields not being transient for serializable class (DefaultMetricsSystem 
 is an Enum (which is serializable)). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9446) Support Kerberos HTTP SPNEGO authentication for non-SUN JDK

2013-08-12 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737447#comment-13737447
]

Luke Lu commented on HADOOP-9446:
-

bq. Should PlatformName.java be moved to hadoop-auth package? this will affect
classes referring to it.

To the hadoop-auth module, I think so. We should keep it in the o.a.h.util
package, since it's less disruptive, as hadoop-common already depends on
hadoop-auth for KerberosName etc, though eventually we might need to come up
with something like hadoop-common-util that hadoop-auth, hadoop-common and
some other modules (e.g. hadoop-nfs, hadoop-fs-extras, hadoop-webapp etc.) in
the common project can depend on. It's not worth creating a new module now for
this class only.

Support Kerberos HTTP SPNEGO authentication for non-SUN JDK
---

Key: HADOOP-9446
URL: https://issues.apache.org/jira/browse/HADOOP-9446
Project: Hadoop Common
Issue Type: Improvement
Components: security
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Yu Gao
Assignee: Yu Gao
Attachments: HADOOP-9446-branch-2.patch, HADOOP-9446.patch,
TestKerberosHttpSPNEGO.java,
TEST-org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator.xml,

TEST-org.apache.hadoop.security.authentication.server.TestKerberosAuthenticationHandler.xml

[jira] [Commented] (HADOOP-9446) Support Kerberos HTTP SPNEGO authentication for non-SUN JDK

2013-08-10 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735781#comment-13735781
 ] 

Luke Lu commented on HADOOP-9446:
-

Which version of IBM JDK this is for? IBM JDK 6 SR14? Is the patch needed for 
IBM JDK 7?

You should still use PlatformName.IBM_JAVA instead of introducing a new one in 
KerberosUtil.

 Support Kerberos HTTP SPNEGO authentication for non-SUN JDK
 ---

 Key: HADOOP-9446
 URL: https://issues.apache.org/jira/browse/HADOOP-9446
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 1.1.1, 2.0.2-alpha
Reporter: Yu Gao
Assignee: Yu Gao
 Attachments: HADOOP-9446-branch-2.patch, HADOOP-9446.patch, 
 TestKerberosHttpSPNEGO.java, 
 TEST-org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator.xml,
  
 TEST-org.apache.hadoop.security.authentication.server.TestKerberosAuthenticationHandler.xml


 Class KerberosAuthenticator and KerberosAuthenticationHandler currently only 
 support running with SUN JDK when Kerberos is enabled. In order to support  
 alternative JDKs like IBM JDK which has different options supported by 
 Krb5LoginModule and different login module classes, the HTTP Kerberos 
 authentication classes need to be changed.
 In addition, NT_GSS_KRB5_PRINCIPAL, which is used in KerberosAuthenticator to 
 get the corresponding oid instance, is a field defined in SUN JDK, but not in 
 IBM JDK.
 This JIRA is to fix the existing problems and add support for Kerberos HTTP 
 SPNEGO authentication with non-SUN　JDK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9820) RPCv9 wire protocol is insufficient to support multiplexing

2013-08-07 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732175#comment-13732175
 ] 

Luke Lu commented on HADOOP-9820:
-

IMO, when the stream is being wrapped, no unwrapped exception should be thrown 
across RPC for the stream, as it's a breach of confidentiality. Server should 
log such exceptions.

Minor nits:

# SaslRpcClient.SaslRpc*Stream should be named SaslRpcClient.Wrapped*Stream.
# The default stream buffer size should be configurable instead of hard coded 
64*1024.

 RPCv9 wire protocol is insufficient to support multiplexing
 ---

 Key: HADOOP-9820
 URL: https://issues.apache.org/jira/browse/HADOOP-9820
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc, security
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9820.patch


 RPCv9 is intended to allow future support of multiplexing.  This requires all 
 wire messages to be tagged with a RPC header so a demux can decode and route 
 the messages accordingly.
 RPC ping packets and SASL QOP wrapped data is known to not be tagged with a 
 header.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9820) RPCv9 wire protocol is insufficient to support multiplexing

2013-08-07 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732752#comment-13732752
]

Luke Lu commented on HADOOP-9820:
-

bq. Client and server are using mismatched ciphers.

That should not happen after the SASL negotiation is done. Given that even
timing difference can leak information, we should not even tell a potentially
adversarial client the fact that unwrap failed. We should log the exception at
the server side for debugging purpose and close the connection after waiting
for a random interval.

bq. That's the spec default if the buffer size isn't negotiated so it can't be
a configurable option.

It needs to be a constant (with a pointer to the rfc) instead of literals for
future maintenance.

RPCv9 wire protocol is insufficient to support multiplexing
---

Key: HADOOP-9820
URL: https://issues.apache.org/jira/browse/HADOOP-9820
Project: Hadoop Common
Issue Type: Bug
Components: ipc, security
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
Attachments: HADOOP-9820.patch

RPCv9 is intended to allow future support of multiplexing. This requires all
wire messages to be tagged with a RPC header so a demux can decode and route
the messages accordingly.
RPC ping packets and SASL QOP wrapped data is known to not be tagged with a
header.

[jira] [Updated] (HADOOP-9319) Update bundled lz4 source to latest version

2013-08-05 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9319:


  Resolution: Fixed
   Fix Version/s: 2.3.0
Target Version/s: 2.3.0
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1 for v5. Committed to trunk and branch-2. Thanks Binglin for the patch and 
Arpit for reviewing!

 Update bundled lz4 source to latest version
 ---

 Key: HADOOP-9319
 URL: https://issues.apache.org/jira/browse/HADOOP-9319
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Arpit Agarwal
Assignee: Binglin Chang
 Fix For: 2.3.0

 Attachments: HADOOP-9319-addendum-windows.patch, HADOOP-9319.patch, 
 HADOOP-9319.v2.patch, HADOOP-9319.v3.patch, HADOOP-9319.v4.patch, 
 HADOOP-9319.v5.patch


 There is a newer version available at 
 https://code.google.com/p/lz4/source/detail?r=89
 Among other fixes, r75 fixes compile warnings generated by Visual Studio.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9785) LZ4 code may need upgrade (lz4.c embedded in libHadoop is r43 18 months ago, while latest version is r98)

2013-08-05 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu resolved HADOOP-9785.
-

   Resolution: Duplicate
Fix Version/s: (was: 2.0.4-alpha)
   (was: 3.0.0)
   2.3.0

 LZ4 code may need upgrade (lz4.c embedded in libHadoop is r43 18 months ago, 
 while latest version is r98)
 -

 Key: HADOOP-9785
 URL: https://issues.apache.org/jira/browse/HADOOP-9785
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io, native
Affects Versions: 3.0.0, 2.0.4-alpha
 Environment: [german@localhost lz4-read-only]$ lscpu
 Architecture:  x86_64
 CPU op-mode(s):32-bit, 64-bit
 Byte Order:Little Endian
 CPU(s):4
 On-line CPU(s) list:   0-3
 Thread(s) per core:1
 Core(s) per socket:4
 Socket(s): 1
 NUMA node(s):  1
 Vendor ID: GenuineIntel
 CPU family:6
 Model: 23
 Stepping:  10
 CPU MHz:   2667.000
 BogoMIPS:  5319.82
 Virtualization:VT-x
 L1d cache: 32K
 L1i cache: 32K
 L2 cache:  2048K
 NUMA node0 CPU(s): 0-3
 [german@localhost lz4-read-only]$ uname -r
 2.6.32-358.14.1.el6.x86_64
Reporter: German Florez-Larrahondo
Priority: Minor
 Fix For: 2.3.0


 While analyzing compression performance of different Hadoop codecs I noticed 
 that the LZ4 code was taken from revision 43 of 
 https://code.google.com/p/lz4/. The latest version is r98 and there may be 
 extra performance benefits we can gain from using r98. 
 We may involve the original LZ4 author Yann Collet on these discussions, as 
 the current LZ4 code includes additional algorithms and parameters. 
 To start the investigation, I ran preliminary experiments with the Silesia 
 corpus and there seems to be an improvement on throughput for compression and 
 decompression in the latest release when compared with r43 (haven't done 
 enough analysis to conclude anything statistically, but looks good).  
 Here is raw output using LZ4 from r43 with a SUBSET of the silesia corpus 
 (http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia)
 File: silesia/dickens
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Compressed 10192446 bytes into 6433123 bytes == 63.12%
 Done in 0.07 s == 138.86 MB/s
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Successfully decoded 10192446 bytes
 Done in 0.02 s == 486.01 MB/s
 File: silesia/mozilla
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Compressed 51220480 bytes into 26379814 bytes == 51.50%
 Done in 0.25 s == 195.39 MB/s
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Successfully decoded 51220480 bytes
 Done in 0.12 s == 407.06 MB/s
 File: silesia/mr
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Compressed 9970564 bytes into 5669268 bytes == 56.86%
 Done in 0.04 s == 237.72 MB/s
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Successfully decoded 9970564 bytes
 Done in 0.02 s == 475.43 MB/s
 File: silesia/nci
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Compressed 33553445 bytes into 5880292 bytes == 17.53%
 Done in 0.08 s == 399.99 MB/s
 *** Compression CLI using LZ4 algorithm , by Yann Collet (Jul 29 2013) ***
 Successfully decoded 33553445 bytes
 Done in 0.06 s == 533.32 MB/s
 And here raw output of LZ4 from the latest release r98
 File: silesia/dickens
 *** Full LZ4 speed analyzer , by Yann Collet (Jul 29 2013) ***
 Loading silesia/dickens...
 1-LZ4_compress:  10192446 -^M1-LZ4_compress:  10192446 -   
 6434313 (63.13%),  172.3 MB/s
 1-LZ4_decompress_fast :  10192446 -^M1-LZ4_decompress_fast :  10192446 -   
 676.0 MB/s^MLZ4_decompress_fast   :  10192446 -   676.0 MB/s
 File: silesia/mozilla
 *** Full LZ4 speed analyzer , by Yann Collet (Jul 29 2013) ***
 Loading silesia/mozilla...
 1-LZ4_compress:  51220480 -^M1-LZ4_compress:  51220480 -  
 26382113 (51.51%),  281.7 MB/s
 1-LZ4_decompress_fast :  51220480 -^M1-LZ4_decompress_fast :  51220480 -  
 1003.1 MB/s^MLZ4_decompress_fast   :  51220480 -  1003.1 MB/s
 File: silesia/mr
 *** Full LZ4 speed analyzer , by Yann Collet (Jul 29 2013) ***
 Loading silesia/mr...
 1-LZ4_compress:   9970564 -^M1-LZ4_compress:   9970564 -   
 5669255 (56.86%),  268.3 MB/s
 1-LZ4_decompress_fast :   9970564 -^M1-LZ4_decompress_fast :   9970564 -   
 788.7 MB/s^MLZ4_decompress_fast   :   9970564 -   788.7 MB/s
 File: silesia/nci
 *** Full LZ4 speed analyzer , by Yann Collet (Jul 29 2013) ***

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-08-05 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730061#comment-13730061
 ] 

Luke Lu commented on HADOOP-9160:
-

bq. Making it optional is pretty straightforward.

As I pointed out in the previous review, the patch needs to be improved to 
incorporate hadoop authz (HttpServer#isInstrumentationAccessAllowed for read 
and HttpServer#hasAdministratorAccess for write), which would involve a simple 
wrapper to extend the original AgentServlet. I think a _compile_ time 
dependency (a la jersey) would be more appropriate if you really want the 
runtime dependency to be optional.

OTOH, I personally prefer a static dependency for simplicity and UX, because if 
a user enables jolokia and the jar is not not there, there'd be another cryptic 
class not found error in the log to chase down. jolokia-core jar is only 200+KB 
which is a small fraction of common dependencies.

bq. it may end up on the classpath of client side apps.

The hadoop-client module (HADOOP-8009) should address most of the client side 
issues. For YARN app side dependencies, monitoring too useful to save the 200KB 
initial dist cache copies...



 Adopt Jolokia as the JMX HTTP/JSON bridge.
 --

 Key: HADOOP-9160
 URL: https://issues.apache.org/jira/browse/HADOOP-9160
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Luke Lu
Assignee: Junping Du
  Labels: features
 Attachments: hadoop-9160-demo-branch-1.txt, HADOOP-9160.patch


 The current JMX HTTP bridge has served its purpose, while a more complete 
 solution: Jolokia (formerly Jmx4Perl) has been developed/matured over the 
 years.
 Jolokia provides comprehensive JMX features over HTTP/JSON including search 
 and list of JMX attributes and operations metadata, which helps to support 
 inter framework/platform compatibility. It has first class language bindings 
 for Perl, Python, Javascript, Java.
 It's trivial (see demo patch) to incorporate Jolokia servlet into Hadoop HTTP 
 servers and use the same security mechanisms.
 Adopting Jolokia will substantially improve the manageability of Hadoop and 
 its ecosystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-08-01 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726555#comment-13726555
]

Luke Lu commented on HADOOP-9160:
-

bq. HttpServer won't start up if jolokia isn't on the classpath.

It'll always be part of the classpath from packaging just like other maven
dependencies (jackson etc).

Adopt Jolokia as the JMX HTTP/JSON bridge.
--

Key: HADOOP-9160
URL: https://issues.apache.org/jira/browse/HADOOP-9160
Project: Hadoop Common
Issue Type: Improvement
Reporter: Luke Lu
Assignee: Junping Du
Labels: features
Attachments: hadoop-9160-demo-branch-1.txt, HADOOP-9160.patch

The current JMX HTTP bridge has served its purpose, while a more complete
solution: Jolokia (formerly Jmx4Perl) has been developed/matured over the
years.
Jolokia provides comprehensive JMX features over HTTP/JSON including search
and list of JMX attributes and operations metadata, which helps to support
inter framework/platform compatibility. It has first class language bindings
for Perl, Python, Javascript, Java.
It's trivial (see demo patch) to incorporate Jolokia servlet into Hadoop HTTP
servers and use the same security mechanisms.
Adopting Jolokia will substantially improve the manageability of Hadoop and
its ecosystem.

[jira] [Commented] (HADOOP-9784) Add builder in creating HttpServer

2013-07-31 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725444#comment-13725444
 ] 

Luke Lu commented on HADOOP-9784:
-

Do not delete but deprecate public methods, otherwise it'd be an incompatible 
change. It'd be better to convert the deprecated usage instead of just adding 
suppression.

 Add builder in creating HttpServer
 --

 Key: HADOOP-9784
 URL: https://issues.apache.org/jira/browse/HADOOP-9784
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9784.patch, HADOOP-9784-v2.patch


 There are quite a lot of constructors in class of HttpServer to create 
 instance. Create a builder class to abstract the building steps which helps 
 to avoid more constructors in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9784) Add a builder for HttpServer

2013-07-31 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9784:


Summary: Add a builder for HttpServer  (was: Add builder in creating 
HttpServer)

 Add a builder for HttpServer
 

 Key: HADOOP-9784
 URL: https://issues.apache.org/jira/browse/HADOOP-9784
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Junping Du
Assignee: Junping Du
 Attachments: HADOOP-9784.patch, HADOOP-9784-v2.patch


 There are quite a lot of constructors in class of HttpServer to create 
 instance. Create a builder class to abstract the building steps which helps 
 to avoid more constructors in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-07-31 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725608#comment-13725608
]

Luke Lu commented on HADOOP-9160:
-

bq. I'm surprised they speak Jolokia but not HTTP.

They speak JMX and/or HTTP with java builtin http server (not servlet), which
are not amenable for servlet filters (for SPNEGO etc.) that we have.
Customers/users wish/want to use these monitoring/tuning agents they already
have with the auth mechanisms that hadoop already has. Jolokia can provide a
great out-of-the-box experience.

bq. Similarly, I'd like to understand whether, in your ideal world, you could,
say, read a file or call hdfs upgrade over JMX?

In an ideal world, JMX/Jolokia being a JVM builtin management mechanism can be
used to manage any application component without restarting the server JVM to
minimize service downtime. The current metrics system is already restartable
for different configs via JMX without server restart. It could certainly read a
status/log file and/or do an hdfs upgrade if needed. Though short term, we only
plan to expose a small set of admin functions related to resource management.

bq. We've spent considerable effort getting backwards-compatible protocols,
Opening up another layer of RPC exposes us to more issues here.

JMX/Jolokia are even more backward compatible friendly as they natively support
attribute/method reflection/introspection without the need of IDLs or any
specific client libraries, which is great for interoperability. Please see the
section of interoperability in my previous comment for more details.
Technically JMX is not another layer of RPC but the JVM native management
mechanism that could in theory manage the Hadoop RPC subsystem as well.

Adopt Jolokia as the JMX HTTP/JSON bridge.
--

[jira] [Commented] (HADOOP-9319) Update bundled lz4 source to latest version

2013-07-31 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13725672#comment-13725672
 ] 

Luke Lu commented on HADOOP-9319:
-

git mirror delay/glitch prevented git-svn from working for me :) The addendum 
lgtm as well. 

 Update bundled lz4 source to latest version
 ---

 Key: HADOOP-9319
 URL: https://issues.apache.org/jira/browse/HADOOP-9319
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Arpit Agarwal
Assignee: Binglin Chang
 Attachments: HADOOP-9319-addendum-windows.patch, HADOOP-9319.patch, 
 HADOOP-9319.v2.patch, HADOOP-9319.v3.patch, HADOOP-9319.v4.patch


 There is a newer version available at 
 https://code.google.com/p/lz4/source/detail?r=89
 Among other fixes, r75 fixes compile warnings generated by Visual Studio.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9319) Update bundled lz4 source to latest version

2013-07-30 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724012#comment-13724012
 ] 

Luke Lu commented on HADOOP-9319:
-

The patch lgtm. +1. Will commit shortly.

 Update bundled lz4 source to latest version
 ---

 Key: HADOOP-9319
 URL: https://issues.apache.org/jira/browse/HADOOP-9319
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Arpit Agarwal
Assignee: Binglin Chang
 Attachments: HADOOP-9319.patch, HADOOP-9319.v2.patch, 
 HADOOP-9319.v3.patch, HADOOP-9319.v4.patch


 There is a newer version available at 
 https://code.google.com/p/lz4/source/detail?r=89
 Among other fixes, r75 fixes compile warnings generated by Visual Studio.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-07-30 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724028#comment-13724028
]

Luke Lu commented on HADOOP-9160:
-

bq. I don't like the prospect of having yet another RPC mechanism to do the
same thing.

While it's usually not good to duplicate code for the underlying mechanisms,
it's usually a good thing (adds customer value) to provide alternative access
points for the underlying mechanisms. Example, WebHDFS, NFS-HDFS gateways,
HttpFS etc. There is also added benefits of redundancy for admin protocols as I
mentioned in my [previous comment|#comment-13549249].

It's also false that they do the same thing. Here is an example that the
current RPC mechanism cannot do at all: third party JVM agents that provide
advanced runtime monitoring and tuning of JVM that expose JMX as the API.
Jolokia can provide Hadoop auth to such extensions with zero code change.

Adopt Jolokia as the JMX HTTP/JSON bridge.
--

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-07-30 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724269#comment-13724269
]

Luke Lu commented on HADOOP-9160:
-

bq. If you're running a third party JVM agent, by all means let it expose
whatever APIs it would like--it's got everything it needs to bind to addresses
and listen for 'em.

It's not me or us whose running 3rd party agents that I/we don't have any
control. It's the customers/users who chose/bought these software. These
third-party agents usually have JMX, some has REST API, but they cannot use
Hadoop auth directly without code changes, which is out of the question as
they're not Hadoop specific. Jolokia can extend Hadoop auth to these agents
without code changes.

bq. I have no particular objections to alternative access endpoints (e.g., NFS
proxies).

OK.

bq. I do have objections to alternatives for write access.

This seems to be directly contradict your previous statement.

bq. I completely agree with Allen W: we've got to have a way to turn it off.

As I mentioned in a [previous comment|#comment-13707692], JMX/Jolokia write
access will be off by default to avoid any surprises, which should be few, as
people who care about security has site wide HTTP hadoop auth configured in
core-site.xml. BTW, Jolokia has as per attribute/method ACLs as well.

We're talking about a small and low risk patch to properly expose the builtin
java management facility here. The code to expose JMX/Jolokia access to a
subset of admin functions is trivial compared with that using custom Hadoop RPC
and/or web services.

Adopt Jolokia as the JMX HTTP/JSON bridge.
--

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-07-29 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722240#comment-13722240
 ] 

Luke Lu commented on HADOOP-9160:
-

The patch need to preserve the authz a la JMXJsonServlet.

 Adopt Jolokia as the JMX HTTP/JSON bridge.
 --

 Key: HADOOP-9160
 URL: https://issues.apache.org/jira/browse/HADOOP-9160
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Luke Lu
Assignee: Junping Du
  Labels: features
 Attachments: hadoop-9160-demo-branch-1.txt, HADOOP-9160.patch


 The current JMX HTTP bridge has served its purpose, while a more complete 
 solution: Jolokia (formerly Jmx4Perl) has been developed/matured over the 
 years.
 Jolokia provides comprehensive JMX features over HTTP/JSON including search 
 and list of JMX attributes and operations metadata, which helps to support 
 inter framework/platform compatibility. It has first class language bindings 
 for Perl, Python, Javascript, Java.
 It's trivial (see demo patch) to incorporate Jolokia servlet into Hadoop HTTP 
 servers and use the same security mechanisms.
 Adopting Jolokia will substantially improve the manageability of Hadoop and 
 its ecosystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-07-29 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723032#comment-13723032
]

Luke Lu commented on HADOOP-9160:
-

bq. I don't think we should introduce yet another protocol for admin 'write'
operations.

* JMX already used for write/admin operations (JVM, metrics system etc.)
* Jolokia is basically a firewall and non-java language friendly transport for
JMX.
* Jolokia is an enhancement to the existing JMX http proxy.
* JMX via Jolokia makes backward compatible API evolution trivial for both
Hadoop 1 and beyond.
* We'll only expose admin API carefully in JMX/Jolokia that's common for Hadoop
1, Hadoop 2+ and external systems.

I've also addressed people's concerns in detail in [#comment-13549249]. This is
a light-weight feature that makes management system for multiple clusters of
different versions of Hadoop and other systems easier. The risk of the feature
is minimal, as it's not in typical production paths. If you still have
concerns, I'd be interested in your specific reasons, so I can address them
specifically.

Adopt Jolokia as the JMX HTTP/JSON bridge.
--

[jira] [Commented] (HADOOP-9698) RPCv9 client must honor server's SASL negotiate response

2013-07-28 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722135#comment-13722135
 ] 

Luke Lu commented on HADOOP-9698:
-

The patch lgtm. So does the doc -- Thanks Daryn. +1.

 RPCv9 client must honor server's SASL negotiate response
 

 Key: HADOOP-9698
 URL: https://issues.apache.org/jira/browse/HADOOP-9698
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9698.patch, HADOOP-9698.patch, HADOOP-9698.patch, 
 RPCv9 Authentication.pdf


 As of HADOOP-9421, a RPCv9 server will advertise its authentication methods.  
 This is meant to support features such as IP failover, better token 
 selection, and interoperability in a heterogenous security environment.
 Currently the client ignores the negotiate response and just blindly attempts 
 to authenticate instead of choosing a mutually agreeable auth method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9755) HADOOP-9164 breaks the windows native build

2013-07-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718544#comment-13718544
 ] 

Luke Lu commented on HADOOP-9755:
-

HADOOP-9759 is resolved/fixed first :)

 HADOOP-9164 breaks the windows native build
 ---

 Key: HADOOP-9755
 URL: https://issues.apache.org/jira/browse/HADOOP-9755
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vinay
Assignee: Binglin Chang
Priority: Blocker
 Attachments: HADOOP-9755.patch, HADOOP-9755.v2.patch


 After HADOOP-9164 hadooop windows native build is broken.
 {noformat}  NativeCodeLoader.c
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'Dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2146: syntax error 
 : missing ';' before identifier 'dl_info' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(42): error C2143: syntax error 
 : missing ';' before 'type' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'ret' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2224: left of 
 '.dli_fname' must have struct/union type 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): fatal error C1903: unable 
 to recover from previous error(s); stopping compilation 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
   NativeCrc32.c
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj 
 (default targets) -- FAILED.
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.sln 
 (default targets) -- FAILED.{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9755) HADOOP-9164 breaks the windows native build

2013-07-23 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9755:


Fix Version/s: (was: 2.1.0-beta)

 HADOOP-9164 breaks the windows native build
 ---

 Key: HADOOP-9755
 URL: https://issues.apache.org/jira/browse/HADOOP-9755
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vinay
Assignee: Binglin Chang
Priority: Blocker
 Attachments: HADOOP-9755.patch, HADOOP-9755.v2.patch


 After HADOOP-9164 hadooop windows native build is broken.
 {noformat}  NativeCodeLoader.c
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'Dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2146: syntax error 
 : missing ';' before identifier 'dl_info' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(42): error C2143: syntax error 
 : missing ';' before 'type' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'ret' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2224: left of 
 '.dli_fname' must have struct/union type 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): fatal error C1903: unable 
 to recover from previous error(s); stopping compilation 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
   NativeCrc32.c
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj 
 (default targets) -- FAILED.
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.sln 
 (default targets) -- FAILED.{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9755) HADOOP-9164 breaks the windows native build

2013-07-23 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9755:


   Resolution: Duplicate
Fix Version/s: 2.1.0-beta
   Status: Resolved  (was: Patch Available)

 HADOOP-9164 breaks the windows native build
 ---

 Key: HADOOP-9755
 URL: https://issues.apache.org/jira/browse/HADOOP-9755
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vinay
Assignee: Binglin Chang
Priority: Blocker
 Fix For: 2.1.0-beta

 Attachments: HADOOP-9755.patch, HADOOP-9755.v2.patch


 After HADOOP-9164 hadooop windows native build is broken.
 {noformat}  NativeCodeLoader.c
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'Dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2146: syntax error 
 : missing ';' before identifier 'dl_info' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(42): error C2143: syntax error 
 : missing ';' before 'type' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'ret' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2224: left of 
 '.dli_fname' must have struct/union type 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): fatal error C1903: unable 
 to recover from previous error(s); stopping compilation 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
   NativeCrc32.c
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj 
 (default targets) -- FAILED.
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.sln 
 (default targets) -- FAILED.{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9755) HADOOP-9164 breaks the windows native build

2013-07-22 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715369#comment-13715369
 ] 

Luke Lu commented on HADOOP-9755:
-

I agree with Binglin that Unavailable is ambiguous. How about a more explicit 
library path detection not implemented?

 HADOOP-9164 breaks the windows native build
 ---

 Key: HADOOP-9755
 URL: https://issues.apache.org/jira/browse/HADOOP-9755
 Project: Hadoop Common
  Issue Type: Bug
  Components: native
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Vinay
Assignee: Binglin Chang
Priority: Blocker
 Attachments: HADOOP-9755.patch


 After HADOOP-9164 hadooop windows native build is broken.
 {noformat}  NativeCodeLoader.c
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'Dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2146: syntax error 
 : missing ';' before identifier 'dl_info' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(41): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(42): error C2143: syntax error 
 : missing ';' before 'type' 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'ret' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2065: 'dl_info' : 
 undeclared identifier 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): error C2224: left of 
 '.dli_fname' must have struct/union type 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
 src\org\apache\hadoop\util\NativeCodeLoader.c(45): fatal error C1903: unable 
 to recover from previous error(s); stopping compilation 
 [D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj]
   NativeCrc32.c
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.vcxproj 
 (default targets) -- FAILED.
 Done Building Project 
 D:\hdp2\hadoop-common-project\hadoop-common\src\main\native\native.sln 
 (default targets) -- FAILED.{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9164) Print paths of loaded native libraries in NativeLibraryChecker

2013-07-18 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9164:


Target Version/s: 2.0.3-alpha, 3.0.0  (was: 3.0.0, 2.0.3-alpha)
 Summary: Print paths of loaded native libraries in 
NativeLibraryChecker  (was: Add version number and/or library file name to 
native library for easy tracking)

The patch lgtm. +1. Will commit shortly.

 Print paths of loaded native libraries in NativeLibraryChecker
 --

 Key: HADOOP-9164
 URL: https://issues.apache.org/jira/browse/HADOOP-9164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch, 
 HADOOP-9164.v3.patch, HADOOP-9164.v4.2.patch, HADOOP-9164.v4.patch, 
 HADOOP-9164.v4.patch, HADOOP-9164.v5.patch, HADOOP-9164.v6.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9164) Print paths of loaded native libraries in NativeLibraryChecker

2013-07-18 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9164:


  Resolution: Fixed
   Fix Version/s: 2.1.0-beta
Target Version/s: 2.0.3-alpha, 3.0.0  (was: 3.0.0, 2.0.3-alpha)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, 2.1-beta, and 2.1.0-beta. Thanks Binglin for the 
patch and Colin for reviewing!

 Print paths of loaded native libraries in NativeLibraryChecker
 --

 Key: HADOOP-9164
 URL: https://issues.apache.org/jira/browse/HADOOP-9164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Fix For: 2.1.0-beta

 Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch, 
 HADOOP-9164.v3.patch, HADOOP-9164.v4.2.patch, HADOOP-9164.v4.patch, 
 HADOOP-9164.v4.patch, HADOOP-9164.v5.patch, HADOOP-9164.v6.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9164) Add version number and/or library file name to native library for easy tracking

2013-07-17 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710908#comment-13710908
 ] 

Luke Lu commented on HADOOP-9164:
-

It'd be nice that you print nothing instead of system-native if a library is 
not loaded, as system-native really doesn't make any sense in this case. 
Otherwise the patch lgtm.

 Add version number and/or library file name to native library for easy 
 tracking
 ---

 Key: HADOOP-9164
 URL: https://issues.apache.org/jira/browse/HADOOP-9164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch, 
 HADOOP-9164.v3.patch, HADOOP-9164.v4.2.patch, HADOOP-9164.v4.patch, 
 HADOOP-9164.v4.patch, HADOOP-9164.v5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9164) Add version number and/or library file name to native library for easy tracking

2013-07-13 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707680#comment-13707680
 ] 

Luke Lu commented on HADOOP-9164:
-

IMO the patch to print the resolved paths of shared libs is already very useful 
and orthogonal to the naming conventions. I think we should defer the 
naming/version discussions to another JIRA.


 Add version number and/or library file name to native library for easy 
 tracking
 ---

 Key: HADOOP-9164
 URL: https://issues.apache.org/jira/browse/HADOOP-9164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch, 
 HADOOP-9164.v3.patch, HADOOP-9164.v4.2.patch, HADOOP-9164.v4.patch, 
 HADOOP-9164.v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9160) Adopt Jolokia as the JMX HTTP/JSON bridge.

2013-07-13 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Luke Lu updated HADOOP-9160:

Description:
The current JMX HTTP bridge has served its purpose, while a more complete
solution: Jolokia (formerly Jmx4Perl) has been developed/matured over the years.

Jolokia provides comprehensive JMX features over HTTP/JSON including search and
list of JMX attributes and operations metadata, which helps to support inter
framework/platform compatibility. It has first class language bindings for
Perl, Python, Javascript, Java.

It's trivial (see demo patch) to incorporate Jolokia servlet into Hadoop HTTP
servers and use the same security mechanisms.

Adopting Jolokia will substantially improve the manageability of Hadoop and its
ecosystem.

was:
Currently we use Hadoop RPC (and some HTTP, notably fsck) for admin protocols.
We should consider adopt JMX for future admin protocols, as it's the industry
standard for java server management with wide client support.

Having an alternative/redundant RPC mechanism is very desirable for admin
protocols. I've seen in the past in multiple cases, where NN and/or JT RPC were
locked up solid due to various bugs and/or RPC thread pool exhaustion, while
HTTP and/or JMX worked just fine.

Other desirable benefits include admin protocol backward compatibility and
introspectability, which is convenient for a centralized management system to
manage multiple Hadoop clusters of different versions. Another notable benefit
is that it's much easier to implement new admin commands in JMX (especially
with MXBean) than Hadoop RPC, especially in trunk (as well as 0.23+ and 2.x).

Since Hadoop RPC doesn't guarantee backward compatibility (probably not ever
for branch-1), there are few external tools depending on it. We can keep the
old protocols for as long as needed. New commands should be in JMX. The
transition can be gradual and backward-compatible.

Summary: Adopt Jolokia as the JMX HTTP/JSON bridge. (was: Adopt JMX
for management protocols)

Limit the scope to Jolokia adoption. The current planned write operations will
be limited to resource management for multi-framework/platform (including
Hadoop 1 and 2+) integration. The default config will be read-only to be
compatible with common JMX security setup.

Adopt Jolokia as the JMX HTTP/JSON bridge.
--

Key: HADOOP-9160
URL: https://issues.apache.org/jira/browse/HADOOP-9160
Project: Hadoop Common
Issue Type: Improvement
Reporter: Luke Lu
Attachments: hadoop-9160-demo-branch-1.txt

[jira] [Commented] (HADOOP-9164) Add version number and/or library file name to native library for easy tracking

2013-07-12 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707595#comment-13707595
 ] 

Luke Lu commented on HADOOP-9164:
-

There are quite a few different conventions:
# http://apr.apache.org/versioning.html
# http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html

I'm leaning towards the Apache runtime library naming convention even though 
the trailing version convention is a little unconventional, because it handles 
include directory naming consistently as well.

 Add version number and/or library file name to native library for easy 
 tracking
 ---

 Key: HADOOP-9164
 URL: https://issues.apache.org/jira/browse/HADOOP-9164
 Project: Hadoop Common
  Issue Type: Improvement
  Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
 Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch, 
 HADOOP-9164.v3.patch, HADOOP-9164.v4.2.patch, HADOOP-9164.v4.patch, 
 HADOOP-9164.v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9562) Create REST interface for HDFS health data

2013-07-12 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707601#comment-13707601
 ] 

Luke Lu commented on HADOOP-9562:
-

I think this patch is a little controversial in that

# It duplicates functionality already mostly provided by existing mechanisms: 
JMX and its HTTP bridge.
# The jersey stuff doesn't follow Hadoop authentication convention and doesn't 
honor hadoop.http.filter.initializers to setup SPNEGO and other custom auth 
filters.
# No authorization mechanism at all.
# Augmenting existing JMX mechanism is trivial (NameNodeInfoMXBean doesn't have 
to be implemented by FSNameSystem, it was so for convenience).
# Returning data at the whole bean level is a limitation of the current HTTP 
bridge, which is addressed by HADOOP-9160, which adopts a new popular HTTP/JSON 
bridge (Jolokia) that supports advanced querying mechanism with both batch and 
attribute level queries along with first class language bindings for Perl, 
Python and Javascript.

 Create REST interface for HDFS health data
 --

 Key: HADOOP-9562
 URL: https://issues.apache.org/jira/browse/HADOOP-9562
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 3.0.0, 2.0.4-alpha
Reporter: Trevor Lorimer
Priority: Minor
 Attachments: HADOOP-9562.diff


 The HDFS health screen (dfshealth.jsp) displays basic Version, Security and 
 Health information concerning the NameNode, currently this information is 
 accessible from classes in the org.apache.hadoop,hdfs.server.namenode package 
 and cannot be accessed outside the NameNode. This becomes prevalent if the 
 data is required to be displayed using a new user interface.
 The proposal is to create a REST interface to expose the NameNode information 
 displayed on dfshealth.jsp using GET methods. Wrapper classes will be created 
 to serve the data to the REST root resource within the hadoop-hdfs project.
 This will enable the HDFS health screen information to be accessed remotely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9164) Add version number and/or library file name to native library for easy tracking

2013-07-11 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705981#comment-13705981
]

Luke Lu commented on HADOOP-9164:
-

Release version and so version are different things. The latter is for ABI
compatibility, i.e., if we just fixed some bugs and didn't break compatibility
in a release, we should not increase the so version, which more similar to our
RPC version. The trailing version in libname.so.version is supposed to be
the so version. If we really want embed a release version in the so file name.
We should do something like: libname-release-version.so.so_version. Also
so version should also be set in the ELF header (trivial in cmake) and be
consistent with that in file name. For compatibility, a symlink should also be
provided.

Add version number and/or library file name to native library for easy
tracking
---

Key: HADOOP-9164
URL: https://issues.apache.org/jira/browse/HADOOP-9164
Project: Hadoop Common
Issue Type: Improvement
Components: native
Affects Versions: 2.0.2-alpha
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Minor
Attachments: HADOOP-9164.v1.patch, HADOOP-9164.v2.patch,
HADOOP-9164.v3.patch, HADOOP-9164.v4.2.patch, HADOOP-9164.v4.patch,
HADOOP-9164.v4.patch

[jira] [Commented] (HADOOP-9700) Snapshot support for distcp

2013-07-11 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706371#comment-13706371
 ] 

Luke Lu commented on HADOOP-9700:
-

bq. The smallest unit of distcp is file

MAPREDUCE-2257 aims to change the copy unit to block.

 Snapshot support for distcp
 ---

 Key: HADOOP-9700
 URL: https://issues.apache.org/jira/browse/HADOOP-9700
 Project: Hadoop Common
  Issue Type: New Feature
  Components: tools/distcp
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: HADOOP-9700-demo.patch


 Add snapshot incremental copy ability to distcp, so we can do iterative 
 consistent backup between hadoop clusters. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9698) RPCv9 client must honor server's SASL negotiate response

2013-07-10 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705086#comment-13705086
 ] 

Luke Lu commented on HADOOP-9698:
-

Although, it's desirable to make the mechanism negotiation work, which is a new 
feature in RPC v9, I'm not sure why this would be a blocker, as there is no 
protocol change necessary and that there is no real regression compared to 
earlier versions.

AFAICT, it'd require non-trivial change to the current client to really make 
the negotiation work properly. I see no need to rush the change for 2.2.



 RPCv9 client must honor server's SASL negotiate response
 

 Key: HADOOP-9698
 URL: https://issues.apache.org/jira/browse/HADOOP-9698
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical

 As of HADOOP-9421, a RPCv9 server will advertise its authentication methods.  
 This is meant to support features such as IP failover, better token 
 selection, and interoperability in a heterogenous security environment.
 Currently the client ignores the negotiate response and just blindly attempts 
 to authenticate instead of choosing a mutually agreeable auth method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9700) Snapshot support for distcp

2013-07-09 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703635#comment-13703635
 ] 

Luke Lu commented on HADOOP-9700:
-

The patch uses SnapshotDiffReport, which only supports granularity of diff at 
file level. One use case we should support is the file append/flush case, where 
only newly appended blocks should be copied and concated to the previous 
snapshot on the remote side. Maybe we should to use SnapshotDiffInfo directly?

 Snapshot support for distcp
 ---

 Key: HADOOP-9700
 URL: https://issues.apache.org/jira/browse/HADOOP-9700
 Project: Hadoop Common
  Issue Type: New Feature
  Components: tools/distcp
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: HADOOP-9700-demo.patch


 Add snapshot incremental copy ability to distcp, so we can do iterative 
 consistent backup between hadoop clusters. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9683) Wrap IpcConnectionContext in RPC headers

2013-07-08 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702241#comment-13702241
]

Luke Lu commented on HADOOP-9683:
-

The current count is not very useful in the catch block. I was hoping that you
can propagate the count via your RpcReplyException. I'm fine with addressing it
in a separate jira (useful as a unique signature for debugging of alternative
client impls) though. +1 for the patch.

Wrap IpcConnectionContext in RPC headers

Key: HADOOP-9683
URL: https://issues.apache.org/jira/browse/HADOOP-9683
Project: Hadoop Common
Issue Type: Sub-task
Components: ipc
Reporter: Luke Lu
Assignee: Daryn Sharp
Priority: Blocker
Attachments: HADOOP-9683.patch

After HADOOP-9421, all RPC exchanges (including SASL) are wrapped in RPC
headers except IpcConnectionContext, which is still raw protobuf, which makes
request pipelining (a desirable feature for things like HDFS-2856) impossible
to achieve in a backward compatible way. Let's finish the job and wrap
IpcConnectionContext with the RPC request header with the call id of
SET_IPC_CONNECTION_CONTEXT. Or simply make it an optional field in the RPC
request header that gets set for the first RPC call of a given stream.

[jira] [Commented] (HADOOP-9432) Add support for markdown .md files in site documentation

2013-07-08 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702289#comment-13702289
 ] 

Luke Lu commented on HADOOP-9432:
-

doxia markdown module is based [pegdown|https://github.com/sirthias/pegdown] 
which supports table nicely.

 Add support for markdown .md files in site documentation
 

 Key: HADOOP-9432
 URL: https://issues.apache.org/jira/browse/HADOOP-9432
 Project: Hadoop Common
  Issue Type: New Feature
  Components: build, documentation
Affects Versions: 3.0.0
Reporter: Steve Loughran
Priority: Minor
 Attachments: HADOOP-9432.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The markdown syntax for marking up text is something which the {{mvn site}} 
 build can be set up to support alongside the existing APT formatted text.
 Markdown offers many advantages
  # It's more widely understood.
  # There's tooling support in various text editors (TextMate, an IDEA plugin 
 and others)
  # It can be directly rendered in github
  # the {{.md}} files can be named {{.md.vm}} to trigger velocity 
 preprocessing, at the expense of direct viewing in github
 feature #3 is good as it means that you can point people directly at a doc 
 via a github mirror, and have it rendered. 
 I propose adding the options to Maven to enable content be written as {{.md}} 
 and {{.md.vm}} files in the directory {{src/site/markdown}}. This does not 
 require any changes to the existing {{.apt}} files, which can co-exist and 
 cross-reference each other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9683) Wrap IpcConnectionContext in RPC headers

2013-07-03 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699520#comment-13699520
 ] 

Luke Lu commented on HADOOP-9683:
-

The patch lgtm. Minor nit: it'll be nice to preserve the bytes read in the 
exception log in Listener#doRead.

 Wrap IpcConnectionContext in RPC headers
 

 Key: HADOOP-9683
 URL: https://issues.apache.org/jira/browse/HADOOP-9683
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Reporter: Luke Lu
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9683.patch


 After HADOOP-9421, all RPC exchanges (including SASL) are wrapped in RPC 
 headers except IpcConnectionContext, which is still raw protobuf, which makes 
 request pipelining (a desirable feature for things like HDFS-2856) impossible 
 to achieve in a backward compatible way. Let's finish the job and wrap 
 IpcConnectionContext with the RPC request header with the call id of 
 SET_IPC_CONNECTION_CONTEXT. Or simply make it an optional field in the RPC 
 request header that gets set for the first RPC call of a given stream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9683) Wrap IpcConnectionContext in RPC headers

2013-07-02 Thread Luke Lu (JIRA)

Luke Lu created HADOOP-9683:
---

 Summary: Wrap IpcConnectionContext in RPC headers
 Key: HADOOP-9683
 URL: https://issues.apache.org/jira/browse/HADOOP-9683
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: ipc
Reporter: Luke Lu
Assignee: Daryn Sharp
Priority: Blocker


After HADOOP-9421, all RPC exchanges (including SASL) are wrapped in RPC 
headers except IpcConnectionContext, which is still raw protobuf, which makes 
request pipelining (a desirable feature for things like HDFS-2856) impossible 
to achieve in a backward compatible way. Let's finish the job and wrap 
IpcConnectionContext with the RPC request header with the call id of 
SET_IPC_CONNECTION_CONTEXT. Or simply make it an optional field in the RPC 
request header that gets set for the first RPC call of a given stream.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9675) releasenotes.html always shows up as modified because of line endings issues

2013-06-29 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696220#comment-13696220
 ] 

Luke Lu commented on HADOOP-9675:
-

The problem is due to relnotes.py generating the html containing _some_ CRLF 
(from JIRA) and the release manager is not using git-svn, which caused the html 
with mixed eols getting checked in. The problem will manifest with git users 
due to text=auto in .gitattributes (see HADOOP-8912) that auto converts CRLF to 
LF, hence the persistent modified status of the releasenotes.html.

I'm not sure svn:eol-style=native will fix the problem as it's for checkout 
only. I think the right fix is fixing relnotes.py to normalize eols in JIRA 
field.

 releasenotes.html always shows up as modified because of line endings issues
 

 Key: HADOOP-9675
 URL: https://issues.apache.org/jira/browse/HADOOP-9675
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-9675.001.patch


 hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
 shows up as modified even though I haven't touched it, and I can't check it 
 out or reset to a previous version to make that go away.  The only thing I 
 can do to neutralize it is to put it in a dummy commit, but I have to do this 
 every time I switch branches or rebase.
 This appears to have began after the release notes commit  
 (8c5676830bb176157b2dc28c48cd3dd0a9712741), and must be due to a line endings 
 change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9675) releasenotes.html always shows up as modified because of line endings issues

2013-06-29 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696261#comment-13696261
 ] 

Luke Lu commented on HADOOP-9675:
-

bq. git rm -r --cached *  git reset HEAD --hard

Have you tried this? It doesn't work :)

 releasenotes.html always shows up as modified because of line endings issues
 

 Key: HADOOP-9675
 URL: https://issues.apache.org/jira/browse/HADOOP-9675
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Colin Patrick McCabe
 Attachments: HADOOP-9675.001.patch


 hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
 shows up as modified even though I haven't touched it, and I can't check it 
 out or reset to a previous version to make that go away.  The only thing I 
 can do to neutralize it is to put it in a dummy commit, but I have to do this 
 every time I switch branches or rebase.
 This appears to have began after the release notes commit  
 (8c5676830bb176157b2dc28c48cd3dd0a9712741), and must be due to a line endings 
 change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and provide negotiation capabilities

2013-06-25 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693154#comment-13693154
 ] 

Luke Lu commented on HADOOP-9421:
-

Yes, the connection would be lost if you simply using a vip instead of an HA 
proxy for fail over. I think that we all agree that the server name/id 
negotiation is needed for general sasl support. OTOH, the multiple server 
principal HA itself is insane in that negotiation is always _required_ in 
case of fail over. That fact of the matter is that the server id doesn't _have 
to_ change for HA. The argument for server id and negotiation using HA as the 
main use case is weak.

That said, I'm actually pretty happy with how the _protocol_ turned out. My 
original title for the JIRA: Generalize SASL Support with Protocol Buffer is 
spot on in the end :)

 Convert SASL to use ProtoBuf and provide negotiation capabilities
 -

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.1.0-beta, 2.2.0

 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and provide negotiation capabilities

2013-06-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692445#comment-13692445
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. Is the server generated opaque id in a sense a logical principal for he 
HA's service?

The serverid is essentially the key to lookup the new service ticket for 
INITIATE. But there is a race condition that will make it fail anyway: failover 
right after the original server sending out the NEGOTIATE. I stand by the claim 
that multiple server principal HA is insane :)

 Convert SASL to use ProtoBuf and provide negotiation capabilities
 -

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.1.0-beta, 2.2.0

 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and provide negotiation capabilities

2013-06-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692453#comment-13692453
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. The serverid is essentially the key to lookup the new service ticket for 
INITIATE

More precisely: the server id is the key for KDC(TGS) to look up corresponding 
server principal and generate the service ticket with the client's TGT for the 
client to INITIATE the Kerberos Auth. Sharing the server principal between HA 
servers will make the HA seamless.

For tokens, it really doesn't matter, as the old server id is the logical id 
for digest uri in the Digest-MD5 token. SCRAM doesn't care about server id at 
all.

 Convert SASL to use ProtoBuf and provide negotiation capabilities
 -

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.1.0-beta, 2.2.0

 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and provide negotiation capabilities

2013-06-24 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692480#comment-13692480
 ] 

Luke Lu commented on HADOOP-9421:
-

I do recognize what server id can enable: distinct services on the same 
ip:port, a la the HTTP host header and SSL server name indication. It's an 
insane solution for HA because HA will not be seamless due the race condition 
I mentioned. It's fixable with more negotiate rounds with more logic though :)



 Convert SASL to use ProtoBuf and provide negotiation capabilities
 -

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 3.0.0, 2.1.0-beta, 2.2.0

 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690136#comment-13690136
]

Luke Lu commented on HADOOP-9421:
-

bq. There is nothing in the protocol that would prevent SCRAM being supported.

I meant you'll be SOL to make the token optimization work. Your protocol
*requires* an extra round trip to support SCRAM.

bq. Guessing a supported auth/mechanism

For most common hadoop auth work load: distributed containers/tasks, you don't
need to guess, it's the delegation token with digest-md5/scram, as it's a
framework internal token bootstrapped by other public facing mechanisms. For
other use cases, you can use cached values. For the remaining use cases, the
client can send an empty INITIATE and use NEGOTIATE and REINITIATE with the
same total round-trip cost as yours in all cases. With the optional client
initiate, my protocol gives the choices to the practicing system designers
instead of the original protocol designers.

bq. Dealing with the mishaps when the client blows itself up trying an auth the
server doesn't even support

INITIATE contains the same extra info like protocol and serverId for preferred
auth. Server can simply send a NEGOTIATE if it decides that it cannot support
the preferred auth choice, the client can then decide to REINITIATE or abort.

bq. if it even attempts kerberos with a non-kerberos server. It won't even
succeed far enough to send the INITIATE

For this contrived case, the client can catch exceptions for the preferred auth
when generating the initial token, which would apply to fetching service ticket
for non-kerberos server, and send an empty INITIATE to NEGOTIATE and
REINITIATE. Again for integration clients that need to talk to multiple servers
with different auths it can simply use empty INITIATE to NEGOTIATE and cache
the server auth/mechs for later use to save a round-trip later.

Imagine a busy interactive web console that talk to multiple back-end Hadoop
servers, it's not feasible to keep a connection per user open to all these
servers, you often need to constantly creating new connections to the back-end
servers (a connection pool helps), my protocol allows the web console to save a
mandatory round-trip from yours, which can make the interactive user experience
much better due to lowered latencies.

In summary, my protocol gives that choice to real system designers. Your
protocol takes away that choice because you could not possibly think of all use
cases, where auth latency matters.

Convert SASL to use ProtoBuf and add lengths for non-blocking processing

Key: HADOOP-9421
URL: https://issues.apache.org/jira/browse/HADOOP-9421
Project: Hadoop Common
Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch,
HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch,
HADOOP-9421.patch, HADOOP-9421-v2-demo.patch

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690496#comment-13690496
]

Luke Lu commented on HADOOP-9421:
-

bq. Ok, so now how would you handle improved serverId based tokens? Always take
the REINITIATE hit?

Maintain a persistent (serverId - mech) cache in /tmp/hadoop-$user/rpc. Again:
it should be choice of protocol users (system designers).

bq. In this case, can't the client just blast the INITIATE before getting a
NEGOTIATE?

Like I said before, ignoring NEGOTIATE in client increase client complexity to
ignore the right NEGOTIATE. What if server upgrade/security fix happened in
between?

bq. do you believe there will actually be a measurable performance difference?

Depending on the use case, e.g. a console managing multiple hadoop clusters in
multiple regions: the additional round trip could be in hundreds of ms! which
is extremely annoying.

bq. Will having the client ignore the inflight NEGOTIATE on a reconnect have a
measurable latency

Like I said, client will have to decide when to ignore the NEGOTIATE, which
increase client complexity while still feel dirty. I've shown that this
ugliness is not necessary.

bq. If so, is the extra round trip for REINITIATEs a bad thing

REINITIATE is only supposed to happen when mech cache goes stale.

bq. I'll just add it. I'll re-leverage the existing states to avoid adding the
reinitate.

IMO, explicit REINITIATE make things a lot easier to reason about.

bq. prevent a client seizing a connection thread.

What're you talking about? The new protocol allows completely non-blocking
client.

Convert SASL to use ProtoBuf and add lengths for non-blocking processing

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690510#comment-13690510
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. I'll just add it.

Thank you! It's important to note that we don't need to add the optimizations 
for this jira, which should focus on fixing the protocol. Once the protocol is 
fixed, we can explore new non-blocking client in different languages and enable 
a Cambrian explosion of new use cases! 

bq. REINITIATE is only supposed to happen when mech cache goes stale.

s/only//. System designer can make it happen rarely when mech cache is empty or 
stale and when serverId failover happens.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690539#comment-13690539
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. if the client decided to guess and blast a INITIATE, it could simply set a 
ignoreNegotiate boolean.

When failover/upgrade happens, the INITIATE can fail and potentially two 
NEGOTIATE in row. You need to pick the right one. I think this increase client 
complexity unnecessarily.

bq. A server upgrade is exactly one of the reasons the client shouldn't be 
guessing.

Or one the reasons to REINTIATE? :)


 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690558#comment-13690558
]

Luke Lu commented on HADOOP-9421:
-

bq. Server will not SASL respond to a client until it sends either NEGOTIATE or
INITIATE.

OK, use client NEGOTIATE to replace my empty INITIATE. That's cool.

bq. If the client sends INITIATE, and guesses wrong, the server responds with
NEGOTIATE. Again, the client now has one shot to send a valid INITIATE.

I believe the code could get tricky without a REINITIATE.

bq. Your patch would allow a client to spam INITIATE and keep the socket tied
up indefinitely.

My patch only sends one INITIATE after connection header, it either succeeds or
get a NEGOTIATE to REINITIATE, which doesn't have a transition to NEGOTIATE
again. The code is simple and succinct.

Convert SASL to use ProtoBuf and add lengths for non-blocking processing

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690585#comment-13690585
 ] 

Luke Lu commented on HADOOP-9421:
-

I'm fine with the new protocol. I agree that REINITIATE is not necessary if 
server keeps the sentNegotiate state for the connection.

One bug in the patch: in Server case INITIATE, in the authMethod == null, the 
if (sentNegotiate) should be if (!sentNegotiate)...

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690600#comment-13690600
 ] 

Luke Lu commented on HADOOP-9421:
-

In light of the bug, can you add a unit test for the 
INITIATE/NEGOTIATE/INITIATE path?

The sentNegotiate state will also need to be kept per flow if you want to 
multiplex the connection.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690626#comment-13690626
 ] 

Luke Lu commented on HADOOP-9421:
-

You're right, I somehow read that it is trying to buildNegotiateResponse in 
that branch.

Though the patch always sends client NEGOTIATE, which means a slight 
performance regression on typical clusters, I think the protocol is in a good 
shape now.

Can you add some unit tests, as it's easy to break the code due to all the 
states? Cluster testing results would be appreciated as well.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690650#comment-13690650
 ] 

Luke Lu commented on HADOOP-9421:
-

Filed HADOOP-9662 for more unit tests.

+1 for the patch to unblock 2.1 scale testing.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9662) More unit tests for RPC v9

2013-06-21 Thread Luke Lu (JIRA)

Luke Lu created HADOOP-9662:
---

 Summary: More unit tests for RPC v9
 Key: HADOOP-9662
 URL: https://issues.apache.org/jira/browse/HADOOP-9662
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: Luke Lu


Unit tests for HADOOP-9421.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9663) Nonblocking RPC client

2013-06-21 Thread Luke Lu (JIRA)

Luke Lu created HADOOP-9663:
---

 Summary: Nonblocking RPC client
 Key: HADOOP-9663
 URL: https://issues.apache.org/jira/browse/HADOOP-9663
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Luke Lu


Now that we have [RPC v9|HADOOP-8990], we can write a nonblocking client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and provide negotiation capabilities

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690673#comment-13690673
 ] 

Luke Lu commented on HADOOP-9421:
-

Filed HADOOP-9664 to document RPC v9.

 Convert SASL to use ProtoBuf and provide negotiation capabilities
 -

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9664) Documentation for RPC v9

2013-06-21 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9664:


Description: [RPC v9|HADOOP-8990] should be ready for alternative 
implementations. Let's document the RPC wire protocol somewhere.  (was: [RPC 
v9|HADOOP-8990] should be ready for alternative implementations. Let's document 
RPC wire protocol somewhere.)

 Documentation for RPC v9
 

 Key: HADOOP-9664
 URL: https://issues.apache.org/jira/browse/HADOOP-9664
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Reporter: Luke Lu

 [RPC v9|HADOOP-8990] should be ready for alternative implementations. Let's 
 document the RPC wire protocol somewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9661) Allow metrics sources to be extended

2013-06-21 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690702#comment-13690702
 ] 

Luke Lu commented on HADOOP-9661:
-

Patch lgtm. +1.

 Allow metrics sources to be extended
 

 Key: HADOOP-9661
 URL: https://issues.apache.org/jira/browse/HADOOP-9661
 Project: Hadoop Common
  Issue Type: Improvement
  Components: metrics
Affects Versions: 2.0.5-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: HADOOP-9661.patch


 My use case is to create an FSQueueMetrics that extends QueueMetrics and 
 includes some additional fair-scheduler-specific information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13688991#comment-13688991
 ] 

Luke Lu commented on HADOOP-9421:
-

bq. As described, optimize token path.

[Your 
patch|https://issues.apache.org/jira/secure/attachment/12588738/HADOOP-9421.patch]
 still doesn't have the proper client initiation support:
# Only works with token auths that use digest-md5, will require major protocol 
change to optimize for SCRAM (modern digest-md5 replacement) or Kerberos and 
anything SASL mechanisms that hasInitialResponse.
# fallback prevention seems broken as the patch didn't modify 
TestSaslRpc#testSimpleServerWith*Token and still pass these tests.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-9421) Convert SASL to use ProtoBuf and add lengths for non-blocking processing

2013-06-20 Thread Luke Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated HADOOP-9421:


Attachment: HADOOP-9421.patch

Patch with proper client initiation and fallback prevention unit tests.

Please review.

 Convert SASL to use ProtoBuf and add lengths for non-blocking processing
 

 Key: HADOOP-9421
 URL: https://issues.apache.org/jira/browse/HADOOP-9421
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Sanjay Radia
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, HADOOP-9421.patch, 
 HADOOP-9421-v2-demo.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 4 5 6 >

1 - 100 of 532 matches

Mail list logo