Apache Hadoop qbt Report: trunk+JDK8 on Linux/aarch

2020-05-18 Thread Apache Jenkins Server
ARM Build Details : 
https://builds.apache.org/job/Hadoop-qbt-linux-ARM-trunk/202/

 
CHANGES 
 

[May 18, 2020 6:36:20 AM] (aajisaka) HADOOP-17042. Hadoop distcp throws 'ERROR: 
Tools helper
[May 18, 2020 7:29:07 AM] (aajisaka) Revert "YARN-9606. Set sslfactory for 
AuthenticatedURL() while creating
[May 18, 2020 2:04:04 PM] (github) HDFS-15202 Boost short circuit cache (rebase 
PR-1884) (#2016)
[May 18, 2020 2:09:43 PM] (weichiu) HDFS-13183. Standby NameNode process 
getBlocks request to reduce Active
[May 18, 2020 3:40:38 PM] (weichiu) HDFS-15207. VolumeScanner skip to scan 
blocks accessed during recent


 
TESTS 
 

There are 17831 total tests of which 2 test(s) failed and 2507 test(s) were 
skipped.  

 
FAILED TESTS 
 

2 tests failed.
FAILED:  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
FAILED:  
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType


Complete test report can be accessed at 
https://builds.apache.org/job/Hadoop-qbt-linux-ARM-trunk/202/testReport

IN ATTACHMENT COMPRESSED BUILD LOGS.<>

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2020-05-18 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/145/

[May 18, 2020 6:36:20 AM] (Akira Ajisaka) HADOOP-17042. Hadoop distcp throws 
'ERROR: Tools helper ///usr/lib/hadoop/libexec/tools/hadoop-distcp.sh was not 
found'. Contributed by Aki Tanaka.

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Re: [DISCUSS] making Ozone a separate Apache project

2020-05-18 Thread Mingliang Liu
+1

On Mon, May 18, 2020 at 12:37 AM Elek, Marton  wrote:

>
>
> > One question, for the committers who contributed to Ozone before and got
> > the committer-role in the past (like me), will they carry the
> > committer-role to the new repo?
>
>
> In short: yes.
>
>
> In more details:
>
> This discussion (if there is an agreement) should be followed by a next
> discussion + vote about a very specific proposal which should contain
> all the technical information (including committer list)
>
> I support the the same approach what we followed with Submarine:
>
> ALL the existing (Hadoop) committers should have a free / opt-in
> opportunity to be a committer in Ozone.
>
> (After proposal is created on the wiki, you can add your name, or
> request to be added. But as the initial list can be created based on
> statistics from the Jira, your name can be already there ;-) )
>
>
>
> Marton
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>

-- 
L


Re: [NOTICE] Removal of protobuf classes from Hadoop Token's public APIs' signature

2020-05-18 Thread Eric Yang
ProtobufHelper should not be a public API.  Hadoop uses protobuf
serialization to expertise RPC performance with many drawbacks.  The
generalized object usually require another indirection to map to usable
Java object, this is making Hadoop code messy, and that is topic for
another day.  The main challenges for UGI class is making the system
difficult to secure.

In Google's world, gRPC is built on top of protobuf + HTTP/2 binary
protocol, and secured by JWT token with Google.  This means before
deserializing a protobuf object on the wire, the call must deserialize a
JSON token to determine if the call is authenticated before deserializing
application objects.  Hence, using protobuf for RPC is no longer a good
reason for performance gain over JSON because JWT token deserialization
happens on every gRPC call to ensure the request is secured properly.

In Hadoop world, we are not using JWT token for authentication, we have
pluggable token implementation either SPNEGO, delegation token or some kind
of SASL.  UGI class should not allow protobuf token to be exposed as public
interface, otherwise down stream application can forge the protobuf token
and it will become a privilege escalation issue.  In my opinion, UGI class
must be as private as possible to prevent forgery.  Down stream application
are discouraged from using UGI.doAs for impersonation to reduce privileges
escalation.  Instead, the downstream application should running like Unix
daemon instead of root.  This will ensure that vulnerability for one
application does not spill over security problems to another application.
Some people will disagree with the statement because existing application
is already written to take advantage of UGI.doAs, such as Hive loading
external table.  Fortunately, Hive provides an option to run without doAs.

Protobuf is not suitable candidate for security token transport because it
is a strong type transport.  If multiple tokens are transported with UGI
protobuf, small difference in ASCII, UTF-8, or UTF-16 can cause
conversion ambiguity that might create security holes or headache on type
casting.  I am +1 on removing protobuf from Hadoop Token API.  Hadoop Token
as byte array, and default to JSON serializer is probably simpler solution
to keep the system robust without repeating the past mistakes.

regards,
Eric

On Sun, May 17, 2020 at 11:56 PM Vinayakumar B 
wrote:

> Hi Wei-chu and steve,
>
> Thanks for sharing insights.
>
> I have also tried to compile and execute ozone pointing to
> trunk(3.4.0-SNAPSHOT) which have shaded and upgraded protobuf.
>
> Other than just the usage of internal protobuf APIs, because of which
> compilation would break, I found another major problem was, the Hadoop-rpc
> implementations in downstreams which is based on non-shaded Protobuf
> classes.
>
> 'ProtobufRpcEngine' takes arguments and tries to typecast to Protobuf
> 'Message', which its expecting to be of 3.7 version and shaded package
> (i.e. o.a.h.thirdparty.*).
>
> So,unless downstreams upgrade their protobuf classes to 'hadoop-thirdparty'
> this issue will continue to occur, even after solving compilation issues
> due to internal usage of private APIs with protobuf signatures.
>
> I found a possible workaround for this problem.
> Please check https://issues.apache.org/jira/browse/HADOOP-17046
>   This Jira proposes to keep existing ProtobuRpcEngine as-is (without
> shading and with protobuf-2.5.0 implementation) to support downstream
> implementations.
>   Use new ProtobufRpcEngine2 to use shaded protobuf classes within Hadoop
> and later projects who wish to upgrade their protobufs to 3.x.
>
> For Ozone compilation:
>   I have submitted to PRs to make preparations to adopt to Hadoop 3.3+
> upgrade. These PRs will remove dependency on Hadoop for those internal APIs
> and implemented their own copy in ozone with non-shaded protobuf.
> HDDS-3603: https://github.com/apache/hadoop-ozone/pull/93
> 2
> HDDS-3604: https://github.com/apache/hadoop-ozone/pull/933
>
> Also, I had run some tests on Ozone after applying these PRs and
> HADOOP-17046 with 3.4.0, tests seems to pass.
>
> Please help review these PRs.
>
> Thanks,
> -Vinay
>
>
> On Wed, Apr 29, 2020 at 5:02 PM Steve Loughran  >
> wrote:
>
> > Okay.
> >
> > I am not going to be a purist and say "what were they doing -using our
> > private APIs?" because as we all know, with things like UGI tagged
> @private
> > there's been no way to get something is done without getting into the
> > private stuff.
> >
> > But why did we do the protobuf changes? So that we could update our
> private
> > copy of protobuf with out breaking every single downstream application.
> The
> > great protobuf upgrade to 2.5 is not something we wanted to repeat. When
> > was that? before hadoop-2.2 shipped? I certainly remember a couple of
> weeks
> > were absolutely nothing would build whatsoever, not until every
> downstream
> > project had upgraded t

Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-05-18 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/689/

[May 17, 2020 3:59:10 AM] (yqlin) HDFS-15264. Backport Datanode detection to 
branch-2.10. Contributed by




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-minikdc 
   Possible null pointer dereference in 
org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called 
method Dereferenced at 
MiniKdc.java:org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value 
of called method Dereferenced at MiniKdc.java:[line 515] 

FindBugs :

   module:hadoop-common-project/hadoop-auth 
   
org.apache.hadoop.security.authentication.server.MultiSchemeAuthenticationHandler.authenticate(HttpServletRequest,
 HttpServletResponse) makes inefficient use of keySet iterator instead of 
entrySet iterator At MultiSchemeAuthenticationHandler.java:of keySet iterator 
instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:[line 
192] 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   org.apache.hadoop.crypto.CipherSuite.setUnknownValue(int) 
unconditionally sets the field unknownValue At CipherSuite.java:unknownValue At 
CipherSuite.java:[line 44] 
   org.apache.hadoop.crypto.CryptoProtocolVersion.setUnknownValue(int) 
unconditionally sets the field unknownValue At 
CryptoProtocolVersion.java:unknownValue At CryptoProtocolVersion.java:[line 67] 
   Possible null pointer dereference in 
org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of 
called method Dereferenced at 
FileUtil.java:org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to 
return value of called method Dereferenced at FileUtil.java:[line 118] 
   Possible null pointer dereference in 
org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, 
File, Path, File) due to return value of called method Dereferenced at 
RawLocalFileSystem.java:org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path,
 File, Path, File) due to return value of called method Dereferenced at 
RawLocalFileSystem.java:[line 383] 
   Useless condition:lazyPersist == true at this point At 
CommandWithDestination.java:[line 502] 
   org.apache.hadoop.io.DoubleWritable.compareTo(DoubleWritable) 
incorrectly handles double value At DoubleWritable.java: At 
DoubleWritable.java:[line 78] 
   org.apache.hadoop.io.DoubleWritable$Comparator.compare(byte[], int, int, 
byte[], int, int) incorrectly handles double value At DoubleWritable.java:int) 
incorrectly handles double value At DoubleWritable.java:[line 97] 
   org.apache.hadoop.io.FloatWritable.compareTo(FloatWritable) incorrectly 
handles float value At FloatWritable.java: At FloatWritable.java:[line 71] 
   org.apache.hadoop.io.FloatWritable$Comparator.compare(byte[], int, int, 
byte[], int, int) incorrectly handles float value At FloatWritable.java:int) 
incorrectly handles float value At FloatWritable.java:[line 89] 
   Possible null pointer dereference in 
org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return 
value of called method Dereferenced at 
IOUtils.java:org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) 
due to return value of called method Dereferenced at IOUtils.java:[line 389] 
   Possible bad parsing of shift operation in 
org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At 
Utils.java:operation in 
org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At Utils.java:[line 
398] 
   
org.apache.hadoop.metrics2.lib.DefaultMetricsFactory.setInstance(MutableMetricsFactory)
 unconditionally sets the field mmfImpl At DefaultMetricsFactory.java:mmfImpl 
At DefaultMetricsFactory.java:[line 49] 
   
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.setMiniClusterMode(boolean) 
unconditionally sets the field miniClusterMode At 
DefaultMetricsSystem.java:miniClusterMode At DefaultMetricsSystem.java:[line 
92] 
   Useless object stored in variable seqOs of method 
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.addOrUpdateToken(AbstractDelegationTokenIdentifier,
 AbstractDelegationTokenSecretManager$DelegationTokenIn

[jira] [Created] (YARN-10270) TODO comments exist in trunk while the related issues are already fixed.

2020-05-18 Thread Rungroj Maipradit (Jira)
Rungroj Maipradit created YARN-10270:


 Summary: TODO comments exist in trunk while the related issues are 
already fixed.
 Key: YARN-10270
 URL: https://issues.apache.org/jira/browse/YARN-10270
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Rungroj Maipradit


In a research project, we analyzed the source code of Hadoop looking for 
comments with on-hold SATDs (self-admitted technical debt) that could be fixed 
already. An on-hold SATD is a TODO/FIXME comment blocked by an issue. If this 
blocking issue is already resolved, the related todo can be implemented (or 
sometimes it is already implemented, but the comment is left in the code 
causing confusions). As we found a few instances of these in YARN, we decided 
to collect them in a ticket, so they are documented and can be addressed sooner 
or later.

A list of code comments that mention already closed issues.

* A code comment recommends rewriting the mergeSkyline method once YARN-5328 is 
committed. The current message still exists in trunk while the resolution of 
YARN-5328 is fixed.
{code:java}
// TODO:
// rewrite this function with shift and merge once YARN-5328 is committed
/** First, getHistory the pipeline submission time. */
{code}
Comment location: 
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-resourceestimator/src/main/java/org/apache/hadoop/resourceestimator/solver/preprocess/SolverPreprocessor.java#L137

* A code comment mentions a temporary fix for NM callback and refers to 
YARN-8265. YARN-8265 is already closed, but a comment says that the feature 
will be implemented in YARN-8286. 
(https://issues.apache.org/jira/browse/YARN-8265?focusedCommentId=16473105&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16473105).
 YARN-8286 could be referenced in the source code comment.
{code:java}
// A docker container might get a different IP if the container is
// relaunched by the NM, so we need to keep checking the status.
// This is a temporary fix until the NM provides a callback for
// container relaunch (see YARN-8265).
{code}
Comment location: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/instance/ComponentInstance.java#L684

* The code comment suggests revisiting the AMRMProxyService.initializePipeline 
method in YARN-6128, but YARN-6128 is already fixed. The comment was added in a 
commit message referencing YARN-6127 
(https://github.com/apache/hadoop/commit/49aa60e50d20f8c18ed6f00fa8966244536fe7da).
 It seems that this part of the code remained untouched in the commit related 
to YARN-6128 
(https://github.com/apache/hadoop/commit/d5f66888b8d767ee6706fab9950c194a1bf26d32).
{code:java}
// TODO: revisit in AMRMProxy HA in YARN-6128
// Remove the existing pipeline
{code}
Comment location: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/AMRMProxyService.java#L463




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] making Ozone a separate Apache project

2020-05-18 Thread Elek, Marton





One question, for the committers who contributed to Ozone before and got
the committer-role in the past (like me), will they carry the
committer-role to the new repo?



In short: yes.


In more details:

This discussion (if there is an agreement) should be followed by a next 
discussion + vote about a very specific proposal which should contain 
all the technical information (including committer list)


I support the the same approach what we followed with Submarine:

ALL the existing (Hadoop) committers should have a free / opt-in 
opportunity to be a committer in Ozone.


(After proposal is created on the wiki, you can add your name, or 
request to be added. But as the initial list can be created based on 
statistics from the Jira, your name can be already there ;-) )




Marton

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org