Re: Client Caller Context Through Router(RBF)

2020-05-07 Thread Hui Fei
Ayush, Thanks for bringing this  up, it is very meaningful!

Add one more typical problem, NameNode should log both real client and
Router 's ip. But now NN just logs router's ip, it's difficult for
troubleshooting


Ayush Saxena  于2020年5月4日周一 下午5:26写道:

> Hi All,
> Wanted to share and discuss a problem that we are facing in the present
> situation when using Router Based Federation. Presently when a client
> connects through Router to Namenode, the Namenode receives the caller
> context of the router rather than being of the actual client. This
> typically can cause a couple of problems, Two of which we have identified
> as of now :
>
> Firstly, The concept of data locality doesn't work correctly when
> connecting through Router as the Namenode considers Router as the actual
> client and performs all the optimizations/computations based on Router's
> location rather than using the actual client location.
>
> Secondly, The Namenode Retry Cache can not be used as if in case of
> failover or such an event, the client retries again and connects to other
> router, in that case the since the Call Id is from the Router, but not from
> the actual client, the Retry Cache doesn't identify it as a repeated call
> and serves it as a whole new call which creates inconsistencies.
>
> We have been discussing and trying on solutions since a long time now and
> tried out a couple of solutions :
>
>- Add proxy address in IPC connection (HADOOP-16254
>) --> This had some
>security concerns for Daryn.
>- The RouterRPCServer should transfer CallerContext and client ip to
>NamenodeRpcServer (HDFS-13293
>) --> This tend to be
>little opaque and couple of more problems stated as in HDFS-13248
> by Ajay Kumar and
>Arpit Agarwal
>- Favored Nodes -->  Pass the local node as favored node. But this isn't
>a complete solution. This doesn't take into account the fallback in
> case of
>non availability of local nodes and couple of more. this isn't a
> solution
>for the Retry Cache problem too.
>
>
> The related JIRA's where most of the discussion happened, if someone tends
> to follow :
> HDFS-13248  :- For the
> DataLocality Problem. Has a patch too in the end with Solution 3(Favored
> Nodes)
> HDFS-15079  , HDFS-15078
>  & HDFS-15310
>   : For the Retry Cache
> Problem.
> HADOOP-16254  :
> Solution 1 : Add proxy address in IPC connection.
> HDFS-13293  : Solution 2
> : Passing Caller Context.
>
> Do let us know if any help here, Any further solutions, workarounds or a
> way out to unblock or improvise the tried solutions.
>
> Thanx!!!
> -Ayush
>


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2020-05-07 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/133/

[May 6, 2020 11:25:04 AM] (Ayush Saxena) HDFS-14283. DFSInputStream to prefer 
cached replica. Contributed by Lisheng Sun.
[May 6, 2020 5:27:17 PM] (Jonathan Turner Eagles) YARN-8959. 
TestContainerResizing fails randomly (Ahmed Hussein via jeagles)
[May 6, 2020 8:18:32 PM] (Iñigo Goiri) HDFS-15332. Quota Space consumed was 
wrong in truncate with Snapshots. Contributed by hemanthboyina.
[May 6, 2020 8:22:54 PM] (Iñigo Goiri) YARN-9017. PlacementRule order is not 
maintained in CS. Contributed by Bilwa S T.




-1 overall


The following subsystems voted -1:
asflicense findbugs mvnsite pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

findbugs :

   module:hadoop-yarn-project/hadoop-yarn 
   Uncallable method 
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage$1.getInstance()
 defined in anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:[line 87] 
   Dead store to entities in 
org.apache.hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown.checkQuery(HBaseTimelineReaderImpl)
 At 
TestTimelineReaderHBaseDown.java:org.apache.hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown.checkQuery(HBaseTimelineReaderImpl)
 At TestTimelineReaderHBaseDown.java:[line 190] 
   org.apache.hadoop.yarn.server.webapp.WebServiceClient.sslFactory should 
be package protected At WebServiceClient.java: At WebServiceClient.java:[line 
42] 

findbugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server 
   Uncallable method 
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage$1.getInstance()
 defined in anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:[line 87] 
   Dead store to entities in 
org.apache.hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown.checkQuery(HBaseTimelineReaderImpl)
 At 
TestTimelineReaderHBaseDown.java:org.apache.hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown.checkQuery(HBaseTimelineReaderImpl)
 At TestTimelineReaderHBaseDown.java:[line 190] 
   org.apache.hadoop.yarn.server.webapp.WebServiceClient.sslFactory should 
be package protected At WebServiceClient.java: At WebServiceClient.java:[line 
42] 

findbugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
 
   org.apache.hadoop.yarn.server.webapp.WebServiceClient.sslFactory should 
be package protected At WebServiceClient.java: At WebServiceClient.java:[line 
42] 

findbugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 
   Uncallable method 
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage$1.getInstance()
 defined in anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:[line 87] 
   Dead store to entities in 
org.apache.hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown.checkQuery(HBaseTimelineReaderImpl)
 At 
TestTimelineReaderHBaseDown.java:org.apache.hadoop.yarn.server.timelineservice.storage.TestTimelineReaderHBaseDown.checkQuery(HBaseTimelineReaderImpl)
 At TestTimelineReaderHBaseDown.java:[line 190] 

findbugs :

   module:hadoop-yarn-project 
   Uncallable method 
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage$1.getInstance()
 defined in anonymous class At 
TestTimelineReaderWebServicesHBaseStorage.java:anonymous 

[jira] [Created] (YARN-10261) DistributedShell ExecScript cannot be found

2020-05-07 Thread Markus Jelsma (Jira)
Markus Jelsma created YARN-10261:


 Summary: DistributedShell ExecScript cannot be found
 Key: YARN-10261
 URL: https://issues.apache.org/jira/browse/YARN-10261
 Project: Hadoop YARN
  Issue Type: Bug
  Components: distributed-shell
Affects Versions: 3.2.0
Reporter: Markus Jelsma


When running the DistributedShell using a custom script, it is uploaded to HDFS 
as ExecScript by the Client. The ApplicationMaster cannot find the script 
because it tries to download ExecScript.sh.

 

I have patched the DistributedShell subproject so as the Client uploads it as 
ExecScript.sh (and ExecScript.bat in case of Windows).

 

With the change, the ApplicationMaster now correctly runs my custom script.

 

I'll attach a patch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-05-07 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/678/

[May 6, 2020 5:48:12 PM] (jeagles) YARN-8959. TestContainerResizing fails 
randomly (Ahmed Hussein via




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-minikdc 
   Possible null pointer dereference in 
org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called 
method Dereferenced at 
MiniKdc.java:org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value 
of called method Dereferenced at MiniKdc.java:[line 515] 

FindBugs :

   module:hadoop-common-project/hadoop-auth 
   
org.apache.hadoop.security.authentication.server.MultiSchemeAuthenticationHandler.authenticate(HttpServletRequest,
 HttpServletResponse) makes inefficient use of keySet iterator instead of 
entrySet iterator At MultiSchemeAuthenticationHandler.java:of keySet iterator 
instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:[line 
192] 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   org.apache.hadoop.crypto.CipherSuite.setUnknownValue(int) 
unconditionally sets the field unknownValue At CipherSuite.java:unknownValue At 
CipherSuite.java:[line 44] 
   org.apache.hadoop.crypto.CryptoProtocolVersion.setUnknownValue(int) 
unconditionally sets the field unknownValue At 
CryptoProtocolVersion.java:unknownValue At CryptoProtocolVersion.java:[line 67] 
   Possible null pointer dereference in 
org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of 
called method Dereferenced at 
FileUtil.java:org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to 
return value of called method Dereferenced at FileUtil.java:[line 118] 
   Possible null pointer dereference in 
org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, 
File, Path, File) due to return value of called method Dereferenced at 
RawLocalFileSystem.java:org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path,
 File, Path, File) due to return value of called method Dereferenced at 
RawLocalFileSystem.java:[line 383] 
   Useless condition:lazyPersist == true at this point At 
CommandWithDestination.java:[line 502] 
   org.apache.hadoop.io.DoubleWritable.compareTo(DoubleWritable) 
incorrectly handles double value At DoubleWritable.java: At 
DoubleWritable.java:[line 78] 
   org.apache.hadoop.io.DoubleWritable$Comparator.compare(byte[], int, int, 
byte[], int, int) incorrectly handles double value At DoubleWritable.java:int) 
incorrectly handles double value At DoubleWritable.java:[line 97] 
   org.apache.hadoop.io.FloatWritable.compareTo(FloatWritable) incorrectly 
handles float value At FloatWritable.java: At FloatWritable.java:[line 71] 
   org.apache.hadoop.io.FloatWritable$Comparator.compare(byte[], int, int, 
byte[], int, int) incorrectly handles float value At FloatWritable.java:int) 
incorrectly handles float value At FloatWritable.java:[line 89] 
   Possible null pointer dereference in 
org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return 
value of called method Dereferenced at 
IOUtils.java:org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) 
due to return value of called method Dereferenced at IOUtils.java:[line 389] 
   Possible bad parsing of shift operation in 
org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At 
Utils.java:operation in 
org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At Utils.java:[line 
398] 
   
org.apache.hadoop.metrics2.lib.DefaultMetricsFactory.setInstance(MutableMetricsFactory)
 unconditionally sets the field mmfImpl At DefaultMetricsFactory.java:mmfImpl 
At DefaultMetricsFactory.java:[line 49] 
   
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.setMiniClusterMode(boolean) 
unconditionally sets the field miniClusterMode At 
DefaultMetricsSystem.java:miniClusterMode At DefaultMetricsSystem.java:[line 
92] 
   Useless object stored in variable seqOs of method 
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.addOrUpdateToken(AbstractDelegationTokenIdentifier,