date:20190426

[jira] [Commented] (HDFS-14403) Cost-Based RPC FairCallQueue

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827421#comment-16827421
 ] 

Hadoop QA commented on HDFS-14403:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 25m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 25m 35s{color} 
| {color:red} root generated 3 new + 1481 unchanged - 0 fixed = 1484 total (was 
1481) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 37s{color} | {color:orange} root: The patch generated 9 new + 385 unchanged 
- 6 fixed = 394 total (was 391) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m 41s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}110m 
11s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}244m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ipc.TestProcessingDetails |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14403 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12967181/HDFS-14403.006.combined.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8b7ab1ac0fb7 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
|

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827417#comment-16827417
 ] 

Hadoop QA commented on HDFS-14440:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
54s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
14s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 
54s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14440 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12967170/HDFS-14440-HDFS-13891-02.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8bc5cd3a2f21 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / 55f2f7a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26715/testReport/ |
| Max. process+thread count | 1331 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26715/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was auto

[jira] [Work logged] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1471?focusedWorklogId=233828&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233828
 ]

ASF GitHub Bot logged work on HDDS-1471:


Author: ASF GitHub Bot
Created on: 27/Apr/19 01:06
Start Date: 27/Apr/19 01:06
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on issue #777: HDDS-1471. Update 
ratis dependency to 0.3.0. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/777#issuecomment-487241855
 
 
   /retest
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233828)
Time Spent: 40m  (was: 0.5h)

> Update ratis dependency to 0.3.0
> 
>
> Key: HDDS-1471
> URL: https://issues.apache.org/jira/browse/HDDS-1471
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Update ratis dependency to 0.3.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Ayush Saxena (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14440:

Status: Patch Available  (was: Open)

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch, 
> HDFS-14440-HDFS-13891-02.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1477) Recon Server stops after failed attempt to get snapshot from OM

2019-04-26 Thread Vivek Ratnavel Subramanian (JIRA)

Vivek Ratnavel Subramanian created HDDS-1477:


 Summary: Recon Server stops after failed attempt to get snapshot 
from OM
 Key: HDDS-1477
 URL: https://issues.apache.org/jira/browse/HDDS-1477
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Recon
Affects Versions: 0.4.0
Reporter: Vivek Ratnavel Subramanian
Assignee: Aravindan Vijayan


Recon server stop after it is unable to connect to om

 
{code:java}
2019-04-26 14:55:03,441 INFO org.apache.hadoop.utils.db.DBStoreBuilder: using 
custom profile for table: default
2019-04-26 14:55:03,441 INFO org.apache.hadoop.utils.db.DBStoreBuilder: Using 
default column profile:DBProfile.DISK for Table:default
2019-04-26 14:55:03,444 INFO org.apache.hadoop.utils.db.DBStoreBuilder: Using 
default options. DBProfile.DISK
2019-04-26 14:55:03,659 INFO org.apache.hadoop.conf.Configuration.deprecation: 
No unit for recon.om.connection.request.timeout(5000) assuming MILLISECONDS
2019-04-26 14:56:05,389 INFO 
org.apache.hadoop.ozone.recon.tasks.ContainerKeyMapperTask: Starting a run of 
ContainerKeyMapperTask.
2019-04-26 14:56:05,454 ERROR 
org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Unable 
to obtain Ozone Manager DB Snapshot.
org.apache.http.conn.HttpHostConnectException: Connect to 0.0.0.0:9874 
[/0.0.0.0] failed: Connection refused (Connection refused)
at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:158)
at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
at 
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at org.apache.hadoop.ozone.recon.ReconUtils.makeHttpCall(ReconUtils.java:161)
at 
org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.getOzoneManagerDBSnapshot(OzoneManagerServiceProviderImpl.java:173)
at 
org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.updateReconOmDBWithNewSnapshot(OzoneManagerServiceProviderImpl.java:144)
at 
org.apache.hadoop.ozone.recon.tasks.ContainerKeyMapperTask.run(ContainerKeyMapperTask.java:69)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at 
org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74)
at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
... 20 more
2019-04-26 14:56:05,456 ERROR 
org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Null 
snapshot location got from OM.
2019-04-26 16:09:09,557 INFO org.apache.hadoop.ozone.recon.ReconServer: 
Stopping Recon server
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Aravindan Vijayan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-1475:
---

Assignee: Aravindan Vijayan

> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: newbie
>
> In OzoneContainer start() we have 
> {code:java}
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);{code}
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This can cause an 
> issue for writeChannel.start() if it is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1475:
-
Description: 
In OzoneContainer start() we have 
{code:java}
startContainerScrub();
writeChannel.start();
readChannel.start();
hddsDispatcher.init();
hddsDispatcher.setScmId(scmId);{code}
 

Suppose here if readChannel.start() failed due to some reason, from 
VersionEndPointTask, we try to start OzoneContainer again. This can cause an 
issue for writeChannel.start() if it is already started. 

 

Fix the logic such a way that if service is started, don't attempt to start the 
service again. Similar changes needed to be done for stop().

  was:
In OzoneContainer start() we have 
{code:java}
startContainerScrub();
writeChannel.start();
readChannel.start();
hddsDispatcher.init();
hddsDispatcher.setScmId(scmId);{code}
 

Suppose here if readChannel.start() failed due to some reason, from 
VersionEndPointTask, we try to start OzoneContainer again. This will cause if a 
service is already started. 

 

Fix the logic such a way that if service is started, don't attempt to start the 
service again. Similar changes needed to be done for stop().


> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> In OzoneContainer start() we have 
> {code:java}
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);{code}
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This can cause an 
> issue for writeChannel.start() if it is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827365#comment-16827365
 ] 

Íñigo Goiri commented on HDFS-14454:


The failed test is because of the random mount point. If all the 10 files end 
up in the same subcluster, we get a failure. We can increase the number of 
files to write. Maybe we do that in a separate JIRA? 

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch, 
> HDFS-14454-HDFS-13891.005.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1475:
-
Description: 
In OzoneContainer start() we have 
{code:java}
startContainerScrub();
writeChannel.start();
readChannel.start();
hddsDispatcher.init();
hddsDispatcher.setScmId(scmId);{code}
 

Suppose here if readChannel.start() failed due to some reason, from 
VersionEndPointTask, we try to start OzoneContainer again. This will cause if a 
service is already started. 

 

Fix the logic such a way that if service is started, don't attempt to start the 
service again. Similar changes needed to be done for stop().

  was:
In OzoneContainer start() we have 

startContainerScrub();
writeChannel.start();
readChannel.start();
hddsDispatcher.init();
hddsDispatcher.setScmId(scmId);

 

Suppose here if readChannel.start() failed due to some reason, from 
VersionEndPointTask, we try to start OzoneContainer again. This will cause if a 
service is already started. 

 

Fix the logic such a way that if service is started, don't attempt to start the 
service again. Similar changes needed to be done for stop().


> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> In OzoneContainer start() we have 
> {code:java}
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);{code}
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This will cause if 
> a service is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1475:
-
Component/s: Ozone Datanode

> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> In OzoneContainer start() we have 
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This will cause if 
> a service is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1476) Fix logIfNeeded logic in EndPointStateMachine

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1476:
-
Component/s: Ozone Datanode

> Fix logIfNeeded logic in EndPointStateMachine
> -
>
> Key: HDDS-1476
> URL: https://issues.apache.org/jira/browse/HDDS-1476
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> {code:java}
> public void E(Exception ex) {
>  LOG.trace("Incrementing the Missed count. Ex : {}", ex);
> this.incMissed();
>  if (this.getMissedCount() % getLogWarnInterval(conf) ==
>  0) {
>  LOG.error(
>  "Unable to communicate to SCM server at {} for past {} seconds.",
>  this.getAddress().getHostString() + ":" + this.getAddress().getPort(),
>  TimeUnit.MILLISECONDS.toSeconds(
>  this.getMissedCount() * getScmHeartbeatInterval(this.conf)), ex);
>  }
> }{code}
> This method will be called when any exception occur in stateMachine to log an 
> exception. But to not log aggresively we have this 
> ozone.scm.heartbeat.log.warn.interval.count property to control  logging. 
>  
> There is a small issue here, we don't log the exception first time when it 
> occurred. So, we need to log for the first time and then increment the 
> missingCount.
>  
> Fix is to move the this.incMissed() to end of the method so that we log it 
> for the first time exception occurred and after that every 
> log.warn.interval.count exceptions happened.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14434) webhdfs that connect secure hdfs should not use user.name parameter

2019-04-26 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827361#comment-16827361
 ] 

Eric Yang edited comment on HDFS-14434 at 4/26/19 11:04 PM:


[~magnum], thank you for the patch.  The patch looks good to me.  [~kihwal], 
does it look good on your side?


was (Author: eyang):
[~magnum], thank you for the patch.  The patch looks good to me if we can clean 
up the checkstyle problem.  [~kihwal], does it look good on your side?

> webhdfs that connect secure hdfs should not use user.name parameter
> ---
>
> Key: HDFS-14434
> URL: https://issues.apache.org/jira/browse/HDFS-14434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Assignee: KWON BYUNGCHANG
>Priority: Minor
> Attachments: HDFS-14434.001.patch, HDFS-14434.002.patch, 
> HDFS-14434.003.patch, HDFS-14434.004.patch, HDFS-14434.005.patch, 
> HDFS-14434.006.patch, HDFS-14434.007.patch, HDFS-14434.008.patch
>
>
> I have two secure hadoop cluster.  Both cluster use cross-realm 
> authentication. 
> [use...@a.com|mailto:use...@a.com] can access to HDFS of B.COM realm
> by the way, hadoop username of use...@a.com  in B.COM realm is  
> cross_realm_a_com_user_a.
> hdfs dfs command of use...@a.com using B.COM webhdfs failed.
> root cause is  webhdfs that connect secure hdfs use user.name parameter.
> according to webhdfs spec,  insecure webhdfs use user.name,  secure webhdfs 
> use SPNEGO for authentication.
> I think webhdfs that connect secure hdfs  should not use user.name parameter.
> I will attach patch.
> below is error log
>  
> {noformat}
> $ hdfs dfs -ls  webhdfs://b.com:50070/
> ls: Usernames not matched: name=user_a != expected=cross_realm_a_com_user_a
>  
> # user.name in cross realm webhdfs
> $ curl -u : --negotiate 
> 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN&user.name=user_a' 
> {"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed
>  to obtain user group information: java.io.IOException: Usernames not 
> matched: name=user_a != expected=cross_realm_a_com_user_a"}}
> # USE SPNEGO
> $ curl -u : --negotiate 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN'
> {"Token"{"urlString":"XgA."}}
>  
> {noformat}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1475:
-
Labels: newbie  (was: )

> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> In OzoneContainer start() we have 
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This will cause if 
> a service is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-1475:


Assignee: (was: Bharat Viswanadham)

> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>
> In OzoneContainer start() we have 
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This will cause if 
> a service is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14434) webhdfs that connect secure hdfs should not use user.name parameter

2019-04-26 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827361#comment-16827361
 ] 

Eric Yang commented on HDFS-14434:
--

[~magnum], thank you for the patch.  The patch looks good to me if we can clean 
up the checkstyle problem.  [~kihwal], does it look good on your side?

> webhdfs that connect secure hdfs should not use user.name parameter
> ---
>
> Key: HDFS-14434
> URL: https://issues.apache.org/jira/browse/HDFS-14434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Assignee: KWON BYUNGCHANG
>Priority: Minor
> Attachments: HDFS-14434.001.patch, HDFS-14434.002.patch, 
> HDFS-14434.003.patch, HDFS-14434.004.patch, HDFS-14434.005.patch, 
> HDFS-14434.006.patch, HDFS-14434.007.patch, HDFS-14434.008.patch
>
>
> I have two secure hadoop cluster.  Both cluster use cross-realm 
> authentication. 
> [use...@a.com|mailto:use...@a.com] can access to HDFS of B.COM realm
> by the way, hadoop username of use...@a.com  in B.COM realm is  
> cross_realm_a_com_user_a.
> hdfs dfs command of use...@a.com using B.COM webhdfs failed.
> root cause is  webhdfs that connect secure hdfs use user.name parameter.
> according to webhdfs spec,  insecure webhdfs use user.name,  secure webhdfs 
> use SPNEGO for authentication.
> I think webhdfs that connect secure hdfs  should not use user.name parameter.
> I will attach patch.
> below is error log
>  
> {noformat}
> $ hdfs dfs -ls  webhdfs://b.com:50070/
> ls: Usernames not matched: name=user_a != expected=cross_realm_a_com_user_a
>  
> # user.name in cross realm webhdfs
> $ curl -u : --negotiate 
> 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN&user.name=user_a' 
> {"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed
>  to obtain user group information: java.io.IOException: Usernames not 
> matched: name=user_a != expected=cross_realm_a_com_user_a"}}
> # USE SPNEGO
> $ curl -u : --negotiate 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN'
> {"Token"{"urlString":"XgA."}}
>  
> {noformat}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-1475:


Assignee: Bharat Viswanadham

> Fix OzoneContainer start method
> ---
>
> Key: HDDS-1475
> URL: https://issues.apache.org/jira/browse/HDDS-1475
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> In OzoneContainer start() we have 
> startContainerScrub();
> writeChannel.start();
> readChannel.start();
> hddsDispatcher.init();
> hddsDispatcher.setScmId(scmId);
>  
> Suppose here if readChannel.start() failed due to some reason, from 
> VersionEndPointTask, we try to start OzoneContainer again. This will cause if 
> a service is already started. 
>  
> Fix the logic such a way that if service is started, don't attempt to start 
> the service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1476) Fix logIfNeeded logic in EndPointStateMachine

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1476:
-
Labels: newbie  (was: )

> Fix logIfNeeded logic in EndPointStateMachine
> -
>
> Key: HDDS-1476
> URL: https://issues.apache.org/jira/browse/HDDS-1476
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> {code:java}
> public void E(Exception ex) {
>  LOG.trace("Incrementing the Missed count. Ex : {}", ex);
> this.incMissed();
>  if (this.getMissedCount() % getLogWarnInterval(conf) ==
>  0) {
>  LOG.error(
>  "Unable to communicate to SCM server at {} for past {} seconds.",
>  this.getAddress().getHostString() + ":" + this.getAddress().getPort(),
>  TimeUnit.MILLISECONDS.toSeconds(
>  this.getMissedCount() * getScmHeartbeatInterval(this.conf)), ex);
>  }
> }{code}
> This method will be called when any exception occur in stateMachine to log an 
> exception. But to not log aggresively we have this 
> ozone.scm.heartbeat.log.warn.interval.count property to control  logging. 
>  
> There is a small issue here, we don't log the exception first time when it 
> occurred. So, we need to log for the first time and then increment the 
> missingCount.
>  
> Fix is to move the this.incMissed() to end of the method so that we log it 
> for the first time exception occurred and after that every 
> log.warn.interval.count exceptions happened.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1476) Fix logIfNeeded logic in EndPointStateMachine

2019-04-26 Thread Bharat Viswanadham (JIRA)

Bharat Viswanadham created HDDS-1476:


 Summary: Fix logIfNeeded logic in EndPointStateMachine
 Key: HDDS-1476
 URL: https://issues.apache.org/jira/browse/HDDS-1476
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


{code:java}
public void E(Exception ex) {
 LOG.trace("Incrementing the Missed count. Ex : {}", ex);
this.incMissed();
 if (this.getMissedCount() % getLogWarnInterval(conf) ==
 0) {
 LOG.error(
 "Unable to communicate to SCM server at {} for past {} seconds.",
 this.getAddress().getHostString() + ":" + this.getAddress().getPort(),
 TimeUnit.MILLISECONDS.toSeconds(
 this.getMissedCount() * getScmHeartbeatInterval(this.conf)), ex);
 }

}{code}
This method will be called when any exception occur in stateMachine to log an 
exception. But to not log aggresively we have this 
ozone.scm.heartbeat.log.warn.interval.count property to control  logging. 

 

There is a small issue here, we don't log the exception first time when it 
occurred. So, we need to log for the first time and then increment the 
missingCount.

 

Fix is to move the this.incMissed() to end of the method so that we log it for 
the first time exception occurred and after that every log.warn.interval.count 
exceptions happened.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1474) "ozone.scm.datanode.id" config should take path for a dir an not a file

2019-04-26 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-1474:

Labels: newbie  (was: )

> "ozone.scm.datanode.id" config should take path for a dir an not a file
> ---
>
> Key: HDDS-1474
> URL: https://issues.apache.org/jira/browse/HDDS-1474
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Vivek Ratnavel Subramanian
>Priority: Minor
>  Labels: newbie
>
> Currently, the ozone config "ozone.scm.datanode.id" takes file path as its 
> value. It should instead take dir path as its value and assume a standard 
> filename "datanode.id"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1475) Fix OzoneContainer start method

2019-04-26 Thread Bharat Viswanadham (JIRA)

Bharat Viswanadham created HDDS-1475:


 Summary: Fix OzoneContainer start method
 Key: HDDS-1475
 URL: https://issues.apache.org/jira/browse/HDDS-1475
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


In OzoneContainer start() we have 

startContainerScrub();
writeChannel.start();
readChannel.start();
hddsDispatcher.init();
hddsDispatcher.setScmId(scmId);

 

Suppose here if readChannel.start() failed due to some reason, from 
VersionEndPointTask, we try to start OzoneContainer again. This will cause if a 
service is already started. 

 

Fix the logic such a way that if service is started, don't attempt to start the 
service again. Similar changes needed to be done for stop().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1474) "ozone.scm.datanode.id" config should take path for a dir an not a file

2019-04-26 Thread Vivek Ratnavel Subramanian (JIRA)

Vivek Ratnavel Subramanian created HDDS-1474:


 Summary: "ozone.scm.datanode.id" config should take path for a dir 
an not a file
 Key: HDDS-1474
 URL: https://issues.apache.org/jira/browse/HDDS-1474
 Project: Hadoop Distributed Data Store
  Issue Type: Task
  Components: Ozone Datanode
Affects Versions: 0.4.0
Reporter: Vivek Ratnavel Subramanian


Currently, the ozone config "ozone.scm.datanode.id" takes file path as its 
value. It should instead take dir path as its value and assume a standard 
filename "datanode.id"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1471?focusedWorklogId=233802&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233802
 ]

ASF GitHub Bot logged work on HDDS-1471:


Author: ASF GitHub Bot
Created on: 26/Apr/19 22:51
Start Date: 26/Apr/19 22:51
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #777: HDDS-1471. Update 
ratis dependency to 0.3.0. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/777#issuecomment-487224907
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 90 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | -1 | test4tests | 0 | The patch doesn't appear to include any new or 
modified tests.  Please justify why no new tests are needed for this patch. 
Also please list what manual steps were performed to verify this patch. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 332 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1097 | trunk passed |
   | +1 | compile | 1063 | trunk passed |
   | -1 | mvnsite | 68 | hadoop-ozone in trunk failed. |
   | +1 | shadedclient | 3374 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 231 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | +1 | mvninstall | 392 | the patch passed |
   | +1 | compile | 965 | the patch passed |
   | +1 | javac | 965 | the patch passed |
   | +1 | mvnsite | 246 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 3 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 684 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 153 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 274 | hadoop-hdds in the patch failed. |
   | -1 | unit | 1683 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 52 | The patch does not generate ASF License warnings. |
   | | | 8375 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdds.scm.block.TestBlockManager |
   |   | hadoop.ozone.ozShell.TestOzoneShell |
   |   | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestCommitWatcher |
   |   | hadoop.hdds.scm.pipeline.TestNode2PipelineMap |
   |   | hadoop.ozone.om.TestOzoneManager |
   |   | hadoop.ozone.container.TestContainerReplication |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-777/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/777 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  xml  |
   | uname | Linux 6392affc92e1 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3758270 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-777/1/artifact/out/branch-mvnsite-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-777/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-777/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-777/1/testReport/ |
   | Max. process+thread count | 3632 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds hadoop-ozone U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-777/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233802)
Time Spent: 0.5h  (was: 20m)

> Update ratis dependency to 0.3.0
> 
>
> Key: HDDS-1471
> URL: https://issues.apache.org/jira/browse/HDDS-1471
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>

[jira] [Created] (HDDS-1473) DataNode ID file should be human readable

2019-04-26 Thread Arpit Agarwal (JIRA)

Arpit Agarwal created HDDS-1473:
---

 Summary: DataNode ID file should be human readable
 Key: HDDS-1473
 URL: https://issues.apache.org/jira/browse/HDDS-1473
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Arpit Agarwal


The DataNode ID file should be human readable to make debugging easier. We 
should use YAML as we have used it elsewhere for meta files.

Currently it is a binary file whose contents are protobuf encoded. This is a 
tiny file read once on startup, so performance is not a concern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827349#comment-16827349
 ] 

Hadoop QA commented on HDFS-14245:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
54s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}114m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}187m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.server.namenode.TestFSImage |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14245 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12967163/HDFS-14245.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c44fb8bf5f8f 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revi

[jira] [Commented] (HDFS-14403) Cost-Based RPC FairCallQueue

2019-04-26 Thread Christopher Gregorian (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827344#comment-16827344
 ] 

Christopher Gregorian commented on HDFS-14403:
--

Posted version 006 (based off of HADOOP-16266) and 006.combined (based off of 
current trunk) :)

> Cost-Based RPC FairCallQueue
> 
>
> Key: HDFS-14403
> URL: https://issues.apache.org/jira/browse/HDFS-14403
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ipc, namenode
>Reporter: Erik Krogen
>Assignee: Christopher Gregorian
>Priority: Major
>  Labels: qos, rpc
> Attachments: CostBasedFairCallQueueDesign_v0.pdf, 
> HDFS-14403.001.patch, HDFS-14403.002.patch, HDFS-14403.003.patch, 
> HDFS-14403.004.patch, HDFS-14403.005.patch, HDFS-14403.006.combined.patch, 
> HDFS-14403.006.patch, HDFS-14403.branch-2.8.patch
>
>
> HADOOP-15016 initially described extensions to the Hadoop FairCallQueue 
> encompassing both cost-based analysis of incoming RPCs, as well as support 
> for reservations of RPC capacity for system/platform users. This JIRA intends 
> to track the former, as HADOOP-15016 was repurposed to more specifically 
> focus on the reservation portion of the work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14403) Cost-Based RPC FairCallQueue

2019-04-26 Thread Christopher Gregorian (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Gregorian updated HDFS-14403:
-
Attachment: HDFS-14403.006.combined.patch
HDFS-14403.006.patch

> Cost-Based RPC FairCallQueue
> 
>
> Key: HDFS-14403
> URL: https://issues.apache.org/jira/browse/HDFS-14403
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ipc, namenode
>Reporter: Erik Krogen
>Assignee: Christopher Gregorian
>Priority: Major
>  Labels: qos, rpc
> Attachments: CostBasedFairCallQueueDesign_v0.pdf, 
> HDFS-14403.001.patch, HDFS-14403.002.patch, HDFS-14403.003.patch, 
> HDFS-14403.004.patch, HDFS-14403.005.patch, HDFS-14403.006.combined.patch, 
> HDFS-14403.006.patch, HDFS-14403.branch-2.8.patch
>
>
> HADOOP-15016 initially described extensions to the Hadoop FairCallQueue 
> encompassing both cost-based analysis of incoming RPCs, as well as support 
> for reservations of RPC capacity for system/platform users. This JIRA intends 
> to track the former, as HADOOP-15016 was repurposed to more specifically 
> focus on the reservation portion of the work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1201) Reporting Corruptions in Containers to SCM

2019-04-26 Thread Arpit Agarwal (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827342#comment-16827342
 ] 

Arpit Agarwal commented on HDDS-1201:
-

Thanks for looking into this [~hgadre]. You have the right idea - just one 
suggestion. We can send with the next heartbeat instead of the block report, 
since block reports are less frequent.

> Reporting Corruptions in Containers to SCM
> --
>
> Key: HDDS-1201
> URL: https://issues.apache.org/jira/browse/HDDS-1201
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode, SCM
>Reporter: Supratim Deka
>Assignee: Hrishikesh Gadre
>Priority: Major
>
> Add protocol message and handling to report container corruptions to the SCM.
> Also add basic recovery handling in SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1456) Stop the datanode, when any datanode statemachine state is set to shutdown

2019-04-26 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827335#comment-16827335
 ] 

Hudson commented on HDDS-1456:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16470 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16470/])
HDDS-1456. Stop the datanode, when any datanode statemachine state is… (github: 
rev 43b2a4b77bfdd7dec66c92bf59a70f0aca437722)
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeSet.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestMiniOzoneCluster.java
* (add) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeStopService.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/HddsDatanodeService.java
* (edit) 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeStateMachine.java
* (edit) 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestVolumeSetDiskChecks.java
* (edit) 
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/TestDatanodeStateMachine.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/ozone/container/common/TestEndPoint.java


> Stop the datanode, when any datanode statemachine state is set to shutdown
> --
>
> Key: HDDS-1456
> URL: https://issues.apache.org/jira/browse/HDDS-1456
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Recently we have seen an issue, in InitDatanodeState, there is error during 
> create Path for volume. We set the state to shutdown and this has caused 
> DatanodeStateMachine to stop, but datanode is still running. In this case we 
> should stop Datanode, otherwise, user will know about this when running ozone 
> commands or when user observed metrics like healthy nodes.
>  
> cc [~vivekratnavel]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1472) Add retry to kinit command in smoketests

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1472?focusedWorklogId=233776&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233776
 ]

ASF GitHub Bot logged work on HDDS-1472:


Author: ASF GitHub Bot
Created on: 26/Apr/19 21:27
Start Date: 26/Apr/19 21:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #778: HDDS-1472. Add 
retry to kinit command in smoketests. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/778#issuecomment-487206968
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 25 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | -1 | test4tests | 0 | The patch doesn't appear to include any new or 
modified tests.  Please justify why no new tests are needed for this patch. 
Also please list what manual steps were performed to verify this patch. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1170 | trunk passed |
   | +1 | compile | 68 | trunk passed |
   | +1 | mvnsite | 24 | trunk passed |
   | +1 | shadedclient | 1966 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 17 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | -1 | mvninstall | 18 | dist in the patch failed. |
   | +1 | compile | 18 | the patch passed |
   | +1 | javac | 18 | the patch passed |
   | +1 | mvnsite | 19 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 796 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 15 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 19 | dist in the patch passed. |
   | +1 | asflicense | 26 | The patch does not generate ASF License warnings. |
   | | | 3047 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-778/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/778 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  |
   | uname | Linux 23c866a28826 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed 
Feb 13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 3758270 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-778/1/artifact/out/patch-mvninstall-hadoop-ozone_dist.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-778/1/testReport/ |
   | Max. process+thread count | 340 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/dist U: hadoop-ozone/dist |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-778/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233776)
Time Spent: 0.5h  (was: 20m)

> Add retry to kinit command in smoketests
> 
>
> Key: HDDS-1472
> URL: https://issues.apache.org/jira/browse/HDDS-1472
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add retry to kinit command in smoketests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1456) Stop the datanode, when any datanode statemachine state is set to shutdown

2019-04-26 Thread Bharat Viswanadham (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-1456:
-
   Resolution: Fixed
Fix Version/s: 0.5.0
   Status: Resolved  (was: Patch Available)

Thank You [~arpitagarwal] for the review.

I have committed this to trunk.

> Stop the datanode, when any datanode statemachine state is set to shutdown
> --
>
> Key: HDDS-1456
> URL: https://issues.apache.org/jira/browse/HDDS-1456
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Recently we have seen an issue, in InitDatanodeState, there is error during 
> create Path for volume. We set the state to shutdown and this has caused 
> DatanodeStateMachine to stop, but datanode is still running. In this case we 
> should stop Datanode, otherwise, user will know about this when running ozone 
> commands or when user observed metrics like healthy nodes.
>  
> cc [~vivekratnavel]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1456) Stop the datanode, when any datanode statemachine state is set to shutdown

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1456?focusedWorklogId=233775&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233775
 ]

ASF GitHub Bot logged work on HDDS-1456:


Author: ASF GitHub Bot
Created on: 26/Apr/19 21:25
Start Date: 26/Apr/19 21:25
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #769: 
HDDS-1456. Stop the datanode, when any datanode statemachine state is…
URL: https://github.com/apache/hadoop/pull/769
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233775)
Time Spent: 2.5h  (was: 2h 20m)

> Stop the datanode, when any datanode statemachine state is set to shutdown
> --
>
> Key: HDDS-1456
> URL: https://issues.apache.org/jira/browse/HDDS-1456
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Recently we have seen an issue, in InitDatanodeState, there is error during 
> create Path for volume. We set the state to shutdown and this has caused 
> DatanodeStateMachine to stop, but datanode is still running. In this case we 
> should stop Datanode, otherwise, user will know about this when running ozone 
> commands or when user observed metrics like healthy nodes.
>  
> cc [~vivekratnavel]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1472) Add retry to kinit command in smoketests

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1472?focusedWorklogId=233744&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233744
 ]

ASF GitHub Bot logged work on HDDS-1472:


Author: ASF GitHub Bot
Created on: 26/Apr/19 20:59
Start Date: 26/Apr/19 20:59
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on issue #778: HDDS-1472. Add retry 
to kinit command in smoketests. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/778#issuecomment-487197560
 
 
   +1 pending Jenkins. Thanks for fixing this @ajayydv.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233744)
Time Spent: 20m  (was: 10m)

> Add retry to kinit command in smoketests
> 
>
> Key: HDDS-1472
> URL: https://issues.apache.org/jira/browse/HDDS-1472
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add retry to kinit command in smoketests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1471?focusedWorklogId=233738&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233738
 ]

ASF GitHub Bot logged work on HDDS-1471:


Author: ASF GitHub Bot
Created on: 26/Apr/19 20:42
Start Date: 26/Apr/19 20:42
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #777: HDDS-1471. Update 
ratis dependency to 0.3.0. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/777#issuecomment-487195062
 
 
   +1, Thanks for getting this done. Appreciate it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233738)
Time Spent: 20m  (was: 10m)

> Update ratis dependency to 0.3.0
> 
>
> Key: HDDS-1471
> URL: https://issues.apache.org/jira/browse/HDDS-1471
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update ratis dependency to 0.3.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1472) Add retry to kinit command in smoketests

2019-04-26 Thread Ajay Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-1472:
-
Status: Patch Available  (was: Open)

> Add retry to kinit command in smoketests
> 
>
> Key: HDDS-1472
> URL: https://issues.apache.org/jira/browse/HDDS-1472
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add retry to kinit command in smoketests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1472) Add retry to kinit command in smoketests

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1472:
-
Labels: pull-request-available  (was: )

> Add retry to kinit command in smoketests
> 
>
> Key: HDDS-1472
> URL: https://issues.apache.org/jira/browse/HDDS-1472
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>
> Add retry to kinit command in smoketests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1472) Add retry to kinit command in smoketests

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1472?focusedWorklogId=233732&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233732
 ]

ASF GitHub Bot logged work on HDDS-1472:


Author: ASF GitHub Bot
Created on: 26/Apr/19 20:35
Start Date: 26/Apr/19 20:35
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on pull request #778: HDDS-1472. Add 
retry to kinit command in smoketests. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/778
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233732)
Time Spent: 10m
Remaining Estimate: 0h

> Add retry to kinit command in smoketests
> 
>
> Key: HDDS-1472
> URL: https://issues.apache.org/jira/browse/HDDS-1472
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add retry to kinit command in smoketests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1472) Add retry to kinit command in smoketests

2019-04-26 Thread Ajay Kumar (JIRA)

Ajay Kumar created HDDS-1472:


 Summary: Add retry to kinit command in smoketests
 Key: HDDS-1472
 URL: https://issues.apache.org/jira/browse/HDDS-1472
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Ajay Kumar
Assignee: Ajay Kumar


Add retry to kinit command in smoketests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread Ajay Kumar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDDS-1471:
-
Status: Patch Available  (was: Open)

> Update ratis dependency to 0.3.0
> 
>
> Key: HDDS-1471
> URL: https://issues.apache.org/jira/browse/HDDS-1471
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Update ratis dependency to 0.3.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1471?focusedWorklogId=233729&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233729
 ]

ASF GitHub Bot logged work on HDDS-1471:


Author: ASF GitHub Bot
Created on: 26/Apr/19 20:30
Start Date: 26/Apr/19 20:30
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on pull request #777: HDDS-1471. 
Update ratis dependency to 0.3.0. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/777
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233729)
Time Spent: 10m
Remaining Estimate: 0h

> Update ratis dependency to 0.3.0
> 
>
> Key: HDDS-1471
> URL: https://issues.apache.org/jira/browse/HDDS-1471
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Update ratis dependency to 0.3.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827303#comment-16827303
 ] 

Ayush Saxena commented on HDFS-14440:
-

Have uploaded patch v2 changing to getFileInfo().

Ran up a general comparison just b/w getFileInfo() and getBlockLocations() and 
getFileInfo() tend to fair better average around 23-28 % than the 
getBlockLocations().

 

Pls Review!!!

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch, 
> HDFS-14440-HDFS-13891-02.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1471:
-
Labels: pull-request-available  (was: )

> Update ratis dependency to 0.3.0
> 
>
> Key: HDDS-1471
> URL: https://issues.apache.org/jira/browse/HDDS-1471
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>
> Update ratis dependency to 0.3.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-26 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827302#comment-16827302
 ] 

Konstantin Shvachko commented on HDFS-14245:


Great, simple is good.
 # It would be better if {{getProxyAsClientProtocol()}} was throwing 
{{IOException}} rather than {{RuntimeException}}.
 # It looks that {{getHAServiceState()}} in current revision assumes 
{{STANDBY}} state no matter what error. I think it should only assume 
{{STANDBY}} state when it gets {{StandbyException}}, and re-throw if anything 
else. Also {{LOG.error()}} rather than {{info()}}.

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.002.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Ayush Saxena (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14440:

Attachment: HDFS-14440-HDFS-13891-02.patch

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch, 
> HDFS-14440-HDFS-13891-02.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1471) Update ratis dependency to 0.3.0

2019-04-26 Thread Ajay Kumar (JIRA)

Ajay Kumar created HDDS-1471:


 Summary: Update ratis dependency to 0.3.0
 Key: HDDS-1471
 URL: https://issues.apache.org/jira/browse/HDDS-1471
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Ajay Kumar
Assignee: Ajay Kumar


Update ratis dependency to 0.3.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1406) Avoid usage of commonPool in RatisPipelineUtils

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1406?focusedWorklogId=233714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233714
 ]

ASF GitHub Bot logged work on HDDS-1406:


Author: ASF GitHub Bot
Created on: 26/Apr/19 19:47
Start Date: 26/Apr/19 19:47
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on issue #714: HDDS-1406. 
Avoid usage of commonPool in RatisPipelineUtils.
URL: https://github.com/apache/hadoop/pull/714#issuecomment-487179072
 
 
   Test failures are not related to this patch.
   I will commit this shortly.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233714)
Time Spent: 5h 40m  (was: 5.5h)

> Avoid usage of commonPool in RatisPipelineUtils
> ---
>
> Key: HDDS-1406
> URL: https://issues.apache.org/jira/browse/HDDS-1406
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> We use parallelStream in during createPipline, this internally uses 
> commonPool. Use Our own ForkJoinPool with parallelisim set with number of 
> processors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827250#comment-16827250
 ] 

Ayush Saxena commented on HDFS-14454:
-

Thanx [~elgoiri] for the update.

There is a test failure in the report, Well that passed at my local.

Can you too confirm?

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch, 
> HDFS-14454-HDFS-13891.005.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827244#comment-16827244
 ] 

Ayush Saxena commented on HDFS-13522:
-

Thanx Everyone for the discussion here. Feel like this three challenges.
 [~elgoiri] already has mentioned two challenges :
 * Collecting Observer state
 * Invoking at the observer
 * Handling the state id

The above two seems fairly straightforward. The main challenge seems to be 
handling the state id. In a non federation scenario. A client gets the state id 
for every operation at the Active and the client uses that id while invoking 
the call at observer, which the observer uses to ensure non stale read.

The problem at RBF I feel is Router is mounted to different namespaces and a 
client call can go to any of the namespace depending on the mount mapping.

So, the challenge may be handling the state id. That too may have two 
approaches,that I can think of:

First we store the state id at the Router end and decide observer read at 
Router making the client independent, For each call we check the state id 
corresponding to the NS and invoke the call accordingly.

Second is what I think might be create a Router State which can be sent to the 
Client, as is sent by the NN presently, and that may be decoded back to get 
each of the namespace states, which can be used further.

The first one seems quite easy but major problem which I feel would be to sync 
the state amongst all routers and the overhead that it will cause during an 
operation, We have to read every time the value in this case from StateStore 
and update the value everytime on write(may be a point of bottleneck too) and 
with second the mechanism to wrap the state id stays a challange.

 

> Support observer node from Router-Based Federation
> --
>
> Key: HDFS-13522
> URL: https://issues.apache.org/jira/browse/HDFS-13522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation, namenode
>Reporter: Erik Krogen
>Assignee: Chao Sun
>Priority: Major
>
> Changes will need to occur to the router to support the new observer node.
> One such change will be to make the router understand the observer state, 
> e.g. {{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-26 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14245:
---
Attachment: HDFS-14245.002.patch

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.002.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-26 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827239#comment-16827239
 ] 

Erik Krogen commented on HDFS-14245:


Done, thanks for the heads up [~shv]!

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.002.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827234#comment-16827234
 ] 

Hadoop QA commented on HDFS-14454:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
36s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 12s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterFaultTolerant |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14454 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12967152/HDFS-14454-HDFS-13891.005.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 30cc245ac20d 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / 55f2f7a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26712/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26712/testReport/ |
| Max. process+thread count | 1357 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreComm

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-26 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827227#comment-16827227
 ] 

Konstantin Shvachko commented on HDFS-14245:


[~xkrogen] could you please update the patch. It got out of sync after 
HDFS-14435.

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-26 Thread Fengnan Li (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827189#comment-16827189
 ] 

Fengnan Li commented on HDFS-14426:
---

[~ajisakaa] I am not seeing HDFS-14374 in gitbox repo as well: 
[https://gitbox.apache.org/repos/asf?p=hadoop.git;a=shortlog;h=refs/heads/HDFS-13891]
 Is this the right place?

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-26 Thread CR Hota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827187#comment-16827187
 ] 

CR Hota commented on HDFS-14426:


[~fengnanli]

Thanks for the earlier patch, please work against gitbox repo and upload a new 
patch.

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1458) Create a maven profile to run fault injection tests

2019-04-26 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827170#comment-16827170
 ] 

Eric Yang commented on HDDS-1458:
-

The second part of the fault injection test design to test against disk 
failures. We can start simple using docker-compose and pytest without require 
dice or namazu as additional dependency and focus on using docker container as 
test platform.
 # maven integration-test execution 1
 ## docker-compose up && mount data disk as read/write
 ## run a set of integration tests to make sure happy path works
 ## docker-compose down
 # maven integration-test execution 2
 ## docker-compose up && mount data disk as read only
 ## run a set of smoke tests to ensure data volume in read only mode works
 ## docker-compose down
 # maven integration-test execution 3
 ## docker-compose up && removing/corrupting data disk volume
 ## run another set of smoke tests to ensure missing data or corrupted data 
handles gracefully
 ## docker-compose down

> Create a maven profile to run fault injection tests
> ---
>
> Key: HDDS-1458
> URL: https://issues.apache.org/jira/browse/HDDS-1458
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: HDDS-1458.001.patch
>
>
> Some fault injection tests have been written using blockade.  It would be 
> nice to have ability to start docker compose and exercise the blockade test 
> cases against Ozone docker containers, and generate reports.  This is 
> optional integration tests to catch race conditions and fault tolerance 
> defects. 
> We can introduce a profile with id: it (short for integration tests).  This 
> will launch docker compose via maven-exec-plugin and run blockade to simulate 
> container failures and timeout.
> Usage command:
> {code}
> mvn clean verify -Pit
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827165#comment-16827165
 ] 

Íñigo Goiri commented on HDFS-14440:


OK, let's try using getFileInfo().

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827163#comment-16827163
 ] 

Íñigo Goiri commented on HDFS-14454:


I tweaked the timeout while debugging the issue in 
{{TestRouterRPCClientRetries}}.
It had a mark for deprecated, I undid it as it would be best to do it in a JIRA 
for it.
Take a look at [^HDFS-14454-HDFS-13891.005.patch].

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch, 
> HDFS-14454-HDFS-13891.005.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-14454:
---
Attachment: HDFS-14454-HDFS-13891.005.patch

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch, 
> HDFS-14454-HDFS-13891.005.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1403) KeyOutputStream writes fails after max retries while writing to a closed container

2019-04-26 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827161#comment-16827161
 ] 

Hudson commented on HDDS-1403:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16469 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16469/])
HDDS-1403. KeyOutputStream writes fails after max retries while writing 
(github: rev 37582705fa6697b744b301d999c9952194e9fc40)
* (edit) 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java
* (edit) hadoop-hdds/common/src/main/resources/ozone-default.xml
* (edit) 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/web/utils/OzoneUtils.java
* (edit) 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/KeyOutputStream.java
* (edit) 
hadoop-ozone/objectstore-service/src/main/java/org/apache/hadoop/ozone/web/storage/DistributedStorageHandler.java
* (edit) 
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java
* (edit) 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneClientUtils.java


> KeyOutputStream writes fails after max retries while writing to a closed 
> container
> --
>
> Key: HDDS-1403
> URL: https://issues.apache.org/jira/browse/HDDS-1403
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently a Ozone Client retries a write operation 5 times. It is possible 
> that the container being written to is already closed by the time it is 
> written to. The key write will fail after retrying multiple times with this 
> error. This needs to be fixed as this is an internal error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13888) RequestHedgingProxyProvider shows InterruptedException

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827160#comment-16827160
 ] 

Hadoop QA commented on HDFS-13888:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HDFS-13888 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13888 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12953745/HDFS-13888.004.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26711/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RequestHedgingProxyProvider shows InterruptedException
> --
>
> Key: HDFS-13888
> URL: https://issues.apache.org/jira/browse/HDFS-13888
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Íñigo Goiri
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-13888.001.patch, HDFS-13888.002.patch, 
> HDFS-13888.003.patch, HDFS-13888.004.patch
>
>
> RequestHedgingProxyProvider shows InterruptedException when running:
> {code}
> 2018-08-30 23:52:48,883 WARN ipc.Client: interrupted waiting to send rpc 
> request to server
> java.lang.InterruptedException
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1142)
> at org.apache.hadoop.ipc.Client.call(Client.java:1395)
> at org.apache.hadoop.ipc.Client.call(Client.java:1353)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:900)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider$RequestHedgingInvocationHandler$1.call(RequestHedgingProxyProvider.java:135)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> It looks like this is the case of the background request that is killed once 
> the main one succeeds. We should not log the full stack trace for this and 
> maybe just a debug log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1452) All chunk writes should happen to a single file for a block in datanode

2019-04-26 Thread Anu Engineer (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827159#comment-16827159
 ] 

Anu Engineer commented on HDDS-1452:


I agree it is not orthogonal. I was thinking we can skip step one completely if 
we do the second one. Since the code changes are exactly in the same place. 
Most Object stores and file systems use Extend based allocation and writes. 
Ozone would benefit from moving into some kind of extend based system. In fact, 
it would be best if can allocate extents on SSD, keep the data in those extents 
for 24 hours and move it to spinning disks later. This is similar to what ZFS 
does, and you automatically get SSD caching. If you are writing to a a spinning 
disk, all writes are sequential which increases the write speed.

> All chunk writes should happen to a single file for a block in datanode
> ---
>
> Key: HDDS-1452
> URL: https://issues.apache.org/jira/browse/HDDS-1452
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.5.0
>
>
> Currently, all chunks of a block happen to individual chunk files in 
> datanode. This idea here is to write all individual chunks to a single file 
> in datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13888) RequestHedgingProxyProvider shows InterruptedException

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827155#comment-16827155
 ] 

Íñigo Goiri commented on HDFS-13888:


[~LiJinglun] do you mind taking care of the changes?

> RequestHedgingProxyProvider shows InterruptedException
> --
>
> Key: HDFS-13888
> URL: https://issues.apache.org/jira/browse/HDFS-13888
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Íñigo Goiri
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-13888.001.patch, HDFS-13888.002.patch, 
> HDFS-13888.003.patch, HDFS-13888.004.patch
>
>
> RequestHedgingProxyProvider shows InterruptedException when running:
> {code}
> 2018-08-30 23:52:48,883 WARN ipc.Client: interrupted waiting to send rpc 
> request to server
> java.lang.InterruptedException
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1142)
> at org.apache.hadoop.ipc.Client.call(Client.java:1395)
> at org.apache.hadoop.ipc.Client.call(Client.java:1353)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:900)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.RequestHedgingProxyProvider$RequestHedgingInvocationHandler$1.call(RequestHedgingProxyProvider.java:135)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> It looks like this is the case of the background request that is killed once 
> the main one succeeds. We should not log the full stack trace for this and 
> maybe just a debug log.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1403) KeyOutputStream writes fails after max retries while writing to a closed container

2019-04-26 Thread Hanisha Koneru (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-1403:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> KeyOutputStream writes fails after max retries while writing to a closed 
> container
> --
>
> Key: HDDS-1403
> URL: https://issues.apache.org/jira/browse/HDDS-1403
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently a Ozone Client retries a write operation 5 times. It is possible 
> that the container being written to is already closed by the time it is 
> written to. The key write will fail after retrying multiple times with this 
> error. This needs to be fixed as this is an internal error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1403) KeyOutputStream writes fails after max retries while writing to a closed container

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1403?focusedWorklogId=233674&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233674
 ]

ASF GitHub Bot logged work on HDDS-1403:


Author: ASF GitHub Bot
Created on: 26/Apr/19 17:39
Start Date: 26/Apr/19 17:39
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #753: 
HDDS-1403. KeyOutputStream writes fails after max retries while writing to a 
closed container
URL: https://github.com/apache/hadoop/pull/753
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233674)
Time Spent: 1h 50m  (was: 1h 40m)

> KeyOutputStream writes fails after max retries while writing to a closed 
> container
> --
>
> Key: HDDS-1403
> URL: https://issues.apache.org/jira/browse/HDDS-1403
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently a Ozone Client retries a write operation 5 times. It is possible 
> that the container being written to is already closed by the time it is 
> written to. The key write will fail after retrying multiple times with this 
> error. This needs to be fixed as this is an internal error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827150#comment-16827150
 ] 

Ayush Saxena commented on HDFS-14454:
-

Thanx [~elgoiri] for the patch.

A minor doubt, I guess this in {{TestRouterRPCClientRetries}} is unrelated :
{code:java}
   @Rule
-  public final Timeout testTimeout = new Timeout(10);
+  public final Timeout testTimeout = new Timeout(100, TimeUnit.SECONDS);{code}

Other than this v004  LGTM, covers all scenario in test and the fix is pretty 
sorted.

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827148#comment-16827148
 ] 

Ayush Saxena commented on HDFS-14440:
-

I am putting it in table, might help understand better.
|Operation|Comparison(Old/New)|Details.|
|Successful Write Operation|3.83 (Approx 4 equal to number of namespaces). 
expectedly to increase linearly as increase in number of NS|Scenario where all 
namespaces are checked to confirm non availability of File, And finally if not 
a file is successfully written.|
|Failed Write- Empty File(HASH ORDER)|1.732 (There are always two sequential 
call, One to getBlockLocations and then to getFileInfo )|GetBlockLocations 
expectdlly takes more time than getFileInfo, Guess that is the reason value 
isn’ t near exact 2|
|Failed Write-Non Empty File(HASH)|Approx 1|All operations for both approach 
took around same time.|
|Operations on Non-Hash Orders|Constant with new approach and same as all other 
scenario.|Dynamically increases depending upon the position of actual location 
in the results returned. Worst if the location is the last one and it is an 
empty file. First all locations sequentially invoked for getBlockLocations() 
and then for getFileInfo()|

*Scenario : 4 Namespace, Each averaged on 100 write ops.*
 ** 
bq. I guess it makes sense as the first one actually requires going through the 
block manager while the other is just name space.
We should consider this.

Yes, Thanks for getting up the reason, seems fair to me. If you agree we can 
change to getFileInfo(). :)

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1403) KeyOutputStream writes fails after max retries while writing to a closed container

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1403?focusedWorklogId=233671&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233671
 ]

ASF GitHub Bot logged work on HDDS-1403:


Author: ASF GitHub Bot
Created on: 26/Apr/19 17:33
Start Date: 26/Apr/19 17:33
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on issue #753: HDDS-1403. 
KeyOutputStream writes fails after max retries while writing to a closed 
container
URL: https://github.com/apache/hadoop/pull/753#issuecomment-487138591
 
 
   The test failures in CI are not related to this PR. Will merge the PR. Thank 
you @arp7 , @bshashikant and @mukul1987 for the reviews.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233671)
Time Spent: 1h 40m  (was: 1.5h)

> KeyOutputStream writes fails after max retries while writing to a closed 
> container
> --
>
> Key: HDDS-1403
> URL: https://issues.apache.org/jira/browse/HDDS-1403
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: MiniOzoneChaosCluster, pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently a Ozone Client retries a write operation 5 times. It is possible 
> that the container being written to is already closed by the time it is 
> written to. The key write will fail after retrying multiple times with this 
> error. This needs to be fixed as this is an internal error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13955) RBF: Support secure Namenode in NamenodeHeartbeatService

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827135#comment-16827135
 ] 

Íñigo Goiri commented on HDFS-13955:


[~crh] do you see issues with it right now?

> RBF: Support secure Namenode in NamenodeHeartbeatService
> 
>
> Key: HDFS-13955
> URL: https://issues.apache.org/jira/browse/HDFS-13955
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HDFS-13955-HDFS-13532.000.patch, 
> HDFS-13955-HDFS-13532.001.patch
>
>
> Currently, the NamenodeHeartbeatService uses JMX to get the metrics from the 
> Namenodes. We should support HTTPs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827121#comment-16827121
 ] 

Íñigo Goiri commented on HDFS-14440:


Do you mind putting the results in a table? Hard to parse for me which results 
is for what.
I guess one compromise would be to use the old approach for HASH based mount 
points and the new one for SPACE?
BTW, the use case for RANDOM is basically read low balance.
We have files that are read from thousands of containers and we put those files 
in all subclusters and read from a random one.

Interesting observation on the {{getBlockLocations()}} versus {{getFileInfo()}}.
I guess it makes sense as the first one actually requires going through the 
block manager while the other is just name space.
We should consider this.


> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1430) NPE if secure ozone if KMS uri is not defined.

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1430?focusedWorklogId=233645&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233645
 ]

ASF GitHub Bot logged work on HDDS-1430:


Author: ASF GitHub Bot
Created on: 26/Apr/19 16:55
Start Date: 26/Apr/19 16:55
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #752: HDDS-1430. 
NPE if secure ozone if KMS uri is not defined. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/752#discussion_r279027806
 
 

 ##
 File path: 
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/OzoneKMSUtil.java
 ##
 @@ -128,6 +128,9 @@ public static URI getKeyProviderUri(UserGroupInformation 
ugi,
 
   public static KeyProvider getKeyProvider(final Configuration conf,
   final URI serverProviderUri) throws IOException{
+if (serverProviderUri == null) {
 
 Review comment:
   We should not call getKeyProvider when provider uri is not defined. In 
RpcClient#getKeyProvider() and RestClient#getkeyProvider(), we should do
   
   String kpUri = getKeyProviderUri();
   return kpUri == null ? null : OzoneKMSUtil.getKeyProvider(conf, kpUri);
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233645)
Time Spent: 40m  (was: 0.5h)

> NPE if secure ozone if KMS uri is not defined.
> --
>
> Key: HDDS-1430
> URL: https://issues.apache.org/jira/browse/HDDS-1430
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.4.0
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> OzoneKMSUtil.getKeyProvider throws NPE if KMS uri is not defined. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827112#comment-16827112
 ] 

Ayush Saxena commented on HDFS-14440:
-

Thanx [~elgoiri]

I had a test setup to compare execution, So I tried on two Test setup one with 
Two NS and one with Four NS.

In focused more on the Four NS one, For the execution of only part of the 
method changed, I recorded the comparison,
 * On the successful write scenario, For 100 file writes The comparison time 
avg. landed to 3.83 Approx 4 only(equal to number of NS)
 * On Empty File Scenario Failure, For same 100 write. Comparison Avg. landed 
to 1.732 Approx 2 (For HASH, Since older one is for location and other is for 
fileInfo, I guess fileInfo takes less time as compared getBlockLocations).
 * On Non Empty File Failure: The time was almost same for the method part.

For Non Hash Orders, With older approaches as I said that was very dynamic and 
sometimes quite high too, if the location landed  being among the last 
locations, So can't conclude from the value, But with newer that was const. 
like above ones.

For RANDOM order, I don't think for us too, Not much use case(but can't say no 
one has). But Order SPACE finds fair usability and it has good performance 
impact there. Moreover, Anything good coming as Extras is always good.:)

I didn't had the production N/W load environment for the test, So didn't 
capture the time seconds, As the Comparison number shall stay same at any N/W 
performance and in test environment that would be like I shall myself deciding 
how much Latency for each RPC I weasn't to create. So didn't made sense for me 
to record, So I judged by the comparison b/w both.

Pls Review!!!

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827094#comment-16827094
 ] 

Íñigo Goiri commented on HDFS-14440:


Thanks [~ayushtkn] for checking.
We should focus on the HASH approaches as those are the predictable ones.
Is there any number you can share of latencies?

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14459) ClosedChannelException silently ignored in FsVolumeList.addBlockPool()

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827077#comment-16827077
 ] 

Hadoop QA commented on HDFS-14459:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 38s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 85 unchanged - 0 fixed = 90 total (was 85) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 47s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}132m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14459 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12967127/HDFS-14459.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0fed8d77fc7f 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 556eafd |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26710/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26710/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26710/testReport/ |
| Max. process+thread count | 4704 (vs. ulimit of 1) |
| m

[jira] [Commented] (HDFS-14447) RBF: RouterAdminServer should support RefreshUserMappingsProtocol

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827076#comment-16827076
 ] 

Íñigo Goiri commented on HDFS-14447:


Thanks [~shenyinjie] for  [^HDFS-14447-HDFS-13891.03.patch].
* Do you mind fixing the check styles?
* It would be good if you could add some high level comments to the tests 
explaining what is their purpose.
* For the exception you expect, you can use LambdaTestUtils#intercept.
* For the logs, use the logger format with {{LOG.info("Text: {}", var);}}

> RBF: RouterAdminServer should support RefreshUserMappingsProtocol
> -
>
> Key: HDFS-14447
> URL: https://issues.apache.org/jira/browse/HDFS-14447
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Affects Versions: 3.1.0
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
> Fix For: HDFS-13891
>
> Attachments: HDFS-14447-HDFS-13891.01.patch, 
> HDFS-14447-HDFS-13891.02.patch, HDFS-14447-HDFS-13891.03.patch, error.png
>
>
> HDFS with RBF
> We configure hadoop.proxyuser.xx.yy ,then execute hdfs dfsadmin 
> -Dfs.defaultFS=hdfs://router-fed -refreshSuperUserGroupsConfiguration,
>  it throws "Unknown protocol: ...RefreshUserMappingProtocol".
> RouterAdminServer should support RefreshUserMappingsProtocol , or a proxyuser 
> client would be refused to impersonate.As shown in the screenshot



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14440) RBF: Optimize the file write process in case of multiple destinations.

2019-04-26 Thread Ayush Saxena (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827072#comment-16827072
 ] 

Ayush Saxena commented on HDFS-14440:
-

I did some analysis in my test setup for just this part of code to catch some 
confirmation for the number of RPC's and time
 * For A Successful Write, The number of RPC stayed same for both approaches. 
Just the time improved for obvious reasons, we discussed above.
 * For Non Successful Write:
 ** For Empty Files : The minimum RPC is got was 2 with Order HASH and 4 with 
the new approach and time for checking was half with new approach, But for 
order RANDOM, I guess the optimization that you talked about(First Location 
always hitting, I guess didn't hold up for me) And the RPC count was also 
RANDOM and time too with the old Approach, But with new it stayed const and 
same as other cases.
 ** For Non-Empty : The time was same with HASH order and For other it was more 
and was more like dynamic.

Well the time difference for the method execution depends on the n/w state and 
I can't put the number from prod here. Well it is quite mathematical too.

 

Let me know if any doubts pertain. Moreover I don't think any threat to 
Functionality from this change.

> RBF: Optimize the file write process in case of multiple destinations.
> --
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch
>
>
> In case of multiple destinations, We need to check if the file already exists 
> in one of the subclusters for which we use the existing getBlockLocation() 
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall 
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we 
> need to do getFileInfo to all the locations to get the location where the 
> file exists. This also can be prevented by use of ConcurrentCall since we 
> shall be having the remoteLocation to where the getBlockLocation returned a 
> non null entry.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14454) RBF: getContentSummary() should allow non-existing folders

2019-04-26 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827070#comment-16827070
 ] 

Íñigo Goiri commented on HDFS-14454:


[~ayushtkn], do you mind taking a look?

> RBF: getContentSummary() should allow non-existing folders
> --
>
> Key: HDFS-14454
> URL: https://issues.apache.org/jira/browse/HDFS-14454
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14454-HDFS-13891.000.patch, 
> HDFS-14454-HDFS-13891.001.patch, HDFS-14454-HDFS-13891.002.patch, 
> HDFS-14454-HDFS-13891.003.patch, HDFS-14454-HDFS-13891.004.patch
>
>
> We have a mount point with HASH_ALL and one of the subclusters does not 
> contain the folder.
> In this case, getContentSummary() returns FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1469) Generate default configuration fragments based on annotations

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1469?focusedWorklogId=233578&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233578
 ]

ASF GitHub Bot logged work on HDDS-1469:


Author: ASF GitHub Bot
Created on: 26/Apr/19 14:53
Start Date: 26/Apr/19 14:53
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #773: HDDS-1469. 
Generate default configuration fragments based on annotations
URL: https://github.com/apache/hadoop/pull/773#issuecomment-487086122
 
 
   Thank you for your comments and explanations. +1. Please feel free to commit 
this. Thanks for getting this done. We can now add more features into the 
processor, hopefully generating code for get/set and validation methods. At 
some point, it would be nice to have a validation method too. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233578)
Time Spent: 4.5h  (was: 4h 20m)

> Generate default configuration fragments based on annotations
> -
>
> Key: HDDS-1469
> URL: https://issues.apache.org/jira/browse/HDDS-1469
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> See the design doc in the parent jira for more details.
> In this jira I introduce a new annotation processor which can generate 
> ozone-default.xml fragments based on the annotations which are introduced by 
> HDDS-1468.
> The ozone-default-generated.xml fragments can be used directly by the 
> OzoneConfiguration as I added a small code to the constructor to check ALL 
> the available ozone-default-generated.xml files and add them to the available 
> resources.
> With this approach we don't need to edit ozone-default.xml as all the 
> configuration can be defined in java code.
> As a side effect each service will see only the available configuration keys 
> and values based on the classpath. (If the ozone-default-generated.xml file 
> of OzoneManager is not on the classpath of the SCM, SCM doesn't see the 
> available configs.) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=233549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233549
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 26/Apr/19 14:11
Start Date: 26/Apr/19 14:11
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #693: HDDS-1382. Create 
customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#issuecomment-487071089
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 44 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 72 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1137 | trunk passed |
   | +1 | compile | 1359 | trunk passed |
   | +1 | checkstyle | 152 | trunk passed |
   | -1 | mvnsite | 117 | hadoop-ozone in trunk failed. |
   | -1 | mvnsite | 46 | integration-test in trunk failed. |
   | +1 | shadedclient | 781 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone hadoop-ozone/dist hadoop-ozone/integration-test |
   | +1 | findbugs | 163 | trunk passed |
   | +1 | javadoc | 267 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 28 | Maven dependency ordering for patch |
   | -1 | mvninstall | 174 | hadoop-ozone in the patch failed. |
   | -1 | mvninstall | 28 | integration-test in the patch failed. |
   | +1 | compile | 1002 | the patch passed |
   | +1 | cc | 1002 | the patch passed |
   | +1 | javac | 1002 | the patch passed |
   | +1 | checkstyle | 152 | the patch passed |
   | -1 | hadolint | 0 | The patch generated 3 new + 2 unchanged - 0 fixed = 5 
total (was 2) |
   | -1 | mvnsite | 123 | hadoop-ozone in the patch failed. |
   | -1 | mvnsite | 47 | integration-test in the patch failed. |
   | +1 | shellcheck | 26 | There were no new shellcheck issues. |
   | +1 | shelldocs | 33 | There were no new shelldocs issues. |
   | +1 | whitespace | 1 | The patch has no whitespace issues. |
   | +1 | xml | 7 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 750 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone hadoop-ozone/dist hadoop-ozone/integration-test |
   | +1 | findbugs | 252 | the patch passed |
   | +1 | javadoc | 315 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 95 | common in the patch passed. |
   | -1 | unit | 173 | hadoop-ozone in the patch failed. |
   | +1 | unit | 53 | common in the patch passed. |
   | +1 | unit | 45 | csi in the patch passed. |
   | +1 | unit | 39 | dist in the patch passed. |
   | -1 | unit | 47 | integration-test in the patch failed. |
   | +1 | asflicense | 51 | The patch does not generate ASF License warnings. |
   | | | 8439 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/693 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  shellcheck  shelldocs  
cc  hadolint  |
   | uname | Linux 7ba080e16418 4.4.0-144-generic #170~14.04.1-Ubuntu SMP Mon 
Mar 18 15:02:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / c35abcd |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/branch-mvnsite-hadoop-ozone.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/branch-mvnsite-hadoop-ozone_integration-test.txt
 |
   | shellcheck | v0.4.6 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/patch-mvninstall-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | hadolint | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/diff-patch-hadolint.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact/out/patch-mvnsite-hadoop-ozone.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-693/1/artifact

[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"

2019-04-26 Thread Kitti Nanasi (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826978#comment-16826978
 ] 

Kitti Nanasi commented on HDFS-13933:
-

The affected tests all use HttpsURLConnection and HttpURLConnection classes 
that have a better alternative in JDK 11. We might need to use the new 
HttpClient instead. But let's see if we can fix the current implementation 
first.

Related article: 
[https://dzone.com/articles/java-11-standardized-http-client-api]

> [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification 
> problems for "localhost"
> --
>
> Key: HDFS-13933
> URL: https://issues.apache.org/jira/browse/HDFS-13933
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Priority: Minor
>
> Tests with issues:
> * TestHttpFSFWithSWebhdfsFileSystem
> * TestWebHdfsTokens
> * TestSWebHdfsFileContextMainOperations
> Possibly others. Failure looks like 
> {noformat}
> java.io.IOException: localhost:50260: HTTPS hostname wrong:  should be 
> 
> {noformat}
> These tests set up a trust store and use HTTPS connections, and with Java 11 
> the client validation of the server name in the generated self-signed 
> certificate is failing. Exceptions originate in the JRE's HTTP client 
> library. How everything hooks together uses static initializers, static 
> methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. 
> This is Java 11+28



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1384) TestBlockOutputStreamWithFailures is failing

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1384?focusedWorklogId=233540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233540
 ]

ASF GitHub Bot logged work on HDDS-1384:


Author: ASF GitHub Bot
Created on: 26/Apr/19 13:56
Start Date: 26/Apr/19 13:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #750: HDDS-1384. 
TestBlockOutputStreamWithFailures is failing
URL: https://github.com/apache/hadoop/pull/750#issuecomment-487066181
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 59 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | +1 | mvninstall | 1398 | trunk passed |
   | -1 | compile | 55 | integration-test in trunk failed. |
   | +1 | checkstyle | 27 | trunk passed |
   | -1 | mvnsite | 36 | integration-test in trunk failed. |
   | +1 | shadedclient | 810 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 0 | trunk passed |
   | +1 | javadoc | 21 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | -1 | mvninstall | 27 | integration-test in the patch failed. |
   | -1 | compile | 24 | integration-test in the patch failed. |
   | -1 | javac | 24 | integration-test in the patch failed. |
   | +1 | checkstyle | 17 | the patch passed |
   | -1 | mvnsite | 27 | integration-test in the patch failed. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 812 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 0 | the patch passed |
   | +1 | javadoc | 17 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 28 | integration-test in the patch failed. |
   | +1 | asflicense | 28 | The patch does not generate ASF License warnings. |
   | | | 3486 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/750 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 925e63d669e1 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 556eafd |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/branch-compile-hadoop-ozone_integration-test.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/branch-mvnsite-hadoop-ozone_integration-test.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/patch-compile-hadoop-ozone_integration-test.txt
 |
   | javac | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/patch-compile-hadoop-ozone_integration-test.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/patch-mvnsite-hadoop-ozone_integration-test.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/testReport/ |
   | Max. process+thread count | 316 (vs. ulimit of 5500) |
   | modules | C: hadoop-ozone/integration-test U: 
hadoop-ozone/integration-test |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-750/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233540)
Time Spent: 1h  (was: 50m)

> TestB

[jira] [Commented] (HDFS-14459) ClosedChannelException silently ignored in FsVolumeList.addBlockPool()

2019-04-26 Thread Stephen O'Donnell (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826969#comment-16826969
 ] 

Stephen O'Donnell commented on HDFS-14459:
--

I have uploaded a patch for this which I believe resolves the problem and 
removes the two relevant catch blocks for ClosedChannelExceptions, allowing 
those to be handled by the IOException Handler, and hence treat the volume as 
failed. This also refactors addBlockPool to catch any initial exceptions and 
then throw them all at the end.

However I have not been able to figure out a good way to add a test for the 
change to the addBlockPool method. The change is in FsDatasetImpl, so we cannot 
use the SimulatedFSDataset for this, and there appears to be no way to inject a 
FSVolumeList or FSVolumeImpl object to have it throw an exception. I tested 
manually by adding some temporary code, but I am open to suggestions on how to 
add a test for the changes to this method.

> ClosedChannelException silently ignored in FsVolumeList.addBlockPool()
> --
>
> Key: HDFS-14459
> URL: https://issues.apache.org/jira/browse/HDFS-14459
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14459.001.patch
>
>
> Following on HDFS-14333, I encountered another scenario when a volume has 
> some sort of disk level errors it can silently fail to have the blockpool 
> added to itself in FsVolumeList.addBlockPool().
> In the logs for a recent issue we see the following pattern:
> {code}
> 2019-04-24 04:21:27,690 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /CDH/sdi1/dfs/dn/current, StorageType: DISK
> 2019-04-24 04:21:27,691 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-694ae931-8a4e-42d5-b2b3-d946e35c6b47
> ...
> 2019-04-24 04:21:27,703 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-936404344-xxx-1426594942733 on volume 
> /CDH/sdi1/dfs/dn/current...
> ...
>  2019-04-24 04:21:27,722 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-936404344-xxx-1426594942733 on 
> /CDH/sdi1/dfs/dn/current: 19ms
> >
> ...
> 2019-04-24 04:21:29,871 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> replicas to map for block pool BP-936404344-xxx-1426594942733 on volume 
> /CDH/sdi1/dfs/dn/current...
> ...
> 2019-04-24 04:21:29,872 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Caught 
> exception while adding replicas from /CDH/sdi1/dfs/dn/current. Will throw 
> later.
> java.io.IOException: block pool BP-936404344-10.7.192.215-1426594942733 is 
> not found
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getBlockPoolSlice(FsVolumeImpl.java:407)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191
> {code}
> The notable point, is that the 'scanning block pool' step must not have 
> completed properly for this volume but nothing was logged and then the 
> slightly confusing error is logged when attempting to add the replicas. That 
> error occurs as the block pool was not added to the volume by the 
> addBlockPool step.
> The relevant part of the code in 'addBlockPool()' from current trunk looks 
> like:
> {code}
> for (final FsVolumeImpl v : volumes) {
>   Thread t = new Thread() {
> public void run() {
>   try (FsVolumeReference ref = v.obtainReference()) {
> FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
> " on volume " + v + "...");
> long startTime = Time.monotonicNow();
> v.addBlockPool(bpid, conf);
> long timeTaken = Time.monotonicNow() - startTime;
> FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
> " on " + v + ": " + timeTaken + "ms");
>   } catch (ClosedChannelException e) {
> // ignore.
>   } catch (IOException ioe) {
> FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
> ". Will throw later.", ioe);
> unhealthyDataDirs.put(v, ioe);
>   }
> }
>   };
>   blockPoolAddingThreads.add(t);
>   t.start();
> }
> {code}
> As we get the first log message (Scanning block pool ... ), but not the 
> second (Time take to s

[jira] [Updated] (HDFS-14459) ClosedChannelException silently ignored in FsVolumeList.addBlockPool()

2019-04-26 Thread Stephen O'Donnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-14459:
-
Status: Patch Available  (was: Open)

> ClosedChannelException silently ignored in FsVolumeList.addBlockPool()
> --
>
> Key: HDFS-14459
> URL: https://issues.apache.org/jira/browse/HDFS-14459
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14459.001.patch
>
>
> Following on HDFS-14333, I encountered another scenario when a volume has 
> some sort of disk level errors it can silently fail to have the blockpool 
> added to itself in FsVolumeList.addBlockPool().
> In the logs for a recent issue we see the following pattern:
> {code}
> 2019-04-24 04:21:27,690 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /CDH/sdi1/dfs/dn/current, StorageType: DISK
> 2019-04-24 04:21:27,691 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-694ae931-8a4e-42d5-b2b3-d946e35c6b47
> ...
> 2019-04-24 04:21:27,703 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-936404344-xxx-1426594942733 on volume 
> /CDH/sdi1/dfs/dn/current...
> ...
>  2019-04-24 04:21:27,722 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-936404344-xxx-1426594942733 on 
> /CDH/sdi1/dfs/dn/current: 19ms
> >
> ...
> 2019-04-24 04:21:29,871 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> replicas to map for block pool BP-936404344-xxx-1426594942733 on volume 
> /CDH/sdi1/dfs/dn/current...
> ...
> 2019-04-24 04:21:29,872 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Caught 
> exception while adding replicas from /CDH/sdi1/dfs/dn/current. Will throw 
> later.
> java.io.IOException: block pool BP-936404344-10.7.192.215-1426594942733 is 
> not found
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getBlockPoolSlice(FsVolumeImpl.java:407)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191
> {code}
> The notable point, is that the 'scanning block pool' step must not have 
> completed properly for this volume but nothing was logged and then the 
> slightly confusing error is logged when attempting to add the replicas. That 
> error occurs as the block pool was not added to the volume by the 
> addBlockPool step.
> The relevant part of the code in 'addBlockPool()' from current trunk looks 
> like:
> {code}
> for (final FsVolumeImpl v : volumes) {
>   Thread t = new Thread() {
> public void run() {
>   try (FsVolumeReference ref = v.obtainReference()) {
> FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
> " on volume " + v + "...");
> long startTime = Time.monotonicNow();
> v.addBlockPool(bpid, conf);
> long timeTaken = Time.monotonicNow() - startTime;
> FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
> " on " + v + ": " + timeTaken + "ms");
>   } catch (ClosedChannelException e) {
> // ignore.
>   } catch (IOException ioe) {
> FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
> ". Will throw later.", ioe);
> unhealthyDataDirs.put(v, ioe);
>   }
> }
>   };
>   blockPoolAddingThreads.add(t);
>   t.start();
> }
> {code}
> As we get the first log message (Scanning block pool ... ), but not the 
> second (Time take to scan block pool ...), and we don't get anything logged 
> or an exception thrown, then the operation must have encountered a 
> ClosedChannelException which is silently ignored.
> I am also not sure if we should ignore a ClosedChannelException, as it means 
> the volume failed to add fully. As ClosedChannelException is a subclass of 
> IOException perhaps we can remove that catch block entirely?
> Finally, HDFS-14333 refactored the above code to allow the DN to better 
> handle a disk failure on DN startup. However, if addBlockPool does throw an 
> exception, it will mean getAllVolumesMap() will not get called and the DN 
> will end up partly initialized.
> DataNode.initBlockPool() calls FsDatasetImpl.addBlockPool() which looks like 
> the following, calling addBlockPool() and then getAll

[jira] [Updated] (HDFS-14459) ClosedChannelException silently ignored in FsVolumeList.addBlockPool()

2019-04-26 Thread Stephen O'Donnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-14459:
-
Attachment: HDFS-14459.001.patch

> ClosedChannelException silently ignored in FsVolumeList.addBlockPool()
> --
>
> Key: HDFS-14459
> URL: https://issues.apache.org/jira/browse/HDFS-14459
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14459.001.patch
>
>
> Following on HDFS-14333, I encountered another scenario when a volume has 
> some sort of disk level errors it can silently fail to have the blockpool 
> added to itself in FsVolumeList.addBlockPool().
> In the logs for a recent issue we see the following pattern:
> {code}
> 2019-04-24 04:21:27,690 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /CDH/sdi1/dfs/dn/current, StorageType: DISK
> 2019-04-24 04:21:27,691 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-694ae931-8a4e-42d5-b2b3-d946e35c6b47
> ...
> 2019-04-24 04:21:27,703 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-936404344-xxx-1426594942733 on volume 
> /CDH/sdi1/dfs/dn/current...
> ...
>  2019-04-24 04:21:27,722 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-936404344-xxx-1426594942733 on 
> /CDH/sdi1/dfs/dn/current: 19ms
> >
> ...
> 2019-04-24 04:21:29,871 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> replicas to map for block pool BP-936404344-xxx-1426594942733 on volume 
> /CDH/sdi1/dfs/dn/current...
> ...
> 2019-04-24 04:21:29,872 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Caught 
> exception while adding replicas from /CDH/sdi1/dfs/dn/current. Will throw 
> later.
> java.io.IOException: block pool BP-936404344-10.7.192.215-1426594942733 is 
> not found
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getBlockPoolSlice(FsVolumeImpl.java:407)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191
> {code}
> The notable point, is that the 'scanning block pool' step must not have 
> completed properly for this volume but nothing was logged and then the 
> slightly confusing error is logged when attempting to add the replicas. That 
> error occurs as the block pool was not added to the volume by the 
> addBlockPool step.
> The relevant part of the code in 'addBlockPool()' from current trunk looks 
> like:
> {code}
> for (final FsVolumeImpl v : volumes) {
>   Thread t = new Thread() {
> public void run() {
>   try (FsVolumeReference ref = v.obtainReference()) {
> FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
> " on volume " + v + "...");
> long startTime = Time.monotonicNow();
> v.addBlockPool(bpid, conf);
> long timeTaken = Time.monotonicNow() - startTime;
> FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
> " on " + v + ": " + timeTaken + "ms");
>   } catch (ClosedChannelException e) {
> // ignore.
>   } catch (IOException ioe) {
> FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
> ". Will throw later.", ioe);
> unhealthyDataDirs.put(v, ioe);
>   }
> }
>   };
>   blockPoolAddingThreads.add(t);
>   t.start();
> }
> {code}
> As we get the first log message (Scanning block pool ... ), but not the 
> second (Time take to scan block pool ...), and we don't get anything logged 
> or an exception thrown, then the operation must have encountered a 
> ClosedChannelException which is silently ignored.
> I am also not sure if we should ignore a ClosedChannelException, as it means 
> the volume failed to add fully. As ClosedChannelException is a subclass of 
> IOException perhaps we can remove that catch block entirely?
> Finally, HDFS-14333 refactored the above code to allow the DN to better 
> handle a disk failure on DN startup. However, if addBlockPool does throw an 
> exception, it will mean getAllVolumesMap() will not get called and the DN 
> will end up partly initialized.
> DataNode.initBlockPool() calls FsDatasetImpl.addBlockPool() which looks like 
> the following, calling addBlockPool() and then getAllVolu

[jira] [Created] (HDFS-14459) ClosedChannelException silently ignored in FsVolumeList.addBlockPool()

2019-04-26 Thread Stephen O'Donnell (JIRA)

Stephen O'Donnell created HDFS-14459:


 Summary: ClosedChannelException silently ignored in 
FsVolumeList.addBlockPool()
 Key: HDFS-14459
 URL: https://issues.apache.org/jira/browse/HDFS-14459
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.3.0
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell
 Fix For: 3.3.0


Following on HDFS-14333, I encountered another scenario when a volume has some 
sort of disk level errors it can silently fail to have the blockpool added to 
itself in FsVolumeList.addBlockPool().

In the logs for a recent issue we see the following pattern:

{code}
2019-04-24 04:21:27,690 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
volume - /CDH/sdi1/dfs/dn/current, StorageType: DISK
2019-04-24 04:21:27,691 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added new 
volume: DS-694ae931-8a4e-42d5-b2b3-d946e35c6b47
...
2019-04-24 04:21:27,703 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
block pool BP-936404344-xxx-1426594942733 on volume /CDH/sdi1/dfs/dn/current...
...


...
2019-04-24 04:21:29,871 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
replicas to map for block pool BP-936404344-xxx-1426594942733 on volume 
/CDH/sdi1/dfs/dn/current...
...
2019-04-24 04:21:29,872 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Caught 
exception while adding replicas from /CDH/sdi1/dfs/dn/current. Will throw later.
java.io.IOException: block pool BP-936404344-10.7.192.215-1426594942733 is not 
found
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getBlockPoolSlice(FsVolumeImpl.java:407)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:864)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191
{code}

The notable point, is that the 'scanning block pool' step must not have 
completed properly for this volume but nothing was logged and then the slightly 
confusing error is logged when attempting to add the replicas. That error 
occurs as the block pool was not added to the volume by the addBlockPool step.

The relevant part of the code in 'addBlockPool()' from current trunk looks like:

{code}
for (final FsVolumeImpl v : volumes) {
  Thread t = new Thread() {
public void run() {
  try (FsVolumeReference ref = v.obtainReference()) {
FsDatasetImpl.LOG.info("Scanning block pool " + bpid +
" on volume " + v + "...");
long startTime = Time.monotonicNow();
v.addBlockPool(bpid, conf);
long timeTaken = Time.monotonicNow() - startTime;
FsDatasetImpl.LOG.info("Time taken to scan block pool " + bpid +
" on " + v + ": " + timeTaken + "ms");
  } catch (ClosedChannelException e) {
// ignore.
  } catch (IOException ioe) {
FsDatasetImpl.LOG.info("Caught exception while scanning " + v +
". Will throw later.", ioe);
unhealthyDataDirs.put(v, ioe);
  }
}
  };
  blockPoolAddingThreads.add(t);
  t.start();
}
{code}

As we get the first log message (Scanning block pool ... ), but not the second 
(Time take to scan block pool ...), and we don't get anything logged or an 
exception thrown, then the operation must have encountered a 
ClosedChannelException which is silently ignored.

I am also not sure if we should ignore a ClosedChannelException, as it means 
the volume failed to add fully. As ClosedChannelException is a subclass of 
IOException perhaps we can remove that catch block entirely?

Finally, HDFS-14333 refactored the above code to allow the DN to better handle 
a disk failure on DN startup. However, if addBlockPool does throw an exception, 
it will mean getAllVolumesMap() will not get called and the DN will end up 
partly initialized.

DataNode.initBlockPool() calls FsDatasetImpl.addBlockPool() which looks like 
the following, calling addBlockPool() and then getAllVolumesMap():

{code}
public void addBlockPool(String bpid, Configuration conf)
  throws IOException {
LOG.info("Adding block pool " + bpid);
try (AutoCloseableLock lock = datasetLock.acquire()) {
  volumes.addBlockPool(bpid, conf);
  volumeMap.initBlockPool(bpid);
}
volumes.getAllVolumesMap(bpid, volumeMap, ramDiskReplicaTracker);
  }
{code}

This needs refactored to catch any AddBlockPoolException raised in 
addBlockPool, then continue to call getAllVolumesMap() before re-throwing any 
of the caught exceptions to allow the DN to handle the individual volume 
failures.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005

[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-04-26 Thread Yuriy Malygin (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826939#comment-16826939
 ] 

Yuriy Malygin commented on HDFS-13596:
--

[~ferhui] thanks for your answer. Today I repeated test with _hadoop-trunk + 
HDFS-13596.007.patch + HDFS-14396.002.patch_ and rollingUpgrade with Rollback 
was successfully completed. 

PS: all time cluster working in Secure Mode with QJM

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  a

[jira] [Commented] (HDFS-14401) Refine the implementation for HDFS cache on SCM

2019-04-26 Thread Feilong He (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826935#comment-16826935
 ] 

Feilong He commented on HDFS-14401:
---

HDFS-14401.006.patch has been uploaded. Thanks!

> Refine the implementation for HDFS cache on SCM
> ---
>
> Key: HDFS-14401
> URL: https://issues.apache.org/jira/browse/HDFS-14401
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14401.000.patch, HDFS-14401.001.patch, 
> HDFS-14401.002.patch, HDFS-14401.003.patch, HDFS-14401.004.patch, 
> HDFS-14401.005.patch, HDFS-14401.006.patch
>
>
> In this Jira, we will refine the implementation for HDFS cache on SCM, such 
> as: 1) Handle full pmem volume in VolumeManager; 2) Refine pmem volume 
> selection impl; 3) Clean up MapppableBlockLoader interface; etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1468) Inject configuration values to Java objects

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1468?focusedWorklogId=233477&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233477
 ]

ASF GitHub Bot logged work on HDDS-1468:


Author: ASF GitHub Bot
Created on: 26/Apr/19 12:16
Start Date: 26/Apr/19 12:16
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #772: HDDS-1468. Inject 
configuration values to Java objects
URL: https://github.com/apache/hadoop/pull/772#issuecomment-487036271
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 32 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 4 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 59 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1152 | trunk passed |
   | +1 | compile | 89 | trunk passed |
   | +1 | checkstyle | 31 | trunk passed |
   | +1 | mvnsite | 77 | trunk passed |
   | -1 | shadedclient | 264 | branch has errors when building and testing our 
client artifacts. |
   | -1 | findbugs | 16 | common in trunk failed. |
   | -1 | findbugs | 26 | server-scm in trunk failed. |
   | -1 | javadoc | 26 | common in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 15 | Maven dependency ordering for patch |
   | -1 | mvninstall | 27 | common in the patch failed. |
   | -1 | mvninstall | 20 | server-scm in the patch failed. |
   | +1 | compile | 80 | the patch passed |
   | +1 | javac | 80 | the patch passed |
   | +1 | checkstyle | 30 | the patch passed |
   | -1 | mvnsite | 23 | server-scm in the patch failed. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 627 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | findbugs | 19 | server-scm in the patch failed. |
   | -1 | javadoc | 40 | hadoop-hdds_common generated 1 new + 0 unchanged - 0 
fixed = 1 total (was 0) |
   | -1 | javadoc | 21 | hadoop-hdds_server-scm generated 6 new + 5 unchanged - 
0 fixed = 11 total (was 5) |
   ||| _ Other Tests _ |
   | +1 | unit | 78 | common in the patch passed. |
   | -1 | unit | 24 | server-scm in the patch failed. |
   | +1 | asflicense | 34 | The patch does not generate ASF License warnings. |
   | | | 3007 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/772 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux 42a19cb93db1 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / c35abcd |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/branch-findbugs-hadoop-hdds_common.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/branch-javadoc-hadoop-hdds_common.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/patch-mvninstall-hadoop-hdds_common.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/patch-mvninstall-hadoop-hdds_server-scm.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/patch-mvnsite-hadoop-hdds_server-scm.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/patch-findbugs-hadoop-hdds_server-scm.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/diff-javadoc-javadoc-hadoop-hdds_common.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/diff-javadoc-javadoc-hadoop-hdds_server-scm.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/artifact/out/patch-unit-hadoop-hdds_server-scm.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/testReport/ |
   | Max. process+thread count | 445 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm U: hadoop-hdds |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-772/2/console |
   | Powere

[jira] [Commented] (HDDS-1460) Add the optmizations of HDDS-1300 to BasicOzoneFileSystem

2019-04-26 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826926#comment-16826926
 ] 

Hudson commented on HDDS-1460:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16468 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16468/])
HDDS-1460: Add the optmizations of HDDS-1300 to BasicOzoneFileSystem (github: 
rev 556eafd01a76145e6255b5ff720c80bf8bf7d08b)
* (edit) 
hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java


> Add the optmizations of HDDS-1300 to BasicOzoneFileSystem
> -
>
> Key: HDDS-1460
> URL: https://issues.apache.org/jira/browse/HDDS-1460
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some of the optimizations made in HDDS-1300 were reverted in HDDS-1333. This 
> Jira aims to bring back those optimizations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1300) Optimize non-recursive ozone filesystem apis

2019-04-26 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826927#comment-16826927
 ] 

Hudson commented on HDDS-1300:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16468 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16468/])
HDDS-1460: Add the optmizations of HDDS-1300 to BasicOzoneFileSystem (github: 
rev 556eafd01a76145e6255b5ff720c80bf8bf7d08b)
* (edit) 
hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java


> Optimize non-recursive ozone filesystem apis
> 
>
> Key: HDDS-1300
> URL: https://issues.apache.org/jira/browse/HDDS-1300
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Filesystem, Ozone Manager
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1300.001.patch, HDDS-1300.002.patch, 
> HDDS-1300.003.patch, HDDS-1300.004.patch, HDDS-1300.005.patch, 
> HDDS-1300.006.patch, HDDS-1300.007.patch, HDDS-1300.008.patch
>
>
> This Jira aims to optimise non recursive apis in ozone file system. The Jira 
> would add support for such apis in Ozone manager in order to reduce the 
> number of rpc calls to Ozone Manager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1460) Add the optmizations of HDDS-1300 to BasicOzoneFileSystem

2019-04-26 Thread Lokesh Jain (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-1460:
--
   Resolution: Fixed
Fix Version/s: 0.5.0
   Status: Resolved  (was: Patch Available)

> Add the optmizations of HDDS-1300 to BasicOzoneFileSystem
> -
>
> Key: HDDS-1460
> URL: https://issues.apache.org/jira/browse/HDDS-1460
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some of the optimizations made in HDDS-1300 were reverted in HDDS-1333. This 
> Jira aims to bring back those optimizations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1460) Add the optmizations of HDDS-1300 to BasicOzoneFileSystem

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1460?focusedWorklogId=233469&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233469
 ]

ASF GitHub Bot logged work on HDDS-1460:


Author: ASF GitHub Bot
Created on: 26/Apr/19 12:01
Start Date: 26/Apr/19 12:01
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on issue #765: HDDS-1460: Add the 
optmizations of HDDS-1300 to BasicOzoneFileSystem
URL: https://github.com/apache/hadoop/pull/765#issuecomment-487032481
 
 
   @mukul1987  Thanks for reviewing the pull request! I have merged it to trunk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233469)
Time Spent: 40m  (was: 0.5h)

> Add the optmizations of HDDS-1300 to BasicOzoneFileSystem
> -
>
> Key: HDDS-1460
> URL: https://issues.apache.org/jira/browse/HDDS-1460
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Some of the optimizations made in HDDS-1300 were reverted in HDDS-1333. This 
> Jira aims to bring back those optimizations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1460) Add the optmizations of HDDS-1300 to BasicOzoneFileSystem

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1460?focusedWorklogId=233468&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233468
 ]

ASF GitHub Bot logged work on HDDS-1460:


Author: ASF GitHub Bot
Created on: 26/Apr/19 11:59
Start Date: 26/Apr/19 11:59
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #765: HDDS-1460: 
Add the optmizations of HDDS-1300 to BasicOzoneFileSystem
URL: https://github.com/apache/hadoop/pull/765
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233468)
Time Spent: 0.5h  (was: 20m)

> Add the optmizations of HDDS-1300 to BasicOzoneFileSystem
> -
>
> Key: HDDS-1460
> URL: https://issues.apache.org/jira/browse/HDDS-1460
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Some of the optimizations made in HDDS-1300 were reverted in HDDS-1333. This 
> Jira aims to bring back those optimizations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-999) Make the DNS resolution in OzoneManager more resilient

2019-04-26 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826887#comment-16826887
 ] 

Hudson commented on HDDS-999:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16467 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16467/])
HDDS-999. Make the DNS resolution in OzoneManager more resilient (elek: rev 
c35abcd831c5b1c96e8ffa9b3cc64ef2f51fb7e1)
* (edit) hadoop-ozone/dist/src/main/compose/ozone-om-ha/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozoneperf/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozonesecure-mr/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozone/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozones3/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/k8s/ozone/om-statefulset.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozonetrace/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozonefs/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozone-recon/docker-compose.yaml
* (edit) hadoop-ozone/dist/src/main/compose/ozonesecure/docker-compose.yaml
* (edit) 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
* (edit) hadoop-ozone/dist/src/main/compose/ozone-hdfs/docker-compose.yaml


> Make the DNS resolution in OzoneManager more resilient
> --
>
> Key: HDDS-999
> URL: https://issues.apache.org/jira/browse/HDDS-999
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Elek, Marton
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-999.01.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If the OzoneManager is started before scm the scm dns may not be available. 
> In this case the om should retry and re-resolve the dns, but as of now it 
> throws an exception:
> {code:java}
> 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
> java.net.SocketException: Call From om-0.om to null:0 failed on socket 
> exception: java.net.SocketException: Unresolved address; For more details 
> see:  http://wiki.apache.org/hadoop/SocketException
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:566)
>     at org.apache.hadoop.ipc.Server$Listener.(Server.java:1042)
>     at org.apache.hadoop.ipc.Server.(Server.java:2815)
>     at org.apache.hadoop.ipc.RPC$Server.(RPC.java:994)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:421)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
>     at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
>     at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:265)
>     at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
>     at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
> Caused by: java.net.SocketException: Unresolved address
>     at sun.nio.ch.Net.translateToSocketException(Net.java:131)
>     at sun.nio.ch.Net.translateException(Net.java:157)
>     at sun.nio.ch.Net.translateException(Net.java:163)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:549)
>     ... 11 more
> Caused by: java.nio.channels.UnresolvedAddressException
>     at sun.nio.ch.Net.checkAddress(Net.java:101)
>     at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>     ... 12 more{code}
> It should be fixed. (See also HDDS-421 which fixed the same problem in 
> datanode side and HDDS-907 which is the workaround while this issue is not 
> resolved).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-999) Make the DNS resolution in OzoneManager more resilient

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-999?focusedWorklogId=233438&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233438
 ]

ASF GitHub Bot logged work on HDDS-999:
---

Author: ASF GitHub Bot
Created on: 26/Apr/19 10:47
Start Date: 26/Apr/19 10:47
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #758: HDDS-999. Make the 
DNS resolution in OzoneManager more resilient. (swagle)
URL: https://github.com/apache/hadoop/pull/758
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233438)
Time Spent: 1h 50m  (was: 1h 40m)

> Make the DNS resolution in OzoneManager more resilient
> --
>
> Key: HDDS-999
> URL: https://issues.apache.org/jira/browse/HDDS-999
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Elek, Marton
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-999.01.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If the OzoneManager is started before scm the scm dns may not be available. 
> In this case the om should retry and re-resolve the dns, but as of now it 
> throws an exception:
> {code:java}
> 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
> java.net.SocketException: Call From om-0.om to null:0 failed on socket 
> exception: java.net.SocketException: Unresolved address; For more details 
> see:  http://wiki.apache.org/hadoop/SocketException
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:566)
>     at org.apache.hadoop.ipc.Server$Listener.(Server.java:1042)
>     at org.apache.hadoop.ipc.Server.(Server.java:2815)
>     at org.apache.hadoop.ipc.RPC$Server.(RPC.java:994)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:421)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
>     at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
>     at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:265)
>     at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
>     at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
> Caused by: java.net.SocketException: Unresolved address
>     at sun.nio.ch.Net.translateToSocketException(Net.java:131)
>     at sun.nio.ch.Net.translateException(Net.java:157)
>     at sun.nio.ch.Net.translateException(Net.java:163)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:549)
>     ... 11 more
> Caused by: java.nio.channels.UnresolvedAddressException
>     at sun.nio.ch.Net.checkAddress(Net.java:101)
>     at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>     ... 12 more{code}
> It should be fixed. (See also HDDS-421 which fixed the same problem in 
> datanode side and HDDS-907 which is the workaround while this issue is not 
> resolved).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1406) Avoid usage of commonPool in RatisPipelineUtils

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1406?focusedWorklogId=233434&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233434
 ]

ASF GitHub Bot logged work on HDDS-1406:


Author: ASF GitHub Bot
Created on: 26/Apr/19 10:46
Start Date: 26/Apr/19 10:46
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #714: HDDS-1406. 
Avoid usage of commonPool in RatisPipelineUtils.
URL: https://github.com/apache/hadoop/pull/714#discussion_r278896085
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/RatisPipelineUtils.java
 ##
 @@ -41,16 +41,35 @@
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.List;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.ForkJoinPool;
+import java.util.concurrent.ForkJoinWorkerThread;
+import java.util.concurrent.RejectedExecutionException;
 
 /**
  * Utility class for Ratis pipelines. Contains methods to create and destroy
  * ratis pipelines.
  */
-final class RatisPipelineUtils {
+public final class RatisPipelineUtils {
 
   private static final Logger LOG =
   LoggerFactory.getLogger(RatisPipelineUtils.class);
 
+  // Set parallelism at 3, as now in Ratis we create 1 and 3 node pipelines.
+  private static final int PARALLELISIM_FOR_POOL = 3;
+
+  private static final ForkJoinPool.ForkJoinWorkerThreadFactory FACTORY =
 
 Review comment:
   Can we avoid making it static? The problem with static occurs in the 
MiniOzoneCluster tests. Once SCM is stopped by one of the tests, the fork join 
pool will be shutdown and will not be available again for execution. I think 
this might be a reason for unit test failures.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233434)
Time Spent: 5h 20m  (was: 5h 10m)

> Avoid usage of commonPool in RatisPipelineUtils
> ---
>
> Key: HDDS-1406
> URL: https://issues.apache.org/jira/browse/HDDS-1406
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> We use parallelStream in during createPipline, this internally uses 
> commonPool. Use Our own ForkJoinPool with parallelisim set with number of 
> processors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1406) Avoid usage of commonPool in RatisPipelineUtils

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1406?focusedWorklogId=233435&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233435
 ]

ASF GitHub Bot logged work on HDDS-1406:


Author: ASF GitHub Bot
Created on: 26/Apr/19 10:46
Start Date: 26/Apr/19 10:46
Worklog Time Spent: 10m 
  Work Description: lokeshj1703 commented on pull request #714: HDDS-1406. 
Avoid usage of commonPool in RatisPipelineUtils.
URL: https://github.com/apache/hadoop/pull/714#discussion_r278895492
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/RatisPipelineUtils.java
 ##
 @@ -146,19 +165,33 @@ private static void callRatisRpc(List 
datanodes,
 SecurityConfig(ozoneConf));
 final TimeDuration requestTimeout =
 RatisHelper.getClientRequestTimeout(ozoneConf);
-datanodes.parallelStream().forEach(d -> {
-  final RaftPeer p = RatisHelper.toRaftPeer(d);
-  try (RaftClient client = RatisHelper
-  .newRaftClient(SupportedRpcType.valueOfIgnoreCase(rpcType), p,
-  retryPolicy, maxOutstandingRequests, tlsConfig, requestTimeout)) 
{
-rpc.accept(client, p);
-  } catch (IOException ioe) {
-String errMsg =
-"Failed invoke Ratis rpc " + rpc + " for " + d.getUuid();
-LOG.error(errMsg, ioe);
-exceptions.add(new IOException(errMsg, ioe));
-  }
-});
+try {
+  POOL.submit(() -> {
 
 Review comment:
   Can you please check if one of the threads is not used up in waiting for 
parallel stream to finish execution? If it does then there are only two threads 
available for making a rpc call.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233435)
Time Spent: 5.5h  (was: 5h 20m)

> Avoid usage of commonPool in RatisPipelineUtils
> ---
>
> Key: HDDS-1406
> URL: https://issues.apache.org/jira/browse/HDDS-1406
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> We use parallelStream in during createPipline, this internally uses 
> commonPool. Use Our own ForkJoinPool with parallelisim set with number of 
> processors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-999) Make the DNS resolution in OzoneManager more resilient

2019-04-26 Thread Elek, Marton (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-999:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged. Thanks the contribution [~swagle]

> Make the DNS resolution in OzoneManager more resilient
> --
>
> Key: HDDS-999
> URL: https://issues.apache.org/jira/browse/HDDS-999
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Elek, Marton
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDDS-999.01.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> If the OzoneManager is started before scm the scm dns may not be available. 
> In this case the om should retry and re-resolve the dns, but as of now it 
> throws an exception:
> {code:java}
> 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager.
> java.net.SocketException: Call From om-0.om to null:0 failed on socket 
> exception: java.net.SocketException: Unresolved address; For more details 
> see:  http://wiki.apache.org/hadoop/SocketException
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:566)
>     at org.apache.hadoop.ipc.Server$Listener.(Server.java:1042)
>     at org.apache.hadoop.ipc.Server.(Server.java:2815)
>     at org.apache.hadoop.ipc.RPC$Server.(RPC.java:994)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:421)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
>     at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563)
>     at 
> org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927)
>     at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:265)
>     at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674)
>     at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587)
> Caused by: java.net.SocketException: Unresolved address
>     at sun.nio.ch.Net.translateToSocketException(Net.java:131)
>     at sun.nio.ch.Net.translateException(Net.java:157)
>     at sun.nio.ch.Net.translateException(Net.java:163)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:549)
>     ... 11 more
> Caused by: java.nio.channels.UnresolvedAddressException
>     at sun.nio.ch.Net.checkAddress(Net.java:101)
>     at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>     ... 12 more{code}
> It should be fixed. (See also HDDS-421 which fixed the same problem in 
> datanode side and HDDS-907 which is the workaround while this issue is not 
> resolved).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1469) Generate default configuration fragments based on annotations

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1469?focusedWorklogId=233429&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233429
 ]

ASF GitHub Bot logged work on HDDS-1469:


Author: ASF GitHub Bot
Created on: 26/Apr/19 10:35
Start Date: 26/Apr/19 10:35
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #773: HDDS-1469. 
Generate default configuration fragments based on annotations
URL: https://github.com/apache/hadoop/pull/773#issuecomment-487011937
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 0 | Docker mode activated. |
   | -1 | patch | 7 | https://github.com/apache/hadoop/pull/773 does not apply 
to trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/773 |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-773/2/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233429)
Time Spent: 4h 20m  (was: 4h 10m)

> Generate default configuration fragments based on annotations
> -
>
> Key: HDDS-1469
> URL: https://issues.apache.org/jira/browse/HDDS-1469
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> See the design doc in the parent jira for more details.
> In this jira I introduce a new annotation processor which can generate 
> ozone-default.xml fragments based on the annotations which are introduced by 
> HDDS-1468.
> The ozone-default-generated.xml fragments can be used directly by the 
> OzoneConfiguration as I added a small code to the constructor to check ALL 
> the available ozone-default-generated.xml files and add them to the available 
> resources.
> With this approach we don't need to edit ozone-default.xml as all the 
> configuration can be defined in java code.
> As a side effect each service will see only the available configuration keys 
> and values based on the classpath. (If the ozone-default-generated.xml file 
> of OzoneManager is not on the classpath of the SCM, SCM doesn't see the 
> available configs.) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13933) [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification problems for "localhost"

2019-04-26 Thread Kitti Nanasi (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826866#comment-16826866
 ] 

Kitti Nanasi commented on HDFS-13933:
-

Thanks [~apurtell] for reporting this issue and [~smeng] for the further 
details! It seems like all three tests fail with OpenJDK 11, but they succeed 
with Zulu JDK 11.

> [JDK 11] SWebhdfsFileSystem related tests fail with hostname verification 
> problems for "localhost"
> --
>
> Key: HDFS-13933
> URL: https://issues.apache.org/jira/browse/HDFS-13933
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Priority: Minor
>
> Tests with issues:
> * TestHttpFSFWithSWebhdfsFileSystem
> * TestWebHdfsTokens
> * TestSWebHdfsFileContextMainOperations
> Possibly others. Failure looks like 
> {noformat}
> java.io.IOException: localhost:50260: HTTPS hostname wrong:  should be 
> 
> {noformat}
> These tests set up a trust store and use HTTPS connections, and with Java 11 
> the client validation of the server name in the generated self-signed 
> certificate is failing. Exceptions originate in the JRE's HTTP client 
> library. How everything hooks together uses static initializers, static 
> methods, JUnit MethodRules... There's a lot to unpack, not sure how to fix. 
> This is Java 11+28



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14401) Refine the implementation for HDFS cache on SCM

2019-04-26 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826846#comment-16826846
 ] 

Hadoop QA commented on HDFS-14401:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 58s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 3 new + 473 unchanged - 
3 fixed = 476 total (was 476) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14401 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12967104/HDFS-14401.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux a50e7f6596f2 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 79d3d35 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26709/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26709/artifact/out/patch-unit-hadoop-hd

[jira] [Work logged] (HDDS-1469) Generate default configuration fragments based on annotations

2019-04-26 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1469?focusedWorklogId=233412&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-233412
 ]

ASF GitHub Bot logged work on HDDS-1469:


Author: ASF GitHub Bot
Created on: 26/Apr/19 10:02
Start Date: 26/Apr/19 10:02
Worklog Time Spent: 10m 
  Work Description: elek commented on issue #773: HDDS-1469. Generate 
default configuration fragments based on annotations
URL: https://github.com/apache/hadoop/pull/773#issuecomment-487003326
 
 
   > I am ok with that, but some of the old school people might like a single 
file, and in the deployment, phase don't we need a single file ? or should we 
move away since the code already has the default?
   
   It's a very good question and I don't know the final answer. In fact we use 
standard hadoop Configuration features to load all the fragments, so it should 
be fine. I would prefer to try out this approach (with independent config 
fragments), but based on the feedback, experiences, we can improve/refactor it.
   
   My arguments:
   
1. First of all, it's easier to implement. We don't need a final merge.
2. It's way easier to test. To generate the final ozone-default.xml we need 
a project which depends on all the others with config fragments. But in the 
mean time we need merged ozone-default.xml to test the different components. 
With fragments it just works based on classpath.
3. The biggest argument to use one ozone-default.xml (IMHO) is that it can 
be used as a documentation. But I think we can provide better documentation 
page (with better structures). But it can be true: we may need to generate a 
static doc page about all the configuration settings.
4. It's very interesting that the source of a key is recorded in the 
Configuration class. With using fragments we will have a source information out 
of the box:
   
   
   ```XML
   
  hdds.scm.replication.event.timeout
  10m
  false
  
jar:file:/opt/hadoop/share/ozone/lib/hadoop-hdds-server-scm-0.5.0-SNAPSHOT.jar!/ozone-default-generated.xml
   
   ```
   
   It also means that we don't need to use SCM, HDDS, OZONE tags any more as 
they can be added based on the source. And with this approach we can print out 
the configuration based on the components (eg. SCM configs, common configs, 
etc.). Would be great to add an other information, too: which class defined the 
specific configuration key.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 233412)
Time Spent: 4h 10m  (was: 4h)

> Generate default configuration fragments based on annotations
> -
>
> Key: HDDS-1469
> URL: https://issues.apache.org/jira/browse/HDDS-1469
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> See the design doc in the parent jira for more details.
> In this jira I introduce a new annotation processor which can generate 
> ozone-default.xml fragments based on the annotations which are introduced by 
> HDDS-1468.
> The ozone-default-generated.xml fragments can be used directly by the 
> OzoneConfiguration as I added a small code to the constructor to check ALL 
> the available ozone-default-generated.xml files and add them to the available 
> resources.
> With this approach we don't need to edit ozone-default.xml as all the 
> configuration can be defined in java code.
> As a side effect each service will see only the available configuration keys 
> and values based on the classpath. (If the ozone-default-generated.xml file 
> of OzoneManager is not on the classpath of the SCM, SCM doesn't see the 
> available configs.) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 118 matches

Mail list logo