[jira] [Commented] (HDDS-1365) Fix error handling in KeyValueContainerCheck

2019-04-02 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808391#comment-16808391
 ] 

Yiqun Lin commented on HDDS-1365:
-

Hi [~sdeka],
bq. When that happens, we can introduce a specific new exception for every such 
specialised category with its own handler logic.
I agree on this, maybe we can define a new exception similar to 
StorageContainerException.
We can temporarily commit this change. +1.
 

> Fix error handling in KeyValueContainerCheck
> 
>
> Key: HDDS-1365
> URL: https://issues.apache.org/jira/browse/HDDS-1365
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
> Attachments: HDDS-1365.000.patch
>
>
> Error handling and propagation in KeyValueContainerCheck needs to be based on 
> throwing IOException instead of passing an error code to the calling function.
> HDDS-1163 implemented the basic framework using a mix of error code return 
> and exception handling. There is added complexity because exceptions deep 
> inside the call chain are being caught and translated to error code return 
> values. The goal is to change all error handling in this class to use 
> Exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Siddharth Wagle (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808263#comment-16808263
 ] 

Siddharth Wagle edited comment on HDDS-1189 at 4/3/19 4:58 AM:
---

Thanks [~elek] and [~avijayan] for your reviews. Addressed the comments in 05 
as below:

_License issue_:
- Removed HSQLDB dependency and reverted to using in-memory sqlite for code 
generation, the hsqldb license is actually a copy of BSD but there was an 
alternate way out so went with that.
- spring-jdbc is Apache v2 

_Configuration issues_:
-  Used the recon.db.dir for constructing default url
- There is no password field tag, checked source tree
- The findbugs plugin does not apply recursively, hence need to be explicit


was (Author: swagle):
Thanks [~elek] and [~avijayan] for your reviews. Addressed the comments in 05 
as below:

_License issue_:
- Removed HSQLDB dependency and reverted to using in-memory sqlite for code 
generation, the hsqldb license is actually a copy of BSD but there was an 
alternate way out so went with that.
- spring-jdbc is Apache v2 

_Configuration issues_:
-  Used the recon.dbdir for constructing default url
- There is no password field tag, checked source tree
- The findbugs plugin does not apply recursively, hence need to be explicit

> Recon Aggregate DB schema and ORM
> -
>
> Key: HDDS-1189
> URL: https://issues.apache.org/jira/browse/HDDS-1189
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1189.01.patch, HDDS-1189.02.patch, 
> HDDS-1189.03.patch, HDDS-1189.04.patch, HDDS-1189.05.patch, HDDS-1189.06.patch
>
>
> _Objectives_
> - Define V1 of the db schema for recon service
> - The current proposal is to use jOOQ as the ORM for SQL interaction. For two 
> main reasons: a) powerful DSL for querying, that abstracts out SQL dialects, 
> b) Allows code to schema and schema to code seamless transition, critical for 
> creating DDL through the code and unit testing across versions of the 
> application.
> - Add e2e unit tests suite for Recon entities, created based on the design doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Siddharth Wagle (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808263#comment-16808263
 ] 

Siddharth Wagle edited comment on HDDS-1189 at 4/3/19 4:58 AM:
---

Thanks [~elek] and [~avijayan] for your reviews. Addressed the comments in 05 
as below:

_License issue_:
- Removed HSQLDB dependency and reverted to using in-memory sqlite for code 
generation, the hsqldb license is actually a copy of BSD but there was an 
alternate way out so went with that.
- spring-jdbc is Apache v2 

_Configuration issues_:
-  Used the recon.dbdir for constructing default url
- There is no password field tag, checked source tree
- The findbugs plugin does not apply recursively, hence need to be explicit


was (Author: swagle):
Thanks [~elek] and [~avijayan] for your reviews. Addressed the comments in 05 
as below:

_License issue_:
- Removed HSQLDB dependency and reverted to using in-memory sqlite for code 
generation, the hsqldb license is actually BSD but there was an alternate way 
out so went with that.
- spring-jdbc is Apache v2 

_Configuration issues_:
-  Used the recon.dbdir for constructing default url
- There is no password field tag, checked source tree
- The findbugs plugin does not apply recursively, hence need to be explicit

> Recon Aggregate DB schema and ORM
> -
>
> Key: HDDS-1189
> URL: https://issues.apache.org/jira/browse/HDDS-1189
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1189.01.patch, HDDS-1189.02.patch, 
> HDDS-1189.03.patch, HDDS-1189.04.patch, HDDS-1189.05.patch, HDDS-1189.06.patch
>
>
> _Objectives_
> - Define V1 of the db schema for recon service
> - The current proposal is to use jOOQ as the ORM for SQL interaction. For two 
> main reasons: a) powerful DSL for querying, that abstracts out SQL dialects, 
> b) Allows code to schema and schema to code seamless transition, critical for 
> creating DDL through the code and unit testing across versions of the 
> application.
> - Add e2e unit tests suite for Recon entities, created based on the design doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808364#comment-16808364
 ] 

Hadoop QA commented on HDDS-1189:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
0s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
35s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
7s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
33s{color} | {color:green} hadoop-hdds in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 48s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.TestMiniChaosOzoneCluster |
|   | hadoop.ozone.om.TestScmChillMode |
|   | hadoop.ozone.container.TestContainerReplication |
|   | hadoop.ozone.om.TestOzoneManagerHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2628/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1189 |
| 

[jira] [Commented] (HDFS-14369) RBF: Fix trailing "/" for webhdfs

2019-04-02 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808348#comment-16808348
 ] 

Akira Ajisaka commented on HDFS-14369:
--

HADOOP-16226 is now fixed in trunk and branch-3.2. Can someone rebase 
HDFS-13891 branch?

> RBF: Fix trailing "/" for webhdfs
> -
>
> Key: HDFS-14369
> URL: https://issues.apache.org/jira/browse/HDFS-14369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-14369-HDFS-13891-regressiontest-001.patch, 
> HDFS-14369-HDFS-13891.001.patch, HDFS-14369-HDFS-13891.002.patch, 
> HDFS-14369-HDFS-13891.003.patch, HDFS-14369-HDFS-13891.004.patch
>
>
> WebHDFS doesn't trim trailing slash causing discrepancy in operations.
> Example below
> --
> Using HDFS API, two directory are listed.
> {code}
> $ hdfs dfs -ls hdfs://:/tmp/
> Found 2 items
> drwxrwxrwx   - hdfs supergroup  0 2018-11-09 17:50 
> hdfs://:/tmp/tmp1
> drwxrwxrwx   - hdfs supergroup  0 2018-11-09 17:50 
> hdfs://:/tmp/tmp2
> {code}
> Using WebHDFS API, only one directory is listed.
> {code}
> $ curl -u : --negotiate -i 
> "http://:50071/webhdfs/v1/tmp/?op=LISTSTATUS"
> (snip)
> {"FileStatuses":{"FileStatus":[
> {"accessTime":0,"blockSize":0,"childrenNum":0,"fileId":16387,"group":"supergroup","length":0,"modificationTime":1552016766769,"owner":"hdfs","pathSuffix":"tmp1","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"}
> ]}}
> {code}
> The mount table is as follows:
> {code}
> $ hdfs dfsrouteradmin -ls /tmp
> Mount Table Entries:
> SourceDestinations  Owner 
> Group Mode  Quota/Usage  
> /tmp  ns1->/tmp aajisaka  
> users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> /tmp/tmp1 ns1->/tmp/tmp1aajisaka  
> users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> /tmp/tmp2 ns2->/tmp/tmp2aajisaka  
> users rwxr-xr-x [NsQuota: -/-, SsQuota: 
> -/-]
> {code}
> Without trailing thrash, two directories are listed.
> {code}
> $ curl -u : --negotiate -i 
> "http://:50071/webhdfs/v1/tmp?op=LISTSTATUS"
> (snip)
> {"FileStatuses":{"FileStatus":[
> {"accessTime":1541753421917,"blockSize":0,"childrenNum":0,"fileId":0,"group":"supergroup","length":0,"modificationTime":1541753421917,"owner":"hdfs","pathSuffix":"tmp1","permission":"777","replication":0,"storagePolicy":0,"symlink":"","type":"DIRECTORY"},
> {"accessTime":1541753429812,"blockSize":0,"childrenNum":0,"fileId":0,"group":"supergroup","length":0,"modificationTime":1541753429812,"owner":"hdfs","pathSuffix":"tmp2","permission":"777","replication":0,"storagePolicy":0,"symlink":"","type":"DIRECTORY"}
> ]}}
> {code}
> [~ajisakaa] Thanks for reporting this, I borrowed the text from 
> HDFS-13972



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Siddharth Wagle (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDDS-1189:
--
Attachment: HDDS-1189.06.patch

> Recon Aggregate DB schema and ORM
> -
>
> Key: HDDS-1189
> URL: https://issues.apache.org/jira/browse/HDDS-1189
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1189.01.patch, HDDS-1189.02.patch, 
> HDDS-1189.03.patch, HDDS-1189.04.patch, HDDS-1189.05.patch, HDDS-1189.06.patch
>
>
> _Objectives_
> - Define V1 of the db schema for recon service
> - The current proposal is to use jOOQ as the ORM for SQL interaction. For two 
> main reasons: a) powerful DSL for querying, that abstracts out SQL dialects, 
> b) Allows code to schema and schema to code seamless transition, critical for 
> creating DDL through the code and unit testing across versions of the 
> application.
> - Add e2e unit tests suite for Recon entities, created based on the design doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Siddharth Wagle (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808313#comment-16808313
 ] 

Siddharth Wagle commented on HDDS-1189:
---

06 - 05 = checkstyle fixes.

> Recon Aggregate DB schema and ORM
> -
>
> Key: HDDS-1189
> URL: https://issues.apache.org/jira/browse/HDDS-1189
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1189.01.patch, HDDS-1189.02.patch, 
> HDDS-1189.03.patch, HDDS-1189.04.patch, HDDS-1189.05.patch, HDDS-1189.06.patch
>
>
> _Objectives_
> - Define V1 of the db schema for recon service
> - The current proposal is to use jOOQ as the ORM for SQL interaction. For two 
> main reasons: a) powerful DSL for querying, that abstracts out SQL dialects, 
> b) Allows code to schema and schema to code seamless transition, critical for 
> creating DDL through the code and unit testing across versions of the 
> application.
> - Add e2e unit tests suite for Recon entities, created based on the design doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13853) RBF: RouterAdmin update cmd is overwriting the entry not updating the existing

2019-04-02 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808304#comment-16808304
 ] 

Ayush Saxena commented on HDFS-13853:
-

Thanks [~elgoiri] for the comments.

Have uploaded v5 with said changes.

> RBF: RouterAdmin update cmd is overwriting the entry not updating the existing
> --
>
> Key: HDFS-13853
> URL: https://issues.apache.org/jira/browse/HDFS-13853
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13853-HDFS-13891-01.patch, 
> HDFS-13853-HDFS-13891-02.patch, HDFS-13853-HDFS-13891-03.patch, 
> HDFS-13853-HDFS-13891-04.patch, HDFS-13853-HDFS-13891-05.patch
>
>
> {code:java}
> // Create a new entry
> Map destMap = new LinkedHashMap<>();
> for (String ns : nss) {
>   destMap.put(ns, dest);
> }
> MountTable newEntry = MountTable.newInstance(mount, destMap);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13853) RBF: RouterAdmin update cmd is overwriting the entry not updating the existing

2019-04-02 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13853:

Attachment: HDFS-13853-HDFS-13891-05.patch

> RBF: RouterAdmin update cmd is overwriting the entry not updating the existing
> --
>
> Key: HDFS-13853
> URL: https://issues.apache.org/jira/browse/HDFS-13853
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13853-HDFS-13891-01.patch, 
> HDFS-13853-HDFS-13891-02.patch, HDFS-13853-HDFS-13891-03.patch, 
> HDFS-13853-HDFS-13891-04.patch, HDFS-13853-HDFS-13891-05.patch
>
>
> {code:java}
> // Create a new entry
> Map destMap = new LinkedHashMap<>();
> for (String ns : nss) {
>   destMap.put(ns, dest);
> }
> MountTable newEntry = MountTable.newInstance(mount, destMap);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808290#comment-16808290
 ] 

Hadoop QA commented on HDDS-1189:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
32s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-ozone: The patch generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
8s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 18s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 29s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2627/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1189 |
| JIRA Patch URL | 

[jira] [Resolved] (HDDS-294) Destroy ratis pipeline on datanode on pipeline close event.

2019-04-02 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh resolved HDDS-294.

Resolution: Implemented

> Destroy ratis pipeline on datanode on pipeline close event.
> ---
>
> Key: HDDS-294
> URL: https://issues.apache.org/jira/browse/HDDS-294
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: alpha2, newbie
>
> Once a ratis pipeline is closed, the corresponding metadata on the datanode 
> should be destroyed as well. This jira proposes to remove the ratis metadata 
> and destroy the ratis ring on datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808279#comment-16808279
 ] 

Hadoop QA commented on HDFS-13699:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 10s{color} | {color:orange} root: The patch generated 3 new + 743 unchanged 
- 3 fixed = 746 total (was 746) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  0s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
10s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
4s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 14s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}196m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.server.datanode.TestDataNodeLifeline |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-13699 |
| JIRA Patch URL | 

[jira] [Work logged] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?focusedWorklogId=222115=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222115
 ]

ASF GitHub Bot logged work on HDDS-1353:


Author: ASF GitHub Bot
Created on: 03/Apr/19 01:03
Start Date: 03/Apr/19 01:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #681: HDDS-1353 : 
Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
because of BackgroundPipelineCreator.
URL: https://github.com/apache/hadoop/pull/681#issuecomment-479279985
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 43 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 70 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1182 | trunk passed |
   | +1 | compile | 1148 | trunk passed |
   | +1 | checkstyle | 208 | trunk passed |
   | +1 | mvnsite | 124 | trunk passed |
   | +1 | shadedclient | 1075 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 51 | trunk passed |
   | +1 | javadoc | 56 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 61 | Maven dependency ordering for patch |
   | +1 | mvninstall | 87 | the patch passed |
   | +1 | compile | 1085 | the patch passed |
   | +1 | javac | 1085 | the patch passed |
   | +1 | checkstyle | 220 | the patch passed |
   | +1 | mvnsite | 81 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 713 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 56 | the patch passed |
   | +1 | javadoc | 56 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 126 | server-scm in the patch passed. |
   | -1 | unit | 1110 | integration-test in the patch failed. |
   | +1 | asflicense | 55 | The patch does not generate ASF License warnings. |
   | | | 7526 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdds.scm.pipeline.TestRatisPipelineUtils |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.TestMiniChaosOzoneCluster |
   |   | hadoop.ozone.scm.pipeline.TestSCMPipelineMetrics |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/681 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux b2219f6251b4 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / cf26811 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/3/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/3/testReport/ |
   | Max. process+thread count | 5249 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/server-scm hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222115)
Time Spent: 40m  (was: 0.5h)

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of BackgroundPipelineCreator
> ---
>
> Key: HDDS-1353
> URL: https://issues.apache.org/jira/browse/HDDS-1353
> Project: Hadoop Distributed Data Store
>

[jira] [Commented] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Siddharth Wagle (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808263#comment-16808263
 ] 

Siddharth Wagle commented on HDDS-1189:
---

Thanks [~elek] and [~avijayan] for your reviews. Addressed the comments in 05 
as below:

_License issue_:
- Removed HSQLDB dependency and reverted to using in-memory sqlite for code 
generation, the hsqldb license is actually BSD but there was an alternate way 
out so went with that.
- spring-jdbc is Apache v2 

_Configuration issues_:
-  Used the recon.dbdir for constructing default url
- There is no password field tag, checked source tree
- The findbugs plugin does not apply recursively, hence need to be explicit

> Recon Aggregate DB schema and ORM
> -
>
> Key: HDDS-1189
> URL: https://issues.apache.org/jira/browse/HDDS-1189
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1189.01.patch, HDDS-1189.02.patch, 
> HDDS-1189.03.patch, HDDS-1189.04.patch, HDDS-1189.05.patch
>
>
> _Objectives_
> - Define V1 of the db schema for recon service
> - The current proposal is to use jOOQ as the ORM for SQL interaction. For two 
> main reasons: a) powerful DSL for querying, that abstracts out SQL dialects, 
> b) Allows code to schema and schema to code seamless transition, critical for 
> creating DDL through the code and unit testing across versions of the 
> application.
> - Add e2e unit tests suite for Recon entities, created based on the design doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1189) Recon Aggregate DB schema and ORM

2019-04-02 Thread Siddharth Wagle (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDDS-1189:
--
Attachment: HDDS-1189.05.patch

> Recon Aggregate DB schema and ORM
> -
>
> Key: HDDS-1189
> URL: https://issues.apache.org/jira/browse/HDDS-1189
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1189.01.patch, HDDS-1189.02.patch, 
> HDDS-1189.03.patch, HDDS-1189.04.patch, HDDS-1189.05.patch
>
>
> _Objectives_
> - Define V1 of the db schema for recon service
> - The current proposal is to use jOOQ as the ORM for SQL interaction. For two 
> main reasons: a) powerful DSL for querying, that abstracts out SQL dialects, 
> b) Allows code to schema and schema to code seamless transition, critical for 
> creating DDL through the code and unit testing across versions of the 
> application.
> - Add e2e unit tests suite for Recon entities, created based on the design doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1373) KeyOutputStream, close after write request fails after retries, runs into IllegalArgumentException

2019-04-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808239#comment-16808239
 ] 

Hadoop QA commented on HDDS-1373:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  4m 
22s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-ozone: The patch generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
55s{color} | {color:green} hadoop-hdds in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 47s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 95m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.container.TestContainerReplication |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2626/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1373 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12964622/HDDS-1373.000.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs 

[jira] [Commented] (HDFS-13989) RBF: Add FSCK to the Router

2019-04-02 Thread Fengnan Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808231#comment-16808231
 ] 

Fengnan Li commented on HDFS-13989:
---

Current patch LGTM. 

+1

> RBF: Add FSCK to the Router
> ---
>
> Key: HDFS-13989
> URL: https://issues.apache.org/jira/browse/HDFS-13989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13989.001.patch
>
>
> The namenode supports FSCK.
> The Router should be able to forward FSCK to the right Namenode and aggregate 
> the results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?focusedWorklogId=222067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222067
 ]

ASF GitHub Bot logged work on HDDS-1353:


Author: ASF GitHub Bot
Created on: 02/Apr/19 22:48
Start Date: 02/Apr/19 22:48
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #681: HDDS-1353 : 
Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
because of BackgroundPipelineCreator.
URL: https://github.com/apache/hadoop/pull/681#issuecomment-479240329
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 34 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 66 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1094 | trunk passed |
   | +1 | compile | 1094 | trunk passed |
   | +1 | checkstyle | 203 | trunk passed |
   | +1 | mvnsite | 113 | trunk passed |
   | +1 | shadedclient | 1000 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 47 | trunk passed |
   | +1 | javadoc | 49 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for patch |
   | +1 | mvninstall | 63 | the patch passed |
   | +1 | compile | 920 | the patch passed |
   | +1 | javac | 920 | the patch passed |
   | +1 | checkstyle | 191 | the patch passed |
   | +1 | mvnsite | 79 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 687 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 57 | the patch passed |
   | +1 | javadoc | 66 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 123 | server-scm in the patch passed. |
   | -1 | unit | 678 | integration-test in the patch failed. |
   | +1 | asflicense | 46 | The patch does not generate ASF License warnings. |
   | | | 6572 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdds.scm.pipeline.TestSCMPipelineManager |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/681 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux d0a668414102 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / cf26811 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/2/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/2/testReport/ |
   | Max. process+thread count | 4493 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/server-scm hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/2/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222067)
Time Spent: 0.5h  (was: 20m)

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of BackgroundPipelineCreator
> ---
>
> Key: HDDS-1353
> URL: https://issues.apache.org/jira/browse/HDDS-1353
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: 

[jira] [Commented] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808206#comment-16808206
 ] 

Hadoop QA commented on HDFS-13699:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 33s{color} | {color:orange} root: The patch generated 4 new + 743 unchanged 
- 3 fixed = 747 total (was 746) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
20s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
6s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
53s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}233m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-13699 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12964607/HDFS-13699.009.patch |
| 

[jira] [Commented] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808196#comment-16808196
 ] 

Konstantin Shvachko commented on HDFS-13699:


+1 on v010 patch.

> Add DFSClient sending handshake token to DataNode, and allow DataNode 
> overwrite downstream QOP
> --
>
> Key: HDFS-13699
> URL: https://issues.apache.org/jira/browse/HDFS-13699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13699.001.patch, HDFS-13699.002.patch, 
> HDFS-13699.003.patch, HDFS-13699.004.patch, HDFS-13699.005.patch, 
> HDFS-13699.006.patch, HDFS-13699.007.patch, HDFS-13699.008.patch, 
> HDFS-13699.009.patch, HDFS-13699.010.patch, HDFS-13699.WIP.001.patch
>
>
> Given the other Jiras under HDFS-13541, this Jira is to allow DFSClient to 
> redirect the encrypt secret to DataNode. The encrypted message is the QOP 
> that client and NameNode have used. DataNode decrypts the message and enforce 
> the QOP for the client connection. Also, this Jira will also include 
> overwriting downstream QOP, as mentioned in the HDFS-13541 design doc. 
> Namely, this is to allow inter-DN QOP that is different from client-DN QOP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14406) Add per user RPC Processing time

2019-04-02 Thread Xue Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated HDFS-14406:
---
Description: 
For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This JIRA aims to add per user RPC processing time metrics and expose it via 
JMX.

  was:
For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This JIRA aims to add per user RPC response time metrics and expose it via JMX.


> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This JIRA aims to add per user RPC processing time metrics and expose it via 
> JMX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1373) KeyOutputStream, close after write request fails after retries, runs into IllegalArgumentException

2019-04-02 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-1373:
--
Attachment: HDDS-1373.000.patch

> KeyOutputStream, close after write request fails after retries, runs into 
> IllegalArgumentException
> --
>
> Key: HDDS-1373
> URL: https://issues.apache.org/jira/browse/HDDS-1373
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.4.0, 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster
> Attachments: HDDS-1373.000.patch
>
>
> In this code, the stream is closed via try with resource.
> {code}
>   try (OzoneOutputStream stream = ozoneBucket.createKey(keyName,
>   bufferCapacity, ReplicationType.RATIS, ReplicationFactor.THREE,
>   new HashMap<>())) {
> stream.write(buffer.array());
>   } catch (Exception e) {
> LOG.error("LOADGEN: Create key:{} failed with exception", keyName, e);
> break;
>   }
> {code}
> Here, the write call fails correctly as expected, However the close doesn't 
> fail with the same exception.
> The exception stack stack is as following
> {code}
> 2019-04-03 00:52:54,116 ERROR ozone.MiniOzoneLoadGenerator 
> (MiniOzoneLoadGenerator.java:load(101)) - LOADGEN: Create 
> key:pool-431-thread-9-8126 failed with exception
> java.io.IOException: Retry request failed. retries get failed due to exceeded 
> maximum allowed retries number: 5
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:492)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:287)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
> at java.io.OutputStream.write(OutputStream.java:75)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:99)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:137)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Suppressed: java.lang.IllegalArgumentException
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:643)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:100)
>  

[jira] [Updated] (HDDS-1373) KeyOutputStream, close after write request fails after retries, runs into IllegalArgumentException

2019-04-02 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-1373:
--
Status: Patch Available  (was: Open)

> KeyOutputStream, close after write request fails after retries, runs into 
> IllegalArgumentException
> --
>
> Key: HDDS-1373
> URL: https://issues.apache.org/jira/browse/HDDS-1373
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.4.0, 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster
> Attachments: HDDS-1373.000.patch
>
>
> In this code, the stream is closed via try with resource.
> {code}
>   try (OzoneOutputStream stream = ozoneBucket.createKey(keyName,
>   bufferCapacity, ReplicationType.RATIS, ReplicationFactor.THREE,
>   new HashMap<>())) {
> stream.write(buffer.array());
>   } catch (Exception e) {
> LOG.error("LOADGEN: Create key:{} failed with exception", keyName, e);
> break;
>   }
> {code}
> Here, the write call fails correctly as expected, However the close doesn't 
> fail with the same exception.
> The exception stack stack is as following
> {code}
> 2019-04-03 00:52:54,116 ERROR ozone.MiniOzoneLoadGenerator 
> (MiniOzoneLoadGenerator.java:load(101)) - LOADGEN: Create 
> key:pool-431-thread-9-8126 failed with exception
> java.io.IOException: Retry request failed. retries get failed due to exceeded 
> maximum allowed retries number: 5
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:492)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:287)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
> at java.io.OutputStream.write(OutputStream.java:75)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:99)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:137)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Suppressed: java.lang.IllegalArgumentException
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:643)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> at 
> 

[jira] [Updated] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-13699:
--
Attachment: HDFS-13699.010.patch

> Add DFSClient sending handshake token to DataNode, and allow DataNode 
> overwrite downstream QOP
> --
>
> Key: HDFS-13699
> URL: https://issues.apache.org/jira/browse/HDFS-13699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13699.001.patch, HDFS-13699.002.patch, 
> HDFS-13699.003.patch, HDFS-13699.004.patch, HDFS-13699.005.patch, 
> HDFS-13699.006.patch, HDFS-13699.007.patch, HDFS-13699.008.patch, 
> HDFS-13699.009.patch, HDFS-13699.010.patch, HDFS-13699.WIP.001.patch
>
>
> Given the other Jiras under HDFS-13541, this Jira is to allow DFSClient to 
> redirect the encrypt secret to DataNode. The encrypted message is the QOP 
> that client and NameNode have used. DataNode decrypts the message and enforce 
> the QOP for the client connection. Also, this Jira will also include 
> overwriting downstream QOP, as mentioned in the HDFS-13541 design doc. 
> Namely, this is to allow inter-DN QOP that is different from client-DN QOP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808186#comment-16808186
 ] 

Chen Liang commented on HDFS-13699:
---

Post v010 patch with one additional unused import fix.

> Add DFSClient sending handshake token to DataNode, and allow DataNode 
> overwrite downstream QOP
> --
>
> Key: HDFS-13699
> URL: https://issues.apache.org/jira/browse/HDFS-13699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13699.001.patch, HDFS-13699.002.patch, 
> HDFS-13699.003.patch, HDFS-13699.004.patch, HDFS-13699.005.patch, 
> HDFS-13699.006.patch, HDFS-13699.007.patch, HDFS-13699.008.patch, 
> HDFS-13699.009.patch, HDFS-13699.010.patch, HDFS-13699.WIP.001.patch
>
>
> Given the other Jiras under HDFS-13541, this Jira is to allow DFSClient to 
> redirect the encrypt secret to DataNode. The encrypted message is the QOP 
> that client and NameNode have used. DataNode decrypts the message and enforce 
> the QOP for the client connection. Also, this Jira will also include 
> overwriting downstream QOP, as mentioned in the HDFS-13541 design doc. 
> Namely, this is to allow inter-DN QOP that is different from client-DN QOP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808184#comment-16808184
 ] 

Sahil Takiar commented on HDFS-14394:
-

Thanks for the input Todd. One last thing I forgot to mention. Hadoop QA didn't 
run the libhdfs tests for whatever reason. I ran then manually against this 
patch and they all passed. For anyone else having trouble getting the tests to 
reliably work on Linux, I was only able to get them to work properly while 
inside the Hadoop Docker image (run {{./start-build-env.sh}}).

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808180#comment-16808180
 ] 

Todd Lipcon commented on HDFS-14394:


One more piece of related info here is that the concern about usage of GCC 
extensions potentially linking in parts of the GCC runtime is explicitly 
addressed by a runtime library exception: 
https://www.gnu.org/licenses/gcc-exception-3.1-faq.en.html so even if we were 
to distribute a binary artifact compiled by GCC that included parts of the GCC 
runtime (libgcc) we'd be ok.

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14406) Add per user RPC Processing time

2019-04-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-14406:
---
Description: 
For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This JIRA aims to add per user RPC response time metrics and expose it via JMX.

  was:
For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This Jira aims to add per user RPC response time metrics and expose it via jmx.


> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This JIRA aims to add per user RPC response time metrics and expose it via 
> JMX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14406) Add per user RPC Processing time

2019-04-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned HDFS-14406:
--

Assignee: Xue Liu

> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This Jira aims to add per user RPC response time metrics and expose it via 
> jmx.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222030=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222030
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:47
Start Date: 02/Apr/19 21:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479218191
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 22 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 71 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1133 | trunk passed |
   | +1 | compile | 949 | trunk passed |
   | +1 | checkstyle | 208 | trunk passed |
   | +1 | mvnsite | 204 | trunk passed |
   | +1 | shadedclient | 1166 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 198 | trunk passed |
   | +1 | javadoc | 145 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | -1 | mvninstall | 26 | integration-test in the patch failed. |
   | +1 | compile | 1000 | the patch passed |
   | +1 | javac | 1000 | the patch passed |
   | +1 | checkstyle | 213 | the patch passed |
   | +1 | mvnsite | 163 | the patch passed |
   | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 784 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 227 | the patch passed |
   | +1 | javadoc | 146 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 90 | common in the patch passed. |
   | +1 | unit | 44 | common in the patch passed. |
   | -1 | unit | 754 | integration-test in the patch failed. |
   | +1 | unit | 56 | ozone-manager in the patch passed. |
   | +1 | asflicense | 46 | The patch does not generate ASF License warnings. |
   | | | 7665 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.om.TestScmChillMode |
   |   | hadoop.hdds.scm.pipeline.TestRatisPipelineUtils |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux 23206903a05e 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed 
Oct 31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / cf26811 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/patch-mvninstall-hadoop-ozone_integration-test.txt
 |
   | whitespace | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/whitespace-eol.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/testReport/ |
   | Max. process+thread count | 4419 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/common 
hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/4/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog 

[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=222029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222029
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:47
Start Date: 02/Apr/19 21:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271508687
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -534,4 +536,84 @@ public void testReadRequest() throws Exception {
   proxyProvider.getCurrentProxyOMNodeId());
 }
   }
+
+  @Test
+  public void testOMRatisSnapshot() throws Exception {
+String userName = "user" + RandomStringUtils.randomNumeric(5);
+String adminName = "admin" + RandomStringUtils.randomNumeric(5);
+String volumeName = "volume" + RandomStringUtils.randomNumeric(5);
+String bucketName = "bucket" + RandomStringUtils.randomNumeric(5);
+
+VolumeArgs createVolumeArgs = VolumeArgs.newBuilder()
+.setOwner(userName)
+.setAdmin(adminName)
+.build();
+
+objectStore.createVolume(volumeName, createVolumeArgs);
+OzoneVolume retVolumeinfo = objectStore.getVolume(volumeName);
+
+retVolumeinfo.createBucket(bucketName);
+OzoneBucket ozoneBucket = retVolumeinfo.getBucket(bucketName);
+
+String leaderOMNodeId = objectStore.getClientProxy().getOMProxyProvider()
+.getCurrentProxyOMNodeId();
+OzoneManager ozoneManager = cluster.getOzoneManager(leaderOMNodeId);
+
+// Send commands to ratis to increase the log index so that ratis
+// triggers a snapshot on the state machine.
+
+long appliedLogIndex = 0;
+while (appliedLogIndex <= SNAPSHOT_THRESHOLD) {
+  createKey(ozoneBucket);
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
+  return false;
+}, 1000, 10);
+
+// The current lastAppliedLogIndex on the state machine should be greater
+// than or equal to the saved snapshot index.
+long smLastAppliedIndex =
+ozoneManager.getOmRatisServer().getStateMachineLastAppliedIndex();
+long ratisSnapshotIndex = ozoneManager.loadRatisSnapshotIndex();
+Assert.assertTrue("LastAppliedIndex on OM State Machine ("
++ smLastAppliedIndex + ") is less than the saved snapshot index("
++ ratisSnapshotIndex + ").",
+smLastAppliedIndex >= ratisSnapshotIndex);
+
+// Add more transactions to Ratis to trigger another snapshot
+while (appliedLogIndex <= (smLastAppliedIndex + SNAPSHOT_THRESHOLD)) {
+  createKey(ozoneBucket);
+  appliedLogIndex = ozoneManager.getOmRatisServer()
+  .getStateMachineLastAppliedIndex();
+}
+
+GenericTestUtils.waitFor(() -> {
+  if (ozoneManager.loadRatisSnapshotIndex() > 0) {
+return true;
+  }
+  return false;
+}, 1000, 10);
+
+// The new snapshot index must be greater than the previous snapshot index
+long ratisSnapshotIndexNew = ozoneManager.loadRatisSnapshotIndex();
+Assert.assertTrue("Latest snapshot index must be greater than previous " +
+"snapshot indices", ratisSnapshotIndexNew > ratisSnapshotIndex);  
 
 Review comment:
   whitespace:end of line
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222029)
Time Spent: 3.5h  (was: 3h 20m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HDFS-14406) Add per user RPC Processing time

2019-04-02 Thread Xue Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated HDFS-14406:
---
Description: 
For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This Jira aims to add per user RPC response time metrics and expose it via jmx.

  was:
For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This Jira aims to add per user RPC response time metrics and export it via jmx.


> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This Jira aims to add per user RPC response time metrics and expose it via 
> jmx.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14406) Add per user RPC Processing time

2019-04-02 Thread Xue Liu (JIRA)
Xue Liu created HDFS-14406:
--

 Summary: Add per user RPC Processing time
 Key: HDFS-14406
 URL: https://issues.apache.org/jira/browse/HDFS-14406
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.2.0
Reporter: Xue Liu
 Fix For: 3.2.0


For a shared cluster we would want to separate users' resources, as well as 
having our metrics reflecting on the usage, latency, etc, for each user. 

This Jira aims to add per user RPC response time metrics and export it via jmx.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread Aravindan Vijayan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1353:

Status: Patch Available  (was: In Progress)

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of BackgroundPipelineCreator
> ---
>
> Key: HDDS-1353
> URL: https://issues.apache.org/jira/browse/HDDS-1353
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: newbie, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There is a {{BackgroundPipelineCreator}} thread in SCM which runs in a fixed 
> interval and tries to create pipelines. This BackgroundPipelineCreator uses 
> {{IOException}} as exit criteria (no more pipelines can be created). In each 
> run of BackgroundPipelineCreator we exit when we are not able to create any 
> more pipelines, i.e. when we get IOException while trying to create the 
> pipeline. This means that 
> {{scm_pipeline_metrics_num_pipeline_creation_failed}} value will get 
> incremented in each run of BackgroundPipelineCreator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808173#comment-16808173
 ] 

Todd Lipcon commented on HDFS-14394:


BTW I should have said I'm +1 on this patch. I'll commit unless there are 
further objections.

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808172#comment-16808172
 ] 

Todd Lipcon commented on HDFS-14394:


I'm pretty certain that depending on GNU extensions to the C language does not 
imply anything about licensing of your code. See the Apache license faq here: 
https://apache.org/legal/resolved.html#prohibited

bq. For example, using a GPL'ed tool during the build is OK, however including 
GPL'ed source code is not.

Note that if language/toolchain licensing was viral to projects built in that 
language, all of our use of Java would be problematic as well :)

Note also that clang supports the --std=gnu99 mode, so even turning this on 
doesn't imply usage of gcc or any other GPL tool. Generally I think the GNU99 
standard is very well supported. As of gcc 5.1 in fact the default is 
--std=gnu11. It seems like icc is also fine with this as of icc 17.0.0.

I don't see any reason to add --pedantic-errors in this patch either -- this is 
just fixing an issue compiling on older compilers to match the standard we are 
_already using_ on newer compilers. If we want to be stricter about our C 
standard adherence, let's do that separately.

On the subject of passing this flag to the C++ code, I agree that it won't have 
any effect, because the CMAKE_C_FLAGS should not affect the C++ code. 

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14405) RBF: Client should be able to renew DT immediately after it fetched the DT

2019-04-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HDFS-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-14405:
---
Summary: RBF: Client should be able to renew DT immediately after it 
fetched the DT  (was: RBF: Client should be able to renew dt immediately after 
it fetched the dt)

> RBF: Client should be able to renew DT immediately after it fetched the DT
> --
>
> Key: HDFS-14405
> URL: https://issues.apache.org/jira/browse/HDFS-14405
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Minor
>
> By the current design, once a dt is generated it needs to sync to other 
> routers as well as backing up in the state store, therefore there is a time 
> gap between other routers are able to know the existence of this token.
> Ideally, the same client should be able to renew the token it just created 
> through fetchdt even though two calls are hitting two distinct routers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14405) RBF: Client should be able to renew dt immediately after it fetched the dt

2019-04-02 Thread Fengnan Li (JIRA)
Fengnan Li created HDFS-14405:
-

 Summary: RBF: Client should be able to renew dt immediately after 
it fetched the dt
 Key: HDFS-14405
 URL: https://issues.apache.org/jira/browse/HDFS-14405
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Fengnan Li
Assignee: Fengnan Li


By the current design, once a dt is generated it needs to sync to other routers 
as well as backing up in the state store, therefore there is a time gap between 
other routers are able to know the existence of this token.

Ideally, the same client should be able to renew the token it just created 
through fetchdt even though two calls are hitting two distinct routers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222008
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271495280
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
 ##
 @@ -15,129 +15,94 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.hadoop.hdds.scm.container;
 
-import java.io.IOException;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Set;
-import java.util.stream.Collectors;
+package org.apache.hadoop.hdds.scm.container;
 
 import org.apache.hadoop.hdds.protocol.DatanodeDetails;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReplicaProto;
 import org.apache.hadoop.hdds.protocol.proto
 .StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
 import org.apache.hadoop.hdds.scm.block.PendingDeleteStatusList;
-import org.apache.hadoop.hdds.scm.container.replication
-.ReplicationActivityStatus;
-import org.apache.hadoop.hdds.scm.container.replication.ReplicationRequest;
 import org.apache.hadoop.hdds.scm.events.SCMEvents;
 import org.apache.hadoop.hdds.scm.node.NodeManager;
 import org.apache.hadoop.hdds.scm.node.states.NodeNotFoundException;
-import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
-import org.apache.hadoop.hdds.scm.server
-.SCMDatanodeHeartbeatDispatcher.ContainerReportFromDatanode;
+import org.apache.hadoop.hdds.scm.server.SCMDatanodeHeartbeatDispatcher
+.ContainerReportFromDatanode;
 import org.apache.hadoop.hdds.server.events.EventHandler;
 import org.apache.hadoop.hdds.server.events.EventPublisher;
-
-import com.google.common.base.Preconditions;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
 /**
  * Handles container reports from datanode.
  */
-public class ContainerReportHandler implements
-EventHandler {
+public class ContainerReportHandler extends AbstractContainerReportHandler
+implements EventHandler {
 
   private static final Logger LOG =
   LoggerFactory.getLogger(ContainerReportHandler.class);
 
   private final NodeManager nodeManager;
-  private final PipelineManager pipelineManager;
   private final ContainerManager containerManager;
-  private final ReplicationActivityStatus replicationStatus;
 
+  /**
+   * Constructs ContainerReportHandler instance with the
+   * given NodeManager and ContainerManager instance.
+   *
+   * @param nodeManager NodeManager instance
+   * @param containerManager ContainerManager instance
+   */
   public ContainerReportHandler(final NodeManager nodeManager,
-  final PipelineManager pipelineManager,
-  final ContainerManager containerManager,
-  final ReplicationActivityStatus replicationActivityStatus) {
-Preconditions.checkNotNull(nodeManager);
-Preconditions.checkNotNull(pipelineManager);
-Preconditions.checkNotNull(containerManager);
-Preconditions.checkNotNull(replicationActivityStatus);
+final ContainerManager containerManager) {
+super(containerManager, LOG);
 this.nodeManager = nodeManager;
-this.pipelineManager = pipelineManager;
 this.containerManager = containerManager;
-this.replicationStatus = replicationActivityStatus;
   }
 
+  /**
+   * Process the container reports from datanodes.
+   *
+   * @param reportFromDatanode Container Report
+   * @param publisher EventPublisher reference
+   */
   @Override
   public void onMessage(final ContainerReportFromDatanode reportFromDatanode,
-  final EventPublisher publisher) {
+final EventPublisher publisher) {
 
 final DatanodeDetails datanodeDetails =
 reportFromDatanode.getDatanodeDetails();
-
 final ContainerReportsProto containerReport =
 reportFromDatanode.getReport();
 
 try {
+  final List replicas =
+  containerReport.getReportsList();
+  final Set containersInSCM =
+  nodeManager.getContainers(datanodeDetails);
 
-  final List replicas = containerReport
-  .getReportsList();
-
-  // ContainerIDs which SCM expects this datanode to have.
-  final Set expectedContainerIDs = nodeManager
-  .getContainers(datanodeDetails);
-
-  // ContainerIDs that this 

[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222007
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271487673
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
 ##
 @@ -160,15 +161,43 @@ public ReplicationManager(final Configuration conf,
* Starts Replication Monitor thread.
*/
   public synchronized void start() {
+start(0);
+  }
+
+  /**
+   * Starts Replication Monitor thread after the given initial delay.
+   *
+   * @param delay initial delay in milliseconds
+   */
+  public void start(final long delay) {
 if (!running) {
-  LOG.info("Starting Replication Monitor Thread.");
   running = true;
-  replicationMonitor.start();
+  CompletableFuture.runAsync(() -> {
 
 Review comment:
   Don't use the forkJoin commonPool. It has very few threads and can be easily 
exhausted. We saw this to be a frequent issue in unit tests.
   
   Instead use the overload that accepts an Executor.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222007)
Time Spent: 0.5h  (was: 20m)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222009
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271487930
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
 ##
 @@ -160,15 +161,43 @@ public ReplicationManager(final Configuration conf,
* Starts Replication Monitor thread.
*/
   public synchronized void start() {
+start(0);
+  }
+
+  /**
+   * Starts Replication Monitor thread after the given initial delay.
+   *
+   * @param delay initial delay in milliseconds
+   */
+  public void start(final long delay) {
 if (!running) {
-  LOG.info("Starting Replication Monitor Thread.");
   running = true;
-  replicationMonitor.start();
+  CompletableFuture.runAsync(() -> {
+try {
+  LOG.info("Replication Monitor Thread will be started" +
+  " in {} milliseconds.", delay);
+  Thread.sleep(delay);
+} catch (InterruptedException ignored) {
+  // InterruptedException is ignored.
 
 Review comment:
   Set interrupted flag here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222009)
Time Spent: 50m  (was: 40m)

> Refactor Container Report Processing logic and plugin new Replication Manager
> -
>
> Key: HDDS-1207
> URL: https://issues.apache.org/jira/browse/HDDS-1207
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HDDS-1205 brings in new ReplicationManager, this Jira is to refactor 
> ContainerReportProcessing logic in SCM so that it complements 
> ReplicationManager and plugin the new ReplicationManager code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1207) Refactor Container Report Processing logic and plugin new Replication Manager

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1207?focusedWorklogId=222010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222010
 ]

ASF GitHub Bot logged work on HDDS-1207:


Author: ASF GitHub Bot
Created on: 02/Apr/19 21:26
Start Date: 02/Apr/19 21:26
Worklog Time Spent: 10m 
  Work Description: arp7 commented on pull request #662: HDDS-1207. 
Refactor Container Report Processing logic and plugin new Replication Manager.
URL: https://github.com/apache/hadoop/pull/662#discussion_r271502348
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerReportHandler.java
 ##
 @@ -146,68 +111,89 @@ public void onMessage(final ContainerReportFromDatanode 
reportFromDatanode,
 
   }
 
+  /**
+   * Processes the ContainerReport.
+   *
+   * @param datanodeDetails Datanode from which this report was received
+   * @param replicas list of ContainerReplicaProto
+   */
   private void processContainerReplicas(final DatanodeDetails datanodeDetails,
-  final List replicas,
-  final EventPublisher publisher) {
-final PendingDeleteStatusList pendingDeleteStatusList =
-new PendingDeleteStatusList(datanodeDetails);
+  final List replicas) {
 for (ContainerReplicaProto replicaProto : replicas) {
   try {
-final ContainerID containerID = ContainerID.valueof(
-replicaProto.getContainerID());
-
-ReportHandlerHelper.processContainerReplica(containerManager,
-containerID, replicaProto, datanodeDetails, publisher, LOG);
-
-final ContainerInfo containerInfo = containerManager
-.getContainer(containerID);
-
-if (containerInfo.getDeleteTransactionId() >
-replicaProto.getDeleteTransactionId()) {
-  pendingDeleteStatusList
-  .addPendingDeleteStatus(replicaProto.getDeleteTransactionId(),
-  containerInfo.getDeleteTransactionId(),
-  containerInfo.getContainerID());
-}
+processContainerReplica(datanodeDetails, replicaProto);
   } catch (ContainerNotFoundException e) {
-LOG.error("Received container report for an unknown container {} from"
-+ " datanode {} {}", replicaProto.getContainerID(),
+LOG.error("Received container report for an unknown container" +
+" {} from datanode {}.", replicaProto.getContainerID(),
 datanodeDetails, e);
   } catch (IOException e) {
-LOG.error("Exception while processing container report for container"
-+ " {} from datanode {} {}", replicaProto.getContainerID(),
+LOG.error("Exception while processing container report for container" +
+" {} from datanode {}.", replicaProto.getContainerID(),
 datanodeDetails, e);
   }
 }
-if (pendingDeleteStatusList.getNumPendingDeletes() > 0) {
-  publisher.fireEvent(SCMEvents.PENDING_DELETE_STATUS,
-  pendingDeleteStatusList);
-}
   }
 
-  private void checkReplicationState(ContainerID containerID,
-  EventPublisher publisher) {
-try {
-  ContainerInfo container = containerManager.getContainer(containerID);
-  replicateIfNeeded(container, publisher);
-} catch (ContainerNotFoundException ex) {
-  LOG.warn("Container is missing from containerStateManager. Can't request 
"
-  + "replication. {} {}", containerID, ex);
+  /**
+   * Process the missing replica on the given datanode.
+   *
+   * @param datanodeDetails DatanodeDetails
+   * @param missingReplicas ContainerID which are missing on the given datanode
+   */
+  private void processMissingReplicas(final DatanodeDetails datanodeDetails,
+  final Set missingReplicas) {
+for (ContainerID id : missingReplicas) {
+  try {
+containerManager.getContainerReplicas(id).stream()
+.filter(replica -> replica.getDatanodeDetails()
+.equals(datanodeDetails)).findFirst()
+.ifPresent(replica -> {
+  try {
+containerManager.removeContainerReplica(id, replica);
+  } catch (ContainerNotFoundException |
+  ContainerReplicaNotFoundException ignored) {
+// This should not happen, but even if it happens, not an issue
+  }
+});
+  } catch (ContainerNotFoundException e) {
+LOG.warn("Cannot remove container replica, container {} not found.",
+id, e);
+  }
 }
-
   }
 
-  private void replicateIfNeeded(ContainerInfo container,
-  EventPublisher publisher) throws ContainerNotFoundException {
-if (!container.isOpen() && replicationStatus.isReplicationEnabled()) {
-  final int existingReplicas = containerManager
-  .getContainerReplicas(container.containerID()).size();
-  

[jira] [Commented] (HDDS-1348) Refactor BlockOutpuStream Class

2019-04-02 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808154#comment-16808154
 ] 

Hadoop QA commented on HDDS-1348:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
25s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | {color:orange} hadoop-hdds: The patch generated 5 new + 0 
unchanged - 0 fixed = 5 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 48s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 31s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 31s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.container.common.TestDatanodeStateMachine |
|   | hadoop.ozone.TestMiniChaosOzoneCluster |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2625/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1348 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12964611/HDDS-1348.000.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 062a9f7cc0e1 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 

[jira] [Updated] (HDDS-1354) Kerberos principal configuration of OzoneManager doesn't use FQDN

2019-04-02 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-1354:
-
Labels: regression-test  (was: )

> Kerberos principal configuration of OzoneManager doesn't use FQDN
> -
>
> Key: HDDS-1354
> URL: https://issues.apache.org/jira/browse/HDDS-1354
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Elek, Marton
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: regression-test
>
> In the "*.kerberos.principal" settings hadoop supports the _HOST variable 
> which is replaced to the fully qualified domain name.
> For example:
> {code}
> OZONE-SITE.XML_hdds.scm.kerberos.principal: "scm/_h...@example.com"
> {code}
> It works well with scm but for om it uses the hostname instead of the FQDN. 
> (SCM uses the HddsServerUtil.getScmBlockClientBindAddress which uses the  
> _bind_ address but the om uses the om rpc address).
> I would suggest to use the same behaviour for both SCM and OM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1354) Kerberos principal configuration of OzoneManager doesn't use FQDN

2019-04-02 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-1354:
-
Target Version/s: 0.5.0  (was: 0.4.0)

> Kerberos principal configuration of OzoneManager doesn't use FQDN
> -
>
> Key: HDDS-1354
> URL: https://issues.apache.org/jira/browse/HDDS-1354
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.4.0
>Reporter: Elek, Marton
>Assignee: Ajay Kumar
>Priority: Major
>
> In the "*.kerberos.principal" settings hadoop supports the _HOST variable 
> which is replaced to the fully qualified domain name.
> For example:
> {code}
> OZONE-SITE.XML_hdds.scm.kerberos.principal: "scm/_h...@example.com"
> {code}
> It works well with scm but for om it uses the hostname instead of the FQDN. 
> (SCM uses the HddsServerUtil.getScmBlockClientBindAddress which uses the  
> _bind_ address but the om uses the om rpc address).
> I would suggest to use the same behaviour for both SCM and OM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808138#comment-16808138
 ] 

Eric Yang edited comment on HDFS-14394 at 4/2/19 8:43 PM:
--

[~stakiar] pedantic-errors flag forces syntax that is not defined by stand c to 
error out.  This means Hadoop unit tests have already using gnu dialect without 
realizing it.  Gcc is GPLv3 licensed, and it looks like Apache is not ok with 
GPLv3 according to this statement: 
https://apache.org/licenses/GPL-compatibility.html
This means the problem is concerning based on the latest information.  +0 from 
my side.  Others may provide more insights on how to address this matter.


was (Author: eyang):
[~stakiar] pedantic-errors flag forces syntax that is not defined by stand c to 
error out.  This means Hadoop unit tests have already using gnu dialect without 
realizing it.  Gcc is GPLv3 licensed, and it looks like Apache is ok with GPLv3 
according to this statement: https://apache.org/licenses/GPL-compatibility.html
This means the patch is probably ok base on the latest information.  +1 from my 
side.

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808138#comment-16808138
 ] 

Eric Yang commented on HDFS-14394:
--

[~stakiar] pedantic-errors flag forces syntax that is not defined by stand c to 
error out.  This means Hadoop unit tests have already using gnu dialect without 
realizing it.  Gcc is GPLv3 licensed, and it looks like Apache is ok with GPLv3 
according to this statement: https://apache.org/licenses/GPL-compatibility.html
This means the patch is probably ok base on the latest information.  +1 from my 
side.

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1373) KeyOutputStream, close after write request fails after retries, runs into IllegalArgumentException

2019-04-02 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee reassigned HDDS-1373:
-

Assignee: Shashikant Banerjee

> KeyOutputStream, close after write request fails after retries, runs into 
> IllegalArgumentException
> --
>
> Key: HDDS-1373
> URL: https://issues.apache.org/jira/browse/HDDS-1373
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.4.0, 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> In this code, the stream is closed via try with resource.
> {code}
>   try (OzoneOutputStream stream = ozoneBucket.createKey(keyName,
>   bufferCapacity, ReplicationType.RATIS, ReplicationFactor.THREE,
>   new HashMap<>())) {
> stream.write(buffer.array());
>   } catch (Exception e) {
> LOG.error("LOADGEN: Create key:{} failed with exception", keyName, e);
> break;
>   }
> {code}
> Here, the write call fails correctly as expected, However the close doesn't 
> fail with the same exception.
> The exception stack stack is as following
> {code}
> 2019-04-03 00:52:54,116 ERROR ozone.MiniOzoneLoadGenerator 
> (MiniOzoneLoadGenerator.java:load(101)) - LOADGEN: Create 
> key:pool-431-thread-9-8126 failed with exception
> java.io.IOException: Retry request failed. retries get failed due to exceeded 
> maximum allowed retries number: 5
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:492)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:287)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
> at java.io.OutputStream.write(OutputStream.java:75)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:99)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:137)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Suppressed: java.lang.IllegalArgumentException
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:643)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:100)
> ... 5 more
> {code}




[jira] [Updated] (HDDS-1370) Command Execution in Datanode fails becaue of NPE

2019-04-02 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1370:

Labels: MiniOzoneChaosCluster  (was: )

> Command Execution in Datanode fails becaue of NPE
> -
>
> Key: HDDS-1370
> URL: https://issues.apache.org/jira/browse/HDDS-1370
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> The command execution on the datanode is failing with the following exception.
> {code}
> 2019-04-02 23:56:30,434 ERROR statemachine.DatanodeStateMachine 
> (DatanodeStateMachine.java:start(196)) - Unable to finish the execution.
> java.lang.NullPointerException
> at 
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:179)
> at 
> org.apache.hadoop.ozone.container.common.states.datanode.RunningDatanodeState.execute(RunningDatanodeState.java:89)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.StateContext.execute(StateContext.java:354)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:183)
> at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:338)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1372) getContainerWithPipeline for a standalone pipeline fails with ConcurrentModificationException

2019-04-02 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1372:

Labels: MiniOzoneChaosCluster  (was: )

> getContainerWithPipeline for a standalone pipeline fails with 
> ConcurrentModificationException
> -
>
> Key: HDDS-1372
> URL: https://issues.apache.org/jira/browse/HDDS-1372
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> The excception is hit while fetching a pipeline during read.
> {code}
> 2019-04-03 00:52:50,125 WARN  ipc.Server (Server.java:logException(2724)) - 
> IPC Server handler 16 on 59758, call Call#2270 Retry#0 
> org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.getC
> ontainerWithPipeline from 192.168.0.108:60011
> java.util.ConcurrentModificationException
> at 
> java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1558)
> at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at 
> org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.getContainerWithPipeline(SCMClientProtocolServer.java:252)
> at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.getContainerWithPipeline(StorageContainerLocationProtocolServerSideTranslatorPB.java:144)
> at 
> org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16390)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1373) KeyOutputStream, close after write request fails after retries, runs into IllegalArgumentException

2019-04-02 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-1373:

Target Version/s: 0.5.0
  Labels: MiniOzoneChaosCluster  (was: )

> KeyOutputStream, close after write request fails after retries, runs into 
> IllegalArgumentException
> --
>
> Key: HDDS-1373
> URL: https://issues.apache.org/jira/browse/HDDS-1373
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Affects Versions: 0.4.0, 0.5.0
>Reporter: Mukul Kumar Singh
>Priority: Major
>  Labels: MiniOzoneChaosCluster
>
> In this code, the stream is closed via try with resource.
> {code}
>   try (OzoneOutputStream stream = ozoneBucket.createKey(keyName,
>   bufferCapacity, ReplicationType.RATIS, ReplicationFactor.THREE,
>   new HashMap<>())) {
> stream.write(buffer.array());
>   } catch (Exception e) {
> LOG.error("LOADGEN: Create key:{} failed with exception", keyName, e);
> break;
>   }
> {code}
> Here, the write call fails correctly as expected, However the close doesn't 
> fail with the same exception.
> The exception stack stack is as following
> {code}
> 2019-04-03 00:52:54,116 ERROR ozone.MiniOzoneLoadGenerator 
> (MiniOzoneLoadGenerator.java:load(101)) - LOADGEN: Create 
> key:pool-431-thread-9-8126 failed with exception
> java.io.IOException: Retry request failed. retries get failed due to exceeded 
> maximum allowed retries number: 5
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:492)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:287)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
> at java.io.OutputStream.write(OutputStream.java:75)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:99)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:137)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Suppressed: java.lang.IllegalArgumentException
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
> at 
> org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:643)
> at 
> org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
> at 
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:100)
> ... 5 more
> {code}



--
This 

[jira] [Created] (HDDS-1373) KeyOutputStream, close after write request fails after retries, runs into IllegalArgumentException

2019-04-02 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-1373:
---

 Summary: KeyOutputStream, close after write request fails after 
retries, runs into IllegalArgumentException
 Key: HDDS-1373
 URL: https://issues.apache.org/jira/browse/HDDS-1373
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Affects Versions: 0.4.0, 0.5.0
Reporter: Mukul Kumar Singh


In this code, the stream is closed via try with resource.

{code}
  try (OzoneOutputStream stream = ozoneBucket.createKey(keyName,
  bufferCapacity, ReplicationType.RATIS, ReplicationFactor.THREE,
  new HashMap<>())) {
stream.write(buffer.array());
  } catch (Exception e) {
LOG.error("LOADGEN: Create key:{} failed with exception", keyName, e);
break;
  }
{code}

Here, the write call fails correctly as expected, However the close doesn't 
fail with the same exception.

The exception stack stack is as following

{code}
2019-04-03 00:52:54,116 ERROR ozone.MiniOzoneLoadGenerator 
(MiniOzoneLoadGenerator.java:load(101)) - LOADGEN: Create 
key:pool-431-thread-9-8126 failed with exception
java.io.IOException: Retry request failed. retries get failed due to exceeded 
maximum allowed retries number: 5
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:492)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleRetry(KeyOutputStream.java:514)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleException(KeyOutputStream.java:468)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:344)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:287)
at 
org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
at java.io.OutputStream.write(OutputStream.java:75)
at 
org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:99)
at 
org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:137)
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Suppressed: java.lang.IllegalArgumentException
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:72)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:643)
at 
org.apache.hadoop.ozone.client.io.OzoneOutputStream.close(OzoneOutputStream.java:60)
at 
org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:100)
... 5 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14327) Using FQDN instead of IP to access servers with DNS resolving

2019-04-02 Thread Fengnan Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808110#comment-16808110
 ] 

Fengnan Li commented on HDFS-14327:
---

Thanks for the review [~elgoiri]! I will work on other related issues as well.

> Using FQDN instead of IP to access servers with DNS resolving
> -
>
> Key: HDFS-14327
> URL: https://issues.apache.org/jira/browse/HDFS-14327
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14327.001.patch, HDFS-14327.002.patch
>
>
> With [HDFS-14118|https://issues.apache.org/jira/browse/HDFS-14118], clients 
> can get the IP of the servers (NN/Routers) and use the IP addresses to access 
> the machine. This will fail in secure environment as Kerberos is using the 
> domain name  (FQDN) in the principal so it won't recognize the IP addresses.
> This task is mainly adding a reverse look up on the current basis and get the 
> domain name after the IP is fetched. After that clients will still use the 
> domain name to access the servers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?focusedWorklogId=221940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221940
 ]

ASF GitHub Bot logged work on HDDS-1353:


Author: ASF GitHub Bot
Created on: 02/Apr/19 19:45
Start Date: 02/Apr/19 19:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #681: HDDS-1353 
: Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
because of BackgroundPipelineCreator.
URL: https://github.com/apache/hadoop/pull/681#discussion_r271467369
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestSCMPipelineManager.java
 ##
 @@ -208,4 +212,60 @@ public void testPipelineReport() throws IOException {
 // clean up
 pipelineManager.close();
   }
+
+  @Test
+  public void testPipelineCreationFailedMetric() throws Exception {
+SCMPipelineManager pipelineManager =
+new SCMPipelineManager(conf, nodeManager, new EventQueue());
+PipelineProvider mockRatisProvider =
+new MockRatisPipelineProvider(nodeManager,
+pipelineManager.getStateManager(), conf);
+pipelineManager.setPipelineProvider(HddsProtos.ReplicationType.RATIS,
+mockRatisProvider);
+
+MetricsRecordBuilder metrics = getMetrics(
+SCMPipelineMetrics.class.getSimpleName());
+long numPipelineCreated = getLongCounter("NumPipelineCreated",
+metrics);
+Assert.assertTrue(numPipelineCreated == 0);
+
+// 3 DNs are unhealthy.
+// Create 5 pipelines (Use up 15 Datanodes)
+for (int i = 0; i < 5; i++) {
+  Pipeline pipeline = pipelineManager
+  .createPipeline(HddsProtos.ReplicationType.RATIS,
+  HddsProtos.ReplicationFactor.THREE);
+  Assert.assertNotNull(pipeline);
+}
+
+metrics = getMetrics(
+SCMPipelineMetrics.class.getSimpleName());
+numPipelineCreated = getLongCounter("NumPipelineCreated", metrics);
+Assert.assertTrue(numPipelineCreated == 5);
+
+long numPipelineCreateFailed = getLongCounter(
+"NumPipelineCreationFailed", metrics);
+Assert.assertTrue(numPipelineCreateFailed == 0);
+
+//This should fail...
+try {
+  pipelineManager.createPipeline(HddsProtos.ReplicationType.RATIS,
+  HddsProtos.ReplicationFactor.THREE);
+  Assert.fail();
+} catch (InsufficientDatanodesException idEx) {
+  Assert.assertEquals(
+  "Cannot create pipeline of factor 3 using 1 nodes.",
+  idEx.getMessage());
+}
+
+metrics = getMetrics(
+SCMPipelineMetrics.class.getSimpleName());
+numPipelineCreated = getLongCounter("NumPipelineCreated", metrics);
+Assert.assertTrue(numPipelineCreated == 5);
+
+numPipelineCreateFailed = getLongCounter(
+"NumPipelineCreationFailed", metrics);
+Assert.assertTrue(numPipelineCreateFailed == 0);
+  }
+  
 
 Review comment:
   whitespace:end of line
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221940)
Time Spent: 20m  (was: 10m)

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of BackgroundPipelineCreator
> ---
>
> Key: HDDS-1353
> URL: https://issues.apache.org/jira/browse/HDDS-1353
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: newbie, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There is a {{BackgroundPipelineCreator}} thread in SCM which runs in a fixed 
> interval and tries to create pipelines. This BackgroundPipelineCreator uses 
> {{IOException}} as exit criteria (no more pipelines can be created). In each 
> run of BackgroundPipelineCreator we exit when we are not able to create any 
> more pipelines, i.e. when we get IOException while trying to create the 
> pipeline. This means that 
> {{scm_pipeline_metrics_num_pipeline_creation_failed}} value will get 
> incremented in each run of BackgroundPipelineCreator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?focusedWorklogId=221941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221941
 ]

ASF GitHub Bot logged work on HDDS-1353:


Author: ASF GitHub Bot
Created on: 02/Apr/19 19:45
Start Date: 02/Apr/19 19:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #681: HDDS-1353 : 
Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
because of BackgroundPipelineCreator.
URL: https://github.com/apache/hadoop/pull/681#issuecomment-479168294
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 39 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 94 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1140 | trunk passed |
   | +1 | compile | 1057 | trunk passed |
   | +1 | checkstyle | 213 | trunk passed |
   | +1 | mvnsite | 117 | trunk passed |
   | +1 | shadedclient | 1085 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 49 | trunk passed |
   | +1 | javadoc | 56 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for patch |
   | +1 | mvninstall | 65 | the patch passed |
   | +1 | compile | 1031 | the patch passed |
   | +1 | javac | 1031 | the patch passed |
   | +1 | checkstyle | 209 | the patch passed |
   | +1 | mvnsite | 80 | the patch passed |
   | -1 | whitespace | 0 | The patch has 1 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply |
   | +1 | shadedclient | 708 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | +1 | findbugs | 55 | the patch passed |
   | +1 | javadoc | 54 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 119 | server-scm in the patch passed. |
   | -1 | unit | 971 | integration-test in the patch failed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 7130 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.TestMiniOzoneCluster |
   |   | hadoop.ozone.client.rpc.TestOzoneClientRetriesOnException |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.hdds.scm.pipeline.TestSCMPipelineManager |
   |   | hadoop.ozone.TestContainerStateMachineIdempotency |
   |   | hadoop.ozone.om.TestOzoneManagerHA |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/681 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  |
   | uname | Linux b02fcadd3e3e 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri 
Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / bfc90bd |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | whitespace | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/1/artifact/out/whitespace-eol.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/1/artifact/out/patch-unit-hadoop-ozone_integration-test.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/1/testReport/ |
   | Max. process+thread count | 4138 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/server-scm hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-681/1/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221941)

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of 

[jira] [Created] (HDDS-1372) getContainerWithPipeline for a standalone pipeline fails with ConcurrentModificationException

2019-04-02 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-1372:
---

 Summary: getContainerWithPipeline for a standalone pipeline fails 
with ConcurrentModificationException
 Key: HDDS-1372
 URL: https://issues.apache.org/jira/browse/HDDS-1372
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Affects Versions: 0.5.0
Reporter: Mukul Kumar Singh


The excception is hit while fetching a pipeline during read.

{code}
2019-04-03 00:52:50,125 WARN  ipc.Server (Server.java:logException(2724)) - IPC 
Server handler 16 on 59758, call Call#2270 Retry#0 
org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.getC
ontainerWithPipeline from 192.168.0.108:60011
java.util.ConcurrentModificationException
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1558)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at 
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at 
org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.getContainerWithPipeline(SCMClientProtocolServer.java:252)
at 
org.apache.hadoop.ozone.protocolPB.StorageContainerLocationProtocolServerSideTranslatorPB.getContainerWithPipeline(StorageContainerLocationProtocolServerSideTranslatorPB.java:144)
at 
org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:16390)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14400) Namenode ExpiredHeartbeats metric

2019-04-02 Thread Karthik Palanisamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-14400:
--
Resolution: Not A Bug
Status: Resolved  (was: Patch Available)

> Namenode ExpiredHeartbeats metric
> -
>
> Key: HDFS-14400
> URL: https://issues.apache.org/jira/browse/HDFS-14400
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Minor
> Attachments: HDFS-14400-001.patch, HDFS-14400-002.patch, 
> HDFS-14400-003.patch
>
>
> Noticed incorrect value in ExpiredHeartbeats metrics under namenode JMX.
> We will increment ExpiredHeartbeats count when Datanode is dead but somehow 
> we missed to decrement when datanode is alive back.
> {code}
> { "name" : "Hadoop:service=NameNode,name=FSNamesystem", "modelerType" : 
> "FSNamesystem", "tag.Context" : "dfs", "tag.TotalSyncTimes" : "7 ", 
> "tag.HAState" : "active", ... "ExpiredHeartbeats" : 2, ... }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13960) hdfs dfs -checksum command should optionally show block size in output

2019-04-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808088#comment-16808088
 ] 

Hudson commented on HDFS-13960:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16329 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16329/])
HDFS-13960. hdfs dfs -checksum command should optionally show block size 
(weichiu: rev cf268114c9af2e33f35d0c24b57e31ef4d5e8353)
* (edit) 
hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
* (edit) hadoop-common-project/hadoop-common/src/test/resources/testConf.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Display.java


> hdfs dfs -checksum command should optionally show block size in output
> --
>
> Key: HDFS-13960
> URL: https://issues.apache.org/jira/browse/HDFS-13960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Adam Antal
>Assignee: Lokesh Jain
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-13960.001.patch, HDFS-13960.002.patch, 
> HDFS-13960.003.patch
>
>
> The hdfs checksum command computes the checksum in a distributed manner, 
> which would take into account the block size. In other words, the block size 
> determines how the file will be broken up.
> Therefore it can happen that the checksum command produces different outputs 
> for the exact same file only differing in the block size: 
> checksum(fileABlock1) + checksum(fileABlock2) != checksum(fileABlock1 + 
> fileABlock2)
> I suggest to add an option to the hdfs dfs -checksum command which would 
> displays the block size along with the output, and that could also be helpful 
> in some other cases where this piece of information is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1348) Refactor BlockOutpuStream Class

2019-04-02 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-1348:
--
Status: Patch Available  (was: Open)

> Refactor BlockOutpuStream Class
> ---
>
> Key: HDDS-1348
> URL: https://issues.apache.org/jira/browse/HDDS-1348
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1348.000.patch
>
>
> BlockOutputStream contains functionalities for handling write, flush and 
> close as well as tracking commitIndexes . The idea is to separate all 
> commitIndex tracking and management code outside of BlockOutputStream class



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1348) Refactor BlockOutpuStream Class

2019-04-02 Thread Shashikant Banerjee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-1348:
--
Attachment: HDDS-1348.000.patch

> Refactor BlockOutpuStream Class
> ---
>
> Key: HDDS-1348
> URL: https://issues.apache.org/jira/browse/HDDS-1348
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Client
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: HDDS-1348.000.patch
>
>
> BlockOutputStream contains functionalities for handling write, flush and 
> close as well as tracking commitIndexes . The idea is to separate all 
> commitIndex tracking and management code outside of BlockOutputStream class



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808067#comment-16808067
 ] 

Sahil Takiar commented on HDFS-14394:
-

I can add the {{-fextended-identifiers}} flag without any issues. I added 
{{-pedantic-errors}} and there are ton of warnings that have now become errors 
(including errors from unit tests and third party libraries). As you pointed 
out, I don't see a good way of applying {{CMAKE_C_FLAGS}} to specific projects, 
but if someone has a smart way of doing so I'm open to at least fixing the 
errors in libhdfs.

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14400) Namenode ExpiredHeartbeats metric

2019-04-02 Thread Karthik Palanisamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808068#comment-16808068
 ] 

Karthik Palanisamy commented on HDFS-14400:
---

Thanks for the confirmation [~goiri].

I thought it is point to current expired heartbeats but as it meant to total 
expired heartbeats.

Let me resolve this JIRA as Invalid. 

> Namenode ExpiredHeartbeats metric
> -
>
> Key: HDFS-14400
> URL: https://issues.apache.org/jira/browse/HDFS-14400
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Minor
> Attachments: HDFS-14400-001.patch, HDFS-14400-002.patch, 
> HDFS-14400-003.patch
>
>
> Noticed incorrect value in ExpiredHeartbeats metrics under namenode JMX.
> We will increment ExpiredHeartbeats count when Datanode is dead but somehow 
> we missed to decrement when datanode is alive back.
> {code}
> { "name" : "Hadoop:service=NameNode,name=FSNamesystem", "modelerType" : 
> "FSNamesystem", "tag.Context" : "dfs", "tag.TotalSyncTimes" : "7 ", 
> "tag.HAState" : "active", ... "ExpiredHeartbeats" : 2, ... }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13960) hdfs dfs -checksum command should optionally show block size in output

2019-04-02 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13960:
---
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

Thanks for reviewing the patch [~adam.antal] and contributing the patch 
[~ljain]! Pushed the 003 patch to trunk.

> hdfs dfs -checksum command should optionally show block size in output
> --
>
> Key: HDFS-13960
> URL: https://issues.apache.org/jira/browse/HDFS-13960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Adam Antal
>Assignee: Lokesh Jain
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-13960.001.patch, HDFS-13960.002.patch, 
> HDFS-13960.003.patch
>
>
> The hdfs checksum command computes the checksum in a distributed manner, 
> which would take into account the block size. In other words, the block size 
> determines how the file will be broken up.
> Therefore it can happen that the checksum command produces different outputs 
> for the exact same file only differing in the block size: 
> checksum(fileABlock1) + checksum(fileABlock2) != checksum(fileABlock1 + 
> fileABlock2)
> I suggest to add an option to the hdfs dfs -checksum command which would 
> displays the block size along with the output, and that could also be helpful 
> in some other cases where this piece of information is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13960) hdfs dfs -checksum command should optionally show block size in output

2019-04-02 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808062#comment-16808062
 ] 

Wei-Chiu Chuang commented on HDFS-13960:


+1 failed tests unrelated. Will commit soon.

> hdfs dfs -checksum command should optionally show block size in output
> --
>
> Key: HDFS-13960
> URL: https://issues.apache.org/jira/browse/HDFS-13960
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Adam Antal
>Assignee: Lokesh Jain
>Priority: Minor
> Attachments: HDFS-13960.001.patch, HDFS-13960.002.patch, 
> HDFS-13960.003.patch
>
>
> The hdfs checksum command computes the checksum in a distributed manner, 
> which would take into account the block size. In other words, the block size 
> determines how the file will be broken up.
> Therefore it can happen that the checksum command produces different outputs 
> for the exact same file only differing in the block size: 
> checksum(fileABlock1) + checksum(fileABlock2) != checksum(fileABlock1 + 
> fileABlock2)
> I suggest to add an option to the hdfs dfs -checksum command which would 
> displays the block size along with the output, and that could also be helpful 
> in some other cases where this piece of information is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy

2019-04-02 Thread Karthik Palanisamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808050#comment-16808050
 ] 

Karthik Palanisamy commented on HDFS-14383:
---

{quote}I suppose this load issue wouldn't occur if you configure datanodes such 
that they have some HOT and some COLD volumes.
{quote}
Exactly [~jojochuang]. Problem only with some datanodes which are specific 
either COLD or HOT.

IMO - it is typical to see this setup because hardware and resources are 
specific to COLD/HOT storage.

 

 

> Compute datanode load based on StoragePolicy
> 
>
> Key: HDFS-14383
> URL: https://issues.apache.org/jira/browse/HDFS-14383
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.7.3, 3.1.2
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> Datanode load check logic needs to be changed because existing computation 
> will not consider StoragePolicy.
> DatanodeManager#getInServiceXceiverAverage
> {code}
> public double getInServiceXceiverAverage() {
>  double avgLoad = 0;
>  final int nodes = getNumDatanodesInService();
>  if (nodes != 0) {
>  final int xceivers = heartbeatManager
>  .getInServiceXceiverCount();
>  avgLoad = (double)xceivers/nodes;
>  }
>  return avgLoad;
> }
> {code}
>  
> For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) 
> with average 10 xceivers the calculated threshold by the NN is 28 (((500 + 
> 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes 
> unavailable when the COLD tier nodes are barely in use. Turning this check 
> off helps to mitigate this issue, however the 
> dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, 
> upon turning it off can lead to situations where specific DNs are 
> "overloaded".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-1371) Download RocksDB checkpoint from OM Leader to Follower

2019-04-02 Thread Hanisha Koneru (JIRA)
Hanisha Koneru created HDDS-1371:


 Summary: Download RocksDB checkpoint from OM Leader to Follower
 Key: HDDS-1371
 URL: https://issues.apache.org/jira/browse/HDDS-1371
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


If a follower OM is lagging way behind the leader OM or in case of a restart or 
bootstrapping, a follower OM might need RocksDB checkpoint from the leader to 
catch up with it. This is because the leader might have purged its logs after 
taking a snapshot.
This Jira aims to add support to download a RocksDB checkpoint from leader OM 
to follower OM through a HTTP servlet. We reuse the servlet used by Recon 
server. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1371) Download RocksDB checkpoint from OM Leader to Follower

2019-04-02 Thread Hanisha Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-1371:
-
Target Version/s: 0.5.0

> Download RocksDB checkpoint from OM Leader to Follower
> --
>
> Key: HDDS-1371
> URL: https://issues.apache.org/jira/browse/HDDS-1371
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>
> If a follower OM is lagging way behind the leader OM or in case of a restart 
> or bootstrapping, a follower OM might need RocksDB checkpoint from the leader 
> to catch up with it. This is because the leader might have purged its logs 
> after taking a snapshot.
> This Jira aims to add support to download a RocksDB checkpoint from leader OM 
> to follower OM through a HTTP servlet. We reuse the servlet used by Recon 
> server. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221920
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:48
Start Date: 02/Apr/19 18:48
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479143795
 
 
   I think we need to rebase with trunk to get a Yetus run.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221920)
Time Spent: 3h 20m  (was: 3h 10m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221918
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:47
Start Date: 02/Apr/19 18:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #651: HDDS-1339. 
Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#issuecomment-479143473
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 0 | Docker mode activated. |
   | -1 | patch | 7 | https://github.com/apache/hadoop/pull/651 does not apply 
to trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/651 |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-651/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221918)
Time Spent: 3h 10m  (was: 3h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221917
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:45
Start Date: 02/Apr/19 18:45
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271446311
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   Then I think I having less value make sense. 
   Do we want to revisit this later, for now just go with ratis default value 
of 400k?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221917)
Time Spent: 3h  (was: 2h 50m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14327) Using FQDN instead of IP to access servers with DNS resolving

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808020#comment-16808020
 ] 

Íñigo Goiri commented on HDFS-14327:


[^HDFS-14327.002.patch] LGTM.
+1

> Using FQDN instead of IP to access servers with DNS resolving
> -
>
> Key: HDFS-14327
> URL: https://issues.apache.org/jira/browse/HDFS-14327
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14327.001.patch, HDFS-14327.002.patch
>
>
> With [HDFS-14118|https://issues.apache.org/jira/browse/HDFS-14118], clients 
> can get the IP of the servers (NN/Routers) and use the IP addresses to access 
> the machine. This will fail in secure environment as Kerberos is using the 
> domain name  (FQDN) in the principal so it won't recognize the IP addresses.
> This task is mainly adding a reverse look up on the current basis and get the 
> domain name after the IP is fetched. After that clients will still use the 
> domain name to access the servers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221916=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221916
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:39
Start Date: 02/Apr/19 18:39
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271444007
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1617,7 +1617,7 @@
 
   
 ozone.om.ratis.snapshot.auto.trigger.threshold
-40L
+40
 
 Review comment:
   Yes that's right
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221916)
Time Spent: 2h 50m  (was: 2h 40m)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1339) Implement Ratis Snapshots on OM

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1339?focusedWorklogId=221915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221915
 ]

ASF GitHub Bot logged work on HDDS-1339:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:38
Start Date: 02/Apr/19 18:38
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #651: 
HDDS-1339. Implement ratis snapshots on OM
URL: https://github.com/apache/hadoop/pull/651#discussion_r271443794
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
 ##
 @@ -161,7 +161,10 @@ public TransactionContext startTransaction(
   @Override
   public long takeSnapshot() throws IOException {
 LOG.info("Saving Ratis snapshot on the OM.");
-return ozoneManager.saveRatisSnapshot();
+if (ozoneManager != null) {
+  return ozoneManager.saveRatisSnapshot();
+}
+return 0;
 
 Review comment:
   We do return the last applied index which is being stored on disk. The null 
check is for tests only (TestOzoneManagerRatisServer).
   Ratis currently does not do anything with the returned value.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221915)
Time Spent: 2h 40m  (was: 2.5h)

> Implement Ratis Snapshots on OM
> ---
>
> Key: HDDS-1339
> URL: https://issues.apache.org/jira/browse/HDDS-1339
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> For bootstrapping and restarting OMs, we need to implement snapshots in OM. 
> The OM state maintained by RocksDB will be checkpoint-ed on demand. Ratis 
> snapshots will only preserve the last applied log index by the State Machine 
> on disk. This index will be stored in file in the OM metadata dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13248) RBF: Namenode need to choose block location for the client

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808010#comment-16808010
 ] 

Íñigo Goiri commented on HDFS-13248:


I agree on extending the protocol, but we need a larger task for that with 
design doc and larger vote.
The doc should have:
* New protocol.
* Impact to current services.
* How to use it for read.
* How to use it for writes.
Can anybody take on this?

> RBF: Namenode need to choose block location for the client
> --
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Weiwei Wu
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch, 
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch, 
> HDFS-13248.005.patch, clientMachine-call-path.jpeg, debug-info-1.jpeg, 
> debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block 
> location for the router, not for the real client. This will affect the file's 
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or 
> add a parameter for the current addBlock method, to pass the real client 
> information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808008#comment-16808008
 ] 

Chen Liang commented on HDFS-13699:
---

Thanks for the clarification [~shv]! I've attached v009 patch to address all 
comments.

> Add DFSClient sending handshake token to DataNode, and allow DataNode 
> overwrite downstream QOP
> --
>
> Key: HDFS-13699
> URL: https://issues.apache.org/jira/browse/HDFS-13699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13699.001.patch, HDFS-13699.002.patch, 
> HDFS-13699.003.patch, HDFS-13699.004.patch, HDFS-13699.005.patch, 
> HDFS-13699.006.patch, HDFS-13699.007.patch, HDFS-13699.008.patch, 
> HDFS-13699.009.patch, HDFS-13699.WIP.001.patch
>
>
> Given the other Jiras under HDFS-13541, this Jira is to allow DFSClient to 
> redirect the encrypt secret to DataNode. The encrypted message is the QOP 
> that client and NameNode have used. DataNode decrypts the message and enforce 
> the QOP for the client connection. Also, this Jira will also include 
> overwriting downstream QOP, as mentioned in the HDFS-13541 design doc. 
> Namely, this is to allow inter-DN QOP that is different from client-DN QOP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13699) Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP

2019-04-02 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-13699:
--
Attachment: HDFS-13699.009.patch

> Add DFSClient sending handshake token to DataNode, and allow DataNode 
> overwrite downstream QOP
> --
>
> Key: HDFS-13699
> URL: https://issues.apache.org/jira/browse/HDFS-13699
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13699.001.patch, HDFS-13699.002.patch, 
> HDFS-13699.003.patch, HDFS-13699.004.patch, HDFS-13699.005.patch, 
> HDFS-13699.006.patch, HDFS-13699.007.patch, HDFS-13699.008.patch, 
> HDFS-13699.009.patch, HDFS-13699.WIP.001.patch
>
>
> Given the other Jiras under HDFS-13541, this Jira is to allow DFSClient to 
> redirect the encrypt secret to DataNode. The encrypted message is the QOP 
> that client and NameNode have used. DataNode decrypts the message and enforce 
> the QOP for the client connection. Also, this Jira will also include 
> overwriting downstream QOP, as mentioned in the HDFS-13541 design doc. 
> Namely, this is to allow inter-DN QOP that is different from client-DN QOP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-1370) Command Execution in Datanode fails becaue of NPE

2019-04-02 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-1370:
---

 Summary: Command Execution in Datanode fails becaue of NPE
 Key: HDDS-1370
 URL: https://issues.apache.org/jira/browse/HDDS-1370
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Affects Versions: 0.5.0
Reporter: Mukul Kumar Singh


The command execution on the datanode is failing with the following exception.

{code}
2019-04-02 23:56:30,434 ERROR statemachine.DatanodeStateMachine 
(DatanodeStateMachine.java:start(196)) - Unable to finish the execution.
java.lang.NullPointerException
at 
java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:179)
at 
org.apache.hadoop.ozone.container.common.states.datanode.RunningDatanodeState.execute(RunningDatanodeState.java:89)
at 
org.apache.hadoop.ozone.container.common.statemachine.StateContext.execute(StateContext.java:354)
at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:183)
at 
org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:338)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14383) Compute datanode load based on StoragePolicy

2019-04-02 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808004#comment-16808004
 ] 

Wei-Chiu Chuang commented on HDFS-14383:


Seems tricky. I wonder what's the typical configuration for heterogeneous 
storage node. Is it typical to see nodes with only COLD storage volumes? I 
suppose this load issue wouldn't occur if you configure datanodes such that 
they have some HOT and some COLD volumes.

> Compute datanode load based on StoragePolicy
> 
>
> Key: HDFS-14383
> URL: https://issues.apache.org/jira/browse/HDFS-14383
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Affects Versions: 2.7.3, 3.1.2
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>
> Datanode load check logic needs to be changed because existing computation 
> will not consider StoragePolicy.
> DatanodeManager#getInServiceXceiverAverage
> {code}
> public double getInServiceXceiverAverage() {
>  double avgLoad = 0;
>  final int nodes = getNumDatanodesInService();
>  if (nodes != 0) {
>  final int xceivers = heartbeatManager
>  .getInServiceXceiverCount();
>  avgLoad = (double)xceivers/nodes;
>  }
>  return avgLoad;
> }
> {code}
>  
> For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) 
> with average 10 xceivers the calculated threshold by the NN is 28 (((500 + 
> 900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes 
> unavailable when the COLD tier nodes are barely in use. Turning this check 
> off helps to mitigate this issue, however the 
> dfs.namenode.replication.considerLoad helps to "balance" the load of the DNs, 
> upon turning it off can lead to situations where specific DNs are 
> "overloaded".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13853) RBF: RouterAdmin update cmd is overwriting the entry not updating the existing

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808003#comment-16808003
 ] 

Íñigo Goiri commented on HDFS-13853:


Thanks [~ayushtkn] for the update, minor comments:
* Fix the checkstyle warnings.
* The warning in RouterAdmin#300 could be done without the + operator by moving 
all to the second line.
* Add space between the comment and the text (RouterAdmin#588).
* Remove the empty line in RouterAdmin#686.
* Instead of {{true/false}}, do {{true|false}}.
* I would leave {{testUpdateNonExistingMountTable}} with an update for the new 
behavior (already covered, but better to have smaller tests).
* Actually, it may make sense to split {{testUpdateChangeAttributes()}} a 
little more.

> RBF: RouterAdmin update cmd is overwriting the entry not updating the existing
> --
>
> Key: HDFS-13853
> URL: https://issues.apache.org/jira/browse/HDFS-13853
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-13853-HDFS-13891-01.patch, 
> HDFS-13853-HDFS-13891-02.patch, HDFS-13853-HDFS-13891-03.patch, 
> HDFS-13853-HDFS-13891-04.patch
>
>
> {code:java}
> // Create a new entry
> Map destMap = new LinkedHashMap<>();
> for (String ns : nss) {
>   destMap.put(ns, dest);
> }
> MountTable newEntry = MountTable.newInstance(mount, destMap);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1333) OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security classes

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1333?focusedWorklogId=221900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221900
 ]

ASF GitHub Bot logged work on HDDS-1333:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:16
Start Date: 02/Apr/19 18:16
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on pull request #653: HDDS-1333. 
OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
classes
URL: https://github.com/apache/hadoop/pull/653#discussion_r271434920
 
 

 ##
 File path: 
hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneClientAdapterImpl.java
 ##
 @@ -0,0 +1,371 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.ozone;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.URI;
+import java.util.HashMap;
+import java.util.Iterator;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.crypto.key.KeyProvider;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.security.x509.SecurityConfig;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneBucket;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.OzoneKey;
+import org.apache.hadoop.ozone.client.OzoneVolume;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.security.OzoneTokenIdentifier;
+import org.apache.hadoop.security.token.Token;
+import org.apache.hadoop.security.token.TokenRenewer;
+
+import org.apache.commons.lang3.StringUtils;
+import static org.apache.hadoop.ozone.OzoneConsts.OZONE_URI_DELIMITER;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Basic Implementation of the OzoneFileSystem calls.
+ * 
+ * This is the minimal version which doesn't include any statistics.
+ * 
+ * For full featured version use OzoneClientAdapterImpl.
+ */
+public class BasicOzoneClientAdapterImpl implements OzoneClientAdapter {
+
+  static final Logger LOG =
+  LoggerFactory.getLogger(BasicOzoneClientAdapterImpl.class);
+
+  private OzoneClient ozoneClient;
+  private ObjectStore objectStore;
+  private OzoneVolume volume;
+  private OzoneBucket bucket;
+  private ReplicationType replicationType;
+  private ReplicationFactor replicationFactor;
+  private boolean securityEnabled;
+
+  /**
+   * Create new OzoneClientAdapter implementation.
+   *
+   * @param volumeStr Name of the volume to use.
+   * @param bucketStr Name of the bucket to use
+   * @throws IOException In case of a problem.
+   */
+  public BasicOzoneClientAdapterImpl(String volumeStr, String bucketStr)
+  throws IOException {
+this(createConf(), volumeStr, bucketStr);
+  }
+
+  private static OzoneConfiguration createConf() {
+ClassLoader contextClassLoader =
+Thread.currentThread().getContextClassLoader();
+Thread.currentThread().setContextClassLoader(null);
+OzoneConfiguration conf = new OzoneConfiguration();
+Thread.currentThread().setContextClassLoader(contextClassLoader);
+return conf;
+  }
+
+  public BasicOzoneClientAdapterImpl(OzoneConfiguration conf, String volumeStr,
+  String bucketStr)
+  throws IOException {
+this(null, -1, conf, volumeStr, bucketStr);
+  }
+
+  public BasicOzoneClientAdapterImpl(String omHost, int omPort,
+  Configuration hadoopConf, String volumeStr, String bucketStr)
+  throws IOException {
+
+ClassLoader contextClassLoader =
 
 Review comment:
   We can reuse createConf api here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to 

[jira] [Commented] (HDFS-14394) Add -std=c99 / -std=gnu99 to libhdfs compile flags

2019-04-02 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807986#comment-16807986
 ] 

Eric Yang commented on HDFS-14394:
--

[~stakiar] Thanks for the reminder about CMAKE_CXX_FLAGS, not CMAKE_C_FLAGS 
used to set std flag.  I think this patch can work fine, and even better if it 
is using -std=c99 -pedantic-errors -fextended-identifiers to avoid certain code 
scan on flagging gnu keyword.

> Add -std=c99 / -std=gnu99 to libhdfs compile flags
> --
>
> Key: HDFS-14394
> URL: https://issues.apache.org/jira/browse/HDFS-14394
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14394.001.patch
>
>
> libhdfs compilation currently does not enforce a minimum required C version. 
> As of today, the libhdfs build on Hadoop QA works, but when built on a 
> machine with an outdated gcc / cc version where C89 is the default, 
> compilation fails due to errors such as:
> {code}
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  error: ‘for’ loop initial declarations are only allowed in C99 mode
> for (int i = 0; i < numCachedClasses; i++) {
> ^
> /build/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/jclasses.c:106:5:
>  note: use option -std=c99 or -std=gnu99 to compile your code
> {code}
> We should add the -std=c99 / -std=gnu99 flags to libhdfs compilation so that 
> we can enforce C99 as the minimum required version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1333) OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security classes

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1333?focusedWorklogId=221893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221893
 ]

ASF GitHub Bot logged work on HDDS-1333:


Author: ASF GitHub Bot
Created on: 02/Apr/19 18:10
Start Date: 02/Apr/19 18:10
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on pull request #653: HDDS-1333. 
OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
classes
URL: https://github.com/apache/hadoop/pull/653#discussion_r271432487
 
 

 ##
 File path: 
hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneClientAdapterImpl.java
 ##
 @@ -0,0 +1,371 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.ozone;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.URI;
+import java.util.HashMap;
+import java.util.Iterator;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.crypto.key.KeyProvider;
+import org.apache.hadoop.hdds.client.ReplicationFactor;
+import org.apache.hadoop.hdds.client.ReplicationType;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.hdds.security.x509.SecurityConfig;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.ozone.OzoneConfigKeys;
+import org.apache.hadoop.ozone.client.ObjectStore;
+import org.apache.hadoop.ozone.client.OzoneBucket;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+import org.apache.hadoop.ozone.client.OzoneKey;
+import org.apache.hadoop.ozone.client.OzoneVolume;
+import org.apache.hadoop.ozone.client.io.OzoneOutputStream;
+import org.apache.hadoop.ozone.security.OzoneTokenIdentifier;
+import org.apache.hadoop.security.token.Token;
+import org.apache.hadoop.security.token.TokenRenewer;
+
+import org.apache.commons.lang3.StringUtils;
+import static org.apache.hadoop.ozone.OzoneConsts.OZONE_URI_DELIMITER;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Basic Implementation of the OzoneFileSystem calls.
+ * 
+ * This is the minimal version which doesn't include any statistics.
+ * 
+ * For full featured version use OzoneClientAdapterImpl.
+ */
+public class BasicOzoneClientAdapterImpl implements OzoneClientAdapter {
+
+  static final Logger LOG =
+  LoggerFactory.getLogger(BasicOzoneClientAdapterImpl.class);
+
+  private OzoneClient ozoneClient;
+  private ObjectStore objectStore;
+  private OzoneVolume volume;
+  private OzoneBucket bucket;
+  private ReplicationType replicationType;
+  private ReplicationFactor replicationFactor;
+  private boolean securityEnabled;
+
+  /**
+   * Create new OzoneClientAdapter implementation.
+   *
+   * @param volumeStr Name of the volume to use.
+   * @param bucketStr Name of the bucket to use
+   * @throws IOException In case of a problem.
+   */
+  public BasicOzoneClientAdapterImpl(String volumeStr, String bucketStr)
+  throws IOException {
+this(createConf(), volumeStr, bucketStr);
+  }
+
+  private static OzoneConfiguration createConf() {
+ClassLoader contextClassLoader =
+Thread.currentThread().getContextClassLoader();
+Thread.currentThread().setContextClassLoader(null);
+OzoneConfiguration conf = new OzoneConfiguration();
+Thread.currentThread().setContextClassLoader(contextClassLoader);
+return conf;
+  }
+
+  public BasicOzoneClientAdapterImpl(OzoneConfiguration conf, String volumeStr,
+  String bucketStr)
+  throws IOException {
+this(null, -1, conf, volumeStr, bucketStr);
+  }
+
+  public BasicOzoneClientAdapterImpl(String omHost, int omPort,
+  Configuration hadoopConf, String volumeStr, String bucketStr)
+  throws IOException {
+
+ClassLoader contextClassLoader =
+Thread.currentThread().getContextClassLoader();
+Thread.currentThread().setContextClassLoader(null);
+OzoneConfiguration conf;
+if (hadoopConf instanceof OzoneConfiguration) {
+  conf = (OzoneConfiguration) hadoopConf;
+} else {

[jira] [Created] (HDDS-1369) Containers should be processed by Container Scanner right after close.

2019-04-02 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-1369:
---

 Summary: Containers should be processed by Container Scanner right 
after close.
 Key: HDDS-1369
 URL: https://issues.apache.org/jira/browse/HDDS-1369
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Datanode
Reporter: Mukul Kumar Singh


Containers which have been closed by the datanode, should be processed by 
container scanner immediately. This proposal is to identify any potential 
problem in the container closing or with container metadata immediately. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10687) Federation Membership State Store internal API

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807969#comment-16807969
 ] 

Íñigo Goiri commented on HDFS-10687:


[~fengnanli], you are correct, that's the approach.
Do you see a need to have an updater?

> Federation Membership State Store internal API
> --
>
> Key: HDFS-10687
> URL: https://issues.apache.org/jira/browse/HDFS-10687
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Íñigo Goiri
>Assignee: Jason Kace
>Priority: Major
> Fix For: 2.9.0, 3.0.0
>
> Attachments: HDFS-10467-HDFS-10687-001.patch, 
> HDFS-10687-HDFS-10467-002.patch, HDFS-10687-HDFS-10467-003.patch, 
> HDFS-10687-HDFS-10467-004.patch, HDFS-10687-HDFS-10467-005.patch, 
> HDFS-10687-HDFS-10467-006.patch, HDFS-10687-HDFS-10467-007.patch, 
> HDFS-10687-HDFS-10467-008.patch
>
>
> The Federation Membership State encapsulates the information about the 
> Namenodes of each sub-cluster that are participating in Federation. The 
> information includes addresses for RPC, Web. This information is stored in 
> the State Store and later used by the Router to find data in the federation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14400) Namenode ExpiredHeartbeats metric

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807967#comment-16807967
 ] 

Íñigo Goiri commented on HDFS-14400:


[~kpalanisamy], my whole point is that I think that the counter is correct.
This metrics represents the number of times we got an expired heartbeat.
Here you are adding a new metric which is the current number of DNs that have a 
expired HB.
We should keep the old one.

> Namenode ExpiredHeartbeats metric
> -
>
> Key: HDFS-14400
> URL: https://issues.apache.org/jira/browse/HDFS-14400
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Minor
> Attachments: HDFS-14400-001.patch, HDFS-14400-002.patch, 
> HDFS-14400-003.patch
>
>
> Noticed incorrect value in ExpiredHeartbeats metrics under namenode JMX.
> We will increment ExpiredHeartbeats count when Datanode is dead but somehow 
> we missed to decrement when datanode is alive back.
> {code}
> { "name" : "Hadoop:service=NameNode,name=FSNamesystem", "modelerType" : 
> "FSNamesystem", "tag.Context" : "dfs", "tag.TotalSyncTimes" : "7 ", 
> "tag.HAState" : "active", ... "ExpiredHeartbeats" : 2, ... }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14385) RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807960#comment-16807960
 ] 

Íñigo Goiri commented on HDFS-14385:


There seems to be an issue with HAServiceState.
We should also make some test use the mock now.

> RBF: Optimize MiniRouterDFSCluster with optional light weight MiniDFSCluster
> 
>
> Key: HDFS-14385
> URL: https://issues.apache.org/jira/browse/HDFS-14385
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14385-HDFS-13891.001.patch
>
>
> MiniRouterDFSCluster mimic federated HDFS cluster with routers to support RBF 
> test, In MiniRouterDFSCluster, it starts MiniDFSCluster with complete roles 
> of HDFS which have significant time cost. As HDFS-14351 discussed, it is 
> better to provide mock MiniDFSCluster/Namenodes as one option to support some 
> test case and reduce time cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14390) Provide kerberos support for AliasMap service used by Provided storage

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807955#comment-16807955
 ] 

Íñigo Goiri commented on HDFS-14390:


+1 on  [^HDFS-14390.003.patch].
I'd like [~virajith] to take a look too.

> Provide kerberos support for AliasMap service used by Provided storage
> --
>
> Key: HDFS-14390
> URL: https://issues.apache.org/jira/browse/HDFS-14390
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashvin
>Assignee: Ashvin
>Priority: Major
> Attachments: HDFS-14390.001.patch, HDFS-14390.002.patch, 
> HDFS-14390.003.patch
>
>
> With {{PROVIDED}} storage (-HDFS-9806)-, HDFS can address data stored in 
> external storage systems. This feature is not supported in a secure HDFS 
> cluster. The {{AliasMap}} service does not support kerberos, and as a result 
> the cluster nodes will fail to communicate with it. This JIRA is to enable 
> kerberos support for the {{AliasMap}} service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13972) RBF: Support for Delegation Token (WebHDFS)

2019-04-02 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807951#comment-16807951
 ] 

Íñigo Goiri commented on HDFS-13972:


Not very convinced that {{getDatanodeReport()}} should be done the way 
[^HDFS-13972-HDFS-13891.011.patch] does.
Can we address this in a follow up?

> RBF: Support for Delegation Token (WebHDFS)
> ---
>
> Key: HDFS-13972
> URL: https://issues.apache.org/jira/browse/HDFS-13972
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-13972-HDFS-13891.001.patch, 
> HDFS-13972-HDFS-13891.002.patch, HDFS-13972-HDFS-13891.003.patch, 
> HDFS-13972-HDFS-13891.004.patch, HDFS-13972-HDFS-13891.005.patch, 
> HDFS-13972-HDFS-13891.006.patch, HDFS-13972-HDFS-13891.007.patch, 
> HDFS-13972-HDFS-13891.008.patch, HDFS-13972-HDFS-13891.009.patch, 
> HDFS-13972-HDFS-13891.010.patch, HDFS-13972-HDFS-13891.011.patch, 
> TestRouterWebHDFSContractTokens.java
>
>
> HDFS Router should support issuing HDFS delegation tokens through WebHDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-1353:
-
Labels: newbie pull-request-available  (was: newbie)

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of BackgroundPipelineCreator
> ---
>
> Key: HDDS-1353
> URL: https://issues.apache.org/jira/browse/HDDS-1353
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: newbie, pull-request-available
>
> There is a {{BackgroundPipelineCreator}} thread in SCM which runs in a fixed 
> interval and tries to create pipelines. This BackgroundPipelineCreator uses 
> {{IOException}} as exit criteria (no more pipelines can be created). In each 
> run of BackgroundPipelineCreator we exit when we are not able to create any 
> more pipelines, i.e. when we get IOException while trying to create the 
> pipeline. This means that 
> {{scm_pipeline_metrics_num_pipeline_creation_failed}} value will get 
> incremented in each run of BackgroundPipelineCreator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1353) Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing because of BackgroundPipelineCreator

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1353?focusedWorklogId=221880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221880
 ]

ASF GitHub Bot logged work on HDDS-1353:


Author: ASF GitHub Bot
Created on: 02/Apr/19 17:44
Start Date: 02/Apr/19 17:44
Worklog Time Spent: 10m 
  Work Description: avijayanhwx commented on pull request #681: HDDS-1353 : 
Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
because of BackgroundPipelineCreator.
URL: https://github.com/apache/hadoop/pull/681
 
 
   There is a BackgroundPipelineCreator thread in SCM which runs in a fixed 
interval and tries to create pipelines. This BackgroundPipelineCreator uses 
IOException as exit criteria (no more pipelines can be created). In each run of 
BackgroundPipelineCreator we exit when we are not able to create any more 
pipelines, i.e. when we get IOException while trying to create the pipeline. 
This means that scm_pipeline_metrics_num_pipeline_creation_failed value will 
get incremented in each run of BackgroundPipelineCreator.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221880)
Time Spent: 10m
Remaining Estimate: 0h

> Metrics scm_pipeline_metrics_num_pipeline_creation_failed keeps increasing 
> because of BackgroundPipelineCreator
> ---
>
> Key: HDDS-1353
> URL: https://issues.apache.org/jira/browse/HDDS-1353
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: newbie, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a {{BackgroundPipelineCreator}} thread in SCM which runs in a fixed 
> interval and tries to create pipelines. This BackgroundPipelineCreator uses 
> {{IOException}} as exit criteria (no more pipelines can be created). In each 
> run of BackgroundPipelineCreator we exit when we are not able to create any 
> more pipelines, i.e. when we get IOException while trying to create the 
> pipeline. This means that 
> {{scm_pipeline_metrics_num_pipeline_creation_failed}} value will get 
> incremented in each run of BackgroundPipelineCreator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1333) OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security classes

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1333?focusedWorklogId=221875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221875
 ]

ASF GitHub Bot logged work on HDDS-1333:


Author: ASF GitHub Bot
Created on: 02/Apr/19 17:36
Start Date: 02/Apr/19 17:36
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on pull request #653: HDDS-1333. 
OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
classes
URL: https://github.com/apache/hadoop/pull/653#discussion_r271418779
 
 

 ##
 File path: hadoop-ozone/dist/src/main/compose/ozonefs/docker-compose.yaml
 ##
 @@ -49,21 +49,53 @@ services:
   environment:
  ENSURE_SCM_INITIALIZED: /data/metadata/scm/current/VERSION
   command: ["/opt/hadoop/bin/ozone","scm"]
-   hadoop3:
+   hadoop32:
 
 Review comment:
   One advantage i can think of is to have separate test suites for hadoop 2 
and 3. With hadoop 2 and 3 being first class we can change there configs 
independently in future. However i am ok with this approch for current patch. 
We can split it later as desired. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221875)
Time Spent: 4h 40m  (was: 4.5h)

> OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
> classes
> -
>
> Key: HDDS-1333
> URL: https://issues.apache.org/jira/browse/HDDS-1333
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The current ozonefs compatibility layer is broken by: HDDS-1299.
> The spark jobs (including hadoop 2.7) can't be executed any more:
> {code}
> 2019-03-25 09:50:08 INFO  StateStoreCoordinatorRef:54 - Registered 
> StateStoreCoordinator endpoint
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/crypto/key/KeyProviderTokenIssuer
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
> at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
> at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
> at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
> at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at 
> org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:45)
> at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:332)
> at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
> at 
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
> at 
> org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:715)
> at 
> org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:757)
> at 
> org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:724)
> at 

[jira] [Commented] (HDDS-1365) Fix error handling in KeyValueContainerCheck

2019-04-02 Thread Supratim Deka (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807897#comment-16807897
 ] 

Supratim Deka commented on HDDS-1365:
-

hello [~linyiqun], I did consider losing the error code. Not every corruption 
will require specific handling. At best there will be a few categories. When 
that happens, we can introduce a specific new exception for every such 
specialised category with its own handler logic. Basically, moving away from 
the error code approach - using exceptions does make the code cleaner and less 
clunky.

And it should be ok to defer this incremental work until actually required. Yes?

> Fix error handling in KeyValueContainerCheck
> 
>
> Key: HDDS-1365
> URL: https://issues.apache.org/jira/browse/HDDS-1365
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
> Attachments: HDDS-1365.000.patch
>
>
> Error handling and propagation in KeyValueContainerCheck needs to be based on 
> throwing IOException instead of passing an error code to the calling function.
> HDDS-1163 implemented the basic framework using a mix of error code return 
> and exception handling. There is added complexity because exceptions deep 
> inside the call chain are being caught and translated to error code return 
> values. The goal is to change all error handling in this class to use 
> Exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1333) OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security classes

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1333?focusedWorklogId=221818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221818
 ]

ASF GitHub Bot logged work on HDDS-1333:


Author: ASF GitHub Bot
Created on: 02/Apr/19 16:10
Start Date: 02/Apr/19 16:10
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #653: HDDS-1333. 
OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
classes
URL: https://github.com/apache/hadoop/pull/653#discussion_r271382877
 
 

 ##
 File path: hadoop-hdds/docs/content/SparkOzoneFSK8S.md
 ##
 @@ -78,11 +78,13 @@ And create a custom `core-site.xml`:
 
 
 fs.o3fs.impl
-org.apache.hadoop.fs.ozone.OzoneFileSystem
+org.apache.hadoop.fs.ozone.BasicOzoneFileSystem
 
 
 ```
 
+_Note_: You may also use `org.apache.hadoop.fs.ozone.OzoneFileSystem` without 
the `Basic` prefix. The `Basic` version doesn't support FS statistics and 
security tokens but can work together with older hadoop versions.
 
 Review comment:
   bq. "The `Basic` version doesn't support FS statistics and security tokens 
but can work together with older hadoop versions."
   
   This is not accurate. If I understand correctly, the BasicOzoneFileSystem 
also support delegation token APIs but not FS statistics.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221818)
Time Spent: 4.5h  (was: 4h 20m)

> OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
> classes
> -
>
> Key: HDDS-1333
> URL: https://issues.apache.org/jira/browse/HDDS-1333
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The current ozonefs compatibility layer is broken by: HDDS-1299.
> The spark jobs (including hadoop 2.7) can't be executed any more:
> {code}
> 2019-03-25 09:50:08 INFO  StateStoreCoordinatorRef:54 - Registered 
> StateStoreCoordinator endpoint
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/crypto/key/KeyProviderTokenIssuer
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
> at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
> at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
> at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
> at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at 
> org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:45)
> at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:332)
> at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
> at 
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
> at 
> 

[jira] [Work logged] (HDDS-1333) OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security classes

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1333?focusedWorklogId=221810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221810
 ]

ASF GitHub Bot logged work on HDDS-1333:


Author: ASF GitHub Bot
Created on: 02/Apr/19 16:06
Start Date: 02/Apr/19 16:06
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #653: HDDS-1333. 
OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
classes
URL: https://github.com/apache/hadoop/pull/653#discussion_r271382877
 
 

 ##
 File path: hadoop-hdds/docs/content/SparkOzoneFSK8S.md
 ##
 @@ -78,11 +78,13 @@ And create a custom `core-site.xml`:
 
 
 fs.o3fs.impl
-org.apache.hadoop.fs.ozone.OzoneFileSystem
+org.apache.hadoop.fs.ozone.BasicOzoneFileSystem
 
 
 ```
 
+_Note_: You may also use `org.apache.hadoop.fs.ozone.OzoneFileSystem` without 
the `Basic` prefix. The `Basic` version doesn't support FS statistics and 
security tokens but can work together with older hadoop versions.
 
 Review comment:
   This is not accurate. If I understand correctly, the BasicOzoneFileSystem 
also support delegation token APIs but not FS statistics.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 221810)
Time Spent: 4h 20m  (was: 4h 10m)

> OzoneFileSystem can't work with spark/hadoop2.7 because incompatible security 
> classes
> -
>
> Key: HDDS-1333
> URL: https://issues.apache.org/jira/browse/HDDS-1333
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> The current ozonefs compatibility layer is broken by: HDDS-1299.
> The spark jobs (including hadoop 2.7) can't be executed any more:
> {code}
> 2019-03-25 09:50:08 INFO  StateStoreCoordinatorRef:54 - Registered 
> StateStoreCoordinator endpoint
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/crypto/key/KeyProviderTokenIssuer
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
> at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
> at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
> at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
> at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
> at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
> at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at 
> org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:45)
> at 
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:332)
> at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
> at 
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
> at 
> org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:715)
> at 
> org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:757)
> at 
> 

[jira] [Updated] (HDDS-1355) Only FQDN is accepted for OM rpc address in secure environment

2019-04-02 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-1355:
-
Fix Version/s: 0.4.0

> Only FQDN is accepted for OM rpc address in secure environment
> --
>
> Key: HDDS-1355
> URL: https://issues.apache.org/jira/browse/HDDS-1355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> While the scm address can be a host name (relative to the current search 
> domain) if the om address is just a hostname and not a FQDN a NPE is thrown:
> {code}
>   10   │   OZONE-SITE.XML_ozone.om.address: "om-0.om"
>   11   │   OZONE-SITE.XML_ozone.scm.client.address: "scm-0.scm"
>   12   │   OZONE-SITE.XML_ozone.scm.names: "scm-0.scm"
> {code} 
> {code}
> 2019-03-29 14:37:52 ERROR OzoneManager:865 - Failed to start the OzoneManager.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.ozone.om.OzoneManager.getSCMSignedCert(OzoneManager.java:1372)
> at 
> org.apache.hadoop.ozone.om.OzoneManager.initializeSecurity(OzoneManager.java:1018)
> at org.apache.hadoop.ozone.om.OzoneManager.omInit(OzoneManager.java:971)
> at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:928)
> at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:859)
> {code}
> I don't know what is the right validation rule here, but I am pretty sure 
> that NPE should be avoided and a meaningful error should be thrown. (and the 
> behaviour should be the same for scm and om)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1355) Only FQDN is accepted for OM rpc address in secure environment

2019-04-02 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-1355:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Only FQDN is accepted for OM rpc address in secure environment
> --
>
> Key: HDDS-1355
> URL: https://issues.apache.org/jira/browse/HDDS-1355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> While the scm address can be a host name (relative to the current search 
> domain) if the om address is just a hostname and not a FQDN a NPE is thrown:
> {code}
>   10   │   OZONE-SITE.XML_ozone.om.address: "om-0.om"
>   11   │   OZONE-SITE.XML_ozone.scm.client.address: "scm-0.scm"
>   12   │   OZONE-SITE.XML_ozone.scm.names: "scm-0.scm"
> {code} 
> {code}
> 2019-03-29 14:37:52 ERROR OzoneManager:865 - Failed to start the OzoneManager.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.ozone.om.OzoneManager.getSCMSignedCert(OzoneManager.java:1372)
> at 
> org.apache.hadoop.ozone.om.OzoneManager.initializeSecurity(OzoneManager.java:1018)
> at org.apache.hadoop.ozone.om.OzoneManager.omInit(OzoneManager.java:971)
> at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:928)
> at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:859)
> {code}
> I don't know what is the right validation rule here, but I am pretty sure 
> that NPE should be avoided and a meaningful error should be thrown. (and the 
> behaviour should be the same for scm and om)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14404) Reduce KMS error logging severity from WARN to INFO

2019-04-02 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807870#comment-16807870
 ] 

Xiaoyu Yao commented on HDFS-14404:
---

Thanks [~knanasi] for reporting the issue and fixed it. Patch LGTM, +1 and I 
will commit it shortly.

> Reduce KMS error logging severity from WARN to INFO
> ---
>
> Key: HDFS-14404
> URL: https://issues.apache.org/jira/browse/HDFS-14404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: kms
>Affects Versions: 3.2.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Trivial
> Attachments: HDFS-14404.001.patch
>
>
> When the KMS is deployed as an HA service and a failure occurs the current 
> error severity in the client code appears to be WARN. It can result in 
> excessive errors despite the fact that another instance may succeed.
> Maybe this log level can be adjusted in only the load balancing provider.
> {code}
> 19/02/27 05:10:10 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
> [https://example.com:16000/kms/v1/] threw an IOException 
> [java.net.ConnectException: Connection refused (Connection refused)]!!
> 19/02/12 20:50:09 WARN kms.LoadBalancingKMSClientProvider: KMS provider at 
> [https://example.com:16000/kms/v1/] threw an IOException:
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-04-02 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807862#comment-16807862
 ] 

Weiwei Yang commented on HDFS-13596:


This is an important fix, thanks [~ferhui] for working on this. I am not 
familiar with this part, so pinging few folks internally to see if they can 
help to review. Meanwhile, glad to see [~xkrogen] is helping on this :)

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> 

[jira] [Commented] (HDFS-14378) Simplify the design of multiple NN and both logic of edit log roll and checkpoint

2019-04-02 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807844#comment-16807844
 ] 

Erik Krogen commented on HDFS-14378:


Hey [~starphin], I haven't taken a detailed look at the patch. I don't have 
much historical context here so before doing so I'd like to get some opinions 
from older members of the project regarding the overall idea. Having the active 
do more of these things makes sense to me -- I've never really understood why 
the SbNN is the one rolling the edit logs -- but there may be some good 
reasoning that we are missing. [~shv], [~kihwal], [~ajisakaa] -- do any of you 
have opinions on this?

> Simplify the design of multiple NN and both logic of edit log roll and 
> checkpoint
> -
>
> Key: HDFS-14378
> URL: https://issues.apache.org/jira/browse/HDFS-14378
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: star
>Assignee: star
>Priority: Minor
> Attachments: HDFS-14378-trunk.001.patch, HDFS-14378-trunk.002.patch
>
>
>       HDFS-6440 introduced a mechanism to support more than 2 NNs. It 
> implements a first-writer-win policy to avoid duplicated fsimage downloading. 
> Variable 'isPrimaryCheckPointer' is used to hold the first-writer state, with 
> which SNN will provide fsimage for ANN next time. Then we have three roles in 
> NN cluster: ANN, one primary SNN, one or more normal SNN.
>       Since HDFS-12248, there may be more than two primary SNN shortly after 
> a exception occurred. It takes care with a scenario  that SNN will not upload 
> fsimage on IOE and Interrupted exceptions. Though it will not cause any 
> further functional issues, it is inconsistent. 
>       Futher more, edit log may be rolled more frequently than necessary with 
> multiple Standby name nodes, HDFS-14349. (I'm not so sure about this, will 
> verify by unit tests or any one could point it out.)
>       Above all, I‘m wondering if we could make it simple with following 
> changes:
>  * There are only two roles:ANN, SNN
>  * ANN will roll its edit log every DFS_HA_LOGROLL_PERIOD_KEY period.
>  * ANN will select a SNN to download checkpoint.
> SNN will just do logtail and checkpoint. Then provide a servlet for fsimage 
> downloading as normal. SNN will not try to roll edit log or send checkpoint 
> request to ANN.
> In a word, ANN will be more active. Suggestions are welcomed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-1365) Fix error handling in KeyValueContainerCheck

2019-04-02 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807836#comment-16807836
 ] 

Yiqun Lin edited comment on HDDS-1365 at 4/2/19 3:06 PM:
-

Hi [~sdeka], the error code can be kept since we can do the corresponding 
handle logic (e.g. notify SCM  corrupted containers) based on these error code 
value in the future like following:
{code:java}
   KvCheckError error;  
switch(error) {
case FILE_LOAD:
// Handler logic1..
break;
case METADATA_PATH_ACCESS
// Handler logic2..
break;
case CHUNKS_PATH_ACCESS
// Handler logic3..
break;
...
default
break;
}
{code}
Based on the current change, the exception is almost swallowed. We only log the 
exception message, how can we do the handle logic for different error cases?

Other refactor looks good to me. One nit:
 {{assertTrue(corruption == false);}} can be simplified to 
{{assertFalse(corruption);}}


was (Author: linyiqun):
Hi [~sdeka], the error code can be kept since we can do the corresponding 
handle logic (e.g. notify SCM  corrupted containers)according to error code 
value in the future like following:
{code:java}
   KvCheckError error;  
switch(error) {
case FILE_LOAD:
// Handler logic1..
break;
case METADATA_PATH_ACCESS
// Handler logic2..
break;
case CHUNKS_PATH_ACCESS
// Handler logic3..
break;
...
default
break;
}
{code}
Based on the current change, the exception is almost swallowed. We only log the 
exception message, how can we do the handle logic for different error cases?

Other refactor looks good to me. One nit:
 {{assertTrue(corruption == false);}} can be simplified to 
{{assertFalse(corruption);}}

> Fix error handling in KeyValueContainerCheck
> 
>
> Key: HDDS-1365
> URL: https://issues.apache.org/jira/browse/HDDS-1365
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
> Attachments: HDDS-1365.000.patch
>
>
> Error handling and propagation in KeyValueContainerCheck needs to be based on 
> throwing IOException instead of passing an error code to the calling function.
> HDDS-1163 implemented the basic framework using a mix of error code return 
> and exception handling. There is added complexity because exceptions deep 
> inside the call chain are being caught and translated to error code return 
> values. The goal is to change all error handling in this class to use 
> Exceptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >