date:20200327

mukul1987 merged pull request #729: HDDS-3288: Update default RPC handler 
SCM/OM count to 100
URL: https://github.com/apache/hadoop-ozone/pull/729
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-3241) Invalid container reported to SCM should be deleted

2020-03-27 Thread Yiqun Lin (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069198#comment-17069198
 ] 

Yiqun Lin edited comment on HDDS-3241 at 3/28/20, 2:55 AM:
---

Thanks for the comments, [~elek] / [~msingh].

Actually current SCM safemode can also ensure this behavior is safe enough once 
we startup SCM with wrong container/pipeline db files. And then leads large 
containers deleted.  This should not happen because SCM won't exit safemode 
firstly since DN containers reported will not reach the safemode threshold 
anyway.

Also I have mentioned another case that in large clusters, the node sent to 
repair and come back to cluster again. SCM deletion behavior can help 
automation cleanup Datanode stale container datas. This is also one common 
cases.

I have updated the PR to make this configurable and disabled by default. Please 
help have a look, thanks.


was (Author: linyiqun):
Thanks for the comments, [~elek] / [~msingh].

Actually current SCM safemode can also protect this behavior once we startup 
SCM with wrong container/pipeline db files. And then leads large containers 
deleted.  This should not happen because SCM won't exit safemode firstly since 
DN containers reported will not reach the safemode threshold anyway.

Also I have mentioned another case that in large clusters, the node sent to 
repair and come back to cluster again. SCM deletion behavior can help 
automation cleanup Datanode stale container datas. This is also one common 
cases.

I have updated the PR to make this configurable and disabled by default. Please 
help have a look, thanks.

> Invalid container reported to SCM should be deleted
> ---
>
> Key: HDDS-3241
> URL: https://issues.apache.org/jira/browse/HDDS-3241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.4.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the invalid or out-updated container reported by Datanode, 
> ContainerReportHandler in SCM only prints error log and doesn't 
>  take any action.
> {noformat}
> 2020-03-15 05:19:41,072 ERROR 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received 
> container report for an unknown container 37 from datanode 
> 0d98dfab-9d34-46c3-93fd-6b64b65ff543{ip: xx.xx.xx.xx, host: lyq-xx.xx.xx.xx, 
> networkLocation: /dc2/rack1, certSerialId: null}.
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: Container 
> with id #37 not found.
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:542)
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.getContainerInfo(ContainerStateMap.java:188)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.getContainer(ContainerStateManager.java:484)
> at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.getContainer(SCMContainerManager.java:204)
> at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:85)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:126)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:97)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:46)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-03-15 05:19:41,073 ERROR 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received 
> container report for an unknown container 38 from datanode 
> 0d98dfab-9d34-46c3-93fd-6b64b65ff543{ip: xx.xx.xx.xx, host: lyq-xx.xx.xx.xx, 
> networkLocation: /dc2/rack1, certSerialId: null}.
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: Container 
> with id #38 not found.
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:542)
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.getContainerInfo(ContainerStateMap.java:188)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.getContainer(ContainerStateManager.java:484)
> at 
> org.apache.hadoop.hdds.scm.cont

[jira] [Commented] (HDDS-3241) Invalid container reported to SCM should be deleted

2020-03-27 Thread Yiqun Lin (Jira)



[ 
https://issues.apache.org/jira/browse/HDDS-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069198#comment-17069198
 ] 

Yiqun Lin commented on HDDS-3241:
-

Thanks for the comments, [~elek] / [~msingh].

Actually current SCM safemode can also protect this behavior once we startup 
SCM with wrong container/pipeline db files. And then leads large containers 
deleted.  This should not happen because SCM won't exit safemode firstly since 
DN containers reported will not reach the safemode threshold anyway.

Also I have mentioned another case that in large clusters, the node sent to 
repair and come back to cluster again. SCM deletion behavior can help 
automation cleanup Datanode stale container datas. This is also one common 
cases.

I have updated the PR to make this configurable and disabled by default. Please 
help have a look, thanks.

> Invalid container reported to SCM should be deleted
> ---
>
> Key: HDDS-3241
> URL: https://issues.apache.org/jira/browse/HDDS-3241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.4.1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the invalid or out-updated container reported by Datanode, 
> ContainerReportHandler in SCM only prints error log and doesn't 
>  take any action.
> {noformat}
> 2020-03-15 05:19:41,072 ERROR 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received 
> container report for an unknown container 37 from datanode 
> 0d98dfab-9d34-46c3-93fd-6b64b65ff543{ip: xx.xx.xx.xx, host: lyq-xx.xx.xx.xx, 
> networkLocation: /dc2/rack1, certSerialId: null}.
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: Container 
> with id #37 not found.
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:542)
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.getContainerInfo(ContainerStateMap.java:188)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.getContainer(ContainerStateManager.java:484)
> at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.getContainer(SCMContainerManager.java:204)
> at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:85)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:126)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:97)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:46)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-03-15 05:19:41,073 ERROR 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received 
> container report for an unknown container 38 from datanode 
> 0d98dfab-9d34-46c3-93fd-6b64b65ff543{ip: xx.xx.xx.xx, host: lyq-xx.xx.xx.xx, 
> networkLocation: /dc2/rack1, certSerialId: null}.
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: Container 
> with id #38 not found.
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:542)
> at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.getContainerInfo(ContainerStateMap.java:188)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.getContainer(ContainerStateManager.java:484)
> at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.getContainer(SCMContainerManager.java:204)
> at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:85)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:126)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:97)
> at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:46)
> at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadP

[jira] [Created] (HDDS-3296) Ozone admin should always have read/write ACL permission on ozone objects

2020-03-27 Thread Xiaoyu Yao (Jira)

Xiaoyu Yao created HDDS-3296:


 Summary: Ozone admin should always have read/write ACL permission 
on ozone objects
 Key: HDDS-3296
 URL: https://issues.apache.org/jira/browse/HDDS-3296
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 0.5.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


Ozone admin should always have read/write acl permission to ozone objects. This 
way, if owner incorrectly set the acls and lose access, admin can always help 
to get acces back. 





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3291) Write operation when both OM followers are shutdown



 [ 
https://issues.apache.org/jira/browse/HDDS-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3291:
-
Labels: pull-request-available  (was: )

> Write operation when both OM followers are shutdown
> ---
>
> Key: HDDS-3291
> URL: https://issues.apache.org/jira/browse/HDDS-3291
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Nilotpal Nandi
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> steps taken :
> --
> 1. In OM HA environment, shutdown both OM followers.
> 2. Start PUT key operation.
> PUT key operation is hung.
> Cluster details : 
> https://quasar-vwryte-1.quasar-vwryte.root.hwx.site:7183/cmf/home
> Snippet of OM log on LEADER:
> {code:java}
> 2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,250 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,251 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:48,752 WARN org.apache.rati

[GitHub] [hadoop-ozone] bharatviswa504 opened a new pull request #733: HDDS-3291. Write operation when both OM followers are shutdown.

bharatviswa504 opened a new pull request #733: HDDS-3291. Write operation when 
both OM followers are shutdown.
URL: https://github.com/apache/hadoop-ozone/pull/733
 
 
   ## What changes were proposed in this pull request?
   
   Add IPC client time out, so that the client will fail with socket time out 
exception in cases of 2 OM node failures.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3291
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   Tested it in docker compose cluster with 1 minute, and see that it failed 
finally, instead of hanging.
   
   To repro this test, we need to change leader.election.time.out value also to 
large value, as we need this request to be submitted to ratis, and as ratis 
server keeps on retry then only we will see this issue.
   
   2020-03-27 16:25:27,625 [main] INFO  RetryInvocationHandler:411 - 
com.google.protobuf.ServiceException: java.net.SocketTimeoutException: Call 
From c5263a1df1ad/172.22.0.3 to om2:9862 failed on socket timeout exception: 
java.net.SocketTimeoutException: 6 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/172.22.0.3:56460 remote=om2/172.22.0.4:9862]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout, while invoking 
$Proxy19.submitRequest over nodeId=om2,nodeAddress=om2:9862 after 15 failover 
attempts. Trying to failover immediately.
   2020-03-27 16:25:27,626 [main] ERROR OMFailoverProxyProvider:285 - Failed to 
connect to OMs: [nodeId=om1,nodeAddress=om1:9862, 
nodeId=om3,nodeAddress=om3:9862, nodeId=om2,nodeAddress=om2:9862]. Attempted 15 
failovers.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] vivekratnavel commented on issue #732: HDDS-3295. Ozone admins getting Permission Denied error while creating volume

vivekratnavel commented on issue #732: HDDS-3295. Ozone admins getting 
Permission Denied error while creating volume
URL: https://github.com/apache/hadoop-ozone/pull/732#issuecomment-605341527
 
 
   @bharatviswa504 @xiaoyuyao Please review


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3295) Ozone admins getting Permission Denied error while creating volume



 [ 
https://issues.apache.org/jira/browse/HDDS-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3295:
-
Labels: pull-request-available  (was: )

> Ozone admins getting Permission Denied error while creating volume 
> ---
>
> Key: HDDS-3295
> URL: https://issues.apache.org/jira/browse/HDDS-3295
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 0.5.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
>
> Even when a user is added to ozone.administrators,  Permission Denied error 
> is thrown while creating a new volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] vivekratnavel opened a new pull request #732: HDDS-3295. Ozone admins getting Permission Denied error while creating volume

vivekratnavel opened a new pull request #732: HDDS-3295. Ozone admins getting 
Permission Denied error while creating volume
URL: https://github.com/apache/hadoop-ozone/pull/732
 
 
   ## What changes were proposed in this pull request?
   
   - get user information from om request instead of client
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3295
   
   ## How was this patch tested?
   
   Tested manually in a cluster by replacing the ozone-manager jar.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3295) Ozone admins getting Permission Denied error while creating volume

2020-03-27 Thread Vivek Ratnavel Subramanian (Jira)

Vivek Ratnavel Subramanian created HDDS-3295:


 Summary: Ozone admins getting Permission Denied error while 
creating volume 
 Key: HDDS-3295
 URL: https://issues.apache.org/jira/browse/HDDS-3295
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Security
Affects Versions: 0.5.0
Reporter: Vivek Ratnavel Subramanian
Assignee: Vivek Ratnavel Subramanian


Even when a user is added to ozone.administrators,  Permission Denied error is 
thrown while creating a new volume.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3266) Intermittent integration test failure due to DEADLINE_EXCEEDED



[ 
https://issues.apache.org/jira/browse/HDDS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069067#comment-17069067
 ] 

Attila Doroszlai commented on HDDS-3266:


{code:title=https://github.com/apache/hadoop-ozone/pull/582/checks?check_run_id=540143086}
ERROR freon.RandomKeyGenerator (RandomKeyGenerator.java:run(1064)) - Exception 
while validating write.
...
Caused by: java.io.IOException: java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
ClientCall started after deadline exceeded: -0.186633493s from now
{code}

> Intermittent integration test failure due to DEADLINE_EXCEEDED
> --
>
> Key: HDDS-3266
> URL: https://issues.apache.org/jira/browse/HDDS-3266
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Blocker
> Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient-output.txt, 
> org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient.txt, 
> org.apache.hadoop.ozone.freon.TestRandomKeyGenerator-output.txt
>
>
> {code:title=https://github.com/apache/hadoop-ozone/runs/527778966}
> Tests run: 71, Failures: 0, Errors: 1, Skipped: 3, Time elapsed: 85.254 s <<< 
> FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
> testReadKeyWithCorruptedDataWithMutiNodes(org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient)
>   Time elapsed: 2.577 s  <<< ERROR!
> java.io.IOException: Unexpected OzoneException: java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: ClientCall started after deadline exceeded: -0.611771733s 
> from now
> at 
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunk(ChunkInputStream.java:341)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #625: HDDS-2980. Delete replayed entry from OpenKeyTable during commit

hanishakoneru commented on a change in pull request #625: HDDS-2980. Delete 
replayed entry from OpenKeyTable during commit
URL: https://github.com/apache/hadoop-ozone/pull/625#discussion_r399554295
 
 

 ##
 File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3MultipartUploadCommitPartRequest.java
 ##
 @@ -147,11 +146,6 @@ public OMClientResponse 
validateAndUpdateCache(OzoneManager ozoneManager,
 throw new OMException("Failed to commit Multipart Upload key, as " +
 openKey + "entry is not found in the openKey table",
 KEY_NOT_FOUND);
-  } else {
-// Check the OpenKeyTable if this transaction is a replay of ratis 
logs.
 
 Review comment:
   This check was redundant. Irrespective of whether KeyCreate Request was 
replayed or not, if key+clientID exits in the openKey table, then the 
CommitPart request should also be executed (same as we do for KeyCommit 
Request). 
   If the same Key part was created again, the clientID would be different. 
Hence the openKey would also be different. 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3266) Intermittent integration test failure due to DEADLINE_EXCEEDED



 [ 
https://issues.apache.org/jira/browse/HDDS-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-3266:
---
Summary: Intermittent integration test failure due to DEADLINE_EXCEEDED  
(was: Intermittent TestSecureOzoneRpcClient failure due to DEADLINE_EXCEEDED)

> Intermittent integration test failure due to DEADLINE_EXCEEDED
> --
>
> Key: HDDS-3266
> URL: https://issues.apache.org/jira/browse/HDDS-3266
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Blocker
> Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient-output.txt, 
> org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient.txt, 
> org.apache.hadoop.ozone.freon.TestRandomKeyGenerator-output.txt
>
>
> {code:title=https://github.com/apache/hadoop-ozone/runs/527778966}
> Tests run: 71, Failures: 0, Errors: 1, Skipped: 3, Time elapsed: 85.254 s <<< 
> FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient
> testReadKeyWithCorruptedDataWithMutiNodes(org.apache.hadoop.ozone.client.rpc.TestSecureOzoneRpcClient)
>   Time elapsed: 2.577 s  <<< ERROR!
> java.io.IOException: Unexpected OzoneException: java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: ClientCall started after deadline exceeded: -0.611771733s 
> from now
> at 
> org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunk(ChunkInputStream.java:341)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on issue #728: Master stable

bharatviswa504 edited a comment on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605312146
 
 
   > @bharatviswa504 There are multiple problems. The space issue is fixed 
thanks to @adoroszlai (and I forget to merge it originally as he wrote). But 
the space issue is just one problem.
   > 
   > HDDS-3234 + HDDS-3064 together can cause timeout problems visible both in 
integration and acceptance tests (at least this is my understanding)
   
   Because I see HDDS-3234 PR got committed after a clean run. Might be 
including both has caused the problem, but once it is affecting write time out 
and other is reads. But I am fine with reverting, but I just want to say it out 
here.
   https://github.com/apache/hadoop-ozone/runs/518369082
   
   I will open a new PR to try out with HDDS-3234 again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #728: Master stable

bharatviswa504 commented on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605312146
 
 
   > @bharatviswa504 There are multiple problems. The space issue is fixed 
thanks to @adoroszlai (and I forget to merge it originally as he wrote). But 
the space issue is just one problem.
   > 
   > HDDS-3234 + HDDS-3064 together can cause timeout problems visible both in 
integration and acceptance tests (at least this is my understanding)
   
   Because I see HDDS-3234 PR got committed after a clean run.
   https://github.com/apache/hadoop-ozone/runs/518369082
   
   Can we try with reverting that change or I can open a new PR to try out?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #728: Master stable

elek commented on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605301043
 
 
   @bharatviswa504 There are multiple problems. The space issue is fixed thanks 
to @adoroszlai (and I forget to merge it originally as he wrote). But the space 
issue is just one problem.
   
   HDDS-3234 + HDDS-3064 together can cause timeout problems visible both in 
integration and acceptance tests (at least this is my understanding)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-2011) TestRandomKeyGenerator fails due to timeout



[ 
https://issues.apache.org/jira/browse/HDDS-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069020#comment-17069020
 ] 

Siyao Meng commented on HDDS-2011:
--

Also found in:
https://github.com/apache/hadoop-ozone/pull/696/checks?check_run_id=540098578
and
https://github.com/apache/hadoop-ozone/pull/582/checks?check_run_id=540143086

> TestRandomKeyGenerator fails due to timeout
> ---
>
> Key: HDDS-2011
> URL: https://issues.apache.org/jira/browse/HDDS-2011
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Priority: Major
>
> {{TestRandomKeyGenerator#bigFileThan2GB}} is failing intermittently due to 
> timeout in Ratis {{appendEntries}}.  Commit on pipeline fails, and new 
> pipeline cannot be created with 2 nodes (there are 5 nodes total).
> Most recent one: 
> https://github.com/elek/ozone-ci/tree/master/trunk/trunk-nightly-pz9vg/integration/hadoop-ozone/tools



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-3294) Flaky test TestContainerStateMachineFailureOnRead#testReadStateMachineFailureClosesPipeline



 [ 
https://issues.apache.org/jira/browse/HDDS-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng reassigned HDDS-3294:


Assignee: (was: Siyao Meng)

> Flaky test 
> TestContainerStateMachineFailureOnRead#testReadStateMachineFailureClosesPipeline
> ---
>
> Key: HDDS-3294
> URL: https://issues.apache.org/jira/browse/HDDS-3294
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Siyao Meng
>Priority: Major
>
> Shows up in a PR: https://github.com/apache/hadoop-ozone/runs/540133363
> {code:title=log}
> [INFO] Running 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 49.766 s <<< FAILURE! - in 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead
> [ERROR] 
> testReadStateMachineFailureClosesPipeline(org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead)
>   Time elapsed: 49.623 s  <<< ERROR!
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead.testReadStateMachineFailureClosesPipeline(TestContainerStateMachineFailureOnRead.java:204)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> {code:title=Location of NPE at 
> TestContainerStateMachineFailureOnRead.java:204}
> // delete the container dir from leader
> FileUtil.fullyDelete(new File(
> leaderDn.get().getDatanodeStateMachine()
> .getContainer().getContainerSet()
> 
> .getContainer(omKeyLocationInfo.getContainerID()).getContainerData() <-- this 
> line
> .getContainerPath()));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3294) Flaky test TestContainerStateMachineFailureOnRead#testReadStateMachineFailureClosesPipeline

Siyao Meng created HDDS-3294:


 Summary: Flaky test 
TestContainerStateMachineFailureOnRead#testReadStateMachineFailureClosesPipeline
 Key: HDDS-3294
 URL: https://issues.apache.org/jira/browse/HDDS-3294
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Siyao Meng
Assignee: Siyao Meng


Shows up in a PR: https://github.com/apache/hadoop-ozone/runs/540133363

{code:title=log}
[INFO] Running 
org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 49.766 
s <<< FAILURE! - in 
org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead
[ERROR] 
testReadStateMachineFailureClosesPipeline(org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead)
  Time elapsed: 49.623 s  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead.testReadStateMachineFailureClosesPipeline(TestContainerStateMachineFailureOnRead.java:204)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

{code}

{code:title=Location of NPE at TestContainerStateMachineFailureOnRead.java:204}
// delete the container dir from leader
FileUtil.fullyDelete(new File(
leaderDn.get().getDatanodeStateMachine()
.getContainer().getContainerSet()

.getContainer(omKeyLocationInfo.getContainerID()).getContainerData() <-- this 
line
.getContainerPath()));
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-3281) Add timeouts to all robot tests

2020-03-27 Thread Hanisha Koneru (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru resolved HDDS-3281.
--
Resolution: Fixed

> Add timeouts to all robot tests
> ---
>
> Key: HDDS-3281
> URL: https://issues.apache.org/jira/browse/HDDS-3281
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have seen in some CI runs that the acceptance test suit is getting 
> cancelled as it runs for more than 6 hours. Because of this, the test results 
> and logs are also not saved. 
> This Jira aims to add a 5 minute timeout to all robot tests. In case some 
> tests require more time, we can update the timeout. This would help to 
> isolate the test which could be causing the whole acceptance test suit to 
> time out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot 
tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605255750
 
 
   Thank you all for the reviews. Will merge this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru merged pull request #723: HDDS-3281. Add timeouts to all robot tests

hanishakoneru merged pull request #723: HDDS-3281. Add timeouts to all robot 
tests
URL: https://github.com/apache/hadoop-ozone/pull/723
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] aryangupta1998 opened a new pull request #730: HDDS-3289. Add a freon generator to create nested directories.

aryangupta1998 opened a new pull request #730: HDDS-3289. Add a freon generator 
to create nested directories.
URL: https://github.com/apache/hadoop-ozone/pull/730
 
 
   ## What changes were proposed in this pull request?
   
   This Jira proposes to add a functionality to freon to create nested 
directories. Also, multiple child directories can be created inside the leaf 
directory and also multiple top level directories can be created.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3289
   
   ## How was this patch tested?
   
   Tested manually by running Freon Directory Generator.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3047) ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get principal name by default



 [ 
https://issues.apache.org/jira/browse/HDDS-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-3047:
-
Summary: ObjectStore#listVolumesByUser and CreateVolumeHandler#call should 
get principal name by default  (was: ObjectStore#listVolumesByUser and 
CreateVolumeHandler#call should get full principal name by default)

> ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get 
> principal name by default
> ---
>
> Key: HDDS-3047
> URL: https://issues.apache.org/jira/browse/HDDS-3047
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [{{ObjectStore#listVolumesByUser}}|https://github.com/apache/hadoop-ozone/blob/2fa37ef99b8fb4575169ba8326eeb677b3d2ed74/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/ObjectStore.java#L249-L256]
>  is using {{getShortUserName()}} by default (when user is empty or null):
> {code:java|title=ObjectStore#listVolumesByUser}
>   public Iterator listVolumesByUser(String user,
>   String volumePrefix, String prevVolume)
>   throws IOException {
> if(Strings.isNullOrEmpty(user)) {
>   user = UserGroupInformation.getCurrentUser().getShortUserName();  // <--
> }
> return new VolumeIterator(user, volumePrefix, prevVolume);
>   }
> {code}
> It should use {{getUserName()}} instead.
> For a quick reference for the difference between {{getUserName()}} and 
> {{getShortUserName()}}:
> {code:java|title=UserGroupInformation#getUserName}
>   /**
>* Get the user's full principal name.
>* @return the user's full principal name.
>*/
>   @InterfaceAudience.Public
>   @InterfaceStability.Evolving
>   public String getUserName() {
> return user.getName();
>   }
> {code}
> {code:java|title=UserGroupInformation#getShortUserName}
>   /**
>* Get the user's login name.
>* @return the user's name up to the first '/' or '@'.
>*/
>   public String getShortUserName() {
> return user.getShortName();
>   }
> {code}
> This won't cause issue if Kerberos is not in use. However, once Kerberos is 
> enabled, {{getUserName()}} and {{getShortUserName()}} result differs and can 
> cause some issues.
> When Kerberos is enabled, {{getUserName()}} returns full principal name e.g. 
> {{om/o...@example.com}}, but {{getShortUserName()}} will return login name 
> e.g. {{hadoop}}.
> If {{hadoop.security.auth_to_local}} is set, {{getShortUserName()}} result 
> can become very different from full principal name.
> For example, when {{hadoop.security.auth_to_local = 
> RULE:[2:$1@$0](.*)s/.*/root/}},
> {{getShortUserName()}} returns {{root}}, while {{getUserName()}} still gives 
> {{om/o...@example.com}}.)
> This can lead to user experience issue (when Kerberos is enabled) where the 
> user creates a volume with ozone shell ([uses 
> {{getUserName()}}|https://github.com/apache/hadoop-ozone/blob/ecb5bf4df1d80723835a1500d595102f3f861708/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/CreateVolumeHandler.java#L63-L65]
>  internally) then try to list it with {{ObjectStore#listVolumesByUser(null, 
> ...)}} ([uses {{getShortUserName()}} by 
> default|https://github.com/apache/hadoop-ozone/blob/2fa37ef99b8fb4575169ba8326eeb677b3d2ed74/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/ObjectStore.java#L238-L256]
>  when user param is empty or null), the user won't see any volumes because of 
> the mismatch.
> We should also double check *all* usages that uses {{getShortUserName()}}.
> *Update:*
> Xiaoyu and I checked that the usage of {{getShortUserName()}} on the server 
> side shouldn't become a problem. Because server should've maintained it's own 
> auth_to_local rules (admin should make sure they separate each user into 
> different short names. just don't map multiple principal names into the same 
> then it won't be a problem).
> The usage in {{BasicOzoneFileSystem}} itself also seems valid because that 
> {{getShortUserName()}} is only used for client side purpose (to set 
> {{workingDir}}, etc.).
> But the usage in {{ObjectStore#listVolumesByUser}} is confirmed problematic 
> at the moment, which needs to be fixed. Same for 
> [{{CreateVolumeHandler#call}}|https://github.com/apache/hadoop-ozone/blob/ecb5bf4df1d80723835a1500d595102f3f861708/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/CreateVolumeHandler.java#L81-L83]:
> {code:java|title=CreateVolumeHandler#call}
>   } else {
> rootName = UserGroupInformation.getCurrentUser().getShortUserName();
>   }
> {code}
> It should pa

[GitHub] [hadoop-ozone] aryangupta1998 closed pull request #730: HDDS-3289. Add a freon generator to create nested directories.

aryangupta1998 closed pull request #730: HDDS-3289. Add a freon generator to 
create nested directories.
URL: https://github.com/apache/hadoop-ozone/pull/730
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3047) ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get full principal name by default



 [ 
https://issues.apache.org/jira/browse/HDDS-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-3047:
-
Summary: ObjectStore#listVolumesByUser and CreateVolumeHandler#call should 
get full principal name by default  (was: ObjectStore#listVolumesByUser and 
CreateVolumeHandler#call should get user's full principal name instead of login 
name by default)

> ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get full 
> principal name by default
> 
>
> Key: HDDS-3047
> URL: https://issues.apache.org/jira/browse/HDDS-3047
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [{{ObjectStore#listVolumesByUser}}|https://github.com/apache/hadoop-ozone/blob/2fa37ef99b8fb4575169ba8326eeb677b3d2ed74/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/ObjectStore.java#L249-L256]
>  is using {{getShortUserName()}} by default (when user is empty or null):
> {code:java|title=ObjectStore#listVolumesByUser}
>   public Iterator listVolumesByUser(String user,
>   String volumePrefix, String prevVolume)
>   throws IOException {
> if(Strings.isNullOrEmpty(user)) {
>   user = UserGroupInformation.getCurrentUser().getShortUserName();  // <--
> }
> return new VolumeIterator(user, volumePrefix, prevVolume);
>   }
> {code}
> It should use {{getUserName()}} instead.
> For a quick reference for the difference between {{getUserName()}} and 
> {{getShortUserName()}}:
> {code:java|title=UserGroupInformation#getUserName}
>   /**
>* Get the user's full principal name.
>* @return the user's full principal name.
>*/
>   @InterfaceAudience.Public
>   @InterfaceStability.Evolving
>   public String getUserName() {
> return user.getName();
>   }
> {code}
> {code:java|title=UserGroupInformation#getShortUserName}
>   /**
>* Get the user's login name.
>* @return the user's name up to the first '/' or '@'.
>*/
>   public String getShortUserName() {
> return user.getShortName();
>   }
> {code}
> This won't cause issue if Kerberos is not in use. However, once Kerberos is 
> enabled, {{getUserName()}} and {{getShortUserName()}} result differs and can 
> cause some issues.
> When Kerberos is enabled, {{getUserName()}} returns full principal name e.g. 
> {{om/o...@example.com}}, but {{getShortUserName()}} will return login name 
> e.g. {{hadoop}}.
> If {{hadoop.security.auth_to_local}} is set, {{getShortUserName()}} result 
> can become very different from full principal name.
> For example, when {{hadoop.security.auth_to_local = 
> RULE:[2:$1@$0](.*)s/.*/root/}},
> {{getShortUserName()}} returns {{root}}, while {{getUserName()}} still gives 
> {{om/o...@example.com}}.)
> This can lead to user experience issue (when Kerberos is enabled) where the 
> user creates a volume with ozone shell ([uses 
> {{getUserName()}}|https://github.com/apache/hadoop-ozone/blob/ecb5bf4df1d80723835a1500d595102f3f861708/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/CreateVolumeHandler.java#L63-L65]
>  internally) then try to list it with {{ObjectStore#listVolumesByUser(null, 
> ...)}} ([uses {{getShortUserName()}} by 
> default|https://github.com/apache/hadoop-ozone/blob/2fa37ef99b8fb4575169ba8326eeb677b3d2ed74/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/ObjectStore.java#L238-L256]
>  when user param is empty or null), the user won't see any volumes because of 
> the mismatch.
> We should also double check *all* usages that uses {{getShortUserName()}}.
> *Update:*
> Xiaoyu and I checked that the usage of {{getShortUserName()}} on the server 
> side shouldn't become a problem. Because server should've maintained it's own 
> auth_to_local rules (admin should make sure they separate each user into 
> different short names. just don't map multiple principal names into the same 
> then it won't be a problem).
> The usage in {{BasicOzoneFileSystem}} itself also seems valid because that 
> {{getShortUserName()}} is only used for client side purpose (to set 
> {{workingDir}}, etc.).
> But the usage in {{ObjectStore#listVolumesByUser}} is confirmed problematic 
> at the moment, which needs to be fixed. Same for 
> [{{CreateVolumeHandler#call}}|https://github.com/apache/hadoop-ozone/blob/ecb5bf4df1d80723835a1500d595102f3f861708/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/web/ozShell/volume/CreateVolumeHandler.java#L81-L83]:
> {code:java|title=CreateVolumeHandler#call}
>   } else {
> rootName = UserGroupInformation.getCurrentUser().getShort

[GitHub] [hadoop-ozone] aryangupta1998 opened a new pull request #730: HDDS-3289. Add a freon generator to create nested directories.

aryangupta1998 opened a new pull request #730: HDDS-3289. Add a freon generator 
to create nested directories.
URL: https://github.com/apache/hadoop-ozone/pull/730
 
 
   ## What changes were proposed in this pull request?
   
   This Jira proposes to add a functionality to freon to create nested 
directories. Also, multiple child directories can be created inside the leaf 
directory and also multiple top level directories can be created.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3289
   
   ## How was this patch tested?
   
   Tested manually by running Freon Directory Generator.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-2976) Recon throws error while trying to get snapshot in secure environment

2020-03-27 Thread Siddharth Wagle (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle reassigned HDDS-2976:
-

Assignee: Prashant Pogde  (was: Siddharth Wagle)

> Recon throws error while trying to get snapshot in secure environment
> -
>
> Key: HDDS-2976
> URL: https://issues.apache.org/jira/browse/HDDS-2976
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Prashant Pogde
>Priority: Critical
>
> Recon throws the following exception while trying to get snapshot from OM in 
> a secure env:
> {code:java}
> 10:19:24.743 PMINFO OzoneManagerServiceProviderImpl Obtaining full snapshot 
> from Ozone Manager
> 10:19:24.754 PMERROR OzoneManagerServiceProviderImpl Unable to obtain Ozone 
> Manager DB Snapshot. 
> javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure
>   at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
>   at sun.security.ssl.Alerts.getSSLException(Alerts.java:154)
>   at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2020)
>   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1127)
>   at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
>   at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
>   at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
>   at 
> org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:394)
>   at 
> org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:353)
>   at 
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
>   at 
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
>   at 
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
>   at 
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
>   at 
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
>   at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
>   at 
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
>   at 
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
>   at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
>   at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
>   at 
> org.apache.hadoop.ozone.recon.ReconUtils.makeHttpCall(ReconUtils.java:232)
>   at 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.getOzoneManagerDBSnapshot(OzoneManagerServiceProviderImpl.java:239)
>   at 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.updateReconOmDBWithNewSnapshot(OzoneManagerServiceProviderImpl.java:267)
>   at 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.syncDataFromOM(OzoneManagerServiceProviderImpl.java:358)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 10:19:24.755 PMERROR OzoneManagerServiceProviderImpl Null snapshot location 
> got from OM.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] aryangupta1998 closed pull request #730: HDDS-3289. Add a freon generator to create nested directories.

aryangupta1998 closed pull request #730: HDDS-3289. Add a freon generator to 
create nested directories.
URL: https://github.com/apache/hadoop-ozone/pull/730
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3293) read operation failing when two container replicas are corrupted

2020-03-27 Thread Nilotpal Nandi (Jira)

Nilotpal Nandi created HDDS-3293:


 Summary: read operation failing when two container replicas are 
corrupted
 Key: HDDS-3293
 URL: https://issues.apache.org/jira/browse/HDDS-3293
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Reporter: Nilotpal Nandi


steps taken :

1) Mounted noise injection FUSE on all datanodes.

2) Write a key ( multi blocks)

3) Select one of the container ids ,  inject error on 2 container replicas for 
that container id.

4) Run GET key operation.

GET key operation fails intermittenly.

Error seen :

-

 
{noformat}
20/03/27 18:30:40 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-xceiverclientmetrics.properties,hadoop-metrics2.properties
E 20/03/27 18:30:40 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
period at 10 second(s).
E 20/03/27 18:30:40 INFO impl.MetricsSystemImpl: XceiverClientMetrics metrics 
system started
E 20/03/27 18:31:12 ERROR scm.XceiverClientGrpc: Failed to execute command 
cmdType: ReadChunk
E traceID: "f80a51eaec481a1c:cbb8e92869015a53:f80a51eaec481a1c:0"
E containerID: 67
E datanodeUuid: "96101390-2446-40e6-a54e-36e170497e57"
E readChunk {
E blockID {
E containerID: 67
E localID: 103896435892617248
E blockCommitSequenceId: 1010
E }
E chunkData {
E chunkName: "103896435892617248_chunk_28"
E offset: 113246208
E len: 4194304
E checksumData {
E type: CRC32
E bytesPerChecksum: 1048576
E checksums: "\034\376\313\031"
E checksums: ";U\225\037"
E checksums: "\327m\332."
E checksums: "|\307\004E"
E }
E }
E }
E on the pipeline Pipeline[ Id: bce6316c-9690-452b-80e3-0f3590533444, Nodes: 
96101390-2446-40e6-a54e-36e170497e57{ip: 172.27.111.129, host: 
quasar-olrywk-3.quasar-olrywk.root.hwx.site, networkLocation: /default-rack, 
certSerialId: null}3e85204d-2399-43b5-952a-55b837eb4c1d{ip: 172.27.100.0, host: 
quasar-olrywk-1.quasar-olrywk.root.hwx.site, networkLocation: /default-rack, 
certSerialId: null}5af0340a-6fee-4ce8-9f68-37fa35566a5a{ip: 172.27.73.0, host: 
quasar-olrywk-9.quasar-olrywk.root.hwx.site, networkLocation: /default-rack, 
certSerialId: null}, Type:STAND_ALONE, Factor:THREE, State:OPEN, 
leaderId:96101390-2446-40e6-a54e-36e170497e57, 
CreationTimestamp2020-03-27T03:36:51.880Z].
E Unexpected OzoneException: java.io.IOException: 
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: 
deadline exceeded after 84603913ns. [remote_addr=/172.27.73.0:9859]]{noformat}
 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl edited a comment on issue #731: HDDS-3279. Rebase OFS branch

smengcl edited a comment on issue #731: HDDS-3279. Rebase OFS branch
URL: https://github.com/apache/hadoop-ozone/pull/731#issuecomment-605202481
 
 
   Unrelated flaky test 
`org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead`.
   
   Will commit in a min.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl commented on issue #731: HDDS-3279. Rebase OFS branch

smengcl commented on issue #731: HDDS-3279. Rebase OFS branch
URL: https://github.com/apache/hadoop-ozone/pull/731#issuecomment-605202481
 
 
   Unrelated flaky test 
`org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead`.
   
   Will merge in a min.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl merged pull request #731: HDDS-3279. Rebase OFS branch

smengcl merged pull request #731: HDDS-3279. Rebase OFS branch
URL: https://github.com/apache/hadoop-ozone/pull/731
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] xiaoyuyao commented on issue #665: HDDS-3160. Disable index and filter block cache for RocksDB.

xiaoyuyao commented on issue #665: HDDS-3160. Disable index and filter block 
cache for RocksDB.
URL: https://github.com/apache/hadoop-ozone/pull/665#issuecomment-605189323
 
 
   Note even we don't put the filter/index into the block cache after this 
change, they will still be put into off-heap memory by rocksdb. It is good to 
track the OM JVM heap usage w/wo this change during compaction to fully 
understand the impact of this change. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests

adoroszlai commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605153928
 
 
   > @adoroszlai, @elek are we good to merge this patch?
   
   Yes, thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] adoroszlai commented on issue #728: Master stable

adoroszlai commented on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605148945
 
 
   > Downloaded tarball and see that it failed due to no disk space.
   
   Disk space issue was fixed after `master-stable` branch had been created, so 
fix was brought into the branch with most recent merge from `master`:
   
   ```
   * 947ca10a1 (origin/master-stable) retrigger build
   *   f801e60e7 Merge remote-tracking branch 'origin/master' into master-stable
   |\
   | * 7d132ce38 (origin/master) HDDS-3179. Pipeline placement based on 
Topology does not have fallback (#678)
   | * 3d2856869 HDDS-3074. Make the configuration of container scrub 
consistent. (#722)
   | * 07fcb79e8 HDDS-3284. ozonesecure-mr test fails due to lack of disk space 
(#725)
   | * 4682babb6 HDDS-3164. Add Recon endpoint to serve missing containers and 
its metadata. (#714)
   | * f6be7660a HDDS-3243. Recon should not have the ability to send 
Create/Close Container commands to Datanode. (#712)
   | * 824938534 HDDS-3250. Create a separate log file for Warnings and Errors 
in MiniOzoneChaosCluster. (#711)
   * | 58cdc36c2 Revert "HDDS-3234. Fix retry interval default in Ozone client. 
(#698)"
   * | 1d4227b5d Revert "HDDS-3064. Get Key is hung when READ delay is injected 
in chunk file path. (#673)"
   |/
   * 512d607df Revert "HDDS-3142. Create isolated enviornment for OM to test it 
without SCM. (#656)"
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on issue #728: Master stable

bharatviswa504 edited a comment on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605143568
 
 
   @elek Even for a revert of HDDS-3234 mr jobs failed.
   So, HDDS-3234 is not a real issue I think, our underlying CI has some issue.
   Downloaded tarball and see that it failed due to no disk space.
   
   https://user-images.githubusercontent.com/8586345/77783575-5708b000-7016-11ea-9233-c228e471be96.png";>
   
   This PR fixed this issue of disk space issue.
   
   
https://github.com/apache/hadoop-ozone/commit/07fcb79e8253c19d9537772ab8f3d82c51a0220f
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 edited a comment on issue #728: Master stable

bharatviswa504 edited a comment on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605143568
 
 
   @elek Even for a revert of HDDS-3234 mr jobs failed.
   So, HDDS-3234 is not a real issue I think, our underlying CI has some issue.
   Downloaded tarball and see that it failed due to no disk space.
   
   https://user-images.githubusercontent.com/8586345/77783575-5708b000-7016-11ea-9233-c228e471be96.png";>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #728: Master stable

bharatviswa504 commented on issue #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728#issuecomment-605143568
 
 
   @elek Even for revert of HDDS-3234 mr jobs failed.
   
   Downloaded tarball and see that it failed due to no disk space.
   
   https://user-images.githubusercontent.com/8586345/77783575-5708b000-7016-11ea-9233-c228e471be96.png";>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl opened a new pull request #731: HDDS-3279. Rebase OFS branch

smengcl opened a new pull request #731: HDDS-3279. Rebase OFS branch
URL: https://github.com/apache/hadoop-ozone/pull/731
 
 
   ## What changes were proposed in this pull request?
   
   Get the necessary changes in OFS dev branch after the rebase to master 
branch. See the description and comments in 
https://github.com/apache/hadoop-ozone/pull/721
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3279
   
   ## How was this patch tested?
   
   Tested in https://github.com/apache/hadoop-ozone/pull/721


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl edited a comment on issue #721: HDDS-3279. Rebase OFS branch (Draft)

smengcl edited a comment on issue #721: HDDS-3279. Rebase OFS branch (Draft)
URL: https://github.com/apache/hadoop-ozone/pull/721#issuecomment-605136056
 
 
   Thanks @xiaoyuyao . I am going to do the following:
   1. Close this PR;
   2. Merge master commits to OFS dev branch manually;
   3. Create a new PR https://github.com/apache/hadoop-ozone/pull/731 with only 
the 3 commits I posted in this PR already;
   4. Merge that new PR https://github.com/apache/hadoop-ozone/pull/731.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDDS-3285) MiniOzoneChaosCluster exits because of deadline exceeding

2020-03-27 Thread Mukul Kumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh reassigned HDDS-3285:
---

Assignee: Shashikant Banerjee

> MiniOzoneChaosCluster exits because of deadline exceeding
> -
>
> Key: HDDS-3285
> URL: https://issues.apache.org/jira/browse/HDDS-3285
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Mukul Kumar Singh
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: MiniOzoneChaosCluster
> Attachments: complete.log.gz
>
>
> 2020-03-26 21:26:48,869 [pool-326-thread-2] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(210)) - Exiting with status 1: java.io.IOException: 
> java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.
> grpc.StatusRuntimeException: DEADLINE_EXCEEDED: ClientCall started after 
> deadline exceeded: -4.330590725s from now
> {code}
> 2020-03-26 21:26:48,866 [pool-326-thread-2] ERROR 
> loadgenerators.LoadExecutors (LoadExecutors.java:load(64)) - FileSystem 
> LOADGEN: null Exiting due to exception
> java.io.IOException: java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: ClientCall started after deadline exceeded: -4.330590725s 
> from now
> at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:359)
> at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:281)
> at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:259)
> at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:119)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:199)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:133)
> at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:254)
> at 
> org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:197)
> at 
> org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:63)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at 
> org.apache.hadoop.ozone.utils.LoadBucket$ReadOp.doPostOp(LoadBucket.java:205)
> at 
> org.apache.hadoop.ozone.utils.LoadBucket$Op.execute(LoadBucket.java:121)
> at 
> org.apache.hadoop.ozone.utils.LoadBucket$ReadOp.execute(LoadBucket.java:180)
> at 
> org.apache.hadoop.ozone.utils.LoadBucket.readKey(LoadBucket.java:82)
> at 
> org.apache.hadoop.ozone.loadgenerators.FilesystemLoadGenerator.generateLoad(FilesystemLoadGenerator.java:54)
> at 
> org.apache.hadoop.ozone.loadgenerators.LoadExecutors.load(LoadExecutors.java:62)
> at 
> org.apache.hadoop.ozone.loadgenerators.LoadExecutors.lambda$startLoad$0(LoadExecutors.java:78)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: ClientCall started after deadline exceeded: -4.330590725s 
> from now
> at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:336)
> ... 20 more
> Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> DEADLINE_EXCEEDED: ClientCall started after deadline exceeded: -4.330590725s 
> from now
> at 
> org.apache.ratis.thirdparty.io.grpc.Status.asRuntimeException(Status.java:533)
> at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:442)
> at 
> org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
> at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
> at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
>at 
> org.apache.ratis.thirdparty.io.grpc.internal.CensusStat

[GitHub] [hadoop-ozone] smengcl commented on issue #721: HDDS-3279. Rebase OFS branch

smengcl commented on issue #721: HDDS-3279. Rebase OFS branch
URL: https://github.com/apache/hadoop-ozone/pull/721#issuecomment-605136056
 
 
   Thanks @xiaoyuyao . I am going to do the following:
   1. Close this PR;
   2. Merge master commits to OFS dev branch manually;
   3. Create a new PR with only the 3 commits I posted in this PR already;
   4. Merge that new PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3291) Write operation when both OM followers are shutdown

2020-03-27 Thread Mukul Kumar Singh (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-3291:

Reporter: Nilotpal Nandi  (was: Bharat Viswanadham)

> Write operation when both OM followers are shutdown
> ---
>
> Key: HDDS-3291
> URL: https://issues.apache.org/jira/browse/HDDS-3291
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Nilotpal Nandi
>Assignee: Bharat Viswanadham
>Priority: Major
>
> steps taken :
> --
> 1. In OM HA environment, shutdown both OM followers.
> 2. Start PUT key operation.
> PUT key operation is hung.
> Cluster details : 
> https://quasar-vwryte-1.quasar-vwryte.root.hwx.site:7183/cmf/home
> Snippet of OM log on LEADER:
> {code:java}
> 2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,250 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,251 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
> om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
> 2020-03-24 04:16:48,752 WARN org.apache.ratis.grpc.server.GrpcLogAppende

[GitHub] [hadoop-ozone] smengcl closed pull request #721: HDDS-3279. Rebase OFS branch

smengcl closed pull request #721: HDDS-3279. Rebase OFS branch
URL: https://github.com/apache/hadoop-ozone/pull/721
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-2964) Fix @Ignore-d integration tests

2020-03-27 Thread Shashikant Banerjee (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDDS-2964:
--
Status: Open  (was: Patch Available)

> Fix @Ignore-d integration tests
> ---
>
> Key: HDDS-2964
> URL: https://issues.apache.org/jira/browse/HDDS-2964
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: test
>Reporter: Marton Elek
>Priority: Major
>
> We marked all the intermittent unit tests with @Ignore to get reliable 
> feedback from CI builds.
> Before HDDS-2833 we had 21 @Ignore annotations, HDDS-2833 introduced 34 new 
> one.
> We need to review all of these tests and either fix, or delete or convert 
> them to real unit tests.
> The current list of ignore tests:
> {code:java}
> hadoop-hdds/server-scm 
> org/apache/hadoop/hdds/scm/node/TestContainerPlacement.java:  @Ignore
> hadoop-hdds/server-scm 
> org/apache/hadoop/hdds/scm/node/TestDeadNodeHandler.java:  @Ignore("Tracked 
> by HDDS-2508.")
> hadoop-hdds/server-scm 
> org/apache/hadoop/hdds/scm/node/TestSCMNodeManager.java:  @Ignore
> hadoop-hdds/server-scm 
> org/apache/hadoop/hdds/scm/node/TestSCMNodeManager.java:  @Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/hdds/scm/container/TestContainerStateManagerIntegration.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/hdds/scm/container/TestContainerStateManagerIntegration.java:
>   @Ignore("TODO:HDDS-1159")
> hadoop-ozone/integration-test 
> org/apache/hadoop/hdds/scm/pipeline/TestNodeFailure.java:  @Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/hdds/scm/pipeline/TestNodeFailure.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/hdds/scm/pipeline/TestRatisPipelineCreateAndDestroy.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/hdds/scm/safemode/TestSCMSafeModeWithPipelineRules.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/Test2WayCommitInRatis.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestBlockOutputStreamWithFailures.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestCloseContainerHandlingByClient.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestCloseContainerHandlingByClient.java:  
> @Ignore // test needs to be fixed after close container is handled for
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestCommitWatcher.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestContainerReplicationEndToEnd.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestContainerStateMachineFailures.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestContainerStateMachine.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestDeleteWithSlowFollower.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestFailureHandlingByClient.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestMultiBlockWritesWithDnFailures.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestOzoneAtRestEncryption.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestOzoneClientRetriesOnException.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java:  @Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java:  
> @Ignore("Debug Jenkins Timeout")
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientForAclAuditLog.java:@Ignore("Fix
>  this after adding audit support for HA Acl code. This will be " +
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientWithRatis.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestSecureOzoneRpcClient.java:  
> @Ignore("Needs to be moved out of this class as  client setup is static")
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/client/rpc/TestWatchForCommit.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestBlockDeletion.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/container/common/statemachine/commandhandler/TestCloseContainerByPipeline.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/container/common/transport/server/ratis/TestCSMMetrics.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainer.java:@Ignore
> hadoop-ozone/integration-test 
> org/apache/hadoop/o

[GitHub] [hadoop-ozone] smengcl commented on issue #582: HDDS-3047. ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get user's full principal name instead of login name by default

smengcl commented on issue #582: HDDS-3047. ObjectStore#listVolumesByUser and 
CreateVolumeHandler#call should get user's full principal name instead of login 
name by default
URL: https://github.com/apache/hadoop-ozone/pull/582#issuecomment-605130860
 
 
   Rebased to latest master to include test failure fix HDDS-3284.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl edited a comment on issue #582: HDDS-3047. ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get user's full principal name instead of login name by def

smengcl edited a comment on issue #582: HDDS-3047. 
ObjectStore#listVolumesByUser and CreateVolumeHandler#call should get user's 
full principal name instead of login name by default
URL: https://github.com/apache/hadoop-ozone/pull/582#issuecomment-605130860
 
 
   Rebased to latest master to include (unrelated) test failure fix HDDS-3284.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] smengcl commented on issue #696: HDDS-3056. Allow users to list volumes they have access to, and optionally allow all users to list all volumes

smengcl commented on issue #696: HDDS-3056. Allow users to list volumes they 
have access to, and optionally allow all users to list all volumes
URL: https://github.com/apache/hadoop-ozone/pull/696#issuecomment-605129354
 
 
   Rebased onto latest master to include the acceptance test failure fix 
HDDS-3284.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3292) Support Hadoop 3.3

2020-03-27 Thread Wei-Chiu Chuang (Jira)

Wei-Chiu Chuang created HDDS-3292:
-

 Summary: Support Hadoop 3.3
 Key: HDDS-3292
 URL: https://issues.apache.org/jira/browse/HDDS-3292
 Project: Hadoop Distributed Data Store
  Issue Type: Task
Reporter: Wei-Chiu Chuang


Hadoop 3.3.0 is coming out soon. We should start testing Ozone on Hadoop 3.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3291) Write operation when both OM followers are shutdown

2020-03-27 Thread Bharat Viswanadham (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3291:
-
Description: 
steps taken :
--
1. In OM HA environment, shutdown both OM followers.
2. Start PUT key operation.

PUT key operation is hung.

Cluster details : 
https://quasar-vwryte-1.quasar-vwryte.root.hwx.site:7183/cmf/home

Snippet of OM log on LEADER:


{code:java}
2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,250 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,251 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,752 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,752 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,753 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,753

[jira] [Created] (HDDS-3291) Write operation when both OM followers are shutdown

2020-03-27 Thread Bharat Viswanadham (Jira)

Bharat Viswanadham created HDDS-3291:


 Summary: Write operation when both OM followers are shutdown
 Key: HDDS-3291
 URL: https://issues.apache.org/jira/browse/HDDS-3291
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


steps taken :
--
1. In OM HA environment, shutdown both OM followers.
2. Start PUT key operation.

PUT key operation is hung.

Cluster details : 
https://quasar-vwryte-1.quasar-vwryte.root.hwx.site:7183/cmf/home

Snippet of OM log on LEADER:


{code:java}
2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,249 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:46,250 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,750 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:46,750 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,250 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,251 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,251 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,751 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:47,752 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,252 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om2: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,252 INFO org.apache.ratis.server.impl.FollowerInfo: 
om1@group-9F198C4C3682->om3: nextIndex: updateUnconditionally 360 -> 359
2020-03-24 04:16:48,752 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om3-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
2020-03-24 04:16:48,752 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
om1@group-9F198C4C3682->om2-AppendLogResponseHandler: Failed appendEntries: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException:

[GitHub] [hadoop-ozone] hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot tests

hanishakoneru commented on issue #723: HDDS-3281. Add timeouts to all robot 
tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605096549
 
 
   > Note: originally I suggested to put the Timeout to the commonlib.robot to 
avoid code duplication, but I tested it and doesn't work.
   Yes. Learned that robot framework does not allow "global timeout" by design.
   
   @adoroszlai, @elek are we good to merge this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] hanishakoneru edited a comment on issue #723: HDDS-3281. Add timeouts to all robot tests

hanishakoneru edited a comment on issue #723: HDDS-3281. Add timeouts to all 
robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-605096549
 
 
   > Note: originally I suggested to put the Timeout to the commonlib.robot to 
avoid code duplication, but I tested it and doesn't work.
   
   Yes. Learned that robot framework does not allow "global timeout" by design.
   
   @adoroszlai, @elek are we good to merge this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3288) Update default RPC handler SCM/OM count to 100

2020-03-27 Thread Rakesh Radhakrishnan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan updated HDDS-3288:
---
Status: Patch Available  (was: Open)

> Update default RPC handler SCM/OM count to 100 
> ---
>
> Key: HDDS-3288
> URL: https://issues.apache.org/jira/browse/HDDS-3288
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: om, SCM
>Reporter: Rakesh Radhakrishnan
>Assignee: Rakesh Radhakrishnan
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Presently, default PC handler count of {{ozone.scm.handler.count.key=10}} and 
> {{ozone.om.handler.count.key=20}} are too small values and its good to 
> increase the default values to a realistic value.
> {code:java}
> ozone.om.handler.count.key=100
> ozone.scm.handler.count.key=100
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #687: HDDS-2184. Rename ozone scmcli to ozone admin

elek commented on issue #687: HDDS-2184. Rename ozone scmcli to ozone admin
URL: https://github.com/apache/hadoop-ozone/pull/687#issuecomment-605047108
 
 
   Base branch is changed. Please merge it manually instead of using github UI 
button.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3290) Remove deprecated RandomKeyGenerator

2020-03-27 Thread Marton Elek (Jira)

Marton Elek created HDDS-3290:
-

 Summary: Remove deprecated RandomKeyGenerator
 Key: HDDS-3290
 URL: https://issues.apache.org/jira/browse/HDDS-3290
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Marton Elek
Assignee: Marton Elek


Our first Freon test (RandomKeyGenerator) is depracated as we have all the 
functionalities with a simplified architecture (BaseFreonGenerator). We can 
remove it (especially as it's flaky...)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #710: HDDS-3173. Provide better default JVM options

elek commented on issue #710: HDDS-3173. Provide better default JVM options
URL: https://github.com/apache/hadoop-ozone/pull/710#issuecomment-605005247
 
 
   > What do you think?
   
   I think it's a very good idea to print out the flags when we touch them, but 
not sure what is your suggestion exactly:
   
1. Print out the JVM settings (and/or a warning) when we set the defaults 
(which can be unexpected)
2. Print JVM settings always?
3. Print out a notification when we don't add the defaults? (any other -XX 
options are used).
   
   What is your preference?
   
   I am thinking about printing out all the JVM options *always* (similar to 
the classpath) + a warning that we defined the default GC parameters (2nd 
option)
   
   ```
   NOTE: default JVM parameters are applied. Use any -XX: JVM parameter to use 
your own instead of the defaults.
   CLASSPATH: .
   HADOOP_OPTS: ... 
   ```
   
   Is it possible that somebody adds any secret information to the 
`HADOOP_OPTS` which should be hidden? (Do we need to use 1st option?)
   
   > and if someone sets something we would like to let him/her know what would 
have been set, so he/she can review and preserve what is still needed.
   
   As a main rule we don't set anything when any of the `-XX` flags are 
present. But I agree that it's more clear if it's somehow printed out.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3289) Add a freon generator to create nested directories



 [ 
https://issues.apache.org/jira/browse/HDDS-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3289:
-
Labels: pull-request-available  (was: )

> Add a freon generator to create nested directories
> --
>
> Key: HDDS-3289
> URL: https://issues.apache.org/jira/browse/HDDS-3289
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Tools
>Reporter: Aryan Gupta
>Assignee: Aryan Gupta
>Priority: Major
>  Labels: pull-request-available
>
> This Jira proposes to add a functionality to freon to create nested 
> directories. Also, multiple child directories can be created inside the leaf 
> directory and also multiple top level directories can be created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #713: HDDS-3251. Bump version to 0.6.0-SNAPSHOT

elek commented on issue #713: HDDS-3251. Bump version to 0.6.0-SNAPSHOT
URL: https://github.com/apache/hadoop-ozone/pull/713#issuecomment-604995537
 
 
   BTW, I changed the base branch to master-stable, please don't merge it with 
Github UI, just manually...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] aryangupta1998 opened a new pull request #730: HDDS-3289. Add a freon generator to create nested directories.

aryangupta1998 opened a new pull request #730: HDDS-3289. Add a freon generator 
to create nested directories.
URL: https://github.com/apache/hadoop-ozone/pull/730
 
 
   ## What changes were proposed in this pull request?
   
   This Jira proposes to add a functionality to freon to create nested 
directories. Also, multiple child directories can be created inside the leaf 
directory and also multiple top level directories can be created.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3289
   
   ## How was this patch tested?
   
   Tested manually by running Freon Directory Generator.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #713: HDDS-3251. Bump version to 0.6.0-SNAPSHOT

elek commented on issue #713: HDDS-3251. Bump version to 0.6.0-SNAPSHOT
URL: https://github.com/apache/hadoop-ozone/pull/713#issuecomment-604995129
 
 
   Thanks the review @dineshchitlangia 
   
   > @elek the failures are unrelated to your proposed change.
   
   I am -1 to merge anything without clean build even if they are unrelated ;-) 
It's very easy to miss something when one unrelated test hides a related 
problem. Working on cleaning up master first, and will merge it after a green 
build.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #665: HDDS-3160. Disable index and filter block cache for RocksDB.

elek commented on issue #665: HDDS-3160. Disable index and filter block cache 
for RocksDB.
URL: https://github.com/apache/hadoop-ozone/pull/665#issuecomment-604992384
 
 
   Base branch is changed. Please don't merge it from github ui, only manually.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #714: HDDS-3164. Add Recon endpoint to serve missing containers and its metadata.

elek commented on issue #714: HDDS-3164. Add Recon endpoint to serve missing 
containers and its metadata.
URL: https://github.com/apache/hadoop-ozone/pull/714#issuecomment-604987204
 
 
   If you see a flaky test, please disable it (without and issue) + create a 
new open issue and repeat the build. There is some risk that one flakiness 
hides a other. Merging patches without green build makes harder to debug flaky 
tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #722: HDDS-3074. Make the configuration of container scrub consistent.

elek commented on issue #722: HDDS-3074. Make the configuration of container 
scrub consistent.
URL: https://github.com/apache/hadoop-ozone/pull/722#issuecomment-604986683
 
 
   If you see a flaky test, please disable it (without and issue) + create a 
new open issue and repeat the build. There is some risk that one flakiness 
hides a other. Merging patches without green build makes harder to debug flaky 
tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

elek commented on issue #678: HDDS-3179 Pipeline placement based on Topology 
does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678#issuecomment-604986545
 
 
   If you see a flaky test, please disable it (without and issue) + create a 
new open issue and repeat the build. There is some risk that one flakiness 
hides a other. Merging patches without green build makes harder to debug flaky 
tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #665: HDDS-3160. Disable index and filter block cache for RocksDB.

elek commented on issue #665: HDDS-3160. Disable index and filter block cache 
for RocksDB.
URL: https://github.com/apache/hadoop-ozone/pull/665#issuecomment-604982534
 
 
   > Given this does not happen during the key creation earlier. This seems 
very likely from a RocksDB compaction, which updates the filters/indices of the 
SSTs.
   
   Yes, this is after a compaction:
   
   
![image](https://user-images.githubusercontent.com/170549/77745765-2a579700-701c-11ea-8d94-74d3193b1be1.png)
   
   I think after the compaction both the size of index/filter and block numbers 
are increased and the fixed amount of cache is not enough.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #708: HDDS-3239. Provide message-level metrics from the generic protocol dispatch.

elek commented on issue #708: HDDS-3239. Provide message-level metrics from the 
generic protocol dispatch.
URL: https://github.com/apache/hadoop-ozone/pull/708#issuecomment-604981594
 
 
   Base branch is changed. Please don't merge it from the github web ui, only 
manually.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3289) Add a freon generator to create nested directories

2020-03-27 Thread Aryan Gupta (Jira)

Aryan Gupta created HDDS-3289:
-

 Summary: Add a freon generator to create nested directories
 Key: HDDS-3289
 URL: https://issues.apache.org/jira/browse/HDDS-3289
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Tools
Reporter: Aryan Gupta
Assignee: Aryan Gupta


This Jira proposes to add a functionality to freon to create nested 
directories. Also, multiple child directories can be created inside the leaf 
directory and also multiple top level directories can be created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3288) Update default RPC handler SCM/OM count to 100



 [ 
https://issues.apache.org/jira/browse/HDDS-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3288:
-
Labels: pull-request-available  (was: )

> Update default RPC handler SCM/OM count to 100 
> ---
>
> Key: HDDS-3288
> URL: https://issues.apache.org/jira/browse/HDDS-3288
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: om, SCM
>Reporter: Rakesh Radhakrishnan
>Assignee: Rakesh Radhakrishnan
>Priority: Minor
>  Labels: pull-request-available
>
> Presently, default PC handler count of {{ozone.scm.handler.count.key=10}} and 
> {{ozone.om.handler.count.key=20}} are too small values and its good to 
> increase the default values to a realistic value.
> {code:java}
> ozone.om.handler.count.key=100
> ozone.scm.handler.count.key=100
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] rakeshadr opened a new pull request #729: HDDS-3288: Update default RPC handler SCM/OM count to 100

rakeshadr opened a new pull request #729: HDDS-3288: Update default RPC handler 
SCM/OM count to 100
URL: https://github.com/apache/hadoop-ozone/pull/729
 
 
   ## What changes were proposed in this pull request?
   
   Presently, default PC handler count of ozone.scm.handler.count.key=10 and 
ozone.om.handler.count.key=20 are too small values and its good to increase the 
default values to a realistic value.
   
   ```
   ozone.om.handler.count.key=100
   ozone.scm.handler.count.key=100
   ```
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3288
   
   ## How was this patch tested?
   
   Config changes and no UTs added.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] runzhiwang closed pull request #709: HDDS-3244. Improve write efficiency by opening RocksDB only once

runzhiwang closed pull request #709: HDDS-3244. Improve write efficiency by 
opening RocksDB only once
URL: https://github.com/apache/hadoop-ozone/pull/709
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] runzhiwang opened a new pull request #709: HDDS-3244. Improve write efficiency by opening RocksDB only once

runzhiwang opened a new pull request #709: HDDS-3244. Improve write efficiency
by opening RocksDB only once
URL: https://github.com/apache/hadoop-ozone/pull/709

## What changes were proposed in this pull request?

What's the problem ?
1. This happens when datanode create container. I split the
`HddsDispatcher.WriteChunk` into `HddsDispatcher.WriteData` and
`HddsDispatcher.CommitData` as the code shows, to show the cost of them in
jaeger UI.

![image](https://user-images.githubusercontent.com/51938049/77373666-cd51ac00-6da3-11ea-9c77-8d6864f05aac.png)

2. when datanode create each container, a new RocksDB instance will be

[created](https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java#L76)
in `HddsDispatcher.WriteData` , but then the created RocksDB was
[closed](https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java#L83),
until `HddsDispatcher.PutBlock` the RocsDB will be
[opend](https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/utils/ContainerCache.java#L123)
again, so the RocksDB was open twice in each datanode. And the RocksDB was
not used until `HddsDispatcher.PutBlock`.

3. Besides, as the image shows, when leader datanode open RocksDB in
`HddsDispatcher.WriteData` , 2 follower datanodes can not open RocksDB until
the leader finish it. So the whole write cost 3 * cost(RocksDB.open) = 600ms to
open RocksDB.

![image](https://user-images.githubusercontent.com/51938049/77320657-f0507180-6d4b-11ea-8188-acb26785f608.png)

4. When upload a 3KB file five times, the average cost is 912ms.

![image](https://user-images.githubusercontent.com/51938049/77320748-170ea800-6d4c-11ea-8131-81a1180a349a.png)

How to fix it ?
1. Open RocksDB in `HddsDispatcher.CommitData` rather than
`HddsDispatcher.WriteData`, because leader datanode and 2 follower datanodes
can open RocksDB in parallel in `HddsDispatcher.CommitData`.
2. Put the RocksDB handler into cache after open it in
`HddsDispatcher.CommitData`, to avoid open it again when
`HddsDispatcher.PutBlock`.
3. So the whole write cost 1 * cost(RocksDB.open) = 200ms to open RocksDB.

![image](https://user-images.githubusercontent.com/51938049/77321385-17f40980-6d4d-11ea-925c-d3960342ed06.png)

4. When upload a 3KB file five times, the average cost is 516ms, improve
about 44%.

![image](https://user-images.githubusercontent.com/51938049/77321360-0dd20b00-6d4d-11ea-84b3-efec6734db74.png)

## What is the link to the Apache JIRA

https://issues.apache.org/jira/projects/HDDS/issues/HDDS-3244

## How was this patch tested?

I will change the UTs to pass the CI.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3288) Update default RPC handler SCM/OM count to 100

2020-03-27 Thread Rakesh Radhakrishnan (Jira)

Rakesh Radhakrishnan created HDDS-3288:
--

 Summary: Update default RPC handler SCM/OM count to 100 
 Key: HDDS-3288
 URL: https://issues.apache.org/jira/browse/HDDS-3288
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: om, SCM
Reporter: Rakesh Radhakrishnan
Assignee: Rakesh Radhakrishnan


Presently, default PC handler count of {{ozone.scm.handler.count.key=10}} and 
{{ozone.om.handler.count.key=20}} are too small values and its good to increase 
the default values to a realistic value.
{code:java}
ozone.om.handler.count.key=100

ozone.scm.handler.count.key=100
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on a change in pull request #700: HDDS-3172 Use DBStore instead of MetadataStore in SCM

elek commented on a change in pull request #700: HDDS-3172 Use DBStore instead 
of MetadataStore in SCM
URL: https://github.com/apache/hadoop-ozone/pull/700#discussion_r399153032
 
 

 ##
 File path: 
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java
 ##
 @@ -92,14 +95,22 @@ public 
ReconStorageContainerManagerFacade(OzoneConfiguration conf,
 this.ozoneConfiguration = getReconScmConfiguration(conf);
 this.scmStorageConfig = new ReconStorageConfig(conf);
 this.clusterMap = new NetworkTopologyImpl(conf);
+DBStore dbStore = new SCMDBDefinition().createDBStore(conf);
 
 Review comment:
   Good catch, thanks. Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on a change in pull request #700: HDDS-3172 Use DBStore instead of MetadataStore in SCM

elek commented on a change in pull request #700: HDDS-3172 Use DBStore instead 
of MetadataStore in SCM
URL: https://github.com/apache/hadoop-ozone/pull/700#discussion_r399152379
 
 

 ##
 File path: 
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconNodeManager.java
 ##
 @@ -44,6 +40,10 @@
 import org.apache.hadoop.ozone.protocol.commands.SCMCommand;
 import org.apache.hadoop.ozone.recon.ReconUtils;
 import org.apache.hadoop.util.Time;
+
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_DB_CACHE_SIZE_DEFAULT;
+import static 
org.apache.hadoop.hdds.scm.ScmConfigKeys.OZONE_SCM_DB_CACHE_SIZE_MB;
+import static org.apache.hadoop.ozone.recon.ReconConstants.RECON_SCM_NODE_DB;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 
 Review comment:
   Not particular reason, just reducing the size the patch. I renamed my 
original Jira to fix SCM only. Recon and Datanode can be fixed in next two 
jira. 
   
   (But thanks to point it to me, I was not aware of this. But it also can be 
moved to SCM. AFAK @Nanda had a plan to persist it on the SCM side, too...)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests

elek commented on issue #723: HDDS-3281. Add timeouts to all robot tests
URL: https://github.com/apache/hadoop-ozone/pull/723#issuecomment-604911971

> @adoroszlai, I think even with that limitation the timeout will help us
isolate the problem. Let's say the acceptance suit is cancelled, we could still
get to know which test contributed to the time out.

There are two timeouts:
1. timeout of the test (measured between two steps)
2. timeout of one step test steps

As far as I understood @adoroszlai warned us that even if we have a test
level timeout it doesn't help at all, if 2nd is not in place. If one `curl`
based command is hanging (and robot test doesn't do a `kill`) it won't be
stopped (and we won't have any logs / results).

But I agree even without 2nd, it's good to have this patch.

On the other hand, I tested it with sleep, and it seems to be working for
me...

```
*** Settings ***
Documentation Timeout test
Library OperatingSystem
Test Timeout20 seconds
#Resourcecommonlib.robot

*** Test cases ***
Execute PI calculation
${output} = Run sleep 60
Should Contain ${output} completed
successfully
```

```
time robot test.robot

==
Test :: Timeout test

==
Execute PI calculation| FAIL
|
Test timeout 20 seconds exceeded.

--
Test :: Timeout test | FAIL
|
1 critical test, 0 passed, 1 failed
1 test total, 0 passed, 1 failed

==
Output:
/home/elek/projects/ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/output.xml
Log:
/home/elek/projects/ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/log.html
Report:
/home/elek/projects/ozone/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/smoketest/report.html
robot test.robot 0.25s user 0.03s system 1% cpu 20.285 total
```

As you see my sleep command was killed after 20 seconds.

Note: originally I suggested to put the `Timeout` to the `commonlib.robot`
to avoid code duplication, but I tested it and doesn't work.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #720: HDDS-3185 Construct a standalone ratis server for SCM.

elek commented on issue #720: HDDS-3185 Construct a standalone ratis server for 
SCM.
URL: https://github.com/apache/hadoop-ozone/pull/720#issuecomment-604904324
 
 
   Speaking about the code in the patch: I would suggest to use the new config 
based annotation model for new code:
   
   
https://cwiki.apache.org/confluence/display/HADOOP/Java-based+configuration+API
   
   (I know it's not used everywhere, therefore it's hard to notice this 
movement, and some features can be still missing, but new code seems to be a 
good opportunity to switch to the new API)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #720: HDDS-3185 Construct a standalone ratis server for SCM.

elek commented on issue #720: HDDS-3185 Construct a standalone ratis server for 
SCM.
URL: https://github.com/apache/hadoop-ozone/pull/720#issuecomment-604901383
 
 
   @timmylicheng Thanks the explain it.
   
   I will add your explanation to the next Community Meeting minutes.
   
   ```
   SCM-HA: First draft is already available and will be updated soon based on 
the existing feedback. Prototype implementation has been started at HDDS-2823 
branch. 
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] ChenSammi commented on issue #720: HDDS-3185 Construct a standalone ratis server for SCM.

ChenSammi commented on issue #720: HDDS-3185 Construct a standalone ratis 
server for SCM.
URL: https://github.com/apache/hadoop-ozone/pull/720#issuecomment-604899459
 
 
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] ChenSammi merged pull request #720: HDDS-3185 Construct a standalone ratis server for SCM.

ChenSammi merged pull request #720: HDDS-3185 Construct a standalone ratis 
server for SCM.
URL: https://github.com/apache/hadoop-ozone/pull/720
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek opened a new pull request #728: Master stable

elek opened a new pull request #728: Master stable
URL: https://github.com/apache/hadoop-ozone/pull/728
 
 
   ## What changes were proposed in this pull request?
   
   Recently we have seen a lot of timeout errors in integrations tests and 
acceptance tests. My theory is that it caused by the combination of (HDDS-3234. 
Fix retry interval default in Ozone client) and (HDDS-3064. Get Key is hung 
when READ delay is injected in chunk file path.)
   
   (HDDS-3285, HDDS-3257, ...)
   
   ## What is the link to the Apache JIRA
   
   We have the original JIRAs.
   
   ## How was this patch tested?
   
   Full CI + I will use this branch as target branch of my PRs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] elek commented on issue #690: HDDS-3221. Refactor SafeModeHandler to use a Notification Interface

elek commented on issue #690: HDDS-3221. Refactor SafeModeHandler to use a 
Notification Interface
URL: https://github.com/apache/hadoop-ozone/pull/690#issuecomment-604894837
 
 
   I am not full happy that my suggestion is ignored. 
   
   I think it would be better to switch to the EventQueue unless we have a 
strong arguments against it (what I would accept happily, but I haven't see it 
yet).
   
   I don't think it's a good idea to develop more complexity on top of a 
structure what we would like to replace very soon... 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3257) Intermittent timeout in integration tests



 [ 
https://issues.apache.org/jira/browse/HDDS-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-3257:
---
Attachment: 
org.apache.hadoop.fs.ozone.contract.ITestOzoneContractMkdir-output.txt

> Intermittent timeout in integration tests
> -
>
> Key: HDDS-3257
> URL: https://issues.apache.org/jira/browse/HDDS-3257
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: 
> org.apache.hadoop.fs.ozone.contract.ITestOzoneContractMkdir-output.txt, 
> org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures-output.txt,
>  org.apache.hadoop.ozone.freon.TestOzoneClientKeyGenerator-output.txt, 
> org.apache.hadoop.ozone.freon.TestRandomKeyGenerator-output.txt
>
>
> Even after the changes done in HDDS-3086, some integration tests (especially 
> in it-freon) are intermittently timing out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3257) Intermittent timeout in integration tests



[ 
https://issues.apache.org/jira/browse/HDDS-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068434#comment-17068434
 ] 

Attila Doroszlai commented on HDDS-3257:


{code:title=https://github.com/apache/hadoop-ozone/runs/538728907}
---
Test set: org.apache.hadoop.fs.ozone.contract.ITestOzoneContractMkdir
---
Tests run: 8, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 396.407 s <<< 
FAILURE! - in org.apache.hadoop.fs.ozone.contract.ITestOzoneContractMkdir
testMkdirOverParentFile(org.apache.hadoop.fs.ozone.contract.ITestOzoneContractMkdir)
  Time elapsed: 180.022 s  <<< ERROR!
java.lang.Exception: test timed out after 18 milliseconds
...
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:525)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:488)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:503)
at 
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:144)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleStreamAction(KeyOutputStream.java:482)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:456)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:509)
at 
org.apache.hadoop.fs.ozone.OzoneFSOutputStream.close(OzoneFSOutputStream.java:56)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.createFile(ContractTestUtils.java:638)
at 
org.apache.hadoop.fs.contract.AbstractContractMkdirTest.testMkdirOverParentFile(AbstractContractMkdirTest.java:92)

testNoMkdirOverFile(org.apache.hadoop.fs.ozone.contract.ITestOzoneContractMkdir)
  Time elapsed: 180.006 s  <<< ERROR!
java.lang.Exception: test timed out after 18 milliseconds
...
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.waitOnFlushFutures(BlockOutputStream.java:525)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.handleFlush(BlockOutputStream.java:488)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.close(BlockOutputStream.java:503)
at 
org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.close(BlockOutputStreamEntry.java:144)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleStreamAction(KeyOutputStream.java:482)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.handleFlushOrClose(KeyOutputStream.java:456)
at 
org.apache.hadoop.ozone.client.io.KeyOutputStream.close(KeyOutputStream.java:509)
at 
org.apache.hadoop.fs.ozone.OzoneFSOutputStream.close(OzoneFSOutputStream.java:56)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.createFile(ContractTestUtils.java:638)
at 
org.apache.hadoop.fs.contract.AbstractContractMkdirTest.testNoMkdirOverFile(AbstractContractMkdirTest.java:66)
{code}

> Intermittent timeout in integration tests
> -
>
> Key: HDDS-3257
> URL: https://issues.apache.org/jira/browse/HDDS-3257
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Shashikant Banerjee
>Priority: Critical
> Attachments: 
> org.apache.hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures-output.txt,
>  org.apache.hadoop.ozone.freon.TestOzoneClientKeyGenerator-output.txt, 
> org.apache.hadoop.ozone.freon.TestRandomKeyGenerator-output.txt
>
>
> Even after the changes done in HDDS-3086, some integration tests (especially 
> in it-freon) are intermittently timing out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3284) ozonesecure-mr test fails due to lack of disk space



 [ 
https://issues.apache.org/jira/browse/HDDS-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HDDS-3284:
---
Fix Version/s: 0.6.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ozonesecure-mr test fails due to lack of disk space
> ---
>
> Key: HDDS-3284
> URL: https://issues.apache.org/jira/browse/HDDS-3284
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ozonesecure-mr}} acceptance test is failing with {{No space available in 
> any of the local directories.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3287) keyLocationVersions become bigger and bigger when upload the same file many times.



 [ 
https://issues.apache.org/jira/browse/HDDS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3287:
-
Summary: keyLocationVersions become bigger and bigger when upload the same 
file many times.  (was: OmKeyInfo become bigger and bigger when upload the same 
file many times.)

> keyLocationVersions become bigger and bigger when upload the same file many 
> times.
> --
>
> Key: HDDS-3287
> URL: https://issues.apache.org/jira/browse/HDDS-3287
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>
> Because keyLocationVersions get biger and gi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3287) keyLocationVersions become bigger and bigger when upload the same file many times.



 [ 
https://issues.apache.org/jira/browse/HDDS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3287:
-
Description: Add a config to define keep how many versions.  (was: Because 
keyLocationVersions get biger and gi)

> keyLocationVersions become bigger and bigger when upload the same file many 
> times.
> --
>
> Key: HDDS-3287
> URL: https://issues.apache.org/jira/browse/HDDS-3287
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>
> Add a config to define keep how many versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3287) OmKeyInfo become bigger and bigger when upload the same file many times.



 [ 
https://issues.apache.org/jira/browse/HDDS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3287:
-
Description: Because keyLocationVersions get biger and gi

> OmKeyInfo become bigger and bigger when upload the same file many times.
> 
>
> Key: HDDS-3287
> URL: https://issues.apache.org/jira/browse/HDDS-3287
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>
> Because keyLocationVersions get biger and gi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-3287) OmKeyInfo become bigger and bigger when upload the same file many times.



[ 
https://issues.apache.org/jira/browse/HDDS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17068422#comment-17068422
 ] 

runzhiwang commented on HDDS-3287:
--

I'm working on it

> OmKeyInfo become bigger and bigger when upload the same file many times.
> 
>
> Key: HDDS-3287
> URL: https://issues.apache.org/jira/browse/HDDS-3287
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-3287) OmKeyInfo become bigger and bigger when upload the same file many times.



 [ 
https://issues.apache.org/jira/browse/HDDS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3287:
-
Issue Type: Improvement  (was: Bug)

> OmKeyInfo become bigger and bigger when upload the same file many times.
> 
>
> Key: HDDS-3287
> URL: https://issues.apache.org/jira/browse/HDDS-3287
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-3287) OmKeyInfo become bigger and bigger when upload the same file many times.

runzhiwang created HDDS-3287:


 Summary: OmKeyInfo become bigger and bigger when upload the same 
file many times.
 Key: HDDS-3287
 URL: https://issues.apache.org/jira/browse/HDDS-3287
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: runzhiwang
Assignee: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDDS-3179) Pipeline placement based on Topology does not have fall back protection

2020-03-27 Thread Stephen O'Donnell (Jira)



 [ 
https://issues.apache.org/jira/browse/HDDS-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell resolved HDDS-3179.
-
Resolution: Fixed

> Pipeline placement based on Topology does not have fall back protection
> ---
>
> Key: HDDS-3179
> URL: https://issues.apache.org/jira/browse/HDDS-3179
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.4.1
>Reporter: Li Cheng
>Assignee: Li Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When rack awareness and topology is enabled, pipeline placement can fail when 
> there is only one node on the rack.
>  
> Should add fall back logic to search for nodes from other racks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel merged pull request #678: HDDS-3179 Pipeline placement based on Topology does not have fallback

sodonnel merged pull request #678: HDDS-3179 Pipeline placement based on 
Topology does not have fallback
URL: https://github.com/apache/hadoop-ozone/pull/678
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on issue #678: HDDS-3179 Pipeline placement based on Topology does not have fallback