[jira] [Updated] (HDDS-4405) Proxy failover is logging with out trying all OMS

2020-10-28 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4405:
--
Reporter: Uma Maheswara Rao G  (was: umamaheswararao)

> Proxy failover is logging with out trying all OMS
> -
>
> Key: HDDS-4405
> URL: https://issues.apache.org/jira/browse/HDDS-4405
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Uma Maheswara Rao G
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [root@uma-1 ~]# sudo -u hdfs hdfs dfs -ls o3fs://bucket.volume.ozone1/
> 20/10/28 23:37:50 INFO retry.RetryInvocationHandler: 
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
>  OM:om2 is not the leader. Suggested leader is OM:om3.
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:198)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitReadRequestToOM(OzoneManagerProtocolServerSideTranslatorPB.java:186)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:123)
>  at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:73)
>  at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:113)
>  at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
> , while invoking $Proxy10.submitRequest over 
> {om1=nodeId=om1,nodeAddress=uma-1.uma.root.hwx.site:9862, 
> om3=nodeId=om3,nodeAddress=uma-3.uma.root.hwx.site:9862, 
> om2=nodeId=om2,nodeAddress=uma-2.uma.root.hwx.site:9862} after 1 failover 
> attempts. Trying to failover immediately.{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-3816) Erasure Coding in Apache Hadoop Ozone

2020-10-13 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDDS-3816:
-

Assignee: Uma Maheswara Rao G

> Erasure Coding in Apache Hadoop Ozone
> -
>
> Key: HDDS-3816
> URL: https://issues.apache.org/jira/browse/HDDS-3816
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: SCM
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Attachments: Erasure Coding in Apache Hadoop Ozone.pdf
>
>
> We propose to implement Erasure Coding in Apache Hadoop Ozone to provide 
> efficient storage. With EC in place, Ozone can provide same or better 
> tolerance by giving 50% or more  storage space savings. 
> In HDFS project, we already have native codecs(ISAL) and Java codecs 
> implemented, we can leverage the same or similar codec design.
> However, the critical part of EC data layout design is in-progress, we will 
> post the design doc soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-08 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4325:
--
Description: 
It seems that the return codes of ozone getconf -confKey are different in 1.0 
and after 1.0.

Looking at the code:

in old code:

/** Method to be overridden by sub classes for specific behavior. */
int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {


{code:java}
 String value = tool.getConf().getTrimmed(key);
 if (value != null) {
 tool.printOut(value);
 return 0;
 }
 tool.printError("Configuration " + key + " is missing.");
 return -1;
}
{code}

with 1.0 code:
@Override
  public Void call() throws Exception {
String value = tool.getConf().getTrimmed(confKey);
if (value != null) {
  tool.printOut(value);
} else {
  tool.printError("Configuration " + confKey + " is missing.");
}
return null;
  }

We are returning null irrespective of the cases.
Some applications/tests depending on this codes.

Thanks [~nmaheshwari] for helping on debug and finding the issue.
 

  was:
It seems that the return codes of ozone getconf -confKey are different in prior 
1.0 and after 1.0.

Looking at the code:

in old code:

/** Method to be overridden by sub classes for specific behavior. */
int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {


{code:java}
 String value = tool.getConf().getTrimmed(key);
 if (value != null) {
 tool.printOut(value);
 return 0;
 }
 tool.printError("Configuration " + key + " is missing.");
 return -1;
}
{code}

with 1.0 code:
@Override
  public Void call() throws Exception {
String value = tool.getConf().getTrimmed(confKey);
if (value != null) {
  tool.printOut(value);
} else {
  tool.printError("Configuration " + confKey + " is missing.");
}
return null;
  }

We are returning null irrespective of the cases.
Some applications/tests depending on this codes.

Thanks [~nmaheshwari] for helping on debug and finding the issue.
 


> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.1.0
>Reporter: Namit Maheshwari
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> It seems that the return codes of ozone getconf -confKey are different in 1.0 
> and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
> Thanks [~nmaheshwari] for helping on debug and finding the issue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-08 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4325:
--
Affects Version/s: (was: 1.0.0)
   1.1.0

> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.1.0
>Reporter: Namit Maheshwari
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> It seems that the return codes of ozone getconf -confKey are different in 
> prior 1.0 and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
> Thanks [~nmaheshwari] for helping on debug and finding the issue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4310) Ozone getconf broke the compatibility

2020-10-08 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4310:
--
Affects Version/s: (was: 1.0.0)
   1.1.0

> Ozone getconf broke the compatibility
> -
>
> Key: HDDS-4310
> URL: https://issues.apache.org/jira/browse/HDDS-4310
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Fix For: 1.1.0
>
>
> Currently ozone getconf '-confKey' does not work as 'HDDS-3102' removed the 
> need of prepending - with options.
> {code:java}
> RUNNING: ozone getconf -confKey ozone.om.service.ids 2020-10-05 
> 19:10:09,110|INFO|MainThread|machine.py:180 - 
> run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Unknown options: '-confKey', 
> 'ozone.om.service.ids' 2020-10-05 19:10:09,111|INFO|MainThread|machine.py:180 
> - run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Possible solutions: -conf
> {code}
> There are some users which did the automation with the commands and this 
> change broke them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-08 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4325:
--
Fix Version/s: 1.1.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~adoroszlai] for the contribution. 

> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.0.0
>Reporter: Namit Maheshwari
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> It seems that the return codes of ozone getconf -confKey are different in 
> prior 1.0 and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
> Thanks [~nmaheshwari] for helping on debug and finding the issue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-08 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210326#comment-17210326
 ] 

Uma Maheswara Rao G commented on HDDS-4325:
---

Changing the reporter to [~nmaheshwari] thank you!

> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.0.0
>Reporter: Namit Maheshwari
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> It seems that the return codes of ozone getconf -confKey are different in 
> prior 1.0 and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
> Thanks [~nmaheshwari] for helping on debug and finding the issue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-08 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4325:
--
Description: 
It seems that the return codes of ozone getconf -confKey are different in prior 
1.0 and after 1.0.

Looking at the code:

in old code:

/** Method to be overridden by sub classes for specific behavior. */
int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {


{code:java}
 String value = tool.getConf().getTrimmed(key);
 if (value != null) {
 tool.printOut(value);
 return 0;
 }
 tool.printError("Configuration " + key + " is missing.");
 return -1;
}
{code}

with 1.0 code:
@Override
  public Void call() throws Exception {
String value = tool.getConf().getTrimmed(confKey);
if (value != null) {
  tool.printOut(value);
} else {
  tool.printError("Configuration " + confKey + " is missing.");
}
return null;
  }

We are returning null irrespective of the cases.
Some applications/tests depending on this codes.

Thanks [~nmaheshwari] for helping on debug and finding the issue.
 

  was:
It seems that the return codes of ozone getconf -confKey are different in prior 
1.0 and after 1.0.

Looking at the code:

in old code:

/** Method to be overridden by sub classes for specific behavior. */
int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {


{code:java}
 String value = tool.getConf().getTrimmed(key);
 if (value != null) {
 tool.printOut(value);
 return 0;
 }
 tool.printError("Configuration " + key + " is missing.");
 return -1;
}
{code}

with 1.0 code:
@Override
  public Void call() throws Exception {
String value = tool.getConf().getTrimmed(confKey);
if (value != null) {
  tool.printOut(value);
} else {
  tool.printError("Configuration " + confKey + " is missing.");
}
return null;
  }

We are returning null irrespective of the cases.
Some applications/tests depending on this codes.

 


> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.0.0
>Reporter: Namit Maheshwari
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> It seems that the return codes of ozone getconf -confKey are different in 
> prior 1.0 and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
> Thanks [~nmaheshwari] for helping on debug and finding the issue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-08 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4325:
--
Reporter: Namit Maheshwari  (was: Uma Maheswara Rao G)

> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.0.0
>Reporter: Namit Maheshwari
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> It seems that the return codes of ozone getconf -confKey are different in 
> prior 1.0 and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-07 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4325:
--
Affects Version/s: 1.0.0

> Incompatible return codes from Ozone getconf -confKey
> -
>
> Key: HDDS-4325
> URL: https://issues.apache.org/jira/browse/HDDS-4325
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.0.0
>Reporter: Uma Maheswara Rao G
>Priority: Major
>
> It seems that the return codes of ozone getconf -confKey are different in 
> prior 1.0 and after 1.0.
> Looking at the code:
> in old code:
> /** Method to be overridden by sub classes for specific behavior. */
> int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {
> {code:java}
>  String value = tool.getConf().getTrimmed(key);
>  if (value != null) {
>  tool.printOut(value);
>  return 0;
>  }
>  tool.printError("Configuration " + key + " is missing.");
>  return -1;
> }
> {code}
> with 1.0 code:
> @Override
>   public Void call() throws Exception {
> String value = tool.getConf().getTrimmed(confKey);
> if (value != null) {
>   tool.printOut(value);
> } else {
>   tool.printError("Configuration " + confKey + " is missing.");
> }
> return null;
>   }
> We are returning null irrespective of the cases.
> Some applications/tests depending on this codes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4325) Incompatible return codes from Ozone getconf -confKey

2020-10-07 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-4325:
-

 Summary: Incompatible return codes from Ozone getconf -confKey
 Key: HDDS-4325
 URL: https://issues.apache.org/jira/browse/HDDS-4325
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone CLI
Reporter: Uma Maheswara Rao G


It seems that the return codes of ozone getconf -confKey are different in prior 
1.0 and after 1.0.

Looking at the code:

in old code:

/** Method to be overridden by sub classes for specific behavior. */
int doWorkInternal(OzoneGetConf tool, String[] args) throws Exception {


{code:java}
 String value = tool.getConf().getTrimmed(key);
 if (value != null) {
 tool.printOut(value);
 return 0;
 }
 tool.printError("Configuration " + key + " is missing.");
 return -1;
}
{code}

with 1.0 code:
@Override
  public Void call() throws Exception {
String value = tool.getConf().getTrimmed(confKey);
if (value != null) {
  tool.printOut(value);
} else {
  tool.printError("Configuration " + confKey + " is missing.");
}
return null;
  }

We are returning null irrespective of the cases.
Some applications/tests depending on this codes.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4310) Ozone getconf broke the compatibility

2020-10-05 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-4310:
--
Description: 
Currently ozone getconf '-confKey' does not work as 'HDDS-3102'  removed the 
need of prepending  - with options.


{code:java}
RUNNING: /opt/cloudera/parcels/CDH/bin/ozone getconf -confKey 
ozone.om.service.ids 2020-10-05 19:10:09,110|INFO|MainThread|machine.py:180 - 
run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Unknown options: '-confKey', 
'ozone.om.service.ids' 2020-10-05 19:10:09,111|INFO|MainThread|machine.py:180 - 
run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Possible solutions: -conf
{code}

There are some users which did the automation with the commands and this change 
broke them.

  was:
Currently ozone getconf -confKey does not work as HDDS-3102 removed the need of 
prepending '-' with options.


{code:java}
RUNNING: /opt/cloudera/parcels/CDH/bin/ozone getconf -confKey 
ozone.om.service.ids 2020-10-05 19:10:09,110|INFO|MainThread|machine.py:180 - 
run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Unknown options: '-confKey', 
'ozone.om.service.ids' 2020-10-05 19:10:09,111|INFO|MainThread|machine.py:180 - 
run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Possible solutions: -conf
{code}

There are some users which did the automation with the commands and this change 
broke them.


> Ozone getconf broke the compatibility
> -
>
> Key: HDDS-4310
> URL: https://issues.apache.org/jira/browse/HDDS-4310
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone CLI
>Affects Versions: 1.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>
> Currently ozone getconf '-confKey' does not work as 'HDDS-3102'  removed the 
> need of prepending  - with options.
> {code:java}
> RUNNING: /opt/cloudera/parcels/CDH/bin/ozone getconf -confKey 
> ozone.om.service.ids 2020-10-05 19:10:09,110|INFO|MainThread|machine.py:180 - 
> run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Unknown options: '-confKey', 
> 'ozone.om.service.ids' 2020-10-05 19:10:09,111|INFO|MainThread|machine.py:180 
> - run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Possible solutions: -conf
> {code}
> There are some users which did the automation with the commands and this 
> change broke them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4310) Ozone getconf broke the compatibility

2020-10-05 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-4310:
-

 Summary: Ozone getconf broke the compatibility
 Key: HDDS-4310
 URL: https://issues.apache.org/jira/browse/HDDS-4310
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone CLI
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Currently ozone getconf -confKey does not work as HDDS-3102 removed the need of 
prepending '-' with options.


{code:java}
RUNNING: /opt/cloudera/parcels/CDH/bin/ozone getconf -confKey 
ozone.om.service.ids 2020-10-05 19:10:09,110|INFO|MainThread|machine.py:180 - 
run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Unknown options: '-confKey', 
'ozone.om.service.ids' 2020-10-05 19:10:09,111|INFO|MainThread|machine.py:180 - 
run()||GUID=8644ce5b-cfe9-4e6b-9b3f-55c29c950489|Possible solutions: -conf
{code}

There are some users which did the automation with the commands and this change 
broke them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4302) Shade the org.apache.common.lang3 package as this is coming from other hadoop packages as well.

2020-10-01 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-4302:
-

 Summary: Shade the org.apache.common.lang3 package as this is 
coming from other hadoop packages as well.
 Key: HDDS-4302
 URL: https://issues.apache.org/jira/browse/HDDS-4302
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


In one of our duplicate classes tests, we noticed the duplicate classes because 
of commons-lang3. To avoid class collisions, it's good to shade the 
common-lang3 package as well.

java.lang.Exception: Duplicate class 
'org.apache.commons.lang3.arch.Processor$Arch.class' detected in 
'/Users/umagangumalla/Work/repos/Gerrit/xxx/xxx/target/xxx-client-x2.3-dependencies/hadoop-ozone-filesystem-hadoop3-.jar',
 class is already present in 
'/Users/umagangumalla/Work/repos/Gerrit/xxx/xxx/target/xxx-client-xxx-dependencies/commons-lang3-3.9.jar'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4287) Exclude protobuff classes from ozone-filesystem-hadoop3 jars

2020-09-29 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G resolved HDDS-4287.
---
Fix Version/s: 1.1.0
   Resolution: Fixed

Thanks [~bharat] for the review!

> Exclude protobuff classes from ozone-filesystem-hadoop3 jars
> 
>
> Key: HDDS-4287
> URL: https://issues.apache.org/jira/browse/HDDS-4287
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> Currently Ozone-filesystem-hadoop3 jar including protobuff classes. We are 
> already keeping the dependency on hadoop jars a prerequisite condition. And 
> hadoop will get the protobuf classes along with it's jars. So, getting 
> protobuff jars again with Ozone-filesystem-hadoop3 jar would be just 
> duplication. So, we can exclude that prootobuff classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4287) Exclude protobuff classes from ozone-filesystem-hadoop3 jars

2020-09-28 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-4287:
-

 Summary: Exclude protobuff classes from ozone-filesystem-hadoop3 
jars
 Key: HDDS-4287
 URL: https://issues.apache.org/jira/browse/HDDS-4287
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Currently Ozone-filesystem-hadoop3 jar including protobuff classes. We are 
already keeping the dependency on hadoop jars a prerequisite condition. And 
hadoop will get the protobuf classes along with it's jars. So, getting 
protobuff jars again with Ozone-filesystem-hadoop3 jar would be just 
duplication. So, we can exclude that prootobuff classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3816) Erasure Coding in Apache Hadoop Ozone

2020-06-26 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146621#comment-17146621
 ] 

Uma Maheswara Rao G commented on HDDS-3816:
---

HI [~linyiqun],

Thank you so much for your review and questions. Yes, this document is in early 
stage only. We thought to keep it open to all community member from starting 
stage of design, so, that all of us can be in-sync what’s going on with EC.

{quote}But it doesn't mentioned which is the final choice? Or that means we 
want to implement both of them and user can chose which way they prefer?{quote}
No decision has been made yet. This initial doc goal is to list the options 
with pros and cons of different options we have.

{quote}For the Container level option, it will be more easier to implement than 
Block Level option. But as design doc also mentioned that, it has more impact 
of this option, for example, delete operation impact(additionally need to write 
remaining active data to convert 3 replica mode), data recovery cost and high 
risk of data loss when some node crashed. From my personal opinion, Block level 
option is a more complete and robust implementation. How do we think of 
this?{quote}
Yes, you are right. Deletes are the biggest concerning points in 
Container-Level approach. We thought different options like delaying deletes 
unto certain time period and then process the deletes by converting data. This 
seems like not a very clean approach as of now, but it’s a doable with some 
constraints like delaying deletions. When in the case we need to support EC for 
whole cluster level or any data ( which may not be COLD), then there is a 
possibility more deletes ops in cluster and delaying data deletes can confuse 
and converting data on deletes also would be costly.

{quote}For the read/write performance comparison, the Block level EC will have 
a better performance. The block is split into multiple nodes as a striped 
storage. We can parallel read/write the data based on this change. In Container 
level, the block data structure in one Container actually unchanged, it still 
keeps continuous way but just has a striped form in Container level. So the 
read/write rate is exactly not changed under Container level EC. We still need 
to find one specific Container node to read/write for specific block 
data.{quote}
Yes. You are right. To support write time EC, currently better option is this 
Block Level EC. The biggest advantage here is, we don’t have the issue of 
deletes here.

One concern here( may not be areal concern, but comparatively with Container 
Level) is, we have to implement with branch new pipeline. We may not be able to 
use RATIS based pipelines here( need little more analysis here). 

{quote}What's the implementation complexity of this two options way? Like can 
we perfectly integrated current HDFS EC algorithm implementation into Ozone? In 
order to support EC, if there will be a large code refactor in current 
read/write implementation?{quote}

On a high level, we may can re-use the ideas but may not be exactly 
code-pieces. Definitely NN side code like block groups etc will be now a 
container groups , that may need to be handled by SCM. We still need to do 
code-level analysis further to see what level of code can be re-used if we are 
more inclined to this option. We have different protocols in Ozone compared to 
HDFS when we compare in client and DN communications. We can clearly re-use 
CODEC implementations irrespective of the options.

 

{quote}I see current EC design depends on the abstraction of storage-class 
implementation. I'm not sure if this is an easy thing to do at the beginning of 
Ozone EC implementation. Storage-class implementation is also a large feature I 
think, we define data storage type, policy and multiple rules to let system do 
the data transform automatically and transparently. This is similar to HDFS 
SSM(smart storage management) feature design in HDFS-7343. I'm not means to 
disagree of storage-class, but have a concern if we let this as one thing we 
must to implement first.{quote}

Mostly I agree with you. Only thing we thought was to make some foundations 
here considering different storage strategy options for users. EC would be one 
of the action/transition in Storage-class. In Storage-class itself it can go to 
much broader level of actions/transitions.

I think core-part of this EC (one of transition/actions) implementation would 
go independent of Storage-Class surely. The Storage-class may come into picture 
when configuring EC enable or disable and how?. This decisions still need to 
finalize based concrete options exposed from HDDS-3755. Let us continue to 
discuss more specific questions to Storage-class can be in HDDS-3755.

To summarize:
Currently we have two options:

A) 
Supporting any kind of DATA (HOT/COLD), even from writing time EC: 
The better option is Stri

[jira] [Commented] (HDDS-3816) Erasure Coding in Apache Hadoop Ozone

2020-06-23 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142711#comment-17142711
 ] 

Uma Maheswara Rao G commented on HDDS-3816:
---

[~maobaolong] Thanks a lot for the interest. Please take a look at the design 
[doc|https://issues.apache.org/jira/secure/attachment/13006245/Erasure%20Coding%20in%20Apache%20Hadoop%20Ozone.pdf]
 above. 

> Erasure Coding in Apache Hadoop Ozone
> -
>
> Key: HDDS-3816
> URL: https://issues.apache.org/jira/browse/HDDS-3816
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: SCM
>Reporter: Uma Maheswara Rao G
>Priority: Major
> Attachments: Erasure Coding in Apache Hadoop Ozone.pdf
>
>
> We propose to implement Erasure Coding in Apache Hadoop Ozone to provide 
> efficient storage. With EC in place, Ozone can provide same or better 
> tolerance by giving 50% or more  storage space savings. 
> In HDFS project, we already have native codecs(ISAL) and Java codecs 
> implemented, we can leverage the same or similar codec design.
> However, the critical part of EC data layout design is in-progress, we will 
> post the design doc soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3816) Erasure Coding in Apache Hadoop Ozone

2020-06-16 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-3816:
-

 Summary: Erasure Coding in Apache Hadoop Ozone
 Key: HDDS-3816
 URL: https://issues.apache.org/jira/browse/HDDS-3816
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
  Components: SCM
Reporter: Uma Maheswara Rao G


We propose to implement Erasure Coding in Apache Hadoop Ozone to provide 
efficient storage. With EC in place, Ozone can provide same or better tolerance 
by giving 50% or more  storage space savings. 
In HDFS project, we already have native codecs(ISAL) and Java codecs 
implemented, we can leverage the same or similar codec design.

However, the critical part of EC data layout design is in-progress, we will 
post the design doc soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3803) [OFS] Add User Guide

2020-06-15 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-3803:
--
Description: 
Need to add a user guide markdown for OFS. Especially the usage for {{/tmp}}.

Thanks [~umamaheswararao] and [~xyao] for the reminder.

  was:
Need to add a user guide markdown for OFS. Especially the usage for {{/tmp}}.

Thanks [~c7...@ercgroup.com] and [~xyao] for the reminder.


> [OFS] Add User Guide
> 
>
> Key: HDDS-3803
> URL: https://issues.apache.org/jira/browse/HDDS-3803
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> Need to add a user guide markdown for OFS. Especially the usage for {{/tmp}}.
> Thanks [~umamaheswararao] and [~xyao] for the reminder.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3632) HddsDatanodeService cannot be started if HDFS datanode running in same machine with same user.

2020-05-20 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-3632:
-

 Summary: HddsDatanodeService cannot be started if HDFS datanode 
running in same machine with same user.
 Key: HDDS-3632
 URL: https://issues.apache.org/jira/browse/HDDS-3632
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Affects Versions: 0.5.0
Reporter: Uma Maheswara Rao G


since the service names are same and they both referring to same location for 
pid files, we can not start both services at once.

Tweak is to export HADOOP_PID_DIR to different location after starting one 
service and start other one.

It would be better to have different pid file names.

 

 
{noformat}
Umas-MacBook-Pro ozone-0.5.0-beta % bin/ozone --daemon start datanode
datanode is running as process 25167.  Stop it first.
{noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3604) Support for Hadoop-3.3

2020-05-17 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109586#comment-17109586
 ] 

Uma Maheswara Rao G commented on HDDS-3604:
---

CC [~bharat] 

Recently Bharat verified and faced the class cast issues due to shaded packages 
of Message class.

> Support for Hadoop-3.3
> --
>
> Key: HDDS-3604
> URL: https://issues.apache.org/jira/browse/HDDS-3604
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Vinayakumar B
>Priority: Major
>
> Hadoop-3.3 will be released soon, which brings the most important and long 
> awaited Protobuf upgrade to 3.7, by shading the internal protobuf classes in 
> Hadoop-thirdparty library, still keeping the protobuf-2.5.0 as a transitive 
> dependency.
> Unfortunately, There are direct usages of Hadoop's internal protobuf classes. 
> Because of this, ozone breaks after upgrading hadoop dependency to 3.3.0
> This Jira intends to do avoid such direct usages of hadoop's protobuf classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3574) Implement ofs://: Override getTrashRoot

2020-05-12 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-3574:
--
Description: 
[~pifta] found if we delete file with Hadoop shell, namely {{hadoop fs -rm}}, 
without {{-skipTrash}} option, the operation would fail in OFS due to the 
client is renaming the file to {{/user//.Trash/}} because renaming 
across different buckets is not allowed in Ozone. (Unless the file happens to 
be under {{/user//}}, apparently.)

We could override {{getTrashRoot()}} in {{BasicOzoneFileSystem}} to a dir under 
the same bucket to mitigate the problem. Thanks [~umamaheswararao] for the 
suggestion.

This raises one more problem though: We need to implement trash clean up on OM. 
Opened HDDS-3575 for this.

CC [~arp] [~bharat]

  was:
[~pifta] found if we delete file with Hadoop shell, namely {{hadoop fs -rm}}, 
without {{-skipTrash}} option, the operation would fail in OFS due to the 
client is renaming the file to {{/user//.Trash/}} because renaming 
across different buckets is not allowed in Ozone. (Unless the file happens to 
be under {{/user//}}, apparently.)

We could override {{getTrashRoot()}} in {{BasicOzoneFileSystem}} to a dir under 
the same bucket to mitigate the problem. Thanks [~g.umamah...@gmail.com] for 
the suggestion.

This raises one more problem though: We need to implement trash clean up on OM. 
Opened HDDS-3575 for this.

CC [~arp] [~bharat]


> Implement ofs://: Override getTrashRoot
> ---
>
> Key: HDDS-3574
> URL: https://issues.apache.org/jira/browse/HDDS-3574
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> [~pifta] found if we delete file with Hadoop shell, namely {{hadoop fs -rm}}, 
> without {{-skipTrash}} option, the operation would fail in OFS due to the 
> client is renaming the file to {{/user//.Trash/}} because renaming 
> across different buckets is not allowed in Ozone. (Unless the file happens to 
> be under {{/user//}}, apparently.)
> We could override {{getTrashRoot()}} in {{BasicOzoneFileSystem}} to a dir 
> under the same bucket to mitigate the problem. Thanks [~umamaheswararao] for 
> the suggestion.
> This raises one more problem though: We need to implement trash clean up on 
> OM. Opened HDDS-3575 for this.
> CC [~arp] [~bharat]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3380) MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for test failures

2020-04-21 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G resolved HDDS-3380.
---
Resolution: Fixed

> MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for 
> test failures
> --
>
> Key: HDDS-3380
> URL: https://issues.apache.org/jira/browse/HDDS-3380
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: HA, test
>Affects Versions: 0.5.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While I was debugging some code paths using miniOzoneCluster:
> For example in TestOzoneHAManager:
> it plans to trigger snapshots at threshold 50 and same was configured and 
> passed to MiniOzoneHACluster. But inside 
> MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
> test will expect snapshot to trigger after 50 transactions, but it will not.
>  
> It will keep wait even after rolling at 50:
> {quote}GenericTestUtils.waitFor(() -> {
>  if (ozoneManager.getRatisSnapshotIndex() > 0) {
>  return true;
>  }
>  return false;
> }, 1000, 10);
> {quote}
>  
> {quote}2020-04-12 03:54:21,296 
> [omNode-1@group-523986131536-SegmentedRaftLogWorker] INFO 
> segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:execute(583)) - 
> omNode-1@group-523986131536-SegmentedRaftLogWorker: created new log segment 
> /Users/ugangumalla/Work/repos/hadoop-ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-fce544cd-3a80-4b0b-ac92-463cf391975c/omNode-1/ratis/c9bc4cf4-3bc3-3c60-a66b-523986131536/current/log_inprogress_49
> {quote}
>  
> So, respecting user passed configurations will fix the issue. I will post the 
> patch later in some time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-3465) OM Failover retry happens too quickly when new leader suggested and retrying on same OM

2020-04-21 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDDS-3465:
-

Assignee: Uma Maheswara Rao G

> OM Failover retry happens too quickly when new leader suggested and retrying 
> on same OM
> ---
>
> Key: HDDS-3465
> URL: https://issues.apache.org/jira/browse/HDDS-3465
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: HA
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Blocker
>
> When OM throws No leader exception with suggested leader.
> Client side failover happens too quickly.
> Incremental timeouts does not kick in this flow as we don't update 
> lastOM/currentOM ids in this flow.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3467) OM Client RPC failover retries happening more than configured.

2020-04-21 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-3467:
-

 Summary: OM Client RPC failover retries happening more than 
configured. 
 Key: HDDS-3467
 URL: https://issues.apache.org/jira/browse/HDDS-3467
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Affects Versions: 0.5.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Fix For: 0.6.0


Currently OM client is retrying more than configured.

Example if we configure MaxFailover times 2, it will try 3 times. Following log 
shows that.

{quote}{{2020-04-21 00:12:13,908 [Thread-0] INFO  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:log(411)) - com.google.protobuf.ServiceException: 
java.io.EOFException: End of File Exception between local host is: 
"21637.local/192.168.0.12"; destination host is: "localhost":12944; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException, while invoking 
$Proxy43.submitRequest over nodeId=omNode-3,nodeAddress=127.0.0.1:12944. Trying 
to failover immediately.
2020-04-21 00:12:13,909 [Thread-0] INFO  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:log(411)) - com.google.protobuf.ServiceException: 
java.io.EOFException: End of File Exception between local host is: 
"21637.local/192.168.0.12"; destination host is: "localhost":12932; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException, while invoking 
$Proxy43.submitRequest over nodeId=omNode-1,nodeAddress=127.0.0.1:12932 after 1 
failover attempts. Trying to failover immediately.
2020-04-21 00:12:13,910 [Thread-0] INFO  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:log(411)) - com.google.protobuf.ServiceException: 
java.io.EOFException: End of File Exception between local host is: 
"21637.local/192.168.0.12"; destination host is: "localhost":12938; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException, while invoking 
$Proxy43.submitRequest over nodeId=omNode-2,nodeAddress=127.0.0.1:12938 after 2 
failover attempts. Trying to failover immediately.}}{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3465) OM Failover retry happens too quickly when new leader suggested and retrying on same OM

2020-04-20 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-3465:
--
Issue Type: Bug  (was: New Feature)

> OM Failover retry happens too quickly when new leader suggested and retrying 
> on same OM
> ---
>
> Key: HDDS-3465
> URL: https://issues.apache.org/jira/browse/HDDS-3465
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: HA
>Reporter: Uma Maheswara Rao G
>Priority: Major
>
> When OM throws No leader exception with suggested leader.
> Client side failover happens too quickly.
> Incremental timeouts does not kick in this flow as we don't update 
> lastOM/currentOM ids in this flow.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-3465) OM Failover retry happens too quickly when new leader suggested and retrying on same OM

2020-04-20 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088157#comment-17088157
 ] 

Uma Maheswara Rao G edited comment on HDDS-3465 at 4/20/20, 11:24 PM:
--

Please find the logs:

{{2020-04-16 06:49:53,779 [IPC Server handler 1 on 11726] INFO  ipc.Server 
(Server.java:logException(2726)) - IPC Server handler 1 on 11726, call Call#451 
Retry#1 org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest 
from 127.0.0.1:564242020-04-16 06:49:53,779 [IPC Server handler 1 on 11726] 
INFO  ipc.Server (Server.java:logException(2726)) - IPC Server handler 1 on 
11726, call Call#451 Retry#1 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
127.0.0.1:56424*org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException: 
OM:omNode-2 is not the leader. Suggested leader is OM:omNode-3.* at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:185)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:127)
 at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:75)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:99)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)2020-04-16 
06:49:53,779 [grpc-default-executor-0] WARN  server.GrpcLogAppender 
(LogUtils.java:warn(122)) - 
omNode-3@group-523986131536->omNode-1-AppendLogResponseHandler: Failed 
appendEntries: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
UNAVAILABLE: io exception2020-04-16 06:49:53,781 [grpc-default-executor-0] INFO 
 impl.FollowerInfo (FollowerInfo.java:lambda$new$0(50)) - 
omNode-3@group-523986131536->omNode-1: nextIndex: updateUnconditionally 8 -> 
12020-04-16 06:49:53,787 [Thread-2184] INFO  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:log(411)) - com.google.protobuf.ServiceException: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
 *OM:omNode-2 is not the leader. Suggested leader is OM:omNode-3. *at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:185)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:127)
 at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:75)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:99)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682), while invoking 
$Proxy43.submitRequest over nodeId=omNode-2,nodeAddress=127.0.0.1:11726 after 1 
failover attempts. Trying to failover immediately.2020-04-16 06:49:53,811 [IPC 
Server handler 1 on 11732] INFO  ipc.Server (Server.java:logException(2726)) - 
IPC Server handler 1 on 11732, call Call#451 Retry#2 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
127.0.0.1:43120org.apache.hadoop.ozone.om.exceptions.OMLeaderNotReadyException: 
omNode-3@group-523986131536 is in LEADER state but not ready yet. at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.processReply(OzoneManagerRatisServer.java:177)
 at 
org.apache.h

[jira] [Commented] (HDDS-3465) OM Failover retry happens too quickly when new leader suggested and retrying on same OM

2020-04-20 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088157#comment-17088157
 ] 

Uma Maheswara Rao G commented on HDDS-3465:
---

{quote}{{2020-04-16 06:49:53,779 [IPC Server handler 1 on 11726] INFO  
ipc.Server (Server.java:logException(2726)) - IPC Server handler 1 on 11726, 
call Call#451 Retry#1 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
127.0.0.1:564242020-04-16 06:49:53,779 [IPC Server handler 1 on 11726] INFO  
ipc.Server (Server.java:logException(2726)) - IPC Server handler 1 on 11726, 
call Call#451 Retry#1 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
127.0.0.1:56424org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException: 
OM:omNode-2 is not the leader. Suggested leader is OM:omNode-3. at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:185)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:127)
 at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:75)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:99)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)2020-04-16 
06:49:53,779 [grpc-default-executor-0] WARN  server.GrpcLogAppender 
(LogUtils.java:warn(122)) - 
omNode-3@group-523986131536->omNode-1-AppendLogResponseHandler: Failed 
appendEntries: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
UNAVAILABLE: io exception2020-04-16 06:49:53,781 [grpc-default-executor-0] INFO 
 impl.FollowerInfo (FollowerInfo.java:lambda$new$0(50)) - 
omNode-3@group-523986131536->omNode-1: nextIndex: updateUnconditionally 8 -> 
12020-04-16 06:49:53,787 [Thread-2184] INFO  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:log(411)) - com.google.protobuf.ServiceException: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
 OM:omNode-2 is not the leader. Suggested leader is OM:omNode-3. at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.createNotLeaderException(OzoneManagerProtocolServerSideTranslatorPB.java:185)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:127)
 at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:75)
 at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:99)
 at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682), while invoking 
$Proxy43.submitRequest over nodeId=omNode-2,nodeAddress=127.0.0.1:11726 after 1 
failover attempts. Trying to failover immediately.2020-04-16 06:49:53,811 [IPC 
Server handler 1 on 11732] INFO  ipc.Server (Server.java:logException(2726)) - 
IPC Server handler 1 on 11732, call Call#451 Retry#2 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
127.0.0.1:43120org.apache.hadoop.ozone.om.exceptions.OMLeaderNotReadyException: 
omNode-3@group-523986131536 is in LEADER state but not ready yet. at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.processReply(OzoneManagerRatisServer.java:177)
 at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.submitRequest(OzoneManagerR

[jira] [Created] (HDDS-3465) OM Failover retry happens too quickly when new leader suggested and retrying on same OM

2020-04-20 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-3465:
-

 Summary: OM Failover retry happens too quickly when new leader 
suggested and retrying on same OM
 Key: HDDS-3465
 URL: https://issues.apache.org/jira/browse/HDDS-3465
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
  Components: HA
Reporter: Uma Maheswara Rao G


When OM throws No leader exception with suggested leader.

Client side failover happens too quickly.

Incremental timeouts does not kick in this flow as we don't update 
lastOM/currentOM ids in this flow.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-3380) MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for test failures

2020-04-17 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDDS-3380:
-

Assignee: Uma Maheswara Rao G  (was: Uma Maheshwar Rao Gunuganti)

> MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for 
> test failures
> --
>
> Key: HDDS-3380
> URL: https://issues.apache.org/jira/browse/HDDS-3380
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: HA, test
>Affects Versions: 0.5.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While I was debugging some code paths using miniOzoneCluster:
> For example in TestOzoneHAManager:
> it plans to trigger snapshots at threshold 50 and same was configured and 
> passed to MiniOzoneHACluster. But inside 
> MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
> test will expect snapshot to trigger after 50 transactions, but it will not.
>  
> It will keep wait even after rolling at 50:
> {quote}GenericTestUtils.waitFor(() -> {
>  if (ozoneManager.getRatisSnapshotIndex() > 0) {
>  return true;
>  }
>  return false;
> }, 1000, 10);
> {quote}
>  
> {quote}2020-04-12 03:54:21,296 
> [omNode-1@group-523986131536-SegmentedRaftLogWorker] INFO 
> segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:execute(583)) - 
> omNode-1@group-523986131536-SegmentedRaftLogWorker: created new log segment 
> /Users/ugangumalla/Work/repos/hadoop-ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-fce544cd-3a80-4b0b-ac92-463cf391975c/omNode-1/ratis/c9bc4cf4-3bc3-3c60-a66b-523986131536/current/log_inprogress_49
> {quote}
>  
> So, respecting user passed configurations will fix the issue. I will post the 
> patch later in some time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3380) MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for test failures

2020-04-12 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-3380:
--
Description: 
While I was debugging some code paths using miniOzoneCluster:

For example in TestOzoneHAManager:

it plans to trigger snapshots at threshold 50 and same was configured and 
passed to MiniOzoneHACluster. But inside 
MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
test will expect snapshot to trigger after 50 transactions, but it will not.

 

It will keep wait even after rolling at 50:
{quote}GenericTestUtils.waitFor(() -> {
 if (ozoneManager.getRatisSnapshotIndex() > 0) {
 return true;
 }
 return false;
}, 1000, 10);
{quote}
 
{quote}2020-04-12 03:54:21,296 
[omNode-1@group-523986131536-SegmentedRaftLogWorker] INFO 
segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:execute(583)) - 
omNode-1@group-523986131536-SegmentedRaftLogWorker: created new log segment 
/Users/ugangumalla/Work/repos/hadoop-ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-fce544cd-3a80-4b0b-ac92-463cf391975c/omNode-1/ratis/c9bc4cf4-3bc3-3c60-a66b-523986131536/current/log_inprogress_49
{quote}
 

So, respecting user passed configurations will fix the issue. I will post the 
patch later in some time.

  was:
For example in TechOzoneHAManager:

it plans to trigger snapshots at threshold 50 and same was configured and 
passed to MiniOzoneHACluster. But inside 
MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
test will expect snapshot to trigger after 50 transactions, but it will not.

So, respecting user passed configurations will fix the issue. I will post the 
patch later in some time.


> MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for 
> test failures
> --
>
> Key: HDDS-3380
> URL: https://issues.apache.org/jira/browse/HDDS-3380
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: HA, test
>Affects Versions: 0.5.0
>Reporter: Uma Maheswara Rao G
>Priority: Major
> Fix For: 0.6.0
>
>
> While I was debugging some code paths using miniOzoneCluster:
> For example in TestOzoneHAManager:
> it plans to trigger snapshots at threshold 50 and same was configured and 
> passed to MiniOzoneHACluster. But inside 
> MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
> test will expect snapshot to trigger after 50 transactions, but it will not.
>  
> It will keep wait even after rolling at 50:
> {quote}GenericTestUtils.waitFor(() -> {
>  if (ozoneManager.getRatisSnapshotIndex() > 0) {
>  return true;
>  }
>  return false;
> }, 1000, 10);
> {quote}
>  
> {quote}2020-04-12 03:54:21,296 
> [omNode-1@group-523986131536-SegmentedRaftLogWorker] INFO 
> segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:execute(583)) - 
> omNode-1@group-523986131536-SegmentedRaftLogWorker: created new log segment 
> /Users/ugangumalla/Work/repos/hadoop-ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-fce544cd-3a80-4b0b-ac92-463cf391975c/omNode-1/ratis/c9bc4cf4-3bc3-3c60-a66b-523986131536/current/log_inprogress_49
> {quote}
>  
> So, respecting user passed configurations will fix the issue. I will post the 
> patch later in some time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3380) MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for test failures

2020-04-12 Thread Uma Maheswara Rao G (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDDS-3380:
--
Priority: Minor  (was: Major)

> MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for 
> test failures
> --
>
> Key: HDDS-3380
> URL: https://issues.apache.org/jira/browse/HDDS-3380
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: HA, test
>Affects Versions: 0.5.0
>Reporter: Uma Maheswara Rao G
>Priority: Minor
> Fix For: 0.6.0
>
>
> While I was debugging some code paths using miniOzoneCluster:
> For example in TestOzoneHAManager:
> it plans to trigger snapshots at threshold 50 and same was configured and 
> passed to MiniOzoneHACluster. But inside 
> MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
> test will expect snapshot to trigger after 50 transactions, but it will not.
>  
> It will keep wait even after rolling at 50:
> {quote}GenericTestUtils.waitFor(() -> {
>  if (ozoneManager.getRatisSnapshotIndex() > 0) {
>  return true;
>  }
>  return false;
> }, 1000, 10);
> {quote}
>  
> {quote}2020-04-12 03:54:21,296 
> [omNode-1@group-523986131536-SegmentedRaftLogWorker] INFO 
> segmented.SegmentedRaftLogWorker (SegmentedRaftLogWorker.java:execute(583)) - 
> omNode-1@group-523986131536-SegmentedRaftLogWorker: created new log segment 
> /Users/ugangumalla/Work/repos/hadoop-ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-fce544cd-3a80-4b0b-ac92-463cf391975c/omNode-1/ratis/c9bc4cf4-3bc3-3c60-a66b-523986131536/current/log_inprogress_49
> {quote}
>  
> So, respecting user passed configurations will fix the issue. I will post the 
> patch later in some time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3380) MiniOzoneHAClusterImpl#initOMRatisConf will reset the configs and causes for test failures

2020-04-12 Thread Uma Maheswara Rao G (Jira)
Uma Maheswara Rao G created HDDS-3380:
-

 Summary: MiniOzoneHAClusterImpl#initOMRatisConf will reset the 
configs and causes for test failures
 Key: HDDS-3380
 URL: https://issues.apache.org/jira/browse/HDDS-3380
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: HA, test
Affects Versions: 0.5.0
Reporter: Uma Maheswara Rao G
 Fix For: 0.6.0


For example in TechOzoneHAManager:

it plans to trigger snapshots at threshold 50 and same was configured and 
passed to MiniOzoneHACluster. But inside 
MiniOzoneHAClusterImpl#initOMRatisConf, it will silently reset to 100L. So, 
test will expect snapshot to trigger after 50 transactions, but it will not.

So, respecting user passed configurations will fix the issue. I will post the 
patch later in some time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org