[jira] [Commented] (GEODE-10012) Avoid retries for non-HA function execution

2023-11-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787309#comment-17787309
 ] 

ASF GitHub Bot commented on GEODE-10012:


gaussianrecurrence commented on PR #920:
URL: https://github.com/apache/geode-native/pull/920#issuecomment-1816767907

   I haven't had the time to work on this for a long time or I think I'll have 
the opportunity to in the near future, hence I am closing this PR, if anyone 
wants to tackle this issue can take it up from its latest state.




> Avoid retries for non-HA function execution
> ---
>
> Key: GEODE-10012
> URL: https://issues.apache.org/jira/browse/GEODE-10012
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Reporter: Mario Salazar de Torres
>Assignee: Mario Salazar de Torres
>Priority: Major
>  Labels: pull-request-available
>
> *GIVEN* a cluster with 3 servers and 1 locator
> *AND* a PartitionedRegion with redundant-copies="1"
> *AND* a user-defined Function with isHA=false
> *AND* a geode-native client with a pool with single-hop enabled and retry 
> attempts set to 2
> *WHEN* execution fails due to a connectivity issue/connection 
> timeout/internal cluster error
> *THEN* function is retried twice
> 
> *Additional information.* The function is retried at the network code layer, 
> and given whenever there is a timeout/connectivity error, the BSL is removed 
> from the ClientMetadataService, whenever retries happen, you might get a 
> *InternalFunctionInvocationTargetException* exception thrown indicating: 
> {code:java}
> Multiple target nodes found for single hop operation {code}
> Also, have into account that Java client API behaviour never retries a 
> Function Execution if isHA=false for that function, and whenever isHA=true, 
> the function is retried but on the function execution logic and not at the 
> networking layer.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (GEODE-10012) Avoid retries for non-HA function execution

2023-11-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787308#comment-17787308
 ] 

ASF GitHub Bot commented on GEODE-10012:


gaussianrecurrence closed pull request #920:  GEODE-10012: Avoid non-HA 
function retries
URL: https://github.com/apache/geode-native/pull/920




> Avoid retries for non-HA function execution
> ---
>
> Key: GEODE-10012
> URL: https://issues.apache.org/jira/browse/GEODE-10012
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Reporter: Mario Salazar de Torres
>Assignee: Mario Salazar de Torres
>Priority: Major
>  Labels: pull-request-available
>
> *GIVEN* a cluster with 3 servers and 1 locator
> *AND* a PartitionedRegion with redundant-copies="1"
> *AND* a user-defined Function with isHA=false
> *AND* a geode-native client with a pool with single-hop enabled and retry 
> attempts set to 2
> *WHEN* execution fails due to a connectivity issue/connection 
> timeout/internal cluster error
> *THEN* function is retried twice
> 
> *Additional information.* The function is retried at the network code layer, 
> and given whenever there is a timeout/connectivity error, the BSL is removed 
> from the ClientMetadataService, whenever retries happen, you might get a 
> *InternalFunctionInvocationTargetException* exception thrown indicating: 
> {code:java}
> Multiple target nodes found for single hop operation {code}
> Also, have into account that Java client API behaviour never retries a 
> Function Execution if isHA=false for that function, and whenever isHA=true, 
> the function is retried but on the function execution logic and not at the 
> networking layer.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (GEODE-10012) Avoid retries for non-HA function execution

2022-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500310#comment-17500310
 ] 

ASF GitHub Bot commented on GEODE-10012:


pivotal-jbarrett commented on a change in pull request #920:
URL: https://github.com/apache/geode-native/pull/920#discussion_r817929905



##
File path: cppcache/src/ExecutionImpl.hpp
##
@@ -78,6 +78,20 @@ class ExecutionImpl {
   static void addResults(std::shared_ptr& collector,
  const std::shared_ptr& results);
 
+ protected:
+  std::shared_ptr executeOnPool(
+  const std::string& func, FunctionAttributes funcAttrs, int32_t 
retryAttempts,
+  std::chrono::milliseconds timeout = DEFAULT_QUERY_RESPONSE_TIMEOUT);
+
+  void executeOnAllServers(
+  const std::string& func, FunctionAttributes funcAttrs,
+  std::chrono::milliseconds timeout = DEFAULT_QUERY_RESPONSE_TIMEOUT);
+
+  FunctionAttributes getFunctionAttributes(
+  const std::string& func);

Review comment:
   Please use full names, like `functionId` here.

##
File path: cppcache/src/ExecutionImpl.hpp
##
@@ -78,6 +78,20 @@ class ExecutionImpl {
   static void addResults(std::shared_ptr& collector,
  const std::shared_ptr& results);
 
+ protected:
+  std::shared_ptr executeOnPool(
+  const std::string& func, FunctionAttributes funcAttrs, int32_t 
retryAttempts,
+  std::chrono::milliseconds timeout = DEFAULT_QUERY_RESPONSE_TIMEOUT);
+
+  void executeOnAllServers(
+  const std::string& func, FunctionAttributes funcAttrs,
+  std::chrono::milliseconds timeout = DEFAULT_QUERY_RESPONSE_TIMEOUT);
+
+  FunctionAttributes getFunctionAttributes(
+  const std::string& func);
+  FunctionAttributes updateFunctionAttributes(const std::string );
+  GfErrType getFuncAttributes(const std::string& func, FunctionAttributes& 
attr);

Review comment:
   `functionId` and `functionAttributes` please.

##
File path: cppcache/src/ExecutionImpl.cpp
##
@@ -329,26 +281,23 @@ std::shared_ptr ExecutionImpl::execute(
   "Execution::execute: Transaction function execution on pool is not "
   "supported");
 }
-if (m_allServer == false) {
-  executeOnPool(
-  func, isHAHasResultOptimizeForWrite,
-  (isHAHasResultOptimizeForWrite & 1) ? m_pool->getRetryAttempts() : 0,
-  timeout);
-  if (serverHasResult == true) {
-// ExecutionImpl::addResults(m_rc, rs);
+if (!m_allServer) {
+  executeOnPool(func, attrs, attrs.isHA() ? m_pool->getRetryAttempts() : 0,
+timeout);
+  if (attrs.hasResult()) {
 m_rc->endResults();
   }
   return m_rc;
 }
-executeOnAllServers(func, isHAHasResultOptimizeForWrite, timeout);
+executeOnAllServers(func, attrs, timeout);
   } else {
 throw IllegalStateException("Execution::execute: should not be here");
   }
   return m_rc;
 }
 
-GfErrType ExecutionImpl::getFuncAttributes(
-const std::string& func, std::shared_ptr>* attr) {
+GfErrType ExecutionImpl::getFuncAttributes(const std::string& func,
+   FunctionAttributes& attr) {

Review comment:
   Does it make more sense to have the function return `FunctionAttributes` 
and `throw` exceptions?

##
File path: cppcache/src/ExecutionImpl.cpp
##
@@ -406,24 +355,25 @@ void ExecutionImpl::addResults(
 }
 
 void ExecutionImpl::executeOnAllServers(const std::string& func,
-uint8_t getResult,
+FunctionAttributes funcAttrs,

Review comment:
   Is the copy construction cheaper or necessary here rather than a `const 
&`?

##
File path: cppcache/src/ThinClientRegion.cpp
##
@@ -2907,18 +2911,20 @@ void ThinClientRegion::executeFunction(
 rc->clearResults();
 failedNodes->clear();
   } else if (err == GF_TIMEOUT) {
-LOGINFO("function timeout. Name: %s, timeout: %s, params: %" PRIu8
-", "
-"retryAttempts: %d ",
-func.c_str(), to_string(timeout).c_str(), getResult,
-retryAttempts);
+LOGINFO(
+"function timeout. Name: %s, timeout: %s, FunctionState: %" PRIu8
+", "
+"retryAttempts: %d ",

Review comment:
   Cleanup the formatting here.

##
File path: cppcache/src/FunctionExecution.cpp
##
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the 

[jira] [Commented] (GEODE-10012) Avoid retries for non-HA function execution

2022-03-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500289#comment-17500289
 ] 

ASF GitHub Bot commented on GEODE-10012:


pdxcodemonkey commented on a change in pull request #920:
URL: https://github.com/apache/geode-native/pull/920#discussion_r803908544



##
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##
@@ -74,6 +99,9 @@ class TestResultCollector : public ResultCollector {
   }
 
   virtual void addResult(const std::shared_ptr ) override {
+LOGINFO("Before mutex lock!");
+std::lock_guard lock{mutex_};
+LOGINFO("Adding a new result!");

Review comment:
   Do we really wish to log this at INFO level?  Looks more like a 
DEBUG-level message to me...

##
File path: cppcache/src/ExecutionImpl.hpp
##
@@ -100,29 +114,18 @@ class ExecutionImpl {
 m_allServer(allServer),
 m_pool(pool),
 m_authenticatedView(authenticatedView) {}
+ protected:
+
   std::shared_ptr m_routingObj;

Review comment:
   You've essentially rewritten the class already, it would be a shame not 
to go ahead and properly rename the member variables.

##
File path: cppcache/src/TcrMessage.cpp
##
@@ -1184,12 +1184,11 @@ void TcrMessage::handleByteArrayResponse(
 input.readInt32();
 input.advanceCursor(1);  // ignore byte
 
-if (!m_functionAttributes) {
-  m_functionAttributes = std::make_shared>();
-}
-m_functionAttributes->push_back(input.read());
-m_functionAttributes->push_back(input.read());
-m_functionAttributes->push_back(input.read());
+bool hasResult = input.read() != 0;
+bool isHA = input.read() != 0;
+bool optimizeForWrite = input.read() != 0;
+m_functionAttributes =
+FunctionAttributes{isHA, hasResult, optimizeForWrite};

Review comment:
   Just pausing for a minute to think about how hateful it was to have this 
encoded as a vector with no indication of meaning whatsoever.  This is 
a huge improvement in readability, thanks!

##
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##
@@ -74,6 +99,9 @@ class TestResultCollector : public ResultCollector {
   }
 
   virtual void addResult(const std::shared_ptr ) override {
+LOGINFO("Before mutex lock!");
+std::lock_guard lock{mutex_};
+LOGINFO("Adding a new result!");

Review comment:
   Actually no, we shouldn't be using any of the LOG* macros in integration 
test code at all.  Please remove.

##
File path: cppcache/src/FunctionAttributes.hpp
##
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#ifndef GEODE_FUNCTIONATTRIBUTES_H_
+#define GEODE_FUNCTIONATTRIBUTES_H_
+
+#include 
+
+namespace apache {
+namespace geode {
+namespace client {
+
+class FunctionAttributes {
+ public:
+  enum : uint8_t {

Review comment:
   Does this enum need to be public?  You have accessors for the individual 
values already, it seems like a thing you'd want to hide.

##
File path: cppcache/src/ExecutionImpl.cpp
##
@@ -434,31 +384,15 @@ void ExecutionImpl::executeOnAllServers(const 
std::string& func,
   }
 }
 std::shared_ptr ExecutionImpl::executeOnPool(
-const std::string& func, uint8_t getResult, int32_t retryAttempts,
-std::chrono::milliseconds timeout) {
+const std::string& func, FunctionAttributes funcAttrs,
+int32_t retryAttempts, std::chrono::milliseconds timeout) {
   ThinClientPoolDM* tcrdm = dynamic_cast(m_pool.get());
   if (tcrdm == nullptr) {
 throw IllegalArgumentException(
 "Execute: pool cast to ThinClientPoolDM failed");
   }
   int32_t attempt = 0;
 
-  // auto csArray = tcrdm->getServers();
-
-  // if (csArray != nullptr && csArray->length() != 0) {
-  //  for (int i = 0; i < csArray->length(); i++)
-  //  {
-  //   auto cs = csArray[i];
-  //TcrEndpoint *ep = nullptr;
-  ///*
-  //std::string endpointStr =
-  //Utils::convertHostToCanonicalForm(cs->value().c_str()
-  //);
-  //*/
-  //ep = tcrdm->addEP(cs->value().c_str());
- 

[jira] [Commented] (GEODE-10012) Avoid retries for non-HA function execution

2022-02-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487523#comment-17487523
 ] 

ASF GitHub Bot commented on GEODE-10012:


gaussianrecurrence opened a new pull request #920:
URL: https://github.com/apache/geode-native/pull/920


- Currently, in the native client, if you configure retryAttemps to be >0 
for your pool, non-HA function executions are retried. 
   This happens as retry are coded in the network layer.
- Geode documentation states that retries should happen for functions
  which isHA=true and in the function execution logic. And this is how
  it works in the Java client API.
- If a function is retried at the network layer, it could happen that 
- So, this commit changes the behavior, so non-HA functions are not
  retried.
- Also, FunctionAttributes were introduced in order to improve code
  readability.
- Moved function execution classes to a file of its own, in order to
  improve readability and modularity.
- Solved a bug in which there could not exist 2 functions with the same
  name, defined on different clusters and with different function
  attributes.
- Implemented several new ITs to test that non-HA function executions are 
not
  retried by the API.
- These are the cases being tested:
   * Function execution with PR enabled:
 - A timeout is forced from the server-side, so the server closes
   the connection towards the client.
 - A timeout is forced in the client.
   * Function execution with PR disabled.
   
   
   Co-authored-by: Jakov Varenina 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Avoid retries for non-HA function execution
> ---
>
> Key: GEODE-10012
> URL: https://issues.apache.org/jira/browse/GEODE-10012
> Project: Geode
>  Issue Type: Bug
>  Components: native client
>Reporter: Mario Salazar de Torres
>Assignee: Mario Salazar de Torres
>Priority: Major
>  Labels: needsTriage
>
> *GIVEN* a cluster with 3 servers and 1 locator
> *AND* a PartitionedRegion with redundant-copies="1"
> *AND* a user-defined Function with isHA=false
> *AND* a geode-native client with a pool with single-hop enabled and retry 
> attempts set to 2
> *WHEN* execution fails due to a connectivity issue/connection 
> timeout/internal cluster error
> *THEN* function is retried twice
> 
> *Additional information.* The function is retried at the network code layer, 
> and given whenever there is a timeout/connectivity error, the BSL is removed 
> from the ClientMetadataService, whenever retries happen, you might get a 
> *InternalFunctionInvocationTargetException* exception thrown indicating: 
> {code:java}
> Multiple target nodes found for single hop operation {code}
> Also, have into account that Java client API behaviour never retries a 
> Function Execution if isHA=false for that function, and whenever isHA=true, 
> the function is retried but on the function execution logic and not at the 
> networking layer.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)