Re: [Discuss] Use of -1 as Infinite/All for retry related functions...

Udo Kohlmeyer Thu, 31 Aug 2017 11:15:06 -0700

I think with the retry one has to distinguish between:


1. currently running tasks that have not yet completed (current timeout
   behavior)
2. tasks that have failed because the server died.

In case #1, we currently have the ability to compound any load error, byforwarding the request to the cluster again, thus just addingunnecessary load. In this case we should NEVER just randomly retry.

In case #2, the client should retry another server because the originalrequest might have been lost, never completed, or completed but not yetnotified on result. THIS would be the ONLY case were an auto retrymakes sense....


--Udo


On 8/31/17 11:10, Mark Hanson wrote:

This basic problem exists with the following cases.
Interval: to do something at an interval
Wait: to wait a certain length of time
Retry: to retry a certain number of times
Attempts: to make a certain number of attempts (similar to retry)
Sets of objects: to iterate through an unscoped set of objects.

On Thu, Aug 31, 2017 at 11:04 AM, Jacob Barrett <jbarr...@pivotal.io> wrote:

I should have scoped it to the native API.

On Aug 31, 2017, at 10:30 AM, Bruce Schuchardt <bschucha...@pivotal.io>

wrote:

The DistributedLockService uses -1/0/n

On 8/31/17 10:21 AM, Jacob Barrett wrote:
In relation to this particular example you provided the discussion of

removing it is valid as an alternative to fixing it.

Are there other examples of this -1/0/n parameter style we should

discuss?

-Jake


Sent from my iPhone

On Aug 31, 2017, at 10:15 AM, Mark Hanson <mhan...@pivotal.io> wrote:

As I understand it here, the question is when the first server is no

longer

available, do we retry on another server. I would say the answer is

clearly

yes and we in the name of controlling load want to have an API that
controls the timing of how that is done. The customer can say no

retries

and they can right their own........

This is a little bit off the topic of the much larger topic though. The
reason I was told to send this email was to broach the larger

discussion of

iteration and the overloading to use -1 to mean infinite. At least

that is

my understanding...


On Thu, Aug 31, 2017 at 9:32 AM, Udo Kohlmeyer <ukohlme...@pivotal.io>
wrote:

+1 to removing retry,

Imo, the retry should made the responsibility of the submitting
application. When an operation fails, the user should have to decide

if

they should retry or not. It should not be default behavior of a

connection

pool.

--Udo

On 8/31/17 09:26, Dan Smith wrote:

The java client does still have a retry-attempts setting - it's

pretty

much
the same as the C++ API.

I agree with Bruce though, I think the current retry behavior is not
ideal.
I think it only really makes sense for the client to retry an

operation

that it actually sent to the server if the server stops responding to
pings. The believe the current retry behavior just waits the

read-timeout

and then retries the operation on a new server.

-Dan

On Thu, Aug 31, 2017 at 8:08 AM, Bruce Schuchardt <

bschucha...@pivotal.io

wrote:

Does anyone have a good argument for clients retrying operations?  I

can

see doing that if the server has died but otherwise it just

overloads the

servers.




On 8/30/17 8:36 PM, Dan Smith wrote:

In general, I think we need making the configuration of geode less

complex,
not more.

As far as retry-attempts goes, maybe the best thing to do is to

get rid

of
it. The P2P layer has no such concept. I don't think users should

really

have to care about how many servers an operation is attempted

against. A

user may want to specify how long an operation is allowed to take,

but

that
could be better specified with an operation timeout rather than the
current
read-timeout + retry-attempts.

-Dan



On Wed, Aug 30, 2017 at 2:08 PM, Patrick Rhomberg <

prhomb...@pivotal.io

wrote:

Personally, I don't much like sentinel values, even if they have

their

occasional use.

Do we need to provide an authentic infinite value?  64-bit MAXINT

is

nearly
10 quintillion.  At 10GHz, that still takes almost three years.

If

each
retry takes as much as 10ms, we're still looking at "retry for as

long

as
the earth has existed."  32-bit's is much more attainable, of

course,

but I
think the point stands -- if you need to retry that much,

something

else
is
very wrong.

In the more general sense, I struggle to think of a context where

an

authentic infinity is meaningfully distinct in application from a
massive
finite like MAXINT.  But I could be wrong and would love to hear

what

other
people think.

On Wed, Aug 30, 2017 at 1:26 PM, Mark Hanson <mhan...@pivotal.io>
wrote:

Hi All,

*Question: how should we deal in a very forward and clean

fashion with

the

implicit ambiguity of -1 or all, or infinite, or forever?*

*Background:*


We are looking to get some feedback on the subject of

infinite/all/forever

in the geode/geode-native code.

In looking at the code, we see an example function,


setRetryAttempts
<https://github.com/apache/geode-native/blob/
006df0e70eeb481ef5e9e821dba0050dee9c6893/cppcache/include/
geode/PoolFactory.hpp#L327>()
[1] currently -1 means try all servers before failing. 0 means

try 1

server

before failing, and a number greater than 0 means try number of

servers

+1

before failing. In the case of setRetryAttempts, we don’t know

how many

servers there are. This means that -1 for "All" servers has no
relation

to

the actual number of servers that we have. Perhaps

setRetryAttempts

could
be renamed to setNumberOfAttempts to clarify as well, but the

problem

still

stands...

*Discussion:*


In an attempt to provide the best code possible to the geode
community,
there has been some discussion of the use of

infinite/all/forever as

an
overload of a count. Often -1 indicates infinite, while 0

indicates

never,

and 1 to MAXINT, inclusive, indicates a count.

There are three obvious approaches to solve the problem of the

overloading

of -1. The first approach is do nothing… Status quo.

The second approach to clarify things would be to create an
enumeration
that would be passed in as well as the number or an object..


struct Retries

{

    typedef enum { eINFINITE, eCOUNT, eNONE} eCount;

    eCount approach;

    unsigned int count;

};



The third approach would be to pass a continue object of some

sort

such
that it tells you if it is ok to continue through the use of an

algorithm.

An example would be

class Continue

{

virtual bool Continue() = 0;

}


class InfiniteContinue : public Continue

{

bool Continue()

{

return true;

}

}


Continue co = InfiniteContinue;


while( co.Continue() )

{

//do a thing

}


Another example would be a Continue limited to 5 let’s say,


class CountContinue : public Continue

{

private:

int count;


public:

CountContinue(int count)

{

this.count = count;

}

bool Continue()

{

    return count— > 0;

}

}


In both of these cases what is happening is that the algorithm is
being
outsourced.


*Conclusion:*


We are putting this out, to start a discussion on the best way

to move

this

forward… *What do people think? What direction would be the best

going

forward?*


[1]
https://github.com/apache/geode-native/blob/

006df0e70eeb481ef5e9e821dba005

0dee9c6893/cppcache/include/geode/PoolFactory.hpp#L327

Re: [Discuss] Use of -1 as Infinite/All for retry related functions...

Reply via email to