The problem is that we dont know if there is a packet loss or there are
delays due to a clogged up network.

The goal is to keep operating in presence of the random network errors.

Although, not tried, the AMQ apparently has a way to auto retry, with
configurable delay and back off in between retries. I would
suggest to try that instead of building recovery into uima-as. My point is
that if this problem can be solved by AMQ why do this
in the UIMA-AS. Although, one benefit on doing the recovery in the UIMA-AS
is that we can learn when the delays occur and how
long (approx) they last. I think using AMQ to do recovery will hide the
delays as I suspect the recovery is silent (this may need to
be tested).







On Fri, Feb 7, 2014 at 11:52 AM, Marshall Schor (JIRA)
<[email protected]>wrote:

>
>     [
> https://issues.apache.org/jira/browse/UIMA-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894717#comment-13894717]
>
> Marshall Schor commented on UIMA-3605:
> --------------------------------------
>
> hmmm,  is the design point for uima-as to operate OK in environments were
> tcp/ip packets are lost (I thought tcp/ip has recovery for that:
> http://en.wikipedia.org/wiki/Packet_loss )? or just the network is slow?
> It seems that if the network is so slow that a 10 second timeout pops, then
> maybe the network is too poor to support a UIMA-AS scaleout?
>
> Or, is the goal to keep operating in the presence of random, occasional,
> network hangs?
>
> Do we have any profile of the what's going on in networks when this kind
> of problem happens - is it temporary?
>
> The answer could guide what kind of solution is appropriate.  For
> instance, if it is determined that for some reason, the network is usually
> great, but occasionally delays packets for 1 minute, then the recovery
> might want to do something like wait 1 minute before retrying.  Or if the
> desire is to operate in the presence of occasional network hangs,, perhaps
> some design which measures the duration of these hangs, on an ongoing
> basis, would be useful - if it found they were 20 seconds, then the delay
> before higher-level retry could be set at 20 + delta seconds.
>
> If it is thought this is too much for UIMA-AS to handle, and it should be
> handled by fixing the networks, then perhaps the current design is OK :-)
>
> > UIMA-AS gets "Wire format negotiation timeout" on connection.open()
> > -------------------------------------------------------------------
> >
> >                 Key: UIMA-3605
> >                 URL: https://issues.apache.org/jira/browse/UIMA-3605
> >             Project: UIMA
> >          Issue Type: Bug
> >          Components: Async Scaleout
> >    Affects Versions: 2.4.2AS
> >            Reporter: Jerry Cwiklik
> >            Assignee: Jerry Cwiklik
> >             Fix For: 2.5.0AS
> >
> >
> > It appears that under heavy network load UIMA-AS is getting "Wire format
> negotiation timeout" Exception when opening a connection to a broker.
> > The client side of AMQ is sending a frame containing its parameters to
> the server (broker). It reconciles clients params against its own and sends
> a reply  back to the client. The reply apparently never reaches the client
> causing the timer to pop (default=10secs) and an exception is thrown.
> > Attempt to extend the client timeout via
> wireFormat.maxInactivityDurationInitalDelay=60000 doesnt fix the problem.
> One possible explanation is that either the client wire format frame is not
> reaching the server or the server's reply doesnt reach the client. This may
> be due to a lost TCP packet.
> > Since the low level amq wire negotiation doesnt offer retry, the UIMA-AS
> may need implement a higher level retry around the connection open() logic.
> It should capture generic JMSException and check the associated description
> for "wire format ..." problem. In such case, the connection should be
> closed and reopened.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)
>

Reply via email to