Hello god wu.Through my check, I have found that there some error info in my 
skywalking-agent logs,such as "Send UpstreamSegment to collector fail with a 
grpc internal exception. 
org.apache.skywalking.apm.dependencies.io.grpc.StatusRuntimeException: 
UNAVAILABLE: Network closed for unknown reason"
How to explain it?

















At 2021-09-13 15:05:24, "Sheng Wu" <[email protected]> wrote:
>(1) All data in that bulk(ElasticSearch concept, read their doc) will
>be lost, yes.
>(2) This only means your agent gets disconnected from Server
>unexpectedly. For a reason about why, it wouldn't tell.
>
>About what you described in Chinese, first of all, it is better to
>keep Chinese and English consistent, don't put more information on one
>side, it is confusing.
>Why the agent will be disconnected forever, it can't be told from what
>you have provided.
>Auto reconnecting is working normally AFAIK.
>
>Sheng Wu 吴晟
>Twitter, wusheng1108
>
>dafang <[email protected]> 于2021年9月13日周一 下午2:58写道:
>>
>> And now.  I have two questions
>> 1.if this error exist,will all trace and jvm metric be lost?
>> 2.if there some msg in server logs just 
>> like:"org.apache.skywalking.oap.server.receiver.trace.provider.handler.v8.grpc.TraceSegmentReportServiceHandler
>>  - 86 [grpcServerPool-1-thread-7] ERROR [] - CANCELLED: cancelled before 
>> receiving half close
>> io.grpc.StatusRuntimeException: CANCELLED: cancelled before receiving half 
>> close"
>> will this make trace or jvm metrics be lost?
>>
>>
>> 中文解释一下:我现在线上100多台机器,就会经常出现某些实例机器是好的,但是就会经常出现机器trace指标或者jvm指标丢失后就完全不会再出现,除非重启服务,我上面列举的这两个情况会导致我预见的这种情况么?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2021-09-13 14:50:14,"Sheng Wu" <[email protected]> 写道:
>> >That error does matter. HTTP too large will make ElasticSearch reject.
>> >your bulk insert, which causes data loss.
>> >
>> >Sheng Wu 吴晟
>> >Twitter, wusheng1108
>> >
>> >dafang <[email protected]> 于2021年9月13日周一 下午2:23写道:
>> >>
>> >> Hi skywalking dev team:
>> >> In our prod env,I had found that the trace and jvm metrics lost after 
>> >> some service start . And agent logs show no error info.Only server log 
>> >> show: "Es 413 request too large".Will this problem cause complete data 
>> >> loss?
>> >>
>> >>
>> >> 我用中文再形容一下:
>> >> 最近发现我们线上服务集群原本有15台机器,但是接入skywalking之后,有一部分(大概5-6台),过了一段时间之后,trace指标或者jvm指标或者两者同时
>> >>  会消失,但是此时该服务是可以继续提供服务的,只是监控数据没有了。经过排查  
>> >> 发现agent-log中没有任何错误信息,仅在服务端的日志中找到一些"413 request too large"的es报错,我想咨询一下 
>> >> ,这个问题会导致trace或者jvm指标入库失败之后,再也不会采集存储了么?
>> >>
>> >>
>> >> wait for your help
>> >> yours
>> >> 大方
>> >> 2021.09.13

Reply via email to