Well, we don't send ES logs to itself. I think you can create a feedback loop that breaks the whole thing down. -peter
On Fri, Jul 15, 2016 at 3:39 PM, Luke Meyer <lme...@redhat.com> wrote: > They surely do. Although it would probably be easiest here to just get them > from `oc logs` against the ES pod, especially if we can't trust ES storage. > > On Fri, Jul 15, 2016 at 3:26 PM, Peter Portante <pport...@redhat.com> wrote: >> >> Eric, Luke, >> >> Do the logs from the ES instance itself flow into that ES instance? >> >> -peter >> >> On Fri, Jul 15, 2016 at 12:14 PM, Alex Wauck <alexwa...@exosite.com> >> wrote: >> > I'm not sure that I can. I clicked the "Archive" link for the >> > logging-es >> > pod and then changed the query in Kibana to "kubernetes_container_name: >> > logging-es-cycd8veb && kubernetes_namespace_name: logging". I got no >> > results, instead getting this error: >> > >> > Index: unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.12 >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution (queue >> > capacity 1000) on >> > >> > org.elasticsearch.search.action.SearchServiceTransportAction$23@6b1f2699] >> > Index: unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.14 >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution (queue >> > capacity 1000) on >> > >> > org.elasticsearch.search.action.SearchServiceTransportAction$23@66b9a5fb] >> > Index: unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.15 >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution (queue >> > capacity 1000) on >> > org.elasticsearch.search.action.SearchServiceTransportAction$23@512820e] >> > Index: unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.29 >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution (queue >> > capacity 1000) on >> > >> > org.elasticsearch.search.action.SearchServiceTransportAction$23@3dce96b9] >> > Index: unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.30 >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution (queue >> > capacity 1000) on >> > >> > org.elasticsearch.search.action.SearchServiceTransportAction$23@2f774477] >> > >> > When I initially clicked the "Archive" link, I saw a lot of messages >> > with >> > the kubernetes_container_name "logging-fluentd", which is not what I >> > expected to see. >> > >> > >> > On Fri, Jul 15, 2016 at 10:44 AM, Peter Portante <pport...@redhat.com> >> > wrote: >> >> >> >> Can you go back further in the logs to the point where the errors >> >> started? >> >> >> >> I am thinking about possible Java HEAP issues, or possibly ES >> >> restarting for some reason. >> >> >> >> -peter >> >> >> >> On Fri, Jul 15, 2016 at 11:37 AM, Lukáš Vlček <lvl...@redhat.com> >> >> wrote: >> >> > Also looking at this. >> >> > Alex, is it possible to investigate if you were having some kind of >> >> > network connection issues in the ES cluster (I mean between >> >> > individual >> >> > cluster nodes)? >> >> > >> >> > Regards, >> >> > Lukáš >> >> > >> >> > >> >> > >> >> > >> >> >> On 15 Jul 2016, at 17:08, Peter Portante <pport...@redhat.com> >> >> >> wrote: >> >> >> >> >> >> Just catching up on the thread, will get back to you all in a few >> >> >> ... >> >> >> >> >> >> On Fri, Jul 15, 2016 at 10:08 AM, Eric Wolinetz >> >> >> <ewoli...@redhat.com> >> >> >> wrote: >> >> >>> Adding Lukas and Peter >> >> >>> >> >> >>> On Fri, Jul 15, 2016 at 8:07 AM, Luke Meyer <lme...@redhat.com> >> >> >>> wrote: >> >> >>>> >> >> >>>> I believe the "queue capacity" there is the number of parallel >> >> >>>> searches >> >> >>>> that can be queued while the existing search workers operate. It >> >> >>>> sounds like >> >> >>>> it has plenty of capacity there and it has a different reason for >> >> >>>> rejecting >> >> >>>> the query. I would guess the data requested is missing given it >> >> >>>> couldn't >> >> >>>> fetch shards it expected to. >> >> >>>> >> >> >>>> The number of shards is a multiple (for redundancy) of the number >> >> >>>> of >> >> >>>> indices, and there is an index created per project per day. So >> >> >>>> even >> >> >>>> for a >> >> >>>> small cluster this doesn't sound out of line. >> >> >>>> >> >> >>>> Can you give a little more information about your logging >> >> >>>> deployment? >> >> >>>> Have >> >> >>>> you deployed multiple ES nodes for redundancy, and what are you >> >> >>>> using >> >> >>>> for >> >> >>>> storage? Could you attach full ES logs? How many OpenShift nodes >> >> >>>> and >> >> >>>> projects do you have? Any history of events that might have >> >> >>>> resulted >> >> >>>> in lost >> >> >>>> data? >> >> >>>> >> >> >>>> On Thu, Jul 14, 2016 at 4:06 PM, Alex Wauck >> >> >>>> <alexwa...@exosite.com> >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> When doing searches in Kibana, I get error messages similar to >> >> >>>>> "Courier >> >> >>>>> Fetch: 919 of 2020 shards failed". Deeper inspection reveals >> >> >>>>> errors >> >> >>>>> like >> >> >>>>> this: "EsRejectedExecutionException[rejected execution (queue >> >> >>>>> capacity 1000) >> >> >>>>> on >> >> >>>>> >> >> >>>>> >> >> >>>>> org.elasticsearch.search.action.SearchServiceTransportAction$23@14522b8e]". >> >> >>>>> >> >> >>>>> A bit of investigation lead me to conclude that our Elasticsearch >> >> >>>>> server >> >> >>>>> was not sufficiently powerful, but I spun up a new one with four >> >> >>>>> times the >> >> >>>>> CPU and RAM of the original one, but the queue capacity is still >> >> >>>>> only 1000. >> >> >>>>> Also, 2020 seems like a really ridiculous number of shards. Any >> >> >>>>> idea what's >> >> >>>>> going on here? >> >> >>>>> >> >> >>>>> -- >> >> >>>>> >> >> >>>>> Alex Wauck // DevOps Engineer >> >> >>>>> >> >> >>>>> E X O S I T E >> >> >>>>> www.exosite.com >> >> >>>>> >> >> >>>>> Making Machines More Human. >> >> >>>>> >> >> >>>>> >> >> >>>>> _______________________________________________ >> >> >>>>> users mailing list >> >> >>>>> users@lists.openshift.redhat.com >> >> >>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> >>>>> >> >> >>>> >> >> >>>> >> >> >>>> _______________________________________________ >> >> >>>> users mailing list >> >> >>>> users@lists.openshift.redhat.com >> >> >>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> > >> > >> > >> > >> > >> > -- >> > >> > Alex Wauck // DevOps Engineer >> > >> > E X O S I T E >> > www.exosite.com >> > >> > Making Machines More Human. > > _______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users