I'm not sure that I can. I clicked the "Archive" link for the logging-es pod and then changed the query in Kibana to "kubernetes_container_name: logging-es-cycd8veb && kubernetes_namespace_name: logging". I got no results, instead getting this error:
- *Index:* unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.12 *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@6b1f2699] - *Index:* unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.14 *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@66b9a5fb] - *Index:* unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.15 *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@512820e] - *Index:* unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.29 *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@3dce96b9] - *Index:* unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.30 *Shard:* 2 *Reason:* EsRejectedExecutionException[rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@2f774477] When I initially clicked the "Archive" link, I saw a lot of messages with the kubernetes_container_name "logging-fluentd", which is not what I expected to see. On Fri, Jul 15, 2016 at 10:44 AM, Peter Portante <pport...@redhat.com> wrote: > Can you go back further in the logs to the point where the errors started? > > I am thinking about possible Java HEAP issues, or possibly ES > restarting for some reason. > > -peter > > On Fri, Jul 15, 2016 at 11:37 AM, Lukáš Vlček <lvl...@redhat.com> wrote: > > Also looking at this. > > Alex, is it possible to investigate if you were having some kind of > network connection issues in the ES cluster (I mean between individual > cluster nodes)? > > > > Regards, > > Lukáš > > > > > > > > > >> On 15 Jul 2016, at 17:08, Peter Portante <pport...@redhat.com> wrote: > >> > >> Just catching up on the thread, will get back to you all in a few ... > >> > >> On Fri, Jul 15, 2016 at 10:08 AM, Eric Wolinetz <ewoli...@redhat.com> > wrote: > >>> Adding Lukas and Peter > >>> > >>> On Fri, Jul 15, 2016 at 8:07 AM, Luke Meyer <lme...@redhat.com> wrote: > >>>> > >>>> I believe the "queue capacity" there is the number of parallel > searches > >>>> that can be queued while the existing search workers operate. It > sounds like > >>>> it has plenty of capacity there and it has a different reason for > rejecting > >>>> the query. I would guess the data requested is missing given it > couldn't > >>>> fetch shards it expected to. > >>>> > >>>> The number of shards is a multiple (for redundancy) of the number of > >>>> indices, and there is an index created per project per day. So even > for a > >>>> small cluster this doesn't sound out of line. > >>>> > >>>> Can you give a little more information about your logging deployment? > Have > >>>> you deployed multiple ES nodes for redundancy, and what are you using > for > >>>> storage? Could you attach full ES logs? How many OpenShift nodes and > >>>> projects do you have? Any history of events that might have resulted > in lost > >>>> data? > >>>> > >>>> On Thu, Jul 14, 2016 at 4:06 PM, Alex Wauck <alexwa...@exosite.com> > wrote: > >>>>> > >>>>> When doing searches in Kibana, I get error messages similar to > "Courier > >>>>> Fetch: 919 of 2020 shards failed". Deeper inspection reveals errors > like > >>>>> this: "EsRejectedExecutionException[rejected execution (queue > capacity 1000) > >>>>> on > >>>>> > org.elasticsearch.search.action.SearchServiceTransportAction$23@14522b8e > ]". > >>>>> > >>>>> A bit of investigation lead me to conclude that our Elasticsearch > server > >>>>> was not sufficiently powerful, but I spun up a new one with four > times the > >>>>> CPU and RAM of the original one, but the queue capacity is still > only 1000. > >>>>> Also, 2020 seems like a really ridiculous number of shards. Any > idea what's > >>>>> going on here? > >>>>> > >>>>> -- > >>>>> > >>>>> Alex Wauck // DevOps Engineer > >>>>> > >>>>> E X O S I T E > >>>>> www.exosite.com > >>>>> > >>>>> Making Machines More Human. > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> users@lists.openshift.redhat.com > >>>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > >>>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> users mailing list > >>>> users@lists.openshift.redhat.com > >>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > > -- Alex Wauck // DevOps Engineer *E X O S I T E* *www.exosite.com <http://www.exosite.com/>* Making Machines More Human.
_______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users