On 01/29/2014 10:01 AM, Felix Lee wrote:
Dear all,
Just some experiences to share on this.
After I upgraded Grizzy to Havana, I lived with keystone token expiration = 14400 plus memcached backend perfectly without patch for weeks.

But since last week, it started suffering "Unable to add token user list" issue. So, I was then adjusting token lifetime from 3 hours, 2 hours and to 1 hour, or even less, but none of them could really solve this issue for good(keystone could last for ~10 minute at most), further more, after couple restarts of keystone and flush memcached, the keystone suddenly could not start up properly, and kept complaining error like this:

2014-01-24 13:25:52.081 91813 INFO keystone.common.environment.eventlet_server [-] Starting /usr/bin/keystone-all
 on 0.0.0.0:35357
2014-01-24 13:25:52.081 91813 CRITICAL keystone [-] [Errno 98] Address already in use


So, I checked system with:
netstat -nap
lsof -i :35357


But I saw no any process or connection was occupying 35357 socket.
I never encountered this problem on Linux before, the only way to solve this issue(excepting to reboot machine :) ) is to flush cache by hand, like this:

sync; echo 3 > /proc/sys/vm/drop_caches


I suspect that the 35357 socket file was removed from /proc while process was stopped but somehow it was still remaining in memory cache by unknown reason, probably it's some undiscovered bug of eventlet server I don't know..., anyway, after this incident, I applied the patch and use expiration = 3600 for token life time, now, everything is working perfectly again. Only I still have no idea why the problem was suddenly escalated into such terrible condition, just like keystone was suddenly suffering token DDoS attack by Neutron agent and other internal Openstack service components with no reason...


35357 is considered an Ephemeral socket by Linux (although not by Posix) and I suspect that it is getting "reclaimed" by one sub system but not released in another: another way of describing you hypothesis.

I read somewhere that there is a way to tell Linux to reserve port 35357, and not treat it as ephemeral.


Best regards,
Felix Lee ~


On 2014年01月13日 17:25, Morgan Fainberg wrote:
Hi Tim,

The change is being proposed directly to stabe/havana.  We have an
alternative implementation for Icehouse as we are refactoring the entire
key-value-store system and making memcache a version of that new
implementation.

Cheers,
Morgan

On January 12, 2014 at 10:14:49, Tim Bell (tim.b...@cern.ch
<mailto://tim.b...@cern.ch>) wrote:

Can we tag this patch for backporting to Havana stable ?

We're starting work for the CERN upgrade and this looks like a very
useful patch to be part of the standard Havana offering.

Tim

> -----Original Message-----
> From: Jonathan Proulx [mailto:j...@jonproulx.com]
> Sent: 12 January 2014 18:32
> To: Morgan Fainberg
> Cc: openstack@lists.openstack.org
> Subject: Re: [Openstack] [Keystone] performance issues after havana upgrade
>
> puzzling side effect?
>
> I just made a small change to neutron.conf (adjusted a default quota) and restarted neutron-server, now neutron (but not other services)
> is
> spweing:
>
> Invalid user token - rejecting request
>
> (quite possibly only from dashboard requests CLI seems to work). I've tried restarting keystone (in both wsgi and eventlet modes), > restarting neutron-server w/ reverted config and flushing/restarting memcached in various combinations.
>
> I don't really see how restarting neutron-server could confuse token validation...
>
>
> On Sun, Jan 12, 2014 at 10:38 AM, Morgan Fainberg <mor...@metacloud.com> wrote: > > Thanks for confirming this! It also validates my new logic going into
> > icehouse (I might have had some ulterior motives here, or not so
> > ulterior as the case may be).  I'll make sure we resolve the test
> > issues (unrelated to the patch) and get it into the Havana tree so you
> > don't need to maintain it outside of the releases.
> >
> > Cheers,
> > Morgan
> >
> > Sent from my tablet-like-device
> >
> >> On Jan 11, 2014, at 11:01 PM, Jonathan Proulx <j...@jonproulx.com> wrote:
> >>
> >>> On Sat, Jan 11, 2014 at 10:57 PM, Morgan Fainberg <m...@metacloud.com> wrote:
> >>> Sounds good!  Just remember that prior to the fix I posted there,
> >>> for each token in the user's index, it incurred a round-trip to
> >>> memcached to validate the token wasn't expired. This change makes > >>> it so that there are significantly less trips from keystone to memcached.
> >>>
> >>> If this doesn't 100% solve the issue, we should start digging
> >>> further into what is going on, but I am confident this will (at the
> >>> very least) help a reasonable amount.
> >>
> >> You sir are a miracle worker, my hat is off!
> >>
> >> The responsiveness of everything is better than it's ever been, my
> >> users will think this is the best feature the upgrade.
> >>
> >> For example earlier today I managed to launch 10 VMs in parallel,
> >> eventually, I'd guess on the order of 5-10min. One of my usual
> >> acceptance tests is being able to launch 100 VMs in that time. Just
> >> now Iaunched 100 in <2min from request until they'd all been
> >> provisioned and were booting. Now there's too many moving pieces and > >> too few experimental samples to make any publishable claims, but your
> >> patch is the only thing that changed.
> >>
> >> Thanks,
> >> -Jon
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to