Re: [pkg-discuss] Code review: sysrepo config cache

Tim Foster Wed, 23 May 2012 22:37:00 -0700

Hi Brock,

On 05/24/12 03:53 PM, Brock Pytlik wrote:

On 05/13/12 22:35, Tim Foster wrote:

https://cr.opensolaris.org/action/browse/pkg/timf/sysrepo-config-cache/sysrepo-config-cache-webrev


Thanks for taking a look.

sysrepo.py
line 317, 365 Nit: use open instead of file (at least I think we settled
on open)

Good point, thanks - I'll switch to open. We seem to use both in thegate, but open is predominant.

line 326: shouldn't this also generate an error? or do we hit this if no
publishers are configured?


You're right, it should report the error, I'll add that.

434,435,437: Why the change from tuples to lists?


The version of simplejson we have always converts tuple to lists:

>>> import simplejson
>>> import StringIO
>>> s = StringIO.StringIO()
>>> o = ["list with tuples", ("foo", "bar")]
>>> simplejson.dump(o, s)
>>> s.seek(0)
>>> s.read()
'["list with tuples", ["foo", "bar"]]'
>>> s.seek(0)
>>> simplejson.load(s)
['list with tuples', ['foo', 'bar']]
>>>

The simplejson.dump() option 'tuple_as_array' comes in version 2.2.0which I think would solve this, but we use 2.1. In this case, switchingto lists is fine I think.

Could you discuss briefly what happens in the situation where a user
adds a (currently) unreachable publisher with --no-refresh? (And perhaps
add a test case for it)

There's basically no change to the current behaviour: the cache filegets deleted, and system repository service will get restarted - ifpkg.sysrepo manages to timeout before the service hits the SMF start (orrefresh) method timeout, then we write a new cache, and continue servingrequests.

That cached configuration may be incomplete (in terms of server-sideredirects being missing because of that unavailable publisher) in whichcase, when the publisher becomes available, we'll get transport errorswhen trying to install packages if they need to go through those redirects.

On the testing side of things, I don't think I see tests for a) an image
with no publishers and no packages installed or b) an image with no
publishers but packages installed. I think it's probably worth just
checking those two corner cases.


Sure, I'll add tests there.

Is it the _follow_redirects bit of sysrepo.py that currently requires
that we have networking available and publishers are reachable? Anything
else?


Right, that's the only part of sysrepo.py that wants networking.

Of course the system repository itself needs networking any time itproxies a request that points to a non-local repository (when all thezones have come up too and are making those requests), but by that time,I would expect networking to be available.

Just to make sure I understand how this works...
When a user adds a publisher or an origin to an existing publisher, that
refreshes/restarts the sysrepo service.


Right (that's what happens in the code currently in the gate too)

When it starts the next time, it
will look at the cached information. realize it's out of date since the
publisher or origin isn't in the cached info, and revert to existing
behavior?

We're being more forceful than that: when publisher information changes,we remove the cache file in pkg.client.image.Image.save_config() toensure the cache gets regenerated (just before we cause asystem-repository service refresh).

We could perhaps have relied on the cache validation code inpkg.sysrepo(1) checking the cache validity, but a total cache rebuild isgoing to be more reliable (in terms of being able to revalidateserver-side redirects are still current)

There are perhaps more efficient ways we could have done this if cacheregeneration was expensive, but it isn't so long as the network is up.

I think though, when a user removes a publisher or origin, after the
restart/refresh, the sysrepo will continue using the previously cached
data? Have I missed something?

Yep, by virtue of completely nuking the cache on publisher configurationchange, we should be ok here.

I have a couple of other concerns, though I'm not sure how major they are...
Is it possible to have the sysrepo use the cache as a fallback if
publishers aren't reachable? That seems like it might be the safer route.

That's the problem in the current gate bits really: given enough originsin our image, and a fixed SMF start/timeout_seconds threshold for thesystem-repository service, any time we're going out to the network tosee if publishers are reachable in sysrepo.py, we risk running into thattimeout.

I'm wary of making changes that dynamically tune thestart/timeout_seconds value, as it could result in a system-repositoryservice that either never comes online, or takes long enough that theother services that depend on it drop to maintenance.

Using the cache seems to be the least worst solution, but I'm open tobetter suggestions :-)

If I'm reading the code right, if a server changes it server-side
redirections, the sysrepo would never pick up on that change and
wouldn't provide the right set of redirects in its configuration,
leading to transport errors. While I think that could happen now, at
least if sysrepo was restarted/refreshed, it would pick up on the new
redirects, with this, it seems like it would take a publisher change by
the user to cause it to pick up the new redirects.

That's correct. I felt that the chances of server-side redirectschanging were low, and decided that the tradeoff of making pkg.sysrepogo faster/depend less on the network during boot was better.

If server-side redirects do change, then yes, we'll see transport errorswhen installing packages through the system repository, so it will beobvious that something is wrong. The fix would be for the GZadministrator to remove the cache, and restart the service, though youhave me wondering whether we try to do this automatically... (though themore I think about it, the less sure I am: it'd mean somehow being ableto pass a message to the system repository that it should rebuild it'scache and restart: a recipe for zone->global-zone DOS attacks I think)

Perhaps we could start up immediately with the cached info then
periodically after revalidate the redirects?

That's an idea, though it would mean the system repository refreshingitself, potentially interrupting package installs. When and how oftendo we check? We could get a bit neurotic here.


I'll work on a few extra tests, but other comments welcome.

        cheers,
                        tim
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Code review: sysrepo config cache

Reply via email to