On 04/17/12 02:31 PM, Brock Pytlik wrote:
On 04/09/12 22:02, Tim Foster wrote:
https://cr.opensolaris.org/action/browse/pkg/timf/granular-proxies/

Thanks for the detailed review!  I've got updated bits at:

https://cr.opensolaris.org/action/browse/pkg/timf/granular-proxies-webrev-v1/

comments below.

--proxy option
--------------

We add a --proxy option, where the combination of -g/G/m/M/p and the
--proxy flag now indicate a unique origin or mirror.  Wildcards like
'-G * --proxy foo' will only work if every origin to be removed has
the same proxy value 'foo'.
Does this mean that -G '*' (without the --proxy) will no longer
unconfigure every origin for a publisher? That would be troubling.

Och, I'm wrong (I was flip-flopping over how to do this originally)

-G '*' still works, it just ignores proxy arguments, which itself may need an explicit error message along the lines of "using wildcards with --proxy is not supported" to avoid being ambiguous.


# pkg image-create /tmp/a
# pkg -R /tmp/a set-publisher -g http://ipkg.us.oracle.com/on-extra
on-extra
# pkg -R /tmp/a set-publisher -G http://ipkg.us.oracle.com/on-extra
--proxy http://kura:9091 on-extra

pkg set-publisher: Unknown repository origin
'http://ipkg.us.oracle.com/on-extra/' with proxy 'http://kura:9091'

# pkg -R /tmp/a set-publisher -G http://ipkg.us.oracle.com/on-extra
on-extra
# pkg -R /tmp/a set-publisher -g http://ipkg.us.oracle.com/on-extra
--proxy http://kura:9091 on-extra
# pkg -R /tmp/a publisher
PUBLISHER                             TYPE     STATUS   URI
on-extra                              origin   online
http://ipkg.us.oracle.com/on-extra/

I thought there was going some indication that this uri was being
proxied? I'm uncomfortable with this being the entire output of pkg
publisher as I think it's actively misleading to people if they're
trying to figure out why they can't reach a publisher. I think there may
need to be an additional column (perhaps Proxied?) which has true/false
values in it. Other display suggestions are definitely welcome.

I hoped that would come along with a -o option at some point with 7133972.

Without the -o support, it feels like changing the default to include a new column for what I suspect won't be a hugely common configuration (using a proxy) would be overkill. The long-form "pkg publisher <publisher>" does include the proxy used, which I thought would be enough for now, with the expectation of getting -o soon.

# pkg -R /tmp/a publisher on-extra

             Publisher: on-extra
                 Alias:
            Origin URI: http://ipkg.us.oracle.com/on-extra/
                 Proxy: http://kura:9091
               SSL Key: None
              SSL Cert: None
           Client UUID: a8cd40a8-8227-11e1-8664-d4bed990482b
       Catalog Updated: April  9, 2012 09:38:59 AM
               Enabled: Yes


Zones pkg publisher changes
---------------------------

We get rid of the "proxy://<uri>" resources for the short-form pkg
publisher output, instead using the more explanatory
"<system-repository>" value, but provide the full URI in both the
long-form pkg publisher output, as well as the --format tsv output.

# pkg -R /zones/kahurangi/root publisher
PUBLISHER                             TYPE     STATUS   URI
solaris                  (syspub)     origin   online <system-repository>
punchin                  (syspub)     origin   online <system-repository>
extra                    (syspub)     origin   online <system-repository>
internal-only            (syspub)     origin   online <system-repository>
solarisstudio            (syspub)     origin   online <system-repository>
on-extra                 (syspub)     origin   online <system-repository>
userland                 (syspub)     origin   online
http://localhost:1008/userland/614bdd284706b4be9743fa5021c76a968192c904/

I don't think that changing the output based on whether the format's tsv
or not makes much sense. Perhaps the tsv version should have another
column which has the full uri and the proxy info?

Do you mean, have tsv format use "URI" (which can contain "<system-repository>" values), and add "Actual URI" (or somesuch) and "Proxy"?

While I'm very enthusiastic about the change to the<system-repository>
display, I'm a little concerned about it taking place now, as opposed to
across a larger change boundary.

Well, we don't document the 'proxy://http://foo' format anywhere as far as I know, and as is, passing one of those URLs to any of the pkg(5) tools will result in errors that are at least as bad as trying to do

pkg set-publisher -G '<system-repository>'

That said, if others feel like we ought to wait for a major release for this change, I can do that, but I'd really rather get rid of proxy:// sooner if at all possible.

The rest of the output looked good to me.

[snip]

client.py:
I noticed changes only on set-publisher, do we need to allow --proxy on
commands which can take a temporary repository? ('pkg list -g<mumble>
--proxy' for example?)

Good point - I forgot that. There's an argument to be made that a $http_proxy environment variable works just as well for something that's supposed to be temporary in the first place, but I can add this support if others feel it's worthwhile (it's a relatively large change though, so would like to get consensus on whether we really need this first)

imageconfig.py:
line 627: nit: couldn't you just put a set() around the list comprehension?
line 627: Could we get a comment added about why origins and mirrors
should be stored as sets, when other properties aren't?

There's no reason - I should have stored them as a list and have fixed that now.

How this code changes from the current gate tip, is that coming into this method, we may have repo.origins containing a bunch of RepositoryURI objects which have the same url but different proxy values. We need to ensure those are unique when writing out the "origins" image property so that older pkg(5) bits won't get upset when trying to read one of these images.

596-627: I'm having a little trouble understanding why, given the new
info stored in lines 596-619, why we also need to store the origins and
mirrors separately?

Good question. I'm trying to be backwards compatible here, to allow older clients to still read these images. Yes, there's a risk that older pkg(5) bits could remove one of those origins, causing the origin_info map to become stale, but I felt that was better than causing an image-format bump.

Lines 643-646: can't this be removed given the change to line 631?

Yep, I've fixed that, thanks.

773-774: this comment seems a little awkward. Perhaps:
"For now, the only property tracked on a per-origin/mirror basis is
which proxy (if any) is used to access it."

Sure, I've made that change.

836-863: This could be condensed into a single loop (like you've done
elsewhere with the info_map dictionary for example). Not sure if that
would make things simpler or not.

Yes, good idea, I've done that. Thanks.

publisher.py:
line 213: does it make sense to have this do url_ffix_trailing_slash as
well (or conversely, to strip the /)?

Good point, I'll strip the '/'

line 325: given how url_affix_trailing_slash is implemented, I don't get
why this line is necessary

It's necessary because it copes with input urls such as

http://foo/bar//

ensuring that we canonicalize them properly to simply

htt://foo/bar/

381-385: why are we stripping the slashes here? as a general comment,
I'd think it'd make sense to decide either "uri/proxy's always have
trailing slashes" or "uri/proxy's never have trailing slashes" and store
and display them in the canonical form rather than stripping and adding
the slashes in various locations. Maybe there's a real need to have
both, which the code suggests might be the case, but even looking at the
code, I have no idea WHY it would be the case.

Groan, yes I agree - I'm a little lost as to our exact policy on trailing slashes too. In this method, I wanted whatever we use to be canonical because they're keys to allow us to look up RepositoryURI objects in a dictionary, and I wanted to choose one representation and stick with it.

More generally, perhaps it's worth revisiting our approach to trailing slashes and demystifying the code a little, but perhaps not this changeset?

engine.py:
224: it might be overkill, but I think I can imagine a way to write this
such that the values it uses for a key are automatically updated when
the RepoUri's key method is changed. If nothing else, there probably
needs to be a comment on the key method mentioning that this needs to
change if it's changed. Let me know if you think the automatic way would
be worth doing, and we can discuss offlist how we might achieve it.

Yep, I thought about that, but wasn't sure whether the effort was worth it. I've got a comment in pkg.client.transport.engine.__cleanup_requests, but should have a comment pointer going in the other direction. I've added that.

repo.py:
2242-2244: two questions: First, does it still make sense to get the
scheme from the repostats object and not the repouri object? if so,
could we add a comment as to why? Second, assuming it does, could we
collapse 2242-2244 into one line of code since origin_url and url_tuple
don't seem to be used anywhere else anymore?
2265-2267: same comments as 2242-2244

Yep, I can't see any reason why not to use repouri.scheme instead.

stats.py:
62-76: This looks an awful lot like the code from
publisher.__get_runtime_proxy. I'm wondering if instead of making
runtime_proxy a property, we should make get_runtime_proxy a function
(since you can't set a runtime_proxy) and allowing it to take a
parameter indicating whether expand_proxy_envvars is used or not (which
I think is what would introduce the cleartext passwords).

Good idea - I've added a function to pkg.misc that does this. I'd like to keep the <RepositoryURI>.runtime_proxy as a property though, since we do still want to cache the result there, but it does now use that common function in pkg.misc.

config.py:
626: a comment about what we care about here would be nice. Because of
setting v[item] = "" above, I initially thought that's why we're
checking for "" in self.allowed, but after looking closely at the code,
I think that's not the case. What we're checking for is whether the list
itself is allowed to be empty, not whether any of the items within the
list are allowed to be empty. Is that correct?

Not quite, v[item] = "" is us encoding 'None' values in the dictionary as an empty string, otherwise we end up with the property:

foo = [ {"this": "bar", "other": None} ]

getting serialised as:

foo = [ {"this": "bar", "other": "None"} ]

line 626 is really checking whether we allow an empty list.

I actually had the _is_allowed() method incorrect here - it wasn't properly checking the dictionary contents against the allowed ones (though we actually don't use that functionality in the code)

795: do we not care if the proxy is a valid uri at least? (or that it
starts with http?)

Nope, libcurl allows proxies that don't specify a scheme.

misc.py:
1396-1397: completely optional alternative way of expressing this:
if not all(item.startswith("$") for item in user_passwd):

Sounds good - thanks!

sysrepo.py
278-284: If I'm reading this correctly, what this means is that if you
have a repository accessed through a proxy where you need to provide a
userid and password, you can't use that repository from the global zone
if you have any non-global zones, is that right?

Yes.

If that's the case, I'm
wondering about the wisdom of allowing the use of those kinds of proxies
at all. I agree it's probably the right decision to go with what you
have, but I am concerned that someone might develop and test their
publisher configuration on a machine that has no zones, then be
unpleasantly surprised when they go to roll out their setup on
production machines and find that it's broken because the production
machines have zones.

Yes, I understand - we've had similar limitations with p5p archives and temporary publishers and zones in the past. Our choice is to:

   a) invest in Apache trying to write code to have it support
      authenticating proxies
   b) document the behaviour (which the code checks for) and move on

I'm not inclined to do a) unless it was something that was a highly requested feature, and so far, this isn't - I'd rather do b) and trust that administrators would have test systems that closely match the environment that they use in production.

If there's consensus that this isn't a good approach, I'll remove support for authenticating proxies altogether.

t_misc.py:
112-116: Might it be worth testing what happens to something like this:
http://$user$password@host";? I'm not sure what should happen, but it
seems like we should define that behavior and test for it since it's an
easy typo to make.

Sure - I'm taking it as invalid, and it'll get passed through without expansion (yes $user could expand to "timf:" and make it valid, but I'd rather be strict in what we accept) I've added a test for this behaviour.

t_pkg_install:
4964-4967: Just something that popped into my head when I was looking at
this. Would it make sense to have self.sysrepo() return some of that
info as part of its return value?

That might be useful alright, but it's not much extra work to leave it the way it is..

5011: Why start the proxy again here?

In this part of the test, we've configured a proxied origin as well as an unproxied one. The proxy isn't configured to access either URI though, so I want to be sure that the transport allows us to access the URI (trying and failing to connect via the proxy, and falling back to the direct route)

test_19_granular_proxy: Just to put a bow on things, it might be worth
testing that once the unproxied origin for test2 is removed, that an
install of a package from test2 fails.

Sure, I've added that.

t_sysrepo:
Can you verify for me, at least using a manual test, that changing the
proxy on an origin induces a system-repository refresh?

Sure - it's actually impossible to change a proxy on an origin, instead you must remove an origin then add a proxied origin. This is by virtue (sin?) of there only being a single --proxy flag that applies to both -g and -G :-/ but yes, the system repository does get refreshed.


I think I've addressed all of your comments here, but Shawn had said he was interested in having a look at the changes too, so I'll hang on for his comments, and any follow-on questions you or others may have.

        cheers,
                        tim
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to