On 05/26/12 14:00, Danek Duvall wrote:
So it seems pretty clear to me this needs a bit more thought.

Shawn Walker wrote:

So there are going to be a few very ugly failures possible with that
configuration:

   * implicit refreshes and set-publisher operations will fail since one
     of the origins is unreachable; because the implementation here is
     creating unique RepositoryURI objects for each URI / Proxy pair

   * unreachable uri/proxy pairs for a given origin will cause connection
     timeouts and could significantly delay initial transfers

Seems like we need better handling for unreachable hosts, then, possibly as
a prerequisite for this change.

We only need better support for unreachable hosts if we allow the URI (origin) to be configured multiple times (or to have multiple proxies). Otherwise, we're no worse off than we already are.

There are three related bugs/RFEs open in this area that would possibly need to be fixed: 7140165, 7140168, 7138102.

   * composition can break because origin is considered the unique key
     and now with the potential for a proxy to interfere with the output
     catalog responses may not be identical, especially if both origins
     are reachable (that is, same URI, different proxy configs)

How would the catalog responses be different?  Different HTTP headers that
could cause caching to be screwy or different payload?  If the latter, then
how would that happen aside from a broken proxy?  Would this be a
significantly different situation than pulling catalogs through the
different proxies now?

Proxies can choose their own caching schemes within reason; that is, they can choose not to cache at all, or they could choose to cache for up to the time specified in the response headers.

As a result, that could lead to different responses from different proxy sources, which would result in a few possible failure cases:

  * catalog retrieval could fail as some parts might be retrieved
    from one proxied source and some from another, causing a mismatch
    error

  * catalog refresh could fail as a cached response may be older than
    what a client has already retrieved; can't remember what happens
    in that scenario

If we really wanted to support that sort of configuration, we need to be
able to tie origins to a specific nwam location profile so that when
users move between networks, specific origins can be disabled/enabled
automatically.

That would be pretty cool indeed, though not something I would think
necessary for the first pass.

Agreed; I was just saying that if we actually want to support that sort of configuration, that's the "proper" way to support it. Otherwise, a user is always going to have wait for transport connections to timeout for unreachable sources before transfers begins, potentially adding several minutes to download times or causing complete operation failure (depending on the operating being attempted right now).

We would need to make sure that any other metadata associated with origins
isn't added orthogonally to proxies -- that is, you shouldn't have to name
the full tuple of an origin's metadata in order to refer to it.  That
suggests that users might need a way to name the configured origins.  Right
now, that's effectively its proxy, but that's not a good handle going
forward.

But that's one of my primary issues here; to a user, they aren't managing
unique pairs of (origin, proxy).  They're managing origins, and there's a
set of proxies and/or keys/certs they can use for that origin.

Only in our internal implementation is it practical to view them as pairs
and exposing that through the user interface at the CLI level concerns me.

I still haven't looked at the code, so I may be off here.  But I thought
that Tim's change involved multiple RepositoryURI? objects, each with its
own configuration, and the output of pkg publisher displayed them like
that, but that the stored configuration was as you describe -- a single URI
string, with multiple possible configurations.  I agree that the UI (at
least the display) isn't the best, but the output of pkg publisher has
been, IMO, underwhelming for some time, so I'm not especially concerned
about that.

I was aware of Tim's approach using multiple RepositoryURI objects here, and that (as you have explained better than I could), also leads to some unfortunate UI implications.

In some ways, listing each origin/proxy combination in the output of 'pkg publisher' may be a preferable UI since it allows us to cleanly display the state of a particular proxy configuration. It also allows us to eventually leverage persistent transport statistics to provide a 'STATUS' value as it specifically applies to a URI/proxy combination.

It also depends on how much other information we want the user to be able to provide/configure at the origin/proxy combo level as opposed to the publisher or origin levels. (For example, does ssl key/cert need to be per-origin or per-origin and proxy?)

However, we could choose to instead only show the list of proxies configured and other information in the 'long-form' output shown by 'pkg publisher $name'.

Certainly, I would expect the packagemanager to only list a unique origin once, and then be able to configure the list of proxies in a new dialog using that entry, as opposed to listing all of the origin/proxy combinations individually.

If I understand Tim's rationale for the multiple RepositoryURI objects was
so that the statistics collection would allow the ones that weren't
available to automatically fall out of contention.  Would it be reasonable
to have a single RepositoryURI object per origin, with multiple possible
configurations, and let the stats collect at the config level, rather than
at the origin level?  That would let users continue to manage by URI.

Yes, its possible to do that, and it would allow users to continue to manage by URI, and it would have the added benefit of not breaking existing API consumers if tailored a certain way.

However, that approach would potentially make it more difficult to handle per-proxy configuration.

-Shawn
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to