Thanks for the good responses Danek & Shawn,

On 05/27/12 09:00 AM, Danek Duvall wrote:
So it seems pretty clear to me this needs a bit more thought.

Shawn Walker wrote:

So there are going to be a few very ugly failures possible with that
configuration:

   * implicit refreshes and set-publisher operations will fail since one
     of the origins is unreachable; because the implementation here is
     creating unique RepositoryURI objects for each URI / Proxy pair

   * unreachable uri/proxy pairs for a given origin will cause connection
     timeouts and could significantly delay initial transfers

Seems like we need better handling for unreachable hosts, then, possibly as
a prerequisite for this change.

Yes indeed we do fail as you suggest :-(

To what extent is this composition or redundancy? We're requiring all resources to be available, when in fact any could be, though we've no means to indicate which we prefer.

Here's how we currently react given an unreachable proxy:

[pasted as a quote because thunderbird sucks]
# pkg refresh --full extra
URL                                       Proxy                          Good   
Err  Conn Speed    Size       Used  CSpeed  Qual
http://ipkg.us.oracle.com/opensolaris/ext http://enoproxy                0      
1    0    0 B/s    0.00 B     True  0.000000 3127
http://ipkg.us.oracle.com/opensolaris/ext -                              5      
0    3    24 kB/s  58.73 kB   True  0.170273 31895
pkg: 0/1 catalogs successfully updated:

Framework error: code: 5 reason: Couldn't resolve proxy 'enoproxy'
URL: 'http://ipkg.us.oracle.com/opensolaris/extra/versions/0/'

which is bad.

   * composition can break because origin is considered the unique key
     and now with the potential for a proxy to interfere with the output
     catalog responses may not be identical, especially if both origins
     are reachable (that is, same URI, different proxy configs)

How would the catalog responses be different?

I guess my example above illustrates an extreme case of composition breaking: the proxied resource is returning no catalog response at all and the client decides that's enough to give up.

We would need to make sure that any other metadata associated with origins
isn't added orthogonally to proxies -- that is, you shouldn't have to name
the full tuple of an origin's metadata in order to refer to it.  That
suggests that users might need a way to name the configured origins.  Right
now, that's effectively its proxy, but that's not a good handle going
forward.

But that's one of my primary issues here; to a user, they aren't managing
unique pairs of (origin, proxy).  They're managing origins, and there's a
set of proxies and/or keys/certs they can use for that origin.

Only in our internal implementation is it practical to view them as pairs
and exposing that through the user interface at the CLI level concerns me.

Ok. So I think this boils down to the CLI being clumsy because it's trying to set per-origin settings in a per-publisher subcommand. In the future, we want to add SSL keys/certs into the mix and we'll have the same problem then (right now, keys and certs are per-publisher)

Throughout my work on this wad, I have been thinking that each combination of uri+proxy+ssl key/cert identifies an an origin, but it sounds like this isn't the model we want.

So, is this any better:

 'pkg set-origin --add-proxy http://bar http://foo'
 'pkg set-origin --remove-proxy http://bar http://foo'
 'pkg set-origin --add-ssl \
      --key /foo/ssl-key \
      --cert /foo/ssl-cert https://foo'
 'pkg list-origin http://foo'

if it is, then I'll shelve these bits for now, because I don't believe I'll be able to get this completed by U1 and my time would be better spent fixing bugs elsewhere I think.

I still haven't looked at the code, so I may be off here.  But I thought
that Tim's change involved multiple RepositoryURI? objects, each with its
own configuration, and the output of pkg publisher displayed them like
that, but that the stored configuration was as you describe -- a single URI
string, with multiple possible configurations.  I agree that the UI (at
least the display) isn't the best, but the output of pkg publisher has
been, IMO, underwhelming for some time, so I'm not especially concerned
about that.

Right, the stored configuration is a string representing the URI, with multiple possible configurations.

Inside the client, that's represented by a publisher with multiple RepositoryURI objects, each identified by (uri, proxy) pairs, and each are configured as separate transport endpoints.

Those transport endpoints (pkg.client.transport.repo.*Repo) map directly to RepositoryURI objects at present.

To change that, we need to modify __gen_repo(..) in pkg.client.transport.transport.Transport and bits of RepoCache too in order to track different configurations properly.

If I understand Tim's rationale for the multiple RepositoryURI objects was
so that the statistics collection would allow the ones that weren't
available to automatically fall out of contention.

That was the plan, yes.

> Would it be reasonable
to have a single RepositoryURI object per origin, with multiple possible
configurations, and let the stats collect at the config level, rather than
at the origin level?  That would let users continue to manage by URI.

That would work, though users would need a means to manage those configurations per-URI (see above for suggestion of 'set-origin'), which probably means introducing a new UI.

I'm also concerned that someone might not fix this in the packagemanager
before U1 ships; the changes made to the pkg.client.publisher API here are
incompatible.  That is, as soon as you add proxied sources, the current
implementation breaks existing API consumers.

Yeah ... the packagemanager can't break because of this change.

Fair enough, so as far as I can tell, the decision we need to take is:

1. don't allow multiple proxies per-URI, but introduce a precedent for set-publisher to set origin-level properties, which we might break for the next release

 2. allow them, and live with the breakage

 3. don't introduce this wad yet

At this stage, I'm leaning heavily towards 3. Given my question about what it means to have multiple origins configured - that is, when are we doing composition, and when are we allowing redundancy, it might be useful to be able to do:

 pkg set-origin --allow-failure --proxy http://bar http://foo

if we wanted the transport to not throw errors when http://bar was unreachable, but simply move on to the next available URI configured for that publisher.

        cheers,
                        tim
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to