> On Jan. 23, 2015, 4:16 p.m., Adam B wrote: > > Changing protobufs is tricky when considering backwards-compatibility. > > Looks like include/mesos/fetcher/fetcher.proto is part of the public API, > > it's only used by containerizers, but not just the mesos/docker > > containerizers, but also any external containerizers will need to update to > > stay in sync. Should we be concerned about this? Deimos might be > > deprecated, but I believe @tarnfeld still has an external containerizer > > implementation. > > Timothy Chen wrote: > Hi Adam, FetcherInfo is not released in any version yet, that's why I > left Bernd change whatever he wants.
BTW, there is another incompatible change in a later patch. The URIs field below is only here temporarily to achieve consistency in this intermediate step. > On Jan. 23, 2015, 4:16 p.m., Adam B wrote: > > src/slave/containerizer/fetcher.cpp, line 253 > > <https://reviews.apache.org/r/30036/diff/3/?file=827799#file827799line253> > > > > Do we also need to check for empty-string user here, or valid path > > name, or valid user? Or is that validation guaranteed to happen > > earlier/later on? > > What if I try to launch a task as user "foo|bar?baz", will it get > > caught before the fetcher, or will the fetcher/chown gracefully > > fail/recover (replace with `_`s), or will the fetcher break and cause more > > confusing errors down the line? This has the same behavior as earlier Mesos: - Empty user is allowed. No chown occurs then. - In this context, for cache segregation purposes, "empty user" is one distinct user, with separate caching from "all other" users. - Invalid users have always made chown attempts fail and continue to do so. This happens after downloading. No change in that regard. - This now means there could be items from invalid users in the cache and never get used. Cache eviction will eventually take care of this. - We could file a ticket to check for valid user early. That would be new functionality, also for previous Mesos versions. > On Jan. 23, 2015, 4:16 p.m., Adam B wrote: > > src/slave/flags.hpp, line 89 > > <https://reviews.apache.org/r/30036/diff/3/?file=827802#file827802line89> > > > > Per slaveId? Does my "slave" get a new fetcher cache if it has to > > register as a new slaveId? If it's not per slave ID, then different slaves on the same host would share the same cache on disk. This would require cooperation and trust between them, neither of which is necessary with the current approach. A shared cache could pay off if you ran multiple slaves on the same host in production. That does not seem probable. We might later make the Fetcher and FetcherProcess classes into an extra daemon that serves multiple slaves, handles slave recovery more skillfully, etc. (I still prefer this to the third alternative, which is having the ephemeral mesos-fetcher program do all the cache synchronization and eviction work. See some discussion about this in MESOS-336.) > On Jan. 23, 2015, 4:16 p.m., Adam B wrote: > > include/mesos/fetcher/fetcher.proto, lines 31-38 > > <https://reviews.apache.org/r/30036/diff/3/?file=827793#file827793line31> > > > > Whoa. This looks like a drastic, incompatible protobuf change. I > > understand that it's just a protobuf that serializes into an environment > > variable, and the slave/containerizer and fetcher should be always be > > updated in sync, but it still seems counter to the protobuf philosophy. > > What if your fetcher is older/newer than your version of Mesos somehow? > > According to traditional protobuf updates, the optional fields can be > > removed, and new optional fields can be added, but anything required must > > remain, and fields must keep the same field-id and a compatible datatype. > > This means the proper upgrade for this message would look like: > > ``` > > message FetcherInfo { > > required CommandInfo command_info = 1; // required is forever. > > required string work_directory = 2; // Must stay id 2. > > optional string user = 3; > > optional string frameworks_home = 4; > > optional string hadoop_home = 5; > > // New fields: > > optional string cache_directory = 6; // Could specify [default = > > "/tmp/mesos/fetch"] if you want to guarantee it has a value. Cannot be > > required, since older versions won't know about it. > > repeated CommandInfo.URI uris = 7; // If non-empty, you could ignore > > command_info; if empty, assume an older sender and read from command_info. > > } > > ``` > > Or the (simpler) option is to define a new Protobuf message named > > something like "FetcherFlags" or "FetcherEnvironment" and just start using > > it instead. Maybe even rename the environment variable so you don't have to > > check parsing errors to see which format it's in. You'll have to support > > both protobufs/env_vars for a release, but then we can deprecate the old > > protobuf/env. > > > > Does this concern anybody else? Maybe I'm being too extreme and we can > > break compatibility for internal-only protobufs more liberally. These protobufs are indeed regarded as internal-only and they will only settle down after the whole series of patches for the fetcher cache. But we shall keep them backwards-compatible from then on. - Bernd ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30036/#review69502 ----------------------------------------------------------- On Jan. 25, 2015, 3:27 a.m., Bernd Mathiske wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30036/ > ----------------------------------------------------------- > > (Updated Jan. 25, 2015, 3:27 a.m.) > > > Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and > Timothy Chen. > > > Bugs: MESO-2069 > https://issues.apache.org/jira/browse/MESO-2069 > > > Repository: mesos-git > > > Description > ------- > > Extends the CommandInfo::URI protobuf with a boolean "caching" field that > will later cause fetcher cache actions. Also introduces the notion of a cache > directory to the fetcher info protobuf. And then propagates these additions > throughout the rest of the code base where applicable. This includes passing > the slave ID all the way down to the place where the cache dir name is > constructed. > > > Diffs > ----- > > include/mesos/fetcher/fetcher.proto > facb87b92bf3194516f636dcc348e136af537721 > include/mesos/mesos.proto 26003fada2e4c4be0e9ebc4663f7ebab858cc48d > src/launcher/fetcher.cpp fed0105946da579a38357a30e7ae56e646e05b89 > src/slave/containerizer/docker.hpp b7bf54ac65d6c61622e485ac253513eaac2e4f88 > src/slave/containerizer/docker.cpp 5f4b4ce49a9523e4743e5c79da4050e6f9e29ed7 > src/slave/containerizer/fetcher.hpp > 1db0eaf002c8d0eaf4e0391858e61e0912b35829 > src/slave/containerizer/fetcher.cpp > d290f95251def3952c5ee34f600e1d71467f6293 > src/slave/containerizer/mesos/containerizer.hpp > 802988c90ac872b0cefa5e28f06e6fec98e8d032 > src/slave/containerizer/mesos/containerizer.cpp > d712278428889ebdfd598537690138329d8464f0 > src/slave/flags.hpp a3c5c68a553b1c88ce6d5177e625079f7cdb2e5f > src/tests/docker_containerizer_tests.cpp > 2105ae2c410f01e7e0d10241d5c00df143fd3439 > src/tests/fetcher_tests.cpp 8c0b0757eb388f1684d8b94393983f1844a769a7 > > Diff: https://reviews.apache.org/r/30036/diff/ > > > Testing > ------- > > make check. > > > Thanks, > > Bernd Mathiske > >