Re: Follow up to discussion regarding use : in paths on Windows (MESOS-9109)
It seems we have the following issues w.r.t path generation: 1. Path separators are disallowed: This is general to all systems, so we'll need to put a platform-independent check. But since no one's doing this we can put this into the backlog. 2. Other invalid characters on different platforms: For now let's just focus on Windows since Un*x doesn't have any restriction other than /, but since we're already working on this issue, how about resolve all of 0x00-0x1F 0x7F " * / : < > ? | at once? This can be a Windows-specific now, as proposed by Andy. 3. Other path constraints, e.g., invalid sequences of valid characters. This is platform-dependent but the problem is there for both Un*x and Windows. We can resolve this along with 1 later. As long as the way we resolve 2 (i.e., the encoding/decoding process) won't introduce any compatibility problem in the future, I'm good at only fixing 2 for now and follow up with a clean up later. To be conservative, if we're sure that there's no existing framework using % in its ID, does it make sense to add a check for now to ensure that? On Tue, Sep 4, 2018 at 2:12 PM Andrew Schwartzmeyer < and...@schwartzmeyer.com> wrote: > I think your approach would be fairly sound. That is, to change the > logic to read the IDs from the info file instead of the paths. But I > also think we can punt this for now (as I do not think a task ID like > 'Hello%3AWorld' is plausibly in use right now), and implement a fix for > colons now that would remain compatible. > > If we add encode/decode logic for colons on Windows, we do not introduce > backward compatibility issues on other platforms (as we'd constrain the > change to Windows), and in the future, we can safely replace the decode > logic with your approach. That is to say, we implement the encoding as > sparingly as possible, but implement it now, because it's kind of > required, and we implement the decoding only as a stop-gap until we > replace this logic with reading from the info file instead. If we later > find another character in use that also needs to be encoded, we can then > abstract the single encoding into a per-platform encoding set. > > Does this seem reasonable? > > Thanks, > > Andy > > P.S. Sorry this took a while to get back to, I was out last week. > > On 08/23/2018 3:34 pm, Chun-Hung Hsiao wrote: > > I'm a bit concerned about the recovery logic and backward > > compatibility: > > The changes we're making shouldn't affect existing users, > > and we should try hard to avoid any future backward compatibility > > problem. > > > > Say if there is already some custom framework using task ID > > 'Hello%3AWorld', > > then if we blindly decode the task path during recovery, we will get > > the > > wrong ID 'Hello:World'. > > On the other hand, if we don't decode the task path during recovery, > > then later on during checkpointing for the same task, > > we shouldn't blindly encode the task ID, because it might create a > > different path, > > and we'll need to introduce some migration code to avoid such > > duplication. > > > > Fortunately, we do checkpoint the executor IDs and task IDs in the info > > files under the meta dir. > > So I'm considering the following design to minimize the backward > > compatibility issue we might have: > > During recovery, we don't decode the recovered task path, > > but get the executor/task ID from the info file instead of relying on > > parsing the executor/task path. > > When checkpointing, we only encode executor/task IDs if they contain > > reserved characters. > > The set of reserved characters could be defined as a platform-dependent > > variable, > > similar to what we have done for `PATH_SEPARATOR`. > > > > The above design would look a bit more complicated then just blindly > > applying percent encoding > > to when constructing checkpoint paths, but it doesn't require extra > > checkpoint migration logic, > > and would keep the exact same behavior we have now for "normal" > > executor/task IDs. > > > > What did you guys think? Please feel free to raise any concern :) > > And we don't need to implement the whole thing for now. > > For example, we could start with just dealing with colons, > > and extend the implementation later on, > > as long as the partial solution we're going to have right now doesn't > > create future tech debts! > > > > Best, > > Chun-Hung > > > > On Thu, Aug 23, 2018 at 1:42 PM Greg Mann wrote: > > > >> Thanks Andy! Responses inlined below. > >> > >> > >> > >>> No: As the only character we've run into a problem with is `:` > >>> (MESOS-9109), it might not be worth it to generalize this to solve a > >>> bunch > >>> of problems that we haven't encountered. > >>> > >>> > >> It's true that I'm not aware of other scenarios where > >> filesystem-disallowed characters in task/executor IDs have caused > >> issues > >> for users, and this issue has existed for a long time. However, when > >> feasible I would like to fix issues that we're aware
Re: [VOTE] Release Apache Mesos 1.7.0 (rc3)
+1 (binding). Internal CI succeeded. The binary deb/rpm packages for this RC can be found here: http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.7.0-rc3 On Fri, Sep 14, 2018 at 4:18 AM Alex Rukletsov wrote: > +1 (binding) > > Mesosphere's internal CI run with the aforementioned tag. Observed 4 flaky > tests, 3 are known: > https://issues.apache.org/jira/browse/MESOS-5048 > https://issues.apache.org/jira/browse/MESOS-8260 > https://issues.apache.org/jira/browse/MESOS-8951 > > One has been introduced as part of adding GC to nested containers > (MESOS-7947), which is disabled in the release: > https://issues.apache.org/jira/browse/MESOS-9217 > > > On Tue, Sep 11, 2018 at 8:09 PM, Gastón Kleiman > wrote: > >> Hi all, >> >> Please vote on releasing the following candidate as Apache Mesos 1.7.0. >> >> >> 1.7.0 includes the following: >> >> >> * Performance Improvements: >> * Master `/state` endpoint: ~130% throughput improvement through >> RapidJSON >> * Allocator: Improved allocator cycle significantly >> * Agent `/containers` endpoint: Fixed a performance issue >> * Agent container launch / destroy throughput is significantly improved >> * Containerization: >> * **Experimental** Supported docker image tarball fetching from HDFS >> * Added new `cgroups/all` and `linux/devices` isolators >> * Added metrics for `network/cni` isolator and docker pull latency >> * Windows: >> * Added support to libprocess for the Windows Thread Pool API >> * Multi-Framework Workloads: >> * **Experimental** Added per-framework metrics to the master >> * A new weighted random sorter was added as an alternative to the DRF >> sorter >> >> The CHANGELOG for the release is available at: >> >> https://gitbox.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.7.0-rc3 >> >> >> >> The candidate for Mesos 1.7.0 release is available at: >> https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc3/mesos-1.7.0.tar.gz >> >> The tag to be voted on is 1.7.0-rc3: >> https://gitbox.apache.org/repos/asf?p=mesos.git;a=commit;h=1.7.0-rc3 >> >> The SHA512 checksum of the tarball can be found at: >> >> https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc3/mesos-1.7.0.tar.gz.sha512 >> >> The signature of the tarball can be found at: >> >> https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc3/mesos-1.7.0.tar.gz.asc >> >> The PGP key used to sign the release is here: >> https://dist.apache.org/repos/dist/release/mesos/KEYS >> >> The JAR is in a staging repository here: >> https://repository.apache.org/content/repositories/orgapachemesos-1234 >> >> Please vote on releasing this package as Apache Mesos 1.7.0! >> >> The vote is open until Fri Sep 14 11:06:30 PDT 2018 and passes if a >> majority of at least 3 +1 PMC votes are cast. >> >> [ ] +1 Release this package as Apache Mesos 1.7.0 >> [ ] -1 Do not release this package because ... >> >> Thanks, >> >> Chun-Hung & Gastón >> > >
Re: [VOTE] Release Apache Mesos 1.7.0 (rc3)
+1 (binding) Mesosphere's internal CI run with the aforementioned tag. Observed 4 flaky tests, 3 are known: https://issues.apache.org/jira/browse/MESOS-5048 https://issues.apache.org/jira/browse/MESOS-8260 https://issues.apache.org/jira/browse/MESOS-8951 One has been introduced as part of adding GC to nested containers (MESOS-7947), which is disabled in the release: https://issues.apache.org/jira/browse/MESOS-9217 On Tue, Sep 11, 2018 at 8:09 PM, Gastón Kleiman wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.7.0. > > > 1.7.0 includes the following: > > > * Performance Improvements: > * Master `/state` endpoint: ~130% throughput improvement through > RapidJSON > * Allocator: Improved allocator cycle significantly > * Agent `/containers` endpoint: Fixed a performance issue > * Agent container launch / destroy throughput is significantly improved > * Containerization: > * **Experimental** Supported docker image tarball fetching from HDFS > * Added new `cgroups/all` and `linux/devices` isolators > * Added metrics for `network/cni` isolator and docker pull latency > * Windows: > * Added support to libprocess for the Windows Thread Pool API > * Multi-Framework Workloads: > * **Experimental** Added per-framework metrics to the master > * A new weighted random sorter was added as an alternative to the DRF > sorter > > The CHANGELOG for the release is available at: > https://gitbox.apache.org/repos/asf?p=mesos.git;a=blob_plain > ;f=CHANGELOG;hb=1.7.0-rc3 > > > > The candidate for Mesos 1.7.0 release is available at: > https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc3/mesos-1.7.0.tar.gz > > The tag to be voted on is 1.7.0-rc3: > https://gitbox.apache.org/repos/asf?p=mesos.git;a=commit;h=1.7.0-rc3 > > The SHA512 checksum of the tarball can be found at: > https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc3/mesos > -1.7.0.tar.gz.sha512 > > The signature of the tarball can be found at: > https://dist.apache.org/repos/dist/dev/mesos/1.7.0-rc3/mesos > -1.7.0.tar.gz.asc > > The PGP key used to sign the release is here: > https://dist.apache.org/repos/dist/release/mesos/KEYS > > The JAR is in a staging repository here: > https://repository.apache.org/content/repositories/orgapachemesos-1234 > > Please vote on releasing this package as Apache Mesos 1.7.0! > > The vote is open until Fri Sep 14 11:06:30 PDT 2018 and passes if a > majority of at least 3 +1 PMC votes are cast. > > [ ] +1 Release this package as Apache Mesos 1.7.0 > [ ] -1 Do not release this package because ... > > Thanks, > > Chun-Hung & Gastón >