It seems we have the following issues w.r.t path generation: 1. Path separators are disallowed: This is general to all systems, so we'll need to put a platform-independent check. But since no one's doing this we can put this into the backlog. 2. Other invalid characters on different platforms: For now let's just focus on Windows since Un*x doesn't have any restriction other than /, but since we're already working on this issue, how about resolve all of 0x00-0x1F 0x7F " * / : < > ? | at once? This can be a Windows-specific now, as proposed by Andy. 3. Other path constraints, e.g., invalid sequences of valid characters. This is platform-dependent but the problem is there for both Un*x and Windows. We can resolve this along with 1 later.
As long as the way we resolve 2 (i.e., the encoding/decoding process) won't introduce any compatibility problem in the future, I'm good at only fixing 2 for now and follow up with a clean up later. To be conservative, if we're sure that there's no existing framework using % in its ID, does it make sense to add a check for now to ensure that? On Tue, Sep 4, 2018 at 2:12 PM Andrew Schwartzmeyer < and...@schwartzmeyer.com> wrote: > I think your approach would be fairly sound. That is, to change the > logic to read the IDs from the info file instead of the paths. But I > also think we can punt this for now (as I do not think a task ID like > 'Hello%3AWorld' is plausibly in use right now), and implement a fix for > colons now that would remain compatible. > > If we add encode/decode logic for colons on Windows, we do not introduce > backward compatibility issues on other platforms (as we'd constrain the > change to Windows), and in the future, we can safely replace the decode > logic with your approach. That is to say, we implement the encoding as > sparingly as possible, but implement it now, because it's kind of > required, and we implement the decoding only as a stop-gap until we > replace this logic with reading from the info file instead. If we later > find another character in use that also needs to be encoded, we can then > abstract the single encoding into a per-platform encoding set. > > Does this seem reasonable? > > Thanks, > > Andy > > P.S. Sorry this took a while to get back to, I was out last week. > > On 08/23/2018 3:34 pm, Chun-Hung Hsiao wrote: > > I'm a bit concerned about the recovery logic and backward > > compatibility: > > The changes we're making shouldn't affect existing users, > > and we should try hard to avoid any future backward compatibility > > problem. > > > > Say if there is already some custom framework using task ID > > 'Hello%3AWorld', > > then if we blindly decode the task path during recovery, we will get > > the > > wrong ID 'Hello:World'. > > On the other hand, if we don't decode the task path during recovery, > > then later on during checkpointing for the same task, > > we shouldn't blindly encode the task ID, because it might create a > > different path, > > and we'll need to introduce some migration code to avoid such > > duplication. > > > > Fortunately, we do checkpoint the executor IDs and task IDs in the info > > files under the meta dir. > > So I'm considering the following design to minimize the backward > > compatibility issue we might have: > > During recovery, we don't decode the recovered task path, > > but get the executor/task ID from the info file instead of relying on > > parsing the executor/task path. > > When checkpointing, we only encode executor/task IDs if they contain > > reserved characters. > > The set of reserved characters could be defined as a platform-dependent > > variable, > > similar to what we have done for `PATH_SEPARATOR`. > > > > The above design would look a bit more complicated then just blindly > > applying percent encoding > > to when constructing checkpoint paths, but it doesn't require extra > > checkpoint migration logic, > > and would keep the exact same behavior we have now for "normal" > > executor/task IDs. > > > > What did you guys think? Please feel free to raise any concern :) > > And we don't need to implement the whole thing for now. > > For example, we could start with just dealing with colons, > > and extend the implementation later on, > > as long as the partial solution we're going to have right now doesn't > > create future tech debts! > > > > Best, > > Chun-Hung > > > > On Thu, Aug 23, 2018 at 1:42 PM Greg Mann <g...@mesosphere.io> wrote: > > > >> Thanks Andy! Responses inlined below. > >> > >> > >> > >>> No: As the only character we've run into a problem with is `:` > >>> (MESOS-9109), it might not be worth it to generalize this to solve a > >>> bunch > >>> of problems that we haven't encountered. > >>> > >>> > >> It's true that I'm not aware of other scenarios where > >> filesystem-disallowed characters in task/executor IDs have caused > >> issues > >> for users, and this issue has existed for a long time. However, when > >> feasible I would like to fix issues that we're aware of before they > >> cause > >> problems for users, rather than after. I would suggest that since we > >> have > >> one compelling case that we need to address now, it's worth > >> formulating an > >> approach for the general case, so that we can be sure any current work > >> doesn't get in our way later on. > >> > >> > >>> I'm somewhat comfortable doing so only for Windows, as we don't > >>> really > >>> need to worry about the recovery scenario; but very uncomfortable > >>> about > >>> doing so for Linux etc., for precisely that reason. > >>> > >>> So expanding this is definitely up for debate; but we must fix the > >>> bug > >>> with `:`. > >>> > >>> > >> Indeed, addressing the general case may prove to be much more complex > >> - I > >> can certainly identify with this situation, where a fix for a smaller > >> issue > >> turns into a big project :) > >> It may turn out to be possible to implement a scoped-down solution for > >> the > >> colon case now, and extend it later on. I think it would be good if we > >> could at least get an idea of how we want to handle the general case > >> now, > >> so that any short-term solutions can be a constructive step toward the > >> long-term. > >> > >> Cheers, > >> G > >> >