Re: Follow up to discussion regarding use : in paths on Windows (MESOS-9109)

Chun-Hung Hsiao Fri, 14 Sep 2018 14:38:52 -0700

It seems we have the following issues w.r.t path generation:

1. Path separators are disallowed:
    This is general to all systems, so we'll need to put a
platform-independent check. But since no one's doing this we can put this
into the backlog.
2. Other invalid characters on different platforms:
    For now let's just focus on Windows since Un*x doesn't have any
restriction other than /,
    but since we're already working on this issue, how about resolve all
of 0x00-0x1F 0x7F " * / : < > ? | at once?
    This can be a Windows-specific now, as proposed by Andy.
3. Other path constraints, e.g., invalid sequences of valid characters.
    This is platform-dependent but the problem is there for both Un*x and
Windows. We can resolve this along with 1 later.


As long as the way we resolve 2 (i.e., the encoding/decoding process) won't
introduce any compatibility problem in the future,
I'm good at only fixing 2 for now and follow up with a clean up later.
To be conservative, if we're sure that there's no existing framework using
% in its ID,
does it make sense to add a check for now to ensure that?

On Tue, Sep 4, 2018 at 2:12 PM Andrew Schwartzmeyer <
and...@schwartzmeyer.com> wrote:

> I think your approach would be fairly sound. That is, to change the
> logic to read the IDs from the info file instead of the paths. But I
> also think we can punt this for now (as I do not think a task ID like
> 'Hello%3AWorld' is plausibly in use right now), and implement a fix for
> colons now that would remain compatible.
>
> If we add encode/decode logic for colons on Windows, we do not introduce
> backward compatibility issues on other platforms (as we'd constrain the
> change to Windows), and in the future, we can safely replace the decode
> logic with your approach. That is to say, we implement the encoding as
> sparingly as possible, but implement it now, because it's kind of
> required, and we implement the decoding only as a stop-gap until we
> replace this logic with reading from the info file instead. If we later
> find another character in use that also needs to be encoded, we can then
> abstract the single encoding into a per-platform encoding set.
>
> Does this seem reasonable?
>
> Thanks,
>
> Andy
>
> P.S. Sorry this took a while to get back to, I was out last week.
>
> On 08/23/2018 3:34 pm, Chun-Hung Hsiao wrote:
> > I'm a bit concerned about the recovery logic and backward
> > compatibility:
> > The changes we're making shouldn't affect existing users,
> > and we should try hard to avoid any future backward compatibility
> > problem.
> >
> > Say if there is already some custom framework using task ID
> > 'Hello%3AWorld',
> > then if we blindly decode the task path during recovery, we will get
> > the
> > wrong ID 'Hello:World'.
> > On the other hand, if we don't decode the task path during recovery,
> > then later on during checkpointing for the same task,
> > we shouldn't blindly encode the task ID, because it might create a
> > different path,
> > and we'll need to introduce some migration code to avoid such
> > duplication.
> >
> > Fortunately, we do checkpoint the executor IDs and task IDs in the info
> > files under the meta dir.
> > So I'm considering the following design to minimize the backward
> > compatibility issue we might have:
> > During recovery, we don't decode the recovered task path,
> > but get the executor/task ID from the info file instead of relying on
> > parsing the executor/task path.
> > When checkpointing, we only encode executor/task IDs if they contain
> > reserved characters.
> > The set of reserved characters could be defined as a platform-dependent
> > variable,
> > similar to what we have done for `PATH_SEPARATOR`.
> >
> > The above design would look a bit more complicated then just blindly
> > applying percent encoding
> > to when constructing checkpoint paths, but it doesn't require extra
> > checkpoint migration logic,
> > and would keep the exact same behavior we have now for "normal"
> > executor/task IDs.
> >
> > What did you guys think? Please feel free to raise any concern :)
> > And we don't need to implement the whole thing for now.
> > For example, we could start with just dealing with colons,
> > and extend the implementation later on,
> > as long as the partial solution we're going to have right now doesn't
> > create future tech debts!
> >
> > Best,
> > Chun-Hung
> >
> > On Thu, Aug 23, 2018 at 1:42 PM Greg Mann <g...@mesosphere.io> wrote:
> >
> >> Thanks Andy! Responses inlined below.
> >>
> >>
> >>
> >>> No: As the only character we've run into a problem with is `:`
> >>> (MESOS-9109), it might not be worth it to generalize this to solve a
> >>> bunch
> >>> of problems that we haven't encountered.
> >>>
> >>>
> >> It's true that I'm not aware of other scenarios where
> >> filesystem-disallowed characters in task/executor IDs have caused
> >> issues
> >> for users, and this issue has existed for a long time. However, when
> >> feasible I would like to fix issues that we're aware of before they
> >> cause
> >> problems for users, rather than after. I would suggest that since we
> >> have
> >> one compelling case that we need to address now, it's worth
> >> formulating an
> >> approach for the general case, so that we can be sure any current work
> >> doesn't get in our way later on.
> >>
> >>
> >>> I'm somewhat comfortable doing so only for Windows, as we don't
> >>> really
> >>> need to worry about the recovery scenario; but very uncomfortable
> >>> about
> >>> doing so for Linux etc., for precisely that reason.
> >>>
> >>> So expanding this is definitely up for debate; but we must fix the
> >>> bug
> >>> with `:`.
> >>>
> >>>
> >> Indeed, addressing the general case may prove to be much more complex
> >> - I
> >> can certainly identify with this situation, where a fix for a smaller
> >> issue
> >> turns into a big project :)
> >> It may turn out to be possible to implement a scoped-down solution for
> >> the
> >> colon case now, and extend it later on. I think it would be good if we
> >> could at least get an idea of how we want to handle the general case
> >> now,
> >> so that any short-term solutions can be a constructive step toward the
> >> long-term.
> >>
> >> Cheers,
> >> G
> >>
>

Re: Follow up to discussion regarding use : in paths on Windows (MESOS-9109)

Reply via email to