instead of modifying http segment fetcher can we introduce a new FS
implementation? SegmentFetcher should purely work on an FS implementation.

On Fri, Jun 12, 2020 at 10:03 AM Subbu Subramaniam <[email protected]>
wrote:

> OK, I think I see jackie's point. An empty URI simply means that we could
> not get it to store in deep store reliably. Architecturally, that makes
> sense. If a SegmentFetcher wants to fetch the segment and finds empty URI,
> then it needs to look at servers with ExrernalView being ONLINE for the
> segment and fetch it from there.
>
> So, let us do this.
> (1) The segment completion protoocol carries a URI as proposed by Ting.
> (We should not be able to distinguish between an invalid request that may
> be missing a URI vs a request that truly says that the URI is only with
> peers at this point in time).
> (2) Controller commits the metadata with an empty URI if it sees a "peer"
> scheme in the committing segment URI
> (3) Segment downloader assumes that an empty URI == download from peer.
>
> The con side of this is that we need to check everywhere to make sure that
> we are handling null URI in the metadata. It is hard to catch this in
> review, so i would ask Ting to do the due diligence here, and also have a
> couple of reviewers just in case.
>
> Regarding segment download, I have a proposal:
> A. Update the SegmentFetcher interface to introduce a new method that
> takes a list of URIs instead of one URI. Default impl for this method can
> be to call the older API with any of the URIs
> B. Modify the http segment fetcher to try a random URL if a list is given
> (or, cycle through, or whatever).
>
> This makes Ting;s code simpler on the caller side
>
> -Subbu
>
> On 2020/06/12 05:26:41, TING CHEN <[email protected]> wrote:
> > Thanks for your comments. The download url zk Metadata format change is
> > part of the deepstore by-pass proposal. The design aims to improve on
> > status quo for two cases w.r.t deep store config:
> >
> >    1. A deep store configured for Pinot cluster
> >    2. No deep store configured for Pinot -- not for heavy users of Pinot
> >    but still valuable for many small/medium size setup as seen from slack
> >    questions.
> >
> > For case 2, because there is no deep store URI, there is a need for a new
> > download URI format in zkMetaData.
> > For case 1, our design proposal of controller and server interaction in
> > case of deep store outage is similar to your proposal. In particular,
> >
> >    1. When the selected commit server fails to upload a segment *S* to
> the
> >    deep store, it will pass a special uri *U* to the controller.
> >    2. The controller will then save *U *in the down url of a segment
> >    metadata.
> >    3. When another server needs to download *S, *if it sees *U *it
> >    directly downloads from peer servers. Otherwise it first tries to
> download
> >    from deep store and if fails trying to download from peer servers.
> >
> > Note that the above steps work for both case 1 and case 2.
> >
> > To me, your proposal simplifies the logic above in step 3 assuming deep
> > store is configured. I.e., everytime the server needs to download *S, *it
> > always downloads from deepstore first if it fails, downloads *S* from
> peer.
> > My proposal adds a check from the uri format but overall it works with
> both
> > cases regardless if the deep store is configured or not.
> >
> > As for segment download cost from peer servers, I agree it is expensive
> and
> > should be avoided for high traffic clusters. But it is not the main focus
> > of this discussion.
> >
> >
> > On Thu, Jun 11, 2020 at 4:18 PM Xiaotian Jiang <[email protected]>
> wrote:
> >
> > > Let me elaborate more on my proposal:
> > >
> > > The problem we are trying to solve here is when deep storage has lower
> > > availability than Pinot, we should be able to leverage the segment
> > > replications on Pinot server to increase the availability of segment
> > > download.
> > > Here we should first try to download from deep storage, and only if it
> > > fails, we use peer download as the backup.
> > > It is possible that the deep storage URL does not exist because the
> > > segment upload failed, in which case we should download from the
> > > peers.
> > > Notice that in both cases, peer download should be modeled as a backup
> > > plan instead of the main way of downloading segments.
> > > Also, downloading segments from a server requires the server to
> > > compress and send the segments, which is not a cheap operation, and
> > > can cause performance impact on the server. So we should only use peer
> > > download if there is no other option, i.e. as a backup.
> > > If we model peer download as a backup plan, we should not overload the
> > > existing downloadUri to trigger it. Instead, we should try to download
> > > with the downloadUri first, and only if it fails (including the case
> > > where downloadUri does not exist), we try to download from peer.
> > >
> > > On Thu, Jun 11, 2020 at 3:26 PM TING CHEN <[email protected]>
> > > wrote:
> > > >
> > > > Our current design
> > > > <
> > >
> https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion
> > > >
> > > > does not add a new pinotFS for *peer://*. This is a very interesting
> > > > question though. The only operations our design uses today for peer
> "FS"
> > > is
> > > > essentially *copyToLocal()* in the pinotFS interface. Our design
> > > basically
> > > > has a class and a few supporting methods
> > > > <
> > >
> https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion#BypassingdeepstorerequirementforRealtimesegmentcompletion-EnablebesteffortsegmentuploadinSplitSegmentCommiteranddownloadsegmentfrompeerservers
> > > .>
> > > > implementing the above method. That is why we do not add a
> full-fledged
> > > > pinotFS subclass for this design.
> > > >
> > > >
> > > > On Thu, Jun 11, 2020 at 3:00 PM kishore g <[email protected]>
> wrote:
> > > >
> > > > > Also, will peer:// have an implementation of pinotFS ?
> > > > >
> > > > > On Thu, Jun 11, 2020 at 2:58 PM kishore g <[email protected]>
> wrote:
> > > > >
> > > > > > +1 peer.
> > > > > >
> > > > > > unrelated to this - do we support multiple URI's?
> > > > > >
> > > > > > On Thu, Jun 11, 2020 at 2:51 PM Subbu Subramaniam <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > >> Hey Ting,
> > > > > >>
> > > > > >> I like the URI in metadata as "peer:///segmentName". This way,
> the
> > > URI
> > > > > >> remains parsable and we can use the scheme to check for a
> segment
> > > > > fetcher,
> > > > > >>
> > > > > >> thanks
> > > > > >>
> > > > > >> -Subbu
> > > > > >>
> > > > > >> On 2020/06/10 01:09:25, TING CHEN <[email protected]>
> > > wrote:
> > > > > >> > As part of the deep store by-passing
> > > > > >> > <
> > > > > >>
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion
> > > > > >> >
> > > > > >> > project, a server is allowed to download segments from peer
> Pinot
> > > > > >> servers
> > > > > >> > during Low Level Consumer (LLC) instead of deep segment
> stores. To
> > > > > >> enable
> > > > > >> > this feature, we plan to add a new URI format for the field
> > > > > >> > *segment.realtime.download.url
> > > > > >> > *in LLCRealtimeSegmentZKMetadata.
> > > > > >> >
> > > > > >> > The new URI format serves the purpose of instructing a Pinot
> > > server to
> > > > > >> find
> > > > > >> > and download the segment from a peer server. Controller
> writes it
> > > to
> > > > > >> Helix
> > > > > >> > in case of segment upload failure or no deep store configured
> at
> > > all.
> > > > > We
> > > > > >> > proposed the following format options and want to hear your
> > > feedback:
> > > > > >> >
> > > > > >> >    1. peer:///segmentName; (my preference)
> > > > > >> >    2. simply an empty string *''*
> > > > > >> >
> > > > > >> > Both are in essence specially markers to indicate that the
> > > segment is
> > > > > >> not
> > > > > >> > found in deep store and servers have to download them from
> peer
> > > > > servers.
> > > > > >> > (1) has the benefit of better readability than (2) for
> debugging
> > > > > >> purposes.
> > > > > >> >
> > > > > >> > Please let me know what you think.
> > > > > >> >
> > > > > >> > Ting Chen
> > > > > >> >
> > > > > >>
> > > > > >>
> > > ---------------------------------------------------------------------
> > > > > >> To unsubscribe, e-mail: [email protected]
> > > > > >> For additional commands, e-mail: [email protected]
> > > > > >>
> > > > > >>
> > > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to