I agree. To answer Kishore's question, then, the peer download is not modeled as a PinotFS download. Instead, it is a backup to PinotFS download.
Whether http/https fetchers should be modelled via PinotFS is another question, and we don't need to answer that to unblock Ting. -Subbu On 2020/06/11 23:18:41, Xiaotian Jiang <[email protected]> wrote: > Let me elaborate more on my proposal: > > The problem we are trying to solve here is when deep storage has lower > availability than Pinot, we should be able to leverage the segment > replications on Pinot server to increase the availability of segment > download. > Here we should first try to download from deep storage, and only if it > fails, we use peer download as the backup. > It is possible that the deep storage URL does not exist because the > segment upload failed, in which case we should download from the > peers. > Notice that in both cases, peer download should be modeled as a backup > plan instead of the main way of downloading segments. > Also, downloading segments from a server requires the server to > compress and send the segments, which is not a cheap operation, and > can cause performance impact on the server. So we should only use peer > download if there is no other option, i.e. as a backup. > If we model peer download as a backup plan, we should not overload the > existing downloadUri to trigger it. Instead, we should try to download > with the downloadUri first, and only if it fails (including the case > where downloadUri does not exist), we try to download from peer. > > On Thu, Jun 11, 2020 at 3:26 PM TING CHEN <[email protected]> wrote: > > > > Our current design > > <https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion> > > does not add a new pinotFS for *peer://*. This is a very interesting > > question though. The only operations our design uses today for peer "FS" is > > essentially *copyToLocal()* in the pinotFS interface. Our design basically > > has a class and a few supporting methods > > <https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion#BypassingdeepstorerequirementforRealtimesegmentcompletion-EnablebesteffortsegmentuploadinSplitSegmentCommiteranddownloadsegmentfrompeerservers.> > > implementing the above method. That is why we do not add a full-fledged > > pinotFS subclass for this design. > > > > > > On Thu, Jun 11, 2020 at 3:00 PM kishore g <[email protected]> wrote: > > > > > Also, will peer:// have an implementation of pinotFS ? > > > > > > On Thu, Jun 11, 2020 at 2:58 PM kishore g <[email protected]> wrote: > > > > > > > +1 peer. > > > > > > > > unrelated to this - do we support multiple URI's? > > > > > > > > On Thu, Jun 11, 2020 at 2:51 PM Subbu Subramaniam <[email protected]> > > > > wrote: > > > > > > > >> Hey Ting, > > > >> > > > >> I like the URI in metadata as "peer:///segmentName". This way, the URI > > > >> remains parsable and we can use the scheme to check for a segment > > > fetcher, > > > >> > > > >> thanks > > > >> > > > >> -Subbu > > > >> > > > >> On 2020/06/10 01:09:25, TING CHEN <[email protected]> wrote: > > > >> > As part of the deep store by-passing > > > >> > < > > > >> > > > https://cwiki.apache.org/confluence/display/PINOT/By-passing+deep-store+requirement+for+Realtime+segment+completion > > > >> > > > > >> > project, a server is allowed to download segments from peer Pinot > > > >> servers > > > >> > during Low Level Consumer (LLC) instead of deep segment stores. To > > > >> enable > > > >> > this feature, we plan to add a new URI format for the field > > > >> > *segment.realtime.download.url > > > >> > *in LLCRealtimeSegmentZKMetadata. > > > >> > > > > >> > The new URI format serves the purpose of instructing a Pinot server > > > >> > to > > > >> find > > > >> > and download the segment from a peer server. Controller writes it to > > > >> Helix > > > >> > in case of segment upload failure or no deep store configured at all. > > > We > > > >> > proposed the following format options and want to hear your feedback: > > > >> > > > > >> > 1. peer:///segmentName; (my preference) > > > >> > 2. simply an empty string *''* > > > >> > > > > >> > Both are in essence specially markers to indicate that the segment is > > > >> not > > > >> > found in deep store and servers have to download them from peer > > > servers. > > > >> > (1) has the benefit of better readability than (2) for debugging > > > >> purposes. > > > >> > > > > >> > Please let me know what you think. > > > >> > > > > >> > Ting Chen > > > >> > > > > >> > > > >> --------------------------------------------------------------------- > > > >> To unsubscribe, e-mail: [email protected] > > > >> For additional commands, e-mail: [email protected] > > > >> > > > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
