Hi Michel, currently Ozone does not support short-circuit reads, it is on the roadmap but as we transition to a segregated storage and compute word it is not the most important one afaik.
It is a FileSystem related thing and in Object Stores it is not a thing at all, as we continue our FS interface developments it certainly becomes important though but probably later than sooner. On the other hand, as with any placement policy, or any other improvement, you should feel free to add this functionality on your own of course and the community will be happy to help and review :) Pifta Michel Sumbul <[email protected]> ezt írta (időpont: 2020. júl. 20., Hét 13:53): > Thanks Pifta that's really clear! > > If you don't mind a last question on data locality, does Ozone support > short-circuit like HDFS? > If not, is it something on the roadmap? Short circuit provide a significant > performance boost in the HDFS world, do you think it will be the same for > Ozone? > > Thanks, > Michel > > Le mar. 14 juil. 2020 à 22:02, István Fajth <[email protected]> a écrit : > > > Hi Michel, > > > > at the moment the placement policy is an interesting topic. > > In Ozone placement is considered in terms of containers, and not blocks. > > Block are sub-container structures. > > The container has a lifecycle, when it is open then the pipeline attached > > to it is defining the placement of data. The pipeline placement if there > > are racks and we are talking about replication factor 3 pipelines then it > > places two container replicas into one rack and one into an other rack. > > This is a wired behaviour, and pipelines are balanced between DataNodes. > If > > there are no racks defined, or just one rack is defined pipeline > placement > > falls back to random placement that considers space available on > DataNodes > > and favors nodes with more available space. > > > > When a container gets closed, the replicas are managed by the > > ReplicationManager, which has a configurable policy. There are three > > policies at the moment, random, available space aware random, and rack > > aware policy. > > The closed containers are moved by the ReplicationManager as needed if > > replication violates the policy or replicas are created or removed when > > under or overreplication occurs. > > > > This is because Ozone aims to balance the write I/O by balancing the > > pipelines. Read I/O is balanced by the random placement within the rules > > defined by the policy. > > > > Ozone needs to harmonize the pipeline placement and the container > placement > > in the future as we want to add more policies for sure but at the moment > > this is how placement works. > > > > In regards of balancing at the moment we do not have a balancing logic > > built in, and we do not have a balancer tool like HDFS at the moment it > is > > part of the roadmap, however you can bet any balancing logic has to > > consider the placement policy configured for closed containers at least. > > > > If you need to have a policy like the one you mentioned, the closed > > container policy is pluggable, so you can write your own or even > contribute > > it to the project if you want. > > But at the moment you need to consider the load which will be there if > the > > custom policy is violated by the pipeline placement then at container > > closure containers have to be moved to fit with the closed container > > placement policy. > > > > Pifta > > > > Michel Sumbul <[email protected]> ezt írta (időpont: 2020. júl. > 14., > > Ke 15:38): > > > > > Hi Pifta, > > > > > > Thanks for your reply. > > > That's good news! Does Ozone also support other placement policies like > > one > > > replica in 3 different racks? That will be super useful from an > > operational > > > point of view. It will be possible to put in maintenance (for update or > > > other task) an entire rack and be sure that 2 other replicas are in 2 > > > different racks still up and running and not losing 2 replicas. > > > > > > Does the placement policy is also enforced during the rebalancing like > in > > > HDFS? > > > > > > Thanks, > > > Michel > > > > > > Le jeu. 9 juil. 2020 à 13:05, István Fajth <[email protected]> a > écrit : > > > > > > > Hi Michel, > > > > > > > > yes, Ozone has topology support (currently 3 levels are supported: > > root, > > > > rack, node) to specify cluster topology similarly as in HDFS. With > > > > replication factor 3 it works similarly as in HDFS and ensures that > > > > container replicas reside in 2 racks, 2 in one rack, and 1 in another > > > rack. > > > > Also the FileSystem APIs (o3fs:// and ofs://) are implementing the > > > methods > > > > required to provide the locality information to the clients similarly > > as > > > in > > > > HDFS, so YARN can take advantage of this information, and can bring > > > compute > > > > to the data as with HDFS. > > > > > > > > It is worth noting that there are not too many clusters currently > using > > > > these features, but if any issues arise we are there to react, and > > there > > > > are some plans as well to harden the system further. There are a > couple > > > of > > > > items already planned after the soon to be released 0.6.0 you can > check > > > > into it in this JIRA (HDDS-3722) > > > > <https://issues.apache.org/jira/browse/HDDS-3722>. > > > > > > > > If you have any questions feel free to ask further :) > > > > Pifta > > > > > > > > Michel Sumbul <[email protected]> ezt írta (időpont: 2020. júl. > > 9., > > > > Cs, 12:57): > > > > > > > > > Hi guys, > > > > > > > > > > First thanks for your work on this project, it looks really great > as > > > the > > > > > next evolution of HDFS (if I can say that :-) ) > > > > > > > > > > I saw in multiple slideshows on the web that Ozone will support > data > > > > > locality like HDFS. > > > > > What's the status of that? Is it already implemented? > > > > > > > > > > Thanks, > > > > > Michel > > > > > > > > > > > > > > > > > -- > > > > Pifta > > > > > > > > > >
