Potentially you would be able to but I guess you will have to update the partitioning code and correspondingly RMContainerAllocator (YARN-map reduce) code. Today we have same priority for all map task < same priority for all reduce task. What you can do is to change the MAP task priorities based on partition size (file size). Make sure when you are assigning priorities to container request priorities for containers for corresponding map tasks apartment > room > villa....
However you should notice few things here..plus I have few questions for you.. 1) I don't see why you want to do this but for your task to succeed you will need all the of the map tasks to finish.. why you want this ordering?? any benefits? 2) Even if you submit all the requests with specified priorities you are not guaranteed to get them in same order because most of these requests are for specific host machines (node managers) so we don't know in advance whether sufficient resources will be available there or not. Thanks, Omkar Joshi *Hortonworks Inc.* <http://www.hortonworks.com> On Wed, Sep 11, 2013 at 4:08 PM, Mark Olimpiati <markq2...@gmail.com> wrote: > Hi Vinod, I had the node assignment at first but in my second email I > explained how I want to change the order of data partition execution. The > default is run tasks based on the *size *of the assigned partition to it. > Now I want to run tasks such that specific order of partitions is to be > executed. > > Eg. First assume input is directory Houses/ with files {Villa, Apartment, > Room} such that file "Villa" is larger in size than "Apartments" than > "Room". > > The default hadoop would run : > map1 --> Villa > map2 --> Apartment > map3 --> Room > > I want to assign priorities to the *data partitions* such that > Apartment=1, Room=2, Villa=3 then the scheduler will run the following in > this order: > map1 --> Apartment > map2 --> Room > map3 --> Villa > > My question is that possible? Notice this is regardless of the assigned > node. > Thank you, > Mark > > > On Wed, Sep 11, 2013 at 10:45 AM, Vinod Kumar Vavilapalli < > vino...@apache.org> wrote: > >> >> I assume you are talking about MapReduce. And 1.x release or 2.x? >> >> In either of the releases, this cannot be done directly. >> >> In 1.x, the framework doesn't expose a feature like this as it is a >> shared service, and if enough jobs flock to a node, it will lead to >> utilization and failure handling issues. >> >> In Hadoop 2 YARN, the platform does expose this functionality. But >> MapReduce framework doesn't yet expose this functionality to the end users. >> >> What exactly is your use case? Why are some nodes of higher priority than >> others? >> >> Thanks, >> +Vinod Kumar Vavilapalli >> Hortonworks Inc. >> http://hortonworks.com/ >> >> On Sep 11, 2013, at 10:09 AM, Mark Olimpiati wrote: >> >> Thanks for replying Rev, but the link is talking about reducers which >> seems to be like a similar case but what if I assigned priorities to the >> data partitions (eg. partition B=1, partition C=2, partition A=3,...) such >> that first map task is assigned partition B to run first. Then second map >> is given partition C, .. etc. This is instead of assigning based on >> partition size. Is that possible? >> >> Thanks, >> Mark >> >> >> On Mon, Sep 9, 2013 at 11:17 AM, Ravi Prakash <ravi...@ymail.com> wrote: >> >>> >>> http://lucene.472066.n3.nabble.com/Assigning-reduce-tasks-to-specific-nodes-td4022832.html >>> >>> ------------------------------ >>> *From:* Mark Olimpiati <markq2...@gmail.com> >>> *To:* user@hadoop.apache.org >>> *Sent:* Friday, September 6, 2013 1:47 PM >>> *Subject:* assign tasks to specific nodes >>> >>> Hi guys, >>> >>> I'm wondering if there is a way for me to assign tasks to specific >>> machines or at least assign priorities to the tasks to be executed in that >>> order. Any suggestions? >>> >>> Thanks, >>> Mark >>> >>> >>> >> >> >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. > > > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.