Re: # of BSP task slots and # of InputSplits

Praveen Sripati Thu, 05 Apr 2012 21:28:12 -0700

> So when the bsp job is in progress, if free slots come into picture, it
is too late for the currently running bsp jobs to consider these new slots.


It's clear that either due to the limitations of the BSP model or the Hama
framework, the free slots available after the job has been launched can't
be used. For the same reason, I think there will be more data locality for
the tasks in Hadoop than Hama.

It would be good to evaluate this when multiple jobs run in a cluster and
as resources get free as other jobs get completed.

> The hama job designer would already have set the optimum number of tasks
for his jobs. This rule even applies to hadoop job designer.

In Hadoop the user sets the number of redcuers and the number of mappers is
not explicitly set by the user which is determined by the number of
InputSplits. Also, the parallelization of is determined by the limitation
of the Hadoop framework, h/w resources and how much of the cluster capacity
has been assigned.

For tera sort (with HDFS block size of 128 MB) around 8000 map tasks nned
to be spawned. According to the Yahoo tera sort document (
sortbenchmark.org/YahooHadoop.pdf) the maximum tasks spawn tare around
1500-1600 map and reduce tasks.

> I feel optimizing algorithms on both frameworks is an iterative process.

Can't agree more. Since I am new to BSP and Hama, I am trying to figure it
out and why some decisions were made. Again, as I mentioned earlier I don't
want to get every thing fixed in the next release, but would like to get a
thought process and a discussion going on to make Hama better.

On Fri, Apr 6, 2012 at 2:32 AM, Suraj Menon <[email protected]> wrote:

> First, I would like to explain a BSP task lifecycle. In its first
> superstep, a set of bsp tasks run in parallel. They are frequently called
> as peers. The count of peers can be specified by the hama job designer or
> be decided by the hama job scheduler based on the number of input splits or
> availability of free slots. In the first superstep, all the peers typically
> would be reading from HDFS, or from a socket stream or making up their own
> inputs for next superstep. The outputs generated in the first superstep
> would be then given to intended peers. Every peer then enters the sync
> barrier. This gives a global synchronization state for all peers where each
> one of them could be sure that the destination peers received the messages
> sent to them. In the subsequent supersteps, every peer reads from the
> messages that were sent to itself by other peers in the previous superstep.
> This repeats till the maximum allowed superstep count specified by the job
> designer or till desired results are computed.
>
> > But, the # number of free slots change constantly (as other
> > jobs gets completed and started). The free slots could be assigned to the
> > bsp job in question at a later stage also, similar to an mr job.
>
> >Adding additional tasks might make the job complete faster.
>
> So this is where we have to understand that the lifecycle of a mapper or a
> reducer is equivalent to a single superstep of bsp task. A bsp task is
> located on the same machine until the job completes unlike mapper or
> reducer task.
>
> So when the bsp job is in progress, if free slots come into picture, it is
> too late for the currently running bsp jobs to consider these new slots.
> The hama job designer would already have set the optimum number of tasks
> for his jobs. This rule even applies to hadoop job designer. We can't have
> 1,000,000 mappers and 1,000,000 reducers as explained on their wiki. A
> careful designer would specify his input partitions and the partitioner for
> reduce such that it is optimum and won't exhaust the cluster resources.
> Having more bsp tasks in parallel need not make things faster. I feel
> optimizing algorithms on both frameworks is an iterative process. In Hama,
> new free slots would be considered if a job is scheduled after the new
> slots come into picture. If some of the jobs don't get scheduled because
> non-availability of slots, this is good indication that either the current
> tasks are taking too many slots or your cluster is not big enough to carry
> out all jobs in parallel.
>
> Hadoop has some very good functionalities implemented by some very good
> engineers and we do try to get our insights from them if not reuse their
> code ;). But the requirements change in certain circumstances when we have
> to implement BSP model. We may have to tweak a lot with HDFS too in future.
>
> To add to this, I remember Chiahung mentioning about Ciel project that
> schedules tasks dynamically as needed and I thought it would be cool to
> have this feature in hama, especially for real-time computation. Vaguely
> thinking, we can start tasks with input from  checkpointed data. We don't
> have this feature in our immediate roadmap. We are trying to get some
> important features out first.
>
>
> -Suraj
>

Re: # of BSP task slots and # of InputSplits

Reply via email to