Hi Sylvain,

> > Also, we modified tuned COLL to implement interconnect-and-topology-
> > specific bcast/allgather/alltoall/allreduce algorithm. These algorithm
> > implementations also bypass PML/BML/BTL to eliminate protocol and 
> software
> > overhead.
> This seems perfectly valid to me. The current coll components use normal 
> MPI_Send/Recv semantics, hence the PML/BML/BTL chain, but I always saw the 
> coll framework as a way to be able to integrate smoothly "custom" 
> collective components for a specific interconnect. I think that Mellanox 
> also did a specific collective component using directly their ConnectX HCA 
> capabilities.
> 
> However, modifying the "tuned" component may not be the better way to 
> integrate your collective work. You may consider creating a "tofu" coll 
> component which would only provide the collectives you optimized (and the 
> coll framework will fallback on tuned for the ones you didn't optimize).

Yes. I agree.
But sadly, my colleague implemented it badly.

We created another COLL component that use interconnect barrier,
like Mellanox FCA.

> > To achieve above, we created 'tofu COMMON', like sm 
> (ompi/mca/common/sm/).
> > 
> > Is there interesting one?
> It may be interesting, yes. I don't know the tofu model, but if it is not 
> secret, contributing it is usually a good thing.
> 
> Your communication model may be similar to others and portions of code may 
> be shared with other technologies (I'm thinking of IB, MX, PSM,...). 
> People writing new code would also consider your model and let you take 
> advantage of it. Knowing how tofu is integrated into Open MPI may also 
> impact major decisions the open-source community is taking.

Tofu communication model is simular to that of IB RDMA.
Actually, we use source code of openib BTL as a reference.
We'll consider contribution of some code, and join the discussion.

Regards,

Takahiro Kawashima,
MPI development team,
Fujitsu

Reply via email to