Re: BytePS-MXNet Integration

2019-12-09 Thread Yimin Jiang
Hi Patric,

Sorry for the late reply. I added a link for the rationale of BytePS (why
it outperforms Horovod and other libs) which points to our GitHub page (
https://github.com/bytedance/byteps/blob/master/docs/rationale.md).
Hopefully it answers your question. Thanks. :)

Best,
Yimin

On Sun, Nov 10, 2019 at 8:04 PM Zhao, Patric  wrote:

> I read the proposal but little technical statement about why BytePS is
> better than Horovod or other HW provided libraries.
> It will be better if more technical details of BytePS can be introduced in
> the proposal.
>
> Thanks,
>
> --Patric
>
> > -Original Message-
> > From: Lin Yuan 
> > Sent: Sunday, November 10, 2019 1:58 PM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Re: BytePS-MXNet Integration
> >
> > Very interesting proposal. I have tried BytePS on some examples and did
> see
> > better performance than Horovod. I look forward to this integration and
> feel
> > free to let the community know if any help is needed.
> >
> > Lin
>


Re: BytePS-MXNet Integration

2019-11-06 Thread Yimin Jiang
Hi Zhennan,

Thanks for your interest. To be honest, our team currently do not have a
plan for CPU training. That said, the notion of BytePS is not GPU-specific
and should also apply to CPU. I do not see a fundamental challenge yet. And
we welcome contributions on this.

Thank you,
Yimin

On Wed, Nov 6, 2019 at 2:59 PM Qin, Zhennan  wrote:

> Hi Yimin,
>
> Welcome to make contribution to MXNet project!
>
> From <https://github.com/bytedance/byteps/blob/master/README.md>
> https://github.com/bytedance/byteps/blob/master/README.md I found another
> limitation that isn't shown in your proposal:
>
> BytePS does not support pure CPU training for now. One reason is that the
> cheap PS assumption<
> https://github.com/bytedance/byteps/blob/master/docs/rationale.md> of
> BytePS do not hold for CPU training. Consequently, you need CUDA and NCCL
> to build and run BytePS.
>
> I have a couple of question for this: How's the status of CPU training
> support? If CPU training isn't supported yet, what's the challenge to
> support it? Do you have a plan to support it?
>
> Thanks,
> Zhennan
>
> On Wed, 2019-11-06 at 12:14 +0800, Yimin Jiang wrote:
>
> Hi MXNet Community,
>
>
> BytePS (https://github.com/bytedance/byteps) is a high-performance,
>
> cross-framework architecture for distributed training. BytePS developers
>
> are planning to integrate a part of BytePS into MXNet. The link below is
>
> the proposal. Feedbacks are welcome.
>
>
> https://cwiki.apache.org/confluence/display/MXNET/BytePS-MXNet+Integration
>
>
>
> Thank you,
>
> Yimin Jiang
>
>


BytePS-MXNet Integration

2019-11-05 Thread Yimin Jiang
Hi MXNet Community,

BytePS (https://github.com/bytedance/byteps) is a high-performance,
cross-framework architecture for distributed training. BytePS developers
are planning to integrate a part of BytePS into MXNet. The link below is
the proposal. Feedbacks are welcome.

https://cwiki.apache.org/confluence/display/MXNET/BytePS-MXNet+Integration


Thank you,
Yimin Jiang