Another interesting opportunity would be the development of a gpu storage
tier based on gpu direct

On Feb 17, 2018 4:00 PM, "Patrick Stuedi" <[email protected]> wrote:

> That's great, one of the main goals of crail being an apache incubator
> project is to get more people involved in the development of crail. I've
> been following your contributions to tensorflow, nice work! Collaborating
> in this context (incl mxnet) would be very interesting. There are multiple
> ways to go. Once we have the core c++ client we could need help in the
> developmen of the various bindings (rdma, tcp, for storage and rpc). Or we
> could need help in leveraging crail in tensirflow and mxnet (param server,
> storage of the model > dram). Let us know where you see opportinities.
>
> On Feb 17, 2018 3:36 PM, "Bairen YI" <[email protected]> wrote:
>
>> Hi Patrick,
>>
>> That would be fantastic. In fact we would love to get more involved as
>> our lab in HKUST has partnered with MLNX to codevelop datacenter scale AI
>> software solution (TensorFlow and Apache MXNet), and we could encourage a
>> couple of students contributing code to Crail at this very stage if we see
>> fit. It could also bring novel system/networking research opportunities to
>> our lab.
>>
>> Let me know how we could better work together.
>>
>> Best,
>> Bairen
>>
>> > On 17 Feb 2018, at 22:19, Patrick Stuedi <[email protected]> wrote:
>> >
>> > Hi Bairen,
>> >
>> > Your comment is just on spot. The development of a c++ Api for crail is
>> one
>> > of the top items on the roadmap, in partical to facilitate the
>> integration
>> > into tensorflow and serverless. In fact i started drafting a prototype
>> two
>> > weeks ago that i wanted to share soon. If you are interested in helping
>> let
>> > us know!
>> >
>> >
>> >
>> > On Feb 17, 2018 1:49 PM, "Bairen YI" <[email protected]> wrote:
>> >
>> > HI folks,
>> >
>> > I have been looking into you guys’ work for a long time and it is great
>> to
>> > see Crail accepted as an Apache Incubator project.
>> >
>> > I authored the GPU Direct RDMA transport for TensorFlow (
>> > https://github.com/tensorflow/tensorflow/pull/11392), and I would love
>> to
>> > see how we could design an end-to-end zero-copy dataflow from Crail to
>> > various deep learning framework such as TensorFlow (
>> > https://dl.acm.org/citation.cfm?doid=3123878.3131975).
>> >
>> > Is there any roadmap for Crail as a standalone language-independent
>> > FileSystem/Cache service with C API? That would really ease the
>> integration
>> > into non-JVM based third party system. It does not have to be HDFS
>> > compatible if that brings extra performance cost.
>> >
>> > Best,
>> > Bairen
>>
>

Reply via email to