You mentioned "I’m currently working to integrate the ignite distributed 
dataframes", what dataframes are you referring to? Could you share a link to 
docs, for example? We have no official term "the ignite distributed dataframes".

I guess the term that the community uses is shared dataframes
https://apacheignite-fs.readme.io/docs/ignite-data-frame

As far as XGBoost, seems as though the platform isn’t really working on 
training, just inference. My teammate did say he “needs to investigate Ignite 
for distributed training for both XGBoost as well as TF (tensorflow). And, I 
have been fiddling around with it” so I will keep you informed.

Regards

~Adam


From: Alexey Zinoviev <zaleslaw....@gmail.com>
Date: Thursday, April 9, 2020 at 5:59 AM
To: "Carbone, Adam" <adam.carb...@bottomline.com>
Cc: "dev@ignite.apache.org" <dev@ignite.apache.org>
Subject: Re: Ignite XGBoost support

Morning!
I'm going to publish the roadmap for Ignite ML 2.9 for wide discussion at the 
end of April when the dates of 2.9 will be finished.

But I suppose that we will closer to the this point "the plan to rely on models 
trained elsewhere and then imported into the platform for scoring?", I mean 
distributed inference, hope to increase the amount of integrated models and 
model formats.

You mentioned "I’m currently working to integrate the ignite distributed 
dataframes", what dataframes are you referring to? Could you share a link to 
docs, for example? We have no official term "the ignite distributed dataframes".

If you get some results about integration with XGBoost, please let me know!

Sincerely yours,
                 Alexey

пт, 27 мар. 2020 г. в 16:29, Carbone, Adam 
<adam.carb...@bottomline.com<mailto:adam.carb...@bottomline.com>>:
Good Morning Alexey,

Let me first answer your questions.

1. Are you a member of XGBoost project and have a permission to commit the 
XGBoost project? (in many cases the collaboration involves changes in both 
integrated frameworks)
No I am not personally, nor is our organization.
2. What are the primitives or integration points are accessible in XGBoost? 
Could you share a paper/article/link to give me a chance to read more?
Not sure that we the person on my team doing this work, has that level of 
understanding yet. Like I mentioned in the previous email we were about to 
embark on this when we saw the 2.8 announcement come out and decided to look 
further at what the level of support was.
3. What is planned architecture with native C++ libraries? Could you share it 
with me and Ignite community?
I will only be able to share the higher level on this ( again if it makes sense 
we could do a deeper dive with the developers that are working on this 
directly), but currently our Tensorflow Neural Network modeling is exposed via 
some internal webservices written in C++ wrapping the tensorflow libraries. 
This is  called within our job scheduling/runner framework. These webservices 
run on different images within our overall system. We were looking to do 
something similar around XGBoost prior to seeing it come up in the announcement.



So it looks like you are using MLeap to import and support external models we 
have looked at the same approach ourselves. From what you mentioned it seems 
that there are currently no intentions to add distributed training of any 
external Algorithms to the platform, are you developing your own algorithms? Or 
is the plan to rely on models trained elsewhere and then imported into the 
platform for scoring? Just interested in the ways that we may be able to 
leverage the platform or help contribute, we are looking to use other features 
of ignite so leveraging additional features over time seems like the right 
approach. I’m currently working to integrate the ignite distributed dataframes.

Regards

Adam

Adam Carbone | Director of Innovation – Intelligent Platform Team | Bottomline 
Technologies
Office: 603-501-6446 | Mobile: 603-570-8418
www.bottomline.com<http://www.bottomline.com>



From: Alexey Zinoviev <zaleslaw....@gmail.com<mailto:zaleslaw....@gmail.com>>
Date: Friday, March 27, 2020 at 1:58 AM
To: "Carbone, Adam" 
<adam.carb...@bottomline.com<mailto:adam.carb...@bottomline.com>>
Cc: "dev@ignite.apache.org<mailto:dev@ignite.apache.org>" 
<dev@ignite.apache.org<mailto:dev@ignite.apache.org>>
Subject: Re: Ignite XGBoost support

Morning, Adam, Denis!

Let me describe the current status

1. https://issues.apache.org/jira/browse/IGNITE-10810 is related to MLeap not 
to XGBoost. This is the right ticket for XGBoost 
https://issues.apache.org/jira/browse/IGNITE-10289
2. Currently, we have no plans to add XGBoost or any external ML library for 
distributed training (inference could be supported now with a few limitations, 
see XGBoost or H2O examples)
3. We have models storage and partitioned dataset primitives to keep the data 
with MapReduce-like operations, but each algorithm should be implemented as a 
sequence of MR operations manually (we have no MR code generation here)

I have a few questions, could you please answer them?

1. Are you a member of XGBoost project and have a permission to commit the 
XGBoost project? (in many cases the collaboration involves changes in both 
integrated frameworks)
2. What are the primitives or integration points are accessible in XGBoost? 
Could you share a paper/article/link to give me a chance to read more?
3. What is planned architecture with native C++ libraries? Could you share it 
with me and Ignite community?

P.S. I need to go deeper to understand what capabilities of Ignite ML could be 
used to become the platform for distributed training, you answers will be 
helpful.

Sincerely yours,
          Alexey Zinoviev

пт, 27 мар. 2020 г. в 01:04, Carbone, Adam 
<adam.carb...@bottomline.com<mailto:adam.carb...@bottomline.com>>:
Good afternoon Denis,

Nice to meet you, Hello to you too Alexey. So I'm not sure if it will be me or 
another member on our team, but I wanted to start the discussion.  We are 
investigating/integrating ignite into our ML platform. In addition We have 
already done a separate tensor flow implementation for Neural Network using the 
C++ libraries. And we were about to take the same approach for XGBoost, when we 
saw the 2.8 announcement. So before we went that route I wanted to do a more 
proper investigations as to where things were, and where they might head.

Regards

Adam

Adam Carbone | Director of Innovation – Intelligent Platform Team | Bottomline 
Technologies
Office: 603-501-6446 | Mobile: 603-570-8418
www.bottomline.com<http://www.bottomline.com>



On 3/26/20, 5:20 PM, "Denis Magda" 
<dma...@apache.org<mailto:dma...@apache.org>> wrote:

    Hi Adam, thanks for starting the thread. The contributions are
    highly appreciated and we'll be glad to see you among our contributors,
    especially, if it helps to make our ML library stronger.

    But first things first, let me introduce you to @Alexey Zinoviev
    <zaleslaw....@gmail.com<mailto:zaleslaw....@gmail.com>> who is our main ML 
maintainer.

    -
    Denis


    On Thu, Mar 26, 2020 at 1:49 PM Carbone, Adam 
<adam.carb...@bottomline.com<mailto:adam.carb...@bottomline.com>>
    wrote:

    > Good Afternoon All
    >
    > I was asked to forward this here by Denis Magda. I see in the 2.8 release
    > that you implemented importing of XGBoost models for distributed inference
    > =>
    > 
https://issues.apache.org/jira/browse/IGNITE-10810?focusedCommentId=16728718&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16728718Is
    > there any plans to add distributed training, We are at a cross roads of
    > building on top of the C++ libraries an XGBoost solution, but if this is 
on
    > the roadmap maybe we will go the ignite direction vs the pure C++, and
    > maybe we might even be able to help and contribute.
    >
    > Regards
    >
    > Adam Carbone
    >
    > Adam Carbone | Director of Innovation – Intelligent Platform Team |
    > Bottomline Technologies
    > Office: 603-501-6446 | Mobile: 603-570-8418
    > www.bottomline.com<http://www.bottomline.com>
    >
    >
    >

Reply via email to