Thank you Felix, for provided information. Currently I analyze the provided integration of Flink with SystemML.
And also gather the information for the ticket FLINK-1730 <https://issues.apache.org/jira/browse/FLINK-1730>, maybe we will take it to work, to unlock SystemML/Flink integration. чт, 9 февр. 2017 г. в 0:17, Felix Neutatz <neut...@googlemail.com.invalid>: > Hi Kate, > > 1) - Broadcast: > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-5%3A+Only+send+data+to+each+taskmanager+once+for+broadcasts > - Caching: https://issues.apache.org/jira/browse/FLINK-1730 > > 2) I have no idea about the GPU implementation. The SystemML mailing list > will probably help you out their. > > Best regards, > Felix > > 2017-02-08 14:33 GMT+01:00 Katherin Eri <katherinm...@gmail.com>: > > > Thank you Felix, for your point, it is quite interesting. > > > > I will take a look at the code, of the provided Flink integration. > > > > 1) You have these problems with Flink: >>we realized that the lack of > a > > caching operator and a broadcast issue highly effects the performance, > have > > you already asked about this the community? In case yes: please provide > the > > reference to the ticket or the topic of letter. > > > > 2) You have said, that SystemML provides GPU support. I have seen > > SystemML’s source code and would like to ask: why you have decided to > > implement your own integration with cuda? Did you try to consider ND4J, > or > > because it is younger, you support your own implementation? > > > > вт, 7 февр. 2017 г. в 18:35, Felix Neutatz <neut...@googlemail.com>: > > > > > Hi Katherin, > > > > > > we are also working in a similar direction. We implemented a prototype > to > > > integrate with SystemML: > > > https://github.com/apache/incubator-systemml/pull/119 > > > SystemML provides many different matrix formats, operations, GPU > support > > > and a couple of DL algorithms. Unfortunately, we realized that the lack > > of > > > a caching operator and a broadcast issue highly effects the performance > > > (e.g. compared to Spark). At the moment I am trying to tackle the > > broadcast > > > issue. But caching is still a problem for us. > > > > > > Best regards, > > > Felix > > > > > > 2017-02-07 16:22 GMT+01:00 Katherin Eri <katherinm...@gmail.com>: > > > > > > > Thank you, Till. > > > > > > > > 1) Regarding ND4J, I didn’t know about such a pity and critical > > > > restriction of it -> lack of sparsity optimizations, and you are > right: > > > > this issue is still actual for them. I saw that Flink uses Breeze, > but > > I > > > > thought its usage caused by some historical reasons. > > > > > > > > 2) Regarding integration with DL4J, I have read the source code > of > > > > DL4J/Spark integration, that’s why I have declined my idea of reuse > of > > > > their word2vec implementation for now, for example. I can perform > > deeper > > > > investigation of this topic, if it required. > > > > > > > > > > > > > > > > So I feel that we have the following picture: > > > > > > > > 1) DL integration investigation, could be part of Apache Bahir. > I > > > can > > > > perform futher investigation of this topic, but I thik we need some > > > > separated ticket for this to track this activity. > > > > > > > > 2) GPU support, required for DL is interesting, but requires > ND4J > > > for > > > > example. > > > > > > > > 3) ND4J couldn’t be incorporated because it doesn’t support > > sparsity > > > > <https://deeplearning4j.org/roadmap.html> [1]. > > > > > > > > Regarding ND4J is this the single blocker for incorporation of it or > > may > > > be > > > > some others known? > > > > > > > > > > > > [1] https://deeplearning4j.org/roadmap.html > > > > > > > > вт, 7 февр. 2017 г. в 16:26, Till Rohrmann <trohrm...@apache.org>: > > > > > > > > Thanks for initiating this discussion Katherin. I think you're right > > that > > > > in general it does not make sense to reinvent the wheel over and over > > > > again. Especially if you only have limited resources at hand. So if > we > > > > could integrate Flink with some existing library that would be great. > > > > > > > > In the past, however, we couldn't find a good library which provided > > > enough > > > > freedom to integrate it with Flink. Especially if you want to have > > > > distributed and somewhat high-performance implementations of ML > > > algorithms > > > > you would have to take Flink's execution model (capabilities as well > as > > > > limitations) into account. That is mainly the reason why we started > > > > implementing some of the algorithms "natively" on Flink. > > > > > > > > If I remember correctly, then the problem with ND4J was and still is > > that > > > > it does not support sparse matrices which was a requirement from our > > > side. > > > > As far as I know, it is quite common that you have sparse data > > structures > > > > when dealing with large scale problems. That's why we built our own > > > > abstraction which can have different implementations. Currently, the > > > > default implementation uses Breeze. > > > > > > > > I think the support for GPU based operations and the actual resource > > > > management are two orthogonal things. The implementation would have > to > > > work > > > > with no GPUs available anyway. If the system detects that GPUs are > > > > available, then ideally it would exploit them. Thus, we could add > this > > > > feature later and maybe integrate it with FLINK-5131 [1]. > > > > > > > > Concerning the integration with DL4J I think that Theo's proposal to > do > > > it > > > > in a separate repository (maybe as part of Apache Bahir) is a good > > idea. > > > > We're currently thinking about outsourcing some of Flink's libraries > > into > > > > sub projects. This could also be an option for the DL4J integration > > then. > > > > In general I think it should be feasible to run DL4J on Flink given > > that > > > it > > > > also runs on Spark. Have you already looked at it closer? > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-5131 > > > > > > > > Cheers, > > > > Till > > > > > > > > On Tue, Feb 7, 2017 at 11:47 AM, Katherin Eri < > katherinm...@gmail.com> > > > > wrote: > > > > > > > > > Thank you Theodore, for your reply. > > > > > > > > > > 1) Regarding GPU, your point is clear and I agree with it, ND4J > > > looks > > > > > appropriate. But, my current understanding is that, we also need to > > > cover > > > > > some resource management questions -> when we need to provide GPU > > > support > > > > > we also need to manage it like resource. For example, Mesos has > > already > > > > > supported GPU like resource item: Initial support for GPU > resources. > > > > > < > https://issues.apache.org/jira/browse/MESOS-4424?jql=text%20~%20GPU > > > > > > > > Flink > > > > > uses Mesos as cluster manager, and this means that this feature of > > > Mesos > > > > > could be reused. Also memory managing questions in Flink regarding > > GPU > > > > > should be clarified. > > > > > > > > > > 2) Regarding integration with DL4J: what stops us to initialize > > > ticket > > > > > and start the discussion around this topic? We need some user story > > or > > > > the > > > > > community is not sure that DL is really helpful? Why the discussion > > > with > > > > > Adam > > > > > Gibson just finished with no implementation of any idea? What > > concerns > > > do > > > > > we have? > > > > > > > > > > пн, 6 февр. 2017 г. в 15:01, Theodore Vasiloudis < > > > > > theodoros.vasilou...@gmail.com>: > > > > > > > > > > > Hell all, > > > > > > > > > > > > This is point that has come up in the past: Given the multitude > of > > ML > > > > > > libraries out there, should we have native implementations in > > FlinkML > > > > or > > > > > > try to integrate other libraries instead? > > > > > > > > > > > > We haven't managed to reach a consensus on this before. My > opinion > > is > > > > > that > > > > > > there is definitely value in having ML algorithms written > natively > > in > > > > > > Flink, both for performance optimization, > > > > > > but more importantly for engineering simplicity, we don't want to > > > force > > > > > > users to use yet another piece of software to run their ML algos > > (at > > > > > least > > > > > > for a basic set of algorithms). > > > > > > > > > > > > We have in the past discussed integrations with DL4J > (particularly > > > > ND4J) > > > > > > with Adam Gibson, the core developer of the library, but we never > > got > > > > > > around to implementing anything. > > > > > > > > > > > > Whether it makes sense to have an integration with DL4J as part > of > > > the > > > > > > Flink distribution would be up for discussion. I would suggest to > > > make > > > > it > > > > > > an independent repo to allow for > > > > > > faster dev/release cycles, and because it wouldn't be directly > > > related > > > > to > > > > > > the core of Flink so it would add extra reviewing burden to an > > > already > > > > > > overloaded group of committers. > > > > > > > > > > > > Natively supporting GPU calculations in Flink would be much > better > > > > > achieved > > > > > > through a library like ND4J, the engineering burden would be too > > much > > > > > > otherwise. > > > > > > > > > > > > Regards, > > > > > > Theodore > > > > > > > > > > > > On Mon, Feb 6, 2017 at 11:26 AM, Katherin Eri < > > > katherinm...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > Hello, guys. > > > > > > > > > > > > > > Theodore, last week I started the review of the PR: > > > > > > > https://github.com/apache/flink/pull/2735 related to *word2Vec > > for > > > > > > Flink*. > > > > > > > > > > > > > > > > > > > > > > > > > > > > During this review I have asked myself: why do we need to > > implement > > > > > such > > > > > > a > > > > > > > very popular algorithm like *word2vec one more time*, when > there > > is > > > > > > already > > > > > > > available implementation in java provided by > deeplearning4j.org > > > > > > > <https://deeplearning4j.org/word2vec> library (DL4J -> Apache > 2 > > > > > > licence). > > > > > > > This library tries to promote itself, there is a hype around it > > in > > > ML > > > > > > > sphere, and it was integrated with Apache Spark, to provide > > > scalable > > > > > > > deeplearning calculations. > > > > > > > > > > > > > > > > > > > > > *That's why I thought: could we integrate with this library or > > not > > > > also > > > > > > and > > > > > > > Flink? * > > > > > > > > > > > > > > 1) Personally I think, providing support and deployment of > > > > > > > *Deeplearning(DL) > > > > > > > algorithms/models in Flink* is promising and attractive > feature, > > > > > because: > > > > > > > > > > > > > > a) during last two years DL proved its efficiency and these > > > > > > algorithms > > > > > > > used in many applications. For example *Spotify *uses DL based > > > > > algorithms > > > > > > > for music content extraction: Recommending music on Spotify > with > > > deep > > > > > > > learning AUGUST 05, 2014 > > > > > > > <http://benanne.github.io/2014/08/05/spotify-cnns.html> for > > their > > > > > music > > > > > > > recommendations. Developers need to scale up DL manually, that > > > causes > > > > a > > > > > > lot > > > > > > > of work, so that’s why such platforms like Flink should support > > > these > > > > > > > models deployment. > > > > > > > > > > > > > > b) Here is presented the scope of Deeplearning usage cases > > > > > > > <https://deeplearning4j.org/use_cases>, so many of this > > scenarios > > > > > > related > > > > > > > to scenarios, that could be supported on Flink. > > > > > > > > > > > > > > > > > > > > > 2) But DL uncover such questions like: > > > > > > > > > > > > > > a) scale up calculations over machines > > > > > > > > > > > > > > b) perform these calculations both over CPU and GPU. GPU is > > > > > required > > > > > > to > > > > > > > train big DL models, otherwise learning process could have very > > > slow > > > > > > > convergence. > > > > > > > > > > > > > > > > > > > > > 3) I have checked this DL4J library, which already have reach > > > support > > > > > of > > > > > > > many attractive DL models like: Recurrent Networks and LSTMs, > > > > > > Convolutional > > > > > > > Networks (CNN), Restricted Boltzmann Machines (RBM) and others. > > So > > > we > > > > > > won’t > > > > > > > need to implement them independently, but only provide the > > ability > > > of > > > > > > > execution of this models over Flink cluster, the quite similar > > way > > > > like > > > > > > it > > > > > > > was integrated with Apache Spark. > > > > > > > > > > > > > > > > > > > > > Because of all of this I propose: > > > > > > > > > > > > > > 1) To create new ticket in Flink’s JIRA for integration of > > Flink > > > > > with > > > > > > > DL4J and decide on which side this integration should be > > > implemented. > > > > > > > > > > > > > > 2) Support natively GPU resources in Flink and allow > > > calculations > > > > > over > > > > > > > them, like that is described in this publication > > > > > > > https://www.oreilly.com/learning/accelerating-spark- > > > > > workloads-using-gpus > > > > > > > > > > > > > > > > > > > > > > > > > > > > *Regarding original issue Implement Word2Vec > > > > > > > <https://issues.apache.org/jira/browse/FLINK-2094>in Flink, > *I > > > have > > > > > > > investigated its implementation in DL4J and that > implementation > > of > > > > > > > integration DL4J with Apache Spark, and got several points: > > > > > > > > > > > > > > It seems that idea of building of our own implementation of > > > word2vec > > > > in > > > > > > > Flink not such a bad solution, because: This DL4J was forced to > > > > > > reimplement > > > > > > > its original word2Vec over Spark. I have checked the > integration > > of > > > > > DL4J > > > > > > > with Spark, and found that it is too strongly coupled with > Spark > > > API, > > > > > so > > > > > > > that it is impossible just to take some DL4J API and reuse it, > > > > instead > > > > > we > > > > > > > need to implement independent integration for Flink. > > > > > > > > > > > > > > *That’s why we simply finish implementation of current PR > > > > > > > **independently **from > > > > > > > integration to DL4J.* > > > > > > > > > > > > > > > > > > > > > > > > > > > > Could you please provide your opinion regarding my questions > and > > > > > points, > > > > > > > what do you think about them? > > > > > > > > > > > > > > > > > > > > > > > > > > > > пн, 6 февр. 2017 г. в 12:51, Katherin Eri < > > katherinm...@gmail.com > > > >: > > > > > > > > > > > > > > > Sorry, guys I need to finish this letter first. > > > > > > > > Full version of it will come shortly. > > > > > > > > > > > > > > > > пн, 6 февр. 2017 г. в 12:49, Katherin Eri < > > > katherinm...@gmail.com > > > > >: > > > > > > > > > > > > > > > > Hello, guys. > > > > > > > > Theodore, last week I started the review of the PR: > > > > > > > > https://github.com/apache/flink/pull/2735 related to > *word2Vec > > > for > > > > > > > Flink*. > > > > > > > > > > > > > > > > During this review I have asked myself: why do we need to > > > implement > > > > > > such > > > > > > > a > > > > > > > > very popular algorithm like *word2vec one more time*, when > > there > > > is > > > > > > > > already availabe implementation in java provided by > > > > > deeplearning4j.org > > > > > > > > <https://deeplearning4j.org/word2vec> library (DL4J -> > Apache > > 2 > > > > > > > licence). > > > > > > > > This library tries to promote it self, there is a hype around > > it > > > in > > > > > ML > > > > > > > > sphere, and it was integrated with Apache Spark, to provide > > > > scalable > > > > > > > > deeplearning calculations. > > > > > > > > That's why I thought: could we integrate with this library or > > not > > > > > also > > > > > > > and > > > > > > > > Flink? > > > > > > > > 1) Personally I think, providing support and deployment of > > > > > Deeplearning > > > > > > > > algorithms/models in Flink is promising and attractive > feature, > > > > > > because: > > > > > > > > a) during last two years deeplearning proved its > efficiency > > > and > > > > > > this > > > > > > > > algorithms used in many applications. For example *Spotify > > *uses > > > DL > > > > > > based > > > > > > > > algorithms for music content extraction: Recommending music > on > > > > > Spotify > > > > > > > > with deep learning AUGUST 05, 2014 > > > > > > > > <http://benanne.github.io/2014/08/05/spotify-cnns.html> for > > > their > > > > > > music > > > > > > > > recommendations. Doing this natively scalable is very > > attractive. > > > > > > > > > > > > > > > > > > > > > > > > I have investigated that implementation of integration DL4J > > with > > > > > Apache > > > > > > > > Spark, and got several points: > > > > > > > > > > > > > > > > 1) It seems that idea of building of our own implementation > of > > > > > word2vec > > > > > > > > not such a bad solution, because the integration of DL4J with > > > Spark > > > > > is > > > > > > > too > > > > > > > > strongly coupled with Saprk API and it will take time from > the > > > side > > > > > of > > > > > > > DL4J > > > > > > > > to adopt this integration to Flink. Also I have expected that > > we > > > > will > > > > > > be > > > > > > > > able to call just some API, it is not such thing. > > > > > > > > 2) > > > > > > > > > > > > > > > > https://deeplearning4j.org/use_cases > > > > > > > > https://www.analyticsvidhya.com/blog/2017/01/t-sne- > > > > > > > implementation-r-python/ > > > > > > > > > > > > > > > > > > > > > > > > чт, 19 янв. 2017 г. в 13:29, Till Rohrmann < > > trohrm...@apache.org > > > >: > > > > > > > > > > > > > > > > Hi Katherin, > > > > > > > > > > > > > > > > welcome to the Flink community. Always great to see new > people > > > > > joining > > > > > > > the > > > > > > > > community :-) > > > > > > > > > > > > > > > > Cheers, > > > > > > > > Till > > > > > > > > > > > > > > > > On Tue, Jan 17, 2017 at 1:02 PM, Katherin Sotenko < > > > > > > > katherinm...@gmail.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > ok, I've got it. > > > > > > > > > I will take a look at > > > https://github.com/apache/flink/pull/2735 > > > > . > > > > > > > > > > > > > > > > > > вт, 17 янв. 2017 г. в 14:36, Theodore Vasiloudis < > > > > > > > > > theodoros.vasilou...@gmail.com>: > > > > > > > > > > > > > > > > > > > Hello Katherin, > > > > > > > > > > > > > > > > > > > > Welcome to the Flink community! > > > > > > > > > > > > > > > > > > > > The ML component definitely needs a lot of work you are > > > > correct, > > > > > we > > > > > > > are > > > > > > > > > > facing similar problems to CEP, which we'll hopefully > > resolve > > > > > with > > > > > > > the > > > > > > > > > > restructuring Stephan has mentioned in that thread. > > > > > > > > > > > > > > > > > > > > If you'd like to help out with PRs we have many open, > one I > > > > have > > > > > > > > started > > > > > > > > > > reviewing but got side-tracked is the Word2Vec one [1]. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Theodore > > > > > > > > > > > > > > > > > > > > [1] https://github.com/apache/flink/pull/2735 > > > > > > > > > > > > > > > > > > > > On Tue, Jan 17, 2017 at 12:17 PM, Fabian Hueske < > > > > > fhue...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Katherin, > > > > > > > > > > > > > > > > > > > > > > welcome to the Flink community! > > > > > > > > > > > Help with reviewing PRs is always very welcome and a > > great > > > > way > > > > > to > > > > > > > > > > > contribute. > > > > > > > > > > > > > > > > > > > > > > Best, Fabian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2017-01-17 11:17 GMT+01:00 Katherin Sotenko < > > > > > > > katherinm...@gmail.com > > > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > Thank you, Timo. > > > > > > > > > > > > I have started the analysis of the topic. > > > > > > > > > > > > And if it necessary, I will try to perform the review > > of > > > > > other > > > > > > > > pulls) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > вт, 17 янв. 2017 г. в 13:09, Timo Walther < > > > > > twal...@apache.org > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Katherin, > > > > > > > > > > > > > > > > > > > > > > > > > > great to hear that you would like to contribute! > > > Welcome! > > > > > > > > > > > > > > > > > > > > > > > > > > I gave you contributor permissions. You can now > > assign > > > > > issues > > > > > > > to > > > > > > > > > > > > > yourself. I assigned FLINK-1750 to you. > > > > > > > > > > > > > Right now there are many open ML pull requests, you > > are > > > > > very > > > > > > > > > welcome > > > > > > > > > > to > > > > > > > > > > > > > review the code of others, too. > > > > > > > > > > > > > > > > > > > > > > > > > > Timo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Am 17/01/17 um 10:39 schrieb Katherin Sotenko: > > > > > > > > > > > > > > Hello, All! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm Kate Eri, I'm java developer with 6-year > > > enterprise > > > > > > > > > experience, > > > > > > > > > > > > also > > > > > > > > > > > > > I > > > > > > > > > > > > > > have some expertise with scala (half of the > year). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Last 2 years I have participated in several > BigData > > > > > > projects > > > > > > > > that > > > > > > > > > > > were > > > > > > > > > > > > > > related to Machine Learning (Time series > analysis, > > > > > > > Recommender > > > > > > > > > > > systems, > > > > > > > > > > > > > > Social networking) and ETL. I have experience > with > > > > > Hadoop, > > > > > > > > Apache > > > > > > > > > > > Spark > > > > > > > > > > > > > and > > > > > > > > > > > > > > Hive. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I’m fond of ML topic, and I see that Flink > project > > > > > requires > > > > > > > > some > > > > > > > > > > work > > > > > > > > > > > > in > > > > > > > > > > > > > > this area, so that’s why I would like to join > Flink > > > and > > > > > ask > > > > > > > me > > > > > > > > to > > > > > > > > > > > grant > > > > > > > > > > > > > the > > > > > > > > > > > > > > assignment of the ticket > > > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-1750 > > > > > > > > > > > > > > to me. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >