Hi Maheshakya, Thank you very much.i am already onto to that.will let you soon.thank you. BR, mahesh.
On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena <mahesha...@wso2.com > wrote: > Hi Mahesh, > > does that Scala API is with your current product or repo? > > > No, we don't have the Scala API included. What we want is to design the > Java implementations of those algorithms to train with mini-batches of > streaming data with the help of the aforementioned methods so that we can > include in as a CEP extension. > > As to clarify, please try to write a simple Java program using Spark MLLib > linear regression and k-means clustering with a sample data set (You can > find alot of data sets from UCI repo[1]). You need to break the dataset > into several pieces and train a model repeatedly with those. > After each training run, save the model information (such as weights, > intercepts for regression and cluster centers for clustering - please check > the arguments of those methods I have mentioned and save the required > information of the model) > When training a model we a new piece of data, use those methods to > initialize and put the save values for the arguments. This way you can > start from where you stopped in the previous run. > > Let us know your observations and feel free to ask if you need to know > anything more on this. > > We'll let you know what needs to be done to include this in CEP. > > Best regards. > > On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> great.thank you.i already have ML and CEP and working more towards it. >> does that Scala API is with your current product or repo?. thank you. >> BR, >> Mahesh. >> >> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Please find the comments inline. >>> >>> does data stream is taken to ML as the event publisher's format through >>>> event publisher. Or we can use direct traffic that comes to event >>>> receiver, or else as streams >>>> >>> We intend to use the direct data as even streams. >>> >>> 1.) Those data coming from wso2 DAS to ML are coming as streams? >>>> >>> No, WSO2 ML doesn't use any even stream. The data stored in tables in >>> DAS is loaded into ML. >>> >>> 2.) Are there any incremental learning algorithms currently active in >>>> ML?you mentioned that there are and they are with scala API. So there is a >>>> streaming support with that Scala API. In that API which format the data is >>>> aquired to ML? >>>> >>> No, there are no incremental learning algorithms in ML. The scala API is >>> about Spark MLLib. MLLib supports streaming k-means and other generalized >>> linear models (linear regression variants and logistic regression) with >>> Scala API. What they basically do in those implementations is retraining >>> the trained models with mini batches when data sequentially arrives. There, >>> the breaking of streaming data into mini batches is done with the help of >>> Spark Streaming. But we do not intend to use Spark streaming in our >>> implementation. What we need to do is implement a similar behavior for >>> event streams using the Java API. The Java API has the following methods: >>> >>> - *createModel >>> >>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>* >>> (Vector >>> >>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html> >>> weights, >>> double intercept) - for GLMs >>> - *setInitialModel >>> >>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28org.apache.spark.mllib.clustering.KMeansModel%29>* >>> (KMeansModel >>> >>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeansModel.html> >>> model) >>> - for K means >>> >>> With the help of these methods, we can train models again with newly >>> arriving data, keeping the characteristics learned with the previous data. >>> When implementing this, we need to pay attention to other parameters of >>> incremental learning such as data horizon and data obsolescence (indicated >>> in the project ideas page). >>> We need to discuss on how to add these with CEP event streams. I have >>> added Suho into the thread for more clarification. >>> >>> Best regards. >>> >>> >>> On Sat, Mar 5, 2016 at 5:15 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> >>>> Hi maheshakya, >>>> as we concerned to use WSO2 CEP to handle streaming data and implement >>>> the machine learning algorithms with Spark MLLib, does data stream is taken >>>> to ML as the event publisher's format through event publisher. Or we can >>>> use direct traffic that comes to event receiver, or else as streams. >>>> referring to https://docs.wso2.com/display/CEP410/User+Guide >>>> 1.) Those data coming from wso2 DAS to ML are coming as streams? >>>> 2.) Are there any incremental learning algorithms currently active >>>> in ML?you mentioned that there are and they are with scala API. So there is >>>> a streaming support with that Scala API. In that API which format the data >>>> is aquired to ML? >>>> >>>> thank you. >>>> BR, >>>> Mahesh. >>>> >>>> On Fri, Mar 4, 2016 at 2:03 PM, Maheshakya Wijewardena < >>>> mahesha...@wso2.com> wrote: >>>> >>>>> Hi Mahesh, >>>>> >>>>> We had to modify a the project scope a little to suit best for the >>>>> requirements. We will update the project idea with those concerns soon and >>>>> let you know. >>>>> >>>>> We do not support streaming data in WSO2 Machine learner at the >>>>> moment. The new concern is to use WSO2 CEP to handle streaming data and >>>>> implement the machine learning algorithms with Spark MLLib. You can look >>>>> at >>>>> the streaming k-means and streaming linear regression implementations in >>>>> MLLib. Currently, the API is only for scala. Our need is to get the Java >>>>> APIs of k-means and generalized linear models to support incremental >>>>> learning with streaming data. This has to be done as mini-batch learning >>>>> since these algorithms operates as stochastic gradient descents so that >>>>> any >>>>> learning with new data can be done on top of the previously learned >>>>> models. >>>>> So please go through the those APIs[1][2][3] and try to get an idea. >>>>> Also please try to understand how event streams work in WSO2 CEP >>>>> [4][5]. >>>>> >>>>> Best regards. >>>>> >>>>> [1] >>>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html >>>>> [2] >>>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html >>>>> [3] >>>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/classification/LogisticRegressionWithSGD.html >>>>> [4] https://docs.wso2.com/display/CEP310/Working+with+Event+Streams >>>>> [5] https://docs.wso2.com/display/CEP310/Working+with+Execution+Plans >>>>> >>>>> On Fri, Mar 4, 2016 at 11:26 AM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> Hi maheshakya, >>>>>> give me sometime to go through your ML package. Do current product >>>>>> have any stream data support?. i did some university projects related to >>>>>> machine learning with regressions,modelling, factor analysis, cluster >>>>>> analysis and classification problems (Discriminant Analysis) with SVM >>>>>> (Support Vector machines), Neural networks, LS classification and >>>>>> ML(Maximum likelihood). give me sometime to see how wso2 architecture >>>>>> works.then i can come up with good architecture.thank you. >>>>>> BR, >>>>>> Mahesh. >>>>>> >>>>>> On Wed, Mar 2, 2016 at 2:41 PM, Mahesh Dananjaya < >>>>>> dananjayamah...@gmail.com> wrote: >>>>>> >>>>>>> Hi Maheshakya, >>>>>>> Thank you for the resources. I will go through this and looking >>>>>>> forward to this proposed project.Thank you. >>>>>>> BR, >>>>>>> Mahesh. >>>>>>> >>>>>>> On Wed, Mar 2, 2016 at 1:52 PM, Maheshakya Wijewardena < >>>>>>> mahesha...@wso2.com> wrote: >>>>>>> >>>>>>>> Hi Mahesh, >>>>>>>> >>>>>>>> Thank you for the interest for this project. >>>>>>>> >>>>>>>> We would like to know what type of similar projects you have worked >>>>>>>> on. You may have seen that WSO2 Machine Learner supports several >>>>>>>> learning >>>>>>>> algorithms at the moment[1]. This project intends to leverage the >>>>>>>> existing >>>>>>>> algorithms in WSO2 Machine Learner to support streaming data. As an >>>>>>>> initiative, first you can get an idea about what WSO2 Machine Learner >>>>>>>> does >>>>>>>> and how it operates. You can download WSO2 Machine Learner from product >>>>>>>> page[2] and the the source code [3]. ML is using Apache Spark MLLib[4] >>>>>>>> for >>>>>>>> its' algorithms so it's better to read and understand what it does as >>>>>>>> well. >>>>>>>> >>>>>>>> In order to get an idea about the deliverables and the scope of >>>>>>>> this project, try to understand how Spark streaming[5] (see examples) >>>>>>>> handles streaming data. Also, have a look in the streaming >>>>>>>> algorithms[6][7] >>>>>>>> supported by MLLib. There are two approaches discussed to employ >>>>>>>> incremental learning in ML in the project proposals page. These >>>>>>>> streaming >>>>>>>> algorithms can be directly used in the first approach. For the other >>>>>>>> approach, the your implementation should contain a procedure to create >>>>>>>> mini >>>>>>>> batches from streaming data with relevant sizes (i.e. a moving window) >>>>>>>> and >>>>>>>> do periodic retraining of the same algorithm. >>>>>>>> >>>>>>>> To start with the project, you will need to come up with a suitable >>>>>>>> plan and an architecture first. >>>>>>>> >>>>>>>> Please watch the video referenced in the proposal (reference: 5). >>>>>>>> It will help you getting a better idea about machine learning >>>>>>>> algorithms >>>>>>>> with streaming data. >>>>>>>> >>>>>>>> Let us know if you need any help with these. >>>>>>>> >>>>>>>> Best regards >>>>>>>> >>>>>>>> [1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms >>>>>>>> [2] http://wso2.com/products/machine-learner/ >>>>>>>> [3] >>>>>>>> https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout >>>>>>>> [4] https://spark.apache.org/docs/1.4.1/mllib-guide.html >>>>>>>> [5] >>>>>>>> https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html >>>>>>>> [6] >>>>>>>> https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression >>>>>>>> [7] >>>>>>>> https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means >>>>>>>> >>>>>>>> On Wed, Mar 2, 2016 at 1:19 PM, Mahesh Dananjaya < >>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> I am interesting on contribute to proposal 6: "Predictive analytic >>>>>>>>> with online data for WSO2 Machine Learner" for GSOC2 this time. Since >>>>>>>>> i >>>>>>>>> have been engaging with some similar projects i think it will be a >>>>>>>>> great >>>>>>>>> experience for me. Please let me know what you think and what you >>>>>>>>> suggest. >>>>>>>>> I have been going through your documents.thank you. >>>>>>>>> regards, >>>>>>>>> Mahesh Dananjaya. >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Dev mailing list >>>>>>>>> Dev@wso2.org >>>>>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>> mahesha...@wso2.com >>>>>>>> +94711228855 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Pruthuvi Maheshakya Wijewardena >>>>> mahesha...@wso2.com >>>>> +94711228855 >>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> Pruthuvi Maheshakya Wijewardena >>> mahesha...@wso2.com >>> +94711228855 >>> >>> >>> >> > > > -- > Pruthuvi Maheshakya Wijewardena > mahesha...@wso2.com > +94711228855 > > >
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev