Hi Mahesh, What's the status of the project?
On Thu, Jul 14, 2016 at 10:28 AM, Mahesh Dananjaya < [email protected]> wrote: > Hi Maheshakya, > I am building and running samoa to see its functionality. In samoa still > we have limited supports in algorithms. Samoa supports only classification > and clustering with streams. It also use kind of StreamProcessor, like the > one we use in StreamProcessor extension. I was getting started with Samoa > referring to this page [1]. Then i ran couple of examples to identified the > flow. Samoa use hadoop framework instead spark for distribution. But i am > using it in a local mode. When i see the Samoa core there is only limited > algorithms. IMO if we are going to use Samoa we have to limit the > functionality and algorithms [2]. When i go to developer corner in [3], it > seems to be something like CEP extension that we are using currenlty. SO in > Samoa though the algorihtms are limited, they have implemented streaming > support for them. Therefore if we integrate it into CEP we have to look for > how to handle streams and algorithms in Samoa side. Is it good for your > side to have both hadoop and spark running background.thank you. > regards, > Mahesh. > > [1] https://samoa.incubator.apache.org/documentation/Home.html > [2] > https://samoa.incubator.apache.org/documentation/api/current/index.html > > > On Wed, Jun 22, 2016 at 11:51 AM, Mahesh Dananjaya < > [email protected]> wrote: > >> Hi Maheshakya, >> can i give external data sources like data from database , data from HDFS >> to generate events in the cep event simulator rather than giving a file. i >> saw "Switch to upload file for simulation" in the input Data By Data Source >> in the event simulator. How can i feed data real time from other sources >> or directly as data generating from remote server as JSON or etc... What >> format the database should be.This is just for my knowledge.thank you. >> regards, >> Mahesh. >> >> On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya < >> [email protected]> wrote: >> >>> Hi Nirmal, >>> *This is what i have done so far in the GSOC2016,* >>> >>> - prior research before SGD (Stochastic Gradient Descent) >>> optimization techniques and mini-batch processing >>> - Getting familiar and writing extensions to siddhi >>> - Wrote a Stream Processor extensions for streaming application and >>> machine learning algorithms (Linear Regression,KMeans & Logistic >>> Regression) >>> - Developed a Streaming Linear Regression class for periodically >>> retrain models as mini batch processing with SGD >>> - Extend the functionality for Moving Window Mini Batch Processing >>> with SGD providing windowShift which control data horizon and data >>> obsolescences >>> - Performance evaluation of the implementation >>> - Adding Streaming Linear Regression class and Stream Processor >>> extension to carbon-ml >>> >>> >>> *As a next step,* >>> >>> - Adding Persisting temporal models for applications such as >>> prediction >>> - complete Streaming Kmeans clustering and Logistic Regression >>> classes >>> - Improve batching and streaming mechanisms >>> - improve visualization(optional) >>> - and writing examples and documentation >>> >>> regards, >>> >>> Mahesh. >>> >>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena < >>> [email protected]> wrote: >>> >>>> Sorry, you need to put the returned values of the function into the >>>> output stream >>>> >>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, 0.95, >>>> salary, rbi, walks, strikeouts, errors) >>>> >>>> >>>> >>>> *select mseinsert into LinregOutput;* >>>> or >>>> >>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, 0.95, >>>> salary, rbi, walks, strikeouts, errors) >>>> select * >>>> insert into LinregOutput; >>>> >>>> where LinregOutput stream definition contains all attributes: mse, >>>> intercept, beta1, .... >>>> >>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < >>>> [email protected]> wrote: >>>> >>>>> Hi Mahesh, >>>>> >>>>> In your output stream, you need to list all the attributes that are >>>>> returned from the streamlinreg function: mse, intercept, beta1, .... >>>>> Can you try that? >>>>> >>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Maheshakya, >>>>>> This is the full query i used. >>>>>> >>>>>> @Import('LinRegInput:1.0.0') >>>>>> >>>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>>> strikeouts double, errors double); >>>>>> >>>>>> @Export('LinRegOutput:1.0.0') >>>>>> >>>>>> define stream LinregOutput (mse double); >>>>>> >>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, 0.95, >>>>>> salary, rbi, walks, strikeouts, errors) >>>>>> >>>>>> select * >>>>>> insert into mse; >>>>>> >>>>>> but i am sending [mse,intercept,beta1....betap] as a outputData >>>>>> Object[]. SO how can i publish all these infomation on event publisher. >>>>>> regards, >>>>>> Mahesh. >>>>>> >>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Mahesh, >>>>>>> >>>>>>> Can you summarize the work we have done so far and the remaining >>>>>>> work items please? >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Maheshakya, >>>>>>>> I have updated the repo [2] and upto date documents can be found at >>>>>>>> [1].thank you. >>>>>>>> regards, >>>>>>>> Mahesh. >>>>>>>> [1] >>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>>>>>> [2] >>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>> From: Mahesh Dananjaya <[email protected]> >>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic >>>>>>>>> with online data for WSO2 Machine Learner >>>>>>>>> To: Maheshakya Wijewardena <[email protected]> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Maheshakya, >>>>>>>>> new query is like this adding spport for moving window methods. >>>>>>>>> >>>>>>>>> >>>>>>>>> @Import('LinRegInput:1.0.1') >>>>>>>>> define stream LinRegInput (salary double, rbi double, walks >>>>>>>>> double, strikeouts double, errors double); >>>>>>>>> >>>>>>>>> @Export('LinRegOutput:1.0.1') >>>>>>>>> define stream LinRegOutput (mse double); >>>>>>>>> >>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, >>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>>>> select * >>>>>>>>> insert into mse; >>>>>>>>> 1=learnType >>>>>>>>> 2=windowShift >>>>>>>>> 4=batchSize....... >>>>>>>>> >>>>>>>>> windowShift is added to configure the amount of shift. i have >>>>>>>>> added log.infe(mse) to view the MSE. >>>>>>>>> Mahesh. >>>>>>>>> >>>>>>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi Mahesh, >>>>>>>>>> >>>>>>>>>> If you are installing features from new p2 repo into a new CEP >>>>>>>>>> pack, then you wont need to replace those jars. >>>>>>>>>> If you have already installed those in the CEP from a previous >>>>>>>>>> p2-repo, then you have to un-install those features and reinstall >>>>>>>>>> with new >>>>>>>>>> p2 repo. But you don't need to do this because you can just replace >>>>>>>>>> the >>>>>>>>>> jar. It's easy. >>>>>>>>>> >>>>>>>>>> Best regards. >>>>>>>>>> >>>>>>>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>> If i built the carbon-ml then product-ml and point new p2 >>>>>>>>>>> repository to cep features, do i need to copy that >>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension1.1..... thing into >>>>>>>>>>> cep_home/repository/component/... place. >>>>>>>>>>> regards, >>>>>>>>>>> Mahesh. >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> In MLModelhandler there's persistModel method >>>>>>>>>>>> debug that method while trying to train a model from ML >>>>>>>>>>>> you can see the steps it takes >>>>>>>>>>>> don't use deep learning algorithm >>>>>>>>>>>> any other algorithm would work >>>>>>>>>>>> from line 777 is the section for creating the serializable >>>>>>>>>>>> object from trained model and saving it >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I think you don't need to directly use ML model handler >>>>>>>>>>>> you need to use the code in that for persisting models in the >>>>>>>>>>>> streaming algorithm >>>>>>>>>>>> so you can add a utils class in the streaming folder >>>>>>>>>>>> then add the persisting logic there >>>>>>>>>>>> ignore the deeplearning section in that >>>>>>>>>>>> only forcus on persisting spark mod >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>> I pushed the StreamingLinearRegression modules into my forked >>>>>>>>>>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on >>>>>>>>>>>>> persisting >>>>>>>>>>>>> model.thank you. >>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>> [1] https://github.com/dananjayamahesh/carbon-ml >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> yes >>>>>>>>>>>>>> you should develop in tha fork repo >>>>>>>>>>>>>> clone your forked repo >>>>>>>>>>>>>> then go into that >>>>>>>>>>>>>> then add upstream repo as original wso2 repo >>>>>>>>>>>>>> see the remote tracking branchs by >>>>>>>>>>>>>> git remote -v >>>>>>>>>>>>>> you will see the origin as your forked repo >>>>>>>>>>>>>> to add upstream >>>>>>>>>>>>>> git remote add upstream <wso2 repo> >>>>>>>>>>>>>> when you change something create a new branch by >>>>>>>>>>>>>> git checkout -b new_branch_name >>>>>>>>>>>>>> then add and commit to this branch >>>>>>>>>>>>>> after that push to the forked by >>>>>>>>>>>>>> git push origin new_branch_name >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>>> the above error is due to a simple mistake of not providing >>>>>>>>>>>>>>> my local p2 repo.Now it is working and i debugged the >>>>>>>>>>>>>>> StreamingLinearRegression model cep. >>>>>>>>>>>>>>> regards, >>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>>>> I did what you recommend. But when i am adding the query >>>>>>>>>>>>>>>> the following error is appearing. >>>>>>>>>>>>>>>> No extension exist for >>>>>>>>>>>>>>>> StreamFunctionExtension{namespace='ml'} in execution plan >>>>>>>>>>>>>>>> "NewExecutionPlan" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *My query is as follows, >>>>>>>>>>>>>>>> @Import('LinRegInput:1.0.0') >>>>>>>>>>>>>>>> define stream LinRegInput (salary double, rbi double, walks >>>>>>>>>>>>>>>> double, strikeouts double, errors double); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> @Export('LinRegOutput:1.0.0') >>>>>>>>>>>>>>>> define stream LinRegOutput (mse double); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.00000001, >>>>>>>>>>>>>>>> 1.0, 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>>>>>>>>>>> select * >>>>>>>>>>>>>>>> insert into mse; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have added my files as follows, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression; >>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> and add following lines to ml.siddhiext >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> .Then i build the carbon-ml. The replace the jar file you >>>>>>>>>>>>>>>> asked me replace with the name changed.any thoughts? >>>>>>>>>>>>>>>> regards, >>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You don't need to add new p2 repo. >>>>>>>>>>>>>>>>> In the <CEP_HOME>/repository/components/plugins folder, >>>>>>>>>>>>>>>>> you will find >>>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace >>>>>>>>>>>>>>>>> this with >>>>>>>>>>>>>>>>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. >>>>>>>>>>>>>>>>> First rename this jar in the target folder to the jar name in >>>>>>>>>>>>>>>>> the plugins >>>>>>>>>>>>>>>>> folder then replace (Make sure, otherwise will not work). >>>>>>>>>>>>>>>>> Your updates will be there in the CEP after this. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best regards. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>>>>>> Do i need to add p2 local repos of ML into CEP after i >>>>>>>>>>>>>>>>>> made changes to ml extensions. Or will it be automatically >>>>>>>>>>>>>>>>>> updated. I am >>>>>>>>>>>>>>>>>> trying to debug my extension with the cep.thank you. >>>>>>>>>>>>>>>>>> regards, >>>>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Mahesh when you add your work to carbon-ml follow the >>>>>>>>>>>>>>>>>>> bellow guidelines, it will help to keep the code clean. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - Add only the sources code file you have newly >>>>>>>>>>>>>>>>>>> added or changed. >>>>>>>>>>>>>>>>>>> - Do not use add . (add all) command in git. Only >>>>>>>>>>>>>>>>>>> use add filename >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have seen in your gsoc repo that there are gitignore >>>>>>>>>>>>>>>>>>> files, idea related files and the target folder is there. >>>>>>>>>>>>>>>>>>> These should not >>>>>>>>>>>>>>>>>>> be in the source code, only the source files you add. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> - Commit when you have done some major activity. Do >>>>>>>>>>>>>>>>>>> not add commits always when you make a change. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>>>>>>>> May i seperately put the classes to ml and extensions >>>>>>>>>>>>>>>>>>>> in carbon-core. I can put Streaming Extensions to >>>>>>>>>>>>>>>>>>>> extensions and >>>>>>>>>>>>>>>>>>>> Algorithms/StreamingLinear Regression and StreamingKMeans >>>>>>>>>>>>>>>>>>>> in ml core. what >>>>>>>>>>>>>>>>>>>> is the suitable format. I will commit my changes today as >>>>>>>>>>>>>>>>>>>> seperate branch >>>>>>>>>>>>>>>>>>>> in my forked carbon-ml local repo.thank you. >>>>>>>>>>>>>>>>>>>> regards, >>>>>>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>>>>>> p.s: better if you can meet me via hangout. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>> [email protected] >>>>>>>>>> +94711228855 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Dev mailing list >>>>>>>> [email protected] >>>>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks & regards, >>>>>>> Nirmal >>>>>>> >>>>>>> Team Lead - WSO2 Machine Learner >>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>> Mobile: +94715779733 >>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Pruthuvi Maheshakya Wijewardena >>>>> [email protected] >>>>> +94711228855 >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Pruthuvi Maheshakya Wijewardena >>>> [email protected] >>>> +94711228855 >>>> >>>> >>>> >>> >> >> _______________________________________________ >> Dev mailing list >> [email protected] >> http://wso2.org/cgi-bin/mailman/listinfo/dev >> >> > -- Thanks & regards, Nirmal Team Lead - WSO2 Machine Learner Associate Technical Lead - Data Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
