Hi Mahesh,

What's the status of the project?

On Thu, Jul 14, 2016 at 10:28 AM, Mahesh Dananjaya <
[email protected]> wrote:

> Hi Maheshakya,
> I am building and running samoa to see its functionality. In samoa still
> we have limited supports in algorithms. Samoa supports only classification
> and clustering with streams. It also use kind of StreamProcessor, like the
> one we use in StreamProcessor extension.  I was getting started with Samoa
> referring to this page [1]. Then i ran couple of examples to identified the
> flow. Samoa use hadoop framework instead spark for distribution. But i am
> using it in a local mode. When i see the Samoa core there is only limited
> algorithms. IMO if we are going to use Samoa we  have to limit the
> functionality and algorithms [2]. When i go to developer corner in [3], it
> seems to be something like CEP extension that we are using currenlty. SO in
> Samoa though the algorihtms are limited, they have implemented streaming
> support for them. Therefore if we integrate it into CEP we have to look for
> how to handle streams and algorithms in Samoa side. Is it good for your
> side to have both hadoop and spark running background.thank you.
> regards,
> Mahesh.
>
> [1] https://samoa.incubator.apache.org/documentation/Home.html
> [2]
> https://samoa.incubator.apache.org/documentation/api/current/index.html
>
>
> On Wed, Jun 22, 2016 at 11:51 AM, Mahesh Dananjaya <
> [email protected]> wrote:
>
>> Hi Maheshakya,
>> can i give external data sources like data from database , data from HDFS
>> to generate events in the cep event simulator rather than giving a file. i
>> saw "Switch to upload file for simulation" in the input Data By Data Source
>> in  the event simulator. How can i feed data real time from other sources
>> or directly as data generating from remote server as JSON or etc... What
>> format the database should be.This is just for my knowledge.thank you.
>> regards,
>> Mahesh.
>>
>> On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya <
>> [email protected]> wrote:
>>
>>> Hi Nirmal,
>>> *This is what i have done so far in the GSOC2016,*
>>>
>>>    - prior research before SGD (Stochastic Gradient Descent)
>>>    optimization techniques and mini-batch processing
>>>    - Getting familiar and writing extensions to siddhi
>>>    - Wrote a Stream Processor extensions for streaming application and
>>>    machine learning algorithms (Linear Regression,KMeans & Logistic 
>>> Regression)
>>>    - Developed a Streaming Linear Regression class for periodically
>>>    retrain models as mini batch processing with SGD
>>>    - Extend the functionality for Moving Window Mini Batch Processing
>>>    with SGD providing windowShift which control data horizon and data
>>>    obsolescences
>>>    - Performance evaluation of the implementation
>>>    - Adding Streaming Linear Regression class and Stream Processor
>>>    extension to carbon-ml
>>>
>>>
>>> *As a next step,*
>>>
>>>    - Adding Persisting temporal models for applications such as
>>>    prediction
>>>    - complete Streaming Kmeans clustering and Logistic Regression
>>>    classes
>>>    - Improve batching and streaming mechanisms
>>>    - improve visualization(optional)
>>>    - and writing examples and documentation
>>>
>>> regards,
>>>
>>> Mahesh.
>>>
>>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
>>> [email protected]> wrote:
>>>
>>>> Sorry, you need to put the returned values of the function into the
>>>> output stream
>>>>
>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, 0.95,
>>>> salary, rbi, walks, strikeouts, errors)
>>>>
>>>>
>>>>
>>>> *select mseinsert into LinregOutput;*
>>>> or
>>>>
>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, 0.95,
>>>> salary, rbi, walks, strikeouts, errors)
>>>> select *
>>>> insert into LinregOutput;
>>>>
>>>> where LinregOutput stream definition contains all attributes: mse,
>>>> intercept, beta1, ....
>>>>
>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> In your output stream, you need to list all the attributes that are
>>>>> returned from the streamlinreg function: mse, intercept, beta1, ....
>>>>> Can you try that?
>>>>>
>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> This is the full query i used.
>>>>>>
>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>
>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>> strikeouts double, errors double);
>>>>>>
>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>
>>>>>> define stream LinregOutput (mse double);
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>>
>>>>>> select *
>>>>>> insert into mse;
>>>>>>
>>>>>> but i am sending [mse,intercept,beta1....betap] as a outputData
>>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> Can you summarize the work we have done so far and the remaining
>>>>>>> work items please?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Maheshakya,
>>>>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>>>>> [1].thank you.
>>>>>>>> regards,
>>>>>>>> Mahesh.
>>>>>>>> [1]
>>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>>>>> [2]
>>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>> From: Mahesh Dananjaya <[email protected]>
>>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic
>>>>>>>>> with online data for WSO2 Machine Learner
>>>>>>>>> To: Maheshakya Wijewardena <[email protected]>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Maheshakya,
>>>>>>>>> new query is like this adding spport for moving window methods.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> @Import('LinRegInput:1.0.1')
>>>>>>>>> define stream LinRegInput (salary double, rbi double, walks
>>>>>>>>> double, strikeouts double, errors double);
>>>>>>>>>
>>>>>>>>> @Export('LinRegOutput:1.0.1')
>>>>>>>>> define stream LinRegOutput (mse double);
>>>>>>>>>
>>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.00000001, 1.0,
>>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>>> select *
>>>>>>>>> insert into mse;
>>>>>>>>> 1=learnType
>>>>>>>>> 2=windowShift
>>>>>>>>> 4=batchSize.......
>>>>>>>>>
>>>>>>>>> windowShift is added to configure the amount of shift. i have
>>>>>>>>> added log.infe(mse) to view the MSE.
>>>>>>>>> Mahesh.
>>>>>>>>>
>>>>>>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>
>>>>>>>>>> If you are installing features  from new p2 repo into a new CEP
>>>>>>>>>> pack, then you wont need to replace those jars.
>>>>>>>>>> If you have already installed those in the CEP from a previous
>>>>>>>>>> p2-repo, then you have to un-install those features and reinstall 
>>>>>>>>>> with new
>>>>>>>>>> p2 repo. But you don't need to do this because you can just replace 
>>>>>>>>>> the
>>>>>>>>>> jar. It's easy.
>>>>>>>>>>
>>>>>>>>>> Best regards.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>> If i built the carbon-ml then product-ml and point new p2
>>>>>>>>>>> repository to cep features, do i need to copy that
>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension1.1..... thing into
>>>>>>>>>>> cep_home/repository/component/... place.
>>>>>>>>>>> regards,
>>>>>>>>>>> Mahesh.
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> In MLModelhandler there's persistModel method
>>>>>>>>>>>> debug that method while trying to train a model from ML
>>>>>>>>>>>> you can see the steps it takes
>>>>>>>>>>>> don't use deep learning algorithm
>>>>>>>>>>>> any other algorithm would work
>>>>>>>>>>>> from line 777 is the section for creating the serializable
>>>>>>>>>>>> object from trained model and saving it
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think you don't need to directly use ML model handler
>>>>>>>>>>>> you need to use the code in that for persisting models in the
>>>>>>>>>>>> streaming algorithm
>>>>>>>>>>>> so you can add a utils class in the streaming folder
>>>>>>>>>>>> then add the persisting logic there
>>>>>>>>>>>> ignore the deeplearning section in that
>>>>>>>>>>>> only forcus on persisting spark mod
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>>>> I pushed the StreamingLinearRegression modules into my forked
>>>>>>>>>>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on 
>>>>>>>>>>>>> persisting
>>>>>>>>>>>>> model.thank you.
>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> yes
>>>>>>>>>>>>>> you should develop in tha fork repo
>>>>>>>>>>>>>> clone your forked repo
>>>>>>>>>>>>>> then go into that
>>>>>>>>>>>>>> then add upstream repo as original wso2 repo
>>>>>>>>>>>>>> see the remote tracking branchs by
>>>>>>>>>>>>>> git remote -v
>>>>>>>>>>>>>> you will see the origin as your forked repo
>>>>>>>>>>>>>> to add upstream
>>>>>>>>>>>>>> git remote add upstream <wso2 repo>
>>>>>>>>>>>>>> when you change something create a new branch by
>>>>>>>>>>>>>> git checkout -b new_branch_name
>>>>>>>>>>>>>> then add and commit to this branch
>>>>>>>>>>>>>> after that push to the forked by
>>>>>>>>>>>>>> git push origin new_branch_name
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>>>>>> the above error is due to a simple mistake of not providing
>>>>>>>>>>>>>>> my local p2 repo.Now it is working and i debugged the
>>>>>>>>>>>>>>> StreamingLinearRegression model cep.
>>>>>>>>>>>>>>> regards,
>>>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>>>>>>> I did what you recommend. But when i am adding the query
>>>>>>>>>>>>>>>> the following error is appearing.
>>>>>>>>>>>>>>>> No extension exist for
>>>>>>>>>>>>>>>> StreamFunctionExtension{namespace='ml'} in execution plan 
>>>>>>>>>>>>>>>> "NewExecutionPlan"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> *My query is as follows,
>>>>>>>>>>>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>>>>>>>>>>> define stream LinRegInput (salary double, rbi double, walks
>>>>>>>>>>>>>>>> double, strikeouts double, errors double);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>>>>>>>>>>> define stream LinRegOutput (mse double);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.00000001,
>>>>>>>>>>>>>>>> 1.0, 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>>>>>>>>>> select *
>>>>>>>>>>>>>>>> insert into mse;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have added my files as follows,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
>>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and add following lines to ml.siddhiext
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> .Then i build the carbon-ml. The replace the jar file you
>>>>>>>>>>>>>>>> asked me replace with the name changed.any thoughts?
>>>>>>>>>>>>>>>> regards,
>>>>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> You don't need to add new p2 repo.
>>>>>>>>>>>>>>>>> In the <CEP_HOME>/repository/components/plugins folder,
>>>>>>>>>>>>>>>>> you will find 
>>>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace
>>>>>>>>>>>>>>>>> this with
>>>>>>>>>>>>>>>>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
>>>>>>>>>>>>>>>>> First rename this jar in the target folder to the jar name in 
>>>>>>>>>>>>>>>>> the plugins
>>>>>>>>>>>>>>>>> folder then replace (Make sure, otherwise will not work).
>>>>>>>>>>>>>>>>> Your updates will be there in the CEP after this.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>>>>>>>>> Do i need to add p2 local repos of ML into CEP after i
>>>>>>>>>>>>>>>>>> made changes to ml extensions. Or will it be automatically 
>>>>>>>>>>>>>>>>>> updated. I am
>>>>>>>>>>>>>>>>>> trying to debug my extension with the cep.thank you.
>>>>>>>>>>>>>>>>>> regards,
>>>>>>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Mahesh when you add your work to carbon-ml follow the
>>>>>>>>>>>>>>>>>>> bellow guidelines, it will help to keep the code clean.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    - Add only the sources code file you have newly
>>>>>>>>>>>>>>>>>>>    added or changed.
>>>>>>>>>>>>>>>>>>>    - Do not use add . (add all) command in git. Only
>>>>>>>>>>>>>>>>>>>    use add filename
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have seen in your gsoc repo that there are gitignore
>>>>>>>>>>>>>>>>>>> files, idea related files and the target folder is there. 
>>>>>>>>>>>>>>>>>>> These should not
>>>>>>>>>>>>>>>>>>> be in the source code, only the source files you add.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>    - Commit when you have done some major activity. Do
>>>>>>>>>>>>>>>>>>>    not add commits always when you make a change.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>>>>>>>>>>> May i seperately put the classes to ml and extensions
>>>>>>>>>>>>>>>>>>>> in carbon-core. I can put Streaming Extensions to 
>>>>>>>>>>>>>>>>>>>> extensions and
>>>>>>>>>>>>>>>>>>>> Algorithms/StreamingLinear Regression and StreamingKMeans 
>>>>>>>>>>>>>>>>>>>> in ml core. what
>>>>>>>>>>>>>>>>>>>> is the suitable format. I will commit my changes today as 
>>>>>>>>>>>>>>>>>>>> seperate branch
>>>>>>>>>>>>>>>>>>>> in my forked carbon-ml local repo.thank you.
>>>>>>>>>>>>>>>>>>>> regards,
>>>>>>>>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>>>>>>>>> p.s: better if you can meet me via hangout.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>>> +94711228855
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>> +94711228855
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>> [email protected]
>>>>>>>>>> +94711228855
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Dev mailing list
>>>>>>>> [email protected]
>>>>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks & regards,
>>>>>>> Nirmal
>>>>>>>
>>>>>>> Team Lead - WSO2 Machine Learner
>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>> Mobile: +94715779733
>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Pruthuvi Maheshakya Wijewardena
>>>>> [email protected]
>>>>> +94711228855
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Pruthuvi Maheshakya Wijewardena
>>>> [email protected]
>>>> +94711228855
>>>>
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Dev mailing list
>> [email protected]
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>


-- 

Thanks & regards,
Nirmal

Team Lead - WSO2 Machine Learner
Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to