Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-26 Thread Maheshakya Wijewardena
> wrote:
>>>>
>>>>> Hi Nirmal,
>>>>> *This is what i have done so far in the GSOC2016,*
>>>>>
>>>>>- prior research before SGD (Stochastic Gradient Descent)
>>>>>optimization techniques and mini-batch processing
>>>>>- Getting familiar and writing extensions to siddhi
>>>>>- Wrote a Stream Processor extensions for streaming application
>>>>>and machine learning algorithms (Linear Regression,KMeans & Logistic
>>>>>Regression)
>>>>>- Developed a Streaming Linear Regression class for periodically
>>>>>retrain models as mini batch processing with SGD
>>>>>    - Extend the functionality for Moving Window Mini Batch Processing
>>>>>with SGD providing windowShift which control data horizon and data
>>>>>obsolescences
>>>>>- Performance evaluation of the implementation
>>>>>- Adding Streaming Linear Regression class and Stream Processor
>>>>>extension to carbon-ml
>>>>>
>>>>>
>>>>> *As a next step,*
>>>>>
>>>>>- Adding Persisting temporal models for applications such as
>>>>>prediction
>>>>>- complete Streaming Kmeans clustering and Logistic Regression
>>>>>classes
>>>>>- Improve batching and streaming mechanisms
>>>>>- improve visualization(optional)
>>>>>- and writing examples and documentation
>>>>>
>>>>> regards,
>>>>>
>>>>> Mahesh.
>>>>>
>>>>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Sorry, you need to put the returned values of the function into the
>>>>>> output stream
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>>
>>>>>>
>>>>>>
>>>>>> *select mseinsert into LinregOutput;*
>>>>>> or
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>> select *
>>>>>> insert into LinregOutput;
>>>>>>
>>>>>> where LinregOutput stream definition contains all attributes: mse,
>>>>>> intercept, beta1, 
>>>>>>
>>>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> In your output stream, you need to list all the attributes that are
>>>>>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>>>>>> Can you try that?
>>>>>>>
>>>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Maheshakya,
>>>>>>>> This is the full query i used.
>>>>>>>>
>>>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>>>
>>>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>>>> strikeouts double, errors double);
>>>>>>>>
>>>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>>>
>>>>>>>> define stream LinregOutput (mse double);
>>>>>>>>
>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0,
>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>>
>>>>>>>> select *
>>>>>>>> insert into mse;
>>>>>>>>
>>>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>>>>> regards,
>>>>>>>> Mahesh.
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-17 Thread Maheshakya Wijewardena
Hi Mahesh,

Can you  please share your samoa project?

On Sun, Jul 17, 2016 at 11:19 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
> Date: Sun, Jul 17, 2016 at 11:18 AM
> Subject: Re: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online
> data for WSO2 Machine Learner-Samoa Integration
> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>
>
> Hi Maheshakaya,
> just need a little help. In Samoa when we want to run a class what is does
> it used this commands [1],
> 1. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
> "ClusteringEvaluation"
> 2. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
> "PrequentialEvaluation -d /tmp/dump.csv -i 100 -f 10 -l
> (classifiers.trees.VerticalHoeffdingTree -p 4) -s
> (generators.RandomTreeGenerator -c 2 -o 10 -u 10)"
>
> what is does is call a class named LocalDoTask [4] and pass this string as
> argument.After that that LocalDoTask call the relevent Tasks such as
> ClusteringEvaluation or PrequentialEvaluation. [2].
>
> Now i have add samoa dependencies to my new maven project, where i used
> original samoa source to write examples and test then earlier.Now i want to
> push them into my new java project with samoa dependencies. I added
> dependency and it was built fine. Now i am calling my local DoTask.java [3]
> file as same as i did with samoa with,
> java -cp target/streaming-1.0-SNAPSHOT.jar org.gsoc.samoa.streaming.DoTask
> "org.gsoc.samoa.streaming.ClusteringEvaluation"
> But seems to be i am incorrect in some place.
> Error: A JNI error has occurred, please check your installation and try
> again
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/samoa/topology/ComponentFactory
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at
> sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
> at
> sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.samoa.topology.ComponentFactory
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 7 more
>
>
> can i actually call the Task like this.
>
> BR,
> Mahesh.
>
> [1]
> https://samoa.incubator.apache.org/documentation/Prequential-Evaluation-Task.html
> [2]
> https://github.com/apache/incubator-samoa/blob/releases/0.4.0-incubating-RC0/samoa-api/src/main/java/org/apache/samoa/tasks/ClusteringEvaluation.java
> [3]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming
> [4]
> https://github.com/apache/incubator-samoa/tree/releases/0.4.0-incubating-RC0/samoa-local/src/main/java/org/apache/samoa
>
>
> On Thu, Jul 14, 2016 at 3:47 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi srinath,
>> sure.i am working on it.thank you.
>> regards,
>> Mahesh.
>>
>> On Thu, Jul 14, 2016 at 11:12 AM, Srinath Perera <srin...@wso2.com>
>> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Let's focus on getting SAOMA work with CEP. It is OK to be limited to
>>> few algorithms.
>>>
>>> --Srinath
>>>
>>> On Thu, Jul 14, 2016 at 10:49 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I think we can build new tasks [1] like the one in execution plan in
>>>> cep with samoa. I will try to build a one.
>>>> regards,
>>>> Mahesh.
>>>> [1]
>>>> https://samoa.incubator.apache.org/documentation/Developing-New-Tasks-in-SAMOA.html
>>>>
>>>>
>>>> On Thu, Jul 14, 2016 at 10:35 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I am building and running samoa to see its functionality. In samoa
>>>>> still we have limited supports in algorithms. Samoa supports only
>>>>> classification and clustering with streams. It also use kind of
>>>>> StreamProcessor, 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Maheshakya Wijewardena
Sorry, you need to put the returned values of the function into the output
stream

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)



*select mseinsert into LinregOutput;*
or

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)
select *
insert into LinregOutput;

where LinregOutput stream definition contains all attributes: mse,
intercept, beta1, 

On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> In your output stream, you need to list all the attributes that are
> returned from the streamlinreg function: mse, intercept, beta1, 
> Can you try that?
>
> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> This is the full query i used.
>>
>> @Import('LinRegInput:1.0.0')
>>
>> define stream LinRegInput (salary double, rbi double, walks double,
>> strikeouts double, errors double);
>>
>> @Export('LinRegOutput:1.0.0')
>>
>> define stream LinregOutput (mse double);
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>>
>> select *
>> insert into mse;
>>
>> but i am sending [mse,intercept,beta1betap] as a outputData Object[].
>> SO how can i publish all these infomation on event publisher.
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Can you summarize the work we have done so far and the remaining work
>>> items please?
>>>
>>> Thanks.
>>>
>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I have updated the repo [2] and upto date documents can be found at
>>>> [1].thank you.
>>>> regards,
>>>> Mahesh.
>>>> [1]
>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>
>>>>
>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>>>> online data for WSO2 Machine Learner
>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>
>>>>>
>>>>> Hi Maheshakya,
>>>>> new query is like this adding spport for moving window methods.
>>>>>
>>>>>
>>>>> @Import('LinRegInput:1.0.1')
>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>> strikeouts double, errors double);
>>>>>
>>>>> @Export('LinRegOutput:1.0.1')
>>>>> define stream LinRegOutput (mse double);
>>>>>
>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>> salary, rbi, walks, strikeouts, errors)
>>>>> select *
>>>>> insert into mse;
>>>>> 1=learnType
>>>>> 2=windowShift
>>>>> 4=batchSize...
>>>>>
>>>>> windowShift is added to configure the amount of shift. i have added
>>>>> log.infe(mse) to view the MSE.
>>>>> Mahesh.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> If you are installing features  from new p2 repo into a new CEP pack,
>>>>>> then you wont need to replace those jars.
>>>>>> If you have already installed those in the CEP from a previous
>>>>>> p2-repo, then you have to un-install those features and reinstall with 
>>>>>> new
>>>>>> p2 repo. But you don't need to do this because you can just replace the
>>>>>> jar. It's easy.
>>>>>>
>>>>>> Best regards.
>>>>>&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Maheshakya Wijewardena
Hi Mahesh,

In your output stream, you need to list all the attributes that are
returned from the streamlinreg function: mse, intercept, beta1, 
Can you try that?

On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> This is the full query i used.
>
> @Import('LinRegInput:1.0.0')
>
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.0')
>
> define stream LinregOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
>
> select *
> insert into mse;
>
> but i am sending [mse,intercept,beta1betap] as a outputData Object[].
> SO how can i publish all these infomation on event publisher.
> regards,
> Mahesh.
>
> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Can you summarize the work we have done so far and the remaining work
>> items please?
>>
>> Thanks.
>>
>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have updated the repo [2] and upto date documents can be found at
>>> [1].thank you.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>
>>>
>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>>
>>>> -- Forwarded message --
>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>>> online data for WSO2 Machine Learner
>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>
>>>>
>>>> Hi Maheshakya,
>>>> new query is like this adding spport for moving window methods.
>>>>
>>>>
>>>> @Import('LinRegInput:1.0.1')
>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>> strikeouts double, errors double);
>>>>
>>>> @Export('LinRegOutput:1.0.1')
>>>> define stream LinRegOutput (mse double);
>>>>
>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>> salary, rbi, walks, strikeouts, errors)
>>>> select *
>>>> insert into mse;
>>>> 1=learnType
>>>> 2=windowShift
>>>> 4=batchSize...
>>>>
>>>> windowShift is added to configure the amount of shift. i have added
>>>> log.infe(mse) to view the MSE.
>>>> Mahesh.
>>>>
>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> If you are installing features  from new p2 repo into a new CEP pack,
>>>>> then you wont need to replace those jars.
>>>>> If you have already installed those in the CEP from a previous
>>>>> p2-repo, then you have to un-install those features and reinstall with new
>>>>> p2 repo. But you don't need to do this because you can just replace the
>>>>> jar. It's easy.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> If i built the carbon-ml then product-ml and point new p2 repository
>>>>>> to cep features, do i need to copy that
>>>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>>>>> cep_home/repository/component/... place.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> In MLModelhandler there's persistModel method
>>>>>>> debug that method while trying to train a model from ML
>>>>>>> you can see the steps it takes
>>>>>>> don't use deep

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Maheshakya Wijewardena
Hi Mahesh,

If you are installing features  from new p2 repo into a new CEP pack, then
you wont need to replace those jars.
If you have already installed those in the CEP from a previous p2-repo,
then you have to un-install those features and reinstall with new p2 repo.
But you don't need to do this because you can just replace the jar. It's
easy.

Best regards.

On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> If i built the carbon-ml then product-ml and point new p2 repository to
> cep features, do i need to copy that
> org.wso2.carbon.ml.siddhi.extension1.1. thing into
> cep_home/repository/component/... place.
> regards,
> Mahesh.
>
> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> In MLModelhandler there's persistModel method
>> debug that method while trying to train a model from ML
>> you can see the steps it takes
>> don't use deep learning algorithm
>> any other algorithm would work
>> from line 777 is the section for creating the serializable object from
>> trained model and saving it
>>
>>
>> I think you don't need to directly use ML model handler
>> you need to use the code in that for persisting models in the streaming
>> algorithm
>> so you can add a utils class in the streaming folder
>> then add the persisting logic there
>> ignore the deeplearning section in that
>> only forcus on persisting spark mod
>>
>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I pushed the StreamingLinearRegression modules into my forked carbon-ml
>>> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>>> model.thank you.
>>> Mahesh.
>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>
>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> yes
>>>> you should develop in tha fork repo
>>>> clone your forked repo
>>>> then go into that
>>>> then add upstream repo as original wso2 repo
>>>> see the remote tracking branchs by
>>>> git remote -v
>>>> you will see the origin as your forked repo
>>>> to add upstream
>>>> git remote add upstream 
>>>> when you change something create a new branch by
>>>> git checkout -b new_branch_name
>>>> then add and commit to this branch
>>>> after that push to the forked by
>>>> git push origin new_branch_name
>>>>
>>>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> the above error is due to a simple mistake of not providing my local
>>>>> p2 repo.Now it is working and i debugged the StreamingLinearRegression
>>>>> model cep.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I did what you recommend. But when i am adding the query the
>>>>>> following error is appearing.
>>>>>> No extension exist for StreamFunctionExtension{namespace='ml'} in
>>>>>> execution plan "NewExecutionPlan"
>>>>>>
>>>>>> *My query is as follows,
>>>>>> @Import('LinRegInput:1.0.0')
>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>> strikeouts double, errors double);
>>>>>>
>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>> define stream LinRegOutput (mse double);
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>> select *
>>>>>> insert into mse;
>>>>>>
>>>>>> I have added my files as follows,
>>>>>>
>>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
>>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>>>>>>
>>>>>> and add following lines to ml.siddhiext
>>>>>>
>>>>>> streamlinreg=org

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Maheshakya Wijewardena
Hi Mahesh,

You don't need to add new p2 repo.
In the /repository/components/plugins folder, you will find
org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
First rename this jar in the target folder to the jar name in the plugins
folder then replace (Make sure, otherwise will not work).
Your updates will be there in the CEP after this.

Best regards.

On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> Do i need to add p2 local repos of ML into CEP after i made changes to ml
> extensions. Or will it be automatically updated. I am trying to debug my
> extension with the cep.thank you.
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Mahesh when you add your work to carbon-ml follow the bellow guidelines,
>> it will help to keep the code clean.
>>
>>
>>- Add only the sources code file you have newly added or changed.
>>- Do not use add . (add all) command in git. Only use add filename
>>
>> I have seen in your gsoc repo that there are gitignore files, idea
>> related files and the target folder is there. These should not be in the
>> source code, only the source files you add.
>>
>>- Commit when you have done some major activity. Do not add commits
>>always when you make a change.
>>
>>
>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> May i seperately put the classes to ml and extensions in carbon-core. I
>>> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
>>> Regression and StreamingKMeans in ml core. what is the suitable format. I
>>> will commit my changes today as seperate branch in my forked carbon-ml
>>> local repo.thank you.
>>> regards,
>>> Mahesh.
>>> p.s: better if you can meet me via hangout.
>>>
>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Maheshakya Wijewardena
Hi Mahesh,

You can add a new folder for streaming algorithms in the siddhi extension.
There, keep stream processors and the algorithms classes separately.

We can arrange a hangout tomorrow.

Best regards.

On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> May i seperately put the classes to ml and extensions in carbon-core. I
> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
> Regression and StreamingKMeans in ml core. what is the suitable format. I
> will commit my changes today as seperate branch in my forked carbon-ml
> local repo.thank you.
> regards,
> Mahesh.
> p.s: better if you can meet me via hangout.
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-11 Thread Maheshakya Wijewardena
Hi Mahesh,

Regarding your question:

my outputData Object[]array is in the format of
> [mse,beta0,beta1,betap].But seems to be that cep does not understand it.


Did you create an output stream first for the publisher? You need to create
a stream with attributes: mse double, beta1 double, ...and
point to that from the publisher.



On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Maheshakya,
> you can find the details of the queries in this ReadMe [1]. i have add
> some changes . so previous querirs may not valid.please use these new
> queries in the README.
> *1.Streaming Linear regression*
> from LinRegInputStream#streaming:streaminglr((learnType),
> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
> (ci), salary, rbi, walks, strikeouts, errors)
> select *
>
>
>
>
> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0,
> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select
> *insert into regResults*;
>
> *2.Streaming KMeans Clustering*
> from LinRegInputStream#streaming:streamingkm((learnType),
> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into regResults;
>
>
>
> *from
> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
> *insert into regResults*
>
>  And i need a help in returning the outputData of my program back to cep.
> therefore currenlt you may not find the stream output in event publish.but
> you can see the output in the console. i want to understand the final stepd
> of putting the output data back to output stream after the batch size is
> completed and the algorithms is completed. you may find that following line
> passes an exception. Thats have actually no clue of outputData format that
> need to give for Output stream.
>
> Object[] outputData = streamingLinearRegression.regress(eventData);
>
>
> if (outputData == null) {
> streamEventChunk.remove();
> } else {
> complexEventPopulater.populateComplexEvent(complexEvent, outputData);
> }
>
> my outputData Object[]array is in the format of
> [mse,beta0,beta1,betap].But seems to be that cep does not understand
> it. i do it by looking at the time series stream rpocessor extension at
> [2].can you please help me with this.
> regards,
> Mahesh.
>
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java
>
> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Great work so far.
>>
>> Regarding the queries:
>>
>> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)
>>
>>
>> Can you give me the definitions of the first few entities in the order.
>> Also in previous supervised cases (linear regression), what is the response
>> variable, etc.
>> I'll go through the code and give you a feedback.
>>
>>  After this, we need to me this implementation into carbon-ml siddhi
>> extension. Please also do a similar implementation for logistic regression
>> as well because we need to have a streaming version for classification as
>> well.
>>
>> Best regards.
>>
>>
>>
>> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> I have changed the siddhi query for our StreamingKMeansClustering by
>>> adding Alpha into the picture which we can use to make data horizon (how
>>> quickly a most recent data point becomes a part of the model) and data
>>> obsolescence (how long does it take a past data point to become irrelevant
>>> to the model)in the streaming clustering algorithms.i have added new
>>> changes to repo [1] introducing StreamingKMeansClusteringModel and
>>> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>>>
>>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
>>> walks, strikeouts, errors)
>>>
>>> select *
>>> insert into regResults;
>>>
>>> regrads,
>>> Mahesh.
>>>
>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-07 Thread Maheshakya Wijewardena
gKMeansClustering [1] for our
>>>> purposes and debugged them.thank you.
>>>> regards,
>>>> Mahesh.
>>>> [1]
>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>>>>
>>>> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga <sup...@wso2.com> wrote:
>>>>
>>>>> Thanks Mahesh! The graphs look promising! :)
>>>>>
>>>>> So by looking at graph, LR with SGD can train  a model within 60 secs
>>>>> (6*10^10 nano sec), using about 900,000 data points . Means, this online
>>>>> training can handle events/data points coming at rate of 15,000 per second
>>>>> (or more) , if the batch size is set to 900,000 (or less) or window size 
>>>>> is
>>>>> set to 60 secs (or less). This is great IMO!
>>>>>
>>>>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> As you requested i can change other parameters as well such as
>>>>>> feature size(p). Initially i did it with p=3;sure thing. Anyway you can 
>>>>>> see
>>>>>> and run the code if you want. source is at [1]. the test timing is called
>>>>>> with random data as you requested if you set args[0] to 1. And you can 
>>>>>> find
>>>>>> the extension and streaming algorithms in gsoc/ directiry[2]. thank you.
>>>>>> BR,
>>>>>> Mahesh.
>>>>>> [1]
>>>>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
>>>>>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>>>>
>>>>>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi supun,
>>>>>>> Though i pushed it yesterday, there was some problems with the
>>>>>>> network. now you can see them in the repo location [1].I added some 
>>>>>>> Matlab
>>>>>>> plot you can see the patter there.you can use ml also. Ok sure thing. I 
>>>>>>> can
>>>>>>> prepare a report or else blog if you want. files are as follows. The y 
>>>>>>> axis
>>>>>>> is in ns and x axis is in batch size. And also i added two pplots as
>>>>>>> jpegs[2], so you can easily compare.
>>>>>>> lr_timing_1000.txt -> batch size incremented by 1000
>>>>>>> lr_timing_1.txt -> batch size incremented by 1
>>>>>>> lr_timing_power10.txt -> batch size incremented by power of 10
>>>>>>>
>>>>>>> In here independent variable is only tha batch size.If you want i
>>>>>>> can send you making other parameters such as step size, number of
>>>>>>> iteration, feature vector size as independent variables. please let me 
>>>>>>> know
>>>>>>> if you want further info. thank you.
>>>>>>> regards,
>>>>>>> Mahesh.
>>>>>>>
>>>>>>>
>>>>>>> [1
>>>>>>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>>>>>> [2]
>>>>>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>>>>>>
>>>>>>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga <sup...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Mahesh,
>>>>>>>>
>>>>>>>> I have added those timing reports to my repo [1].
>>>>>>>>
>>>>>>>> Whats the file name? :)
>>>>>>>>
>>>>>>>> Btw, can you compile simple doc (gdoc) with the above results, and
>>>>>>>> bring everything to one place? That way it is easy to compare, and keep
>>>>>>>> track.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Supun
>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-29 Thread Maheshakya Wijewardena
Hi Mahesh,

Thank you for the update. I will look into your implementation.

And i will be able to send you the timing/performances analysis report
> tomorrow for the SGD functions
>

Great. Sent those asap so that we can proceed.

Best regards.

On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

>
> Hi maheshakay,
> I have implemented the linear regression with cep siddhi event stream
> with  taking batch sizes as parameters from the cep. Now we can trying the
> moving window method to. Before that i think i should get your opinion on
> data structures to save the streaming data.please check my repo [1]  /gsoc/
> folder there you can find all new things i add.. there in the extension
> folder you can find those extension. And i will be able to send you the
> timing/performances analysis report tomorrow for the SGD functions. thank
> you.
> regards,
> Mahesh.
> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
>
> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshkaya,
>> i have written some siddhi extension and trying to develop a one for my
>> one. In time series example in the [1], can you please explain me the input
>> format and query lines in that example for my understanding.
>>
>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks,
>> strikeouts, errors)
>> select *
>> insert into regResults;
>>
>> i just want to knwo how i give a set of data into this extension and what
>> is baseballData. Is it input stream as usual.or any data file?how can i
>> find that data set to create dummy input stream like baseballData?
>>
>> thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>
>> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> today i got the siddhi and debug the math extention. then did some
>>> changes and check. Now i am trying to write same kind of extension in my
>>> code base. so i add dependencies and it was built fine. Now i am trying to
>>> debug my extension and i did the same thing as i did in previous case. Cep
>>> is sending data, bu my extension is not firing in relevant break point.
>>> 1. So how can i debug the siddhi extension in my new extension.(you can
>>> see it in my example repoo)
>>>
>>> I think if i do it correctly we can built the extension for our purpose.
>>> And i will send the relevant timing report of SGD algorithms very soon as
>>> supun was asking me. thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Also note that there is a calculation interval in the siddhi time
>>>> series regression function[1]. You maybe able get some insight for this
>>>> from that as well.
>>>>
>>>> [1] https://docs.wso2.com/display/CEP400/Regression
>>>>
>>>> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> As we discussed offline, we can use similar mechanism to train linear
>>>>> regression models, logistic regression models and k-means clustering 
>>>>> models.
>>>>>
>>>>> It is very interesting that i have found that somethings that can make
>>>>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>>>>> Processor Extention program [1]. There is a example of
>>>>>> LinearRegressionStreamProcessor [1].
>>>>>>
>>>>>
>>>>> As we have to train predictive models with Spark, you can write
>>>>> wrappers around regression/clustering models of Spark. Refer to Siddhi 
>>>>> time
>>>>> series regression source codes[1][2]. You can write a streaming linear
>>>>> regression class for ML in a similar fashion by wrapping Spark mllib
>>>>> implementations. You can use the methods "addEvent", "removeEvent", etc.
>>>>> (may have to be changed according to requirements) for the similar 
>>>>> purpose.
>>>>> You can introduce trainLinearRegression/LogisticRegression/Kmeans which
>>>>> does a similar thing 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-23 Thread Maheshakya Wijewardena
Also note that there is a calculation interval in the siddhi time series
regression function[1]. You maybe able get some insight for this from that
as well.

[1] https://docs.wso2.com/display/CEP400/Regression

On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> As we discussed offline, we can use similar mechanism to train linear
> regression models, logistic regression models and k-means clustering models.
>
> It is very interesting that i have found that somethings that can make use
>> of our work. In the cep 4.0 documentation there is a Custom Stream
>> Processor Extention program [1]. There is a example of
>> LinearRegressionStreamProcessor [1].
>>
>
> As we have to train predictive models with Spark, you can write wrappers
> around regression/clustering models of Spark. Refer to Siddhi time series
> regression source codes[1][2]. You can write a streaming linear regression
> class for ML in a similar fashion by wrapping Spark mllib implementations.
> You can use the methods "addEvent", "removeEvent", etc. (may have to be
> changed according to requirements) for the similar purpose. You can
> introduce trainLinearRegression/LogisticRegression/Kmeans which does a
> similar thing as in createLinearRegression in those time series functions.
> In the processData method you can use Spark mllib classes to actually train
> models and return the model weights, evaluation metrics. So, converting
> streams into RDDs and retrieving information from the trained models shall
> happen in this method.
>
> In the stream processor extension example, you can retrieve those values
> then use them to train new models with new batches. Weights/cluster centers
> maybe passed as initialization parameters for the wrappers.
>
> Please note that we have to figure out the best siddhi extension type for
> this process. In the siddhi query, we define batch size, type of algorithm
> and number of features (there can be more). After batch size number of
> events received, train a model and save parameters, return evaluation
> metric. With the next batch, retrain the model initialized with previously
> learned parameters.
>
> We also may need to test the same scenario with a moving window, but I
> suspect that that approach may become so slow as a model is trained each
> time an event is received. So, we may have to change the number of slots
> the moving window moves at a time (eg: not one by one, but ten by ten).
>
> Once this is resolved, majority of the research part will be finished and
> all we will be left to do is implementing wrappers around the 3 learning
> algorithms we consider.
>
> Best regards.
>
> [1]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java
>
>
> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> shall we use [1] for our work? i am checking the possibility.
>> BR,
>> Mahesh.
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>> [2]
>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>> [3]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>
>> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> It is very interesting that i have found that somethings that can make
>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>> Processor Extention program [1]. There is a example of
>>> LinearRegressionStreamProcessor [1] and also i saw
>>>  private int batchSize = 10; i am going through this one.
>>> Please check whether we can use. WIll there be any compatibility or
>>> support issue?
>>> regards,
>>> Mahesh.
>>>
>>>
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>>
>>> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi maheshakya,
>>>> anyway how can test any siddhi extention after write it without
>>>> integrating it to cep.can you please explain me the procedure. i am
>>>> referring to [1] [2] [3] [4].  thank y

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-23 Thread Maheshakya Wijewardena
Hi Mahesh,

As we discussed offline, we can use similar mechanism to train linear
regression models, logistic regression models and k-means clustering models.

It is very interesting that i have found that somethings that can make use
> of our work. In the cep 4.0 documentation there is a Custom Stream
> Processor Extention program [1]. There is a example of
> LinearRegressionStreamProcessor [1].
>

As we have to train predictive models with Spark, you can write wrappers
around regression/clustering models of Spark. Refer to Siddhi time series
regression source codes[1][2]. You can write a streaming linear regression
class for ML in a similar fashion by wrapping Spark mllib implementations.
You can use the methods "addEvent", "removeEvent", etc. (may have to be
changed according to requirements) for the similar purpose. You can
introduce trainLinearRegression/LogisticRegression/Kmeans which does a
similar thing as in createLinearRegression in those time series functions.
In the processData method you can use Spark mllib classes to actually train
models and return the model weights, evaluation metrics. So, converting
streams into RDDs and retrieving information from the trained models shall
happen in this method.

In the stream processor extension example, you can retrieve those values
then use them to train new models with new batches. Weights/cluster centers
maybe passed as initialization parameters for the wrappers.

Please note that we have to figure out the best siddhi extension type for
this process. In the siddhi query, we define batch size, type of algorithm
and number of features (there can be more). After batch size number of
events received, train a model and save parameters, return evaluation
metric. With the next batch, retrain the model initialized with previously
learned parameters.

We also may need to test the same scenario with a moving window, but I
suspect that that approach may become so slow as a model is trained each
time an event is received. So, we may have to change the number of slots
the moving window moves at a time (eg: not one by one, but ten by ten).

Once this is resolved, majority of the research part will be finished and
all we will be left to do is implementing wrappers around the 3 learning
algorithms we consider.

Best regards.

[1]
https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
[2]
https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java


On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshkya,
> shall we use [1] for our work? i am checking the possibility.
> BR,
> Mahesh.
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
> [2]
> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
> [3]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>
> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> It is very interesting that i have found that somethings that can make
>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>> Processor Extention program [1]. There is a example of
>> LinearRegressionStreamProcessor [1] and also i saw
>>  private int batchSize = 10; i am going through this one.
>> Please check whether we can use. WIll there be any compatibility or
>> support issue?
>> regards,
>> Mahesh.
>>
>>
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>
>> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshakya,
>>> anyway how can test any siddhi extention after write it without
>>> integrating it to cep.can you please explain me the procedure. i am
>>> referring to [1] [2] [3] [4].  thank you.
>>> BR,
>>> Mahesh.
>>>
>>> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>>> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>>> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
>>> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>>
>>> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> thank you for the feedback. I have add data-sets into repo.
>>>> data-sets/lr. I am all right with next week.Now i am writing som

Re: [Dev] [cep][ml][gsoc-6]Capturing event stream with a specified window size for ml

2016-05-23 Thread Maheshakya Wijewardena
Hi Mahesh,

Actually, IMO, there should be an output for trained model, which is the
evaluation metric; for linear regression, MSE and for logistic regression,
accuracy. For clustering, it could be cluster centers.
That way, it's possible to examine how model behaves with data.

Best regards.

On Mon, May 23, 2016 at 3:06 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi suho,
> in my project machine learning models are incrementally being trained.
> Therefore real time data is taken into my design as mini batches of data
> sample points. Since we are developing this models for cep siddhi
> processor,we want to get the mini batch of data points into my algorithms
> from siddhi processor. In my case there is a input stream of K-size
> (Batch/Window Size) of sample points bundles together. In linear regrassion
> case all the independent and dependent data, In K-mean case whole feature
> vector (data sample). Not only as single sample points (Window size=1), but
> also as mini-batch (Window size=N)of sample points (Stream data). In my
> case there wont be an output stream. The modeled will be there so even
> predict can be used with that models.I looked into the [1] also.thank you.
> regards,
> Mahesh.
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>
> On Mon, May 23, 2016 at 2:48 PM, Sriskandarajah Suhothayan <s...@wso2.com>
> wrote:
>
>> Hi Mahesh
>>
>> Can you explain the expected input to your extension and the expected
>> output. Then we can help you to find the proper Siddhi extension to use.
>>
>> Regards
>> Suho
>>
>> On Sat, May 21, 2016 at 11:44 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi all,
>>> i am currenl working on the gsoc project "Predictive analytics with
>>> online data for WSO2 Machine Learner
>>> <https://docs.wso2.com/display/GSoC/Project+Proposals+for+2016#ProjectProposalsfor2016-Proposal6:[ML]PredictiveanalyticswithonlinedataforWSO2MachineLearner>"
>>> with wso2 ML and cep extention for acquiring stream of events (sample data
>>> points interms of ml) from cep siddhi porocessor. I am trying to write a
>>> cep extention to get the stream of events as windows with a "Specified
>>> window size". Then i am using those data sets to incrementally and
>>> periodically learn the ML model which store the specific ml model
>>> information to use with the current window of event data samples. I am
>>> facing problem of writing a siddhi extention for my purpose to get stream
>>> of data windows from cep siddhi rpocessor. Please help me with followings.
>>> 1. I have been referring to [1] [2] [3] for writing siddhi extention. In
>>> my case,what can be the most suitable option for this among the set of
>>> siddhi extensions given?
>>>
>>> 2. I am currently working on carbon-ml and product-ml and ml cep
>>> extentions currently built [6]. In case what is the best way to write
>>> simple cep extention to check the functionality.?
>>>
>>> 3. I have gone through the [5] [4] for cep inbuilt windows. How can i
>>> effectivey aggregate those features into my case?
>>>
>>> In case, if i need to look into other areas of cep for this purpose
>>> please let me know. Thank you very much.
>>> regards,
>>> Mahesh.
>>> [1]https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>>> https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>>> [2]https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>> https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>> [3]https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>>> https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>>> [4]
>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>> [5]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>> [6]
>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>
>>
>>
>>
>> --
>>
>> *S. Suhothayan*
>> Technical Lead & Team Lead of WSO2 Complex Event Pr

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-18 Thread Maheshakya Wijewardena
Hi Mahesh,

I've look into your code sample of streaming linear regression. Looks good
to me, apart from few issues in coding practices which we can improve when
you're doing the implementations in carbon-ml and during the code reviews.
You are using a set of files as mini-batches of data, right? Can you also
send us the datasets you've been using. I'd like to run this.

does that cep problem is now all right that we were trying to fix. I am
> still using those pre-build versions. If so i can merge with the latest one.


I'll check this and let you know.

Can we arrange a meeting (preferably in WSO2 offices) in next week with ML
team members as well. Coding period begins on next Monday, so it's better
to get overall feedback from others and discuss more about the project. Let
me know convenient time slots for you. I'll arrange a meeting with ML team.

Best regards.

On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> Ok. I will check it.you have sent me those relevant references and i am
> working on that thing.thank you. does that cep problem is now all right
> that we were trying to fix. I am still using those pre-build versions. If
> so i can merge with the latest one.thanks.
> BR,
> Mahesh.
>
> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't actually have to implement anything in spark streaming. Try to
>> understand how streaming data is handled in and the specifics of the
>> underlying algorithms in streaming.
>> What we want to do is having the similar algorithms that support CEP
>> event streams with siddhi.
>>
>> Best regards.
>>
>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Did you check the repo. I will add recent works today.And also i was
>>> going through the Java docs related to spark streaming work. It is with
>>> that scala API. thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I have gone through the Java Docs and run some of the Spark examples on
>>>> spark shell which are paramount improtant for our work. Then i have been
>>>> writing my codes to check the Linear regression, K means for streaming.
>>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>>> the capturing event streams for our work. I will update the recent things
>>>> on git. check the park-example directory for java. examples run on git
>>>> shell is not included there. In my case i think i have to build mini
>>>> batches from data streams that comes as individual samples. Now i am
>>>> working on some coding to collect mini batches from data streams.thank you.
>>>> regards,
>>>> Mahesh.
>>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>>
>>>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I have gone through the Java Docs and run some of the Spark examples
>>>>> on spark shell which are paramount improtant for our work. Then i have 
>>>>> been
>>>>> writing my codes to check the Linear regression, K means for streaming.
>>>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>>>> the capturing event streams for our work. I will update the recent things
>>>>> on git. check the park-example directory for java. examples run on git
>>>>> shell is not included there. In my case i think i have to build mini
>>>>> batches from data streams that comes as individual samples. Now i am
>>>>> working on some coding to collect mini batches from data streams.thank 
>>>>> you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>>>
>>>>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> thank you. i will update the repo today.thank you.i changed the
>>>>>> carbon ml siddhi extention and see how the changes are effecting. i will
>>>>>> update the progress as soon as possible.thank you. i had some pro

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Maheshakya Wijewardena
Hi Mahesh,

You don't actually have to implement anything in spark streaming. Try to
understand how streaming data is handled in and the specifics of the
underlying algorithms in streaming.
What we want to do is having the similar algorithms that support CEP event
streams with siddhi.

Best regards.

On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> Did you check the repo. I will add recent works today.And also i was going
> through the Java docs related to spark streaming work. It is with that
> scala API. thank you.
> regards,
> Mahesh.
>
> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples on
>>> spark shell which are paramount improtant for our work. Then i have been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>
>>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> thank you. i will update the repo today.thank you.i changed the carbon
>>>> ml siddhi extention and see how the changes are effecting. i will update
>>>> the progress as soon as possible.thank you. i had some problem in spark
>>>> mllib dependency. i was fixing that.
>>>> regards,
>>>> Mahesh.
>>>> p.s: do i need to maintain a blog?
>>>>
>>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Sorry for replying late.
>>>>>
>>>>> Thank you for the update. I believe you have done some implementations
>>>>> with with Spark MLLIb algorithms in streaming fashion as we have 
>>>>> discussed.
>>>>> If so, can you please share your code in a Github repo.
>>>>>
>>>>> Now i want to implements some machine learning algorithms with
>>>>>> importing mllib and want to run within your code base
>>>>>>
>>>>>
>>>>> For the moment you can try out editing the same class
>>>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
>>>>> add this separately. You should be able to add org.apache.spark.mllib.
>>>>> classes to there.
>>>>>
>>>>> And i want to see how event streams are coming from cep. As i think it
>>>>>> is not in a RDD format since it is arriving as the individual samples. I
>>>>>> will send a email to dev asking about how to get the streams.
>>>>>
>>>>>
>>>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows
>>>>> in siddhi. What you need to write are functions similar to a custom
>>>>> aggregate function[2].
>>>>> When you send the email to dev list, explain your requirement. You
>>>>> need to get a se

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Maheshakya Wijewardena
Hi Mahesh,

I'll review your code sample and give you our feedback asap.
In the meantime, please go through the documentation for writing siddhi
extensions and get some idea. It's better if you can try writing some
simple siddhi extensions your self and test them to get a good
understanding.

Best regards.

On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples on
> spark shell which are paramount improtant for our work. Then i have been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you. i will update the repo today.thank you.i changed the carbon
>>> ml siddhi extention and see how the changes are effecting. i will update
>>> the progress as soon as possible.thank you. i had some problem in spark
>>> mllib dependency. i was fixing that.
>>> regards,
>>> Mahesh.
>>> p.s: do i need to maintain a blog?
>>>
>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Sorry for replying late.
>>>>
>>>> Thank you for the update. I believe you have done some implementations
>>>> with with Spark MLLIb algorithms in streaming fashion as we have discussed.
>>>> If so, can you please share your code in a Github repo.
>>>>
>>>> Now i want to implements some machine learning algorithms with
>>>>> importing mllib and want to run within your code base
>>>>>
>>>>
>>>> For the moment you can try out editing the same class
>>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
>>>> add this separately. You should be able to add org.apache.spark.mllib.
>>>> classes to there.
>>>>
>>>> And i want to see how event streams are coming from cep. As i think it
>>>>> is not in a RDD format since it is arriving as the individual samples. I
>>>>> will send a email to dev asking about how to get the streams.
>>>>
>>>>
>>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
>>>> siddhi. What you need to write are functions similar to a custom aggregate
>>>> function[2].
>>>> When you send the email to dev list, explain your requirement. You need
>>>> to get a set of event with from a stream with a specified window size
>>>> (number of events). Then build a model within that function. You also need
>>>> to retain the data (learned weights, cluster centers, etc.) from the
>>>> previous window to use in the current window. Ask what can be the most
>>>> suitable option for this among the set of siddhi extensions given.
>>>>
>>>> Best regards.
>>>>
>>>> [1]
>>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>>> [2]
>>>> https:/

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-15 Thread Maheshakya Wijewardena
Hi Mahesh,

Sorry for replying late.

Thank you for the update. I believe you have done some implementations with
with Spark MLLIb algorithms in streaming fashion as we have discussed. If
so, can you please share your code in a Github repo.

Now i want to implements some machine learning algorithms with importing
> mllib and want to run within your code base
>

For the moment you can try out editing the same class
PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
add this separately. You should be able to add org.apache.spark.mllib.
classes to there.

And i want to see how event streams are coming from cep. As i think it is
> not in a RDD format since it is arriving as the individual samples. I will
> send a email to dev asking about how to get the streams.


Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
siddhi. What you need to write are functions similar to a custom aggregate
function[2].
When you send the email to dev list, explain your requirement. You need to
get a set of event with from a stream with a specified window size (number
of events). Then build a model within that function. You also need to
retain the data (learned weights, cluster centers, etc.) from the previous
window to use in the current window. Ask what can be the most suitable
option for this among the set of siddhi extensions given.

Best regards.

[1]
https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
[2] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function

On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
> Date: Wed, May 11, 2016 at 1:43 PM
> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
> data for WSO2 Machine Learner
> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>
>
> Hi Maheshakya,
> sorry for not updating. I did what you wanted me to do. I checked the code
> base and train functions. I went through those java docs. I went through
> the carbon-ml current implementation of LG and K-Mean. And i had Apache
> Spark and i tried with several examples. Now i want to implements some
> machine learning algorithms with importing mllib and want to run within
> your code base. Can you help me with that.
> And i want to see how event streams are coming from cep. As i think it is
> not in a RDD format since it is arriving as the individual samples. I will
> send a email to dev asking about how to get the streams. I debugged many of
> those functions in the code base. So need further instructions to
> proceed.thank you.
> regards,
> Mahesh.
>
> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Any update on your progress?
>>
>> Best regards.
>>
>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> is that "Put break points in train methods in Linear Regression class"
>>>> means the spark/algorithms/ LinearRegrassion.java class in the
>>>> org.wso2.carbon.ml.core? is that the correct file?
>>>
>>>
>>> Yes, this is the correct place.
>>>
>>> You can refer to spark programming guide[1][2] as well as our ML code
>>> base when you try those algorithms out. Please try to do rough
>>> implementations of the streaming versions of linear regression, logistic
>>> regression and k-means clustering as we have discussed in the proposal in
>>> plain Java. It's better if you can create a git repo and share your code
>>> once you have made some progress.
>>>
>>> Were you able debug and understand the flow of the ML siddhi extension?
>>> I hope you haven't encountered more errors after switching the released
>>> version of CEP.
>>>
>>> Is this Friday okay for you? Afternoon at 2:00 pm?
>>>
>>> Best regards.
>>>
>>>
>>> Best regards.
>>>
>>> [1] http://spark.apache.org/docs/latest/programming-guide.html
>>> [2] http://spark.apache.org/docs/latest/mllib-guide.html
>>>
>>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I have been looking into some algorithms related to stochastic gradient
>>>> descent based algorithms.anything i should focus please let me know.Ans
>>>> also i will be available for calling this week and next week.thank you.
>>>> BR,
>>&g

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-10 Thread Maheshakya Wijewardena
Hi Mahesh,

Any update on your progress?

Best regards.

On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <mahesha...@wso2.com>
wrote:

> Hi Mahesh,
>
> is that "Put break points in train methods in Linear Regression class"
>> means the spark/algorithms/ LinearRegrassion.java class in the
>> org.wso2.carbon.ml.core? is that the correct file?
>
>
> Yes, this is the correct place.
>
> You can refer to spark programming guide[1][2] as well as our ML code base
> when you try those algorithms out. Please try to do rough implementations
> of the streaming versions of linear regression, logistic regression and
> k-means clustering as we have discussed in the proposal in plain Java. It's
> better if you can create a git repo and share your code once you have made
> some progress.
>
> Were you able debug and understand the flow of the ML siddhi extension? I
> hope you haven't encountered more errors after switching the released
> version of CEP.
>
> Is this Friday okay for you? Afternoon at 2:00 pm?
>
> Best regards.
>
>
> Best regards.
>
> [1] http://spark.apache.org/docs/latest/programming-guide.html
> [2] http://spark.apache.org/docs/latest/mllib-guide.html
>
> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have been looking into some algorithms related to stochastic gradient
>> descent based algorithms.anything i should focus please let me know.Ans
>> also i will be available for calling this week and next week.thank you.
>> BR,
>> Mahesh.
>>
>> On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you.that's good. i have been trying to fix that for couple of
>>> days. please inform me when it will be fixed.now i have been testing the ML
>>> algorithms and trying to identify the flow and the hierarchy. is that "Put
>>> break points in train methods in Linear Regression class" means the
>>> spark/algorithms/ LinearRegrassion.java class in the
>>> org.wso2.carbon.ml.core? is that the correct file?
>>> And also i am planning to write some programs to use apache spark mllib
>>> algorithms. and i refer to [1] and some wso2 documentations to get some
>>> idea about ML structure.thank you.
>>>
>>> BR,
>>> Mahesh.
>>>
>>> [1]nirmalfdo.blogspot.com
>>>
>>> On Tue, May 3, 2016 at 4:36 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> I have checked. It seems the issue you have encountered is cause only
>>>> in the current development branch of the product-cep. It doesn't identify
>>>> the ML siddhi extension as an extension. ML siddhi extension works fine in
>>>> the latest release of CEP (4.1.0) [1].
>>>> Until we figure out the reason and come up with a solution, can you use
>>>> the latest CEP release for your work. It's fine to use that since you
>>>> haven't started actual development yet.
>>>>
>>>> Best regards.
>>>>
>>>> [1] http://wso2.com/products/complex-event-processor/
>>>>
>>>> On Tue, May 3, 2016 at 3:19 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>>
>>>>>> Is is vital to use those local repo in my upcoming implementation?
>>>>>
>>>>>
>>>>> Yes. The remote p2-repo contains the p2-repos of released versions.
>>>>> What you have to develop on is the current master of the carbon-ml and
>>>>> product-ml. You can try out with the modification I have suggested. In the
>>>>> meantime, I'll verify whether the current repos are working as expected.
>>>>>
>>>>> And also i am trying to debug the carbon-ml org.wso2.carbon.ml.core by
>>>>>> putting some break point in the spark/algorithms/Linear Regression
>>>>>
>>>>>
>>>>> It's great that you have started looking at the implementation of
>>>>> linear regression as well. Put break points in train methods in
>>>>> LinearRegression class. This is being used when you run linear regression
>>>>> from UI.
>>>>>
>>>>> I can see some comments left behind for streaming algo as well.thank
>>>>>> you
>>>>>
>>>>

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-28 Thread Maheshakya Wijewardena
Hi Mahesh,

The links was an example of remote debugging WSO2 server. What you need to
debug is org.wso2.carbon.ml.siddhi.extension in carbon-ml.

Best regards.

On Thu, Apr 28, 2016 at 4:52 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> thank you for your help.i have already built all three sources and  now i
> am trying to get familiar with your code base. i even build the
> carbon-kernel by source.
>  As you mentioned [1] is related to debug the kernel, do i really need to
> debug the carbon kernel in my case. I am trying to remotely debug ml and as
> i got it correct it is the same way as reference[1, but not the kernel.I
> can go with others.
> BR,
> mahesh.
>
> [1] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>
> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Congratulations and welcome to GSoC 2016. You did a great job in
>> preparing the proposal. Now it's time to dig deep and get started with the
>> project.
>>
>> First of all you need to familiarize with the code base. We have agreed
>> to implement this with CEP event streams. We already have a CEP extension
>> for predictions [1][2]. Go through this implementation and familiarize your
>> self with that. You need to understand how:
>>
>>1. Even streams are consumed
>>2. predictions are made from individual event
>>3. Results are sent back
>>
>> Get WSO2 ML and CEP sources (You may use latest released version of CEP)
>> and build the products. Get both carbon-ml[3] and product-ml[4] masters and
>> create new branches for your work from masters.
>>
>> After you build the products, you may need to do remote debugging[5] to
>> understand the flow. So please follow an example of real time prediction
>> with ML with debugging and get some idea. The component you need to debug
>> is org.wso2.carbon.ml.siddhi.extension.
>>
>> Next tasks would be implementing online learning algorithms in plain java
>> with spark ml lib and integrating those to ML. We also need to come up with
>> a proper and detailed architecture to employ those algorithms in ML.
>> Getting familiar with the aforementioned sections would give you some
>> insight on how this should be implemented.
>>
>> So please try to get a quick grasp then you can start the implementation.
>> Let us know if you have any questions or you get stuck somewhere.
>>
>> Also, please always add WSO2 developer's list as well when you
>> communicate with us regarding the project so that you can get opinions and
>> feedback from others as well.
>>
>> Best regards.
>>
>> [1]
>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>
>> [2]
>> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension
>>
>> [3] https://github.com/wso2/carbon-ml
>>
>> [4] https://github.com/wso2/product-ml
>>
>> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>>
>>
>> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi,
>>> thank you for accepting my GSOC 2016 proposal and i am looking forward
>>> for the further instruction and project continuation. thank you very much.
>>> regards,
>>> Mahesh.
>>>
>>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-27 Thread Maheshakya Wijewardena
The error you see usually doesn't relate to a Java heap limit issue, but
the OS actually running out of memory.
Can you try closing every other process while building these products and
make sure the memory usage is low or add more swap space[1].

Are you using open-jdk? If so, can you switch to oracle jdk if the above
doesn't work and try again.

Best regards.
[1] http://askubuntu.com/questions/178712/how-to-increase-swap-space

On Wed, Apr 27, 2016 at 2:32 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi suho,
> Thanl you for the information. In the initial build i used mvn "mvn clean
> install -Dmaven.test.skip=true", thats why i did not get errors.But this
> time i built with mvn clean build and i got some errors in test stage.i
> have already set up MAVEN_OPTS as MAVEN_OPTS="-Xms768m -Xms3072m
> -XX:MaxPermSize=1200m". But it seems to be some memory constriant.i got
> followings.
>
>
> ERROR
> [org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader] -
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=256m; support was removed in 8.0
> ERROR
> [org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader] -
> Java HotSpot(TM) 64-Bit Server VM warning: INFO:
> os::commit_memory(0xf400, 157286400, 0) failed; error='Cannot
> allocate memory' (errno=12)
>
> I have been using Ubuntu 14.04 LTS with 4GB ram.So how can i fix this
> issue.  And i got similar kind of error when i was trying to build the wso2
> product-ml. i have attached the detailes of the error i got in this mail.
> Do i need to set up some additional environemtn variables to fix this.
> BR,
> Mahesh.
>
>
>
>
> On Wed, Apr 27, 2016 at 1:51 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> As Suho mentioned, if you have successfully built with tests, then there
>> shouldn't be an issue.
>>
>> However, in the error you've stated, it seems there's problem with carbon
>> home:
>>
>>> CARBON_HOME environment variable is set to
>>> /home/mahesh/GSOC/WSO2/product-cep/modules/distribution
>>>
>> Can you make sure that you extract
>> product-cep/modules/distribution/target/wso2cep-4.1.1-SNAPSHOT.zip and run
>> the server in wso2cep-4.1.1-SNAPSHOT/bin/ with ./wso2server.sh
>>
>> Best regards.
>>
>> On Wed, Apr 27, 2016 at 12:43 PM, Sriskandarajah Suhothayan <
>> s...@wso2.com> wrote:
>>
>>> If your build has passed, then it should not be an issue. When building
>>> the the tests should have ran.
>>> Is that so? please verify.
>>> During that server should have started and stopped.
>>>
>>> I think there is some issue in the way you have started the CEP.
>>>
>>> You should be able to build the products as you will be working with
>>> components having snapshots versions.
>>>
>>> Regards
>>> Suho
>>>
>>> On Wed, Apr 27, 2016 at 12:33 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I am trying to build the CEP by sourceas [1].it was built without
>>>> errors.But when i run the ./wso2server.sh  i got his error
>>>>
>>>> JAVA_HOME environment variable is set to /usr/local/java/jdk1.8.0_51
>>>> CARBON_HOME environment variable is set to
>>>> /home/mahesh/GSOC/WSO2/product-cep/modules/distribution
>>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>>> MaxPermSize=256m; support was removed in 8.0
>>>> Could not load Logmanager "org.apache.juli.ClassLoaderLogManager"
>>>> java.lang.ClassNotFoundException: org.apache.juli.ClassLoaderLogManager
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> at java.util.logging.LogManager$1.run(LogManager.java:195)
>>>> at java.util.logging.LogManager$1.run(LogManager.java:181)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at java.util.logging.LogManager.(LogManager.java:181)
>>>> at java.util.logging.Logger.demandLogger(Logger.java:448)
>>>> at java.util.logging.Logger.getLogger(Logger.java:502)
>>>> at com.sun.jmx.remote.util.ClassLogger.(ClassLogger.java:55)
>>>> at
>>>> sun.management.j

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-27 Thread Maheshakya Wijewardena
Hi Mahesh,

As Suho mentioned, if you have successfully built with tests, then there
shouldn't be an issue.

However, in the error you've stated, it seems there's problem with carbon
home:

> CARBON_HOME environment variable is set to
> /home/mahesh/GSOC/WSO2/product-cep/modules/distribution
>
Can you make sure that you extract
product-cep/modules/distribution/target/wso2cep-4.1.1-SNAPSHOT.zip and run
the server in wso2cep-4.1.1-SNAPSHOT/bin/ with ./wso2server.sh

Best regards.

On Wed, Apr 27, 2016 at 12:43 PM, Sriskandarajah Suhothayan <s...@wso2.com>
wrote:

> If your build has passed, then it should not be an issue. When building
> the the tests should have ran.
> Is that so? please verify.
> During that server should have started and stopped.
>
> I think there is some issue in the way you have started the CEP.
>
> You should be able to build the products as you will be working with
> components having snapshots versions.
>
> Regards
> Suho
>
> On Wed, Apr 27, 2016 at 12:33 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I am trying to build the CEP by sourceas [1].it was built without
>> errors.But when i run the ./wso2server.sh  i got his error
>>
>> JAVA_HOME environment variable is set to /usr/local/java/jdk1.8.0_51
>> CARBON_HOME environment variable is set to
>> /home/mahesh/GSOC/WSO2/product-cep/modules/distribution
>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>> MaxPermSize=256m; support was removed in 8.0
>> Could not load Logmanager "org.apache.juli.ClassLoaderLogManager"
>> java.lang.ClassNotFoundException: org.apache.juli.ClassLoaderLogManager
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at java.util.logging.LogManager$1.run(LogManager.java:195)
>> at java.util.logging.LogManager$1.run(LogManager.java:181)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.util.logging.LogManager.(LogManager.java:181)
>> at java.util.logging.Logger.demandLogger(Logger.java:448)
>> at java.util.logging.Logger.getLogger(Logger.java:502)
>> at com.sun.jmx.remote.util.ClassLogger.(ClassLogger.java:55)
>> at
>> sun.management.jmxremote.ConnectorBootstrap.(ConnectorBootstrap.java:814)
>> at sun.management.Agent.startLocalManagementAgent(Agent.java:138)
>> at sun.management.Agent.startAgent(Agent.java:260)
>> at sun.management.Agent.startAgent(Agent.java:447)
>> Error: Could not find or load main class
>> org.wso2.carbon.bootstrap.Bootstrap
>>
>> do i need some additional libraries there?Is it allright to go wit the
>> [2] as we will be doing changes to source.
>> BR,
>> Mahesh.
>>
>>
>> On Wed, Apr 27, 2016 at 12:17 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> You don't need to build the kernel. You can build either current master
>>> of product-cep[1] or you can download the latest release from [2].
>>>
>>> Best regards.
>>>
>>> [1] https://github.com/wso2/product-cep
>>> [2] http://wso2.com/products/complex-event-processor/
>>>
>>> On Wed, Apr 27, 2016 at 12:09 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi maheshakya,
>>>> Do we need to build carbon kernal by source before we build  CEP by
>>>> source (https://github.com/wso2/carbon-kernel )  .Or is it inside
>>>> those sources.i am trying to build all three sources after forked
>>>> them.thank you.
>>>> regards,
>>>> Mahesh
>>>>
>>>> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Congratulations and welcome to GSoC 2016. You did a great job in
>>>>> preparing the proposal. Now it's time to dig deep and get started with the
>>>>> project.
>>>>>
>>>>> First of all you need to familiarize with the code base. We have
>>>>> agreed to implement this with CEP event streams. We already have a CEP
>>>>> extension for predictions [1][2]. Go through this implementation and
>>>>> familiarize your self with that. You need to understand how:
>>>>>
>>>>>1. Even streams are cons

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-27 Thread Maheshakya Wijewardena
You don't need to build the kernel. You can build either current master of
product-cep[1] or you can download the latest release from [2].

Best regards.

[1] https://github.com/wso2/product-cep
[2] http://wso2.com/products/complex-event-processor/

On Wed, Apr 27, 2016 at 12:09 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi maheshakya,
> Do we need to build carbon kernal by source before we build  CEP by source
> (https://github.com/wso2/carbon-kernel )  .Or is it inside those
> sources.i am trying to build all three sources after forked them.thank you.
> regards,
> Mahesh
>
> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Congratulations and welcome to GSoC 2016. You did a great job in
>> preparing the proposal. Now it's time to dig deep and get started with the
>> project.
>>
>> First of all you need to familiarize with the code base. We have agreed
>> to implement this with CEP event streams. We already have a CEP extension
>> for predictions [1][2]. Go through this implementation and familiarize your
>> self with that. You need to understand how:
>>
>>1. Even streams are consumed
>>2. predictions are made from individual event
>>3. Results are sent back
>>
>> Get WSO2 ML and CEP sources (You may use latest released version of CEP)
>> and build the products. Get both carbon-ml[3] and product-ml[4] masters and
>> create new branches for your work from masters.
>>
>> After you build the products, you may need to do remote debugging[5] to
>> understand the flow. So please follow an example of real time prediction
>> with ML with debugging and get some idea. The component you need to debug
>> is org.wso2.carbon.ml.siddhi.extension.
>>
>> Next tasks would be implementing online learning algorithms in plain java
>> with spark ml lib and integrating those to ML. We also need to come up with
>> a proper and detailed architecture to employ those algorithms in ML.
>> Getting familiar with the aforementioned sections would give you some
>> insight on how this should be implemented.
>>
>> So please try to get a quick grasp then you can start the implementation.
>> Let us know if you have any questions or you get stuck somewhere.
>>
>> Also, please always add WSO2 developer's list as well when you
>> communicate with us regarding the project so that you can get opinions and
>> feedback from others as well.
>>
>> Best regards.
>>
>> [1]
>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>
>> [2]
>> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension
>>
>> [3] https://github.com/wso2/carbon-ml
>>
>> [4] https://github.com/wso2/product-ml
>>
>> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>>
>>
>> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi,
>>> thank you for accepting my GSOC 2016 proposal and i am looking forward
>>> for the further instruction and project continuation. thank you very much.
>>> regards,
>>> Mahesh.
>>>
>>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-25 Thread Maheshakya Wijewardena
Hi Mahesh,

Congratulations and welcome to GSoC 2016. You did a great job in preparing
the proposal. Now it's time to dig deep and get started with the project.

First of all you need to familiarize with the code base. We have agreed to
implement this with CEP event streams. We already have a CEP extension for
predictions [1][2]. Go through this implementation and familiarize your
self with that. You need to understand how:

   1. Even streams are consumed
   2. predictions are made from individual event
   3. Results are sent back

Get WSO2 ML and CEP sources (You may use latest released version of CEP)
and build the products. Get both carbon-ml[3] and product-ml[4] masters and
create new branches for your work from masters.

After you build the products, you may need to do remote debugging[5] to
understand the flow. So please follow an example of real time prediction
with ML with debugging and get some idea. The component you need to debug
is org.wso2.carbon.ml.siddhi.extension.

Next tasks would be implementing online learning algorithms in plain java
with spark ml lib and integrating those to ML. We also need to come up with
a proper and detailed architecture to employ those algorithms in ML.
Getting familiar with the aforementioned sections would give you some
insight on how this should be implemented.

So please try to get a quick grasp then you can start the implementation.
Let us know if you have any questions or you get stuck somewhere.

Also, please always add WSO2 developer's list as well when you communicate
with us regarding the project so that you can get opinions and feedback
from others as well.

Best regards.

[1]
https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension

[2]
https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension

[3] https://github.com/wso2/carbon-ml

[4] https://github.com/wso2/product-ml

[5] https://dzone.com/articles/how-debug-wso2-carbon-kernel


On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi,
> thank you for accepting my GSOC 2016 proposal and i am looking forward for
> the further instruction and project continuation. thank you very much.
> regards,
> Mahesh.
>
> --
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Changed MSF4J parent poms to WSO2 pom v4

2016-04-19 Thread Maheshakya Wijewardena
Noted.
Will do.

Best regards.

On Tue, Apr 19, 2016 at 8:50 PM, Afkham Azeez <az...@wso2.com> wrote:

> Please note $subject
>
> This uncovered a whole lot of FindBugs & checkstyles issues since moving
> to wso2 pom version 2, these checks have not run. I fixed as many issues as
> possible but there are more, so I had to remove the recently added
> jwt-claims, spring, & JPA samples.
>
> Owners please fix the FindBugs & checkstyles issues and commit your
> changes.
>
> Thanks
> Azeez
>
> --
> *Afkham Azeez*
> Director of Architecture; WSO2, Inc.; http://wso2.com
> Member; Apache Software Foundation; http://www.apache.org/
> * <http://www.apache.org/>*
> *email: **az...@wso2.com* <az...@wso2.com>
> * cell: +94 77 3320919 <%2B94%2077%203320919>blog: *
> *http://blog.afkham.org* <http://blog.afkham.org>
> *twitter: **http://twitter.com/afkham_azeez*
> <http://twitter.com/afkham_azeez>
> *linked-in: **http://lk.linkedin.com/in/afkhamazeez
> <http://lk.linkedin.com/in/afkhamazeez>*
>
> *Lean . Enterprise . Middleware*
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-25 Thread Maheshakya Wijewardena
Hi Mahesh,

Can you add the time line of the project as I've mentioned. It's one of the
crucial parts of the proposal that allows us to evaluate feasibility of the
project in accordance with the given time period by Google.

Best regards.

On Fri, Mar 25, 2016 at 6:53 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
> Date: Fri, Mar 25, 2016 at 7:02 PM
> Subject: Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]
> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>
>
> Hi maheshakya,
> I have uploaded my final submission.here it is. pls check it and inform me
> anything i need to change.thank you.
> BR,
> Mahesh.
>
> On Fri, Mar 25, 2016 at 6:28 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> thank you very much. I will be updating the proposal with those changes
>> and i will submit it by now.thank you.
>> regards,
>> Mahesh.
>>
>> On Fri, Mar 25, 2016 at 6:07 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> In the title, please include both tags [ML] and [CEP]
>>>
>>> Best regards.
>>>
>>> On Fri, Mar 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Also, please include an introduction to yourself (University,
>>>> department), past experience in machine learning, language proficiency, etc
>>>> at the beginning of the proposal.
>>>>
>>>> Best regards.
>>>>
>>>> On Fri, Mar 25, 2016 at 5:47 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Thank you for sending the draft. Please submit it as soon as possible.
>>>>>
>>>>> Few high level comments:
>>>>>
>>>>> In the proposal, you must specifically mention that this will be
>>>>> implemented as a Siddhi extension that can operate directly on incoming
>>>>> streams.
>>>>>
>>>>> Also, you need to have a time line for the project, A sample looks
>>>>> like:
>>>>>
>>>>> May 1- May 20 - Community bonding period - Getting familiar with the
>>>>> platform and discussing implementation methods.
>>>>> May 20 - May 30 - Implementing streaming k-means,
>>>>> -
>>>>> -
>>>>> July 20-24 - Writing examples
>>>>> July 24-18 - Documentation
>>>>>
>>>>> This should end before pencils down date. Refer to the correct time
>>>>> line given in GSoC site.
>>>>>
>>>>> The implementation details of the the streaming algorithms looks fine.
>>>>>
>>>>> Best regards.
>>>>>
>>>>>
>>>>> On Fri, Mar 25, 2016 at 5:23 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> this is my draft proposal.
>>>>>>
>>>>>> https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sha
>>>>>> <https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sharing>
>>>>>> ring
>>>>>> can you ple check this and see whether it is correct.thank you.
>>>>>> BR,
>>>>>> Mahesh
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 21, 2016 at 1:15 PM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> The deadline for submitting your proposals is on March 25th, 2016,
>>>>>>> therefore please start writing the proposal and get feedback.
>>>>>>>
>>>>>>> Best regards.
>>>>>>>
>>>>>>> On Tue, Mar 15, 2016 at 4:14 PM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Maheshakaya,
>>>>>>>> Ok.I have been trying some examples and try to split them and train
>>>>>>>> incrementally. Still doing that. i have been adding them to my github 
&g

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-25 Thread Maheshakya Wijewardena
Also, please include an introduction to yourself (University, department),
past experience in machine learning, language proficiency, etc at the
beginning of the proposal.

Best regards.

On Fri, Mar 25, 2016 at 5:47 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Thank you for sending the draft. Please submit it as soon as possible.
>
> Few high level comments:
>
> In the proposal, you must specifically mention that this will be
> implemented as a Siddhi extension that can operate directly on incoming
> streams.
>
> Also, you need to have a time line for the project, A sample looks like:
>
> May 1- May 20 - Community bonding period - Getting familiar with the
> platform and discussing implementation methods.
> May 20 - May 30 - Implementing streaming k-means,
> -
> -
> July 20-24 - Writing examples
> July 24-18 - Documentation
>
> This should end before pencils down date. Refer to the correct time line
> given in GSoC site.
>
> The implementation details of the the streaming algorithms looks fine.
>
> Best regards.
>
>
> On Fri, Mar 25, 2016 at 5:23 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> this is my draft proposal.
>>
>> https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sha
>> <https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sharing>
>> ring
>> can you ple check this and see whether it is correct.thank you.
>> BR,
>> Mahesh
>>
>>
>> On Mon, Mar 21, 2016 at 1:15 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> The deadline for submitting your proposals is on March 25th, 2016,
>>> therefore please start writing the proposal and get feedback.
>>>
>>> Best regards.
>>>
>>> On Tue, Mar 15, 2016 at 4:14 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakaya,
>>>> Ok.I have been trying some examples and try to split them and train
>>>> incrementally. Still doing that. i have been adding them to my github repo
>>>> too. https://github.com/dananjayamahesh/GSOC2016 . i saw that there is
>>>> only scala API support for those streaming algorithms in Spark. so my task
>>>> is to develop Java API. will let you nkow my progress.thank you very much.
>>>> BR,
>>>> Mahesh
>>>>
>>>> On Tue, Mar 15, 2016 at 3:21 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> No you don't need to use Hadoop at any stage in this project.
>>>>> Everything you need is in Spark (regarding ML algorithms).
>>>>> You can also use Spark MLLibs methods to randomly split datasets.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> On Mon, Mar 14, 2016 at 1:28 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I am writing some java programs and try to break the dataset into
>>>>>> several pieces and train a model repeatedly with those data sets using
>>>>>> Spark MLLib. Do i have to do anything with Hadoop at this stage, because 
>>>>>> i
>>>>>> am working with a standalone mode.thank you.
>>>>>> BR,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Sun, Mar 13, 2016 at 6:30 PM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> You don't have to look into carbon-ml.
>>>>>>>
>>>>>>> Best regards.
>>>>>>>
>>>>>>> On Sun, Mar 13, 2016 at 5:49 PM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi maheshakya,
>>>>>>>> i am working on some examples related to Spark and ML.is there
>>>>>>>> anything to do with carbon-ml. I think i dont need to look into that 
>>>>>>>> one.do
>>>>>>>> i?
>>>>>>>> BR,
>>>>>>>> Mahesh
>>>>>>>>
>>>>>>>> On Tue, Mar 8, 2016 at 11:55 AM, M

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-25 Thread Maheshakya Wijewardena
Hi Mahesh,

Thank you for sending the draft. Please submit it as soon as possible.

Few high level comments:

In the proposal, you must specifically mention that this will be
implemented as a Siddhi extension that can operate directly on incoming
streams.

Also, you need to have a time line for the project, A sample looks like:

May 1- May 20 - Community bonding period - Getting familiar with the
platform and discussing implementation methods.
May 20 - May 30 - Implementing streaming k-means,
-
-
July 20-24 - Writing examples
July 24-18 - Documentation

This should end before pencils down date. Refer to the correct time line
given in GSoC site.

The implementation details of the the streaming algorithms looks fine.

Best regards.


On Fri, Mar 25, 2016 at 5:23 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> this is my draft proposal.
>
> https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sha
> <https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sharing>
> ring
> can you ple check this and see whether it is correct.thank you.
> BR,
> Mahesh
>
>
> On Mon, Mar 21, 2016 at 1:15 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> The deadline for submitting your proposals is on March 25th, 2016,
>> therefore please start writing the proposal and get feedback.
>>
>> Best regards.
>>
>> On Tue, Mar 15, 2016 at 4:14 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakaya,
>>> Ok.I have been trying some examples and try to split them and train
>>> incrementally. Still doing that. i have been adding them to my github repo
>>> too. https://github.com/dananjayamahesh/GSOC2016 . i saw that there is
>>> only scala API support for those streaming algorithms in Spark. so my task
>>> is to develop Java API. will let you nkow my progress.thank you very much.
>>> BR,
>>> Mahesh
>>>
>>> On Tue, Mar 15, 2016 at 3:21 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> No you don't need to use Hadoop at any stage in this project.
>>>> Everything you need is in Spark (regarding ML algorithms).
>>>> You can also use Spark MLLibs methods to randomly split datasets.
>>>>
>>>> Best regards.
>>>>
>>>> On Mon, Mar 14, 2016 at 1:28 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I am writing some java programs and try to break the dataset into
>>>>> several pieces and train a model repeatedly with those data sets using
>>>>> Spark MLLib. Do i have to do anything with Hadoop at this stage, because i
>>>>> am working with a standalone mode.thank you.
>>>>> BR,
>>>>> Mahesh.
>>>>>
>>>>> On Sun, Mar 13, 2016 at 6:30 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> You don't have to look into carbon-ml.
>>>>>>
>>>>>> Best regards.
>>>>>>
>>>>>> On Sun, Mar 13, 2016 at 5:49 PM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi maheshakya,
>>>>>>> i am working on some examples related to Spark and ML.is there
>>>>>>> anything to do with carbon-ml. I think i dont need to look into that 
>>>>>>> one.do
>>>>>>> i?
>>>>>>> BR,
>>>>>>> Mahesh
>>>>>>>
>>>>>>> On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena <
>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>
>>>>>>>> Hi Mahesh,
>>>>>>>>
>>>>>>>> does that Scala API is with your current product or repo?
>>>>>>>>
>>>>>>>>
>>>>>>>> No, we don't have the Scala API included. What we want is to design
>>>>>>>> the Java implementations of those algorithms to train with 
>>>>>>>> mini-batches of
>>>>>>>> streaming data with the help of the aforementioned methods so that we 
>>>>>>>> can

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-15 Thread Maheshakya Wijewardena
Hi Mahesh,

No you don't need to use Hadoop at any stage in this project. Everything
you need is in Spark (regarding ML algorithms).
You can also use Spark MLLibs methods to randomly split datasets.

Best regards.

On Mon, Mar 14, 2016 at 1:28 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> I am writing some java programs and try to break the dataset into several
> pieces and train a model repeatedly with those data sets using Spark MLLib.
> Do i have to do anything with Hadoop at this stage, because i am working
> with a standalone mode.thank you.
> BR,
> Mahesh.
>
> On Sun, Mar 13, 2016 at 6:30 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't have to look into carbon-ml.
>>
>> Best regards.
>>
>> On Sun, Mar 13, 2016 at 5:49 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshakya,
>>> i am working on some examples related to Spark and ML.is there anything
>>> to do with carbon-ml. I think i dont need to look into that one.do i?
>>> BR,
>>> Mahesh
>>>
>>> On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> does that Scala API is with your current product or repo?
>>>>
>>>>
>>>> No, we don't have the Scala API included. What we want is to design the
>>>> Java implementations of those algorithms to train with mini-batches of
>>>> streaming data with the help of the aforementioned methods so that we can
>>>> include in as a CEP extension.
>>>>
>>>> As to clarify, please try to write a simple Java program using Spark
>>>> MLLib linear regression and k-means clustering with a sample data set (You
>>>> can find alot of data sets from UCI repo[1]).  You need to break the
>>>> dataset into several pieces and train a model repeatedly with those.
>>>> After each training run, save the model information (such as weights,
>>>> intercepts for regression and cluster centers for clustering - please check
>>>> the arguments of those methods I have mentioned and save the required
>>>> information of the model)
>>>> When training a model we a new piece of data, use those methods to
>>>> initialize and put the save values for the arguments. This way you can
>>>> start from where you stopped in the previous run.
>>>>
>>>> Let us know your observations and feel free to ask if you need to know
>>>> anything more on this.
>>>>
>>>> We'll let you know what needs to be done to include this in CEP.
>>>>
>>>> Best regards.
>>>>
>>>> On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> great.thank you.i already have ML and CEP and working more towards it.
>>>>> does that Scala API is with your current product or repo?.  thank you.
>>>>> BR,
>>>>> Mahesh.
>>>>>
>>>>> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> Please find the comments inline.
>>>>>>
>>>>>> does data stream is taken to ML as the event publisher's format
>>>>>>> through event publisher. Or  we can use direct traffic that comes to 
>>>>>>> event
>>>>>>> receiver, or else as streams
>>>>>>>
>>>>>> We intend to use the direct data as even streams.
>>>>>>
>>>>>> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>>>>>>>
>>>>>> No, WSO2 ML doesn't use any even stream. The data stored in tables in
>>>>>> DAS is loaded into ML.
>>>>>>
>>>>>> 2.) Are there any incremental learning algorithms currently active in
>>>>>>> ML?you mentioned that there are and they are with scala API. So there 
>>>>>>> is a
>>>>>>> streaming support with that Scala API. In that API which format the 
>>>>>>> data is
>>>>>>> aquired to ML?
>>>>>>>
>>>>>> No, there are no incremental learning algorithms 

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-13 Thread Maheshakya Wijewardena
Hi Mahesh,

You don't have to look into carbon-ml.

Best regards.

On Sun, Mar 13, 2016 at 5:49 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi maheshakya,
> i am working on some examples related to Spark and ML.is there anything to
> do with carbon-ml. I think i dont need to look into that one.do i?
> BR,
> Mahesh
>
> On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> does that Scala API is with your current product or repo?
>>
>>
>> No, we don't have the Scala API included. What we want is to design the
>> Java implementations of those algorithms to train with mini-batches of
>> streaming data with the help of the aforementioned methods so that we can
>> include in as a CEP extension.
>>
>> As to clarify, please try to write a simple Java program using Spark
>> MLLib linear regression and k-means clustering with a sample data set (You
>> can find alot of data sets from UCI repo[1]).  You need to break the
>> dataset into several pieces and train a model repeatedly with those.
>> After each training run, save the model information (such as weights,
>> intercepts for regression and cluster centers for clustering - please check
>> the arguments of those methods I have mentioned and save the required
>> information of the model)
>> When training a model we a new piece of data, use those methods to
>> initialize and put the save values for the arguments. This way you can
>> start from where you stopped in the previous run.
>>
>> Let us know your observations and feel free to ask if you need to know
>> anything more on this.
>>
>> We'll let you know what needs to be done to include this in CEP.
>>
>> Best regards.
>>
>> On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> great.thank you.i already have ML and CEP and working more towards it.
>>> does that Scala API is with your current product or repo?.  thank you.
>>> BR,
>>> Mahesh.
>>>
>>> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Please find the comments inline.
>>>>
>>>> does data stream is taken to ML as the event publisher's format through
>>>>> event publisher. Or  we can use direct traffic that comes to event
>>>>> receiver, or else as streams
>>>>>
>>>> We intend to use the direct data as even streams.
>>>>
>>>> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>>>>>
>>>> No, WSO2 ML doesn't use any even stream. The data stored in tables in
>>>> DAS is loaded into ML.
>>>>
>>>> 2.) Are there any incremental learning algorithms currently active in
>>>>> ML?you mentioned that there are and they are with scala API. So there is a
>>>>> streaming support with that Scala API. In that API which format the data 
>>>>> is
>>>>> aquired to ML?
>>>>>
>>>> No, there are no incremental learning algorithms in ML. The scala API
>>>> is about Spark MLLib. MLLib supports streaming k-means and other
>>>> generalized linear models (linear regression variants and logistic
>>>> regression) with Scala API. What they basically do in those implementations
>>>> is retraining the trained models with mini batches when data sequentially
>>>> arrives. There, the breaking of streaming data into mini batches is done
>>>> with the help of Spark Streaming. But we do not intend to use Spark
>>>> streaming in our implementation. What we need to do is implement a similar
>>>> behavior for event streams using the Java API.  The Java API has the
>>>> following methods:
>>>>
>>>>- *createModel
>>>>
>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>*
>>>>(Vector
>>>>
>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html>
>>>>  weights,
>>>>double intercept) - for GLMs
>>>>- *setInitialModel
>>>>
>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-07 Thread Maheshakya Wijewardena
Hi Mahesh,

does that Scala API is with your current product or repo?


No, we don't have the Scala API included. What we want is to design the
Java implementations of those algorithms to train with mini-batches of
streaming data with the help of the aforementioned methods so that we can
include in as a CEP extension.

As to clarify, please try to write a simple Java program using Spark MLLib
linear regression and k-means clustering with a sample data set (You can
find alot of data sets from UCI repo[1]).  You need to break the dataset
into several pieces and train a model repeatedly with those.
After each training run, save the model information (such as weights,
intercepts for regression and cluster centers for clustering - please check
the arguments of those methods I have mentioned and save the required
information of the model)
When training a model we a new piece of data, use those methods to
initialize and put the save values for the arguments. This way you can
start from where you stopped in the previous run.

Let us know your observations and feel free to ask if you need to know
anything more on this.

We'll let you know what needs to be done to include this in CEP.

Best regards.

On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> great.thank you.i already have ML and CEP and working more towards it.
> does that Scala API is with your current product or repo?.  thank you.
> BR,
> Mahesh.
>
> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Please find the comments inline.
>>
>> does data stream is taken to ML as the event publisher's format through
>>> event publisher. Or  we can use direct traffic that comes to event
>>> receiver, or else as streams
>>>
>> We intend to use the direct data as even streams.
>>
>> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>>>
>> No, WSO2 ML doesn't use any even stream. The data stored in tables in DAS
>> is loaded into ML.
>>
>> 2.) Are there any incremental learning algorithms currently active in
>>> ML?you mentioned that there are and they are with scala API. So there is a
>>> streaming support with that Scala API. In that API which format the data is
>>> aquired to ML?
>>>
>> No, there are no incremental learning algorithms in ML. The scala API is
>> about Spark MLLib. MLLib supports streaming k-means and other generalized
>> linear models (linear regression variants and logistic regression) with
>> Scala API. What they basically do in those implementations is retraining
>> the trained models with mini batches when data sequentially arrives. There,
>> the breaking of streaming data into mini batches is done with the help of
>> Spark Streaming. But we do not intend to use Spark streaming in our
>> implementation. What we need to do is implement a similar behavior for
>> event streams using the Java API.  The Java API has the following methods:
>>
>>- *createModel
>>
>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>*
>>(Vector
>>
>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html>
>>  weights,
>>double intercept) - for GLMs
>>- *setInitialModel
>>
>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28org.apache.spark.mllib.clustering.KMeansModel%29>*
>>(KMeansModel
>>
>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeansModel.html>
>>  model)
>>- for K means
>>
>> With the help of these methods, we can train models again with newly
>> arriving data, keeping the characteristics learned with the previous data.
>> When implementing this, we need to pay attention to other parameters of
>> incremental learning such as data horizon and data obsolescence (indicated
>> in the project ideas page).
>> We need to discuss on how to add these with CEP event streams. I have
>> added Suho into the thread for more clarification.
>>
>> Best regards.
>>
>>
>> On Sat, Mar 5, 2016 at 5:15 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshakya,
>>> as we concerned to use WSO2 CEP to handle streaming data and implement
>>> the machine learning algorithms with Spark MLLib, does data stream is taken
>>> to ML as the event publisher's format 

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-06 Thread Maheshakya Wijewardena
Hi Mahesh,

Please find the comments inline.

does data stream is taken to ML as the event publisher's format through
> event publisher. Or  we can use direct traffic that comes to event
> receiver, or else as streams
>
We intend to use the direct data as even streams.

1.) Those data coming from wso2 DAS to ML are coming as streams?
>
No, WSO2 ML doesn't use any even stream. The data stored in tables in DAS
is loaded into ML.

2.) Are there any incremental learning algorithms currently active in
> ML?you mentioned that there are and they are with scala API. So there is a
> streaming support with that Scala API. In that API which format the data is
> aquired to ML?
>
No, there are no incremental learning algorithms in ML. The scala API is
about Spark MLLib. MLLib supports streaming k-means and other generalized
linear models (linear regression variants and logistic regression) with
Scala API. What they basically do in those implementations is retraining
the trained models with mini batches when data sequentially arrives. There,
the breaking of streaming data into mini batches is done with the help of
Spark Streaming. But we do not intend to use Spark streaming in our
implementation. What we need to do is implement a similar behavior for
event streams using the Java API.  The Java API has the following methods:

   - *createModel
   
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>*
   (Vector
   
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html>
weights,
   double intercept) - for GLMs
   - *setInitialModel
   
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28org.apache.spark.mllib.clustering.KMeansModel%29>*
   (KMeansModel
   
<http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeansModel.html>
model)
   - for K means

With the help of these methods, we can train models again with newly
arriving data, keeping the characteristics learned with the previous data.
When implementing this, we need to pay attention to other parameters of
incremental learning such as data horizon and data obsolescence (indicated
in the project ideas page).
We need to discuss on how to add these with CEP event streams. I have added
Suho into the thread for more clarification.

Best regards.


On Sat, Mar 5, 2016 at 5:15 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi maheshakya,
> as we concerned to use WSO2 CEP to handle streaming data and implement the
> machine learning algorithms with Spark MLLib, does data stream is taken to
> ML as the event publisher's format through event publisher. Or  we can use
> direct traffic that comes to event receiver, or else as streams. referring
> to https://docs.wso2.com/display/CEP410/User+Guide
> 1.) Those data coming from wso2 DAS to ML are coming as streams?
> 2.) Are there any incremental learning algorithms currently active in
> ML?you mentioned that there are and they are with scala API. So there is a
> streaming support with that Scala API. In that API which format the data is
> aquired to ML?
>
> thank you.
> BR,
> Mahesh.
>
> On Fri, Mar 4, 2016 at 2:03 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> We had to modify a the project scope a little to suit best for the
>> requirements. We will update the project idea with those concerns soon and
>> let you know.
>>
>> We do not support streaming data in WSO2 Machine learner at the moment.
>> The new concern is to use WSO2 CEP to handle streaming data and implement
>> the machine learning algorithms with Spark MLLib. You can look at the
>> streaming k-means and streaming linear regression implementations in MLLib.
>> Currently, the API is only for scala. Our need is to get the Java APIs of
>> k-means and generalized linear models to support incremental learning with
>> streaming data. This has to be done as mini-batch learning since these
>> algorithms operates as stochastic gradient descents so that any learning
>> with new data can be done on top of the previously learned models. So
>> please go through the those APIs[1][2][3] and try to get an idea.
>> Also please try to understand how event streams work in WSO2 CEP [4][5].
>>
>> Best regards.
>>
>> [1]
>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html
>> [2]
>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html
>> [3]
>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/classification/Logist

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-04 Thread Maheshakya Wijewardena
Hi Mahesh,

We had to modify a the project scope a little to suit best for the
requirements. We will update the project idea with those concerns soon and
let you know.

We do not support streaming data in WSO2 Machine learner at the moment. The
new concern is to use WSO2 CEP to handle streaming data and implement the
machine learning algorithms with Spark MLLib. You can look at the streaming
k-means and streaming linear regression implementations in MLLib.
Currently, the API is only for scala. Our need is to get the Java APIs of
k-means and generalized linear models to support incremental learning with
streaming data. This has to be done as mini-batch learning since these
algorithms operates as stochastic gradient descents so that any learning
with new data can be done on top of the previously learned models. So
please go through the those APIs[1][2][3] and try to get an idea.
Also please try to understand how event streams work in WSO2 CEP [4][5].

Best regards.

[1]
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html
[2]
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html
[3]
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/classification/LogisticRegressionWithSGD.html
[4] https://docs.wso2.com/display/CEP310/Working+with+Event+Streams
[5] https://docs.wso2.com/display/CEP310/Working+with+Execution+Plans

On Fri, Mar 4, 2016 at 11:26 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi maheshakya,
> give me sometime to go through your ML package. Do current product have
> any stream data support?. i did some university projects related to machine
> learning with regressions,modelling, factor analysis, cluster analysis and
> classification problems (Discriminant Analysis) with SVM (Support Vector
> machines), Neural networks, LS classification and ML(Maximum likelihood).
> give me sometime to see how wso2 architecture works.then i can come up with
> good architecture.thank you.
> BR,
> Mahesh.
>
> On Wed, Mar 2, 2016 at 2:41 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Thank you for the resources. I will go through this and looking forward
>> to this proposed project.Thank you.
>> BR,
>> Mahesh.
>>
>> On Wed, Mar 2, 2016 at 1:52 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Thank you for the interest for this project.
>>>
>>> We would like to know what type of similar projects you have worked on.
>>> You may have seen that WSO2 Machine Learner supports several learning
>>> algorithms at the moment[1]. This project intends to leverage the existing
>>> algorithms in WSO2 Machine Learner to support streaming data. As an
>>> initiative, first you can get an idea about what WSO2 Machine Learner does
>>> and how it operates. You can download WSO2 Machine Learner from product
>>> page[2] and the the source code [3]. ML is using Apache Spark MLLib[4] for
>>> its' algorithms so it's better to read and understand what it does as well.
>>>
>>> In order to get an idea about the deliverables and the scope of this
>>> project, try to understand how Spark streaming[5] (see examples) handles
>>> streaming data. Also, have a look in the streaming algorithms[6][7]
>>> supported by MLLib. There are two approaches discussed to employ
>>> incremental learning in ML in the project proposals page. These streaming
>>> algorithms can be directly used in the first approach. For the other
>>> approach, the your implementation should contain a procedure to create mini
>>> batches from streaming data with relevant sizes (i.e. a moving window) and
>>> do periodic retraining of the same algorithm.
>>>
>>> To start with the project, you will need to come up with a suitable plan
>>> and an architecture first.
>>>
>>> Please watch the video referenced in the proposal (reference: 5). It
>>> will help you getting a better idea about machine learning algorithms with
>>> streaming data.
>>>
>>> Let us know if you need any help with these.
>>>
>>> Best regards
>>>
>>> [1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms
>>> [2] http://wso2.com/products/machine-learner/
>>> [3]
>>> https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout
>>> [4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
>>> [5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
>>> [6]
>>> https:/

Re: [Dev] Regarding Proposal 6: [ML] Predictive analytics with online data for WSO2 Machine Learner

2016-03-03 Thread Maheshakya Wijewardena
Hi Heshani,

Please add WSO2 developers list as well when you are inquiring about
projects because the community is interested in these projects and their
input is vital and the decisions are made collaboratively.

That said, about your request; this projects is targeted to be a GSoC
project so the scope and the time line of the project should comply with
the GSoC official time lines. So there should be a concrete goal and
associated tasks that can be completed within that time frame. These tasks
should be properly discussed with the community and finalized, so that they
can go into the project proposal.
If you want to continue this as your final year project, you should discuss
about what more can be done (in addition to the components added in GSoC
project) and also find a supervisor. But this should be done independently
from the GSoC and you need to get approval from WSO2 for that.

As a starting point (considering GSoC), you can download Machine Learner
product, go through examples and quick start guide.
Then you can try developing small ML applications with Spark ML lib and
play around with Spark streaming as well.
After getting some concrete idea, try to write a program using Spark
Streaming and ML lib streaming linear regression, streaming k-means to
train models with streaming data and share your you experience with us.

Best regards.

On Thu, Mar 3, 2016 at 12:23 PM, Heshani Herath <heshani7.her...@gmail.com>
wrote:

> Hi Maheshakya,
>
> I went through WSO2 ML details,as I'm doing this as the 4th year research
> project I need more requirements to fit for the whole year, If there are
> any requirements to be done please be kind enough to share with me.
>
> Thank you!
>
> On Mon, Feb 29, 2016 at 3:19 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Heshani,
>>
>> Thank you for the interest for this project.
>>
>> WSO2 Machine Learner supports several learning algorithms at the
>> moment[1]. This project intends is to leverage the existing algorithms in
>> WSO2 Machine Learner to support streaming data. As an initiative, first you
>> can get an idea about what WSO2 Machine Learner does and how it operates.
>> You can download WSO2 Machine Learner from product page[2] and the the
>> source code [3]. ML is using Apache Spark MLLib[4] for its' algorithms so
>> it's better to read and understand what it does as well.
>>
>> In order to get an idea about the deliverables and the scope of this
>> project, try to understand how Spark streaming[5] (see examples) handles
>> streaming data. Also, have a look in the streaming algorithms[6][7]
>> supported by MLLib. There are two approaches discussed to employ
>> incremental learning in ML in the project proposals page. These streaming
>> algorithms can be directly used in the first approach. For the other
>> approach, the your implementation should contain a procedure to create mini
>> batches from streaming data with relevant sizes (i.e. a moving window) and
>> do periodic retraining of the same algorithm.
>>
>> To start with the project, you will need to come up with a suitable plan
>> and an architecture first.
>>
>> Please watch the video referenced in the proposal (reference: 5). It will
>> help you getting a better idea about machine learning algorithms with
>> streaming data.
>>
>> Let us know if you need any help with these.
>>
>> Best regards
>>
>> [1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms
>> [2] http://wso2.com/products/machine-learner/
>> [3]
>> https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout
>> [4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
>> [5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
>> [6]
>> https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression
>> [7]
>> https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means
>>
>> On Mon, Feb 29, 2016 at 2:54 PM, Heshani Herath <
>> heshani7.her...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>
>>> https://docs.wso2.com/display/GSoC/Project+Proposals+for+2016#ProjectProposalsfor2016-Proposal6:[ML]PredictiveanalyticswithonlinedataforWSO2MachineLearner
>>>
>>> I'm a 4th year undergraduate from SLIIT faculty of
>>> computing(specializing in Software Engineering) who is interested in doing
>>> the aforementioned project as the final year research. I would like to know
>>> more details on this topic and the procedure to be followed when
>>> implementing it. Please be kind enough to reply as soon as possible.
>>>
>>> Thank you
>>>
>>> --
>>> Best Regards,
>>> Heshani Herath
>>>
>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>
>
> --
> Best Regards,
> Heshani Herath
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-02 Thread Maheshakya Wijewardena
Hi Mahesh,

Thank you for the interest for this project.

We would like to know what type of similar projects you have worked on. You
may have seen that WSO2 Machine Learner supports several learning
algorithms at the moment[1]. This project intends to leverage the existing
algorithms in WSO2 Machine Learner to support streaming data. As an
initiative, first you can get an idea about what WSO2 Machine Learner does
and how it operates. You can download WSO2 Machine Learner from product
page[2] and the the source code [3]. ML is using Apache Spark MLLib[4] for
its' algorithms so it's better to read and understand what it does as well.

In order to get an idea about the deliverables and the scope of this
project, try to understand how Spark streaming[5] (see examples) handles
streaming data. Also, have a look in the streaming algorithms[6][7]
supported by MLLib. There are two approaches discussed to employ
incremental learning in ML in the project proposals page. These streaming
algorithms can be directly used in the first approach. For the other
approach, the your implementation should contain a procedure to create mini
batches from streaming data with relevant sizes (i.e. a moving window) and
do periodic retraining of the same algorithm.

To start with the project, you will need to come up with a suitable plan
and an architecture first.

Please watch the video referenced in the proposal (reference: 5). It will
help you getting a better idea about machine learning algorithms with
streaming data.

Let us know if you need any help with these.

Best regards

[1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms
[2] http://wso2.com/products/machine-learner/
[3]
https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout
[4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
[5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
[6]
https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression
[7]
https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means

On Wed, Mar 2, 2016 at 1:19 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi all,
> I am interesting on contribute to proposal 6: "Predictive analytic with
> online data for WSO2 Machine Learner" for GSOC2 this time. Since i have
> been engaging with some similar projects i think it will be a great
> experience for me. Please let me know what you think and what you suggest.
> I have been going through your documents.thank you.
> regards,
> Mahesh Dananjaya.
>
>
> ___
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Regarding Proposal 6: [ML] Predictive analytics with online data for WSO2 Machine Learner

2016-02-29 Thread Maheshakya Wijewardena
Hi Heshani,

Thank you for the interest for this project.

WSO2 Machine Learner supports several learning algorithms at the moment[1].
This project intends is to leverage the existing algorithms in WSO2 Machine
Learner to support streaming data. As an initiative, first you can get an
idea about what WSO2 Machine Learner does and how it operates. You can
download WSO2 Machine Learner from product page[2] and the the source code
[3]. ML is using Apache Spark MLLib[4] for its' algorithms so it's better
to read and understand what it does as well.

In order to get an idea about the deliverables and the scope of this
project, try to understand how Spark streaming[5] (see examples) handles
streaming data. Also, have a look in the streaming algorithms[6][7]
supported by MLLib. There are two approaches discussed to employ
incremental learning in ML in the project proposals page. These streaming
algorithms can be directly used in the first approach. For the other
approach, the your implementation should contain a procedure to create mini
batches from streaming data with relevant sizes (i.e. a moving window) and
do periodic retraining of the same algorithm.

To start with the project, you will need to come up with a suitable plan
and an architecture first.

Please watch the video referenced in the proposal (reference: 5). It will
help you getting a better idea about machine learning algorithms with
streaming data.

Let us know if you need any help with these.

Best regards

[1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms
[2] http://wso2.com/products/machine-learner/
[3]
https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout
[4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
[5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
[6]
https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression
[7]
https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means

On Mon, Feb 29, 2016 at 2:54 PM, Heshani Herath <heshani7.her...@gmail.com>
wrote:

> Hi,
>
>
> https://docs.wso2.com/display/GSoC/Project+Proposals+for+2016#ProjectProposalsfor2016-Proposal6:[ML]PredictiveanalyticswithonlinedataforWSO2MachineLearner
>
> I'm a 4th year undergraduate from SLIIT faculty of computing(specializing
> in Software Engineering) who is interested in doing the aforementioned
> project as the final year research. I would like to know more details on
> this topic and the procedure to be followed when implementing it. Please be
> kind enough to reply as soon as possible.
>
> Thank you
>
> --
> Best Regards,
> Heshani Herath
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML][GSOC 2016] Proposal6- Predictive analytics with online data for WSO2 Machine Learner

2016-02-24 Thread Maheshakya Wijewardena
Hi Randika,

Thank you for showing interest for this project.

I've checked the SPMF library and what this library supports is sequential
pattern mining which is quite different from machine learning algorithms
used in WSO2 ML. What this project intends to achieve is to leverage the
existing algorithms to support streaming data. As an initiative, first you
can get an idea about the architecture of WSO2 ML[1]. CEP event streams[2]
/ publishers[3] maybe used for feeding data streams in to ML. Since ML is
using Apache Spark mllib[4] for its' algorithms, you might want to read
about that.

To get an idea about an architecture, try to understand how Spark
streaming[5] (see examples) handles input data streams. Also, have a look
in the streaming algorithms[6][7] supported. In order to use these
algorithms, you may have to use Scala APIs(Since Spark does not have Java
implementations yet). There are two approaches indicated in the project
proposals page. These streaming algorithms can be directly used in the
first approach. For the other approach, the architecture should contain a
procedure to create mini batches from streaming data with relevant sizes
(i.e. a moving window) and do periodic retraining of the same algorithm.

BTW, watching the video referenced in the proposal (reference: 5) will help
you getting a better idea about machine learning algorithms with streaming
data.

Let us know if you need any help with these.

Best regards

[1] https://docs.wso2.com/display/ML110/Architecture
[2] https://docs.wso2.com/display/CEP400/Understanding+Event+Streams
[3] https://docs.wso2.com/display/CEP400/HTTP+Event+Publisher
[4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
[5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
[6]
https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression
[7]
https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means

On Thu, Feb 25, 2016 at 10:40 AM, Randika Navagamuwa <
randika...@cse.mrt.ac.lk> wrote:

> Hi,
>  I'm a 3rd year undergraduate from Department of Computer Science and
> Engineering, University of Moratuwa. I went through the project proposals
> and I want to clarify some things regarding this project.
>
>- I've seen two approaches are mentioned, but other than those two
>methods can the objectives be achieved using this approach
>   - SPMF[1] library can be used for pattern analysis.
>   - Then if a data set has a same pattern as a previously modeled
>   data set same algorithm can be used.
>
> According to the deliverables, first step is to come with an architecture.
> Is there any online material to refer before starting this project.
>
> [1]http://www.philippe-fournier-viger.com/spmf/
>
>
> *Best Regards*
>
> *Randika Navagamuwa,*
>
> *Department of Computer Science & Engineering,*
>
> *University of Moratuwa,*
> *Sri Lanka.*
>
> *www.rnavagamuwa.com <http://www.rnavagamuwa.com>*[image:
> lk.linkedin.com/in/rnavagamuwa/] <http://lk.linkedin.com/in/rnavagamuwa/> 
> [image:
> https://www.facebook.com/rnavagamuwa]
> <https://www.facebook.com/rnavagamuwa> [image:
> https://twitter.com/rnavagamuwa] <https://twitter.com/rnavagamuwa> [image:
> https://plus.google.com/+RandikaNavagamuwa/]
> <https://plus.google.com/+RandikaNavagamuwa/>
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Are we using correct names for feature types?

2016-01-18 Thread Maheshakya Wijewardena
Should there be an issue, since we are already considering these type of
"numerical" features as categorical?

On Tue, Jan 19, 2016 at 10:57 AM, Nirmal Fernando <nir...@wso2.com> wrote:

>
> On Tue, Jan 19, 2016 at 10:53 AM, Supun Sethunga <sup...@wso2.com> wrote:
>
>> IMO, 0 & 1 is still categorical. Its an equivalent to a encoded string
>> variable, isn't it?
>>
>
> Question is about what we should name current 'numerical' type. 0, 1 is of
> course categorical :)
>
>
>> On Tue, Jan 19, 2016 at 10:49 AM, Nirmal Fernando <nir...@wso2.com>
>> wrote:
>>
>>> All,
>>>
>>> Currently we have; numerical and categorical. But there can be
>>> categorical features with numerical values for an example if a feature has
>>> only 0 & 1 we say it's a categorical feature. What should be the
>>> appropriate names?
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Team Lead - WSO2 Machine Learner
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> *Supun Sethunga*
>> Software Engineer
>> WSO2, Inc.
>> http://wso2.com/
>> lean | enterprise | middleware
>> Mobile : +94 716546324
>>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Are we using correct names for feature types?

2016-01-18 Thread Maheshakya Wijewardena
Since the concern is about the suitability of the term "numerical", what if
we add two new feature types:

   1. Continues numerical
   2. Discrete numerical

to represent the features that actually have some numerical meaning and
then eliminate "numerical". That way we can get rid of the confusion when
there is categorical features with numerical values.

On Tue, Jan 19, 2016 at 11:03 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Should there be an issue, since we are already considering these type of
> "numerical" features as categorical?
>
> On Tue, Jan 19, 2016 at 10:57 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>>
>> On Tue, Jan 19, 2016 at 10:53 AM, Supun Sethunga <sup...@wso2.com> wrote:
>>
>>> IMO, 0 & 1 is still categorical. Its an equivalent to a encoded string
>>> variable, isn't it?
>>>
>>
>> Question is about what we should name current 'numerical' type. 0, 1 is
>> of course categorical :)
>>
>>
>>> On Tue, Jan 19, 2016 at 10:49 AM, Nirmal Fernando <nir...@wso2.com>
>>> wrote:
>>>
>>>> All,
>>>>
>>>> Currently we have; numerical and categorical. But there can be
>>>> categorical features with numerical values for an example if a feature has
>>>> only 0 & 1 we say it's a categorical feature. What should be the
>>>> appropriate names?
>>>>
>>>> --
>>>>
>>>> Thanks & regards,
>>>> Nirmal
>>>>
>>>> Team Lead - WSO2 Machine Learner
>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>> Mobile: +94715779733
>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> *Supun Sethunga*
>>> Software Engineer
>>> WSO2, Inc.
>>> http://wso2.com/
>>> lean | enterprise | middleware
>>> Mobile : +94 716546324
>>>
>>
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Team Lead - WSO2 Machine Learner
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Are we using correct names for feature types?

2016-01-18 Thread Maheshakya Wijewardena
Note that only continuous is not sufficient since there can be numerical
attributes that are not continuous like number of wins, etc.

On Tue, Jan 19, 2016 at 11:57 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Since the concern is about the suitability of the term "numerical", what
> if we add two new feature types:
>
>1. Continues numerical
>2. Discrete numerical
>
> to represent the features that actually have some numerical meaning and
> then eliminate "numerical". That way we can get rid of the confusion when
> there is categorical features with numerical values.
>
> On Tue, Jan 19, 2016 at 11:03 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Should there be an issue, since we are already considering these type of
>> "numerical" features as categorical?
>>
>> On Tue, Jan 19, 2016 at 10:57 AM, Nirmal Fernando <nir...@wso2.com>
>> wrote:
>>
>>>
>>> On Tue, Jan 19, 2016 at 10:53 AM, Supun Sethunga <sup...@wso2.com>
>>> wrote:
>>>
>>>> IMO, 0 & 1 is still categorical. Its an equivalent to a encoded string
>>>> variable, isn't it?
>>>>
>>>
>>> Question is about what we should name current 'numerical' type. 0, 1 is
>>> of course categorical :)
>>>
>>>
>>>> On Tue, Jan 19, 2016 at 10:49 AM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> All,
>>>>>
>>>>> Currently we have; numerical and categorical. But there can be
>>>>> categorical features with numerical values for an example if a feature has
>>>>> only 0 & 1 we say it's a categorical feature. What should be the
>>>>> appropriate names?
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Team Lead - WSO2 Machine Learner
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Supun Sethunga*
>>>> Software Engineer
>>>> WSO2, Inc.
>>>> http://wso2.com/
>>>> lean | enterprise | middleware
>>>> Mobile : +94 716546324
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Team Lead - WSO2 Machine Learner
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Are we using correct names for feature types?

2016-01-18 Thread Maheshakya Wijewardena
No.
The only reason is we need a name for the features that actually have a
numerical meanings. Numerical may mean "numbers" (which can be categorical
as well) to users. In that case we need an alternative. Is there a better
name we can use?
Or we need to keep numerical as it is and assume it means the features that
have some numerical meaning.

On Tue, Jan 19, 2016 at 12:04 PM, CD Athuraliya <chathur...@wso2.com> wrote:

>
>
> On Tue, Jan 19, 2016 at 11:58 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Note that only continuous is not sufficient since there can be numerical
>> attributes that are not continuous like number of wins, etc.
>>
>> On Tue, Jan 19, 2016 at 11:57 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Since the concern is about the suitability of the term "numerical", what
>>> if we add two new feature types:
>>>
>>>1. Continues numerical
>>>2. Discrete numerical
>>>
>>> to represent the features that actually have some numerical meaning and
>>> then eliminate "numerical". That way we can get rid of the confusion when
>>> there is categorical features with numerical values.
>>>
>>
> What are we trying achieve by introducing these new types? Do we introduce
> anything new to our workflow depending on these new types?
>
>>
>>> On Tue, Jan 19, 2016 at 11:03 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Should there be an issue, since we are already considering these type
>>>> of "numerical" features as categorical?
>>>>
>>>> On Tue, Jan 19, 2016 at 10:57 AM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>>
>>>>> On Tue, Jan 19, 2016 at 10:53 AM, Supun Sethunga <sup...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> IMO, 0 & 1 is still categorical. Its an equivalent to a encoded
>>>>>> string variable, isn't it?
>>>>>>
>>>>>
>>>>> Question is about what we should name current 'numerical' type. 0, 1
>>>>> is of course categorical :)
>>>>>
>>>>>
>>>>>> On Tue, Jan 19, 2016 at 10:49 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> All,
>>>>>>>
>>>>>>> Currently we have; numerical and categorical. But there can be
>>>>>>> categorical features with numerical values for an example if a feature 
>>>>>>> has
>>>>>>> only 0 & 1 we say it's a categorical feature. What should be the
>>>>>>> appropriate names?
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks & regards,
>>>>>>> Nirmal
>>>>>>>
>>>>>>> Team Lead - WSO2 Machine Learner
>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>> Mobile: +94715779733
>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Supun Sethunga*
>>>>>> Software Engineer
>>>>>> WSO2, Inc.
>>>>>> http://wso2.com/
>>>>>> lean | enterprise | middleware
>>>>>> Mobile : +94 716546324
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Team Lead - WSO2 Machine Learner
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Pruthuvi Maheshakya Wijewardena
>>>> mahesha...@wso2.com
>>>> +94711228855
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Pruthuvi Maheshakya Wijewardena
>>> mahesha...@wso2.com
>>> +94711228855
>>>
>>>
>>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>
>
> --
> *CD Athuraliya*
> Software Engineer
> WSO2, Inc.
> lean . enterprise . middleware
> Mobile: +94 716288847 <94716288847>
> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
> <https://twitter.com/cdathuraliya> | Blog
> <https://cdathuraliya.wordpress.com/>
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [DEV][VOTE] Release WSO2 Data Analytics Server 3.0.1 RC2

2016-01-17 Thread Maheshakya Wijewardena
Hi

I've tested ML 1.1.0 with DAS 3.0.1 RC2 spark cluster.
Works fine.

[x] Stable - Go ahead and release

Best regards.


On Mon, Jan 18, 2016 at 11:10 AM, Seshika Fernando <sesh...@wso2.com> wrote:

> Hi,
>
> I've tested DAS 3.0.1 RC2 with the following
>
>- complex execution plans that test - event tables, indexing,
>patterns, windows, joins etc;
>- data persistence
>- event tracing
>- data explorer
>- event publisher
>- stream simulation.
>
>
> All works well.
>
> My vote - [x] Stable - Go ahead and release.
>
> seshi
>
> On Mon, Jan 18, 2016 at 10:59 AM, Anjana Fernando <anj...@wso2.com> wrote:
>
>> Hi,
>>
>> * Tested the basic indexing functionality in a standalone server
>>
>> * Tested Spark analytics with a 2 node cluster with a subset of the
>> Wikipedia data set (4 Million records).
>>
>> [X] Stable - Go ahead and release
>>
>> Cheers,
>> Anjana.
>>
>> On Sun, Jan 17, 2016 at 3:33 PM, Sachith Withana <sach...@wso2.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> This is the second release candidate of WSO2 DAS 3.0.1. Please download,
>>> test and vote.
>>>
>>> The vote will be open for 72 hours or as needed.
>>>
>>> This release fixes the following issues:
>>> https://wso2.org/jira/issues/?filter=12622
>>>
>>> Binary distribution file:
>>> https://svn.wso2.org/repos/wso2/people/sachith/rc2/
>>>
>>>
>>> [ ] Broken - Do not release (explain why)
>>> [ ] Stable - Go ahead and release
>>>
>>>
>>> Thanks,
>>> WSO2 DAS Team.
>>>
>>> --
>>> Sachith Withana
>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>> E-mail: sachith AT wso2.com
>>> M: +94715518127
>>> Linked-In: <http://goog_416592669>
>>> https://lk.linkedin.com/in/sachithwithana
>>>
>>> ___
>>> Dev mailing list
>>> Dev@wso2.org
>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>
>>>
>>
>>
>> --
>> *Anjana Fernando*
>> Senior Technical Lead
>> WSO2 Inc. | http://wso2.com
>> lean . enterprise . middleware
>>
>> ___
>> Dev mailing list
>> Dev@wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
> ___
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC5

2015-12-22 Thread Maheshakya Wijewardena
Hi Devs,

This is the 5th Release Candidate of WSO2 Machine Learner 1.1.0.

This release fixes the following issues:
https://wso2.org/jira/issues/?filter=12601

Please download, test and vote. Vote will be open for 72 hours or as longer
as needed.

*Binary distribution files:*
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc5/wso2ml-1.1.0.zip

*P2 repository*:
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc5/p2-repo.zip


*Maven staging repository:*
http://maven.wso2.org/nexus/content/repositories/orgwso2ml-230/
<http://www.google.com/url?q=http%3A%2F%2Fmaven.wso2.org%2Fnexus%2Fcontent%2Frepositories%2Forgwso2ml-230%2F=D=1=AFQjCNFgpNofnJ8T4zcllTSN8d7xq4cu4w>

*The tag to be voted upon:*
https://github.com/wso2/product-ml/tree/v1.1.0-rc5


[ ] Broken - do not release (explain why)
[ ] Stable - go ahead and release

Thank you,
Machine Learner Team

-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC5

2015-12-22 Thread Maheshakya Wijewardena
Tested ESB predict mediator.

[ x ] Stable - go ahead and release

On Tue, Dec 22, 2015 at 4:57 PM, Ashen Weerathunga <as...@wso2.com> wrote:

> Hi all,
>
> I've tested the following,
>
>- Verified both v10 APIs and v11 APIs.
>- Anomaly detection algorithms tested via ML wizard.
>- Default and tuned samples of Anomaly detection algorithms.
>
> Works fine. I didn't encounter any issues.
>
> [x] Stable - go ahead and release
>
> Thanks and Regards,
> Ashen
>
> On Tue, Dec 22, 2015 at 4:55 PM, Upul Bandara <u...@wso2.com> wrote:
>
>> Tested CEP extension, samples and ML Wizard.
>>
>> Looks OK
>>
>> [ x ] Stable - go ahead and release
>>
>>
>> On Tue, Dec 22, 2015 at 4:47 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>>
>>> Tested samples and looks good!
>>>
>>> [ x ] Stable - go ahead and release
>>>
>>>
>>> On Tue, Dec 22, 2015 at 3:52 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Devs,
>>>>
>>>> This is the 5th Release Candidate of WSO2 Machine Learner 1.1.0.
>>>>
>>>> This release fixes the following issues:
>>>> https://wso2.org/jira/issues/?filter=12601
>>>>
>>>> Please download, test and vote. Vote will be open for 72 hours or as
>>>> longer as needed.
>>>>
>>>> *Binary distribution files:*
>>>>
>>>> https://github.com/wso2/product-ml/releases/download/v1.1.0-rc5/wso2ml-1.1.0.zip
>>>>
>>>> *P2 repository*:
>>>>
>>>> https://github.com/wso2/product-ml/releases/download/v1.1.0-rc5/p2-repo.zip
>>>>
>>>>
>>>> *Maven staging repository:*
>>>> http://maven.wso2.org/nexus/content/repositories/orgwso2ml-230/
>>>> <http://www.google.com/url?q=http%3A%2F%2Fmaven.wso2.org%2Fnexus%2Fcontent%2Frepositories%2Forgwso2ml-230%2F=D=1=AFQjCNFgpNofnJ8T4zcllTSN8d7xq4cu4w>
>>>>
>>>> *The tag to be voted upon:*
>>>> https://github.com/wso2/product-ml/tree/v1.1.0-rc5
>>>>
>>>>
>>>> [ ] Broken - do not release (explain why)
>>>> [ ] Stable - go ahead and release
>>>>
>>>> Thank you,
>>>> Machine Learner Team
>>>>
>>>> --
>>>> Pruthuvi Maheshakya Wijewardena
>>>> mahesha...@wso2.com
>>>> +94711228855
>>>>
>>>>
>>>>
>>>> ___
>>>> Dev mailing list
>>>> Dev@wso2.org
>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Team Lead - WSO2 Machine Learner
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>> ___
>>> Dev mailing list
>>> Dev@wso2.org
>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>
>>>
>>
>>
>> --
>> Upul Bandara,
>> Associate Technical Lead, WSO2, Inc.,
>> Mob: +94 715 468 345.
>>
>> ___
>> Dev mailing list
>> Dev@wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
> *Ashen Weerathunga*
> Software Engineer - Intern
> WSO2 Inc.: http://wso2.com
> lean.enterprise.middleware
>
> Email: as...@wso2.com
> Mobile: +94 716042995 <94716042995>
> LinkedIn:
> *http://lk.linkedin.com/in/ashenweerathunga
> <http://lk.linkedin.com/in/ashenweerathunga>*
>
> ___
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC4

2015-12-21 Thread Maheshakya Wijewardena
Hi Devs,

This is the 4th Release Candidate of WSO2 Machine Learner 1.1.0.

This release fixes the following issues:
https://wso2.org/jira/issues/?filter=12600

Please download, test and vote. Vote will be open for 72 hours or as longer
as needed.

*Binary distribution files:*
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc4/wso2ml-1.1.0.zip

*P2 repository*:
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc4/p2-repo.zip


*Maven staging repository:*
http://maven.wso2.org/nexus/content/repositories/orgwso2ml-224/
<http://www.google.com/url?q=http%3A%2F%2Fmaven.wso2.org%2Fnexus%2Fcontent%2Frepositories%2Forgwso2ml-224%2F=D=1=AFQjCNEGJebhoedukGL1bLh-YnHj8A_Upw>

*The tag to be voted upon:*
https://github.com/wso2/product-ml/tree/v1.1.0-rc4


[ ] Broken - do not release (explain why)
[ ] Stable - go ahead and release

Thank you,
Machine Learner Team

-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Getting Error on executing sample of Java Client

2015-12-20 Thread Maheshakya Wijewardena
Hi,

In order to run the model usage sample, first you have to create a model
using the Machine Learner using the dataset (pima-indians-diabetes dataset)
and the algorithm (Logistic-Regression) with response variable (Class)
indicated in the sample. Then replace the downloaded-ml-model in
src/main/resources/ folder with the model you have generated. That way you
can use the model built with the ML version you have currently.
We have also updated the model in the resources folder[1].

Best regards.
[1]
https://github.com/wso2/carbon-ml/tree/master/samples/model-usage/src/main/resources

On Sun, Dec 20, 2015 at 11:09 AM, NIFRAS ISMAIL <nifrasism...@gmail.com>
wrote:

> Hi,
>
> I have use wso2 ML 1.0.1 to create the model. Now i need to build a java
> client to read the model in my project.
>
> I have try to execute java client sample which is provided by carbon-ml
> git repo of wso2.
> https://github.com/wso2/carbon-ml/tree/master/samples/model-usage
>
> The project is build successfully using mvn clean install
>
> while on exection it gives this error . I am using ideaJ to execute the 
> MLModelUsageSample
> class.
>
> Exception in thread "main" java.io.InvalidClassException:
> org.wso2.carbon.ml.core.spark.models.MLClassificationModel; local class
> incompatible: stream classdesc serialVersionUID = -5405685346474685046,
> local class serialVersionUID = -4575625595152525041
> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
> at MLModelUsageSample.deserializeMLModel(MLModelUsageSample.java:70)
> at MLModelUsageSample.main(MLModelUsageSample.java:48)
>
>
> Regards.
> *M. Nifras Ismail*
> [image: LinkedIn] <http://lk.linkedin.com/pub/nifras-ismail/54/343/94b>
>
>
>
> _______
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Pruthuvi Maheshakya Wijewardena

mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC3

2015-12-19 Thread Maheshakya Wijewardena
Hi,

I tried to test ML predict mediator with ESB. Unable install ML features in
ESB 4.9.0 due to Carbon Kernel version incompatibility (ML 1.1.0 RC3 has
4.4.3 and ESB latest released version 4.9.0 has 4.4.1). Voting down due to
this.

[x] Broken - do not release

Best regards.

On Sat, Dec 19, 2015 at 11:28 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Devs,
>
> This is the 3rd Release Candidate of WSO2 Machine Learner 1.1.0.
>
> This release fixes the following issues:
> https://wso2.org/jira/issues/?filter=12596
>
> Please download, test and vote. Vote will be open for 72 hours or as
> longer as needed.
>
> *Binary distribution files:*
>
> https://github.com/wso2/product-ml/releases/download/v1.1.0-rc3/wso2ml-1.1.0.zip
>
> *P2 repository*:
> https://github.com/wso2/product-ml/releases/download/v1.1.0-rc3/p2-repo.zip
>
>
> *Maven staging repository:*
> http://maven.wso2.org/nexus/content/repositories/orgwso2ml-211/
> <http://www.google.com/url?q=http%3A%2F%2Fmaven.wso2.org%2Fnexus%2Fcontent%2Frepositories%2Forgwso2ml-211%2F=D=1=AFQjCNEwDtWqsKxYF7X3La24nlQBFjv-Qg>
>
> *The tag to be voted upon:*
> https://github.com/wso2/product-ml/tree/v1.1.0-rc3
>
>
> [ ] Broken - do not release (explain why)
> [ ] Stable - go ahead and release
>
> Thank you,
> Machine Learner Team
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena

mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC2

2015-12-18 Thread Maheshakya Wijewardena
Hi Devs,

This is the 2nd Release Candidate of WSO2 Machine Learner 1.1.0.

This release fixes the following issues:
https://wso2.org/jira/issues/?filter=12594

Please download, test and vote. Vote will be open for 72 hours or as longer
as needed.

*Binary distribution files:*
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc2/wso2ml-1.1.0.zip

*P2 repository*:
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc2/p2-repo.zip


*Maven staging repository:*
http://maven.wso2.org/nexus/content/repositories/orgwso2ml-208/
<http://www.google.com/url?q=http%3A%2F%2Fmaven.wso2.org%2Fnexus%2Fcontent%2Frepositories%2Forgwso2ml-208%2F=D=1=AFQjCNGHoJCuN__Mk_7CuGUMR6OtbnNAbQ>

*The tag to be voted upon:*
https://github.com/wso2/product-ml/tree/v1.1.0-rc2


[ ] Broken - do not release (explain why)
[ ] Stable - go ahead and release

Thank you,
Machine Learner Team

-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC3

2015-12-18 Thread Maheshakya Wijewardena
Hi Devs,

This is the 3rd Release Candidate of WSO2 Machine Learner 1.1.0.

This release fixes the following issues:
https://wso2.org/jira/issues/?filter=12596

Please download, test and vote. Vote will be open for 72 hours or as longer
as needed.

*Binary distribution files:*
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc3/wso2ml-1.1.0.zip

*P2 repository*:
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc3/p2-repo.zip


*Maven staging repository:*
http://maven.wso2.org/nexus/content/repositories/orgwso2ml-211/
<http://www.google.com/url?q=http%3A%2F%2Fmaven.wso2.org%2Fnexus%2Fcontent%2Frepositories%2Forgwso2ml-211%2F=D=1=AFQjCNEwDtWqsKxYF7X3La24nlQBFjv-Qg>

*The tag to be voted upon:*
https://github.com/wso2/product-ml/tree/v1.1.0-rc3


[ ] Broken - do not release (explain why)
[ ] Stable - go ahead and release

Thank you,
Machine Learner Team

-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC1

2015-12-17 Thread Maheshakya Wijewardena
Hi all,

We are calling off the VOTE due to an issue and will start a new Vote for
Machine Learner 1.1.0 RC2.

Best regards.

On Fri, Dec 18, 2015 at 10:42 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Devs,
>
> This is the 1st Release Candidate of WSO2 Machine Learner 1.1.0.
>
> This release fixes the following issues:
> https://wso2.org/jira/issues/?filter=12589
>
> Please download, test and vote. Vote will be open for 72 hours or as
> longer as needed.
>
> *Binary distribution files:*
>
> https://github.com/wso2/product-ml/releases/download/v1.1.0-rc1/wso2ml-1.1.0.zip
>
> *P2 repository*:
> https://github.com/wso2/product-ml/releases/download/v1.1.0-rc1/p2-repo.zip
>
> *Maven staging repository:*
> http://maven.wso2.org/nexus/content/repositories/orgwso2ml-199/
>
> *The tag to be voted upon:*
> https://github.com/wso2/product-ml/tree/v1.1.0-rc1
>
>
> [ ] Broken - do not release (explain why)
> [ ] Stable - go ahead and release
>
> Thank you,
> Machine Learner Team
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [VOTE] Release WSO2 Machine Learner 1.1.0 RC1

2015-12-17 Thread Maheshakya Wijewardena
Hi Devs,

This is the 1st Release Candidate of WSO2 Machine Learner 1.1.0.

This release fixes the following issues:
https://wso2.org/jira/issues/?filter=12589

Please download, test and vote. Vote will be open for 72 hours or as longer
as needed.

*Binary distribution files:*
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc1/wso2ml-1.1.0.zip

*P2 repository*:
https://github.com/wso2/product-ml/releases/download/v1.1.0-rc1/p2-repo.zip

*Maven staging repository:*
http://maven.wso2.org/nexus/content/repositories/orgwso2ml-199/

*The tag to be voted upon:*
https://github.com/wso2/product-ml/tree/v1.1.0-rc1


[ ] Broken - do not release (explain why)
[ ] Stable - go ahead and release

Thank you,
Machine Learner Team


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [VOTE] Release WSO2 Carbon Kernel 4.4.3 RC3

2015-12-13 Thread Maheshakya Wijewardena
Hi,

I have built ML 1.1.0 Beta pack including all tests with kernel 4.4.3 RC3.
No issues found.

[x] Stable - Go ahead and release.

Best regards.

On Mon, Dec 14, 2015 at 8:21 AM, Hasitha Aravinda <hasi...@wso2.com> wrote:

> Hi
>
> I have tested BPS 3.5.1-SNAPSHOT with kernel 4.4.3
> ​ RC3​
> ​
> . No issue found.
>
> [x] Stable - Go ahead and release.
>
> Thanks
> Hasitha.
>
> On Fri, Dec 11, 2015 at 8:15 PM, Nipuni Perera <nip...@wso2.com> wrote:
>
>>
>> Hi Devs,
>>
>> This is the RC3 release candidate of WSO2 Carbon Kernel 4.4.3.
>>
>> This release fixes the following issues:
>> https://wso2.org/jira/issues/?filter=12540
>>
>> Please download and test your products with kernel 4.4.3 RC3
>> and vote. Vote will be open for 72 hours or as longer as needed.
>>
>> *​Source and binary distribution files:*
>> http://svn.wso2.org/repos/wso2/people/nipuni/4.4.3-rc3/
>>
>> *Maven staging repository:*
>> http://maven.wso2.org/nexus/content/repositories/orgwso2carbon-168/
>>
>> *The tag to be voted upon:*
>> https://github.com/wso2/carbon-kernel/releases/tag/v4.4.3-RC3
>>
>>
>> [ ] Broken - do not release (explain why)
>> [ ] Stable - go ahead and release
>>
>> Thank you
>> Carbon Team
>>
>> --
>> Nipuni Perera
>> Software Engineer; WSO2 Inc.; http://wso2.com
>> Email: nip...@wso2.com
>> Git hub profile: https://github.com/nipuni
>> Blog : http://nipunipererablog.blogspot.com/
>> Mobile: +94 (71) 5626680
>> <http://wso2.com>
>>
>>
>> ___
>> Dev mailing list
>> Dev@wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
> --
> Hasitha Aravinda,
> Senior Software Engineer,
> WSO2 Inc.
> Email: hasi...@wso2.com
> Mobile : +94 718 210 200
>
> ___
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Fwd: WSO2 Machine Learner 1.1.0 - Beta Released!

2015-12-11 Thread Maheshakya Wijewardena
ML-272
<https://wso2.org/jira/browse/ML-272>

ML-260 <https://wso2.org/jira/browse/ML-260> In breadcrumbs (in models,
analyses), nothing is linked and only shows path. There is no point of
showing only the path. Users should be able to navigate using that.
<https://wso2.org/jira/browse/ML-272>
<https://wso2.org/jira/browse/ML-269>[image: Sub-task]
<https://wso2.org/jira/browse/ML-269>ML-269
<https://wso2.org/jira/browse/ML-269>

ML-260 <https://wso2.org/jira/browse/ML-260> If there are no datasets, show
a message saying you don't have datasets, please create and there should be
a link to dataset creating page. <https://wso2.org/jira/browse/ML-269>
<https://wso2.org/jira/browse/ML-268>[image: Sub-task]
<https://wso2.org/jira/browse/ML-268>ML-268
<https://wso2.org/jira/browse/ML-268>

ML-260 <https://wso2.org/jira/browse/ML-260> In "Create project", dataset
selecting field doesn't show error message when not selected - needs and
error message. <https://wso2.org/jira/browse/ML-268>
<https://wso2.org/jira/browse/ML-270>[image: Sub-task]
<https://wso2.org/jira/browse/ML-270>ML-270
<https://wso2.org/jira/browse/ML-270>

ML-260 <https://wso2.org/jira/browse/ML-260> When there are only failed
models in an analysis, compare models should be disabled.
<https://wso2.org/jira/browse/ML-270>
<https://wso2.org/jira/browse/ML-271>[image: Sub-task]
<https://wso2.org/jira/browse/ML-271>ML-271
<https://wso2.org/jira/browse/ML-271>

ML-260 <https://wso2.org/jira/browse/ML-260> No need of 2 horizontal bars
in home screen - no content in that. <https://wso2.org/jira/browse/ML-271>
<https://wso2.org/jira/browse/ML-267>[image: Sub-task]
<https://wso2.org/jira/browse/ML-267>ML-267
<https://wso2.org/jira/browse/ML-267>

ML-260 <https://wso2.org/jira/browse/ML-260> Allow space to be included in
project and analysis names <https://wso2.org/jira/browse/ML-267>
<https://wso2.org/jira/browse/ML-266>[image: Sub-task]
<https://wso2.org/jira/browse/ML-266>ML-266
<https://wso2.org/jira/browse/ML-266>

ML-260 <https://wso2.org/jira/browse/ML-260> Remove unused tooltip icons in
explore view <https://wso2.org/jira/browse/ML-266>
<https://wso2.org/jira/browse/ML-265>[image: Sub-task]
<https://wso2.org/jira/browse/ML-265>ML-265
<https://wso2.org/jira/browse/ML-265>

ML-260 <https://wso2.org/jira/browse/ML-260> Fix order of appearance of
datasets and projects in menus <https://wso2.org/jira/browse/ML-265>
<https://wso2.org/jira/browse/ML-264>[image: Sub-task]
<https://wso2.org/jira/browse/ML-264>ML-264
<https://wso2.org/jira/browse/ML-264>

ML-260 <https://wso2.org/jira/browse/ML-260> At login, avoid asking for
"Machine Learner" credentials <https://wso2.org/jira/browse/ML-264>
<https://wso2.org/jira/browse/ML-252>[image: Improvement]
<https://wso2.org/jira/browse/ML-252>ML-252
<https://wso2.org/jira/browse/ML-252>

Add a legend to the cluster diagram of anomaly detection model summary
<https://wso2.org/jira/browse/ML-252>
<https://wso2.org/jira/browse/ML-240>[image: Bug]
<https://wso2.org/jira/browse/ML-240>ML-240
<https://wso2.org/jira/browse/ML-240>

Tooltip/info icons in dataset explorer view do not show anything
<https://wso2.org/jira/browse/ML-240>

*Documentation*:

WSO2 Machine Learner 1.1.0 release documentation can be found at

https://docs.wso2.com/display/ML110/WSO2+Machine+Learner+Documentation

*How You Can Contribute*

*Mailing Lists*

Join our mailing list and correspond with the developers directly.

Developer List : dev@wso2.org | Mail Archive
<http://mail.wso2.org/mailarchive/dev/>

*Reporting Issues*

We encourage you to report issues, documentation faults and feature
requests regarding WSO2 Machine Learner through the public JIRA
<https://wso2.org/jira/browse/ML>. You can use the Carbon JIRA
<https://wso2.org/jira/browse/CARBON> to report any issues related to the
Carbon framework or associated Carbon components.


*~~~ WSO2 Machine Learner Team ~~~*




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Machine Learning Help Needed

2015-12-09 Thread Maheshakya Wijewardena
Hi,

You can use the multi class classification algorithms in WSO2 Machine
Learner for your task. The quick start guide(
https://docs.wso2.com/display/ML100/Quick+Start+Guide) describes the steps
you have to follow from uploading a dataset to predicting from a trained a
model.
If you need to predict from data online (in case a new customer comes), use
WSO2 CEP extension for ML predictions(
https://docs.wso2.com/display/ML100/WSO2+CEP+Extension+for+ML+Predictions).

Best regards.

On Wed, Dec 9, 2015 at 4:41 PM, NIFRAS ISMAIL <nifrasism...@gmail.com>
wrote:

> Hi Machine Learners,
>
> Again I need your favour for Data Anaysis on My final year projects.
>
> My Meta Data Labels are
>
>
> "city","country","birthdate","education","gender","houseowner","marital_status","member_card","num_cars_owned","num_children_at_home","occupation","postal_code","state_province","total_children","yearly_income","product_name"
>
> You may notice that this is a customer and his transaction mapping data of
> a shopping mall.
>
> Class Label is  : product_name
>
> Problem: My issue is I need to find the interest classes from my data
> which contains the above headers.
>
> then If a new customer come to the mall I may predict which item he will
> looking for using Multi Class Multi Label Classification.
>
>
> Could Any one there to help to me ? Looking forward your reply ML Team
>
> Thank you
>
>
> Regards.
> *M. Nifras Ismail*
> [image: LinkedIn] <http://lk.linkedin.com/pub/nifras-ismail/54/343/94b>
>
>
>
>
>
> Sent with MailTrack
> <https://mailtrack.io/install?source=signature=en=nifrasism...@gmail.com=22>
>
> ___
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [Architecture] WSO2 Machine Learner 1.1.0 - Milestone 2 Released!

2015-11-26 Thread Maheshakya Wijewardena
*WSO2 Machine Learner 1.1.0 - Milestone 2 Released!*

We are pleased to announce that the second milestone release of WSO2
Machine Learner 1.1.0 is now available to download from here
<https://github.com/wso2/product-ml/releases/download/v1.1.0-m2/wso2ml-1.1.0-m2.zip>.
Source and tag location for this release is available here
<https://github.com/wso2/product-ml/releases/tag/v1.1.0-m2>.

Machine learning has emerged as a key component of big data analytics
space. The goal of WSO2 Machine Learner is to make machine learning
accessible to WSO2 big data platform. WSO2 Machine Learner (
http://wso2.com/products/machine-learner/) provides a user friendly wizard
like interface, which guides users through a set of steps to find and
configure machine learning algorithms. The outcome of this process is a
model that can be deployed in multiple WSO2 products, such as WSO2
Enterprise Service Bus (ESB), WSO2 Complex Event Processor (CEP), WSO2 Data
Analytics Server (DAS) etc.

The novice-friendly machine learning analysis allows developers, data
scientists and database administrators to quickly implement machine
learning methods. If you are familiar with WSO2 products, you can utilize
WSO2 Machine Learner to build machine learning models for various tasks,
such as fraud detection, anomaly detection, classification etc. WSO2
Machine Learner is built up on the award-winning, WSO2 Carbon platform,
which is based on the OSGi framework enabling better modularity for your
service oriented architecture (SOA).

The second milestone release of WSO2 Machine Learner (ML) 1.1.0 contains a
new feature: Deep Learning Algorithm Support.

*Documentation*:
1.1.0 M2 release specific documentation can be downloaded from
https://github.com/wso2/product-ml/files/44928/WSO2MachineLearner1.1.0-MilestoneRelease2.pdf

For general information on WSO2 Machine Learner 1.1.0 release, please visit
our documentation

https://docs.wso2.com/display/ML110/WSO2+Machine+Learner+Documentation


*Fixed Issues*: https://wso2.org/jira/browse/ML-253?filter=12530


*How You Can Contribute*

*Mailing Lists*

Join our mailing list and correspond with the developers directly.
Developer List : dev@wso2.org | Mail Archive
<http://mail.wso2.org/mailarchive/dev/>

*Reporting Issues*
We encourage you to report issues, documentation faults and feature
requests regarding WSO2 Machine Learner through the public JIRA
<https://wso2.org/jira/browse/ML>. You can use the Carbon JIRA
<https://wso2.org/jira/browse/CARBON> to report any issues related to the
Carbon framework or associated Carbon components.


*~~~ WSO2 Machine Learner Team ~~~*

Best regards
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [Orbit] Adding H2O AI 3.2.0.9

2015-11-25 Thread Maheshakya Wijewardena
Hi Kalpa,

I have changed according those comments.

Best regards.

On Wed, Nov 25, 2015 at 9:39 PM, Kalpa Welivitigoda <kal...@wso2.com> wrote:

> Hi Maheshakya,
>
> I added some comments to the PR, would you please attend to them?
>
> On Wed, Nov 25, 2015 at 1:06 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> [Looping Kernel team]
>>
>> On Wed, Nov 25, 2015 at 1:03 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi,
>>>
>>> Can you please review and merge this PR[1] for wso2v1 of the H2O AI
>>> 3.2.0.9.
>>> This is for deep learning integration of WSO2 ML.
>>>
>>> Best regards,
>>>
>>> [1] https://github.com/wso2/orbit/pull/150
>>> --
>>> Pruthuvi Maheshakya Wijewardena
>>> Software Engineer
>>> WSO2 : http://wso2.com/
>>> Email: mahesha...@wso2.com
>>> Mobile: +94711228855
>>>
>>>
>>>
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Team Lead - WSO2 Machine Learner
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>
>
> --
> Best Regards,
>
> Kalpa Welivitigoda
> Software Engineer, WSO2 Inc. http://wso2.com
> Email: kal...@wso2.com
> Mobile: +94776509215
>



-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [Orbit] Adding H2O AI 3.2.0.9

2015-11-24 Thread Maheshakya Wijewardena
Hi,

Can you please review and merge this PR[1] for wso2v1 of the H2O AI
3.2.0.9.
This is for deep learning integration of WSO2 ML.

Best regards,

[1] https://github.com/wso2/orbit/pull/150
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Getting error while creating a dataset

2015-10-01 Thread Maheshakya Wijewardena
Hi Malintha,

Here is the excerpt from the first few lines of the
sloan-school-of-management dataset:

 *88, 92,  2, 99, 16, 66, 94, 37, 70,  0,  0, 24, 42, 65,100,100, 8*
 80,100, 18, 98, 60, 66,100, 29, 42,  0,  0, 23, 42, 61, 56, 98, 8
  0, 94,  9, 57, 20, 19,  7,  0, 20, 36, 70, 68,100,100, 18, 92, 8
 95, 82, 71,100, 27, 77, 77, 73,100, 80, 93, 42, 56, 13,  0,  0, 9
 68,100,  6, 88, 47, 75, 87, 82, 85, 56,100, 29, 75,  6,  0,  0, 9
 70,100,100, 97, 70, 81, 45, 65, 30, 49, 20, 33,  0, 16,  0,  0, 1
 40,100,  0, 81, 15, 58,100, 57, 47, 87, 50, 88, 40, 42, 36,  0, 4
  3, 71,  0, 95, 45,100,100, 99, 79, 78, 48, 53, 31, 24, 54,  0, 7

As you can see, there is no header row (a row with feature names) in this
csv file. At the dataset creation, if you did not specify that there is no
header row in the dataset, ML will automatically take the first row as the
header row and the feature names are derived from that.
If the first row is taken as the header row, you can see that there are
duplicate entries: 0, 100
In ML, there cannot be multiple features with the same name.

At dataset creation, please select "No" for "Column header available", or
add a header row manually into the data file before uploading.

Best regards.

On Fri, Oct 2, 2015 at 8:54 AM, Nirmal Fernando <nir...@wso2.com> wrote:

> Hi Malintha,
>
> Thanks for trying ML. @Wije can you please check?
>
> On Fri, Oct 2, 2015 at 1:09 AM, Malintha Adikari <malin...@wso2.com>
> wrote:
>
>> Hi,
>>
>> I am trying to create a dataset from 748KB sized data file [1] and
>> getting following error.
>>
>> [2015-10-02 01:03:38,769]  INFO
>> {org.wso2.carbon.ml.core.impl.MLDatasetProcessor} -  [Created] MLDataset
>> [id=1, name=digitdd, tenantId=-1234, userName=admin, dataSourceType=file,
>> dataTargetType=file, sourcePath=null, dataType=csv, comments=,
>> version=1.0.0, containsHeader=true, status=null]
>> [2015-10-02 01:03:40,537]  WARN
>> {org.wso2.carbon.ml.database.internal.MLDatabaseUtils} -  An error occurred
>> while enabling autocommit: PooledConnection has already been closed.
>> java.sql.SQLException: PooledConnection has already been closed.
>> at
>> org.apache.tomcat.jdbc.pool.DisposableConnectionFacade.invoke(DisposableConnectionFacade.java:86)
>> at com.sun.proxy.$Proxy16.setAutoCommit(Unknown Source)
>> at
>> org.wso2.carbon.ml.database.internal.MLDatabaseUtils.enableAutoCommit(MLDatabaseUtils.java:153)
>> at
>> org.wso2.carbon.ml.database.internal.MLDatabaseService.updateSummaryStatistics(MLDatabaseService.java:2370)
>> at
>> org.wso2.carbon.ml.core.impl.SummaryStatsGenerator.run(SummaryStatsGenerator.java:130)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> [2015-10-02 01:03:40,550] ERROR
>> {org.wso2.carbon.ml.core.impl.SummaryStatsGenerator} -  Error occurred
>> while calculating summary statistics for dataset version 1: An error
>> occurred while updating the database with summary statistics of the dataset
>> 1: 16
>> org.wso2.carbon.ml.database.exceptions.DatabaseHandlerException: An error
>> occurred while updating the database with summary statistics of the dataset
>> 1: 16
>> at
>> org.wso2.carbon.ml.database.internal.MLDatabaseService.updateSummaryStatistics(MLDatabaseService.java:2366)
>> at
>> org.wso2.carbon.ml.core.impl.SummaryStatsGenerator.run(SummaryStatsGenerator.java:130)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 16
>> at
>> org.wso2.carbon.ml.database.internal.MLDatabaseService.updateSummaryStatistics(MLDatabaseService.java:2329)
>> ... 4 more
>>
>> What could be the possible reason for this error ?
>>
>> [1]
>> http://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/datasets/digits.csv
>>
>> Regards,
>> Malintha
>>
>> --
>> *Malintha Adikari*
>> Software Engineer
>> WSO2 Inc.; http://wso2.com
>> lean.enterprise.middleware
>>
>> Mobile: +94 71 2312958
>> Blog:http://malinthas.blogspot.com
>> Page:   http://about.me/malintha
>>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [CEP] "order by" clause in SiddhiQL

2015-09-24 Thread Maheshakya Wijewardena
Hi,

Does SiddhiQL support "ORDER BY" statement? I couldn't find that in the
documentation[1].

Best regards,

[1] https://docs.wso2.com/display/CEP400/SiddhiQL+Guide+3.0

-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [CEP] "order by" clause in SiddhiQL

2015-09-24 Thread Maheshakya Wijewardena
Hi Lasantha,

Thank you for the references.

What I want to is something similar to the following:

Suppose you have an input stream and there are multiple number of
processing logics. Each logic will generate a result after consuming the
events the stream. What I'm trying to do is obtaining the most frequent
outcome out of these multiple processing logics. i.e. what result has been
produced most.

An approach to do this is: first group by each result, then order by the
descending order of the count of each result and finally retrieving the
first entry from that.

I was wondering whether this type of task can be done with Siddhi.


Best regards.

On Thu, Sep 24, 2015 at 8:34 PM, Lasantha Fernando <lasan...@wso2.com>
wrote:

> Hi Maheshakya,
>
> Ordering of events for real-time analytics need to be done within a time
> frame or an event frame. Siddhi does have a sort window processor that can
> be used to sort events within the window itself.
>
> You can find the documentation at [1] or refer to our test cases at [2].
> If you can describe your use case in more detail, we might be able to point
> you to some constructs in Siddhi language that would let you achieve the
> 'order by' characteristics of a standard SQL query.
>
> [1]
> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-sortsort
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-core/src/test/java/org/wso2/siddhi/core/query/window/SortWindowTestCase.java
>
> Thanks,
> Lasantha
>
> On 24 September 2015 at 18:13, Maheshakya Wijewardena <mahesha...@wso2.com
> > wrote:
>
>> Hi,
>>
>> Does SiddhiQL support "ORDER BY" statement? I couldn't find that in the
>> documentation[1].
>>
>> Best regards,
>>
>> [1] https://docs.wso2.com/display/CEP400/SiddhiQL+Guide+3.0
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> Software Engineer
>> WSO2 : http://wso2.com/
>> Email: mahesha...@wso2.com
>> Mobile: +94711228855
>>
>>
>>
>> ___
>> Dev mailing list
>> Dev@wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
> *Lasantha Fernando*
> Senior Software Engineer - Data Technologies Team
> WSO2 Inc. http://wso2.com
>
> email: lasan...@wso2.com
> mobile: (+94) 71 5247551
>



-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [CEP] "order by" clause in SiddhiQL

2015-09-24 Thread Maheshakya Wijewardena
>
> An approach to do this is: first group by each result, then order by the
> descending order of the count of each result and finally retrieving the
> first entry from that.
>

Sorry, the approach should be as follows:
First group by result, then order by the descending order of the count of
each result and finally retrieving the first entry from that.

On Thu, Sep 24, 2015 at 9:50 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Lasantha,
>
> Thank you for the references.
>
> What I want to is something similar to the following:
>
> Suppose you have an input stream and there are multiple number of
> processing logics. Each logic will generate a result after consuming the
> events the stream. What I'm trying to do is obtaining the most frequent
> outcome out of these multiple processing logics. i.e. what result has been
> produced most.
>
> An approach to do this is: first group by each result, then order by the
> descending order of the count of each result and finally retrieving the
> first entry from that.
>
> I was wondering whether this type of task can be done with Siddhi.
>
>
> Best regards.
>
> On Thu, Sep 24, 2015 at 8:34 PM, Lasantha Fernando <lasan...@wso2.com>
> wrote:
>
>> Hi Maheshakya,
>>
>> Ordering of events for real-time analytics need to be done within a time
>> frame or an event frame. Siddhi does have a sort window processor that can
>> be used to sort events within the window itself.
>>
>> You can find the documentation at [1] or refer to our test cases at [2].
>> If you can describe your use case in more detail, we might be able to point
>> you to some constructs in Siddhi language that would let you achieve the
>> 'order by' characteristics of a standard SQL query.
>>
>> [1]
>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-sortsort
>> [2]
>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-core/src/test/java/org/wso2/siddhi/core/query/window/SortWindowTestCase.java
>>
>> Thanks,
>> Lasantha
>>
>> On 24 September 2015 at 18:13, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi,
>>>
>>> Does SiddhiQL support "ORDER BY" statement? I couldn't find that in the
>>> documentation[1].
>>>
>>> Best regards,
>>>
>>> [1] https://docs.wso2.com/display/CEP400/SiddhiQL+Guide+3.0
>>>
>>> --
>>> Pruthuvi Maheshakya Wijewardena
>>> Software Engineer
>>> WSO2 : http://wso2.com/
>>> Email: mahesha...@wso2.com
>>> Mobile: +94711228855
>>>
>>>
>>>
>>> ___
>>> Dev mailing list
>>> Dev@wso2.org
>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>
>>>
>>
>>
>> --
>> *Lasantha Fernando*
>> Senior Software Engineer - Data Technologies Team
>> WSO2 Inc. http://wso2.com
>>
>> email: lasan...@wso2.com
>> mobile: (+94) 71 5247551
>>
>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> Software Engineer
> WSO2 : http://wso2.com/
> Email: mahesha...@wso2.com
> Mobile: +94711228855
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [VOTE] Release WSO2 ML 1.0.0 RC2

2015-09-21 Thread Maheshakya Wijewardena
Hi,

I've tested the following. No issues encountered.


   - DAS table import


   - DAS as a Spark cluster


   - External Spark cluster


   - ML with MySQL db


[x] Stable - go ahead and release

Best regards,

On Mon, Sep 21, 2015 at 5:30 PM, Nirmal Fernando <nir...@wso2.com> wrote:

>
>
> On Mon, Sep 21, 2015 at 4:12 PM, Manorama Perera <manor...@wso2.com>
> wrote:
>
>> I've tested the following and found no issues.
>>
>>- HDFS support + HDFS sample
>>- CEP ML extension
>>- ESB Predict mediator
>>- Model usage Java sample
>>
>> [x] Stable - go ahead and release
>>
>> Thanks.
>>
>> On Mon, Sep 21, 2015 at 4:08 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>>
>>> [x] Stable
>>>
>>> On Mon, Sep 21, 2015 at 12:11 PM, CD Athuraliya <chathur...@wso2.com>
>>> wrote:
>>>
>>>> Hi Devs,
>>>>
>>>> This is the second release candidate of WSO2 ML 1.0.0.
>>>>
>>>> This release fixes the following issues:
>>>>
>>>> ML-1.0.0-FixedIssues <https://wso2.org/jira/issues/?filter=12390>
>>>>
>>>> Please download, test and vote. Vote will be open for 72 hours or as
>>>> needed.
>>>>
>>>> *Binary distribution files:*
>>>> Product:
>>>> *https://github.com/wso2/product-ml/releases/download/v1.0.0-rc2/wso2ml-1.0.0.zip
>>>> <https://github.com/wso2/product-ml/releases/download/v1.0.0-rc2/wso2ml-1.0.0.zip>*
>>>>
>>>> P2 repository:
>>>> *https://github.com/wso2/product-ml/releases/download/v1.0.0-rc2/p2-repo.zip
>>>> <https://github.com/wso2/product-ml/releases/download/v1.0.0-rc2/p2-repo.zip>*
>>>>
>>>> *Maven staging repo:*
>>>> *http://maven.wso2.org/nexus/content/repositories/orgwso2ml-133/
>>>> <http://maven.wso2.org/nexus/content/repositories/orgwso2ml-133/>*
>>>>
>>>> *The tag to be voted upon:*
>>>> *https://github.com/wso2/product-ml/tree/v1.0.0-rc2
>>>> <https://github.com/wso2/product-ml/tree/v1.0.0-rc2>*
>>>>
>>>> [ ] Broken - do not release (explain why)
>>>> [ ] Stable - go ahead and release
>>>>
>>>> Thanks and Regards,
>>>> ~ WSO2 Machine Learner Team ~
>>>>
>>>>
>>>> --
>>>> *CD Athuraliya*
>>>> Software Engineer
>>>> WSO2, Inc.
>>>> lean . enterprise . middleware
>>>> Mobile: +94 716288847 <94716288847>
>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>>> <https://twitter.com/cdathuraliya> | Blog
>>>> <http://cdathuraliya.tumblr.com/>
>>>>
>>>> ___
>>>> Dev mailing list
>>>> Dev@wso2.org
>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Team Lead - WSO2 Machine Learner
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>> ___
>>> Dev mailing list
>>> Dev@wso2.org
>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>
>>>
>>
>>
>> --
>> Manorama Perera
>> Software Engineer
>> WSO2, Inc.;  http://wso2.com/
>> Mobile : +94716436216
>>
>> ___
>> Dev mailing list
>> Dev@wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [DAS] Unable to add external jars to Spark class path in DAS cluster

2015-09-07 Thread Maheshakya Wijewardena
Hi,

I have tried to submit ML app as an external app to Spark cluster of DAS.
DAS was started without initializing CarbonAnalytics  Spark context. Class
paths of additional jars of ML have been tried to add to class path of DAS
Spark cluster by setting *spark.executor.extraClassPath* and
*spark.driver.extraClassPath* from the ML app. But this fails with the
following error continuously showing in all executors in Spark cluster:

*Error: Could not find or load main class
org.apache.spark.executor.CoarseGrainedExecutorBackend*

It's not possible to add those jars from DAS side either.

Any thoughts on how to fix this?

Best regards.
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [DAS] Unable to add external jars to Spark class path in DAS cluster

2015-09-07 Thread Maheshakya Wijewardena
Thanks Niranda for the help.


Best regards.

On Mon, Sep 7, 2015 at 3:55 PM, Maheshakya Wijewardena <mahesha...@wso2.com>
wrote:

> Hi,
>
> I have tried to submit ML app as an external app to Spark cluster of DAS.
> DAS was started without initializing CarbonAnalytics  Spark context. Class
> paths of additional jars of ML have been tried to add to class path of DAS
> Spark cluster by setting *spark.executor.extraClassPath* and
> *spark.driver.extraClassPath* from the ML app. But this fails with the
> following error continuously showing in all executors in Spark cluster:
>
> *Error: Could not find or load main class
> org.apache.spark.executor.CoarseGrainedExecutorBackend*
>
> It's not possible to add those jars from DAS side either.
>
> Any thoughts on how to fix this?
>
> Best regards.
> --
> Pruthuvi Maheshakya Wijewardena
> Software Engineer
> WSO2 : http://wso2.com/
> Email: mahesha...@wso2.com
> Mobile: +94711228855
>
>
>


-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Spark K-means clustering on KDD cup 99 dataset

2015-08-25 Thread Maheshakya Wijewardena
Is there any particular reason why you are putting aside 65% of anomalous
data at the evaluation? Since there is an obvious imbalance when the
numbers of normal and abnormal cases are taken into account, you will get
greater accuracy at the evaluation because a model tends to produce more
accurate results for the class with the greater size. But it's not the case
for the class of smaller size. With less number of records, it wont make
much impact on the accuracy. Hence IMO, it would be better if you could
evaluate with more anomalous data.
i.e. number of records of each class needs to be roughly equal.

Best regards

On Tue, Aug 25, 2015 at 12:05 PM, CD Athuraliya chathur...@wso2.com wrote:

 Hi Ashen,

 It would be better if you can add the assumptions you make in this process
 (uniform clusters etc). It will make the process more clear IMO.

 Regards,
 CD

 On Tue, Aug 25, 2015 at 11:39 AM, Nirmal Fernando nir...@wso2.com wrote:

 Can we see the code too?

 On Tue, Aug 25, 2015 at 11:36 AM, Ashen Weerathunga as...@wso2.com
 wrote:

 Hi all,

 I am currently working on fraud detection project. I was able to cluster
 the KDD cup 99 network anomaly detection dataset using apache spark k means
 algorithm. So far I was able to achieve 99% accuracy rate from this
 dataset.The steps I have followed during the process are mentioned below.

- Separate the dataset into two parts (normal data and anomaly data)
by filtering the label
- Splits each two parts of data as follows
   - normal data
   - 65% - to train the model
  - 15% - to optimize the model by adjusting hyper parameters
  - 20% - to evaluate the model
   - anomaly data
  - 65% - no use
  - 15% - to optimize the model by adjusting hyper parameters
  - 20% - to evaluate the model
   - Prepossess the dataset
   - Drop out non numerical features since k means can only handle
   numerical values
   - Normalize all the values to 1-0 range
   - Cluster the 65% of normal data using Apache spark K means and
build the model (15% of both normal and anomaly data were used to tune 
 the
hyper parameters such as k, percentile etc. to get an optimized model)
- Finally evaluate the model using 20% of both normal and anomaly
data.

 Method of identifying a fraud as follows,

- When a new data point comes, get the closest cluster center by
using k means predict function.
- I have calculate 98th percentile distance for each cluster. (98
was the best value I got by tuning the model with different values)
- Then I checked whether the distance of new data point with the
given cluster center is less than or grater than the 98th percentile of
that cluster. If it is less than the percentile it is considered as a
normal data. If it is grater than the percentile it is considered as a
fraud since it is in outside the cluster.

 Our next step is to integrate this feature to ML product and try out it
 with more realistic dataset. A summery of results I have obtained using
 98th percentile during the process is attached with this.


 https://docs.google.com/a/wso2.com/spreadsheets/d/1E5fXk9CM31QEkyFCIEongh8KAa6jPeoY7OM3HraGPd4/edit?usp=sharing

 Thanks and Regards,
 Ashen
 --
 *Ashen Weerathunga*
 Software Engineer - Intern
 WSO2 Inc.: http://wso2.com
 lean.enterprise.middleware

 Email: as...@wso2.com
 Mobile: +94 716042995 94716042995
 LinkedIn:
 *http://lk.linkedin.com/in/ashenweerathunga
 http://lk.linkedin.com/in/ashenweerathunga*




 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *CD Athuraliya*
 Software Engineer
 WSO2, Inc.
 lean . enterprise . middleware
 Mobile: +94 716288847 94716288847
 LinkedIn http://lk.linkedin.com/in/cdathuraliya | Twitter
 https://twitter.com/cdathuraliya | Blog
 http://cdathuraliya.tumblr.com/




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [DAS 3.0.0 Beta] No FileSystem for scheme: file

2015-08-14 Thread Maheshakya Wijewardena
(SecretManagerInitializerComponent.java:48)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 org.eclipse.equinox.internal.ds.model.ServiceComponent.activate(ServiceComponent.java:260)
 at
 org.eclipse.equinox.internal.ds.model.ServiceComponentProp.activate(ServiceComponentProp.java:146)
 at
 org.eclipse.equinox.internal.ds.model.ServiceComponentProp.build(ServiceComponentProp.java:345)
 at
 org.eclipse.equinox.internal.ds.InstanceProcess.buildComponent(InstanceProcess.java:620)
 at
 org.eclipse.equinox.internal.ds.InstanceProcess.buildComponents(InstanceProcess.java:197)
 at
 org.eclipse.equinox.internal.ds.Resolver.buildNewlySatisfied(Resolver.java:473)
 at
 org.eclipse.equinox.internal.ds.Resolver.enableComponents(Resolver.java:217)
 at
 org.eclipse.equinox.internal.ds.SCRManager.performWork(SCRManager.java:816)
 at
 org.eclipse.equinox.internal.ds.SCRManager$QueuedJob.dispatch(SCRManager.java:783)
 at org.eclipse.equinox.internal.ds.WorkThread.run(WorkThread.java:89)
 at java.lang.Thread.run(Thread.java:745)



 [1]
 https://docs.google.com/a/wso2.com/drawings/d/1t4NVfkMIeCpRxVZBqr-HeMsg3Qh5mo8huVXED4sYoHE/edit?usp=sharing

 --
 *Thanks and Regards,*
 Anuruddha Lanka Liyanarachchi
 Software Engineer - WSO2
 Mobile : +94 (0) 712762611
 Tel  : +94 112 145 345
 a thili...@wso2.comnurudd...@wso2.com




 --
 Gokul Balakrishnan
 Senior Software Engineer,
 WSO2, Inc. http://wso2.com
 Mob: +94 77 593 5789 | +1 650 272 9927




 --
 *Thanks and Regards,*
 Anuruddha Lanka Liyanarachchi
 Software Engineer - WSO2
 Mobile : +94 (0) 712762611
 Tel  : +94 112 145 345
 a thili...@wso2.comnurudd...@wso2.com




 --
 Gokul Balakrishnan
 Senior Software Engineer,
 WSO2, Inc. http://wso2.com
 Mob: +94 77 593 5789 | +1 650 272 9927

 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Categorical or Numerical column?

2015-08-13 Thread Maheshakya Wijewardena
Another approach to distinguish between categorical and numerical features
can be elaborated as follows:

First, we take out the unique values from the column and sort them. If it's
a categorical feature, then the gaps between the elements of this sorted
list should be equal. In a numerical feature, this is extremely unlikely to
happen. This behavior of valid in most scenarios, but there are a few
exceptions as well. eg: when a numerical ID is used as the categorical
label - 19933, 19913, 18832, ...

This is a very simple hack that can be easily implemented, but not a
standard technique.

WDYT?

On Fri, Aug 14, 2015 at 8:55 AM, Srinath Perera srin...@wso2.com wrote:

 I mean current approach and skewness?

 On Fri, Aug 14, 2015 at 8:54 AM, Srinath Perera srin...@wso2.com wrote:

 Can we use a combination of both?

 On Thu, Aug 13, 2015 at 8:46 PM, Supun Sethunga sup...@wso2.com wrote:

 When a dataset is large, in general its said to be approximates to a
 Normal Distribution. :)  True it Hypothetical, but the point they make is,
 when the datasets are large, then properties of a distribution like
 skewness, variance and etc. become closer to the properties Normal
 Distribution in most cases..

 On Thu, Aug 13, 2015 at 11:07 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Supun,

 Thanks for the reply.

 On Thu, Aug 13, 2015 at 8:09 PM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Nirmal,

 IMO don't think we would be able to use skewness in this case.
 Skewness says how symmetric the distribution is. For example, if we
 consider a numerical/continuous feature (not categorical) which is 
 Normally
 Distributed, then the skewness would be 0. Also for a categorical 
 (encoded)
 feature having a systematic distribution, then again the skewness would be
 0.


 What's the probability of you see a normal distribution of a real
 dataset? IMO it's very less and also since what we're doing here is a
 suggestion, do you see it as an issue?



 We did have this concern at the beginning as well, regarding how we
 could determine whether a feature is categorical or Continuous. Usually
 this is strictly dependent on the domain of the dataset (i.e. user have to
 decide this with the knowledge about the data). That was the idea behind
 letting user change the data type.. But since we needed a default option,
 we had to go for the threshold thing, which was the olny option we could
 come-up with. I did a bit of research on this too, but only to find no
 other solution :(

 Thanks,
 Supun

 On Thu, Aug 13, 2015 at 1:49 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi All,

 We have a feature in ML where we suggest a given data column of a
 dataset is categorical or numerical. Currently, how we determine this is 
 by
 using a threshold value (The maximum number of categories that can
 have in a non-string categorical feature. If exceeds, the feature
 will be treated as a numerical feature.). But this is not a
 successful measurement for most of the datasets.

 Can we use 'skewness' of a distribution as a measurement to determine
 this? Can we say, a column is numerical, if the modulus of the skewness 
 of
 the distribution is less than a certain threshold (say 0.01) ?

 *References*:

 http://www.itrcweb.org/gsmc-1/Content/GW%20Stats/5%20Methods%20in%20indiv%20Topics/5%206%20Distributional%20Tests.htm

 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --
 
 Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
 Site: http://people.apache.org/~hemapani/
 Photos: http://www.flickr.com/photos/hemapani/
 Phone: 0772360902




 --
 
 Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
 Site: http://people.apache.org/~hemapani/
 Photos: http://www.flickr.com/photos/hemapani/
 Phone: 0772360902




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [Orbit] Adding Spark MLlib 1.4.1

2015-08-06 Thread Maheshakya Wijewardena
Thanks Niranjan.

On Thu, Aug 6, 2015 at 3:04 PM, Niranjan Karunanandham niran...@wso2.com
wrote:

 Hi Maheshakya,

 I have merged the PR.

 Regards,
 Nira

 On Thu, Aug 6, 2015 at 2:57 PM, Niranjan Karunanandham niran...@wso2.com
 wrote:

 Noted!

 On Thu, Aug 6, 2015 at 2:13 PM, KasunG Gajasinghe kas...@wso2.com
 wrote:

 Hi Niranjan,

 Can you review this?

 Thanks.

 On Thu, Aug 6, 2015 at 1:59 PM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi,

 Can you please review and merge this PR[1] for wso2v1 of the Spark
 mllib 1.4.1.
 The latest patch release addition of Spark 1.4.1 has not included
 MLlib, so I have added that.

 Best regards,

 [1] https://github.com/wso2/orbit/pull/116
 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 : http://wso2.com/
 Email: mahesha...@wso2.com
 Mobile: +94711228855





 --

 *Kasun Gajasinghe*Senior Software Engineer, WSO2 Inc.
 email: kasung AT spamfree wso2.com
 linked-in: http://lk.linkedin.com/in/gajasinghe
 blog: http://kasunbg.org






 --

 *Niranjan Karunanandham*
 Senior Software Engineer - WSO2 Inc.
 WSO2 Inc.: http://www.wso2.com




 --

 *Niranjan Karunanandham*
 Senior Software Engineer - WSO2 Inc.
 WSO2 Inc.: http://www.wso2.com




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [Orbit] Adding Spark MLlib 1.4.1

2015-08-06 Thread Maheshakya Wijewardena
Hi,

Can you please review and merge this PR[1] for wso2v1 of the Spark mllib
1.4.1.
The latest patch release addition of Spark 1.4.1 has not included MLlib, so
I have added that.

Best regards,

[1] https://github.com/wso2/orbit/pull/116
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] WSO2 Committers += Anuruddha Liyanarachchi

2015-08-05 Thread Maheshakya Wijewardena
Congratulations!

On Wed, Aug 5, 2015 at 2:03 PM, Dilan Udara Ariyaratne dil...@wso2.com
wrote:

 Congratulations, Anuruddha !!!


 *Dilan U. Ariyaratne*
 Software Engineer
 WSO2 Inc. http://wso2.com/
 Mobile: +94775149066
 lean . enterprise . middleware

 On Wed, Aug 5, 2015 at 11:12 AM, Lalanke Athauda lala...@wso2.com wrote:

 Congratulation Anuruddha..

 On Wed, Aug 5, 2015 at 11:09 AM, Sajith Abeywardhana saji...@wso2.com
 wrote:

 Congratulations Anuruddha !!!

 *Sajith Abeywardhana* | Software Engineer
 WSO2, Inc | lean. enterprise. middleware.
 #20, Palm Grove, Colombo 03, Sri Lanka.
 Mobile: +94772260485
 Email: saji...@wso2.com | Web: www.wso2.com

 On Mon, Aug 3, 2015 at 11:37 PM, Imesh Gunaratne im...@wso2.com wrote:

 s/great contributors/great contributions/g

 On Mon, Aug 3, 2015 at 11:36 PM, Imesh Gunaratne im...@wso2.com
 wrote:

 Hi Devs,

 It's my pleasure to welcome Anuruddha Liyanarachchi as a WSO2
 Committer.

 Anuruddha has done great contributors to the WSO2 Private PaaS
 project. As a recognition of his work he has been voted as a WSO2
 committer.

 Anuruddha, welcome aboard! Keep up the good work!

 Thanks

 --
 *Imesh Gunaratne*
 Senior Technical Lead
 WSO2 Inc: http://wso2.com
 T: +94 11 214 5345 M: +94 77 374 2057
 W: http://imesh.gunaratne.org
 Lean . Enterprise . Middleware




 --
 *Imesh Gunaratne*
 Senior Technical Lead
 WSO2 Inc: http://wso2.com
 T: +94 11 214 5345 M: +94 77 374 2057
 W: http://imesh.gunaratne.org
 Lean . Enterprise . Middleware


 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 Lalanke Athauda
 Software Engineer
 Mobile: 0772264301

 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] WSO2 Committers += CD Athuraliya

2015-07-31 Thread Maheshakya Wijewardena
Congratulations !!! :D


On Fri, Jul 31, 2015 at 3:52 PM, Chamin Nalinda chm...@gmail.com wrote:

 Congratulations CD

 On Fri, Jul 31, 2015 at 2:26 PM, Madhawa Gunasekara madha...@wso2.com
 wrote:

 Congratulations CD !!!

 On Fri, Jul 31, 2015 at 2:04 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi All,

 It's my pleasure to announce *CD Athuraliya* as a *WSO2 Committer*. He
 has been a key contributor to the *WSO2 Machine Learner *Product and in
 recognition of his excellent work, he had been voted as a WSO2 Committer.

 Congratulations CD and keep up the good work!

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 *Madhawa Gunasekara*
 Software Engineer
 WSO2 Inc.; http://wso2.com
 lean.enterprise.middleware

 mobile: +94 719411002 +94+719411002
 blog: *http://madhawa-gunasekara.blogspot.com
 http://madhawa-gunasekara.blogspot.com*
 linkedin: *http://lk.linkedin.com/in/mgunasekara
 http://lk.linkedin.com/in/mgunasekara*

 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 Chamin Nalinda
 Research Undergraduate
 University of Colombo School of Computing (UCSC).
 Student Member IEEE (92387118)
 Student Member ACM (5654073)

 LinkedIn: https://www.linkedin.com/in/chaminnalinda
 GitHub: https://github.com/CoolCK
 SlideShare: http://www.slideshare.net/ChaminNalindaLokuGam/
 Blog: http://techspiro.blogspot.com/



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [DEV][DAS] Spark cluster in DAS does not have worker nodes

2015-07-27 Thread Maheshakya Wijewardena
Thanks Niranda.


Best regards.

On Mon, Jul 27, 2015 at 5:40 PM, Niranda Perera nira...@wso2.com wrote:

 Hi Maheshakya,

 I've fixed this issue. Pls take an update of the carbon-analytics repo

 rgds

 On Mon, Jul 27, 2015 at 12:12 PM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi Niranda,

 I started those one by one.

 On Mon, Jul 27, 2015 at 12:07 PM, Niranda Perera nira...@wso2.com
 wrote:

 Hi Maheshakya,

 I will look into this.

 According to your setting, the ideal scenario is,
 node1 -  spark master (active) + worker
 node2 - spark master (standby) + worker
 node3 - worker

 did you start the servers all at once or one by one?

 rgds

 On Mon, Jul 27, 2015 at 11:07 AM, Anjana Fernando anj...@wso2.com
 wrote:

 Hi,

 Actually, when the 3'rd sever has started up, all 3 servers should have
 worker instances. This seems to be a bug. @Niranda, please check it out
 ASAP.

 Cheers,
 Anjana.

 On Mon, Jul 27, 2015 at 11:01 AM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi,

 I have tried to create a Spark cluster with DAS using Carbon
 clustering. 3 DAS nodes are configured and number of Spark masters is set
 to 2. In this setting, one of the 3 nodes should have a Spark worker node,
 but all 3 nodes are starting as Spark masters. What can be the reason for
 this?

 Configuration files (one of *axis2.xml* files of DAS clusters and
 *spark-defaults.conf*) of DAS are attached herewith.

 Best regards.
 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 : http://wso2.com/
 Email: mahesha...@wso2.com
 Mobile: +94711228855





 --
 *Anjana Fernando*
 Senior Technical Lead
 WSO2 Inc. | http://wso2.com
 lean . enterprise . middleware




 --
 *Niranda Perera*
 Software Engineer, WSO2 Inc.
 Mobile: +94-71-554-8430
 Twitter: @n1r44 https://twitter.com/N1R44
 https://pythagoreanscript.wordpress.com/




 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 : http://wso2.com/
 Email: mahesha...@wso2.com
 Mobile: +94711228855





 --
 *Niranda Perera*
 Software Engineer, WSO2 Inc.
 Mobile: +94-71-554-8430
 Twitter: @n1r44 https://twitter.com/N1R44
 https://pythagoreanscript.wordpress.com/




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [DEV][DAS] Spark cluster in DAS does not have worker nodes

2015-07-27 Thread Maheshakya Wijewardena
Hi Niranda,

I started those one by one.

On Mon, Jul 27, 2015 at 12:07 PM, Niranda Perera nira...@wso2.com wrote:

 Hi Maheshakya,

 I will look into this.

 According to your setting, the ideal scenario is,
 node1 -  spark master (active) + worker
 node2 - spark master (standby) + worker
 node3 - worker

 did you start the servers all at once or one by one?

 rgds

 On Mon, Jul 27, 2015 at 11:07 AM, Anjana Fernando anj...@wso2.com wrote:

 Hi,

 Actually, when the 3'rd sever has started up, all 3 servers should have
 worker instances. This seems to be a bug. @Niranda, please check it out
 ASAP.

 Cheers,
 Anjana.

 On Mon, Jul 27, 2015 at 11:01 AM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi,

 I have tried to create a Spark cluster with DAS using Carbon clustering.
 3 DAS nodes are configured and number of Spark masters is set to 2. In this
 setting, one of the 3 nodes should have a Spark worker node, but all 3
 nodes are starting as Spark masters. What can be the reason for this?

 Configuration files (one of *axis2.xml* files of DAS clusters and
 *spark-defaults.conf*) of DAS are attached herewith.

 Best regards.
 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 : http://wso2.com/
 Email: mahesha...@wso2.com
 Mobile: +94711228855





 --
 *Anjana Fernando*
 Senior Technical Lead
 WSO2 Inc. | http://wso2.com
 lean . enterprise . middleware




 --
 *Niranda Perera*
 Software Engineer, WSO2 Inc.
 Mobile: +94-71-554-8430
 Twitter: @n1r44 https://twitter.com/N1R44
 https://pythagoreanscript.wordpress.com/




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [DEV][DAS] Spark cluster in DAS does not have worker nodes

2015-07-26 Thread Maheshakya Wijewardena
Hi,

I have tried to create a Spark cluster with DAS using Carbon clustering. 3
DAS nodes are configured and number of Spark masters is set to 2. In this
setting, one of the 3 nodes should have a Spark worker node, but all 3
nodes are starting as Spark masters. What can be the reason for this?

Configuration files (one of *axis2.xml* files of DAS clusters and
*spark-defaults.conf*) of DAS are attached herewith.

Best regards.
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
!--
  ~ Copyright 2005-2011 WSO2, Inc. (http://wso2.com)
  ~
  ~ Licensed under the Apache License, Version 2.0 (the License);
  ~ you may not use this file except in compliance with the License.
  ~ You may obtain a copy of the License at
  ~
  ~ http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing, software
  ~ distributed under the License is distributed on an AS IS BASIS,
  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  ~ See the License for the specific language governing permissions and
  ~ limitations under the License.
  --

axisconfig name=AxisJava2.0

!-- = --
!-- Globally engaged modules --
!-- = --
module ref=addressing/

!-- = --
!-- Parameters --
!-- = --
parameter name=hotdeploymenttrue/parameter
parameter name=hotupdatetrue/parameter
parameter name=enableMTOM locked=falseoptional/parameter
parameter name=cacheAttachmentstrue/parameter
parameter name=attachmentDIRwork/mtom/parameter
parameter name=sizeThreshold4000/parameter

parameter name=EnableChildFirstClassLoading${childfirstCL}/parameter

!--
The exposeServiceMetadata parameter decides whether the metadata (WSDL, schema, policy) of
the services deployed on Axis2 should be visible when ?wsdl, ?wsdl2, ?xsd, ?policy requests
are received.
This parameter can be defined in the axi2.xml file, in which case this will be applicable
globally, or in the services.xml files, in which case, it will be applicable to the
Service groups and/or services, depending on the level at which the parameter is declared.
This value of this parameter defaults to true.
--
parameter name=exposeServiceMetadatatrue/parameter

!--If turned on with use the Accept header of the request to determine the contentType of the
response--
parameter name=httpContentNegotiationtrue/parameter

!--
Defines how the persistence of WS-ReliableMessaging is handled

Possible value are: inmemory  persistent
--
!-- Following parameter will completely disable REST handling in both the servlets--
parameter name=disableREST locked=falsefalse/parameter

parameter name=Sandesha2StorageManagerinmemory/parameter

!-- This deployment interceptor will be called whenever before a module is initialized or
 service is deployed --
listener class=org.wso2.carbon.core.deployment.DeploymentInterceptor/

!-- setting servicePath. contextRoot is defined in the carbon.xml file --
!-- modification of this variable should be accompanied by the change in 'ServerURL' in carbon.xml file --
parameter name=servicePathservices/parameter

!--the directory in which .aar services are deployed inside axis2 repository--
parameter name=ServicesDirectoryaxis2services/parameter

!--the directory in which modules are deployed inside axis2 repository--
parameter name=ModulesDirectoryaxis2modules/parameter

parameter name=userAgent locked=true
WSO2 Data Analytics Server-3.0.0
/parameter
parameter name=server locked=true
WSO2 Data Analytics Server-3.0.0
/parameter

!-- --

!--During a fault, stacktrace can be sent with the fault message. The following flag will control --
!--that behaviour.--
parameter name=sendStacktraceDetailsWithFaultsfalse/parameter

!--If there aren't any information available to find out the fault reason, we set the message of the expcetion--
!--as the faultreason/Reason. But when a fault is thrown from a service or some where, it will be --
!--wrapped by different levels. Due to this the initial exception message can be lost. If this flag--
!--is set then, Axis2 tries to get the first exception and set its message as the faultreason/Reason.--
parameter name=DrillDownToRootCauseForFaultReasonfalse/parameter

!--Set the flag to true if you want to enable transport level session mangment--
parameter name=manageTransportSessiontrue/parameter

!-- Synapse Configuration file --
parameter name=SynapseConfig.ConfigurationFile locked=false
./repository

Re: [Dev] ML Error with Abalone Dataset

2015-07-14 Thread Maheshakya Wijewardena
This is because Abolone dataset is a multi-class classification dataset
(response variable has values - M, F, and I) and SVM supports only binary
classification. No more than 2 distinct values should be there in response
variable.

Maybe we need to tell this in a nicer way rather than just saying that
input validation failed.

On Tue, Jul 14, 2015 at 4:56 PM, Srinath Perera srin...@wso2.com wrote:

 Got the following error tying the dataset
 https://archive.ics.uci.edu/ml/datasets/Abalone. Please check.
 --Srinath

 An error occurred while building supervised machine learning model: An
 error occurred while building SVM model: Input validation failed. Model
 Configuration [algorithmName=SVM, algorithmClass=Classification,
 responseVariable=M, trainDataFraction=0.7, hyperParameters={Iterations=100,
 SGD_Data_Fraction=1, Reg_Type=L1, Reg_Parameter=0.001,
 Learning_Rate=0.001}, features=[Feature [name=0.095, index=3,
 type=NUMERICAL, imputeOption=DISCARD, include=true], Feature [name=0.101,
 index=6, type=NUMERICAL, imputeOption=DISCARD, include=true], Feature
 [name=0.15, index=7, type=NUMERICAL, imputeOption=DISCARD, include=true],
 Feature [name=0.2245, index=5, type=NUMERICAL, imputeOption=DISCARD,
 include=true], Feature [name=0.365, index=2, type=NUMERICAL,
 imputeOption=DISCARD, include=true], Feature [name=0.455, index=1,
 type=NUMERICAL, imputeOption=DISCARD, include=true], Feature [name=0.514,
 index=4, type=NUMERICAL, imputeOption=DISCARD, include=true], Feature
 [name=15, index=8, type=NUMERICAL, imputeOption=DISCARD, include=true],
 Feature [name=M, index=0, type=CATEGORICAL, imputeOption=DISCARD,
 include=true]]]

 --
 
 Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
 Site: http://people.apache.org/~hemapani/
 Photos: http://www.flickr.com/photos/hemapani/
 Phone: 0772360902

 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 : http://wso2.com/
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [HACKATHON][DAS] Working with the Analytics Dashboard

2015-07-02 Thread Maheshakya Wijewardena
Hi DAS team,

The current documentation of analytics dashboard[1] seems
obsolete(describes an older dashboard).

I'm trying to create a dash board for an event stream stored in a record
store. With the current dash board setting, it's not possible to provide
event stream info or any other table info in the record stores. How do I
create a dashboard for such scenario?

[1] https://docs.wso2.com/display/DAS300/Adding+a+Dashboard

Best regards.
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [HACKATHON][DAS] Working with the Analytics Dashboard

2015-07-02 Thread Maheshakya Wijewardena
This works. Thanks.
Earlier, 'Create Gadget' wizard raised an exception due to some Hbase DAL
configuration error. After fixing that, this works fine.

Best regards.

On Thu, Jul 2, 2015 at 1:32 PM, Dunith Dhanushka dun...@wso2.com wrote:

 Hi Maheshakya,

 What's the version of DAS you are testing?

 Analytics Dashboard had undergone some UX changes after the alpha release
 and those changes are being documented at the moment. I assume your
 requirement is to create a gadget from an eventstream or a table stored in
 the DAL layer and add it to a dashboard (Correct me of I'm wrong). So in
 that case, you can use the gadget generation wizard for that.

 After UX changes, Gadget gen wizard link has been relocated to user's
 welcome page which can be accessed using URL [1]. There you'll see a link
 named CREATE GADGET to start the wizard.


 [1] https://hostName:9443/portal/

 Regards,
 Dunith



 On Thu, Jul 2, 2015 at 12:26 PM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi DAS team,

 The current documentation of analytics dashboard[1] seems
 obsolete(describes an older dashboard).

 I'm trying to create a dash board for an event stream stored in a record
 store. With the current dash board setting, it's not possible to provide
 event stream info or any other table info in the record stores. How do I
 create a dashboard for such scenario?

 [1] https://docs.wso2.com/display/DAS300/Adding+a+Dashboard

 Best regards.
 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 Lanka (Pvt) Ltd
 Email: mahesha...@wso2.com
 Mobile: +94711228855



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 Regards,

 Dunith Dhanushka,
 Senior Software Engineer - BAM,
 WSO2 Inc,

 Mobile - +94 71 8615744
 Blog - dunithd.wordpress.com http://blog.dunith.com
 Twitter - @dunithd http://twitter.com/dunithd




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [CEP] Unable to create query in Siddhi Query Language

2015-06-22 Thread Maheshakya Wijewardena
It's working now. Thanks.

On Mon, Jun 22, 2015 at 8:15 PM, Tharik Kanaka tha...@wso2.com wrote:

 Make that averageLoad float parameter of processed_data  export stream as
 double and try.
 Average function will return a double value.

 On Mon, Jun 22, 2015 at 7:17 PM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi,

 I'm trying to create an execution plan similar to the following:


 ​
 It has a data receiver for an event stream called `streaming_data`, a
 data publisher for a event stream called `processed_data`.
 I have created an execution plan as follows:

 /* Enter a unique ExecutionPlan */
 @Plan:name('ExecutionPlan')

 /* Enter a unique description for ExecutionPlan */
 -- @Plan:description('ExecutionPlan')

 /* define streams/tables and write queries here ... */

 @Import('streaming_data:1.0.0')
 define stream streaming_data (meta_type string, id int, timeStamp int,
 value float, property bool, plugId int, householdId int, houseId string);

 @Export('processed_data:1.0.0')
 define stream processed_data (averageLoad float);

 from streaming_data#window.length(5)
 select avg(value) as averageLoad
 insert into processed_data;


 When I try to validate my query, I get the following error:

 Different definition same as output stream definition
 :StreamDefinition{id='processed_data',
 attributeList=[Attribute{id='averageLoad', type=DOUBLE}], annotations=[]}
 already exist as:StreamDefinition{id='processed_data',
 attributeList=[Attribute{id='averageLoad', type=FLOAT}],
 annotations=[Annotation{name='Export', elements=[Element{key='null',
 value='processed_data:1.0.0'}]}]} in execution plan ExecutionPlan


 What might be the reason for this?

 What I want to do is reading a moving window from the `streaming_data`
 event stream, get the average value of the atribute `value` within that
 window and send that to the `processed_data` event stream.

 (I've built from the sources of product-cep master and running CEP in
 distributed mood with Storm)

 Best regards.
 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 Lanka (Pvt) Ltd
 Email: mahesha...@wso2.com
 Mobile: +94711228855



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --

 *Tharik Kanaka*

 WSO2, Inc |#20, Palm Grove, Colombo 03, Sri Lanka

 Email: tha...@wso2.com | Web: www.wso2.com




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [CEP] Unable to create query in Siddhi Query Language

2015-06-22 Thread Maheshakya Wijewardena
Hi,

I'm trying to create an execution plan similar to the following:


​
It has a data receiver for an event stream called `streaming_data`, a data
publisher for a event stream called `processed_data`.
I have created an execution plan as follows:

/* Enter a unique ExecutionPlan */
 @Plan:name('ExecutionPlan')

 /* Enter a unique description for ExecutionPlan */
 -- @Plan:description('ExecutionPlan')

 /* define streams/tables and write queries here ... */

 @Import('streaming_data:1.0.0')
 define stream streaming_data (meta_type string, id int, timeStamp int,
 value float, property bool, plugId int, householdId int, houseId string);

 @Export('processed_data:1.0.0')
 define stream processed_data (averageLoad float);

 from streaming_data#window.length(5)
 select avg(value) as averageLoad
 insert into processed_data;


When I try to validate my query, I get the following error:

 Different definition same as output stream definition
 :StreamDefinition{id='processed_data',
 attributeList=[Attribute{id='averageLoad', type=DOUBLE}], annotations=[]}
 already exist as:StreamDefinition{id='processed_data',
 attributeList=[Attribute{id='averageLoad', type=FLOAT}],
 annotations=[Annotation{name='Export', elements=[Element{key='null',
 value='processed_data:1.0.0'}]}]} in execution plan ExecutionPlan


What might be the reason for this?

What I want to do is reading a moving window from the `streaming_data`
event stream, get the average value of the atribute `value` within that
window and send that to the `processed_data` event stream.

(I've built from the sources of product-cep master and running CEP in
distributed mood with Storm)

Best regards.
--
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Error in initializing HBase analytics data source in DAS due to missing configuration file

2015-06-10 Thread Maheshakya Wijewardena
Hi,

When DAS(cloned from the master) is set up with HBase as its' data access
layer(as indicated in [1]) and started the server , following error occurs:

 Error in creating analytics data service from configuration: Cannot
 initialize HBase analytics data source the configuration file cannot be
 found at:path_to_DAS
 /wso2das-3.0.0-SNAPSHOT/repository/conf/analytics/hbase-analytics-config.xml


This is caused by non-availability of this *hbase-analytics-config.xml
file *at that specified location. The current location of this config file
is:
path_to_DAS/wso2das-3.0.0-SNAPSHOT/repository/components/features/org.wso2.carbon.analytics.datasource.hbase.server_1.0.3.SNAPSHOT/conf/analytics/hbase-analytics-config.xml.

This can be solved by copying that file into the the first location (which
is not documented or stated anywhere). Maybe it is supposed to be copied at
that location in the first place. Please have a look into this. Thanks.

Best regards.

[1] https://docs.wso2.com/display/DAS300/HBase
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [Spark Logs] Move spark logs to spark.log file and skip wso2carbon.log

2015-06-08 Thread Maheshakya Wijewardena
+1
It sure is cluttering things up.

But wont there be an inconvenience when we need to relate some issue in a
carbon component with the actual spark issue?

On Mon, Jun 8, 2015 at 12:24 PM, Niranda Perera nira...@wso2.com wrote:

 +1 for this.

 Even in DAS this is an issue  think this would be a better option


 On Mon, Jun 8, 2015 at 12:22 PM, Nirmal Fernando nir...@wso2.com wrote:

 All,

 What do you think about https://wso2.org/jira/browse/ML-68 ?

 Reason:

 Spark logs are cluttering the wso2carbon.log and it'll be convenient to
 have the spark logs in a different file.

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 *Niranda Perera*
 Software Engineer, WSO2 Inc.
 Mobile: +94-71-554-8430
 Twitter: @n1r44 https://twitter.com/N1R44
 https://pythagoreanscript.wordpress.com/




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [Spark Logs] Move spark logs to spark.log file and skip wso2carbon.log

2015-06-08 Thread Maheshakya Wijewardena
Yes, that's passable solution.

On Mon, Jun 8, 2015 at 1:14 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi,

 So, since ML component's invoke Spark, if there's a SparkException, ML
 component should log it properly. And if you need to see an internal Spark
 error, you have to correlate them using the timestamp. Will it be
 sufficient?

 On Mon, Jun 8, 2015 at 12:38 PM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 +1
 It sure is cluttering things up.

 But wont there be an inconvenience when we need to relate some issue in a
 carbon component with the actual spark issue?

 On Mon, Jun 8, 2015 at 12:24 PM, Niranda Perera nira...@wso2.com wrote:

 +1 for this.

 Even in DAS this is an issue  think this would be a better option


 On Mon, Jun 8, 2015 at 12:22 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 All,

 What do you think about https://wso2.org/jira/browse/ML-68 ?

 Reason:

 Spark logs are cluttering the wso2carbon.log and it'll be convenient to
 have the spark logs in a different file.

 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 *Niranda Perera*
 Software Engineer, WSO2 Inc.
 Mobile: +94-71-554-8430
 Twitter: @n1r44 https://twitter.com/N1R44
 https://pythagoreanscript.wordpress.com/




 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 Lanka (Pvt) Ltd
 Email: mahesha...@wso2.com
 Mobile: +94711228855





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [ML] Decision tree fails when a feature contains only a single value

2015-06-03 Thread Maheshakya Wijewardena
Sparks' Decision tree does not accept datasets with a single value in a
feature. It produces the following error:

 requirement failed: DecisionTree Strategy given invalid
 categoricalFeaturesInfo setting: feature 645 has 1 categories.  The number
 of categories should be = 2


This is not an uncommon scenario since large datasets can contain features
with only a single value (See training data in [1] for example). As this is
a Spark error, there should be a way to handle such datasets externally.

One possible solution is to allow user to discard features(columns), so
that they can discard those features with single values before training a
Decision tree. Please suggest if there are any other feasible solutions.

Best regards,

[1] https://www.kaggle.com/c/digit-recognizer
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Deprecate import model and add export model in rest API

2015-06-01 Thread Maheshakya Wijewardena
Fixed in the PR: https://github.com/wso2/product-ml/pull/157

On Mon, Jun 1, 2015 at 2:46 PM, Nirmal Fernando nir...@wso2.com wrote:

 Can you please fix UI's download operation to use the new API?

 On Mon, Jun 1, 2015 at 2:42 PM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Hi,

 Task done for the JIRA[1] in the PR[2]

 [1] https://wso2.org/jira/browse/ML-58
 [2] https://github.com/wso2/carbon-ml/pull/13
 --
 Pruthuvi Maheshakya Wijewardena
 Software Engineer
 WSO2 Lanka (Pvt) Ltd
 Email: mahesha...@wso2.com
 Mobile: +94711228855





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Deprecate import model and add export model in rest API

2015-06-01 Thread Maheshakya Wijewardena
Hi,

Task done for the JIRA[1] in the PR[2]

[1] https://wso2.org/jira/browse/ML-58
[2] https://github.com/wso2/carbon-ml/pull/13
-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Algo to use for Logistic regression

2015-05-31 Thread Maheshakya Wijewardena
I should agree with Upul about adding both if possible. Mini-batch adds the
question of determining the right size for batch size, but finding the
right batch size may greatly improve our results as well as time for
convergence. But still, it can depend heavily on the dataset.

Have you tried with different datasets? Different in terms of size as well
as other statistical properties of features(such as standard deviation,
skewness, etc.)?

On Sun, May 31, 2015 at 10:28 PM, Nirmal Fernando nir...@wso2.com wrote:

 yes.. but from the simple test I did, I felt L-BFGS is faster. Will
 confirm anyway.

 On Sun, May 31, 2015 at 10:13 PM, Upul Bandara u...@wso2.com wrote:

 Actually, I'm thinking in terms of training time, even for large data
 sets prediction accuracy of L-BFGS will outperform SGD. But its training
 time would be considerably bigger than the training time of SGD.
 On the other hand, SGD model gives a decent prediction accuracy in
 relatively short period of training time.


 On Sun, May 31, 2015 at 9:52 PM, Nirmal Fernando nir...@wso2.com wrote:

 Thanks Upul. So, are you thinking along the lines of performance? Sure,
 I'll run a test.

 On Sun, May 31, 2015 at 9:50 PM, Upul Bandara u...@wso2.com wrote:

 If it is possible, I would like to have both.

 L-BFGS converges faster than SGD. But it goes through the entire data
 set before moving from one iteration to the next.
 Whereas, SGD uses a minit-batch of the training data set for
 calculating and updating its gradient.
 Hence, for large data sets SGD is more practical than L-BFGS.

 I think we can test this scenario by running these two algorithms
 against a large data set (~ 1GB)

 Thanks,
 Upul

 On Sun, May 31, 2015 at 8:02 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 One other benefit of switching is, this API supports multi-class
 classification too. I've tested this API with Iris dataset.

 On Sun, May 31, 2015 at 7:33 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi,

 Currently in ML, we use mini-batch gradient descent algorithm when
 running logistic regression. But Spark-mllib recommends L-BFGS over
 mini-batch gradient descent for faster convergence [1].

 I tested both the implementation with the same dataset and gained an
 improved accuracy in L-BFGS (80% vs 67% for SGD).

 Shall we switch?

 [1]
 https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression


 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 Upul Bandara,
 Associate Technical Lead, WSO2, Inc.,
 Mob: +94 715 468 345.




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 Upul Bandara,
 Associate Technical Lead, WSO2, Inc.,
 Mob: +94 715 468 345.




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Predicted vs. actuals chart in model summary

2015-05-28 Thread Maheshakya Wijewardena
 http://cdathuraliya.tumblr.com/




 --
 
 Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
 Site: http://people.apache.org/~hemapani/
 Photos: http://www.flickr.com/photos/hemapani/
 Phone: 0772360902




 --
 *CD Athuraliya*
 Software Engineer
 WSO2, Inc.
 lean . enterprise . middleware
 Mobile: +94 716288847 94716288847
 LinkedIn http://lk.linkedin.com/in/cdathuraliya | Twitter
 https://twitter.com/cdathuraliya | Blog
 http://cdathuraliya.tumblr.com/




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --
 Upul Bandara,
 Associate Technical Lead, WSO2, Inc.,
 Mob: +94 715 468 345.




 --
 *CD Athuraliya*
 Software Engineer
 WSO2, Inc.
 lean . enterprise . middleware
 Mobile: +94 716288847 94716288847
 LinkedIn http://lk.linkedin.com/in/cdathuraliya | Twitter
 https://twitter.com/cdathuraliya | Blog
 http://cdathuraliya.tumblr.com/




-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com
Mobile: +94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Predicted vs. actuals chart in model summary

2015-05-28 Thread Maheshakya Wijewardena
Hi CD,

Two of the widely used evaluation metrics are Rand index[1] and mutual
information[2]. In addition, there is Homogeneity, Completeness and
V-measure [3]. One issue with these external indices is that they require
ground truth of cluster assignments. Therefore without the true class
labels, these metrics are not usable. There are several internal indices as
well such as Silhouette Coefficient[4] which do not need ground truth. Some
of those methods are discussed here[5][6][7]. I think the more useful
scenario will be to use internal indices since having ground truth cluster
labels is not always the case.

For visualization, only 2D (or maybe 3D) plots can be used despite there
are large number of features. So available options can be:

   1. Allowing user to choose 2 or 3 features.
   2. Use PCA based dimensionality reduced (to 2 or 3 components) data -
   Here, PCA may need to implemented separately so this option can be quite
   tedious.

It would be nice if the voronoi diagram for the data spread also can be
shown in the same diagram. See [8].
[1] http://en.wikipedia.org/wiki/Rand_index#Adjusted_Rand_index
[2] http://en.wikipedia.org/wiki/Adjusted_mutual_information
[3] http://aclweb.org/anthology/D/D07/D07-1043.pdf
[4] http://en.wikipedia.org/wiki/Silhouette_%28clustering%29
[5]
http://stats.stackexchange.com/questions/21807/evaluation-measure-of-clustering-without-having-truth-labels
[6] https://web.njit.edu/~yl473/papers/ICDM10CLU.pdf
[7] http://shonen.naun.org/multimedia/UPress/cc/20-463.pdf
[8] http://www.naftaliharris.com/blog/visualizing-k-means-clustering/

Best regards.

On Thu, May 28, 2015 at 12:24 PM, CD Athuraliya chathur...@wso2.com wrote:

 Hi Maheshakya,

 We'll be adding cluster diagram in model summary for clustering
 algorithms. Please suggest if there exist any other useful evaluation
 metrics.

 Thanks

 On Thu, May 28, 2015 at 11:58 AM, Maheshakya Wijewardena 
 mahesha...@wso2.com wrote:

 Nice.

 Adding up to charts for classification, I think we need some
 visualization method for clustering as well since there's nothing to show
 after clustering models are trained. Maybe chart with respect to two
 selected attributes.

 On Thu, May 28, 2015 at 11:46 AM, CD Athuraliya chathur...@wso2.com
 wrote:

 Hi all,

 Residual plot has been added for numerical prediction algorithms. Using
 standard chart types as much as possible is better IMO. It will reduce user
 confusion in understanding visualizations. I think we need to look for some
 standard chart types for classification algorithms (both binary and
 multiclass) as well [1].

 [1] http://oobaloo.co.uk/visualising-classifier-results-with-ggplot2

 Thanks

 On Wed, May 27, 2015 at 5:38 AM, Srinath Perera srin...@wso2.com
 wrote:

 +1 shall we try those?
 On 26 May 2015 22:52, Upul Bandara u...@wso2.com wrote:

 +1 for residual plots.

 Though I haven't used it myself Residual Plot  is a useful diagnostic
 tool for regression models.
 Especially, non-linearity in regression models can be easily
 identified using it.

 An Introduction to Statistical Learning book [1] ( page 92-96)
 contains some useful information about residual plots.

 [1]. http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing.pdf

 On Tue, May 26, 2015 at 8:47 PM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi CD,

 As it pops up in the offline discussion as well, IMHO, for
 classifications, this plot may not be the best option. But for 
 regression,
 we can actually use this plot but with a slight modification, that is
 taking the difference of the predicted and actual (rather than the values
 it self), and plot that, against a predictor variable (just like its been
 done atm). We can also add a third variable (categorical feature) to 
 color
 the points. This is a standard plot (AKA Residual plot) which is usually
 use to evaluate regression models.

 One other thing we can try out is, doing the same for classification
 as well. i.e: Taking the difference between the actual probability (o or 
 1)
 and the predicted probability, and plot that, and see whether it gives a
 better overall picture. Not sure how will it come out though :) If it 
 comes
 right, then any point lies above 0.5 (or the threshold we used) is 
 wrongly
 classified, and hence we can get a rough idea, on for which values of
 x-axis feature, does the points get wrongly classified. I mean, we should
 be able to see any pattern, if there exists.

 Thanks,
 Supun

 On Tue, May 26, 2015 at 6:08 PM, CD Athuraliya chathur...@wso2.com
 wrote:

 Hi,

 Plotting predicted and actual values against a feature doesn't look
 very intuitive, specially for non-probabilistic models. Please check the
 attachments. Any thoughts on making this visualization better?

 Thanks

 On Fri, May 22, 2015 at 3:27 PM, Srinath Perera srin...@wso2.com
 wrote:

 yes, rerun using a random sample from test data is OK.

 --Srinath

 On Fri, May 22, 2015 at 2:28 PM, CD Athuraliya chathur...@wso2.com
  wrote:

 Hi Srinath

[Dev] Aggregate operations support for life cycles - Issue

2014-04-22 Thread Maheshakya Wijewardena
Hi,
I have encountered an issue during the implementation. Initial plan was as
follows:

governance.api contains following modules.


   - GovernanceBatchValidate interface: This interface contains the
   validate method.
   - GovernanceAggregateOperations: This uses the OSGi service tracker to
   retrieve the correct validator which is registered as an OSGi service.
   (This registration happens in governance.registry.extenstions, thus
   requires that dependency)

governance.registry.extenstions contains following modules:


   - LifecycleValidateUtil: At the moment, only this is implemented as a
   custom validator. This checks whether lifecycle states are consistent. This
   class implements GovernanceBatchValidate interface mentioned above(So that
   dependency is required here.)
   - BatchValidateServiceComponent: This registers the above implementation
   as an OSGi service. This will be used in GovernanceAggregateOperations
   class mentioned above.


This design as has flaw as there are cyclic references in the
dependencies(governance.api uses governance.registry.extensions and vice
versa). What may the possible solutions for this issue?

Currently we have a thought of moving GovernanceAggregateOperations into
governance.registry.extensions.

Best regards,
-- 
Maheshakya Wijewardena,
Software Engineering Intern.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Aggregate operations support for life cycles

2014-04-17 Thread Maheshakya Wijewardena
Hi,
This is a basic description about the proposed design of the new feature in
Governance registry(mentioned in the subject).
A separate API will be provided in the Governance API to create a batch and
perform other batch operations. This API contains following components.

   1. An interface: BatchValidate: This provides a common interface to
   implement custom validation classes according the user requirement. This
   contains a validate method.
   2. Aggregate operations manager: this contains following methods.


   - createBatch(String[] paths, String batchID, *validation parameters ):
   This uses an OSGi service tracker to determine the appropriate OSGi service
   created with the implementation of the above interface to validate the
   batch of resources.
   - invokeCheckItem(String batchID, String action, MapString,String
   parameterMap): This will validate the batch using the same method in order
   to ensure that properties of batch resources have not been changed. Then it
   will perform the operation. If any of the resources fail during the
   process, the entire batch operation will fail.
   - invokeStateTransition(String batchID, String action): This acts
   similar to the above method.

Custom validations will be in Registry extensions. They will implement the
above mentioned interface and will be created as OSGi services. Some of the
validation criteria can be as follows:

   - LC state should be same
   - Checklist item values should be same, etc.

Service tracker will select the corresponding validator.

There is a slight issue regarding the life cycles which involve transition
UIs. These are operated at the UI level for individual resources. So when a
batch is considered, this concept is not applicable. Hence, the temporary
solution for this issue is to fail the validation if the life cycle
involves Transition UIs.

Note: This is tentative plan

Best regards,
-- 
Maheshakya Wijewardena,
Software Engineering Intern.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Feature: Supporting aggregate operations in lifecycles

2014-03-28 Thread Maheshakya Wijewardena
Hi,

Currently, WSO2 Governance Registry supports life cycle operations only for
individual resources. This feature will enable it to support for a batch of
resources. Following is a concise description of the use cases for
aggregate operations in life cycles. To create a batch of resources, users
will have to tick on check boxes for each resource(will be implemented in
the UI).

   1. Upper bound for the number of resources in a batch : There is a
   convenient maximum amount of selections a user can do.
   2. Select/Deselect resources during multiple operations.
   3. After selecting resources for a batch, get possible operations on the
   entire batch.
   4. If an operation is unable to be performed, provide an informative
   message.
   5. Rollback operations on a batch(Allowing to un-select while rollback?)
   6. Aggregate deletions of resources.

Suggestions for improvements are welcome.

Best regards,

-- 
Maheshakya Wijewardena,
Software Engineering Intern.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev