Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-11-23 Thread Mahesh Dananjaya
I think this is for the StreamingRegression right. You have to check the
queryr. if there is simple mistake such as a space this can be happened. As
i remember this is coming in the execution right?check in your local maven
repository that relevant class is there first. and can you explain how you
are going to run this. with CEP?i will check.
BR,
Mahesh.

On Thu, Nov 24, 2016 at 12:11 PM, Jayan Vidanapathirana <jay...@wso2.com>
wrote:

> Hi Ayya,
>
> This is the exception,
>
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> Exception in thread "Thread-1" 
> org.wso2.siddhi.core.exception.ExecutionPlanRuntimeException:
> Fail to initialize the task :
> at org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.
> utils.regression.StreamingRegressionTaskBuilder.initRegressionTask(
> StreamingRegressionTaskBuilder.java:91)
> at org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.
> utils.regression.StreamingRegressionTaskBuilder.initTask(
> StreamingRegressionTaskBuilder.java:64)
> at org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.
> utils.regression.StreamingRegression.run(StreamingRegression.java:57)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Problem creating instance of class:
> org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.utils.regression.
> StreamingRegressionTask
> at com.github.javacliparser.ClassOption.cliStringToObject(
> ClassOption.java:143)
> at org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.
> utils.regression.StreamingRegressionTaskBuilder.initRegressionTask(
> StreamingRegressionTaskBuilder.java:89)
> ... 3 more
> Caused by: java.lang.IllegalArgumentException: Problems with option:
> trainStream
> at com.github.javacliparser.ClassOption.setValueViaCLIString(
> ClassOption.java:64)
> at com.github.javacliparser.AbstractOption.resetToDefault(
> AbstractOption.java:90)
> at com.github.javacliparser.AbstractClassOption.(
> AbstractClassOption.java:84)
> at com.github.javacliparser.AbstractClassOption.(
> AbstractClassOption.java:63)
> at com.github.javacliparser.ClassOption.(
> ClassOption.java:38)
> at org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.
> utils.ProcessTask.(ProcessTask.java:67)
> at org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.
> utils.regression.StreamingRegressionTask.
> (StreamingRegressionTask.java:32)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at java.lang.Class.newInstance(Class.java:442)
> at com.github.javacliparser.ClassOption.cliStringToObject(
> ClassOption.java:141)
> ... 4 more
> Caused by: java.lang.Exception: Class not found: org.wso2.carbon.ml.siddhi.
> extension.streamingml.samoa.utils.regression.StreamingRegressionStream
> at com.github.javacliparser.ClassOption.cliStringToObject(
> ClassOption.java:136)
> at com.github.javacliparser.ClassOption.setValueViaCLIString(
> ClassOption.java:61)
> ... 16 more
>
> 
> 
> 
> TaskBuilder query
>
> query = 
> "org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.utils.regression.StreamingRegressionTask
>  -f " + batchSize + " -i " + maxInstances +
> " -s 
> (org.wso2.carbon.ml.siddhi.extension.streamingml.samoa.utils.regression.StreamingRegressionStream
>  -A " + numberOfAttributes + " ) " +
> "-l  
> (org.apache.samoa.learners.classifiers.rules.HorizontalAMRulesRegressor " +
>     "-r 9 -p " + parallelism + ")";
>
>
>
> On Wed, Nov 23, 2016 at 2:40 PM, Jayan Vidanapathirana <jay...@wso2.com>
> wrote:
>
>> Hi ayya,
>>
>> I got the same issue you got here. Need a help to solve this is you have
>> free time.
>>
>>
>> -- Forwarded message --
>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>> Date: Fri, Aug 5, 2016 at 12:20 PM
>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic
>> with online data for WSO2 Machine Learner-Samoa Integration
>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-10 Thread Mahesh Dananjaya
   at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:151)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:159)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:164)
at
com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:206)
at akka.actor.ActorSystem$Settings.(ActorSystem.scala:169)
at akka.actor.ActorSystemImpl.(ActorSystem.scala:505)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:119)
at
org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
at
org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1988)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1979)
at
org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
at org.apache.spark.SparkContext.(SparkContext.scala:457)
at
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
at
org.gsoc.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearRegression.(StreamingLinearRegression.java:60)
at
org.gsoc.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor.init(StreamingLinearRegressionStreamProcessor.java:83)
at
org.wso2.siddhi.core.query.processor.stream.AbstractStreamProcessor.initProcessor(AbstractStreamProcessor.java:65)
... 82 more


thakn you.
BR,
Mahesh.
[1]
https://github.com/dananjayamahesh/streaming/blob/master/src/main/java/org/gsoc/carbon/ml/siddhi/extension/streaming/algorithm/StreamingLinearRegression.java


On Mon, Aug 8, 2016 at 4:00 PM, Supun Sethunga <sup...@wso2.com> wrote:

> Hi Mahesh,
>
> Couple of issues I noticed:
>
>- Your siddhi-extension is using spark 1.6.1 dependency. But in the
>logs, it says the version of spark is 1.4.1. You can see the following line
>in the logs.
>
> *TID: [-1234] [] [2016-08-08 13:05:11,699]  INFO
> {org.apache.spark.SparkContext} -  Running Spark version 1.4.1 *
>
>
>- In each of the extension, you are starting a spark-context, and a
>new spark application. But spark only allows to create one spark context
>per JVM.
>
> Can you fix those and check whether the issue is still exists? Also please
> use a fresh CEP pack when testing. For now, you can avoid the second issue
> by creating only one execution plan and calling only one algorithm at a
> time.
>
> Btw, it would be easy for us to reproduce the issue and check whats
> happening, if you can include all dependencies inside the jar itself.
> Otherwise, it's a nightmare to find and add the missing dependencies one by
> one.
>
> Regards,
> Supun
>
>
> On Mon, Aug 8, 2016 at 3:43 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi supun
>> this is the log file.This happens only when i use the cep to invoke the
>> extension.thank you.
>> regards,
>> Mahesh.
>>
>> On Mon, Aug 8, 2016 at 1:32 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi supun,
>>> this is the cep log file.thank you.
>>> Mahesh.
>>>
>>> On Mon, Aug 8, 2016 at 11:03 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi supun,
>>>> I think i have fixed couple of previous errors. Now my samoa extension
>>>> is working fine with cep. But i am getting a exception with my spark
>>>> streaming ml extensions. It was working fine and i did not do any changes
>>>> to my previously developed streaming extensions. This error is occured in
>>>> the line,
>>>>
>>>> conf = new SparkConf().setMaster("local[*]").setAppName("Linear
>>>> Regression Example").set("spark.driver.allowMultipleContexts", "true")
>>>> ;
>>>> sc = new JavaSparkContext(conf);
>>>>
>>>> This what i get and cep server is crashed.
>>>> ERROR {org.apache.spark.ui.SparkUI} -  Failed to bind SparkUI
>>>> javax.servlet.UnavailableException: Servlet class
>>>> com.sun.jersey.spi.container.servlet.ServletContainer is not a
>>>> javax.servlet.Servlet
>>>> at org.spark-project.jetty.servlet.ServletHolder.checkS

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-07 Thread Mahesh Dananjaya
s.CarbonStuckThreadDetectionValve.invoke(CarbonStuckThreadDetectionValve.java:159)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at
org.wso2.carbon.tomcat.ext.valves.CarbonContextCreatorValve.invoke(CarbonContextCreatorValve.java:57)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1074)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1739)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1698)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)


thank you.
Mahesh.

On Mon, Aug 8, 2016 at 10:49 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Srinath and Nirmal,
> I have integrated the my samoa predictive analysis topologies with the cep
> with siddhi extension developed. As you asked me, i have verified the
> output of my samoa streaming clustering topology with the previously build
> Spark streaming Clustering analysis. Results are almost same and is
> converging. As a example if i used two algorithms spark mini-batch
> clustering and samoa kernal clustering for streaming clustering analysis
> outcomes are as follows in my streaming extensions.
>
> Spark Streaming Clustering
> center 0: 440.873301276124,25.477847235065635,64.20812280377294,1010.
> 9613963380791,67.80943221749587
> center 1: 470.25742216416273,12.785760940561731,42.586181145220955,1015.
> 9444850860008,79.6946897452645
>
> Samoa Streaming Clustering
> Center :0: 442.06527377405365,25.51594930115884,60.35312500444376,1010.
> 624677139,61.262047085828065
> Center :1: 469.66365199261037,12.6577054812157,43.59210147812603,1014.
> 3543441369945,76.81973235658958
>
> you can see that cluster centers are almost same.This is with 5 attributes
> and two clusters. As long as we have large batch size and number of
> increasing number of retraining, the results are converging. Now i am
> trying to integrate this with carbon-ml and trying to fix couple of issues
> in the integration. I will also prepare the documentation.thank you. And i
> have create new repository [1] to put essential classes.
> regards,
> Mahesh.
> [1] https://github.com/dananjayamahesh/streaming
>
>
> On Fri, Aug 5, 2016 at 6:47 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> Mahesh,
>>
>> Your issue is not clear. What're you exactly trying ? carbon-ml is our
>> git repo and if you mean to say about the server you're running, please use
>> the term 'ML server'. Please summarize the problem and the steps you've
>> done. Please use the point form.
>>
>> Thanks.
>>
>> On Fri, Aug 5, 2016 at 6:43 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Nirmal,
>>> can those things be caused by permission requirements?
>>> BR,
>>> Mahesh.
>>>
>>> On Fri, Aug 5, 2016 at 5:39 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Nirmal,
>>>> Yes i have added samoa dependencies in carbon-ml and it is working
>>>> fine. These exceptions are coming for the my classes that are inside ml.
>>>> when i used classes in samoa like 
>>>> org.apache.samoa.tasks.ClusteringEvaluation
>>>> it is working fine and run. And also same classes that i developed are
>>>> running fine in my other extensions outside carbon-ml, which are developed
>>>> as regular extensions. i am checking this further. thank you.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Fri, Aug 5, 2016 at 5:31 PM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Did you add samoa jars in ML?
>>>>>
>>>>> On Fri, Aug 5, 2016 at 12:20 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Supun,
>>>>>> This is the error i am getting while run the extension in the
>>>>>> cabon-ml side,
>>>>>> plase reffer to link [1] for the class.
>>>>>>
>>>>>> ERROR 
>>>&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-07 Thread Mahesh Dananjaya
Hi Srinath and Nirmal,
I have integrated the my samoa predictive analysis topologies with the cep
with siddhi extension developed. As you asked me, i have verified the
output of my samoa streaming clustering topology with the previously build
Spark streaming Clustering analysis. Results are almost same and is
converging. As a example if i used two algorithms spark mini-batch
clustering and samoa kernal clustering for streaming clustering analysis
outcomes are as follows in my streaming extensions.

Spark Streaming Clustering
center 0:
440.873301276124,25.477847235065635,64.20812280377294,1010.9613963380791,67.80943221749587
center 1:
470.25742216416273,12.785760940561731,42.586181145220955,1015.9444850860008,79.6946897452645

Samoa Streaming Clustering
Center :0:
442.06527377405365,25.51594930115884,60.35312500444376,1010.624677139,61.262047085828065
Center :1:
469.66365199261037,12.6577054812157,43.59210147812603,1014.3543441369945,76.81973235658958

you can see that cluster centers are almost same.This is with 5 attributes
and two clusters. As long as we have large batch size and number of
increasing number of retraining, the results are converging. Now i am
trying to integrate this with carbon-ml and trying to fix couple of issues
in the integration. I will also prepare the documentation.thank you. And i
have create new repository [1] to put essential classes.
regards,
Mahesh.
[1] https://github.com/dananjayamahesh/streaming


On Fri, Aug 5, 2016 at 6:47 PM, Nirmal Fernando <nir...@wso2.com> wrote:

> Mahesh,
>
> Your issue is not clear. What're you exactly trying ? carbon-ml is our git
> repo and if you mean to say about the server you're running, please use the
> term 'ML server'. Please summarize the problem and the steps you've done.
> Please use the point form.
>
> Thanks.
>
> On Fri, Aug 5, 2016 at 6:43 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Nirmal,
>> can those things be caused by permission requirements?
>> BR,
>> Mahesh.
>>
>> On Fri, Aug 5, 2016 at 5:39 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Nirmal,
>>> Yes i have added samoa dependencies in carbon-ml and it is working fine.
>>> These exceptions are coming for the my classes that are inside ml. when i
>>> used classes in samoa like org.apache.samoa.tasks.ClusteringEvaluation
>>> it is working fine and run. And also same classes that i developed are
>>> running fine in my other extensions outside carbon-ml, which are developed
>>> as regular extensions. i am checking this further. thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Fri, Aug 5, 2016 at 5:31 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>>>
>>>> Did you add samoa jars in ML?
>>>>
>>>> On Fri, Aug 5, 2016 at 12:20 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Supun,
>>>>> This is the error i am getting while run the extension in the cabon-ml
>>>>> side,
>>>>> plase reffer to link [1] for the class.
>>>>>
>>>>> ERROR 
>>>>> {org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder}
>>>>> -  Fail to initialize the task
>>>>>
>>>>> java.lang.Exception: Class not found: StreamingClusteringTask
>>>>>
>>>>>at com.github.javacliparser.ClassOption.cliStringToObject(Class
>>>>> Option.java:136)
>>>>>
>>>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>>>> gClusteringTaskBuilder.initClusteringTask(StreamingClusterin
>>>>> gTaskBuilder.java:129)
>>>>>
>>>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>>>> gClusteringTaskBuilder.initTask(StreamingClusteringTaskBuild
>>>>> er.java:100)
>>>>>
>>>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>>>> gClustering.run(StreamingClustering.java:77)
>>>>>
>>>>>at java.lang.Thread.run(Thread.java:745)
>>>>>
>>>>> [2016-08-04 16:23:07,437]  INFO {org.wso2.carbon.ml.siddhi.ext
>>>>> ension.streaming.samoa.StreamingClusteringTaskBuilder} -  Fail to
>>>>> initialize the taskjava.lang.Exception: Class not found:
>>>>> StreamingClusteringTask
>>>>>
>>>>> +++Please refeer link [1] for the StreamingClusteringTask.
>>>>>
>>>>> Then again for StreamingClusteringStrea

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-05 Thread Mahesh Dananjaya
Hi Nirmal,
can those things be caused by permission requirements?
BR,
Mahesh.

On Fri, Aug 5, 2016 at 5:39 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Nirmal,
> Yes i have added samoa dependencies in carbon-ml and it is working fine.
> These exceptions are coming for the my classes that are inside ml. when i
> used classes in samoa like org.apache.samoa.tasks.ClusteringEvaluation it
> is working fine and run. And also same classes that i developed are running
> fine in my other extensions outside carbon-ml, which are developed as
> regular extensions. i am checking this further. thank you.
> regards,
> Mahesh.
>
> On Fri, Aug 5, 2016 at 5:31 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> Did you add samoa jars in ML?
>>
>> On Fri, Aug 5, 2016 at 12:20 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Supun,
>>> This is the error i am getting while run the extension in the cabon-ml
>>> side,
>>> plase reffer to link [1] for the class.
>>>
>>> ERROR 
>>> {org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder}
>>> -  Fail to initialize the task
>>>
>>> java.lang.Exception: Class not found: StreamingClusteringTask
>>>
>>>at com.github.javacliparser.ClassOption.cliStringToObject(Class
>>> Option.java:136)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClusteringTaskBuilder.initClusteringTask(StreamingClusterin
>>> gTaskBuilder.java:129)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClusteringTaskBuilder.initTask(StreamingClusteringTaskBuilder.java:100)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClustering.run(StreamingClustering.java:77)
>>>
>>>at java.lang.Thread.run(Thread.java:745)
>>>
>>> [2016-08-04 16:23:07,437]  INFO {org.wso2.carbon.ml.siddhi.ext
>>> ension.streaming.samoa.StreamingClusteringTaskBuilder} -  Fail to
>>> initialize the taskjava.lang.Exception: Class not found:
>>> StreamingClusteringTask
>>>
>>> +++Please refeer link [1] for the StreamingClusteringTask.
>>>
>>> Then again for StreamingClusteringStream class while i bypass the String
>>> query in the initTask(). please refer link [2] for the class.
>>>
>>> Exception in thread "Thread-60" java.lang.IllegalArgumentException:
>>> Problems with option: streamTrain
>>>
>>>at com.github.javacliparser.ClassOption.setValueViaCLIString(Cl
>>> assOption.java:64)
>>>
>>>at com.github.javacliparser.AbstractOption.resetToDefault(Abstr
>>> actOption.java:90)
>>>
>>>at com.github.javacliparser.AbstractClassOption.(Abstract
>>> ClassOption.java:84)
>>>
>>>at com.github.javacliparser.AbstractClassOption.(Abstract
>>> ClassOption.java:63)
>>>
>>>at com.github.javacliparser.ClassOption.(ClassOption.java:38)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClusteringTask.(StreamingClusteringTask.java:52)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClusteringTaskBuilder.initClusteringTask(StreamingClusterin
>>> gTaskBuilder.java:140)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClusteringTaskBuilder.initTask(StreamingClusteringTaskBuilder.java:103)
>>>
>>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClustering.run(StreamingClustering.java:77)
>>>
>>>at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.lang.Exception: Class not found:
>>> org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>>> gClusteringStream
>>>
>>>at com.github.javacliparser.ClassOption.cliStringToObject(Class
>>> Option.java:136)
>>>
>>>at com.github.javacliparser.ClassOption.setValueViaCLIString(Cl
>>> assOption.java:61)
>>>
>>>... 9 more
>>>
>>>
>>> Those Class Not Found Exceptions are at runtime. Do i have do anything
>>> on the carbon-ml side for this.My samoa cores can be found in [3]. Those
>>> are working fine and no Class Not Found Exceptions arrive when running
>>> outside the carbon-ml.
>>>
>>>
>>> thank you.
>>>
>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-05 Thread Mahesh Dananjaya
Hi Nirmal,
Yes i have added samoa dependencies in carbon-ml and it is working fine.
These exceptions are coming for the my classes that are inside ml. when i
used classes in samoa like org.apache.samoa.tasks.ClusteringEvaluation it
is working fine and run. And also same classes that i developed are running
fine in my other extensions outside carbon-ml, which are developed as
regular extensions. i am checking this further. thank you.
regards,
Mahesh.

On Fri, Aug 5, 2016 at 5:31 PM, Nirmal Fernando <nir...@wso2.com> wrote:

> Did you add samoa jars in ML?
>
> On Fri, Aug 5, 2016 at 12:20 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Supun,
>> This is the error i am getting while run the extension in the cabon-ml
>> side,
>> plase reffer to link [1] for the class.
>>
>> ERROR 
>> {org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder}
>> -  Fail to initialize the task
>>
>> java.lang.Exception: Class not found: StreamingClusteringTask
>>
>>at com.github.javacliparser.ClassOption.cliStringToObject(Class
>> Option.java:136)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClusteringTaskBuilder.initClusteringTask(StreamingCl
>> usteringTaskBuilder.java:129)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClusteringTaskBuilder.initTask(StreamingClusteringTaskBuilder.java:100)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClustering.run(StreamingClustering.java:77)
>>
>>at java.lang.Thread.run(Thread.java:745)
>>
>> [2016-08-04 16:23:07,437]  INFO {org.wso2.carbon.ml.siddhi.ext
>> ension.streaming.samoa.StreamingClusteringTaskBuilder} -  Fail to
>> initialize the taskjava.lang.Exception: Class not found:
>> StreamingClusteringTask
>>
>> +++Please refeer link [1] for the StreamingClusteringTask.
>>
>> Then again for StreamingClusteringStream class while i bypass the String
>> query in the initTask(). please refer link [2] for the class.
>>
>> Exception in thread "Thread-60" java.lang.IllegalArgumentException:
>> Problems with option: streamTrain
>>
>>at com.github.javacliparser.ClassOption.setValueViaCLIString(Cl
>> assOption.java:64)
>>
>>at com.github.javacliparser.AbstractOption.resetToDefault(Abstr
>> actOption.java:90)
>>
>>at com.github.javacliparser.AbstractClassOption.(Abstract
>> ClassOption.java:84)
>>
>>at com.github.javacliparser.AbstractClassOption.(Abstract
>> ClassOption.java:63)
>>
>>at com.github.javacliparser.ClassOption.(ClassOption.java:38)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClusteringTask.(StreamingClusteringTask.java:52)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClusteringTaskBuilder.initClusteringTask(StreamingCl
>> usteringTaskBuilder.java:140)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClusteringTaskBuilder.initTask(StreamingClusteringTaskBuilder.java:103)
>>
>>at org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClustering.run(StreamingClustering.java:77)
>>
>>at java.lang.Thread.run(Thread.java:745)
>>
>> Caused by: java.lang.Exception: Class not found:
>> org.wso2.carbon.ml.siddhi.extension.streaming.samoa.Streamin
>> gClusteringStream
>>
>>at com.github.javacliparser.ClassOption.cliStringToObject(Class
>> Option.java:136)
>>
>>at com.github.javacliparser.ClassOption.setValueViaCLIString(Cl
>> assOption.java:61)
>>
>>... 9 more
>>
>>
>> Those Class Not Found Exceptions are at runtime. Do i have do anything on
>> the carbon-ml side for this.My samoa cores can be found in [3]. Those are
>> working fine and no Class Not Found Exceptions arrive when running outside
>> the carbon-ml.
>>
>>
>> thank you.
>>
>> regards,
>>
>> Mahesh.
>>
>>
>>
>> [1] https://github.com/dananjayamahesh/carbon-ml/blob/wso2_gsoc_
>> ml6_cml/components/extensions/org.wso2.carbon.ml.siddhi.exte
>> nsion/src/main/java/org/wso2/carbon/ml/siddhi/extension/
>> streaming/samoa/StreamingClusteringTask.java
>>
>> [2]  https://github.com/dananjayamahesh/carbon-ml/blob/wso2_gsoc_
>> ml6_cml/components/extensions/org.wso2.carbon.ml.siddhi.exte
>> nsion/src/main/java/org/wso2/carbon/ml/siddhi/extension/
>> streaming/samoa/StreamingClusteringTask.java
>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-05 Thread Mahesh Dananjaya
Hi Supun,
This is the error i am getting while run the extension in the cabon-ml side,
plase reffer to link [1] for the class.

ERROR
{org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder}
-  Fail to initialize the task

java.lang.Exception: Class not found: StreamingClusteringTask

   at
com.github.javacliparser.ClassOption.cliStringToObject(ClassOption.java:136)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder.initClusteringTask(StreamingClusteringTaskBuilder.java:129)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder.initTask(StreamingClusteringTaskBuilder.java:100)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClustering.run(StreamingClustering.java:77)

   at java.lang.Thread.run(Thread.java:745)

[2016-08-04 16:23:07,437]  INFO
{org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder}
-  Fail to initialize the taskjava.lang.Exception: Class not found:
StreamingClusteringTask

+++Please refeer link [1] for the StreamingClusteringTask.

Then again for StreamingClusteringStream class while i bypass the String
query in the initTask(). please refer link [2] for the class.

Exception in thread "Thread-60" java.lang.IllegalArgumentException:
Problems with option: streamTrain

   at
com.github.javacliparser.ClassOption.setValueViaCLIString(ClassOption.java:64)

   at
com.github.javacliparser.AbstractOption.resetToDefault(AbstractOption.java:90)

   at
com.github.javacliparser.AbstractClassOption.(AbstractClassOption.java:84)

   at
com.github.javacliparser.AbstractClassOption.(AbstractClassOption.java:63)

   at com.github.javacliparser.ClassOption.(ClassOption.java:38)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTask.(StreamingClusteringTask.java:52)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder.initClusteringTask(StreamingClusteringTaskBuilder.java:140)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringTaskBuilder.initTask(StreamingClusteringTaskBuilder.java:103)

   at
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClustering.run(StreamingClustering.java:77)

   at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.Exception: Class not found:
org.wso2.carbon.ml.siddhi.extension.streaming.samoa.StreamingClusteringStream

   at
com.github.javacliparser.ClassOption.cliStringToObject(ClassOption.java:136)

   at
com.github.javacliparser.ClassOption.setValueViaCLIString(ClassOption.java:61)

   ... 9 more


Those Class Not Found Exceptions are at runtime. Do i have do anything on
the carbon-ml side for this.My samoa cores can be found in [3]. Those are
working fine and no Class Not Found Exceptions arrive when running outside
the carbon-ml.


thank you.

regards,

Mahesh.



[1]
https://github.com/dananjayamahesh/carbon-ml/blob/wso2_gsoc_ml6_cml/components/extensions/org.wso2.carbon.ml.siddhi.extension/src/main/java/org/wso2/carbon/ml/siddhi/extension/streaming/samoa/StreamingClusteringTask.java

[2]
https://github.com/dananjayamahesh/carbon-ml/blob/wso2_gsoc_ml6_cml/components/extensions/org.wso2.carbon.ml.siddhi.extension/src/main/java/org/wso2/carbon/ml/siddhi/extension/streaming/samoa/StreamingClusteringTask.java

[3]
https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml/components/extensions/org.wso2.carbon.ml.siddhi.extension/src/main/java/org/wso2/carbon/ml/siddhi/extension/streaming/samoa

On Thu, Aug 4, 2016 at 12:30 PM, Supun Sethunga <sup...@wso2.com> wrote:

> Hi Mahesh,
>
> samoa dependency version in siddhi-extension should be
> *0.4.0-incubating-SNAPSHOT*. That should solve the issue
>
> Regards,
> Supun
>
> On Thu, Aug 4, 2016 at 11:50 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi all,
>> samoa is in my local repository and dependencies works fine with all
>> other extensions that add samoa 0.4.0-incubator dependencies. But only when
>> i build carbon-ml, it gives priority for the remote repository for looking
>> samoa. SO any help with this to give priority for local m2 repo before
>> carbon-ml building is looking for the remote one. I am getting the error
>> because of this. maven option -U also not seems to be working here. any
>> help please.
>>
>> [ERROR] Failed to execute goal on project 
>> org.wso2.carbon.ml.siddhi.extension:
>> Could not resolve dependencies for project org.wso2.carbon.ml:org.wso2.
>> carbon.ml.siddhi.extension:bundle:1.1.2-SNAPSHOT: The following
>> artifacts could not be resolved: 
>> org.apache.samoa:samoa-api:jar:0.4.0-incubating,
>> org.apache.samoa:samoa-local:jar:0.4.0-incubating: Could not find
>> artifact org.apache.samoa:samoa-api:jar:0.4.0-incubating in wso2-nexus (
>&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-04 Thread Mahesh Dananjaya
Hi all,
samoa is in my local repository and dependencies works fine with all other
extensions that add samoa 0.4.0-incubator dependencies. But only when i
build carbon-ml, it gives priority for the remote repository for looking
samoa. SO any help with this to give priority for local m2 repo before
carbon-ml building is looking for the remote one. I am getting the error
because of this. maven option -U also not seems to be working here. any
help please.

[ERROR] Failed to execute goal on project
org.wso2.carbon.ml.siddhi.extension: Could not resolve dependencies for
project 
org.wso2.carbon.ml:org.wso2.carbon.ml.siddhi.extension:bundle:1.1.2-SNAPSHOT:
The following artifacts could not be resolved:
org.apache.samoa:samoa-api:jar:0.4.0-incubating,
org.apache.samoa:samoa-local:jar:0.4.0-incubating: Could not find artifact
org.apache.samoa:samoa-api:jar:0.4.0-incubating in wso2-nexus (
http://maven.wso2.org/nexus/content/groups/wso2-public/)

Since samoa will be used for future work, is it possible to add that in the
relevant wso2 repo. Because there are still no samoa 0.4.0-incubator maven
repo. only 0.3.0. we cannot continue our work with the 0.3.0 since it is
outdated.

thank you.
regards,
Mahesh.

On Wed, Aug 3, 2016 at 4:29 PM, Miyuru Dayarathna <miyu...@wso2.com> wrote:

> Adding Jayan to this email thread.
>
> --
> Thanks,
> Miyuru Dayarathna
> Senior Technical Lead
> Mobile: +94713527783
> Blog: http://miyurublog.blogspot.com
>
> On Wed, Aug 3, 2016 at 3:11 PM, Supun Sethunga <sup...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> you should build carbon-ml *without *-U option. -U means you force mvn
>> to look for updates in remote repo. Rather run it with -o option. Also, can
>> you double check whether the dependencies are defined correctly (group
>> Id's, versions etc).
>>
>> Alternatively, it seems there is a samoa released version in mvn repo.
>> Maybe you could try that one as well. But that's v0.3.0..
>>
>> [1] https://mvnrepository.com/artifact/org.apache.samoa
>>
>> On Wed, Aug 3, 2016 at 2:54 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Supun,
>>> I just neeed a little help. I am in the process of integrating my samoa
>>> core functions and extension into carbon-ml's siddhi extension. For samoa i
>>> am using locally built samoa project to provide samoa 0.4.0
>>> dependencies,since we dont have it in the maven repo or else where. But
>>> when i build carbon-ml by adding samoa dependencies, it seems to be maven
>>> search for remote location, not the local maven repo first. I am running
>>> maven with -U option. But still the problems occurs. Is there any specific
>>> thing in carbon-ml like settings to search remote before local one? I just
>>> need to give local maven repo for the dependency. My extension seperately
>>> working fine, so there is no problem wihat the local dependencies outside
>>> carbon-ml. So can you please help me with this.thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Fri, Jul 22, 2016 at 3:17 PM, Srinath Perera <srin...@wso2.com>
>>> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> On Thu, Jul 21, 2016 at 2:10 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>> I am onto connecting cep streams with samoa streams to data analysis
>>>>> using samoa framework. To connect samoa with cep siddhi event streams what
>>>>> i we can do is that try to convert cep streams into samoa streams or else
>>>>> writing wrpper for samoa for cep  streasm to be used. In both cases i have
>>>>> to covert siddhi cep streasm into samoa streams. Samoa is using MOA to
>>>>> analyse data. Moo contains ML framework to analyse stream data. Samoa is
>>>>> wrapping MOA withsome of its classes.
>>>>>
>>>>> Samoa streams is based on MOA, Instance and InstanceStreams. Samoa see
>>>>> streams as a stream of instances [1]. So if we are going to convert cep
>>>>> events into samoa instances , it will take time. But if we have some
>>>>> similarity between cep siddhi streams and samoa streasm we can reduce the
>>>>> time.
>>>>> 1. What is the underlying infrastructure for cep siddhi streasm.?
>>>>> 2. Are there anything as Instances or InstanceStreams kind of
>>>>> implmentation underlying cep streams?
>>>>> 3. How can i get more underestanding on CEP siddhi streams.
>>>>>
>>>>> On the other hand i can us

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-03 Thread Mahesh Dananjaya
Hi supun,
We have been using v0.4.0 for the cep integration,since the last verstion
is pretty outdated and lack with algorithms for streaming. So i am now
using my locally built v0.4.0 for the project. dependencies are working
fine, because it has been using on the extensions that we develop outside
the carbon-ml as regular extensions. same error appears when i was not
using -U. seems to be that maven search for remote repo rather than local
one.this is the error i got.

[ERROR] Failed to execute goal on project
org.wso2.carbon.ml.siddhi.extension: Could not resolve dependencies for
project 
org.wso2.carbon.ml:org.wso2.carbon.ml.siddhi.extension:bundle:1.1.2-SNAPSHOT:
The following artifacts could not be resolved:
org.apache.samoa:samoa-api:jar:0.4.0-incubating,
org.apache.samoa:samoa-local:jar:0.4.0-incubating: Could not find artifact
org.apache.samoa:samoa-api:jar:0.4.0-incubating in wso2-nexus (
http://maven.wso2.org/nexus/content/groups/wso2-public/)

seems to be aven prefer remote one for samoa rather than local one. This is
only when i build carbon-ml adding samoa local dependencies. Other
extensions used samoa,which are outside carbon-ml are working fine.
thank you..
BR,
Mahesh.

On Wed, Aug 3, 2016 at 3:11 PM, Supun Sethunga <sup...@wso2.com> wrote:

> Hi Mahesh,
>
> you should build carbon-ml *without *-U option. -U means you force mvn to
> look for updates in remote repo. Rather run it with -o option. Also, can
> you double check whether the dependencies are defined correctly (group
> Id's, versions etc).
>
> Alternatively, it seems there is a samoa released version in mvn repo.
> Maybe you could try that one as well. But that's v0.3.0..
>
> [1] https://mvnrepository.com/artifact/org.apache.samoa
>
> On Wed, Aug 3, 2016 at 2:54 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Supun,
>> I just neeed a little help. I am in the process of integrating my samoa
>> core functions and extension into carbon-ml's siddhi extension. For samoa i
>> am using locally built samoa project to provide samoa 0.4.0
>> dependencies,since we dont have it in the maven repo or else where. But
>> when i build carbon-ml by adding samoa dependencies, it seems to be maven
>> search for remote location, not the local maven repo first. I am running
>> maven with -U option. But still the problems occurs. Is there any specific
>> thing in carbon-ml like settings to search remote before local one? I just
>> need to give local maven repo for the dependency. My extension seperately
>> working fine, so there is no problem wihat the local dependencies outside
>> carbon-ml. So can you please help me with this.thank you.
>> regards,
>> Mahesh.
>>
>> On Fri, Jul 22, 2016 at 3:17 PM, Srinath Perera <srin...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> On Thu, Jul 21, 2016 at 2:10 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi All,
>>>> I am onto connecting cep streams with samoa streams to data analysis
>>>> using samoa framework. To connect samoa with cep siddhi event streams what
>>>> i we can do is that try to convert cep streams into samoa streams or else
>>>> writing wrpper for samoa for cep  streasm to be used. In both cases i have
>>>> to covert siddhi cep streasm into samoa streams. Samoa is using MOA to
>>>> analyse data. Moo contains ML framework to analyse stream data. Samoa is
>>>> wrapping MOA withsome of its classes.
>>>>
>>>> Samoa streams is based on MOA, Instance and InstanceStreams. Samoa see
>>>> streams as a stream of instances [1]. So if we are going to convert cep
>>>> events into samoa instances , it will take time. But if we have some
>>>> similarity between cep siddhi streams and samoa streasm we can reduce the
>>>> time.
>>>> 1. What is the underlying infrastructure for cep siddhi streasm.?
>>>> 2. Are there anything as Instances or InstanceStreams kind of
>>>> implmentation underlying cep streams?
>>>> 3. How can i get more underestanding on CEP siddhi streams.
>>>>
>>>> On the other hand i can use my cep siddhi extension and put those
>>>> events into event queue and convert them into samoa instances and feed them
>>>> into samoa streaming ml topologies.
>>>>
>>> I think this is OK. I assume this is much easier. Let's do this and
>>> check the performance.
>>>
>>>
>>>> There is another option. In Samoa what they are basically doing is that
>>>> wrapping MOA ML framework and write some classes for build streaming ml
&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-08-03 Thread Mahesh Dananjaya
Hi Supun,
I just neeed a little help. I am in the process of integrating my samoa
core functions and extension into carbon-ml's siddhi extension. For samoa i
am using locally built samoa project to provide samoa 0.4.0
dependencies,since we dont have it in the maven repo or else where. But
when i build carbon-ml by adding samoa dependencies, it seems to be maven
search for remote location, not the local maven repo first. I am running
maven with -U option. But still the problems occurs. Is there any specific
thing in carbon-ml like settings to search remote before local one? I just
need to give local maven repo for the dependency. My extension seperately
working fine, so there is no problem wihat the local dependencies outside
carbon-ml. So can you please help me with this.thank you.
regards,
Mahesh.

On Fri, Jul 22, 2016 at 3:17 PM, Srinath Perera <srin...@wso2.com> wrote:

> Hi Mahesh,
>
> On Thu, Jul 21, 2016 at 2:10 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi All,
>> I am onto connecting cep streams with samoa streams to data analysis
>> using samoa framework. To connect samoa with cep siddhi event streams what
>> i we can do is that try to convert cep streams into samoa streams or else
>> writing wrpper for samoa for cep  streasm to be used. In both cases i have
>> to covert siddhi cep streasm into samoa streams. Samoa is using MOA to
>> analyse data. Moo contains ML framework to analyse stream data. Samoa is
>> wrapping MOA withsome of its classes.
>>
>> Samoa streams is based on MOA, Instance and InstanceStreams. Samoa see
>> streams as a stream of instances [1]. So if we are going to convert cep
>> events into samoa instances , it will take time. But if we have some
>> similarity between cep siddhi streams and samoa streasm we can reduce the
>> time.
>> 1. What is the underlying infrastructure for cep siddhi streasm.?
>> 2. Are there anything as Instances or InstanceStreams kind of
>> implmentation underlying cep streams?
>> 3. How can i get more underestanding on CEP siddhi streams.
>>
>> On the other hand i can use my cep siddhi extension and put those events
>> into event queue and convert them into samoa instances and feed them into
>> samoa streaming ml topologies.
>>
> I think this is OK. I assume this is much easier. Let's do this and check
> the performance.
>
>
>> There is another option. In Samoa what they are basically doing is that
>> wrapping MOA ML framework and write some classes for build streaming ml
>> topologies. So as the other option i can wrap samoa moa with my design and
>> use moa ml framework directly. (No need for Samoa extension). I have
>> building some topologies to streaming data analysis [2]. Main problem is
>> that lack of documentation. Anyway i had go through their whole samoa
>> design.thank you.
>>
>
> If we use MOA directly, would we loose the distributed support in SAOMA.
> Let's do a call when you can, so we can dsicuss this in detail.
>
> --Srinath
>
>
>
>> regards,
>> Mahesh.
>>
>> [1]
>> https://github.com/apache/incubator-samoa/blob/master/samoa-api/src/main/java/org/apache/samoa/streams/clustering/ClusteringStream.java
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming
>>
>> On Mon, Jul 18, 2016 at 11:40 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Samoa modules built as topologies that connect streams with the internal
>>> processors. I have already written some examples to test the ML algorithms
>>> and samoa analysis topologies. What we need to done is mostly developing a
>>> wrapper around samoa topologies to connect their input and output streams
>>> with our cep streams. So i am currently going through their stream
>>> architecture to connect our streams with their streams. Couple of examples
>>> exapaining samoa ml topologies and streaming can be found in my git hub
>>> repo [1]. Samoa using MOA ml algorithms by wrapping them with their
>>> classes. Initailly i am trying to develop a KMeansClustering analysis with
>>> cep streams with samoa ml topologies.
>>> And also i could not find a maven repo for samoa 0.4.0 incubating. So i
>>> am currently using my local m2 repo's samoa 0..4.0 incubating for my
>>> dependencies to work. The local one is built by original samoa source.thank
>>> you.
>>>
>>> regards,
>>> Mahesh.
>>>
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streamin

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-31 Thread Mahesh Dananjaya
Hi Maheshakya,
I have a small problem. As we did all the time when i build a samoa
streaming extension for the cep by providing samoa dependencies, i put that
jar into cep_home/repository/component/lib/ folder.The start cep and try to
develop the execution plan by providing my extension. i am remotely
debugging the samoa streaming extension that connected to our samoa
predictive learning topology. While on the run i get this exception.
java.lang.ClassNotFoundException:
org.apache.samoa.topology.ComponentFactory cannot be found by
streaming_1.0_SNAPSHOT_1.0.0.

But without cep my samoa core is running fine. It seems to be samoa is not
recognizing in the cep side. Is this kind of a dependency problem. In my
extension side i think nothing wrong. When i replace the streamProcessor
extension with locally build cep event stream emulation java class all the
things run fine. The probelm occurs only when this is running with cep. Do
i need to add anything on the cep side to underestand samoa in cep side.can
you please help me with this.thank you.
regards,
Mahesh.


On Wed, Jul 27, 2016 at 3:59 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Srinath,
>
> "I think this is OK. I assume this is much easier. Let's do this and check
> the performance",
> I also think so. I am currently on this and have a progress in this.for
> your question,
> "If we use MOA directly, would we loose the distributed support in SAOMA.
> Let's do a call when you can, so we can dsicuss this in detail."
>  I have to check for that.i think  if we are using MOA, we can use
> distributed clusters. As i wen through their documentation MOA itself
> cannot support distribution. But samoa can. What samoa does is providing
> streaming and clustering support by wrapping MOA algorithms. So i think we
> dont need to go for that option,directly MOA. Because now we can handle
> samoa building blocks.So we had 2 options for integrating it with cep
> without exploiting samoa architecture which is highly modular,scalable and
> flexible.
> 1. Develop Samoa topologies with basic samoa building blocks which make
> use of MOA algorithms.
> 2. Creating New streaming options with samoa stream building blocks which
> can feed cep siddhi events into samoa streams and get results samoa streams
> to cep back.
>
> As 2nd option is easy and  take reasonable time i am currently developing
> some modules to integrate cep streams into samoa which can be easily
> further extended to 1 option as well. So i had to modify stream and
> entrance modules for that and i think is has good progress.
>  So now i can feed my custom input stream to samoa topologies. That means
> i can easily integrate cep event stream into samoa instance stream.
> currently i am verifying the streaming clustering algorithms and its
> results with my custom input input streams which can be connected to samoa
> instance streams. As i have already developed siddhi extension for
> streaming, i can use them to feed my custom input streams now.  As the
> initial step i am go with the streaming clustering algorithms. Those are in
> my GSOC github repo [1].
> clustering - Streaming Clustering Support with samoa and CEP
> streaming - Streaming extension for samoa for cep evet streams
>
> i am currenlty working on the verification of results with some of our
> custom streams and then we will just have to integrate it with my
> extensions, which are already developed for cep as my first part.thank you.
> regards,
> Mahesh.
>
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming
>
> On Fri, Jul 22, 2016 at 3:17 PM, Srinath Perera <srin...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> On Thu, Jul 21, 2016 at 2:10 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi All,
>>> I am onto connecting cep streams with samoa streams to data analysis
>>> using samoa framework. To connect samoa with cep siddhi event streams what
>>> i we can do is that try to convert cep streams into samoa streams or else
>>> writing wrpper for samoa for cep  streasm to be used. In both cases i have
>>> to covert siddhi cep streasm into samoa streams. Samoa is using MOA to
>>> analyse data. Moo contains ML framework to analyse stream data. Samoa is
>>> wrapping MOA withsome of its classes.
>>>
>>> Samoa streams is based on MOA, Instance and InstanceStreams. Samoa see
>>> streams as a stream of instances [1]. So if we are going to convert cep
>>> events into samoa instances , it will take time. But if we have some
>>> similarity between cep siddhi streams and samoa streasm we can reduce the
>>> time.
>&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-27 Thread Mahesh Dananjaya
Hi Srinath,

"I think this is OK. I assume this is much easier. Let's do this and check
the performance",
I also think so. I am currently on this and have a progress in this.for
your question,
"If we use MOA directly, would we loose the distributed support in SAOMA.
Let's do a call when you can, so we can dsicuss this in detail."
 I have to check for that.i think  if we are using MOA, we can use
distributed clusters. As i wen through their documentation MOA itself
cannot support distribution. But samoa can. What samoa does is providing
streaming and clustering support by wrapping MOA algorithms. So i think we
dont need to go for that option,directly MOA. Because now we can handle
samoa building blocks.So we had 2 options for integrating it with cep
without exploiting samoa architecture which is highly modular,scalable and
flexible.
1. Develop Samoa topologies with basic samoa building blocks which make use
of MOA algorithms.
2. Creating New streaming options with samoa stream building blocks which
can feed cep siddhi events into samoa streams and get results samoa streams
to cep back.

As 2nd option is easy and  take reasonable time i am currently developing
some modules to integrate cep streams into samoa which can be easily
further extended to 1 option as well. So i had to modify stream and
entrance modules for that and i think is has good progress.
 So now i can feed my custom input stream to samoa topologies. That means i
can easily integrate cep event stream into samoa instance stream. currently
i am verifying the streaming clustering algorithms and its results with my
custom input input streams which can be connected to samoa instance
streams. As i have already developed siddhi extension for streaming, i can
use them to feed my custom input streams now.  As the initial step i am go
with the streaming clustering algorithms. Those are in my GSOC github repo
[1].
clustering - Streaming Clustering Support with samoa and CEP
streaming - Streaming extension for samoa for cep evet streams

i am currenlty working on the verification of results with some of our
custom streams and then we will just have to integrate it with my
extensions, which are already developed for cep as my first part.thank you.
regards,
Mahesh.

[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming

On Fri, Jul 22, 2016 at 3:17 PM, Srinath Perera <srin...@wso2.com> wrote:

> Hi Mahesh,
>
> On Thu, Jul 21, 2016 at 2:10 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi All,
>> I am onto connecting cep streams with samoa streams to data analysis
>> using samoa framework. To connect samoa with cep siddhi event streams what
>> i we can do is that try to convert cep streams into samoa streams or else
>> writing wrpper for samoa for cep  streasm to be used. In both cases i have
>> to covert siddhi cep streasm into samoa streams. Samoa is using MOA to
>> analyse data. Moo contains ML framework to analyse stream data. Samoa is
>> wrapping MOA withsome of its classes.
>>
>> Samoa streams is based on MOA, Instance and InstanceStreams. Samoa see
>> streams as a stream of instances [1]. So if we are going to convert cep
>> events into samoa instances , it will take time. But if we have some
>> similarity between cep siddhi streams and samoa streasm we can reduce the
>> time.
>> 1. What is the underlying infrastructure for cep siddhi streasm.?
>> 2. Are there anything as Instances or InstanceStreams kind of
>> implmentation underlying cep streams?
>> 3. How can i get more underestanding on CEP siddhi streams.
>>
>> On the other hand i can use my cep siddhi extension and put those events
>> into event queue and convert them into samoa instances and feed them into
>> samoa streaming ml topologies.
>>
> I think this is OK. I assume this is much easier. Let's do this and check
> the performance.
>
>
>> There is another option. In Samoa what they are basically doing is that
>> wrapping MOA ML framework and write some classes for build streaming ml
>> topologies. So as the other option i can wrap samoa moa with my design and
>> use moa ml framework directly. (No need for Samoa extension). I have
>> building some topologies to streaming data analysis [2]. Main problem is
>> that lack of documentation. Anyway i had go through their whole samoa
>> design.thank you.
>>
>
> If we use MOA directly, would we loose the distributed support in SAOMA.
> Let's do a call when you can, so we can dsicuss this in detail.
>
> --Srinath
>
>
>
>> regards,
>> Mahesh.
>>
>> [1]
>> https://github.com/apache/incubator-samoa/blob/master/samoa-api/src/main/java/org/apache/samoa/streams/clustering/Clus

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-26 Thread Mahesh Dananjaya
Hi Nirmal,
I am in the middle of trying some options to connect samoa with cep for
prediction. There are couple of options.Since there are lack of
documentation and lack of support from the samoa dev email list, Thing is
that i have to go through all the project to implement even a small
topology on my own. I am implementing samoa example topologies for our
exact goal. But i think this is something large in scope. I am in the
middle of converting cep siddhi event streams into some samoa streams and
convert result samoa streams back to cep streams. I am writing couple of
examples for that. Sorry for not updating by email, But i have putting my
examples into my gsoc repository [1]. I will push latest changes to couple
of modules. I was discussing wit the Maheshakaya to arrange meeting with
team next thursday or friday.Please let me know a convenient day. looking
forward to some of the advices from you there. This integration can be
done, but it will take some time for that, because i have to go through all
samoa things.
 ANyway i could connect samoa output streams with the custom stream outside
it [2]. Connecting the custom streams to samoa streams is challenging. I
have couple of options that can be used without exploiting samoa
architecture. I will discuss with you guys when i will be meeting you.
Anyway i think i have already implemented our intial goal with spark. will
try to finish Samoa thing as soon as i can.thank you.
regards,
Mahesh.

P.s: If anyone having knowledge on WEKA or MOA streams and instance, it
will be very helpful.because what i am doing now is converting cep events
into samoa Instance and feed it to the stream.
[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming
[2]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming

On Tue, Jul 26, 2016 at 10:25 AM, Nirmal Fernando <nir...@wso2.com> wrote:

> Hi Mahesh,
>
> What's the status of the project?
>
> On Thu, Jul 14, 2016 at 10:28 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I am building and running samoa to see its functionality. In samoa still
>> we have limited supports in algorithms. Samoa supports only classification
>> and clustering with streams. It also use kind of StreamProcessor, like the
>> one we use in StreamProcessor extension.  I was getting started with Samoa
>> referring to this page [1]. Then i ran couple of examples to identified the
>> flow. Samoa use hadoop framework instead spark for distribution. But i am
>> using it in a local mode. When i see the Samoa core there is only limited
>> algorithms. IMO if we are going to use Samoa we  have to limit the
>> functionality and algorithms [2]. When i go to developer corner in [3], it
>> seems to be something like CEP extension that we are using currenlty. SO in
>> Samoa though the algorihtms are limited, they have implemented streaming
>> support for them. Therefore if we integrate it into CEP we have to look for
>> how to handle streams and algorithms in Samoa side. Is it good for your
>> side to have both hadoop and spark running background.thank you.
>> regards,
>> Mahesh.
>>
>> [1] https://samoa.incubator.apache.org/documentation/Home.html
>> [2]
>> https://samoa.incubator.apache.org/documentation/api/current/index.html
>>
>>
>> On Wed, Jun 22, 2016 at 11:51 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> can i give external data sources like data from database , data from
>>> HDFS to generate events in the cep event simulator rather than giving a
>>> file. i saw "Switch to upload file for simulation" in the input Data By
>>> Data Source in  the event simulator. How can i feed data real time from
>>> other sources or directly as data generating from remote server as JSON or
>>> etc... What format the database should be.This is just for my
>>> knowledge.thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Nirmal,
>>>> *This is what i have done so far in the GSOC2016,*
>>>>
>>>>- prior research before SGD (Stochastic Gradient Descent)
>>>>optimization techniques and mini-batch processing
>>>>- Getting familiar and writing extensions to siddhi
>>>>- Wrote a Stream Processor extensions for streaming application and
>>>>machine learning algorithms (Linear Regression,KMeans & Logistic 
>>>> Regression)
>>>>- Developed a Streami

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-21 Thread Mahesh Dananjaya
Hi All,
I am onto connecting cep streams with samoa streams to data analysis using
samoa framework. To connect samoa with cep siddhi event streams what i we
can do is that try to convert cep streams into samoa streams or else
writing wrpper for samoa for cep  streasm to be used. In both cases i have
to covert siddhi cep streasm into samoa streams. Samoa is using MOA to
analyse data. Moo contains ML framework to analyse stream data. Samoa is
wrapping MOA withsome of its classes.

Samoa streams is based on MOA, Instance and InstanceStreams. Samoa see
streams as a stream of instances [1]. So if we are going to convert cep
events into samoa instances , it will take time. But if we have some
similarity between cep siddhi streams and samoa streasm we can reduce the
time.
1. What is the underlying infrastructure for cep siddhi streasm.?
2. Are there anything as Instances or InstanceStreams kind of implmentation
underlying cep streams?
3. How can i get more underestanding on CEP siddhi streams.

On the other hand i can use my cep siddhi extension and put those events
into event queue and convert them into samoa instances and feed them into
samoa streaming ml topologies. There is another option. In Samoa what they
are basically doing is that wrapping MOA ML framework and write some
classes for build streaming ml topologies. So as the other option i can
wrap samoa moa with my design and use moa ml framework directly. (No need
for Samoa extension). I have building some topologies to streaming data
analysis [2]. Main problem is that lack of documentation. Anyway i had go
through their whole samoa design.thank you.
regards,
Mahesh.

[1]
https://github.com/apache/incubator-samoa/blob/master/samoa-api/src/main/java/org/apache/samoa/streams/clustering/ClusteringStream.java
[2]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming

On Mon, Jul 18, 2016 at 11:40 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> Samoa modules built as topologies that connect streams with the internal
> processors. I have already written some examples to test the ML algorithms
> and samoa analysis topologies. What we need to done is mostly developing a
> wrapper around samoa topologies to connect their input and output streams
> with our cep streams. So i am currently going through their stream
> architecture to connect our streams with their streams. Couple of examples
> exapaining samoa ml topologies and streaming can be found in my git hub
> repo [1]. Samoa using MOA ml algorithms by wrapping them with their
> classes. Initailly i am trying to develop a KMeansClustering analysis with
> cep streams with samoa ml topologies.
> And also i could not find a maven repo for samoa 0.4.0 incubating. So i am
> currently using my local m2 repo's samoa 0..4.0 incubating for my
> dependencies to work. The local one is built by original samoa source.thank
> you.
>
> regards,
> Mahesh.
>
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming
>
>
> On Mon, Jul 18, 2016 at 8:32 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Can you  please share your samoa project?
>>
>> On Sun, Jul 17, 2016 at 11:19 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>>
>>> -- Forwarded message --
>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>> Date: Sun, Jul 17, 2016 at 11:18 AM
>>> Subject: Re: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online
>>> data for WSO2 Machine Learner-Samoa Integration
>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>
>>>
>>> Hi Maheshakaya,
>>> just need a little help. In Samoa when we want to run a class what is
>>> does it used this commands [1],
>>> 1. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
>>> "ClusteringEvaluation"
>>> 2. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
>>> "PrequentialEvaluation -d /tmp/dump.csv -i 100 -f 10 -l
>>> (classifiers.trees.VerticalHoeffdingTree -p 4) -s
>>> (generators.RandomTreeGenerator -c 2 -o 10 -u 10)"
>>>
>>> what is does is call a class named LocalDoTask [4] and pass this string
>>> as argument.After that that LocalDoTask call the relevent Tasks such as
>>> ClusteringEvaluation or PrequentialEvaluation. [2].
>>>
>>> Now i have add samoa dependencies to my new maven project, where i used
>>> original samoa source to write examples and test then earlier.Now i want to
>>> push them into my new java project with samoa dependencie

Re: [Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-18 Thread Mahesh Dananjaya
Hi Maheshakya,
Samoa modules built as topologies that connect streams with the internal
processors. I have already written some examples to test the ML algorithms
and samoa analysis topologies. What we need to done is mostly developing a
wrapper around samoa topologies to connect their input and output streams
with our cep streams. So i am currently going through their stream
architecture to connect our streams with their streams. Couple of examples
exapaining samoa ml topologies and streaming can be found in my git hub
repo [1]. Samoa using MOA ml algorithms by wrapping them with their
classes. Initailly i am trying to develop a KMeansClustering analysis with
cep streams with samoa ml topologies.
And also i could not find a maven repo for samoa 0.4.0 incubating. So i am
currently using my local m2 repo's samoa 0..4.0 incubating for my
dependencies to work. The local one is built by original samoa source.thank
you.

regards,
Mahesh.

[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming


On Mon, Jul 18, 2016 at 8:32 AM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Can you  please share your samoa project?
>
> On Sun, Jul 17, 2016 at 11:19 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>> Date: Sun, Jul 17, 2016 at 11:18 AM
>> Subject: Re: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online
>> data for WSO2 Machine Learner-Samoa Integration
>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>
>>
>> Hi Maheshakaya,
>> just need a little help. In Samoa when we want to run a class what is
>> does it used this commands [1],
>> 1. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
>> "ClusteringEvaluation"
>> 2. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
>> "PrequentialEvaluation -d /tmp/dump.csv -i 100 -f 10 -l
>> (classifiers.trees.VerticalHoeffdingTree -p 4) -s
>> (generators.RandomTreeGenerator -c 2 -o 10 -u 10)"
>>
>> what is does is call a class named LocalDoTask [4] and pass this string
>> as argument.After that that LocalDoTask call the relevent Tasks such as
>> ClusteringEvaluation or PrequentialEvaluation. [2].
>>
>> Now i have add samoa dependencies to my new maven project, where i used
>> original samoa source to write examples and test then earlier.Now i want to
>> push them into my new java project with samoa dependencies. I added
>> dependency and it was built fine. Now i am calling my local DoTask.java [3]
>> file as same as i did with samoa with,
>> java -cp target/streaming-1.0-SNAPSHOT.jar
>> org.gsoc.samoa.streaming.DoTask
>> "org.gsoc.samoa.streaming.ClusteringEvaluation"
>> But seems to be i am incorrect in some place.
>> Error: A JNI error has occurred, please check your installation and try
>> again
>> Exception in thread "main" java.lang.NoClassDefFoundError:
>> org/apache/samoa/topology/ComponentFactory
>> at java.lang.Class.getDeclaredMethods0(Native Method)
>> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
>> at java.lang.Class.getMethod0(Class.java:3018)
>> at java.lang.Class.getMethod(Class.java:1784)
>> at
>> sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
>> at
>> sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.samoa.topology.ComponentFactory
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> ... 7 more
>>
>>
>> can i actually call the Task like this.
>>
>> BR,
>> Mahesh.
>>
>> [1]
>> https://samoa.incubator.apache.org/documentation/Prequential-Evaluation-Task.html
>> [2]
>> https://github.com/apache/incubator-samoa/blob/releases/0.4.0-incubating-RC0/samoa-api/src/main/java/org/apache/samoa/tasks/ClusteringEvaluation.java
>> [3]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming
>> [4]
>> https://github.com/apache/incubator-samoa/tree/releases/0.4.0-incubating-RC0/samoa-local/src/main/java/org/apache/samoa
>>
>>
>> On Thu, Jul 14, 2016 at 3:47 PM, Mah

[Dev] Fwd: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-16 Thread Mahesh Dananjaya
-- Forwarded message --
From: Mahesh Dananjaya <dananjayamah...@gmail.com>
Date: Sun, Jul 17, 2016 at 11:18 AM
Subject: Re: GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online
data for WSO2 Machine Learner-Samoa Integration
To: Maheshakya Wijewardena <mahesha...@wso2.com>


Hi Maheshakaya,
just need a little help. In Samoa when we want to run a class what is does
it used this commands [1],
1. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
"ClusteringEvaluation"
2. bin/samoa storm target/SAMOA-Storm-0.0.1-SNAPSHOT.jar
"PrequentialEvaluation -d /tmp/dump.csv -i 100 -f 10 -l
(classifiers.trees.VerticalHoeffdingTree -p 4) -s
(generators.RandomTreeGenerator -c 2 -o 10 -u 10)"

what is does is call a class named LocalDoTask [4] and pass this string as
argument.After that that LocalDoTask call the relevent Tasks such as
ClusteringEvaluation or PrequentialEvaluation. [2].

Now i have add samoa dependencies to my new maven project, where i used
original samoa source to write examples and test then earlier.Now i want to
push them into my new java project with samoa dependencies. I added
dependency and it was built fine. Now i am calling my local DoTask.java [3]
file as same as i did with samoa with,
java -cp target/streaming-1.0-SNAPSHOT.jar org.gsoc.samoa.streaming.DoTask
"org.gsoc.samoa.streaming.ClusteringEvaluation"
But seems to be i am incorrect in some place.
Error: A JNI error has occurred, please check your installation and try
again
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/samoa/topology/ComponentFactory
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at
sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException:
org.apache.samoa.topology.ComponentFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more


can i actually call the Task like this.

BR,
Mahesh.

[1]
https://samoa.incubator.apache.org/documentation/Prequential-Evaluation-Task.html
[2]
https://github.com/apache/incubator-samoa/blob/releases/0.4.0-incubating-RC0/samoa-api/src/main/java/org/apache/samoa/tasks/ClusteringEvaluation.java
[3]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/samoa/streaming/src/main/java/org/gsoc/samoa/streaming
[4]
https://github.com/apache/incubator-samoa/tree/releases/0.4.0-incubating-RC0/samoa-local/src/main/java/org/apache/samoa


On Thu, Jul 14, 2016 at 3:47 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi srinath,
> sure.i am working on it.thank you.
> regards,
> Mahesh.
>
> On Thu, Jul 14, 2016 at 11:12 AM, Srinath Perera <srin...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Let's focus on getting SAOMA work with CEP. It is OK to be limited to few
>> algorithms.
>>
>> --Srinath
>>
>> On Thu, Jul 14, 2016 at 10:49 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I think we can build new tasks [1] like the one in execution plan in cep
>>> with samoa. I will try to build a one.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://samoa.incubator.apache.org/documentation/Developing-New-Tasks-in-SAMOA.html
>>>
>>>
>>> On Thu, Jul 14, 2016 at 10:35 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I am building and running samoa to see its functionality. In samoa
>>>> still we have limited supports in algorithms. Samoa supports only
>>>> classification and clustering with streams. It also use kind of
>>>> StreamProcessor, like the one we use in StreamProcessor extension.  I was
>>>> getting started with Samoa referring to this page [1]. Then i ran couple of
>>>> examples to identified the flow. Samoa use hadoop framework instead spark
>>>> for distribution. But i am using it in a local mode. When i see the Samoa
>>>> core there is only limited algorithms. IMO if we are going to use Samoa we
>>>> have to limit the functionality and algorithms [2]. When i go to developer
>>>> corner in [3], it seems to be something like CEP extension that we are
>>>> using currenlty. SO

Re: [Dev] GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-13 Thread Mahesh Dananjaya
Hi Maheshakya,
I am building and running samoa to see its functionality. In samoa still we
have limited supports in algorithms. Samoa supports only classification and
clustering with streams. It also use kind of StreamProcessor, like the one
we use in StreamProcessor extension.  I was getting started with Samoa
referring to this page [1]. Then i ran couple of examples to identified the
flow. Samoa use hadoop framework instead spark for distribution. But i am
using it in a local mode. When i see the Samoa core there is only limited
algorithms. IMO if we are going to use Samoa we  have to limit the
functionality and algorithms [2]. When i go to developer corner in [3], it
seems to be something like CEP extension that we are using currenlty. SO in
Samoa though the algorihtms are limited, they have implemented streaming
support for them. Therefore if we integrate it into CEP we have to look for
how to handle streams and algorithms in Samoa side. Is it good for your
side to have both hadoop and spark running background.thank you.
regards,
Mahesh.

[1] https://samoa.incubator.apache.org/documentation/Home.html
[2] https://samoa.incubator.apache.org/documentation/api/current/index.html
[3] https://samoa.incubator.apache.org/documentation/SAMOA-Topology.html
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] GSOC2016: [ML][CEP] [SAMOA]Predictive analytic with online data for WSO2 Machine Learner-Samoa Integration

2016-07-13 Thread Mahesh Dananjaya
Hi Maheshakaya,
IMO Internal structure on samoa is more like CEP for streaming handle
[1].thank you.
BR,
Mahesh.

[1] https://samoa.incubator.apache.org/documentation/SAMOA-Topology.html
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-13 Thread Mahesh Dananjaya
Hi Maheshakya,
I am building and running samoa to see its functionality. In samoa still we
have limited supports in algorithms. Samoa supports only classification and
clustering with streams. It also use kind of StreamProcessor, like the one
we use in StreamProcessor extension.  I was getting started with Samoa
referring to this page [1]. Then i ran couple of examples to identified the
flow. Samoa use hadoop framework instead spark for distribution. But i am
using it in a local mode. When i see the Samoa core there is only limited
algorithms. IMO if we are going to use Samoa we  have to limit the
functionality and algorithms [2]. When i go to developer corner in [3], it
seems to be something like CEP extension that we are using currenlty. SO in
Samoa though the algorihtms are limited, they have implemented streaming
support for them. Therefore if we integrate it into CEP we have to look for
how to handle streams and algorithms in Samoa side. Is it good for your
side to have both hadoop and spark running background.thank you.
regards,
Mahesh.

[1] https://samoa.incubator.apache.org/documentation/Home.html
[2] https://samoa.incubator.apache.org/documentation/api/current/index.html


On Wed, Jun 22, 2016 at 11:51 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> can i give external data sources like data from database , data from HDFS
> to generate events in the cep event simulator rather than giving a file. i
> saw "Switch to upload file for simulation" in the input Data By Data Source
> in  the event simulator. How can i feed data real time from other sources
> or directly as data generating from remote server as JSON or etc... What
> format the database should be.This is just for my knowledge.thank you.
> regards,
> Mahesh.
>
> On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Nirmal,
>> *This is what i have done so far in the GSOC2016,*
>>
>>- prior research before SGD (Stochastic Gradient Descent)
>>optimization techniques and mini-batch processing
>>- Getting familiar and writing extensions to siddhi
>>- Wrote a Stream Processor extensions for streaming application and
>>machine learning algorithms (Linear Regression,KMeans & Logistic 
>> Regression)
>>- Developed a Streaming Linear Regression class for periodically
>>retrain models as mini batch processing with SGD
>>- Extend the functionality for Moving Window Mini Batch Processing
>>with SGD providing windowShift which control data horizon and data
>>obsolescences
>>- Performance evaluation of the implementation
>>- Adding Streaming Linear Regression class and Stream Processor
>>extension to carbon-ml
>>
>>
>> *As a next step,*
>>
>>- Adding Persisting temporal models for applications such as
>>prediction
>>- complete Streaming Kmeans clustering and Logistic Regression classes
>>- Improve batching and streaming mechanisms
>>- improve visualization(optional)
>>- and writing examples and documentation
>>
>> regards,
>>
>> Mahesh.
>>
>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Sorry, you need to put the returned values of the function into the
>>> output stream
>>>
>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>>
>>>
>>>
>>> *select mseinsert into LinregOutput;*
>>> or
>>>
>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into LinregOutput;
>>>
>>> where LinregOutput stream definition contains all attributes: mse,
>>> intercept, beta1, 
>>>
>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> In your output stream, you need to list all the attributes that are
>>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>>> Can you try that?
>>>>
>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> This is the full query i used.
>>>>>
>>>>> @Import('LinRegInput:1.0.0')
>>>>>
>>>>> define stream LinRegInput (salary double, rbi double, walks double,
&

Re: [Dev] "[ml][cep][gsoc-2016]Samoa vs Spark.Which is good for streaming?"

2016-07-04 Thread Mahesh Dananjaya
Hi Srinath,
Yes , Mahout has algorithms that can be used for streaming applications
such as SGD. They have listed their algorithms in [1]. And IMO in Samoa
they have implement some streaming applications based on those algorithms .
Seems to be they are still on it. you can see couple of such applications
in [2].i have been going through it recently.thank you.
regards,
Mahesh.

[1] http://mahout.apache.org/users/basics/algorithms.html
[2]
https://samoa.incubator.apache.org/documentation/SAMOA-and-Machine-Learning.html
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] "[ml][cep][gsoc-2016]Samoa vs Spark.Which is good for streaming?"

2016-07-04 Thread Mahesh Dananjaya
Hi Maheshakya,
As i sent you before there are several options other than spark that we can
used for our purpose [1]. I wen t through Samoa up to some extent. [2]
Samoa is the streaming extension for ML framework Mahout [4] . Mahout is
basically using hadoop framework and has lots of counterparts to apache
spark. There are pros and cos of both platforms [5] . apache Mllib is based
on the Apache spark [3] platform. Apache spark has its own streaming
extension. as I see Samoa and Mahout has lots of options and algorithms for
machine learning  and data analytics. And also Mahout is more advanced than
the spark as i realized.So what we will gone a do. Samoa and mahout has
lots options for streaming analytics.
regards,
Mahesh.

[1] https://github.com/dananjayamahesh/awesome-machine-learning
[2] https://samoa.incubator.apache.org/
[3] http://spark.apache.org/
[4] http://mahout.apache.org/users/basics/algorithms.html
[5]
http://stackoverflow.com/questions/23511459/what-is-the-difference-between-apache-mahout-and-apache-sparks-mllib
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-29 Thread Mahesh Dananjaya
Hi Nirmal,
These are the options we can use instead of spark for streaming.I am going
through Samoa also. all the options are listed in JAVA section in this line
[1].thank you.
regards,
Mahesh.
[1] https://github.com/dananjayamahesh/awesome-machine-learning

On Wed, Jun 29, 2016 at 3:32 PM, Nirmal Fernando <nir...@wso2.com> wrote:

> Thanks Mahesh, could you post the same to dev@ and loop Srinath too.
> srin...@wso2.com
>
> On Wed, Jun 29, 2016 at 3:23 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi NIrmal,
>> These are the options we can use instead of spark for streaming.I am
>> going through Samoa also. all the options are listed in JAVA section in
>> this line [1].thank you.
>> regards,
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/awesome-machine-learning
>>
>> On Wed, Jun 29, 2016 at 11:14 AM, Nirmal Fernando <nir...@wso2.com>
>> wrote:
>>
>>> *Notes from Srinath.*
>>>
>>> https://en.wikipedia.org/wiki/Online_machine_learning
>>> <https://www.google.com/url?q=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FOnline_machine_learning=D=1=AFQjCNHchT_j2KZveIqSgcGIMt8I3raK3w>
>>> Challenges
>>> 1. Concept Drift
>>> 2. Too Much data (scale to many nodes)
>>> 3. Cost of learning is too high
>>> 4. Handling Imprecise and incomplete data
>>>
>>> http://www.scribd.com/doc/218572800/Big-Data-Beer-Apr2014
>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.scribd.com%2Fdoc%2F218572800%2FBig-Data-Beer-Apr2014=D=1=AFQjCNGGtvKMBi0zuwBRlbPMNakOK_Krqw>
>>>  -
>>> Stream Drill, doing ML via sketches and counters
>>>
>>> Clustering
>>> Goals
>>> 1. Compactness of representation,
>>> 2. Fast, incremental processing of new data points
>>> 3. Clear and fast identification of “outliers”.
>>> (e.g. D-Stream Clustering [2] - have a offline process that adjust the
>>> clusters)
>>>
>>> http://datasciencecmu.wordpress.com/2014/04/11/data-stream-mining-techniques-and-challenges/
>>> <http://www.google.com/url?q=http%3A%2F%2Fdatasciencecmu.wordpress.com%2F2014%2F04%2F11%2Fdata-stream-mining-techniques-and-challenges%2F=D=1=AFQjCNFsO1U0XBOSeOIiAgTmdTh8aCYrzg>
>>>
>>> Classification
>>> Goals (Processing an example at a time, and inspect it only once (at
>>> most), using a limited amount of memory, work in a limited amount of time
>>> and being ready to predict at any point. )
>>> Hoeffding Trees [2] - build the tree, and nodes are split when needed
>>>
>>> http://datasciencecmu.wordpress.com/2014/04/11/data-stream-mining-techniques-and-challenges/
>>> <http://www.google.com/url?q=http%3A%2F%2Fdatasciencecmu.wordpress.com%2F2014%2F04%2F11%2Fdata-stream-mining-techniques-and-challenges%2F=D=1=AFQjCNFsO1U0XBOSeOIiAgTmdTh8aCYrzg>
>>>
>>>
>>> Grouping Methods for Pattern Matching in Probabilistic Data Streams <
>>> http://scholar.google.com/scholar_url?url=http://link.springer.com/chapter/10.1007/978-3-319-18120-2_6hl=ensa=Xscisig=AAGBfm1G-VxvT5Xt-4x3XiG01p0sI_xxdwnossl=1oi=scholaralrt
>>> >
>>> K Sugiura, Y Ishikawa, Y Sasaki - Database Systems for Advanced
>>> Applications, 2015
>>> ... Abstract. In recent years, complex event processing has attracted
>>> con- siderable interest in
>>> research and industry.Pattern matching is used to find complex events in
>>> data streams. ...
>>>
>>> A Platform for Detecting Height-Level Contexts from Complex Event
>>> Streams in Pervasive Environment <
>>> http://scholar.google.com/scholar_url?url=http://ieeexplore.ieee.org/xpls/abs_all.jsp%3Farnumber%3D7079615hl=ensa=Xscisig=AAGBfm1lE9JDeTyoeeG5O5vNsSDScidlkAnossl=1oi=scholaralrt
>>> >
>>> CF Liao, K Chen, CT Cheng, TY Weng, WC Lu - Platform Technology and
>>> Service
>>>
>>>
>>>
>>>
>>> Charu Aggarwal’s work
>>>
>>> Book: "Data streams: models and algorithms” ,
>>> http://charuaggarwal.net/streambook.pdf
>>> <http://www.google.com/url?q=http%3A%2F%2Fcharuaggarwal.net%2Fstreambook.pdf=D=1=AFQjCNF_8OAw-4nNVvUNK8jUni6Jt4j-8A>
>>>
>>>
>>> 1. On Clustering Massive Data Streams
>>>
>>> 2. A Survey of Classification Methods in Data Streams
>>>
>>> 3. Frequent Pattern Mining in Data Streams
>>>
>>> 4. A Survey of Change Diagnosis
>>>
>>> 5. Multi-Dimensional Analysis of Data
>>>
>>> 6. Streams Using Stream Cubes
>>>
>>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-22 Thread Mahesh Dananjaya
Hi Maheshakya,
can i give external data sources like data from database , data from HDFS
to generate events in the cep event simulator rather than giving a file. i
saw "Switch to upload file for simulation" in the input Data By Data Source
in  the event simulator. How can i feed data real time from other sources
or directly as data generating from remote server as JSON or etc... What
format the database should be.This is just for my knowledge.thank you.
regards,
Mahesh.

On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Nirmal,
> *This is what i have done so far in the GSOC2016,*
>
>- prior research before SGD (Stochastic Gradient Descent) optimization
>techniques and mini-batch processing
>- Getting familiar and writing extensions to siddhi
>- Wrote a Stream Processor extensions for streaming application and
>machine learning algorithms (Linear Regression,KMeans & Logistic 
> Regression)
>- Developed a Streaming Linear Regression class for periodically
>retrain models as mini batch processing with SGD
>- Extend the functionality for Moving Window Mini Batch Processing
>with SGD providing windowShift which control data horizon and data
>obsolescences
>- Performance evaluation of the implementation
>- Adding Streaming Linear Regression class and Stream Processor
>extension to carbon-ml
>
>
> *As a next step,*
>
>- Adding Persisting temporal models for applications such as prediction
>- complete Streaming Kmeans clustering and Logistic Regression classes
>- Improve batching and streaming mechanisms
>- improve visualization(optional)
>- and writing examples and documentation
>
> regards,
>
> Mahesh.
>
> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Sorry, you need to put the returned values of the function into the
>> output stream
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>>
>>
>>
>> *select mseinsert into LinregOutput;*
>> or
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>> select *
>> insert into LinregOutput;
>>
>> where LinregOutput stream definition contains all attributes: mse,
>> intercept, beta1, 
>>
>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> In your output stream, you need to list all the attributes that are
>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>> Can you try that?
>>>
>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> This is the full query i used.
>>>>
>>>> @Import('LinRegInput:1.0.0')
>>>>
>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>> strikeouts double, errors double);
>>>>
>>>> @Export('LinRegOutput:1.0.0')
>>>>
>>>> define stream LinregOutput (mse double);
>>>>
>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>> salary, rbi, walks, strikeouts, errors)
>>>>
>>>> select *
>>>> insert into mse;
>>>>
>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Can you summarize the work we have done so far and the remaining work
>>>>> items please?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>>> [1].thank you.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>> [1]
>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>>> [2]
>>>>&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Nirmal,
*This is what i have done so far in the GSOC2016,*

   - prior research before SGD (Stochastic Gradient Descent) optimization
   techniques and mini-batch processing
   - Getting familiar and writing extensions to siddhi
   - Wrote a Stream Processor extensions for streaming application and
   machine learning algorithms (Linear Regression,KMeans & Logistic Regression)
   - Developed a Streaming Linear Regression class for periodically retrain
   models as mini batch processing with SGD
   - Extend the functionality for Moving Window Mini Batch Processing with
   SGD providing windowShift which control data horizon and data obsolescences
   - Performance evaluation of the implementation
   - Adding Streaming Linear Regression class and Stream Processor
   extension to carbon-ml


*As a next step,*

   - Adding Persisting temporal models for applications such as prediction
   - complete Streaming Kmeans clustering and Logistic Regression classes
   - Improve batching and streaming mechanisms
   - improve visualization(optional)
   - and writing examples and documentation

regards,

Mahesh.

On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Sorry, you need to put the returned values of the function into the output
> stream
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
>
>
>
> *select mseinsert into LinregOutput;*
> or
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into LinregOutput;
>
> where LinregOutput stream definition contains all attributes: mse,
> intercept, beta1, 
>
> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> In your output stream, you need to list all the attributes that are
>> returned from the streamlinreg function: mse, intercept, beta1, 
>> Can you try that?
>>
>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> This is the full query i used.
>>>
>>> @Import('LinRegInput:1.0.0')
>>>
>>> define stream LinRegInput (salary double, rbi double, walks double,
>>> strikeouts double, errors double);
>>>
>>> @Export('LinRegOutput:1.0.0')
>>>
>>> define stream LinregOutput (mse double);
>>>
>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>>
>>> select *
>>> insert into mse;
>>>
>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>> Object[]. SO how can i publish all these infomation on event publisher.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Can you summarize the work we have done so far and the remaining work
>>>> items please?
>>>>
>>>> Thanks.
>>>>
>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>> [1].thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1]
>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>> [2]
>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>>
>>>>>
>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> -- Forwarded message --
>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>>>>> online data for WSO2 Machine Learner
>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>>
>>>>>>
>>>>>> Hi Maheshakya,
>>>>>> new query is like this adding spport for moving window methods.
>>>>>>
>>>>>>
>>>>>> @Import('LinRegInput:1.0.1')
>>>>>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Maheshakya,
This is the full query i used.

@Import('LinRegInput:1.0.0')

define stream LinRegInput (salary double, rbi double, walks double,
strikeouts double, errors double);

@Export('LinRegOutput:1.0.0')

define stream LinregOutput (mse double);

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)

select *
insert into mse;

but i am sending [mse,intercept,beta1betap] as a outputData Object[].
SO how can i publish all these infomation on event publisher.
regards,
Mahesh.

On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote:

> Hi Mahesh,
>
> Can you summarize the work we have done so far and the remaining work
> items please?
>
> Thanks.
>
> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have updated the repo [2] and upto date documents can be found at
>> [1].thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>
>>
>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>>
>>> -- Forwarded message --
>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>> online data for WSO2 Machine Learner
>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>
>>>
>>> Hi Maheshakya,
>>> new query is like this adding spport for moving window methods.
>>>
>>>
>>> @Import('LinRegInput:1.0.1')
>>> define stream LinRegInput (salary double, rbi double, walks double,
>>> strikeouts double, errors double);
>>>
>>> @Export('LinRegOutput:1.0.1')
>>> define stream LinRegOutput (mse double);
>>>
>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into mse;
>>> 1=learnType
>>> 2=windowShift
>>> 4=batchSize...
>>>
>>> windowShift is added to configure the amount of shift. i have added
>>> log.infe(mse) to view the MSE.
>>> Mahesh.
>>>
>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> If you are installing features  from new p2 repo into a new CEP pack,
>>>> then you wont need to replace those jars.
>>>> If you have already installed those in the CEP from a previous p2-repo,
>>>> then you have to un-install those features and reinstall with new p2 repo.
>>>> But you don't need to do this because you can just replace the jar. It's
>>>> easy.
>>>>
>>>> Best regards.
>>>>
>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> If i built the carbon-ml then product-ml and point new p2 repository
>>>>> to cep features, do i need to copy that
>>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>>>> cep_home/repository/component/... place.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> In MLModelhandler there's persistModel method
>>>>>> debug that method while trying to train a model from ML
>>>>>> you can see the steps it takes
>>>>>> don't use deep learning algorithm
>>>>>> any other algorithm would work
>>>>>> from line 777 is the section for creating the serializable object
>>>>>> from trained model and saving it
>>>>>>
>>>>>>
>>>>>> I think you don't need to directly use ML model handler
>>>>>> you need to use the code in that for persisting models in the
>>>>>> streaming algorithm
>>>>>> so you can add a utils class in the streaming folder
>>>>>> then add the persisting logic there
>>>>>> ignore the deeplearning section in that
>>>>>> only forcus

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Maheshakya,
I have updated the repo [2] and upto date documents can be found at
[1].thank you.
regards,
Mahesh.
[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
[2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml


On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

>
> -- Forwarded message ------
> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
> Date: Tue, Jun 21, 2016 at 5:08 PM
> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
> online data for WSO2 Machine Learner
> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>
>
> Hi Maheshakya,
> new query is like this adding spport for moving window methods.
>
>
> @Import('LinRegInput:1.0.1')
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.1')
> define stream LinRegOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into mse;
> 1=learnType
> 2=windowShift
> 4=batchSize...
>
> windowShift is added to configure the amount of shift. i have added
> log.infe(mse) to view the MSE.
> Mahesh.
>
> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> If you are installing features  from new p2 repo into a new CEP pack,
>> then you wont need to replace those jars.
>> If you have already installed those in the CEP from a previous p2-repo,
>> then you have to un-install those features and reinstall with new p2 repo.
>> But you don't need to do this because you can just replace the jar. It's
>> easy.
>>
>> Best regards.
>>
>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> If i built the carbon-ml then product-ml and point new p2 repository to
>>> cep features, do i need to copy that
>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>> cep_home/repository/component/... place.
>>> regards,
>>> Mahesh.
>>>
>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> In MLModelhandler there's persistModel method
>>>> debug that method while trying to train a model from ML
>>>> you can see the steps it takes
>>>> don't use deep learning algorithm
>>>> any other algorithm would work
>>>> from line 777 is the section for creating the serializable object from
>>>> trained model and saving it
>>>>
>>>>
>>>> I think you don't need to directly use ML model handler
>>>> you need to use the code in that for persisting models in the streaming
>>>> algorithm
>>>> so you can add a utils class in the streaming folder
>>>> then add the persisting logic there
>>>> ignore the deeplearning section in that
>>>> only forcus on persisting spark mod
>>>>
>>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I pushed the StreamingLinearRegression modules into my forked
>>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>>>>> model.thank you.
>>>>> Mahesh.
>>>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>>>
>>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> yes
>>>>>> you should develop in tha fork repo
>>>>>> clone your forked repo
>>>>>> then go into that
>>>>>> then add upstream repo as original wso2 repo
>>>>>> see the remote tracking branchs by
>>>>>> git remote -v
>>>>>> you will see the origin as your forked repo
>>>>>> to add upstream
>>>>>> git remote add upstream 
>>>>>> when you change something create a new branch by
>>>>>> git checkout -b new_branch_name
>>>>>> then add and commit to this branch
>>>>>> after that push to the forked by
>>>>>> git push origin new_branch_name
>>>>>>
>>>>>> On

[Dev] Fwd: Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
-- Forwarded message --
From: Mahesh Dananjaya <dananjayamah...@gmail.com>
Date: Tue, Jun 21, 2016 at 5:08 PM
Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online
data for WSO2 Machine Learner
To: Maheshakya Wijewardena <mahesha...@wso2.com>


Hi Maheshakya,
new query is like this adding spport for moving window methods.


@Import('LinRegInput:1.0.1')
define stream LinRegInput (salary double, rbi double, walks double,
strikeouts double, errors double);

@Export('LinRegOutput:1.0.1')
define stream LinRegOutput (mse double);

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)
select *
insert into mse;
1=learnType
2=windowShift
4=batchSize...

windowShift is added to configure the amount of shift. i have added
log.infe(mse) to view the MSE.
Mahesh.

On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> If you are installing features  from new p2 repo into a new CEP pack, then
> you wont need to replace those jars.
> If you have already installed those in the CEP from a previous p2-repo,
> then you have to un-install those features and reinstall with new p2 repo.
> But you don't need to do this because you can just replace the jar. It's
> easy.
>
> Best regards.
>
> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> If i built the carbon-ml then product-ml and point new p2 repository to
>> cep features, do i need to copy that
>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>> cep_home/repository/component/... place.
>> regards,
>> Mahesh.
>>
>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> In MLModelhandler there's persistModel method
>>> debug that method while trying to train a model from ML
>>> you can see the steps it takes
>>> don't use deep learning algorithm
>>> any other algorithm would work
>>> from line 777 is the section for creating the serializable object from
>>> trained model and saving it
>>>
>>>
>>> I think you don't need to directly use ML model handler
>>> you need to use the code in that for persisting models in the streaming
>>> algorithm
>>> so you can add a utils class in the streaming folder
>>> then add the persisting logic there
>>> ignore the deeplearning section in that
>>> only forcus on persisting spark mod
>>>
>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I pushed the StreamingLinearRegression modules into my forked carbon-ml
>>>> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>>>> model.thank you.
>>>> Mahesh.
>>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>>
>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> yes
>>>>> you should develop in tha fork repo
>>>>> clone your forked repo
>>>>> then go into that
>>>>> then add upstream repo as original wso2 repo
>>>>> see the remote tracking branchs by
>>>>> git remote -v
>>>>> you will see the origin as your forked repo
>>>>> to add upstream
>>>>> git remote add upstream 
>>>>> when you change something create a new branch by
>>>>> git checkout -b new_branch_name
>>>>> then add and commit to this branch
>>>>> after that push to the forked by
>>>>> git push origin new_branch_name
>>>>>
>>>>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> the above error is due to a simple mistake of not providing my local
>>>>>> p2 repo.Now it is working and i debugged the StreamingLinearRegression
>>>>>> model cep.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Maheshakya,
>>>>>>> I did what you recommend. But when i am adding the query the
>>>>>>> following error i

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Maheshakya,
If i built the carbon-ml then product-ml and point new p2 repository to cep
features, do i need to copy that
org.wso2.carbon.ml.siddhi.extension1.1. thing into
cep_home/repository/component/... place.
regards,
Mahesh.

On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> In MLModelhandler there's persistModel method
> debug that method while trying to train a model from ML
> you can see the steps it takes
> don't use deep learning algorithm
> any other algorithm would work
> from line 777 is the section for creating the serializable object from
> trained model and saving it
>
>
> I think you don't need to directly use ML model handler
> you need to use the code in that for persisting models in the streaming
> algorithm
> so you can add a utils class in the streaming folder
> then add the persisting logic there
> ignore the deeplearning section in that
> only forcus on persisting spark mod
>
> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I pushed the StreamingLinearRegression modules into my forked carbon-ml
>> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>> model.thank you.
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/carbon-ml
>>
>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> yes
>>> you should develop in tha fork repo
>>> clone your forked repo
>>> then go into that
>>> then add upstream repo as original wso2 repo
>>> see the remote tracking branchs by
>>> git remote -v
>>> you will see the origin as your forked repo
>>> to add upstream
>>> git remote add upstream 
>>> when you change something create a new branch by
>>> git checkout -b new_branch_name
>>> then add and commit to this branch
>>> after that push to the forked by
>>> git push origin new_branch_name
>>>
>>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> the above error is due to a simple mistake of not providing my local p2
>>>> repo.Now it is working and i debugged the StreamingLinearRegression model
>>>> cep.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I did what you recommend. But when i am adding the query the following
>>>>> error is appearing.
>>>>> No extension exist for StreamFunctionExtension{namespace='ml'} in
>>>>> execution plan "NewExecutionPlan"
>>>>>
>>>>> *My query is as follows,
>>>>> @Import('LinRegInput:1.0.0')
>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>> strikeouts double, errors double);
>>>>>
>>>>> @Export('LinRegOutput:1.0.0')
>>>>> define stream LinRegOutput (mse double);
>>>>>
>>>>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95,
>>>>> salary, rbi, walks, strikeouts, errors)
>>>>> select *
>>>>> insert into mse;
>>>>>
>>>>> I have added my files as follows,
>>>>>
>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
>>>>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>>>>>
>>>>> and add following lines to ml.siddhiext
>>>>>
>>>>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>>>>>
>>>>> .Then i build the carbon-ml. The replace the jar file you asked me
>>>>> replace with the name changed.any thoughts?
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> You don't need to add new p2 repo.
>>>>>> In the /repository/components/plugins folder, you will find
>>>>>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
>>>>>> carbon-ml/components/extensions/or

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-15 Thread Mahesh Dananjaya
Hi Maheshakya,
I pushed the StreamingLinearRegression modules into my forked carbon-ml
repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
model.thank you.
Mahesh.
[1] https://github.com/dananjayamahesh/carbon-ml

On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> yes
> you should develop in tha fork repo
> clone your forked repo
> then go into that
> then add upstream repo as original wso2 repo
> see the remote tracking branchs by
> git remote -v
> you will see the origin as your forked repo
> to add upstream
> git remote add upstream 
> when you change something create a new branch by
> git checkout -b new_branch_name
> then add and commit to this branch
> after that push to the forked by
> git push origin new_branch_name
>
> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> the above error is due to a simple mistake of not providing my local p2
>> repo.Now it is working and i debugged the StreamingLinearRegression model
>> cep.
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I did what you recommend. But when i am adding the query the following
>>> error is appearing.
>>> No extension exist for StreamFunctionExtension{namespace='ml'} in
>>> execution plan "NewExecutionPlan"
>>>
>>> *My query is as follows,
>>> @Import('LinRegInput:1.0.0')
>>> define stream LinRegInput (salary double, rbi double, walks double,
>>> strikeouts double, errors double);
>>>
>>> @Export('LinRegOutput:1.0.0')
>>> define stream LinRegOutput (mse double);
>>>
>>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into mse;
>>>
>>> I have added my files as follows,
>>>
>>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
>>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>>>
>>> and add following lines to ml.siddhiext
>>>
>>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>>>
>>> .Then i build the carbon-ml. The replace the jar file you asked me
>>> replace with the name changed.any thoughts?
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> You don't need to add new p2 repo.
>>>> In the /repository/components/plugins folder, you will find
>>>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
>>>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
>>>> First rename this jar in the target folder to the jar name in the plugins
>>>> folder then replace (Make sure, otherwise will not work).
>>>> Your updates will be there in the CEP after this.
>>>>
>>>> Best regards.
>>>>
>>>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> Do i need to add p2 local repos of ML into CEP after i made changes to
>>>>> ml extensions. Or will it be automatically updated. I am trying to debug 
>>>>> my
>>>>> extension with the cep.thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Mahesh when you add your work to carbon-ml follow the bellow
>>>>>> guidelines, it will help to keep the code clean.
>>>>>>
>>>>>>
>>>>>>- Add only the sources code file you have newly added or changed.
>>>>>>- Do not use add . (add all) command in git. Only use add filename
>>>>>>
>>>>>> I have seen in your gsoc repo that there are gitignore files, idea
>>>>>> related files and the target folder is there. These should not be in the
>>>>>> source c

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
the above error is due to a simple mistake of not providing my local p2
repo.Now it is working and i debugged the StreamingLinearRegression model
cep.
regards,
Mahesh.

On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> I did what you recommend. But when i am adding the query the following
> error is appearing.
> No extension exist for StreamFunctionExtension{namespace='ml'} in
> execution plan "NewExecutionPlan"
>
> *My query is as follows,
> @Import('LinRegInput:1.0.0')
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.0')
> define stream LinRegOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, salary,
> rbi, walks, strikeouts, errors)
> select *
> insert into mse;
>
> I have added my files as follows,
>
> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>
> and add following lines to ml.siddhiext
>
> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>
> .Then i build the carbon-ml. The replace the jar file you asked me replace
> with the name changed.any thoughts?
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't need to add new p2 repo.
>> In the /repository/components/plugins folder, you will find
>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
>> First rename this jar in the target folder to the jar name in the plugins
>> folder then replace (Make sure, otherwise will not work).
>> Your updates will be there in the CEP after this.
>>
>> Best regards.
>>
>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Do i need to add p2 local repos of ML into CEP after i made changes to
>>> ml extensions. Or will it be automatically updated. I am trying to debug my
>>> extension with the cep.thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Mahesh when you add your work to carbon-ml follow the bellow
>>>> guidelines, it will help to keep the code clean.
>>>>
>>>>
>>>>- Add only the sources code file you have newly added or changed.
>>>>- Do not use add . (add all) command in git. Only use add filename
>>>>
>>>> I have seen in your gsoc repo that there are gitignore files, idea
>>>> related files and the target folder is there. These should not be in the
>>>> source code, only the source files you add.
>>>>
>>>>- Commit when you have done some major activity. Do not add commits
>>>>always when you make a change.
>>>>
>>>>
>>>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> May i seperately put the classes to ml and extensions in carbon-core.
>>>>> I can put Streaming Extensions to extensions and 
>>>>> Algorithms/StreamingLinear
>>>>> Regression and StreamingKMeans in ml core. what is the suitable format. I
>>>>> will commit my changes today as seperate branch in my forked carbon-ml
>>>>> local repo.thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>> p.s: better if you can meet me via hangout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Pruthuvi Maheshakya Wijewardena
>>>> mahesha...@wso2.com
>>>> +94711228855
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
I did what you recommend. But when i am adding the query the following
error is appearing.
No extension exist for StreamFunctionExtension{namespace='ml'} in execution
plan "NewExecutionPlan"

*My query is as follows,
@Import('LinRegInput:1.0.0')
define stream LinRegInput (salary double, rbi double, walks double,
strikeouts double, errors double);

@Export('LinRegOutput:1.0.0')
define stream LinRegOutput (mse double);

from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, salary,
rbi, walks, strikeouts, errors)
select *
insert into mse;

I have added my files as follows,

org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;

and add following lines to ml.siddhiext

streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor

.Then i build the carbon-ml. The replace the jar file you asked me replace
with the name changed.any thoughts?
regards,
Mahesh.

On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> You don't need to add new p2 repo.
> In the /repository/components/plugins folder, you will find
> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
> First rename this jar in the target folder to the jar name in the plugins
> folder then replace (Make sure, otherwise will not work).
> Your updates will be there in the CEP after this.
>
> Best regards.
>
> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Do i need to add p2 local repos of ML into CEP after i made changes to ml
>> extensions. Or will it be automatically updated. I am trying to debug my
>> extension with the cep.thank you.
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Mahesh when you add your work to carbon-ml follow the bellow guidelines,
>>> it will help to keep the code clean.
>>>
>>>
>>>- Add only the sources code file you have newly added or changed.
>>>- Do not use add . (add all) command in git. Only use add filename
>>>
>>> I have seen in your gsoc repo that there are gitignore files, idea
>>> related files and the target folder is there. These should not be in the
>>> source code, only the source files you add.
>>>
>>>- Commit when you have done some major activity. Do not add commits
>>>always when you make a change.
>>>
>>>
>>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> May i seperately put the classes to ml and extensions in carbon-core. I
>>>> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
>>>> Regression and StreamingKMeans in ml core. what is the suitable format. I
>>>> will commit my changes today as seperate branch in my forked carbon-ml
>>>> local repo.thank you.
>>>> regards,
>>>> Mahesh.
>>>> p.s: better if you can meet me via hangout.
>>>>
>>>
>>>
>>>
>>> --
>>> Pruthuvi Maheshakya Wijewardena
>>> mahesha...@wso2.com
>>> +94711228855
>>>
>>>
>>>
>>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
Do i need to add p2 local repos of ML into CEP after i made changes to ml
extensions. Or will it be automatically updated. I am trying to debug my
extension with the cep.thank you.
regards,
Mahesh.

On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Mahesh when you add your work to carbon-ml follow the bellow guidelines,
> it will help to keep the code clean.
>
>
>- Add only the sources code file you have newly added or changed.
>- Do not use add . (add all) command in git. Only use add filename
>
> I have seen in your gsoc repo that there are gitignore files, idea related
> files and the target folder is there. These should not be in the source
> code, only the source files you add.
>
>- Commit when you have done some major activity. Do not add commits
>always when you make a change.
>
>
> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> May i seperately put the classes to ml and extensions in carbon-core. I
>> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
>> Regression and StreamingKMeans in ml core. what is the suitable format. I
>> will commit my changes today as seperate branch in my forked carbon-ml
>> local repo.thank you.
>> regards,
>> Mahesh.
>> p.s: better if you can meet me via hangout.
>>
>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
May i seperately put the classes to ml and extensions in carbon-core. I can
put Streaming Extensions to extensions and Algorithms/StreamingLinear
Regression and StreamingKMeans in ml core. what is the suitable format. I
will commit my changes today as seperate branch in my forked carbon-ml
local repo.thank you.
regards,
Mahesh.
p.s: better if you can meet me via hangout.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-12 Thread Mahesh Dananjaya
Hi maheshakya,
ok.these couple of days i have spent on implementing streamin
clustering in a efficient way.i have found couple of methods.intially
i am developing k batch k means for streaming.i will let you know the
progress within next couple of days.i have already added paramter in
query for window shift.i will add tto repo tomorrow morning.
Thank you.
Mahesh.

On 6/12/16, Maheshakya Wijewardena <mahesha...@wso2.com> wrote:
> Hi Mahesh,
>
> Since you have already implemented the streaming algorithms as separate
> siddhi extensions, our next task is to include them in the carbon-ml siddhi
> extensions. Please start that by adding streaming linear regression first.
> You also need to persist models that are trained.
> Refer to method [1] in carbon-ml to see how model persistence is done.
>
> Best regards.
>
> [1]
> https://github.com/wso2/carbon-ml/blob/5211f8b1d662778af832c54fbbcc81fe4aa78e1e/components/ml/org.wso2.carbon.ml.core/src/main/java/org/wso2/carbon/ml/core/impl/MLModelHandler.java#L727
>
> On Sat, Jun 11, 2016 at 10:58 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Regarding your question:
>>
>> my outputData Object[]array is in the format of
>>> [mse,beta0,beta1,betap].But seems to be that cep does not understand
>>> it.
>>
>>
>> Did you create an output stream first for the publisher? You need to
>> create a stream with attributes: mse double, beta1 double, ...
>> and point to that from the publisher.
>>
>>
>>
>> On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> you can find the details of the queries in this ReadMe [1]. i have add
>>> some changes . so previous querirs may not valid.please use these new
>>> queries in the README.
>>> *1.Streaming Linear regression*
>>> from LinRegInputStream#streaming:streaminglr((learnType),
>>> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
>>> (ci), salary, rbi, walks, strikeouts, errors)
>>> select *
>>>
>>>
>>>
>>>
>>> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0,
>>> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts,
>>> errors)select
>>> *insert into regResults*;
>>>
>>> *2.Streaming KMeans Clustering*
>>> from LinRegInputStream#streaming:streamingkm((learnType),
>>> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into regResults;
>>>
>>>
>>>
>>> *from
>>> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
>>> *insert into regResults*
>>>
>>>  And i need a help in returning the outputData of my program back to
>>> cep.
>>> therefore currenlt you may not find the stream output in event
>>> publish.but
>>> you can see the output in the console. i want to understand the final
>>> stepd
>>> of putting the output data back to output stream after the batch size is
>>> completed and the algorithms is completed. you may find that following
>>> line
>>> passes an exception. Thats have actually no clue of outputData format
>>> that
>>> need to give for Output stream.
>>>
>>> Object[] outputData = streamingLinearRegression.regress(eventData);
>>>
>>>
>>> if (outputData == null) {
>>> streamEventChunk.remove();
>>> } else {
>>> complexEventPopulater.populateComplexEvent(complexEvent,
>>> outputData);
>>> }
>>>
>>> my outputData Object[]array is in the format of
>>> [mse,beta0,beta1,betap].But seems to be that cep does not understand
>>> it. i do it by looking at the time series stream rpocessor extension at
>>> [2].can you please help me with this.
>>> regards,
>>> Mahesh.
>>>
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>> [2]
>>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java
>>>
>>> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Great work so far.
>>>>
>&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-08 Thread Mahesh Dananjaya
Hi Maheshakya,
in the last one mentioned example query for streaming linear regression
should be,





*insert into regResults; from LinRegInputStream#streaming:streaminglr(0, 2,
100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors)select
*insert into regResults*;

miniBatchFraction should be given in double fomat.i wrote it wrong when i
document it.thank you.


On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Maheshakya,
> you can find the details of the queries in this ReadMe [1]. i have add
> some changes . so previous querirs may not valid.please use these new
> queries in the README.
> *1.Streaming Linear regression*
> from LinRegInputStream#streaming:streaminglr((learnType),
> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
> (ci), salary, rbi, walks, strikeouts, errors)
> select *
>
>
>
>
> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0,
> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select
> *insert into regResults*;
>
> *2.Streaming KMeans Clustering*
> from LinRegInputStream#streaming:streamingkm((learnType),
> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into regResults;
>
>
>
> *from
> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
> *insert into regResults*
>
>  And i need a help in returning the outputData of my program back to cep.
> therefore currenlt you may not find the stream output in event publish.but
> you can see the output in the console. i want to understand the final stepd
> of putting the output data back to output stream after the batch size is
> completed and the algorithms is completed. you may find that following line
> passes an exception. Thats have actually no clue of outputData format that
> need to give for Output stream.
>
> Object[] outputData = streamingLinearRegression.regress(eventData);
>
>
> if (outputData == null) {
> streamEventChunk.remove();
> } else {
> complexEventPopulater.populateComplexEvent(complexEvent, outputData);
> }
>
> my outputData Object[]array is in the format of
> [mse,beta0,beta1,betap].But seems to be that cep does not understand
> it. i do it by looking at the time series stream rpocessor extension at
> [2].can you please help me with this.
> regards,
> Mahesh.
>
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java
>
> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Great work so far.
>>
>> Regarding the queries:
>>
>> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)
>>
>>
>> Can you give me the definitions of the first few entities in the order.
>> Also in previous supervised cases (linear regression), what is the response
>> variable, etc.
>> I'll go through the code and give you a feedback.
>>
>>  After this, we need to me this implementation into carbon-ml siddhi
>> extension. Please also do a similar implementation for logistic regression
>> as well because we need to have a streaming version for classification as
>> well.
>>
>> Best regards.
>>
>>
>>
>> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> I have changed the siddhi query for our StreamingKMeansClustering by
>>> adding Alpha into the picture which we can use to make data horizon (how
>>> quickly a most recent data point becomes a part of the model) and data
>>> obsolescence (how long does it take a past data point to become irrelevant
>>> to the model)in the streaming clustering algorithms.i have added new
>>> changes to repo [1] introducing StreamingKMeansClusteringModel and
>>> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>>>
>>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
>>> walks, strikeouts, errors)
>>>
>>> select *
>>> insert into regResults;
>>>
>>> regrads,
>>> Mahesh.
>>>
>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-08 Thread Mahesh Dananjaya
Hi Maheshakya,
you can find the details of the queries in this ReadMe [1]. i have add some
changes . so previous querirs may not valid.please use these new queries in
the README.
*1.Streaming Linear regression*
from LinRegInputStream#streaming:streaminglr((learnType),
(batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
(ci), salary, rbi, walks, strikeouts, errors)
select *




*insert into regResults; from LinRegInputStream#streaming:streaminglr(0, 2,
100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select
*insert into regResults*;

*2.Streaming KMeans Clustering*
from LinRegInputStream#streaming:streamingkm((learnType),
(batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
salary, rbi, walks, strikeouts, errors)
select *
insert into regResults;



*from
KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
*insert into regResults*

 And i need a help in returning the outputData of my program back to cep.
therefore currenlt you may not find the stream output in event publish.but
you can see the output in the console. i want to understand the final stepd
of putting the output data back to output stream after the batch size is
completed and the algorithms is completed. you may find that following line
passes an exception. Thats have actually no clue of outputData format that
need to give for Output stream.

Object[] outputData = streamingLinearRegression.regress(eventData);


if (outputData == null) {
streamEventChunk.remove();
} else {
complexEventPopulater.populateComplexEvent(complexEvent, outputData);
}

my outputData Object[]array is in the format of
[mse,beta0,beta1,betap].But seems to be that cep does not understand
it. i do it by looking at the time series stream rpocessor extension at
[2].can you please help me with this.
regards,
Mahesh.

[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
[2]
https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java

On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Great work so far.
>
> Regarding the queries:
>
> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)
>
>
> Can you give me the definitions of the first few entities in the order.
> Also in previous supervised cases (linear regression), what is the response
> variable, etc.
> I'll go through the code and give you a feedback.
>
>  After this, we need to me this implementation into carbon-ml siddhi
> extension. Please also do a similar implementation for logistic regression
> as well because we need to have a streaming version for classification as
> well.
>
> Best regards.
>
>
>
> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> I have changed the siddhi query for our StreamingKMeansClustering by
>> adding Alpha into the picture which we can use to make data horizon (how
>> quickly a most recent data point becomes a part of the model) and data
>> obsolescence (how long does it take a past data point to become irrelevant
>> to the model)in the streaming clustering algorithms.i have added new
>> changes to repo [1] introducing StreamingKMeansClusteringModel and
>> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>>
>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
>> walks, strikeouts, errors)
>>
>> select *
>> insert into regResults;
>>
>> regrads,
>> Mahesh.
>>
>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> As we have discussed the architecture of the project i have already
>>> developed a couple of essential components for our project. During last
>>> week i completed the writing cep siddhi extension for our streaming
>>> algorithms which are developed to learn incrementally with past
>>> experiences. I have written the siddhi extensions with StreamProcessor
>>> extension for StreamingLinearRegerssion and StreamingKMeansClustering with
>>> the relevant parameters to call it as siddhi query. On the other hand i did
>>> some research on developing Mini Batch KMeans clustering for our
>>> StreamingKMeansClustering. And also i added the moving window addition to
>>> usual batch processing. And currently i am working on the time based
>>> incremental  re-trainign method for siddhi streams. On t

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-07 Thread Mahesh Dananjaya
Hi Maheshkya,
I have changed the siddhi query for our StreamingKMeansClustering by adding
Alpha into the picture which we can use to make data horizon (how quickly a
most recent data point becomes a part of the model) and data obsolescence
(how long does it take a past data point to become irrelevant to the
model)in the streaming clustering algorithms.i have added new changes to
repo [1] introducing StreamingKMeansClusteringModel and
StreamingKMeansCLustering classes to project.new siddhi query is as follows.

from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
walks, strikeouts, errors)

select *
insert into regResults;

regrads,
Mahesh.

[1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc

On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Maheshakya,
> As we have discussed the architecture of the project i have already
> developed a couple of essential components for our project. During last
> week i completed the writing cep siddhi extension for our streaming
> algorithms which are developed to learn incrementally with past
> experiences. I have written the siddhi extensions with StreamProcessor
> extension for StreamingLinearRegerssion and StreamingKMeansClustering with
> the relevant parameters to call it as siddhi query. On the other hand i did
> some research on developing Mini Batch KMeans clustering for our
> StreamingKMeansClustering. And also i added the moving window addition to
> usual batch processing. And currently i am working on the time based
> incremental  re-trainign method for siddhi streams. On the
> StreamingClustering side i have already part of th
> StreamingKMeansClustering with the mini batch KMeans clustering. All the
> work i did were pushed to my repo in github [1]. you can find the
> development on gsoc/ directory.
>  And also as the ml team and supun was asked, i have did some timing and
> performance analysis for our SGD (Stochastic Gradient Descent) algorithms
> for LinearRegression. Those results also add to my repo in [2]. Now i am
> developing the rest for our purpose and trying to looked into other
> researches on predictive analysis for online big data. Ans also doing some
> work related to mini batch KMEans Clustering. And also i have been working
> on the performance analysis, accuracy and basic comparison between mini
> batch algorithms and moving window algorithms for streaming and periodic
> re-training of ML model. thank you.
> BR,
> Mahesh.
> [1] https://github.com/dananjayamahesh/GSOC2016
> [2]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>
>
> On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> If you want to run it please use following queries.
>> 1. StreamingLInearRegression
>>
>> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
>> walks, strikeouts, errors)
>>
>> select *
>>
>> insert into regResults;
>>
>> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
>> walks, strikeouts, errors)
>>
>> select *
>> insert into regResults;
>>
>> in both case the first parameter let you to decide which learning methos
>> you want, moving window, batch processing or time based model learning.
>> BR,
>> Mahesh.
>>
>> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkaya,
>>> I have added the moving window method and update the previos
>>> StreamingLinearRegression [1] which only performed batch processing with
>>> streaming data. and also i added the StreamingKMeansClustering [1] for our
>>> purposes and debugged them.thank you.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>>>
>>> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga <sup...@wso2.com> wrote:
>>>
>>>> Thanks Mahesh! The graphs look promising! :)
>>>>
>>>> So by looking at graph, LR with SGD can train  a model within 60 secs
>>>> (6*10^10 nano sec), using about 900,000 data points . Means, this online
>>>> training can handle events/data points coming at rate of 15,000 per second
>>>> (or more) , if the batch size is set to 900,000 (or less) or window size is
>>>> set to 60 secs (or less). This is great IMO!
>>>>
>>>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.c

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-06 Thread Mahesh Dananjaya
Hi Maheshakya,
As we have discussed the architecture of the project i have already
developed a couple of essential components for our project. During last
week i completed the writing cep siddhi extension for our streaming
algorithms which are developed to learn incrementally with past
experiences. I have written the siddhi extensions with StreamProcessor
extension for StreamingLinearRegerssion and StreamingKMeansClustering with
the relevant parameters to call it as siddhi query. On the other hand i did
some research on developing Mini Batch KMeans clustering for our
StreamingKMeansClustering. And also i added the moving window addition to
usual batch processing. And currently i am working on the time based
incremental  re-trainign method for siddhi streams. On the
StreamingClustering side i have already part of th
StreamingKMeansClustering with the mini batch KMeans clustering. All the
work i did were pushed to my repo in github [1]. you can find the
development on gsoc/ directory.
 And also as the ml team and supun was asked, i have did some timing and
performance analysis for our SGD (Stochastic Gradient Descent) algorithms
for LinearRegression. Those results also add to my repo in [2]. Now i am
developing the rest for our purpose and trying to looked into other
researches on predictive analysis for online big data. Ans also doing some
work related to mini batch KMEans Clustering. And also i have been working
on the performance analysis, accuracy and basic comparison between mini
batch algorithms and moving window algorithms for streaming and periodic
re-training of ML model. thank you.
BR,
Mahesh.
[1] https://github.com/dananjayamahesh/GSOC2016
[2]
https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg


On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Maheshkya,
> If you want to run it please use following queries.
> 1. StreamingLInearRegression
>
> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
> walks, strikeouts, errors)
>
> select *
>
> insert into regResults;
>
> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
> walks, strikeouts, errors)
>
> select *
> insert into regResults;
>
> in both case the first parameter let you to decide which learning methos
> you want, moving window, batch processing or time based model learning.
> BR,
> Mahesh.
>
> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkaya,
>> I have added the moving window method and update the previos
>> StreamingLinearRegression [1] which only performed batch processing with
>> streaming data. and also i added the StreamingKMeansClustering [1] for our
>> purposes and debugged them.thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>>
>> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga <sup...@wso2.com> wrote:
>>
>>> Thanks Mahesh! The graphs look promising! :)
>>>
>>> So by looking at graph, LR with SGD can train  a model within 60 secs
>>> (6*10^10 nano sec), using about 900,000 data points . Means, this online
>>> training can handle events/data points coming at rate of 15,000 per second
>>> (or more) , if the batch size is set to 900,000 (or less) or window size is
>>> set to 60 secs (or less). This is great IMO!
>>>
>>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> As you requested i can change other parameters as well such as feature
>>>> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
>>>> the code if you want. source is at [1]. the test timing is called with
>>>> random data as you requested if you set args[0] to 1. And you can find the
>>>> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
>>>> BR,
>>>> Mahesh.
>>>> [1]
>>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
>>>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>>
>>>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi supun,
>>>>> Though i pushed it yesterday, there was some problems with the
>>>>> network. now you can see them in the repo location [1].I a

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Mahesh Dananjaya
Hi Maheshkya,
If you want to run it please use following queries.
1. StreamingLInearRegression

from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
walks, strikeouts, errors)

select *

insert into regResults;

from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
walks, strikeouts, errors)

select *
insert into regResults;

in both case the first parameter let you to decide which learning methos
you want, moving window, batch processing or time based model learning.
BR,
Mahesh.

On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Maheshkaya,
> I have added the moving window method and update the previos
> StreamingLinearRegression [1] which only performed batch processing with
> streaming data. and also i added the StreamingKMeansClustering [1] for our
> purposes and debugged them.thank you.
> regards,
> Mahesh.
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>
> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga <sup...@wso2.com> wrote:
>
>> Thanks Mahesh! The graphs look promising! :)
>>
>> So by looking at graph, LR with SGD can train  a model within 60 secs
>> (6*10^10 nano sec), using about 900,000 data points . Means, this online
>> training can handle events/data points coming at rate of 15,000 per second
>> (or more) , if the batch size is set to 900,000 (or less) or window size is
>> set to 60 secs (or less). This is great IMO!
>>
>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> As you requested i can change other parameters as well such as feature
>>> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
>>> the code if you want. source is at [1]. the test timing is called with
>>> random data as you requested if you set args[0] to 1. And you can find the
>>> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
>>> BR,
>>> Mahesh.
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
>>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi supun,
>>>> Though i pushed it yesterday, there was some problems with the network.
>>>> now you can see them in the repo location [1].I added some Matlab plot you
>>>> can see the patter there.you can use ml also. Ok sure thing. I can prepare
>>>> a report or else blog if you want. files are as follows. The y axis is in
>>>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
>>>> you can easily compare.
>>>> lr_timing_1000.txt -> batch size incremented by 1000
>>>> lr_timing_1.txt -> batch size incremented by 1
>>>> lr_timing_power10.txt -> batch size incremented by power of 10
>>>>
>>>> In here independent variable is only tha batch size.If you want i can
>>>> send you making other parameters such as step size, number of iteration,
>>>> feature vector size as independent variables. please let me know if you
>>>> want further info. thank you.
>>>> regards,
>>>> Mahesh.
>>>>
>>>>
>>>> [1
>>>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>>> [2]
>>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>>>
>>>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga <sup...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> I have added those timing reports to my repo [1].
>>>>>
>>>>> Whats the file name? :)
>>>>>
>>>>> Btw, can you compile simple doc (gdoc) with the above results, and
>>>>> bring everything to one place? That way it is easy to compare, and keep
>>>>> track.
>>>>>
>>>>> Thanks,
>>>>> Supun
>>>>>
>>>>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshkya,
>>>>>> I have added those timing reports to my

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Mahesh Dananjaya
Hi Maheshkaya,
I have added the moving window method and update the previos
StreamingLinearRegression [1] which only performed batch processing with
streaming data. and also i added the StreamingKMeansClustering [1] for our
purposes and debugged them.thank you.
regards,
Mahesh.
[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming

On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga <sup...@wso2.com> wrote:

> Thanks Mahesh! The graphs look promising! :)
>
> So by looking at graph, LR with SGD can train  a model within 60 secs
> (6*10^10 nano sec), using about 900,000 data points . Means, this online
> training can handle events/data points coming at rate of 15,000 per second
> (or more) , if the batch size is set to 900,000 (or less) or window size is
> set to 60 secs (or less). This is great IMO!
>
> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> As you requested i can change other parameters as well such as feature
>> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
>> the code if you want. source is at [1]. the test timing is called with
>> random data as you requested if you set args[0] to 1. And you can find the
>> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
>> BR,
>> Mahesh.
>> [1]
>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi supun,
>>> Though i pushed it yesterday, there was some problems with the network.
>>> now you can see them in the repo location [1].I added some Matlab plot you
>>> can see the patter there.you can use ml also. Ok sure thing. I can prepare
>>> a report or else blog if you want. files are as follows. The y axis is in
>>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
>>> you can easily compare.
>>> lr_timing_1000.txt -> batch size incremented by 1000
>>> lr_timing_1.txt -> batch size incremented by 1
>>> lr_timing_power10.txt -> batch size incremented by power of 10
>>>
>>> In here independent variable is only tha batch size.If you want i can
>>> send you making other parameters such as step size, number of iteration,
>>> feature vector size as independent variables. please let me know if you
>>> want further info. thank you.
>>> regards,
>>> Mahesh.
>>>
>>>
>>> [1
>>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>> [2]
>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>>
>>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga <sup...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> I have added those timing reports to my repo [1].
>>>>
>>>> Whats the file name? :)
>>>>
>>>> Btw, can you compile simple doc (gdoc) with the above results, and
>>>> bring everything to one place? That way it is easy to compare, and keep
>>>> track.
>>>>
>>>> Thanks,
>>>> Supun
>>>>
>>>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshkya,
>>>>> I have added those timing reports to my repo [1].please have a look
>>>>> at. three files are there. one is using incremet as 1000 for batch sizes
>>>>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>>>>> upto 1 million in both scenarios.you can see the reports and figures in 
>>>>> the
>>>>> location [2] in the repo. i also added the streaminglinearregression
>>>>> classes in the repo gsoc folder.thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>>> [2]
>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>>>>
>>>>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Mahesh Dananjaya
Hi Maheshakya,
I have looked into the spark streaming fundamentals and  k mean clustering
to develop the streaming k mean clustering for stream data. those can be
found at [1] and [2].I will commit new changes to my repo today including
the basic implementation of streaming k mean clustering.thank you.
regards,
Mahesh.
[1] http://spark.apache.org/docs/latest/streaming-programming-guide.html
[2] http://spark.apache.org/docs/latest/mllib-clustering.html

On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> As you requested i can change other parameters as well such as feature
> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
> the code if you want. source is at [1]. the test timing is called with
> random data as you requested if you set args[0] to 1. And you can find the
> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
> BR,
> Mahesh.
> [1]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi supun,
>> Though i pushed it yesterday, there was some problems with the network.
>> now you can see them in the repo location [1].I added some Matlab plot you
>> can see the patter there.you can use ml also. Ok sure thing. I can prepare
>> a report or else blog if you want. files are as follows. The y axis is in
>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
>> you can easily compare.
>> lr_timing_1000.txt -> batch size incremented by 1000
>> lr_timing_1.txt -> batch size incremented by 1
>> lr_timing_power10.txt -> batch size incremented by power of 10
>>
>> In here independent variable is only tha batch size.If you want i can
>> send you making other parameters such as step size, number of iteration,
>> feature vector size as independent variables. please let me know if you
>> want further info. thank you.
>> regards,
>> Mahesh.
>>
>>
>> [1
>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>
>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga <sup...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> I have added those timing reports to my repo [1].
>>>
>>> Whats the file name? :)
>>>
>>> Btw, can you compile simple doc (gdoc) with the above results, and bring
>>> everything to one place? That way it is easy to compare, and keep track.
>>>
>>> Thanks,
>>> Supun
>>>
>>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshkya,
>>>> I have added those timing reports to my repo [1].please have a look at.
>>>> three files are there. one is using incremet as 1000 for batch sizes
>>>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>>>> upto 1 million in both scenarios.you can see the reports and figures in the
>>>> location [2] in the repo. i also added the streaminglinearregression
>>>> classes in the repo gsoc folder.thank you.
>>>> regards,
>>>> Mahesh.
>>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>> [2]
>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>>>
>>>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Thank you for the update. I will look into your implementation.
>>>>>
>>>>> And i will be able to send you the timing/performances analysis report
>>>>>> tomorrow for the SGD functions
>>>>>>
>>>>>
>>>>> Great. Sent those asap so that we can proceed.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> Hi maheshakay,
>>>>>> I have implemented the linear regression with cep siddhi event stream
>>>>&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Mahesh Dananjaya
Hi Maheshakya,
As you requested i can change other parameters as well such as feature
size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
the code if you want. source is at [1]. the test timing is called with
random data as you requested if you set args[0] to 1. And you can find the
extension and streaming algorithms in gsoc/ directiry[2]. thank you.
BR,
Mahesh.
[1]
https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
[2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc

On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi supun,
> Though i pushed it yesterday, there was some problems with the network.
> now you can see them in the repo location [1].I added some Matlab plot you
> can see the patter there.you can use ml also. Ok sure thing. I can prepare
> a report or else blog if you want. files are as follows. The y axis is in
> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
> you can easily compare.
> lr_timing_1000.txt -> batch size incremented by 1000
> lr_timing_1.txt -> batch size incremented by 1
> lr_timing_power10.txt -> batch size incremented by power of 10
>
> In here independent variable is only tha batch size.If you want i can send
> you making other parameters such as step size, number of iteration, feature
> vector size as independent variables. please let me know if you want
> further info. thank you.
> regards,
> Mahesh.
>
>
> [1
> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
> [2]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>
> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga <sup...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> I have added those timing reports to my repo [1].
>>
>> Whats the file name? :)
>>
>> Btw, can you compile simple doc (gdoc) with the above results, and bring
>> everything to one place? That way it is easy to compare, and keep track.
>>
>> Thanks,
>> Supun
>>
>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> I have added those timing reports to my repo [1].please have a look at.
>>> three files are there. one is using incremet as 1000 for batch sizes
>>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>>> upto 1 million in both scenarios.you can see the reports and figures in the
>>> location [2] in the repo. i also added the streaminglinearregression
>>> classes in the repo gsoc folder.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>> [2]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>>
>>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Thank you for the update. I will look into your implementation.
>>>>
>>>> And i will be able to send you the timing/performances analysis report
>>>>> tomorrow for the SGD functions
>>>>>
>>>>
>>>> Great. Sent those asap so that we can proceed.
>>>>
>>>> Best regards.
>>>>
>>>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>>
>>>>> Hi maheshakay,
>>>>> I have implemented the linear regression with cep siddhi event stream
>>>>> with  taking batch sizes as parameters from the cep. Now we can trying the
>>>>> moving window method to. Before that i think i should get your opinion on
>>>>> data structures to save the streaming data.please check my repo [1]  
>>>>> /gsoc/
>>>>> folder there you can find all new things i add.. there in the extension
>>>>> folder you can find those extension. And i will be able to send you the
>>>>> timing/performances analysis report tomorrow for the SGD functions. thank
>>>>> you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>>>
>>>>>
>>>>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Mahesh Dananjaya
Hi supun,
Though i pushed it yesterday, there was some problems with the network. now
you can see them in the repo location [1].I added some Matlab plot you can
see the patter there.you can use ml also. Ok sure thing. I can prepare a
report or else blog if you want. files are as follows. The y axis is in ns
and x axis is in batch size. And also i added two pplots as jpegs[2], so
you can easily compare.
lr_timing_1000.txt -> batch size incremented by 1000
lr_timing_1.txt -> batch size incremented by 1
lr_timing_power10.txt -> batch size incremented by power of 10

In here independent variable is only tha batch size.If you want i can send
you making other parameters such as step size, number of iteration, feature
vector size as independent variables. please let me know if you want
further info. thank you.
regards,
Mahesh.


[1
]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
[2]
https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg

On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga <sup...@wso2.com> wrote:

> Hi Mahesh,
>
> I have added those timing reports to my repo [1].
>
> Whats the file name? :)
>
> Btw, can you compile simple doc (gdoc) with the above results, and bring
> everything to one place? That way it is easy to compare, and keep track.
>
> Thanks,
> Supun
>
> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> I have added those timing reports to my repo [1].please have a look at.
>> three files are there. one is using incremet as 1000 for batch sizes
>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>> upto 1 million in both scenarios.you can see the reports and figures in the
>> location [2] in the repo. i also added the streaminglinearregression
>> classes in the repo gsoc folder.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>
>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Thank you for the update. I will look into your implementation.
>>>
>>> And i will be able to send you the timing/performances analysis report
>>>> tomorrow for the SGD functions
>>>>
>>>
>>> Great. Sent those asap so that we can proceed.
>>>
>>> Best regards.
>>>
>>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>>
>>>> Hi maheshakay,
>>>> I have implemented the linear regression with cep siddhi event stream
>>>> with  taking batch sizes as parameters from the cep. Now we can trying the
>>>> moving window method to. Before that i think i should get your opinion on
>>>> data structures to save the streaming data.please check my repo [1]  /gsoc/
>>>> folder there you can find all new things i add.. there in the extension
>>>> folder you can find those extension. And i will be able to send you the
>>>> timing/performances analysis report tomorrow for the SGD functions. thank
>>>> you.
>>>> regards,
>>>> Mahesh.
>>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>>
>>>>
>>>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi maheshkaya,
>>>>> i have written some siddhi extension and trying to develop a one for
>>>>> my one. In time series example in the [1], can you please explain me the
>>>>> input format and query lines in that example for my understanding.
>>>>>
>>>>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi,
>>>>> walks, strikeouts, errors)
>>>>> select *
>>>>> insert into regResults;
>>>>>
>>>>> i just want to knwo how i give a set of data into this extension and
>>>>> what is baseballData. Is it input stream as usual.or any data file?how can
>>>>> i find that data set to create dummy input stream like baseballData?
>>>>>
>>>>> thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1]
>>>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Mahesh Dananjaya
Hi Maheshkya,
I have added those timing reports to my repo [1].please have a look at.
three files are there. one is using incremet as 1000 for batch sizes
(lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
upto 1 million in both scenarios.you can see the reports and figures in the
location [2] in the repo. i also added the streaminglinearregression
classes in the repo gsoc folder.thank you.
regards,
Mahesh.
[1]https://github.com/dananjayamahesh/GSOC2016
[2]
https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output

On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Thank you for the update. I will look into your implementation.
>
> And i will be able to send you the timing/performances analysis report
>> tomorrow for the SGD functions
>>
>
> Great. Sent those asap so that we can proceed.
>
> Best regards.
>
> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> Hi maheshakay,
>> I have implemented the linear regression with cep siddhi event stream
>> with  taking batch sizes as parameters from the cep. Now we can trying the
>> moving window method to. Before that i think i should get your opinion on
>> data structures to save the streaming data.please check my repo [1]  /gsoc/
>> folder there you can find all new things i add.. there in the extension
>> folder you can find those extension. And i will be able to send you the
>> timing/performances analysis report tomorrow for the SGD functions. thank
>> you.
>> regards,
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>>
>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshkaya,
>>> i have written some siddhi extension and trying to develop a one for my
>>> one. In time series example in the [1], can you please explain me the input
>>> format and query lines in that example for my understanding.
>>>
>>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi,
>>> walks, strikeouts, errors)
>>> select *
>>> insert into regResults;
>>>
>>> i just want to knwo how i give a set of data into this extension and
>>> what is baseballData. Is it input stream as usual.or any data file?how can
>>> i find that data set to create dummy input stream like baseballData?
>>>
>>> thank you.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>>
>>> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> today i got the siddhi and debug the math extention. then did some
>>>> changes and check. Now i am trying to write same kind of extension in my
>>>> code base. so i add dependencies and it was built fine. Now i am trying to
>>>> debug my extension and i did the same thing as i did in previous case. Cep
>>>> is sending data, bu my extension is not firing in relevant break point.
>>>> 1. So how can i debug the siddhi extension in my new extension.(you can
>>>> see it in my example repoo)
>>>>
>>>> I think if i do it correctly we can built the extension for our
>>>> purpose. And i will send the relevant timing report of SGD algorithms very
>>>> soon as supun was asking me. thank you.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Also note that there is a calculation interval in the siddhi time
>>>>> series regression function[1]. You maybe able get some insight for this
>>>>> from that as well.
>>>>>
>>>>> [1] https://docs.wso2.com/display/CEP400/Regression
>>>>>
>>>>> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> As we discussed offline, we can use similar mechanism to train linear
>>>>>> regression models, logistic regression models and k-means clustering 
>>>>>> models.
>>>>>>
>>>>>> It is very interesting that i have found that somethings that c

[Dev] Fwd: Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-29 Thread Mahesh Dananjaya
Hi maheshakay,
I have implemented the linear regression with cep siddhi event stream with
taking batch sizes as parameters from the cep. Now we can trying the moving
window method to. Before that i think i should get your opinion on data
structures to save the streaming data.please check my repo [1]  /gsoc/
folder there you can find all new things i add.. there in the extension
folder you can find those extension. And i will be able to send you the
timing/performances analysis report tomorrow for the SGD functions. thank
you.
regards,
Mahesh.
[1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc


On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi maheshkaya,
> i have written some siddhi extension and trying to develop a one for my
> one. In time series example in the [1], can you please explain me the input
> format and query lines in that example for my understanding.
>
> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks,
> strikeouts, errors)
> select *
> insert into regResults;
>
> i just want to knwo how i give a set of data into this extension and what
> is baseballData. Is it input stream as usual.or any data file?how can i
> find that data set to create dummy input stream like baseballData?
>
> thank you.
> regards,
> Mahesh.
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>
> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> today i got the siddhi and debug the math extention. then did some
>> changes and check. Now i am trying to write same kind of extension in my
>> code base. so i add dependencies and it was built fine. Now i am trying to
>> debug my extension and i did the same thing as i did in previous case. Cep
>> is sending data, bu my extension is not firing in relevant break point.
>> 1. So how can i debug the siddhi extension in my new extension.(you can
>> see it in my example repoo)
>>
>> I think if i do it correctly we can built the extension for our purpose.
>> And i will send the relevant timing report of SGD algorithms very soon as
>> supun was asking me. thank you.
>> regards,
>> Mahesh.
>>
>> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Also note that there is a calculation interval in the siddhi time series
>>> regression function[1]. You maybe able get some insight for this from that
>>> as well.
>>>
>>> [1] https://docs.wso2.com/display/CEP400/Regression
>>>
>>> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> As we discussed offline, we can use similar mechanism to train linear
>>>> regression models, logistic regression models and k-means clustering 
>>>> models.
>>>>
>>>> It is very interesting that i have found that somethings that can make
>>>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>>>> Processor Extention program [1]. There is a example of
>>>>> LinearRegressionStreamProcessor [1].
>>>>>
>>>>
>>>> As we have to train predictive models with Spark, you can write
>>>> wrappers around regression/clustering models of Spark. Refer to Siddhi time
>>>> series regression source codes[1][2]. You can write a streaming linear
>>>> regression class for ML in a similar fashion by wrapping Spark mllib
>>>> implementations. You can use the methods "addEvent", "removeEvent", etc.
>>>> (may have to be changed according to requirements) for the similar purpose.
>>>> You can introduce trainLinearRegression/LogisticRegression/Kmeans which
>>>> does a similar thing as in createLinearRegression in those time series
>>>> functions. In the processData method you can use Spark mllib classes to
>>>> actually train models and return the model weights, evaluation metrics. So,
>>>> converting streams into RDDs and retrieving information from the trained
>>>> models shall happen in this method.
>>>>
>>>> In the stream processor extension example, you can retrieve those
>>>> values then use them to train new models with new batches. Weights/cluster
>>>> centers maybe passed as initialization parameters for the wrappers.
>>>>
>>>> Please note that we have to figure out the best siddhi extension type
>>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-27 Thread Mahesh Dananjaya
Hi maheshkaya,
i have written some siddhi extension and trying to develop a one for my
one. In time series example in the [1], can you please explain me the input
format and query lines in that example for my understanding.

from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks,
strikeouts, errors)
select *
insert into regResults;

i just want to knwo how i give a set of data into this extension and what
is baseballData. Is it input stream as usual.or any data file?how can i
find that data set to create dummy input stream like baseballData?

thank you.
regards,
Mahesh.
[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension

On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> today i got the siddhi and debug the math extention. then did some changes
> and check. Now i am trying to write same kind of extension in my code base.
> so i add dependencies and it was built fine. Now i am trying to debug my
> extension and i did the same thing as i did in previous case. Cep is
> sending data, bu my extension is not firing in relevant break point.
> 1. So how can i debug the siddhi extension in my new extension.(you can
> see it in my example repoo)
>
> I think if i do it correctly we can built the extension for our purpose.
> And i will send the relevant timing report of SGD algorithms very soon as
> supun was asking me. thank you.
> regards,
> Mahesh.
>
> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Also note that there is a calculation interval in the siddhi time series
>> regression function[1]. You maybe able get some insight for this from that
>> as well.
>>
>> [1] https://docs.wso2.com/display/CEP400/Regression
>>
>> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> As we discussed offline, we can use similar mechanism to train linear
>>> regression models, logistic regression models and k-means clustering models.
>>>
>>> It is very interesting that i have found that somethings that can make
>>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>>> Processor Extention program [1]. There is a example of
>>>> LinearRegressionStreamProcessor [1].
>>>>
>>>
>>> As we have to train predictive models with Spark, you can write wrappers
>>> around regression/clustering models of Spark. Refer to Siddhi time series
>>> regression source codes[1][2]. You can write a streaming linear regression
>>> class for ML in a similar fashion by wrapping Spark mllib implementations.
>>> You can use the methods "addEvent", "removeEvent", etc. (may have to be
>>> changed according to requirements) for the similar purpose. You can
>>> introduce trainLinearRegression/LogisticRegression/Kmeans which does a
>>> similar thing as in createLinearRegression in those time series functions.
>>> In the processData method you can use Spark mllib classes to actually train
>>> models and return the model weights, evaluation metrics. So, converting
>>> streams into RDDs and retrieving information from the trained models shall
>>> happen in this method.
>>>
>>> In the stream processor extension example, you can retrieve those values
>>> then use them to train new models with new batches. Weights/cluster centers
>>> maybe passed as initialization parameters for the wrappers.
>>>
>>> Please note that we have to figure out the best siddhi extension type
>>> for this process. In the siddhi query, we define batch size, type of
>>> algorithm and number of features (there can be more). After batch size
>>> number of events received, train a model and save parameters, return
>>> evaluation metric. With the next batch, retrain the model initialized with
>>> previously learned parameters.
>>>
>>> We also may need to test the same scenario with a moving window, but I
>>> suspect that that approach may become so slow as a model is trained each
>>> time an event is received. So, we may have to change the number of slots
>>> the moving window moves at a time (eg: not one by one, but ten by ten).
>>>
>>> Once this is resolved, majority of the research part will be finished
>>> and all we will be left to do is implementing wrappers around the 3
>>> learning algorithms we consider.
>>>
>>> Best regards.
>>>
>>> [1]
>>> https://github.com/wso2/si

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-26 Thread Mahesh Dananjaya
Hi Maheshakya,
today i got the siddhi and debug the math extention. then did some changes
and check. Now i am trying to write same kind of extension in my code base.
so i add dependencies and it was built fine. Now i am trying to debug my
extension and i did the same thing as i did in previous case. Cep is
sending data, bu my extension is not firing in relevant break point.
1. So how can i debug the siddhi extension in my new extension.(you can see
it in my example repoo)

I think if i do it correctly we can built the extension for our purpose.
And i will send the relevant timing report of SGD algorithms very soon as
supun was asking me. thank you.
regards,
Mahesh.

On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Also note that there is a calculation interval in the siddhi time series
> regression function[1]. You maybe able get some insight for this from that
> as well.
>
> [1] https://docs.wso2.com/display/CEP400/Regression
>
> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> As we discussed offline, we can use similar mechanism to train linear
>> regression models, logistic regression models and k-means clustering models.
>>
>> It is very interesting that i have found that somethings that can make
>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>> Processor Extention program [1]. There is a example of
>>> LinearRegressionStreamProcessor [1].
>>>
>>
>> As we have to train predictive models with Spark, you can write wrappers
>> around regression/clustering models of Spark. Refer to Siddhi time series
>> regression source codes[1][2]. You can write a streaming linear regression
>> class for ML in a similar fashion by wrapping Spark mllib implementations.
>> You can use the methods "addEvent", "removeEvent", etc. (may have to be
>> changed according to requirements) for the similar purpose. You can
>> introduce trainLinearRegression/LogisticRegression/Kmeans which does a
>> similar thing as in createLinearRegression in those time series functions.
>> In the processData method you can use Spark mllib classes to actually train
>> models and return the model weights, evaluation metrics. So, converting
>> streams into RDDs and retrieving information from the trained models shall
>> happen in this method.
>>
>> In the stream processor extension example, you can retrieve those values
>> then use them to train new models with new batches. Weights/cluster centers
>> maybe passed as initialization parameters for the wrappers.
>>
>> Please note that we have to figure out the best siddhi extension type for
>> this process. In the siddhi query, we define batch size, type of algorithm
>> and number of features (there can be more). After batch size number of
>> events received, train a model and save parameters, return evaluation
>> metric. With the next batch, retrain the model initialized with previously
>> learned parameters.
>>
>> We also may need to test the same scenario with a moving window, but I
>> suspect that that approach may become so slow as a model is trained each
>> time an event is received. So, we may have to change the number of slots
>> the moving window moves at a time (eg: not one by one, but ten by ten).
>>
>> Once this is resolved, majority of the research part will be finished and
>> all we will be left to do is implementing wrappers around the 3 learning
>> algorithms we consider.
>>
>> Best regards.
>>
>> [1]
>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
>> [2]
>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java
>>
>>
>> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> shall we use [1] for our work? i am checking the possibility.
>>> BR,
>>> Mahesh.
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>> [2]
>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>> [3]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>>
>>> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>

[Dev] [ml][cep][gsoc-6] Initial Project Meeting-Predictive Analysis with online data

2016-05-25 Thread Mahesh Dananjaya
Hi Maheshakya,
this is the one i need to post.please check.

Today we had a initial project meeting with wso2 ml team. what we discussed
architecture, best approaches and scope of the entire project. As i
underestood there will be three main component in the design.
1. CEP Siddhi Extension for the CEP streaming data interface
2. Core with Sample data points, recently saved model
3. Apache Spark MLLib algorithm wrapper

As long as we need to facilitate the streaming data support for the cep
siddhi, what i need initially is to developed a siddhi extension to get
events streams into my program. As we discussed the best approach is the
CEP Siddhi extension with Stream Processor extension to get cep event
streams into the program to incrementally learn the model to predictions
and analysis [1]. Extension will be the interface for cep to send data to
program. There can be different interface for different applications  to
use the program. This stream data from cep is taken as a batches rather
than single events.And also there will be outpur from my program to cep
which are the recent model information and parameters such as MSE, Cluster
center etc. After that there can be two approaches that we have not
finalized.

1. Collect K-Size batch from incoming data and learn model with that mini
batch and store the model.In this case memory requirement depends on K and
the number of features of the event. But this way we can achieve high level
of streaming perspectives such as data obsolesces and data horizon while
keeping relevant data while removing irrelevant data from the model
training.
2. Collect data into large memory and use moving window of K size n shift.
Where the n is the number if points that the window is moved after one
calculation. In that case we need a large memory.

Another approach that raised in that store the events/data points in a
database and use them later. As we discussed there can be two approaches to
send the updated/learned model into customer side. Time based and size
based approach. In that case  there can be a time window (one day, one
week, etc.) or a batch size (or both) in the K-size batch approach.

Then the other component, the wrapper around the spakl mllib SGD based
algorithms to for incremental learning. As i realized there will be memory
constraints and other constraints when we incrementally learn models with
stream data coming out of the cep, basically from the machine that cep is
deployed. Therefore we need to look into timing and performance while we
are using those algorithms on large data sets over time frequently.
Initially what we supposed to do is that develop that extension for
cep/siddhi to get stream data/events/sample points. after that we can move
with for mllib libs. Now we have three algorithms, linear regression,
k-mean clustering and logistic regression though we intially look only into
fisrt two.so this wees will be spent to develop that extension. thank you.

regards,
Mahesh.

[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [cep][ml][gsoc-6]Capturing event stream with a specified window size for ml

2016-05-23 Thread Mahesh Dananjaya
Hi Maheshkya,
Ok.then we have a output data too. I thought those data wont be sending
back to cep. In that case we can easily send those info back.thank you for
correcting.
regards,
Mahesh.

On Mon, May 23, 2016 at 3:09 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Actually, IMO, there should be an output for trained model, which is the
> evaluation metric; for linear regression, MSE and for logistic regression,
> accuracy. For clustering, it could be cluster centers.
> That way, it's possible to examine how model behaves with data.
>
> Best regards.
>
> On Mon, May 23, 2016 at 3:06 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi suho,
>> in my project machine learning models are incrementally being trained.
>> Therefore real time data is taken into my design as mini batches of data
>> sample points. Since we are developing this models for cep siddhi
>> processor,we want to get the mini batch of data points into my algorithms
>> from siddhi processor. In my case there is a input stream of K-size
>> (Batch/Window Size) of sample points bundles together. In linear regrassion
>> case all the independent and dependent data, In K-mean case whole feature
>> vector (data sample). Not only as single sample points (Window size=1), but
>> also as mini-batch (Window size=N)of sample points (Stream data). In my
>> case there wont be an output stream. The modeled will be there so even
>> predict can be used with that models.I looked into the [1] also.thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>
>> On Mon, May 23, 2016 at 2:48 PM, Sriskandarajah Suhothayan <s...@wso2.com
>> > wrote:
>>
>>> Hi Mahesh
>>>
>>> Can you explain the expected input to your extension and the expected
>>> output. Then we can help you to find the proper Siddhi extension to use.
>>>
>>> Regards
>>> Suho
>>>
>>> On Sat, May 21, 2016 at 11:44 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>> i am currenl working on the gsoc project "Predictive analytics with
>>>> online data for WSO2 Machine Learner
>>>> <https://docs.wso2.com/display/GSoC/Project+Proposals+for+2016#ProjectProposalsfor2016-Proposal6:[ML]PredictiveanalyticswithonlinedataforWSO2MachineLearner>"
>>>> with wso2 ML and cep extention for acquiring stream of events (sample data
>>>> points interms of ml) from cep siddhi porocessor. I am trying to write a
>>>> cep extention to get the stream of events as windows with a "Specified
>>>> window size". Then i am using those data sets to incrementally and
>>>> periodically learn the ML model which store the specific ml model
>>>> information to use with the current window of event data samples. I am
>>>> facing problem of writing a siddhi extention for my purpose to get stream
>>>> of data windows from cep siddhi rpocessor. Please help me with followings.
>>>> 1. I have been referring to [1] [2] [3] for writing siddhi extention.
>>>> In my case,what can be the most suitable option for this among the set
>>>> of siddhi extensions given?
>>>>
>>>> 2. I am currently working on carbon-ml and product-ml and ml cep
>>>> extentions currently built [6]. In case what is the best way to write
>>>> simple cep extention to check the functionality.?
>>>>
>>>> 3. I have gone through the [5] [4] for cep inbuilt windows. How can i
>>>> effectivey aggregate those features into my case?
>>>>
>>>> In case, if i need to look into other areas of cep for this purpose
>>>> please let me know. Thank you very much.
>>>> regards,
>>>> Mahesh.
>>>> [1]https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>>>> https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>>>> [2]https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>>> https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>>> [3]https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>>>> https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>>>> [4]
>>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>>> [5]
>>>> h

Re: [Dev] [cep][ml][gsoc-6]Capturing event stream with a specified window size for ml

2016-05-23 Thread Mahesh Dananjaya
Hi suho,
in my project machine learning models are incrementally being trained.
Therefore real time data is taken into my design as mini batches of data
sample points. Since we are developing this models for cep siddhi
processor,we want to get the mini batch of data points into my algorithms
from siddhi processor. In my case there is a input stream of K-size
(Batch/Window Size) of sample points bundles together. In linear regrassion
case all the independent and dependent data, In K-mean case whole feature
vector (data sample). Not only as single sample points (Window size=1), but
also as mini-batch (Window size=N)of sample points (Stream data). In my
case there wont be an output stream. The modeled will be there so even
predict can be used with that models.I looked into the [1] also.thank you.
regards,
Mahesh.
[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension

On Mon, May 23, 2016 at 2:48 PM, Sriskandarajah Suhothayan <s...@wso2.com>
wrote:

> Hi Mahesh
>
> Can you explain the expected input to your extension and the expected
> output. Then we can help you to find the proper Siddhi extension to use.
>
> Regards
> Suho
>
> On Sat, May 21, 2016 at 11:44 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi all,
>> i am currenl working on the gsoc project "Predictive analytics with
>> online data for WSO2 Machine Learner
>> <https://docs.wso2.com/display/GSoC/Project+Proposals+for+2016#ProjectProposalsfor2016-Proposal6:[ML]PredictiveanalyticswithonlinedataforWSO2MachineLearner>"
>> with wso2 ML and cep extention for acquiring stream of events (sample data
>> points interms of ml) from cep siddhi porocessor. I am trying to write a
>> cep extention to get the stream of events as windows with a "Specified
>> window size". Then i am using those data sets to incrementally and
>> periodically learn the ML model which store the specific ml model
>> information to use with the current window of event data samples. I am
>> facing problem of writing a siddhi extention for my purpose to get stream
>> of data windows from cep siddhi rpocessor. Please help me with followings.
>> 1. I have been referring to [1] [2] [3] for writing siddhi extention. In
>> my case,what can be the most suitable option for this among the set of
>> siddhi extensions given?
>>
>> 2. I am currently working on carbon-ml and product-ml and ml cep
>> extentions currently built [6]. In case what is the best way to write
>> simple cep extention to check the functionality.?
>>
>> 3. I have gone through the [5] [4] for cep inbuilt windows. How can i
>> effectivey aggregate those features into my case?
>>
>> In case, if i need to look into other areas of cep for this purpose
>> please let me know. Thank you very much.
>> regards,
>> Mahesh.
>> [1]https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>> https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>> [2]https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>> https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>> [3]https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>> https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>> [4]
>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>> [5]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>> [6]
>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>
>
>
>
> --
>
> *S. Suhothayan*
> Technical Lead & Team Lead of WSO2 Complex Event Processor
> *WSO2 Inc. *http://wso2.com
> * <http://wso2.com/>*
> lean . enterprise . middleware
>
>
> *cell: (+94) 779 756 757 <%28%2B94%29%20779%20756%20757> | blog:
> http://suhothayan.blogspot.com/ <http://suhothayan.blogspot.com/>twitter:
> http://twitter.com/suhothayan <http://twitter.com/suhothayan> | linked-in:
> http://lk.linkedin.com/in/suhothayan <http://lk.linkedin.com/in/suhothayan>*
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-21 Thread Mahesh Dananjaya
Hi Maheshkya,
shall we use [1] for our work? i am checking the possibility.
BR,
Mahesh.
[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
[2]
https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
[3]https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function

On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> It is very interesting that i have found that somethings that can make use
> of our work. In the cep 4.0 documentation there is a Custom Stream
> Processor Extention program [1]. There is a example of
> LinearRegressionStreamProcessor [1] and also i saw
>  private int batchSize = 10; i am going through this one.
> Please check whether we can use. WIll there be any compatibility or
> support issue?
> regards,
> Mahesh.
>
>
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>
> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> anyway how can test any siddhi extention after write it without
>> integrating it to cep.can you please explain me the procedure. i am
>> referring to [1] [2] [3] [4].  thank you.
>> BR,
>> Mahesh.
>>
>> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
>> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>
>> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you for the feedback. I have add data-sets into repo.
>>> data-sets/lr. I am all right with next week.Now i am writing some examples
>>> to collect samples and build mini batches and run the algorithms on those
>>> mini-batches. thank you. will add those into repo soon.I am still working
>>> on that siddhi extention.i will let you know the progress.
>>> BR,
>>> mahesh.
>>>
>>> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> I've look into your code sample of streaming linear regression. Looks
>>>> good to me, apart from few issues in coding practices which we can improve
>>>> when you're doing the implementations in carbon-ml and during the code
>>>> reviews. You are using a set of files as mini-batches of data, right? Can
>>>> you also send us the datasets you've been using. I'd like to run this.
>>>>
>>>> does that cep problem is now all right that we were trying to fix. I am
>>>>> still using those pre-build versions. If so i can merge with the latest 
>>>>> one.
>>>>
>>>>
>>>> I'll check this and let you know.
>>>>
>>>> Can we arrange a meeting (preferably in WSO2 offices) in next week with
>>>> ML team members as well. Coding period begins on next Monday, so it's
>>>> better to get overall feedback from others and discuss more about the
>>>> project. Let me know convenient time slots for you. I'll arrange a meeting
>>>> with ML team.
>>>>
>>>> Best regards.
>>>>
>>>> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> Ok. I will check it.you have sent me those relevant references and i
>>>>> am working on that thing.thank you. does that cep problem is now all right
>>>>> that we were trying to fix. I am still using those pre-build versions. If
>>>>> so i can merge with the latest one.thanks.
>>>>> BR,
>>>>> Mahesh.
>>>>>
>>>>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> You don't actually have to implement anything in spark streaming. Try
>>>>>> to understand how streaming data is handled in and the specifics of the
>>>>>> underlying algorithms in streaming.
>>>>>> What we want to do is having the similar algorithms that support CEP
>>>>>> event streams with siddhi.
>>>>>>
>>>>>> Best regards.
&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-21 Thread Mahesh Dananjaya
Hi Maheshakya,
It is very interesting that i have found that somethings that can make use
of our work. In the cep 4.0 documentation there is a Custom Stream
Processor Extention program [1]. There is a example of
LinearRegressionStreamProcessor [1] and also i saw
 private int batchSize = 10; i am going through this one.
Please check whether we can use. WIll there be any compatibility or support
issue?
regards,
Mahesh.


[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension

On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi maheshakya,
> anyway how can test any siddhi extention after write it without
> integrating it to cep.can you please explain me the procedure. i am
> referring to [1] [2] [3] [4].  thank you.
> BR,
> Mahesh.
>
> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>
> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> thank you for the feedback. I have add data-sets into repo. data-sets/lr.
>> I am all right with next week.Now i am writing some examples to collect
>> samples and build mini batches and run the algorithms on those
>> mini-batches. thank you. will add those into repo soon.I am still working
>> on that siddhi extention.i will let you know the progress.
>> BR,
>> mahesh.
>>
>> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> I've look into your code sample of streaming linear regression. Looks
>>> good to me, apart from few issues in coding practices which we can improve
>>> when you're doing the implementations in carbon-ml and during the code
>>> reviews. You are using a set of files as mini-batches of data, right? Can
>>> you also send us the datasets you've been using. I'd like to run this.
>>>
>>> does that cep problem is now all right that we were trying to fix. I am
>>>> still using those pre-build versions. If so i can merge with the latest 
>>>> one.
>>>
>>>
>>> I'll check this and let you know.
>>>
>>> Can we arrange a meeting (preferably in WSO2 offices) in next week with
>>> ML team members as well. Coding period begins on next Monday, so it's
>>> better to get overall feedback from others and discuss more about the
>>> project. Let me know convenient time slots for you. I'll arrange a meeting
>>> with ML team.
>>>
>>> Best regards.
>>>
>>> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> Ok. I will check it.you have sent me those relevant references and i am
>>>> working on that thing.thank you. does that cep problem is now all right
>>>> that we were trying to fix. I am still using those pre-build versions. If
>>>> so i can merge with the latest one.thanks.
>>>> BR,
>>>> Mahesh.
>>>>
>>>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> You don't actually have to implement anything in spark streaming. Try
>>>>> to understand how streaming data is handled in and the specifics of the
>>>>> underlying algorithms in streaming.
>>>>> What we want to do is having the similar algorithms that support CEP
>>>>> event streams with siddhi.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> Did you check the repo. I will add recent works today.And also i was
>>>>>> going through the Java docs related to spark streaming work. It is with
>>>>>> that scala API. thank you.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Maheshakya,
>>>>>>> I have gon

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-21 Thread Mahesh Dananjaya
Hi maheshakya,
anyway how can test any siddhi extention after write it without integrating
it to cep.can you please explain me the procedure. i am referring to [1]
[2] [3] [4].  thank you.
BR,
Mahesh.

[1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
[2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
[3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
[4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi

On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> thank you for the feedback. I have add data-sets into repo. data-sets/lr.
> I am all right with next week.Now i am writing some examples to collect
> samples and build mini batches and run the algorithms on those
> mini-batches. thank you. will add those into repo soon.I am still working
> on that siddhi extention.i will let you know the progress.
> BR,
> mahesh.
>
> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> I've look into your code sample of streaming linear regression. Looks
>> good to me, apart from few issues in coding practices which we can improve
>> when you're doing the implementations in carbon-ml and during the code
>> reviews. You are using a set of files as mini-batches of data, right? Can
>> you also send us the datasets you've been using. I'd like to run this.
>>
>> does that cep problem is now all right that we were trying to fix. I am
>>> still using those pre-build versions. If so i can merge with the latest one.
>>
>>
>> I'll check this and let you know.
>>
>> Can we arrange a meeting (preferably in WSO2 offices) in next week with
>> ML team members as well. Coding period begins on next Monday, so it's
>> better to get overall feedback from others and discuss more about the
>> project. Let me know convenient time slots for you. I'll arrange a meeting
>> with ML team.
>>
>> Best regards.
>>
>> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Ok. I will check it.you have sent me those relevant references and i am
>>> working on that thing.thank you. does that cep problem is now all right
>>> that we were trying to fix. I am still using those pre-build versions. If
>>> so i can merge with the latest one.thanks.
>>> BR,
>>> Mahesh.
>>>
>>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> You don't actually have to implement anything in spark streaming. Try
>>>> to understand how streaming data is handled in and the specifics of the
>>>> underlying algorithms in streaming.
>>>> What we want to do is having the similar algorithms that support CEP
>>>> event streams with siddhi.
>>>>
>>>> Best regards.
>>>>
>>>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> Did you check the repo. I will add recent works today.And also i was
>>>>> going through the Java docs related to spark streaming work. It is with
>>>>> that scala API. thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I have gone through the Java Docs and run some of the Spark examples
>>>>>> on spark shell which are paramount improtant for our work. Then i have 
>>>>>> been
>>>>>> writing my codes to check the Linear regression, K means for streaming.
>>>>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>>>>> the capturing event streams for our work. I will update the recent things
>>>>>> on git. check the park-example directory for java. examples run on git
>>>>>> shell is not included there. In my case i think i have to build mini
>>>>>> batches from data streams that comes as individual samples. Now i am
>>>>>> working on some coding to collect mini batches from data streams.thank 
>>>>>> you.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>> [1]https://github.

[Dev] [cep][ml][gsoc-6]Capturing event stream with a specified window size for ml

2016-05-21 Thread Mahesh Dananjaya
Hi all,
i am currenl working on the gsoc project "Predictive analytics with online
data for WSO2 Machine Learner
"
with wso2 ML and cep extention for acquiring stream of events (sample data
points interms of ml) from cep siddhi porocessor. I am trying to write a
cep extention to get the stream of events as windows with a "Specified
window size". Then i am using those data sets to incrementally and
periodically learn the ML model which store the specific ml model
information to use with the current window of event data samples. I am
facing problem of writing a siddhi extention for my purpose to get stream
of data windows from cep siddhi rpocessor. Please help me with followings.
1. I have been referring to [1] [2] [3] for writing siddhi extention. In my
case,what can be the most suitable option for this among the set of siddhi
extensions given?

2. I am currently working on carbon-ml and product-ml and ml cep extentions
currently built [6]. In case what is the best way to write simple cep
extention to check the functionality.?

3. I have gone through the [5] [4] for cep inbuilt windows. How can i
effectivey aggregate those features into my case?

In case, if i need to look into other areas of cep for this purpose please
let me know. Thank you very much.
regards,
Mahesh.
[1]https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
[2]https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
[3]https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
[4]
https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
[5]https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
[6]
https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-19 Thread Mahesh Dananjaya
Hi Maheshakya,
thank you for the feedback. I have add data-sets into repo. data-sets/lr. I
am all right with next week.Now i am writing some examples to collect
samples and build mini batches and run the algorithms on those
mini-batches. thank you. will add those into repo soon.I am still working
on that siddhi extention.i will let you know the progress.
BR,
mahesh.

On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> I've look into your code sample of streaming linear regression. Looks good
> to me, apart from few issues in coding practices which we can improve when
> you're doing the implementations in carbon-ml and during the code reviews.
> You are using a set of files as mini-batches of data, right? Can you also
> send us the datasets you've been using. I'd like to run this.
>
> does that cep problem is now all right that we were trying to fix. I am
>> still using those pre-build versions. If so i can merge with the latest one.
>
>
> I'll check this and let you know.
>
> Can we arrange a meeting (preferably in WSO2 offices) in next week with ML
> team members as well. Coding period begins on next Monday, so it's better
> to get overall feedback from others and discuss more about the project. Let
> me know convenient time slots for you. I'll arrange a meeting with ML team.
>
> Best regards.
>
> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Ok. I will check it.you have sent me those relevant references and i am
>> working on that thing.thank you. does that cep problem is now all right
>> that we were trying to fix. I am still using those pre-build versions. If
>> so i can merge with the latest one.thanks.
>> BR,
>> Mahesh.
>>
>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> You don't actually have to implement anything in spark streaming. Try to
>>> understand how streaming data is handled in and the specifics of the
>>> underlying algorithms in streaming.
>>> What we want to do is having the similar algorithms that support CEP
>>> event streams with siddhi.
>>>
>>> Best regards.
>>>
>>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> Did you check the repo. I will add recent works today.And also i was
>>>> going through the Java docs related to spark streaming work. It is with
>>>> that scala API. thank you.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I have gone through the Java Docs and run some of the Spark examples
>>>>> on spark shell which are paramount improtant for our work. Then i have 
>>>>> been
>>>>> writing my codes to check the Linear regression, K means for streaming.
>>>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>>>> the capturing event streams for our work. I will update the recent things
>>>>> on git. check the park-example directory for java. examples run on git
>>>>> shell is not included there. In my case i think i have to build mini
>>>>> batches from data streams that comes as individual samples. Now i am
>>>>> working on some coding to collect mini batches from data streams.thank 
>>>>> you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>>>
>>>>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I have gone through the Java Docs and run some of the Spark examples
>>>>>> on spark shell which are paramount improtant for our work. Then i have 
>>>>>> been
>>>>>> writing my codes to check the Linear regression, K means for streaming.
>>>>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>>>>> the capturing event streams for our work. I will update the recent things
>>>>>> on git. check the park-example directory for java. examples run on git
>>>>>> shell is not included there. In my case i think 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Mahesh Dananjaya
Hi Maheshakya,
Ok. I will check it.you have sent me those relevant references and i am
working on that thing.thank you. does that cep problem is now all right
that we were trying to fix. I am still using those pre-build versions. If
so i can merge with the latest one.thanks.
BR,
Mahesh.

On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> You don't actually have to implement anything in spark streaming. Try to
> understand how streaming data is handled in and the specifics of the
> underlying algorithms in streaming.
> What we want to do is having the similar algorithms that support CEP event
> streams with siddhi.
>
> Best regards.
>
> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Did you check the repo. I will add recent works today.And also i was
>> going through the Java docs related to spark streaming work. It is with
>> that scala API. thank you.
>> regards,
>> Mahesh.
>>
>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples on
>>> spark shell which are paramount improtant for our work. Then i have been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>
>>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I have gone through the Java Docs and run some of the Spark examples on
>>>> spark shell which are paramount improtant for our work. Then i have been
>>>> writing my codes to check the Linear regression, K means for streaming.
>>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>>> the capturing event streams for our work. I will update the recent things
>>>> on git. check the park-example directory for java. examples run on git
>>>> shell is not included there. In my case i think i have to build mini
>>>> batches from data streams that comes as individual samples. Now i am
>>>> working on some coding to collect mini batches from data streams.thank you.
>>>> regards,
>>>> Mahesh.
>>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>>
>>>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> thank you. i will update the repo today.thank you.i changed the carbon
>>>>> ml siddhi extention and see how the changes are effecting. i will update
>>>>> the progress as soon as possible.thank you. i had some problem in spark
>>>>> mllib dependency. i was fixing that.
>>>>> regards,
>>>>> Mahesh.
>>>>> p.s: do i need to maintain a blog?
>>>>>
>>>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> Sorry for replying late.
>>>>>>
>>>>>> Thank you for the update. I believe you have done some
>>>>>> implementations with with Spark MLLIb algorithms in streaming fashion as 
>>>>>> we
>>>>>> have discussed. If so, can you please share your code in a Github repo.
>>>>>>
>>>>>> Now i want to implements some machine learning algorithms with
>>>>>>> importing mllib and want to run within your code base
>>>>>>>
>>>>>>
>>>>>> For the moment you can try out editing the same class
>>>>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we 
>>>>>> will
>>>>>> add this separately. You should 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Mahesh Dananjaya
Hi Maheshakya,
Did you check the repo. I will add recent works today.And also i was going
through the Java docs related to spark streaming work. It is with that
scala API. thank you.
regards,
Mahesh.

On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples on
> spark shell which are paramount improtant for our work. Then i have been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you. i will update the repo today.thank you.i changed the carbon
>>> ml siddhi extention and see how the changes are effecting. i will update
>>> the progress as soon as possible.thank you. i had some problem in spark
>>> mllib dependency. i was fixing that.
>>> regards,
>>> Mahesh.
>>> p.s: do i need to maintain a blog?
>>>
>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Sorry for replying late.
>>>>
>>>> Thank you for the update. I believe you have done some implementations
>>>> with with Spark MLLIb algorithms in streaming fashion as we have discussed.
>>>> If so, can you please share your code in a Github repo.
>>>>
>>>> Now i want to implements some machine learning algorithms with
>>>>> importing mllib and want to run within your code base
>>>>>
>>>>
>>>> For the moment you can try out editing the same class
>>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
>>>> add this separately. You should be able to add org.apache.spark.mllib.
>>>> classes to there.
>>>>
>>>> And i want to see how event streams are coming from cep. As i think it
>>>>> is not in a RDD format since it is arriving as the individual samples. I
>>>>> will send a email to dev asking about how to get the streams.
>>>>
>>>>
>>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
>>>> siddhi. What you need to write are functions similar to a custom aggregate
>>>> function[2].
>>>> When you send the email to dev list, explain your requirement. You need
>>>> to get a set of event with from a stream with a specified window size
>>>> (number of events). Then build a model within that function. You also need
>>>> to retain the data (learned weights, cluster centers, etc.) from the
>>>> previous window to use in the current window. Ask what can be the most
>>>> suitable option for this among the set of siddhi extensions given.
>>>>
>>>> Best regards.
>>>>
>>>> [1]
>>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>>> [2]
>>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>>>
>>>> On Wed, May 11, 20

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-16 Thread Mahesh Dananjaya
Hi Maheshakya,
I have gone through the Java Docs and run some of the Spark examples on
spark shell which are paramount improtant for our work. Then i have been
writing my codes to check the Linear regression, K means for streaming.
please check my git repo [1]. I think now i have to ask on dev regarding
the capturing event streams for our work. I will update the recent things
on git. check the park-example directory for java. examples run on git
shell is not included there. In my case i think i have to build mini
batches from data streams that comes as individual samples. Now i am
working on some coding to collect mini batches from data streams.thank you.
regards,
Mahesh.
[1]https://github.com/dananjayamahesh/GSOC2016

On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples on
> spark shell which are paramount improtant for our work. Then i have been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> thank you. i will update the repo today.thank you.i changed the carbon ml
>> siddhi extention and see how the changes are effecting. i will update the
>> progress as soon as possible.thank you. i had some problem in spark mllib
>> dependency. i was fixing that.
>> regards,
>> Mahesh.
>> p.s: do i need to maintain a blog?
>>
>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Sorry for replying late.
>>>
>>> Thank you for the update. I believe you have done some implementations
>>> with with Spark MLLIb algorithms in streaming fashion as we have discussed.
>>> If so, can you please share your code in a Github repo.
>>>
>>> Now i want to implements some machine learning algorithms with importing
>>>> mllib and want to run within your code base
>>>>
>>>
>>> For the moment you can try out editing the same class
>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
>>> add this separately. You should be able to add org.apache.spark.mllib.
>>> classes to there.
>>>
>>> And i want to see how event streams are coming from cep. As i think it
>>>> is not in a RDD format since it is arriving as the individual samples. I
>>>> will send a email to dev asking about how to get the streams.
>>>
>>>
>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
>>> siddhi. What you need to write are functions similar to a custom aggregate
>>> function[2].
>>> When you send the email to dev list, explain your requirement. You need
>>> to get a set of event with from a stream with a specified window size
>>> (number of events). Then build a model within that function. You also need
>>> to retain the data (learned weights, cluster centers, etc.) from the
>>> previous window to use in the current window. Ask what can be the most
>>> suitable option for this among the set of siddhi extensions given.
>>>
>>> Best regards.
>>>
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>> [2]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>>
>>> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>>
>>>> -- Forwarded message --
>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>> Date: Wed, May 11, 2016 at 1:43 PM
>>>> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
>>>> data for WSO2 Machine Learner
>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>
>>>>
>>>> Hi Maheshakya,
>>>> sorry for not updating. I did what you wanted me to do. I

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-16 Thread Mahesh Dananjaya
Hi Maheshakya,
thank you. i will update the repo today.thank you.i changed the carbon ml
siddhi extention and see how the changes are effecting. i will update the
progress as soon as possible.thank you. i had some problem in spark mllib
dependency. i was fixing that.
regards,
Mahesh.
p.s: do i need to maintain a blog?

On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Sorry for replying late.
>
> Thank you for the update. I believe you have done some implementations
> with with Spark MLLIb algorithms in streaming fashion as we have discussed.
> If so, can you please share your code in a Github repo.
>
> Now i want to implements some machine learning algorithms with importing
>> mllib and want to run within your code base
>>
>
> For the moment you can try out editing the same class
> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
> add this separately. You should be able to add org.apache.spark.mllib.
> classes to there.
>
> And i want to see how event streams are coming from cep. As i think it is
>> not in a RDD format since it is arriving as the individual samples. I will
>> send a email to dev asking about how to get the streams.
>
>
> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
> siddhi. What you need to write are functions similar to a custom aggregate
> function[2].
> When you send the email to dev list, explain your requirement. You need to
> get a set of event with from a stream with a specified window size (number
> of events). Then build a model within that function. You also need to
> retain the data (learned weights, cluster centers, etc.) from the previous
> window to use in the current window. Ask what can be the most suitable
> option for this among the set of siddhi extensions given.
>
> Best regards.
>
> [1]
> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
> [2]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>
> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>> Date: Wed, May 11, 2016 at 1:43 PM
>> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
>> data for WSO2 Machine Learner
>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>
>>
>> Hi Maheshakya,
>> sorry for not updating. I did what you wanted me to do. I checked the
>> code base and train functions. I went through those java docs. I went
>> through the carbon-ml current implementation of LG and K-Mean. And i had
>> Apache Spark and i tried with several examples. Now i want to implements
>> some machine learning algorithms with importing mllib and want to run
>> within your code base. Can you help me with that.
>> And i want to see how event streams are coming from cep. As i think it is
>> not in a RDD format since it is arriving as the individual samples. I will
>> send a email to dev asking about how to get the streams. I debugged many of
>> those functions in the code base. So need further instructions to
>> proceed.thank you.
>> regards,
>> Mahesh.
>>
>> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Any update on your progress?
>>>
>>> Best regards.
>>>
>>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> is that "Put break points in train methods in Linear Regression class"
>>>>> means the spark/algorithms/ LinearRegrassion.java class in the
>>>>> org.wso2.carbon.ml.core? is that the correct file?
>>>>
>>>>
>>>> Yes, this is the correct place.
>>>>
>>>> You can refer to spark programming guide[1][2] as well as our ML code
>>>> base when you try those algorithms out. Please try to do rough
>>>> implementations of the streaming versions of linear regression, logistic
>>>> regression and k-means clustering as we have discussed in the proposal in
>>>> plain Java. It's better if you can create a git repo and share your code
>>>> once you have made some progress.
>>>>
>>>> Were you able debug and understand the flow of the ML siddhi extension?
>>>> I hope you haven't encountered more errors after switching the released
&g

[Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-11 Thread Mahesh Dananjaya
-- Forwarded message --
From: Mahesh Dananjaya <dananjayamah...@gmail.com>
Date: Wed, May 11, 2016 at 1:43 PM
Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data
for WSO2 Machine Learner
To: Maheshakya Wijewardena <mahesha...@wso2.com>


Hi Maheshakya,
sorry for not updating. I did what you wanted me to do. I checked the code
base and train functions. I went through those java docs. I went through
the carbon-ml current implementation of LG and K-Mean. And i had Apache
Spark and i tried with several examples. Now i want to implements some
machine learning algorithms with importing mllib and want to run within
your code base. Can you help me with that.
And i want to see how event streams are coming from cep. As i think it is
not in a RDD format since it is arriving as the individual samples. I will
send a email to dev asking about how to get the streams. I debugged many of
those functions in the code base. So need further instructions to
proceed.thank you.
regards,
Mahesh.

On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Any update on your progress?
>
> Best regards.
>
> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> is that "Put break points in train methods in Linear Regression class"
>>> means the spark/algorithms/ LinearRegrassion.java class in the
>>> org.wso2.carbon.ml.core? is that the correct file?
>>
>>
>> Yes, this is the correct place.
>>
>> You can refer to spark programming guide[1][2] as well as our ML code
>> base when you try those algorithms out. Please try to do rough
>> implementations of the streaming versions of linear regression, logistic
>> regression and k-means clustering as we have discussed in the proposal in
>> plain Java. It's better if you can create a git repo and share your code
>> once you have made some progress.
>>
>> Were you able debug and understand the flow of the ML siddhi extension? I
>> hope you haven't encountered more errors after switching the released
>> version of CEP.
>>
>> Is this Friday okay for you? Afternoon at 2:00 pm?
>>
>> Best regards.
>>
>>
>> Best regards.
>>
>> [1] http://spark.apache.org/docs/latest/programming-guide.html
>> [2] http://spark.apache.org/docs/latest/mllib-guide.html
>>
>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have been looking into some algorithms related to stochastic gradient
>>> descent based algorithms.anything i should focus please let me know.Ans
>>> also i will be available for calling this week and next week.thank you.
>>> BR,
>>> Mahesh.
>>>
>>> On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> thank you.that's good. i have been trying to fix that for couple of
>>>> days. please inform me when it will be fixed.now i have been testing the ML
>>>> algorithms and trying to identify the flow and the hierarchy. is that "Put
>>>> break points in train methods in Linear Regression class" means the
>>>> spark/algorithms/ LinearRegrassion.java class in the
>>>> org.wso2.carbon.ml.core? is that the correct file?
>>>> And also i am planning to write some programs to use apache spark mllib
>>>> algorithms. and i refer to [1] and some wso2 documentations to get some
>>>> idea about ML structure.thank you.
>>>>
>>>> BR,
>>>> Mahesh.
>>>>
>>>> [1]nirmalfdo.blogspot.com
>>>>
>>>> On Tue, May 3, 2016 at 4:36 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> I have checked. It seems the issue you have encountered is cause only
>>>>> in the current development branch of the product-cep. It doesn't identify
>>>>> the ML siddhi extension as an extension. ML siddhi extension works fine in
>>>>> the latest release of CEP (4.1.0) [1].
>>>>> Until we figure out the reason and come up with a solution, can you
>>>>> use the latest CEP release for your work. It's fine to use that since you
>>>>> haven't started actual development yet.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> [1] http://wso2.com/products/complex-event-p

[Dev] GSoC 2016: [ML][CEP] Proposal 6, discussion @ Fri May 6, 2016

2016-05-07 Thread Mahesh Dananjaya
HI Maheshakya,
first of all thank you for the discussion. I could possibly solved couple
of doubts regarding the project. I think i get some idea about the work
flow of Carbon-ML, product-ml, product-cep and apache spark MLLib
algorithms. It was  a good discussion and thank you for the invitation to
visit wso2. I will update my progress as we moving on.thank you.
regards,
Mahesh.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-02 Thread Mahesh Dananjaya
Hi maheshakya,
I have installed them correctly.now I am trying to debug the siddhi
extention with the cep as the [1] describes. But when i created an input
stream and a predictionStream (output stream). when i was trying to create
new execution plan with above streams i got error when i clicked "Validate
Query Expression".Error was,
Error:
No extension exist for StreamFunctionExtension{namespace='ml'} in execution
plan "ExecutionPlan"

and my expression is like a

/* Enter a unique ExecutionPlan */
@Plan:name('ExecutionPlan')

/* Enter a unique description for ExecutionPlan */
-- @Plan:description('ExecutionPlan')

/* define streams/tables and write queries here ... */

@Import('InputStream:1.0.0')
define stream InputStream (NumPregnancies double, TSFT double, DPF double,
BMI double, DBP double, PG2 double, Age double, SI2 double);

@Export('PredictionStream:1.0.0')
define stream PredictionSTream (NumPregnancies double, TSFT double, DPF
double, BMI double, DBP double, PG2 double, Age double, SI2 double, Class
double);

from
InputStream#ml:predict('file:///home/mahesh/GSOC/WSO2/data-set/pima-indian-diabetes.data','double')
select *
insert into PredictionStream


i used file instead of registry. And i referred to the [2] and there they
mention that solution for fixing CEP is running on distributed mode with
apache Storm cluster.

1. Is that CEP i built is originally run as distributed mode?
2. Is this cuased by an not having sudo privilleges in current user when
installing ML features onto CEP?
3.Is this the correct way to give file to CEP.

[1]
https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension

[2]https://wso2.org/jira/browse/CEP-1400

BR,
Mahesh.


On Mon, May 2, 2016 at 12:35 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> If you have built product-ml, you can find the P2-repo at
> product-ml/modules/p2-profile/target/p2-repo
> Add this folder as a local repository.
> After that, you should be able to see the ML features.
>
> Best regards.
>
> On Mon, May 2, 2016 at 12:24 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Since i already have carbon-ml built in my pc can i use my local
>> repository to install those features in to CEP.is that correct.thank you.
>> regards,
>> Mahesh.
>>
>> On Mon, May 2, 2016 at 12:20 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Can you please tell me how to find the most recent p2 repository URL to
>>> add machine learner Core, Machine learner commons, Machine learner database
>>> service and ML Siddhi extension to add as features in CEP as describes in
>>> the [1]. When i use
>>> http://product-dist.wso2.com/p2/carbon/releases/4.2.0/ URL those
>>> features are not visible in the CEP.Is that not he most recent one.
>>> BR,
>>> Mahesh.
>>>
>>> [1]
>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>
>>> On Mon, May 2, 2016 at 11:28 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> sorry for the incomplete message.I have set up the dev environment and
>>>> now i am trying to remotely debug. The following steps were done.
>>>> 1. build product-cep, carbon-ml and product-ml by source.
>>>> 2. go through their code bases and trying to understand the way and the
>>>> flow you developed.
>>>> 3. i have set up break point in org.wso2.carbon.ml.siddhi.extension in
>>>> carbon-ml
>>>> 4. start the ./wso2server.sh debug 5005 in the SNAPSHOT directory of
>>>> product-ml
>>>> 5. trying to trigger the break points with the [1] reference.break
>>>> points are placed in the PredictStreamProcessor.java file within the
>>>> extention.
>>>>
>>>> This is the way i followed. I was trying to remotely debug the ML core
>>>> by putting break-points in ml core.(org.wso2.carbon.ml.core) in spark java
>>>> files. Is this the right way to do those things.
>>>>
>>>> [1]
>>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>>
>>>> On Mon, May 2, 2016 at 11:19 AM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi maheshakya,
>>>>> I have set up the dev environment and now i 

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-02 Thread Mahesh Dananjaya
Hi Maheshakya,
Can you please tell me how to find the most recent p2 repository URL to add
machine learner Core, Machine learner commons, Machine learner database
service and ML Siddhi extension to add as features in CEP as describes in
the [1]. When i use http://product-dist.wso2.com/p2/carbon/releases/4.2.0/
URL those features are not visible in the CEP.Is that not he most recent
one.
BR,
Mahesh.

[1]
https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension

On Mon, May 2, 2016 at 11:28 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> sorry for the incomplete message.I have set up the dev environment and now
> i am trying to remotely debug. The following steps were done.
> 1. build product-cep, carbon-ml and product-ml by source.
> 2. go through their code bases and trying to understand the way and the
> flow you developed.
> 3. i have set up break point in org.wso2.carbon.ml.siddhi.extension in
> carbon-ml
> 4. start the ./wso2server.sh debug 5005 in the SNAPSHOT directory of
> product-ml
> 5. trying to trigger the break points with the [1] reference.break points
> are placed in the PredictStreamProcessor.java file within the extention.
>
> This is the way i followed. I was trying to remotely debug the ML core by
> putting break-points in ml core.(org.wso2.carbon.ml.core) in spark java
> files. Is this the right way to do those things.
>
> [1]
> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>
> On Mon, May 2, 2016 at 11:19 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> I have set up the dev environment and now i am trying to remotely debug.
>> The following steps were done.
>> 1. build product-cep, carbon-ml and product-ml by source.
>> 2. go through their code bases and trying to understand the way and the
>> flow you developed.
>> 3. i have set up break point in
>>
>>
>> On Thu, Apr 28, 2016 at 7:05 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> ok.i got it.thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Thu, Apr 28, 2016 at 6:56 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> The links was an example of remote debugging WSO2 server. What you need
>>>> to debug is org.wso2.carbon.ml.siddhi.extension in carbon-ml.
>>>>
>>>> Best regards.
>>>>
>>>> On Thu, Apr 28, 2016 at 4:52 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> thank you for your help.i have already built all three sources and
>>>>> now i am trying to get familiar with your code base. i even build the
>>>>> carbon-kernel by source.
>>>>>  As you mentioned [1] is related to debug the kernel, do i really need
>>>>> to debug the carbon kernel in my case. I am trying to remotely debug ml 
>>>>> and
>>>>> as i got it correct it is the same way as reference[1, but not the 
>>>>> kernel.I
>>>>> can go with others.
>>>>> BR,
>>>>> mahesh.
>>>>>
>>>>> [1] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>>>>>
>>>>> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> Congratulations and welcome to GSoC 2016. You did a great job in
>>>>>> preparing the proposal. Now it's time to dig deep and get started with 
>>>>>> the
>>>>>> project.
>>>>>>
>>>>>> First of all you need to familiarize with the code base. We have
>>>>>> agreed to implement this with CEP event streams. We already have a CEP
>>>>>> extension for predictions [1][2]. Go through this implementation and
>>>>>> familiarize your self with that. You need to understand how:
>>>>>>
>>>>>>1. Even streams are consumed
>>>>>>2. predictions are made from individual event
>>>>>>3. Results are sent back
>>>>>>
>>>>>> Get WSO2 ML and CEP sources (You may use latest released version of
>>>>>> CEP) a

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-28 Thread Mahesh Dananjaya
Hi Maheshakya,
thank you for your help.i have already built all three sources and  now i
am trying to get familiar with your code base. i even build the
carbon-kernel by source.
 As you mentioned [1] is related to debug the kernel, do i really need to
debug the carbon kernel in my case. I am trying to remotely debug ml and as
i got it correct it is the same way as reference[1, but not the kernel.I
can go with others.
BR,
mahesh.

[1] https://dzone.com/articles/how-debug-wso2-carbon-kernel

On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Congratulations and welcome to GSoC 2016. You did a great job in preparing
> the proposal. Now it's time to dig deep and get started with the project.
>
> First of all you need to familiarize with the code base. We have agreed to
> implement this with CEP event streams. We already have a CEP extension for
> predictions [1][2]. Go through this implementation and familiarize your
> self with that. You need to understand how:
>
>1. Even streams are consumed
>2. predictions are made from individual event
>3. Results are sent back
>
> Get WSO2 ML and CEP sources (You may use latest released version of CEP)
> and build the products. Get both carbon-ml[3] and product-ml[4] masters and
> create new branches for your work from masters.
>
> After you build the products, you may need to do remote debugging[5] to
> understand the flow. So please follow an example of real time prediction
> with ML with debugging and get some idea. The component you need to debug
> is org.wso2.carbon.ml.siddhi.extension.
>
> Next tasks would be implementing online learning algorithms in plain java
> with spark ml lib and integrating those to ML. We also need to come up with
> a proper and detailed architecture to employ those algorithms in ML.
> Getting familiar with the aforementioned sections would give you some
> insight on how this should be implemented.
>
> So please try to get a quick grasp then you can start the implementation.
> Let us know if you have any questions or you get stuck somewhere.
>
> Also, please always add WSO2 developer's list as well when you communicate
> with us regarding the project so that you can get opinions and feedback
> from others as well.
>
> Best regards.
>
> [1]
> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>
> [2]
> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension
>
> [3] https://github.com/wso2/carbon-ml
>
> [4] https://github.com/wso2/product-ml
>
> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>
>
> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi,
>> thank you for accepting my GSOC 2016 proposal and i am looking forward
>> for the further instruction and project continuation. thank you very much.
>> regards,
>> Mahesh.
>>
>> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-27 Thread Mahesh Dananjaya
Hi suho,
Thanl you for the information. In the initial build i used mvn "mvn clean
install -Dmaven.test.skip=true", thats why i did not get errors.But this
time i built with mvn clean build and i got some errors in test stage.i
have already set up MAVEN_OPTS as MAVEN_OPTS="-Xms768m -Xms3072m
-XX:MaxPermSize=1200m". But it seems to be some memory constriant.i got
followings.


ERROR [org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader]
- Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
MaxPermSize=256m; support was removed in 8.0
ERROR [org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader]
- Java HotSpot(TM) 64-Bit Server VM warning: INFO:
os::commit_memory(0xf400, 157286400, 0) failed; error='Cannot
allocate memory' (errno=12)

I have been using Ubuntu 14.04 LTS with 4GB ram.So how can i fix this
issue.  And i got similar kind of error when i was trying to build the wso2
product-ml. i have attached the detailes of the error i got in this mail.
Do i need to set up some additional environemtn variables to fix this.
BR,
Mahesh.




On Wed, Apr 27, 2016 at 1:51 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> As Suho mentioned, if you have successfully built with tests, then there
> shouldn't be an issue.
>
> However, in the error you've stated, it seems there's problem with carbon
> home:
>
>> CARBON_HOME environment variable is set to
>> /home/mahesh/GSOC/WSO2/product-cep/modules/distribution
>>
> Can you make sure that you extract
> product-cep/modules/distribution/target/wso2cep-4.1.1-SNAPSHOT.zip and run
> the server in wso2cep-4.1.1-SNAPSHOT/bin/ with ./wso2server.sh
>
> Best regards.
>
> On Wed, Apr 27, 2016 at 12:43 PM, Sriskandarajah Suhothayan <s...@wso2.com
> > wrote:
>
>> If your build has passed, then it should not be an issue. When building
>> the the tests should have ran.
>> Is that so? please verify.
>> During that server should have started and stopped.
>>
>> I think there is some issue in the way you have started the CEP.
>>
>> You should be able to build the products as you will be working with
>> components having snapshots versions.
>>
>> Regards
>> Suho
>>
>> On Wed, Apr 27, 2016 at 12:33 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I am trying to build the CEP by sourceas [1].it was built without
>>> errors.But when i run the ./wso2server.sh  i got his error
>>>
>>> JAVA_HOME environment variable is set to /usr/local/java/jdk1.8.0_51
>>> CARBON_HOME environment variable is set to
>>> /home/mahesh/GSOC/WSO2/product-cep/modules/distribution
>>> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
>>> MaxPermSize=256m; support was removed in 8.0
>>> Could not load Logmanager "org.apache.juli.ClassLoaderLogManager"
>>> java.lang.ClassNotFoundException: org.apache.juli.ClassLoaderLogManager
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> at java.util.logging.LogManager$1.run(LogManager.java:195)
>>> at java.util.logging.LogManager$1.run(LogManager.java:181)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.util.logging.LogManager.(LogManager.java:181)
>>> at java.util.logging.Logger.demandLogger(Logger.java:448)
>>> at java.util.logging.Logger.getLogger(Logger.java:502)
>>> at com.sun.jmx.remote.util.ClassLogger.(ClassLogger.java:55)
>>> at
>>> sun.management.jmxremote.ConnectorBootstrap.(ConnectorBootstrap.java:814)
>>> at sun.management.Agent.startLocalManagementAgent(Agent.java:138)
>>> at sun.management.Agent.startAgent(Agent.java:260)
>>> at sun.management.Agent.startAgent(Agent.java:447)
>>> Error: Could not find or load main class
>>> org.wso2.carbon.bootstrap.Bootstrap
>>>
>>> do i need some additional libraries there?Is it allright to go wit the
>>> [2] as we will be doing changes to source.
>>> BR,
>>> Mahesh.
>>>
>>>
>>> On Wed, Apr 27, 2016 at 12:17 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> You don't need to build the kernel. You can build either current master
>>>> of product-cep[1] or you can download the latest release from [2].
>>>>
&

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-27 Thread Mahesh Dananjaya
Hi Maheshakya,
I am trying to build the CEP by sourceas [1].it was built without
errors.But when i run the ./wso2server.sh  i got his error

JAVA_HOME environment variable is set to /usr/local/java/jdk1.8.0_51
CARBON_HOME environment variable is set to
/home/mahesh/GSOC/WSO2/product-cep/modules/distribution
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
MaxPermSize=256m; support was removed in 8.0
Could not load Logmanager "org.apache.juli.ClassLoaderLogManager"
java.lang.ClassNotFoundException: org.apache.juli.ClassLoaderLogManager
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.util.logging.LogManager$1.run(LogManager.java:195)
at java.util.logging.LogManager$1.run(LogManager.java:181)
at java.security.AccessController.doPrivileged(Native Method)
at java.util.logging.LogManager.(LogManager.java:181)
at java.util.logging.Logger.demandLogger(Logger.java:448)
at java.util.logging.Logger.getLogger(Logger.java:502)
at com.sun.jmx.remote.util.ClassLogger.(ClassLogger.java:55)
at
sun.management.jmxremote.ConnectorBootstrap.(ConnectorBootstrap.java:814)
at sun.management.Agent.startLocalManagementAgent(Agent.java:138)
at sun.management.Agent.startAgent(Agent.java:260)
at sun.management.Agent.startAgent(Agent.java:447)
Error: Could not find or load main class org.wso2.carbon.bootstrap.Bootstrap

do i need some additional libraries there?Is it allright to go wit the [2]
as we will be doing changes to source.
BR,
Mahesh.


On Wed, Apr 27, 2016 at 12:17 PM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> You don't need to build the kernel. You can build either current master of
> product-cep[1] or you can download the latest release from [2].
>
> Best regards.
>
> [1] https://github.com/wso2/product-cep
> [2] http://wso2.com/products/complex-event-processor/
>
> On Wed, Apr 27, 2016 at 12:09 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> Do we need to build carbon kernal by source before we build  CEP by
>> source (https://github.com/wso2/carbon-kernel )  .Or is it inside those
>> sources.i am trying to build all three sources after forked them.thank you.
>> regards,
>> Mahesh
>>
>> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Congratulations and welcome to GSoC 2016. You did a great job in
>>> preparing the proposal. Now it's time to dig deep and get started with the
>>> project.
>>>
>>> First of all you need to familiarize with the code base. We have agreed
>>> to implement this with CEP event streams. We already have a CEP extension
>>> for predictions [1][2]. Go through this implementation and familiarize your
>>> self with that. You need to understand how:
>>>
>>>1. Even streams are consumed
>>>2. predictions are made from individual event
>>>3. Results are sent back
>>>
>>> Get WSO2 ML and CEP sources (You may use latest released version of CEP)
>>> and build the products. Get both carbon-ml[3] and product-ml[4] masters and
>>> create new branches for your work from masters.
>>>
>>> After you build the products, you may need to do remote debugging[5] to
>>> understand the flow. So please follow an example of real time prediction
>>> with ML with debugging and get some idea. The component you need to debug
>>> is org.wso2.carbon.ml.siddhi.extension.
>>>
>>> Next tasks would be implementing online learning algorithms in plain
>>> java with spark ml lib and integrating those to ML. We also need to come up
>>> with a proper and detailed architecture to employ those algorithms in ML.
>>> Getting familiar with the aforementioned sections would give you some
>>> insight on how this should be implemented.
>>>
>>> So please try to get a quick grasp then you can start the
>>> implementation. Let us know if you have any questions or you get stuck
>>> somewhere.
>>>
>>> Also, please always add WSO2 developer's list as well when you
>>> communicate with us regarding the project so that you can get opinions and
>>> feedback from others as well.
>>>
>>> Best regards.
>>>
>>> [1]
>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>
>

Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-27 Thread Mahesh Dananjaya
Hi maheshakya,
Do we need to build carbon kernal by source before we build  CEP by source (
https://github.com/wso2/carbon-kernel )  .Or is it inside those sources.i
am trying to build all three sources after forked them.thank you.
regards,
Mahesh

On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Congratulations and welcome to GSoC 2016. You did a great job in preparing
> the proposal. Now it's time to dig deep and get started with the project.
>
> First of all you need to familiarize with the code base. We have agreed to
> implement this with CEP event streams. We already have a CEP extension for
> predictions [1][2]. Go through this implementation and familiarize your
> self with that. You need to understand how:
>
>1. Even streams are consumed
>2. predictions are made from individual event
>3. Results are sent back
>
> Get WSO2 ML and CEP sources (You may use latest released version of CEP)
> and build the products. Get both carbon-ml[3] and product-ml[4] masters and
> create new branches for your work from masters.
>
> After you build the products, you may need to do remote debugging[5] to
> understand the flow. So please follow an example of real time prediction
> with ML with debugging and get some idea. The component you need to debug
> is org.wso2.carbon.ml.siddhi.extension.
>
> Next tasks would be implementing online learning algorithms in plain java
> with spark ml lib and integrating those to ML. We also need to come up with
> a proper and detailed architecture to employ those algorithms in ML.
> Getting familiar with the aforementioned sections would give you some
> insight on how this should be implemented.
>
> So please try to get a quick grasp then you can start the implementation.
> Let us know if you have any questions or you get stuck somewhere.
>
> Also, please always add WSO2 developer's list as well when you communicate
> with us regarding the project so that you can get opinions and feedback
> from others as well.
>
> Best regards.
>
> [1]
> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>
> [2]
> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension
>
> [3] https://github.com/wso2/carbon-ml
>
> [4] https://github.com/wso2/product-ml
>
> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>
>
> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi,
>> thank you for accepting my GSOC 2016 proposal and i am looking forward
>> for the further instruction and project continuation. thank you very much.
>> regards,
>> Mahesh.
>>
>> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-04-25 Thread Mahesh Dananjaya
Hi Maheshakya,
thank you. give me couple of days to set up the environment. I have already
all three things. I will get latest one and start to work on the above
things. I will let you know the progress. meanwhile if you are i am
available in boh skype and hangout.thank you.
regards,
Mahesh.

On Mon, Apr 25, 2016 at 6:04 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> Congratulations and welcome to GSoC 2016. You did a great job in preparing
> the proposal. Now it's time to dig deep and get started with the project.
>
> First of all you need to familiarize with the code base. We have agreed to
> implement this with CEP event streams. We already have a CEP extension for
> predictions [1][2]. Go through this implementation and familiarize your
> self with that. You need to understand how:
>
>1. Even streams are consumed
>2. predictions are made from individual event
>3. Results are sent back
>
> Get WSO2 ML and CEP sources (You may use latest released version of CEP)
> and build the products. Get both carbon-ml[3] and product-ml[4] masters and
> create new branches for your work from masters.
>
> After you build the products, you may need to do remote debugging[5] to
> understand the flow. So please follow an example of real time prediction
> with ML with debugging and get some idea. The component you need to debug
> is org.wso2.carbon.ml.siddhi.extension.
>
> Next tasks would be implementing online learning algorithms in plain java
> with spark ml lib and integrating those to ML. We also need to come up with
> a proper and detailed architecture to employ those algorithms in ML.
> Getting familiar with the aforementioned sections would give you some
> insight on how this should be implemented.
>
> So please try to get a quick grasp then you can start the implementation.
> Let us know if you have any questions or you get stuck somewhere.
>
> Also, please always add WSO2 developer's list as well when you communicate
> with us regarding the project so that you can get opinions and feedback
> from others as well.
>
> Best regards.
>
> [1]
> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>
> [2]
> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension
>
> [3] https://github.com/wso2/carbon-ml
>
> [4] https://github.com/wso2/product-ml
>
> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>
>
> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi,
>> thank you for accepting my GSOC 2016 proposal and i am looking forward
>> for the further instruction and project continuation. thank you very much.
>> regards,
>> Mahesh.
>>
>> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-31 Thread Mahesh Dananjaya
Hi Maheshakya,
Google have accepted my proof of enrollment. So do i need to proceed
further with the project?t. I have been working with the Spark MLLib and
trying to implement those two algorithms. Can you please tell me what is
the next step i want to do.do i need to wait?thank you.
regards,
Mahesh.

On Fri, Mar 25, 2016 at 10:40 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> Thank you very much for the support given during the last couple of
> weeks.I have finally submitted the proposal to the site.And i am looking
> forward to contribute to your wso2 ml.thank you.
> regards,
> Mahesh.
>
> On Fri, Mar 25, 2016 at 7:49 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> i added the timeline according to my knowledge and uploaded.pls
>> check.thank you.
>> regards,
>> Mahesh.
>>
>> On Fri, Mar 25, 2016 at 7:09 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Can you add the time line of the project as I've mentioned. It's one of
>>> the crucial parts of the proposal that allows us to evaluate feasibility of
>>> the project in accordance with the given time period by Google.
>>>
>>> Best regards.
>>>
>>> On Fri, Mar 25, 2016 at 6:53 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>>
>>>> -- Forwarded message --
>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>> Date: Fri, Mar 25, 2016 at 7:02 PM
>>>> Subject: Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]
>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>
>>>>
>>>> Hi maheshakya,
>>>> I have uploaded my final submission.here it is. pls check it and inform
>>>> me anything i need to change.thank you.
>>>> BR,
>>>> Mahesh.
>>>>
>>>> On Fri, Mar 25, 2016 at 6:28 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> thank you very much. I will be updating the proposal with those
>>>>> changes and i will submit it by now.thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Fri, Mar 25, 2016 at 6:07 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> In the title, please include both tags [ML] and [CEP]
>>>>>>
>>>>>> Best regards.
>>>>>>
>>>>>> On Fri, Mar 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>
>>>>>>> Also, please include an introduction to yourself (University,
>>>>>>> department), past experience in machine learning, language proficiency, 
>>>>>>> etc
>>>>>>> at the beginning of the proposal.
>>>>>>>
>>>>>>> Best regards.
>>>>>>>
>>>>>>> On Fri, Mar 25, 2016 at 5:47 PM, Maheshakya Wijewardena <
>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>
>>>>>>>> Hi Mahesh,
>>>>>>>>
>>>>>>>> Thank you for sending the draft. Please submit it as soon as
>>>>>>>> possible.
>>>>>>>>
>>>>>>>> Few high level comments:
>>>>>>>>
>>>>>>>> In the proposal, you must specifically mention that this will be
>>>>>>>> implemented as a Siddhi extension that can operate directly on incoming
>>>>>>>> streams.
>>>>>>>>
>>>>>>>> Also, you need to have a time line for the project, A sample looks
>>>>>>>> like:
>>>>>>>>
>>>>>>>> May 1- May 20 - Community bonding period - Getting familiar with
>>>>>>>> the platform and discussing implementation methods.
>>>>>>>> May 20 - May 30 - Implementing streaming k-means,
>>>>>>>> -
>>>>>>>> -
>>>>>>>> July 20-24 - Writing examples
>>>>>>>> July 24-18 - Documentation
>>>>>>>>
>>>>>>>> This s

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-14 Thread Mahesh Dananjaya
Hi Maheshakya,
I am writing some java programs and try to break the dataset into several
pieces and train a model repeatedly with those data sets using Spark MLLib.
Do i have to do anything with Hadoop at this stage, because i am working
with a standalone mode.thank you.
BR,
Mahesh.

On Sun, Mar 13, 2016 at 6:30 PM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> You don't have to look into carbon-ml.
>
> Best regards.
>
> On Sun, Mar 13, 2016 at 5:49 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> i am working on some examples related to Spark and ML.is there anything
>> to do with carbon-ml. I think i dont need to look into that one.do i?
>> BR,
>> Mahesh
>>
>> On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> does that Scala API is with your current product or repo?
>>>
>>>
>>> No, we don't have the Scala API included. What we want is to design the
>>> Java implementations of those algorithms to train with mini-batches of
>>> streaming data with the help of the aforementioned methods so that we can
>>> include in as a CEP extension.
>>>
>>> As to clarify, please try to write a simple Java program using Spark
>>> MLLib linear regression and k-means clustering with a sample data set (You
>>> can find alot of data sets from UCI repo[1]).  You need to break the
>>> dataset into several pieces and train a model repeatedly with those.
>>> After each training run, save the model information (such as weights,
>>> intercepts for regression and cluster centers for clustering - please check
>>> the arguments of those methods I have mentioned and save the required
>>> information of the model)
>>> When training a model we a new piece of data, use those methods to
>>> initialize and put the save values for the arguments. This way you can
>>> start from where you stopped in the previous run.
>>>
>>> Let us know your observations and feel free to ask if you need to know
>>> anything more on this.
>>>
>>> We'll let you know what needs to be done to include this in CEP.
>>>
>>> Best regards.
>>>
>>> On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> great.thank you.i already have ML and CEP and working more towards it.
>>>> does that Scala API is with your current product or repo?.  thank you.
>>>> BR,
>>>> Mahesh.
>>>>
>>>> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Please find the comments inline.
>>>>>
>>>>> does data stream is taken to ML as the event publisher's format
>>>>>> through event publisher. Or  we can use direct traffic that comes to 
>>>>>> event
>>>>>> receiver, or else as streams
>>>>>>
>>>>> We intend to use the direct data as even streams.
>>>>>
>>>>> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>>>>>>
>>>>> No, WSO2 ML doesn't use any even stream. The data stored in tables in
>>>>> DAS is loaded into ML.
>>>>>
>>>>> 2.) Are there any incremental learning algorithms currently active in
>>>>>> ML?you mentioned that there are and they are with scala API. So there is 
>>>>>> a
>>>>>> streaming support with that Scala API. In that API which format the data 
>>>>>> is
>>>>>> aquired to ML?
>>>>>>
>>>>> No, there are no incremental learning algorithms in ML. The scala API
>>>>> is about Spark MLLib. MLLib supports streaming k-means and other
>>>>> generalized linear models (linear regression variants and logistic
>>>>> regression) with Scala API. What they basically do in those 
>>>>> implementations
>>>>> is retraining the trained models with mini batches when data sequentially
>>>>> arrives. There, the breaking of streaming data into mini batches is done
>>>>> with the help of Spark Streaming. But we do not intend to use Spark
>>>>> streaming in our implementation. What we need to do is implement a similar
>>&g

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-08 Thread Mahesh Dananjaya
Hi Maheshakya,
Thank you very much.i am already onto to that.will let you soon.thank you.
BR,
mahesh.

On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena <mahesha...@wso2.com
> wrote:

> Hi Mahesh,
>
> does that Scala API is with your current product or repo?
>
>
> No, we don't have the Scala API included. What we want is to design the
> Java implementations of those algorithms to train with mini-batches of
> streaming data with the help of the aforementioned methods so that we can
> include in as a CEP extension.
>
> As to clarify, please try to write a simple Java program using Spark MLLib
> linear regression and k-means clustering with a sample data set (You can
> find alot of data sets from UCI repo[1]).  You need to break the dataset
> into several pieces and train a model repeatedly with those.
> After each training run, save the model information (such as weights,
> intercepts for regression and cluster centers for clustering - please check
> the arguments of those methods I have mentioned and save the required
> information of the model)
> When training a model we a new piece of data, use those methods to
> initialize and put the save values for the arguments. This way you can
> start from where you stopped in the previous run.
>
> Let us know your observations and feel free to ask if you need to know
> anything more on this.
>
> We'll let you know what needs to be done to include this in CEP.
>
> Best regards.
>
> On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> great.thank you.i already have ML and CEP and working more towards it.
>> does that Scala API is with your current product or repo?.  thank you.
>> BR,
>> Mahesh.
>>
>> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Please find the comments inline.
>>>
>>> does data stream is taken to ML as the event publisher's format through
>>>> event publisher. Or  we can use direct traffic that comes to event
>>>> receiver, or else as streams
>>>>
>>> We intend to use the direct data as even streams.
>>>
>>> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>>>>
>>> No, WSO2 ML doesn't use any even stream. The data stored in tables in
>>> DAS is loaded into ML.
>>>
>>> 2.) Are there any incremental learning algorithms currently active in
>>>> ML?you mentioned that there are and they are with scala API. So there is a
>>>> streaming support with that Scala API. In that API which format the data is
>>>> aquired to ML?
>>>>
>>> No, there are no incremental learning algorithms in ML. The scala API is
>>> about Spark MLLib. MLLib supports streaming k-means and other generalized
>>> linear models (linear regression variants and logistic regression) with
>>> Scala API. What they basically do in those implementations is retraining
>>> the trained models with mini batches when data sequentially arrives. There,
>>> the breaking of streaming data into mini batches is done with the help of
>>> Spark Streaming. But we do not intend to use Spark streaming in our
>>> implementation. What we need to do is implement a similar behavior for
>>> event streams using the Java API.  The Java API has the following methods:
>>>
>>>- *createModel
>>>
>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>*
>>>(Vector
>>>
>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html>
>>>  weights,
>>>double intercept) - for GLMs
>>>- *setInitialModel
>>>
>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28org.apache.spark.mllib.clustering.KMeansModel%29>*
>>>(KMeansModel
>>>
>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeansModel.html>
>>>  model)
>>>- for K means
>>>
>>> With the help of these methods, we can train models again with newly
>>> arriving data, keeping the characteristics learned with the previous data.
>>> When implementing this, we need to pay attention to other parameters of
>>> incremental learning such as data horizon and data obsolescence (indicated
>>

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-07 Thread Mahesh Dananjaya
Hi Maheshakya,
great.thank you.i already have ML and CEP and working more towards it. does
that Scala API is with your current product or repo?.  thank you.
BR,
Mahesh.

On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena <mahesha...@wso2.com>
wrote:

> Hi Mahesh,
>
> Please find the comments inline.
>
> does data stream is taken to ML as the event publisher's format through
>> event publisher. Or  we can use direct traffic that comes to event
>> receiver, or else as streams
>>
> We intend to use the direct data as even streams.
>
> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>>
> No, WSO2 ML doesn't use any even stream. The data stored in tables in DAS
> is loaded into ML.
>
> 2.) Are there any incremental learning algorithms currently active in
>> ML?you mentioned that there are and they are with scala API. So there is a
>> streaming support with that Scala API. In that API which format the data is
>> aquired to ML?
>>
> No, there are no incremental learning algorithms in ML. The scala API is
> about Spark MLLib. MLLib supports streaming k-means and other generalized
> linear models (linear regression variants and logistic regression) with
> Scala API. What they basically do in those implementations is retraining
> the trained models with mini batches when data sequentially arrives. There,
> the breaking of streaming data into mini batches is done with the help of
> Spark Streaming. But we do not intend to use Spark streaming in our
> implementation. What we need to do is implement a similar behavior for
> event streams using the Java API.  The Java API has the following methods:
>
>- *createModel
>
> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>*
>(Vector
>
> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html>
>  weights,
>double intercept) - for GLMs
>- *setInitialModel
>
> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28org.apache.spark.mllib.clustering.KMeansModel%29>*
>(KMeansModel
>
> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeansModel.html>
>  model)
>- for K means
>
> With the help of these methods, we can train models again with newly
> arriving data, keeping the characteristics learned with the previous data.
> When implementing this, we need to pay attention to other parameters of
> incremental learning such as data horizon and data obsolescence (indicated
> in the project ideas page).
> We need to discuss on how to add these with CEP event streams. I have
> added Suho into the thread for more clarification.
>
> Best regards.
>
>
> On Sat, Mar 5, 2016 at 5:15 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> as we concerned to use WSO2 CEP to handle streaming data and implement
>> the machine learning algorithms with Spark MLLib, does data stream is taken
>> to ML as the event publisher's format through event publisher. Or  we can
>> use direct traffic that comes to event receiver, or else as streams.
>> referring to https://docs.wso2.com/display/CEP410/User+Guide
>> 1.) Those data coming from wso2 DAS to ML are coming as streams?
>> 2.) Are there any incremental learning algorithms currently active in
>> ML?you mentioned that there are and they are with scala API. So there is a
>> streaming support with that Scala API. In that API which format the data is
>> aquired to ML?
>>
>> thank you.
>> BR,
>> Mahesh.
>>
>> On Fri, Mar 4, 2016 at 2:03 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> We had to modify a the project scope a little to suit best for the
>>> requirements. We will update the project idea with those concerns soon and
>>> let you know.
>>>
>>> We do not support streaming data in WSO2 Machine learner at the moment.
>>> The new concern is to use WSO2 CEP to handle streaming data and implement
>>> the machine learning algorithms with Spark MLLib. You can look at the
>>> streaming k-means and streaming linear regression implementations in MLLib.
>>> Currently, the API is only for scala. Our need is to get the Java APIs of
>>> k-means and generalized linear models to support incremental learning with
>>> streaming data. This has to be done as mini-batch learning since these
>>> algorithms operates as s

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-05 Thread Mahesh Dananjaya
Hi maheshakya,
as we concerned to use WSO2 CEP to handle streaming data and implement the
machine learning algorithms with Spark MLLib, does data stream is taken to
ML as the event publisher's format through event publisher. Or  we can use
direct traffic that comes to event receiver, or else as streams. referring
to https://docs.wso2.com/display/CEP410/User+Guide
1.) Those data coming from wso2 DAS to ML are coming as streams?
2.) Are there any incremental learning algorithms currently active in
ML?you mentioned that there are and they are with scala API. So there is a
streaming support with that Scala API. In that API which format the data is
aquired to ML?

thank you.
BR,
Mahesh.

On Fri, Mar 4, 2016 at 2:03 PM, Maheshakya Wijewardena <mahesha...@wso2.com>
wrote:

> Hi Mahesh,
>
> We had to modify a the project scope a little to suit best for the
> requirements. We will update the project idea with those concerns soon and
> let you know.
>
> We do not support streaming data in WSO2 Machine learner at the moment.
> The new concern is to use WSO2 CEP to handle streaming data and implement
> the machine learning algorithms with Spark MLLib. You can look at the
> streaming k-means and streaming linear regression implementations in MLLib.
> Currently, the API is only for scala. Our need is to get the Java APIs of
> k-means and generalized linear models to support incremental learning with
> streaming data. This has to be done as mini-batch learning since these
> algorithms operates as stochastic gradient descents so that any learning
> with new data can be done on top of the previously learned models. So
> please go through the those APIs[1][2][3] and try to get an idea.
> Also please try to understand how event streams work in WSO2 CEP [4][5].
>
> Best regards.
>
> [1]
> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html
> [2]
> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html
> [3]
> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/classification/LogisticRegressionWithSGD.html
> [4] https://docs.wso2.com/display/CEP310/Working+with+Event+Streams
> [5] https://docs.wso2.com/display/CEP310/Working+with+Execution+Plans
>
> On Fri, Mar 4, 2016 at 11:26 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> give me sometime to go through your ML package. Do current product have
>> any stream data support?. i did some university projects related to machine
>> learning with regressions,modelling, factor analysis, cluster analysis and
>> classification problems (Discriminant Analysis) with SVM (Support Vector
>> machines), Neural networks, LS classification and ML(Maximum likelihood).
>> give me sometime to see how wso2 architecture works.then i can come up with
>> good architecture.thank you.
>> BR,
>> Mahesh.
>>
>> On Wed, Mar 2, 2016 at 2:41 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Thank you for the resources. I will go through this and looking forward
>>> to this proposed project.Thank you.
>>> BR,
>>> Mahesh.
>>>
>>> On Wed, Mar 2, 2016 at 1:52 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Thank you for the interest for this project.
>>>>
>>>> We would like to know what type of similar projects you have worked on.
>>>> You may have seen that WSO2 Machine Learner supports several learning
>>>> algorithms at the moment[1]. This project intends to leverage the existing
>>>> algorithms in WSO2 Machine Learner to support streaming data. As an
>>>> initiative, first you can get an idea about what WSO2 Machine Learner does
>>>> and how it operates. You can download WSO2 Machine Learner from product
>>>> page[2] and the the source code [3]. ML is using Apache Spark MLLib[4] for
>>>> its' algorithms so it's better to read and understand what it does as well.
>>>>
>>>> In order to get an idea about the deliverables and the scope of this
>>>> project, try to understand how Spark streaming[5] (see examples) handles
>>>> streaming data. Also, have a look in the streaming algorithms[6][7]
>>>> supported by MLLib. There are two approaches discussed to employ
>>>> incremental learning in ML in the project proposals page. These streaming
>>>> algorithms can be directly used in the first approach. For the other
>>>> approach, the your implementation should contain a pro

Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-03 Thread Mahesh Dananjaya
Hi maheshakya,
give me sometime to go through your ML package. Do current product have any
stream data support?. i did some university projects related to machine
learning with regressions,modelling, factor analysis, cluster analysis and
classification problems (Discriminant Analysis) with SVM (Support Vector
machines), Neural networks, LS classification and ML(Maximum likelihood).
give me sometime to see how wso2 architecture works.then i can come up with
good architecture.thank you.
BR,
Mahesh.

On Wed, Mar 2, 2016 at 2:41 PM, Mahesh Dananjaya <dananjayamah...@gmail.com>
wrote:

> Hi Maheshakya,
> Thank you for the resources. I will go through this and looking forward to
> this proposed project.Thank you.
> BR,
> Mahesh.
>
> On Wed, Mar 2, 2016 at 1:52 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Thank you for the interest for this project.
>>
>> We would like to know what type of similar projects you have worked on.
>> You may have seen that WSO2 Machine Learner supports several learning
>> algorithms at the moment[1]. This project intends to leverage the existing
>> algorithms in WSO2 Machine Learner to support streaming data. As an
>> initiative, first you can get an idea about what WSO2 Machine Learner does
>> and how it operates. You can download WSO2 Machine Learner from product
>> page[2] and the the source code [3]. ML is using Apache Spark MLLib[4] for
>> its' algorithms so it's better to read and understand what it does as well.
>>
>> In order to get an idea about the deliverables and the scope of this
>> project, try to understand how Spark streaming[5] (see examples) handles
>> streaming data. Also, have a look in the streaming algorithms[6][7]
>> supported by MLLib. There are two approaches discussed to employ
>> incremental learning in ML in the project proposals page. These streaming
>> algorithms can be directly used in the first approach. For the other
>> approach, the your implementation should contain a procedure to create mini
>> batches from streaming data with relevant sizes (i.e. a moving window) and
>> do periodic retraining of the same algorithm.
>>
>> To start with the project, you will need to come up with a suitable plan
>> and an architecture first.
>>
>> Please watch the video referenced in the proposal (reference: 5). It will
>> help you getting a better idea about machine learning algorithms with
>> streaming data.
>>
>> Let us know if you need any help with these.
>>
>> Best regards
>>
>> [1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms
>> [2] http://wso2.com/products/machine-learner/
>> [3]
>> https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout
>> [4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
>> [5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
>> [6]
>> https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression
>> [7]
>> https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means
>>
>> On Wed, Mar 2, 2016 at 1:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi all,
>>> I am interesting on contribute to proposal 6: "Predictive analytic with
>>> online data for WSO2 Machine Learner" for GSOC2 this time. Since i have
>>> been engaging with some similar projects i think it will be a great
>>> experience for me. Please let me know what you think and what you suggest.
>>> I have been going through your documents.thank you.
>>> regards,
>>> Mahesh Dananjaya.
>>>
>>>
>>> ___
>>> Dev mailing list
>>> Dev@wso2.org
>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>
>>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-02 Thread Mahesh Dananjaya
Hi Maheshakya,
Thank you for the resources. I will go through this and looking forward to
this proposed project.Thank you.
BR,
Mahesh.

On Wed, Mar 2, 2016 at 1:52 PM, Maheshakya Wijewardena <mahesha...@wso2.com>
wrote:

> Hi Mahesh,
>
> Thank you for the interest for this project.
>
> We would like to know what type of similar projects you have worked on.
> You may have seen that WSO2 Machine Learner supports several learning
> algorithms at the moment[1]. This project intends to leverage the existing
> algorithms in WSO2 Machine Learner to support streaming data. As an
> initiative, first you can get an idea about what WSO2 Machine Learner does
> and how it operates. You can download WSO2 Machine Learner from product
> page[2] and the the source code [3]. ML is using Apache Spark MLLib[4] for
> its' algorithms so it's better to read and understand what it does as well.
>
> In order to get an idea about the deliverables and the scope of this
> project, try to understand how Spark streaming[5] (see examples) handles
> streaming data. Also, have a look in the streaming algorithms[6][7]
> supported by MLLib. There are two approaches discussed to employ
> incremental learning in ML in the project proposals page. These streaming
> algorithms can be directly used in the first approach. For the other
> approach, the your implementation should contain a procedure to create mini
> batches from streaming data with relevant sizes (i.e. a moving window) and
> do periodic retraining of the same algorithm.
>
> To start with the project, you will need to come up with a suitable plan
> and an architecture first.
>
> Please watch the video referenced in the proposal (reference: 5). It will
> help you getting a better idea about machine learning algorithms with
> streaming data.
>
> Let us know if you need any help with these.
>
> Best regards
>
> [1] https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms
> [2] http://wso2.com/products/machine-learner/
> [3]
> https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout
> [4] https://spark.apache.org/docs/1.4.1/mllib-guide.html
> [5] https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html
> [6]
> https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression
> [7]
> https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means
>
> On Wed, Mar 2, 2016 at 1:19 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi all,
>> I am interesting on contribute to proposal 6: "Predictive analytic with
>> online data for WSO2 Machine Learner" for GSOC2 this time. Since i have
>> been engaging with some similar projects i think it will be a great
>> experience for me. Please let me know what you think and what you suggest.
>> I have been going through your documents.thank you.
>> regards,
>> Mahesh Dananjaya.
>>
>>
>> ___
>> Dev mailing list
>> Dev@wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] GSOC2016: Proposal 6: [ML]

2016-03-01 Thread Mahesh Dananjaya
Hi all,
I am interesting on contribute to proposal 6: "Predictive analytic with
online data for WSO2 Machine Learner" for GSOC2 this time. Since i have
been engaging with some similar projects i think it will be a great
experience for me. Please let me know what you think and what you suggest.
I have been going through your documents.thank you.
regards,
Mahesh Dananjaya.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Fwd: GSOC2016: Proposal 6: [ML]

2016-03-01 Thread Mahesh Dananjaya
Hi all,
I am interesting on contribute to proposal 6: "Predictive analytic with
online data for WSO2 Machine Learner" for GSOC2 this time. Since i have
been engaging with some similar projects i think it will be a great
experience for me. Please let me know what you think and what you suggest.
I have been going through your documents.thank you.
regards,
Mahesh Dananjaya.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev