Re: [Dev] [ML] Update - Deeplearning Integration to WSO2-ml

Thushan Ganegedara Sun, 26 Jul 2015 17:32:50 -0700

Hi all,

I'm doing some tests with several datasets and most of them seemed to be
working fine. Somehow, I stumbled upon the leaf dataset (
https://archive.ics.uci.edu/ml/datasets/Leaf), which does not seem to be
working well for. However, the dataset works fine with other algorithms
(e.g. Logistic Regression L-BFGS) Therefore, I suspect this is due to some
sort of malformed data format. I'm right now looking into that.


Furthermore, I am thinking of starting with the D3 visualization on the
parameter setting stage. Should we be moving forward with that idea?

Finally, I would like to remind that, we haven't decided a date for code
review. Should we do that?

Thank you

On Wed, Jul 22, 2015 at 11:13 PM, Thushan Ganegedara <thu...@gmail.com>
wrote:

> Hi,
>
> Apologies about the late reply.
>
> Notes of the Demonstration
>
> Time duration: approximately 30 mins
>
> The demonstration was to demonstrate the implemented deeplearning feature
> of WSO2-ML. The demo started first explaining the dataset used (i.e.
> MNIST). The dataset is a CSV file with approximately 30000 rows and 784
> features.
>
> Next the dataset was loaded to WSO2-ml. Here a concern was raised
> regarding selecting the type of data in the Preprocessing Phase (i.e.
> Categorical vs Numerical) The suggestion was that there should be a UI
> feature to change the data type for all the variables at once (very useful
> for large amounts of features).
>
> Next the deeplearning algorithm for MNIST dataset was demonstrated and was
> able to achieve an appx 95% accuracy. Regarding the deeplearning
> algorithms, H-2-O doesn't seem to have different deeplearning algorithms at
> the moment, but a general deep network + classifier (probably autoencoder).
> So the idea was to ask H-2-O team whether they are planning to implement
> different networks in the future.
>
> Also, it was suggested to add a visualization feature in parameter setting
> stage to provide a summarized visualization of the network to the user.
>
> Furthermore, another suggestion was to test the deep network on real world
> datasets and see how it performs. For this datasets from Kaggle will be
> used.
>
>
> About progress.
>
> I'm currently testing the algorithm against different datasets. and I'll
> provide a detailed report on that in the recent future.
>
> Thank you
>
>
>
> On Wed, Jul 22, 2015 at 2:00 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> @Thushan how are you progressing? Could you please send the notes of our
>> last review?
>>
>> On Thu, Jul 16, 2015 at 10:43 AM, CD Athuraliya <chathur...@wso2.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Jul 13, 2015 at 11:12 AM, Thushan Ganegedara <thu...@gmail.com>
>>> wrote:
>>>
>>>> Hello CD,
>>>>
>>>> Yes, it seems to be working fine now. But why does it show the axes in
>>>> meters? Is this a d3 specific thing?
>>>>
>>>
>>> I think *m* stands for *Milli* here.
>>>
>>>>
>>>> On Mon, Jul 13, 2015 at 3:17 PM, Thushan Ganegedara <thu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Thank you very much for pointing out. I'll get the latest update and
>>>>> see.
>>>>>
>>>>> On Mon, Jul 13, 2015 at 3:03 PM, CD Athuraliya <chathur...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Thushan,
>>>>>>
>>>>>> That method has been updated. Please get the latest. You might have
>>>>>> to define your own case depending on predicted values.
>>>>>>
>>>>>> CD Athuraliya
>>>>>> Sent from my mobile device
>>>>>> On Jul 13, 2015 10:24 AM, "Nirmal Fernando" <nir...@wso2.com> wrote:
>>>>>>
>>>>>>> Great work Thushan! On the UI issues, @CD could help you. AFAIK
>>>>>>> actual keeps the pointer to the actual label and predicted is the
>>>>>>> probability and predictedLabel is after rounding it using a threshold.
>>>>>>>
>>>>>>> On Mon, Jul 13, 2015 at 7:14 AM, Thushan Ganegedara <
>>>>>>> thu...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I have integrated H-2-O deeplearning to WSO2-ml successfully.
>>>>>>>> Following are the stats on 2 tests conducted (screenshots attached).
>>>>>>>>
>>>>>>>> Iris dataset - 93.62% Accuracy
>>>>>>>> MNIST (Small) dataset - 94.94% Accuracy
>>>>>>>>
>>>>>>>> However, there were few unusual issues that I had to spend lot of
>>>>>>>> time to identify.
>>>>>>>>
>>>>>>>> *FrameSplitter does not work for any value other than 0.5. Any
>>>>>>>> value other than 0.5, the following error is returned*
>>>>>>>> (Frame splitter is used to split trainingData to train and valid
>>>>>>>> sets)
>>>>>>>> barrier onExCompletion for
>>>>>>>> hex.deeplearning.DeepLearning$DeepLearningDriver@25e994ae
>>>>>>>> java.lang.RuntimeException: java.lang.RuntimeException:
>>>>>>>> java.lang.NullPointerException
>>>>>>>> at
>>>>>>>> hex.deeplearning.DeepLearning$DeepLearningDriver.trainModel(DeepLearning.java:382)
>>>>>>>>
>>>>>>>> *DeepLearningModel.score(double[] vec) method doesn't work. *
>>>>>>>> The predictions obtained with score(Frame f) and score(double[] v)
>>>>>>>> is shown below.
>>>>>>>>
>>>>>>>> *Actual, score(Frame f), score(double[] v)*
>>>>>>>> 0.0, 0.0, 1.0
>>>>>>>> 1.0, 1.0, 2.0
>>>>>>>> 2.0, 2.0, 2.0
>>>>>>>> 2.0, 1.0, 2.0
>>>>>>>> 1.0, 1.0, 2.0
>>>>>>>>
>>>>>>>> As you can see, score(double[] v) is quite poor.
>>>>>>>>
>>>>>>>> After fixing above issues, everything seems to be working fine at
>>>>>>>> the moment.
>>>>>>>>
>>>>>>>> However, the I've a concern regarding the following method in
>>>>>>>> view-model.jag -> function
>>>>>>>> drawPredictedVsActualChart(testResultDataPointsSample)
>>>>>>>>
>>>>>>>> var actual = testResultDataPointsSample[i].predictedVsActual.actual;
>>>>>>>>         var predicted =
>>>>>>>> testResultDataPointsSample[i].predictedVsActual.predicted;
>>>>>>>>         var labeledPredicted = labelPredicted(predicted, 0.5);
>>>>>>>>
>>>>>>>>         if(actual == labeledPredicted) {
>>>>>>>>             predictedVsActualPoint[2] = 'Correct';
>>>>>>>>         }
>>>>>>>>         else {
>>>>>>>>             predictedVsActualPoint[2] = 'Incorrect';
>>>>>>>>         }
>>>>>>>>
>>>>>>>> why does it compare the *actual and labeledPredicted* where it
>>>>>>>> should be comparing *actual and predicted*?
>>>>>>>>
>>>>>>>> Also, the *Actual vs Predicted graph for MNIST show the axis in
>>>>>>>> "Meters" *(mnist.png) which doesn't make sense. I'm still looking
>>>>>>>> into this.
>>>>>>>>
>>>>>>>> Thank you
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Thushan Ganegedara
>>>>>>>> School of IT
>>>>>>>> University of Sydney, Australia
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks & regards,
>>>>>>> Nirmal
>>>>>>>
>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>> Mobile: +94715779733
>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> Thushan Ganegedara
>>>>> School of IT
>>>>> University of Sydney, Australia
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> Thushan Ganegedara
>>>> School of IT
>>>> University of Sydney, Australia
>>>>
>>>
>>>
>>>
>>> --
>>> *CD Athuraliya*
>>> Software Engineer
>>> WSO2, Inc.
>>> lean . enterprise . middleware
>>> Mobile: +94 716288847 <94716288847>
>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
>>> <https://twitter.com/cdathuraliya> | Blog
>>> <http://cdathuraliya.tumblr.com/>
>>>
>>
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>
>
> --
> Regards,
>
> Thushan Ganegedara
> School of IT
> University of Sydney, Australia
>



-- 
Regards,

Thushan Ganegedara
School of IT
University of Sydney, Australia

_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] [ML] Update - Deeplearning Integration to WSO2-ml

Reply via email to