Hi,

This was mainly due to the detection of a numerical feature as a
categorical one.
Oh, it makes sense now. Why don't we try taking a sample of data and if the
sample contains only integers (or doubles without any decimals) or strings,
consider it as a categorical variable.

We suggested increasing the categorical threshold as a work-around.
@thushan did it work?
Yes, it worked. After increasing the threshold to 40.

On Fri, Aug 14, 2015 at 2:21 PM, Nirmal Fernando <nir...@wso2.com> wrote:

> This was mainly due to the detection of a numerical feature as a
> categorical one.
>
> We suggested increasing the categorical threshold as a work-around.
> @thushan did it work?
>
> On Tue, Aug 11, 2015 at 5:50 PM, Thushan Ganegedara <thu...@gmail.com>
> wrote:
>
>> This issue occurs, if I turn the response variable to a categorical
>> variable. If I get the variable as a numerical variable, the values are
>> read correctly.
>>
>> So I presume there is a fault in categorical conversion of the variable.
>>
>> On Tue, Aug 11, 2015 at 7:11 PM, Thushan Ganegedara <thu...@gmail.com>
>> wrote:
>>
>>> I still get the same result
>>>
>>> 1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
>>> 1.0     1.0     1.0     12.0    12.0    12.0    12.0    12.0    12.0
>>> 12.0    12.0    12.0    12.0    13.0    13.0    13.0    13.0    13.0    13.0
>>> 13.0    13.0    13.0    13.0    14.0    14.0    14.0    14.0    14.0
>>> 14.0    14.0    14.0    15.0    15.0    15.0    15.0    15.0    15.0
>>> 15.0    15.0    15.0    15.0    15.0    15.0    16.0    16.0    16.0    16.0
>>> 16.0    16.0    16.0    16.0    17.0    17.0    17.0    17.0    17.0
>>> 17.0    17.0    17.0    17.0    17.0    18.0    18.0    18.0    18.0
>>> 18.0    18.0    18.0    18.0    18.0    18.0    18.0    19.0    19.0    19.0
>>> 19.0    19.0    19.0    19.0    19.0    19.0    19.0    19.0    19.0
>>> 19.0    19.0    2.0     2.0     2.0     2.0     2.0     2.0     2.0
>>> 2.0     2.0     2.0     2.0     2.0     2.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     4.0     4.0     4.0     4.0     4.0     4.0
>>> 4.0     4.0     4.0     4.0     4.0     4.0     5.0     5.0     5.0     5.0
>>> 5.0     5.0     5.0     5.0     5.0     5.0     5.0     5.0     5.0
>>> 6.0     6.0     6.0     6.0     6.0     6.0     6.0     6.0     6.0
>>> 6.0     6.0     6.0     7.0     7.0     7.0     7.0     7.0     7.0     7.0
>>> 7.0     7.0     7.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>> 3.0     3.0     3.0     3.0
>>>
>>> On Tue, Aug 11, 2015 at 7:05 PM, Nirmal Fernando <nir...@wso2.com>
>>> wrote:
>>>
>>>> Can you use following code and try;
>>>>
>>>> List<LabeledPoint> points = labeledPoints.collect();
>>>> for(int i=0;i<points.size();i++){
>>>>              System.out.print(points.get(i).label() + "\t");
>>>>             }
>>>>
>>>> On Tue, Aug 11, 2015 at 2:30 PM, Thushan Ganegedara <thu...@gmail.com>
>>>> wrote:
>>>>
>>>>> I used the following snippet
>>>>>
>>>>> for(int i=0;i<labeledPoints.collect().size();i++){
>>>>>             System.out.print(labeledPoints.collect().get(i).label() +
>>>>> "\t");
>>>>>             }
>>>>>
>>>>> in the public MLModel build() throws MLModelBuilderException in
>>>>> DeeplearningModelBuilder.java
>>>>>
>>>>>
>>>>> On Tue, Aug 11, 2015 at 6:17 PM, Nirmal Fernando <nir...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi thushan,
>>>>>>
>>>>>> We need more info. What did you exactly print and where?
>>>>>>
>>>>>> On Tue, Aug 11, 2015 at 12:47 PM, Thushan Ganegedara <
>>>>>> thu...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I found the potential cause of the poor accuracy for the leaf
>>>>>>> dataset. It seems the data read into ML is wrong.
>>>>>>>
>>>>>>> I have attached the data file as a CSV (classes are in the last
>>>>>>> column)
>>>>>>>
>>>>>>> However, when I print out the labels of the read data (classes), it
>>>>>>> looks something like below. Clearly there aren't this many "3.0" classes
>>>>>>> and there should be classes up to 36.0.
>>>>>>>
>>>>>>> Is this caused by a bug?
>>>>>>>
>>>>>>> 1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
>>>>>>> 1.0     1.0     1.0     1.0     12.0    12.0    12.0    12.0    12.0
>>>>>>> 12.0    12.0    12.0    12.0    12.0    13.0    13.0    13.0    13.0
>>>>>>> 13.0    13.0
>>>>>>> 13.0    13.0    13.0    13.0    14.0    14.0    14.0    14.0
>>>>>>> 14.0    14.0    14.0    14.0    15.0    15.0    15.0    15.0    15.0
>>>>>>> 15.0    15.0    15.0    15.0    15.0    15.0    15.0    16.0    16.0
>>>>>>> 16.0    16.0
>>>>>>> 16.0    16.0    16.0    16.0    17.0    17.0    17.0    17.0
>>>>>>> 17.0    17.0    17.0    17.0    17.0    17.0    18.0    18.0    18.0
>>>>>>> 18.0    18.0    18.0    18.0    18.0    18.0    18.0    18.0    19.0
>>>>>>> 19.0    19.0
>>>>>>> 19.0    19.0    19.0    19.0    19.0    19.0    19.0    19.0
>>>>>>> 19.0    19.0    19.0    2.0     2.0     2.0     2.0     2.0     2.0
>>>>>>> 2.0     2.0     2.0     2.0     2.0     2.0     2.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     4.0     4.0     4.0     4.0     4.0
>>>>>>> 4.0     4.0     4.0     4.0     4.0     4.0     4.0     5.0     5.0
>>>>>>> 5.0     5.0
>>>>>>> 5.0     5.0     5.0     5.0     5.0     5.0     5.0     5.0
>>>>>>> 5.0     6.0     6.0     6.0     6.0     6.0     6.0     6.0     6.0
>>>>>>> 6.0     6.0     6.0     6.0     7.0     7.0     7.0     7.0     7.0
>>>>>>> 7.0     7.0
>>>>>>> 7.0     7.0     7.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0     3.0
>>>>>>> 3.0     3.0
>>>>>>> 3.0     3.0     3.0     3.0
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>>
>>>>>>> Thushan Ganegedara
>>>>>>> School of IT
>>>>>>> University of Sydney, Australia
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Thanks & regards,
>>>>>> Nirmal
>>>>>>
>>>>>> Team Lead - WSO2 Machine Learner
>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>> Mobile: +94715779733
>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>>
>>>>> Thushan Ganegedara
>>>>> School of IT
>>>>> University of Sydney, Australia
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Thanks & regards,
>>>> Nirmal
>>>>
>>>> Team Lead - WSO2 Machine Learner
>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>> Mobile: +94715779733
>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>>
>>> Thushan Ganegedara
>>> School of IT
>>> University of Sydney, Australia
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Thushan Ganegedara
>> School of IT
>> University of Sydney, Australia
>>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
Regards,

Thushan Ganegedara
School of IT
University of Sydney, Australia
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to