Hi, This was mainly due to the detection of a numerical feature as a categorical one. Oh, it makes sense now. Why don't we try taking a sample of data and if the sample contains only integers (or doubles without any decimals) or strings, consider it as a categorical variable.
We suggested increasing the categorical threshold as a work-around. @thushan did it work? Yes, it worked. After increasing the threshold to 40. On Fri, Aug 14, 2015 at 2:21 PM, Nirmal Fernando <nir...@wso2.com> wrote: > This was mainly due to the detection of a numerical feature as a > categorical one. > > We suggested increasing the categorical threshold as a work-around. > @thushan did it work? > > On Tue, Aug 11, 2015 at 5:50 PM, Thushan Ganegedara <thu...@gmail.com> > wrote: > >> This issue occurs, if I turn the response variable to a categorical >> variable. If I get the variable as a numerical variable, the values are >> read correctly. >> >> So I presume there is a fault in categorical conversion of the variable. >> >> On Tue, Aug 11, 2015 at 7:11 PM, Thushan Ganegedara <thu...@gmail.com> >> wrote: >> >>> I still get the same result >>> >>> 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 >>> 1.0 1.0 1.0 12.0 12.0 12.0 12.0 12.0 12.0 >>> 12.0 12.0 12.0 12.0 13.0 13.0 13.0 13.0 13.0 13.0 >>> 13.0 13.0 13.0 13.0 14.0 14.0 14.0 14.0 14.0 >>> 14.0 14.0 14.0 15.0 15.0 15.0 15.0 15.0 15.0 >>> 15.0 15.0 15.0 15.0 15.0 15.0 16.0 16.0 16.0 16.0 >>> 16.0 16.0 16.0 16.0 17.0 17.0 17.0 17.0 17.0 >>> 17.0 17.0 17.0 17.0 17.0 18.0 18.0 18.0 18.0 >>> 18.0 18.0 18.0 18.0 18.0 18.0 18.0 19.0 19.0 19.0 >>> 19.0 19.0 19.0 19.0 19.0 19.0 19.0 19.0 19.0 >>> 19.0 19.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 >>> 2.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 4.0 >>> 4.0 4.0 4.0 4.0 4.0 4.0 5.0 5.0 5.0 5.0 >>> 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 >>> 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 >>> 6.0 6.0 6.0 7.0 7.0 7.0 7.0 7.0 7.0 7.0 >>> 7.0 7.0 7.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>> 3.0 3.0 3.0 3.0 >>> >>> On Tue, Aug 11, 2015 at 7:05 PM, Nirmal Fernando <nir...@wso2.com> >>> wrote: >>> >>>> Can you use following code and try; >>>> >>>> List<LabeledPoint> points = labeledPoints.collect(); >>>> for(int i=0;i<points.size();i++){ >>>> System.out.print(points.get(i).label() + "\t"); >>>> } >>>> >>>> On Tue, Aug 11, 2015 at 2:30 PM, Thushan Ganegedara <thu...@gmail.com> >>>> wrote: >>>> >>>>> I used the following snippet >>>>> >>>>> for(int i=0;i<labeledPoints.collect().size();i++){ >>>>> System.out.print(labeledPoints.collect().get(i).label() + >>>>> "\t"); >>>>> } >>>>> >>>>> in the public MLModel build() throws MLModelBuilderException in >>>>> DeeplearningModelBuilder.java >>>>> >>>>> >>>>> On Tue, Aug 11, 2015 at 6:17 PM, Nirmal Fernando <nir...@wso2.com> >>>>> wrote: >>>>> >>>>>> Hi thushan, >>>>>> >>>>>> We need more info. What did you exactly print and where? >>>>>> >>>>>> On Tue, Aug 11, 2015 at 12:47 PM, Thushan Ganegedara < >>>>>> thu...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I found the potential cause of the poor accuracy for the leaf >>>>>>> dataset. It seems the data read into ML is wrong. >>>>>>> >>>>>>> I have attached the data file as a CSV (classes are in the last >>>>>>> column) >>>>>>> >>>>>>> However, when I print out the labels of the read data (classes), it >>>>>>> looks something like below. Clearly there aren't this many "3.0" classes >>>>>>> and there should be classes up to 36.0. >>>>>>> >>>>>>> Is this caused by a bug? >>>>>>> >>>>>>> 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 >>>>>>> 1.0 1.0 1.0 1.0 12.0 12.0 12.0 12.0 12.0 >>>>>>> 12.0 12.0 12.0 12.0 12.0 13.0 13.0 13.0 13.0 >>>>>>> 13.0 13.0 >>>>>>> 13.0 13.0 13.0 13.0 14.0 14.0 14.0 14.0 >>>>>>> 14.0 14.0 14.0 14.0 15.0 15.0 15.0 15.0 15.0 >>>>>>> 15.0 15.0 15.0 15.0 15.0 15.0 15.0 16.0 16.0 >>>>>>> 16.0 16.0 >>>>>>> 16.0 16.0 16.0 16.0 17.0 17.0 17.0 17.0 >>>>>>> 17.0 17.0 17.0 17.0 17.0 17.0 18.0 18.0 18.0 >>>>>>> 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 19.0 >>>>>>> 19.0 19.0 >>>>>>> 19.0 19.0 19.0 19.0 19.0 19.0 19.0 19.0 >>>>>>> 19.0 19.0 19.0 2.0 2.0 2.0 2.0 2.0 2.0 >>>>>>> 2.0 2.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 >>>>>>> 4.0 4.0 4.0 4.0 4.0 4.0 4.0 5.0 5.0 >>>>>>> 5.0 5.0 >>>>>>> 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 >>>>>>> 5.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 >>>>>>> 6.0 6.0 6.0 6.0 7.0 7.0 7.0 7.0 7.0 >>>>>>> 7.0 7.0 >>>>>>> 7.0 7.0 7.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>> 3.0 3.0 >>>>>>> 3.0 3.0 3.0 3.0 >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> >>>>>>> Thushan Ganegedara >>>>>>> School of IT >>>>>>> University of Sydney, Australia >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks & regards, >>>>>> Nirmal >>>>>> >>>>>> Team Lead - WSO2 Machine Learner >>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>> Mobile: +94715779733 >>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> >>>>> Thushan Ganegedara >>>>> School of IT >>>>> University of Sydney, Australia >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks & regards, >>>> Nirmal >>>> >>>> Team Lead - WSO2 Machine Learner >>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>> Mobile: +94715779733 >>>> Blog: http://nirmalfdo.blogspot.com/ >>>> >>>> >>>> >>> >>> >>> -- >>> Regards, >>> >>> Thushan Ganegedara >>> School of IT >>> University of Sydney, Australia >>> >> >> >> >> -- >> Regards, >> >> Thushan Ganegedara >> School of IT >> University of Sydney, Australia >> > > > > -- > > Thanks & regards, > Nirmal > > Team Lead - WSO2 Machine Learner > Associate Technical Lead - Data Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > > -- Regards, Thushan Ganegedara School of IT University of Sydney, Australia
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev