Hi Malintha, Here is the excerpt from the first few lines of the sloan-school-of-management dataset:
*88, 92, 2, 99, 16, 66, 94, 37, 70, 0, 0, 24, 42, 65,100,100, 8* 80,100, 18, 98, 60, 66,100, 29, 42, 0, 0, 23, 42, 61, 56, 98, 8 0, 94, 9, 57, 20, 19, 7, 0, 20, 36, 70, 68,100,100, 18, 92, 8 95, 82, 71,100, 27, 77, 77, 73,100, 80, 93, 42, 56, 13, 0, 0, 9 68,100, 6, 88, 47, 75, 87, 82, 85, 56,100, 29, 75, 6, 0, 0, 9 70,100,100, 97, 70, 81, 45, 65, 30, 49, 20, 33, 0, 16, 0, 0, 1 40,100, 0, 81, 15, 58,100, 57, 47, 87, 50, 88, 40, 42, 36, 0, 4 3, 71, 0, 95, 45,100,100, 99, 79, 78, 48, 53, 31, 24, 54, 0, 7 As you can see, there is no header row (a row with feature names) in this csv file. At the dataset creation, if you did not specify that there is no header row in the dataset, ML will automatically take the first row as the header row and the feature names are derived from that. If the first row is taken as the header row, you can see that there are duplicate entries: 0, 100 In ML, there cannot be multiple features with the same name. At dataset creation, please select "No" for "Column header available", or add a header row manually into the data file before uploading. Best regards. On Fri, Oct 2, 2015 at 8:54 AM, Nirmal Fernando <nir...@wso2.com> wrote: > Hi Malintha, > > Thanks for trying ML. @Wije can you please check? > > On Fri, Oct 2, 2015 at 1:09 AM, Malintha Adikari <malin...@wso2.com> > wrote: > >> Hi, >> >> I am trying to create a dataset from 748KB sized data file [1] and >> getting following error. >> >> [2015-10-02 01:03:38,769] INFO >> {org.wso2.carbon.ml.core.impl.MLDatasetProcessor} - [Created] MLDataset >> [id=1, name=digitdd, tenantId=-1234, userName=admin, dataSourceType=file, >> dataTargetType=file, sourcePath=null, dataType=csv, comments=, >> version=1.0.0, containsHeader=true, status=null] >> [2015-10-02 01:03:40,537] WARN >> {org.wso2.carbon.ml.database.internal.MLDatabaseUtils} - An error occurred >> while enabling autocommit: PooledConnection has already been closed. >> java.sql.SQLException: PooledConnection has already been closed. >> at >> org.apache.tomcat.jdbc.pool.DisposableConnectionFacade.invoke(DisposableConnectionFacade.java:86) >> at com.sun.proxy.$Proxy16.setAutoCommit(Unknown Source) >> at >> org.wso2.carbon.ml.database.internal.MLDatabaseUtils.enableAutoCommit(MLDatabaseUtils.java:153) >> at >> org.wso2.carbon.ml.database.internal.MLDatabaseService.updateSummaryStatistics(MLDatabaseService.java:2370) >> at >> org.wso2.carbon.ml.core.impl.SummaryStatsGenerator.run(SummaryStatsGenerator.java:130) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> [2015-10-02 01:03:40,550] ERROR >> {org.wso2.carbon.ml.core.impl.SummaryStatsGenerator} - Error occurred >> while calculating summary statistics for dataset version 1: An error >> occurred while updating the database with summary statistics of the dataset >> 1: 16 >> org.wso2.carbon.ml.database.exceptions.DatabaseHandlerException: An error >> occurred while updating the database with summary statistics of the dataset >> 1: 16 >> at >> org.wso2.carbon.ml.database.internal.MLDatabaseService.updateSummaryStatistics(MLDatabaseService.java:2366) >> at >> org.wso2.carbon.ml.core.impl.SummaryStatsGenerator.run(SummaryStatsGenerator.java:130) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.ArrayIndexOutOfBoundsException: 16 >> at >> org.wso2.carbon.ml.database.internal.MLDatabaseService.updateSummaryStatistics(MLDatabaseService.java:2329) >> ... 4 more >> >> What could be the possible reason for this error ? >> >> [1] >> http://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/datasets/digits.csv >> >> Regards, >> Malintha >> >> -- >> *Malintha Adikari* >> Software Engineer >> WSO2 Inc.; http://wso2.com >> lean.enterprise.middleware >> >> Mobile: +94 71 2312958 >> Blog: http://malinthas.blogspot.com >> Page: http://about.me/malintha >> > > > > -- > > Thanks & regards, > Nirmal > > Team Lead - WSO2 Machine Learner > Associate Technical Lead - Data Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > > -- Pruthuvi Maheshakya Wijewardena Software Engineer WSO2 : http://wso2.com/ Email: mahesha...@wso2.com Mobile: +94711228855
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev