My mistake. You should put any label value available in the training set. In the previous example, putting "normal" in all test record should be fine.
On Fri, Jan 18, 2013 at 7:26 AM, Ranjitha Chandrashekar <ranjitha...@hcl.com > wrote: > Hi Deneche > > Thank you for your quick response. > > I tried using the numerical value in the label attribute in the test data. > > Original Record in KDDTest : > 13,tcp,telnet,SF,118,2425,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,26,10,0.38,0.12,0.04,0.00,0.00,0.00,0.12,0.30,normal > > Replaced Record : > > 13,tcp,telnet,SF,118,2425,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,26,10,0.38,0.12,0.04,0.00,0.00,0.00,0.12,0.30,1 > > (normal class replaced with numerical value 1) > > Ran TestForest on KDDTest dataset. Following is the error that i get. > Sequential and map reduce classification gives the same error. > > Command --> hadoop jar > /usr/lib/mahout-0.5/mahout-examples-0.5-cdh3u5-job.jar > org.apache.mahout.df.mapreduce.TestForest -i > /user/ranjitha/input/KDDTest+.arff.txt_withnum -ds > /user/ranjitha/input/KDDTrain+.info -m /user/ranjitha/KDDForest -o > /user/ranjitha/KDDResult > > 13/01/18 11:29:24 INFO mapreduce.TestForest: Loading the forest... > 13/01/18 11:29:24 INFO mapreduce.TestForest: Sequential classification... > 13/01/18 11:29:24 ERROR data.DataConverter: label token: 1 dataset.labels: > [normal, anomaly] Exception in thread "main" > java.lang.IllegalStateException: Label value (1) not known > at > org.apache.mahout.df.data.DataConverter.convert(DataConverter.java:71) > at > org.apache.mahout.df.mapreduce.TestForest.testFile(TestForest.java:256) > at > org.apache.mahout.df.mapreduce.TestForest.sequential(TestForest.java:216) > at > org.apache.mahout.df.mapreduce.TestForest.testForest(TestForest.java:172) > at > org.apache.mahout.df.mapreduce.TestForest.run(TestForest.java:142) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.mahout.df.mapreduce.TestForest.main(TestForest.java:275) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:616) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > Looking forward to your reply > > Thanks > Ranjitha. > > -----Original Message----- > From: deneche abdelhakim [mailto:adene...@gmail.com] > Sent: 17 January 2013 18:20 > To: user@mahout.apache.org > Subject: Re: Issue with Partial Implementation Problem > > Hi Ranjitha, > > just put any numerical value in the label attribute. You should be able to > classify the data, but you won't be able to compute the confusion matrix or > the accuracy. > > > On Thu, Jan 17, 2013 at 12:15 PM, Ranjitha Chandrashekar < > ranjitha...@hcl.com> wrote: > > > Hi > > > > I am using Partial Implementation for Random Forest classification. > > > > I have a training dataset with labels class0, class 1, class 2. The > > decision forest is built on this training dataset. The classification > for > > the test dataset is computed using the same data descriptor generated for > > the training dataset. I am able to generate confusion matrix, accuracy > > details with the test data set with class variable. > > > > However I also need to make a classification for a scenario, where test > > data may not have the class variable or class values are not known. For > > ex, assume test data is about future data points, for which class values > > will have to be computed only in the future. > > > > > > * How is it possible to classify the test data set, where the > > class label is not defined or not known. I have tried using default > labels > > like "unknown", "NO_LABEL". It doesnt seem to work. > > > > > > * How to set the class label as "unknown" in the testing dataset. > > > > Looking forward to your reply, > > > > Thanks > > Ranjitha. > > > > > > > > ::DISCLAIMER:: > > > > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > > > The contents of this e-mail and any attachment(s) are confidential and > > intended for the named recipient(s) only. > > E-mail transmission is not guaranteed to be secure or error-free as > > information could be intercepted, corrupted, > > lost, destroyed, arrive late or incomplete, or may contain viruses in > > transmission. The e mail and its contents > > (with or without referred errors) shall therefore not attach any > liability > > on the originator or HCL or its affiliates. > > Views or opinions, if any, presented in this email are solely those of > the > > author and may not necessarily reflect the > > views or opinions of HCL or its affiliates. Any form of reproduction, > > dissemination, copying, disclosure, modification, > > distribution and / or publication of this message without the prior > > written consent of authorized representative of > > HCL is strictly prohibited. If you have received this email in error > > please delete it and notify the sender immediately. > > Before opening any email and/or attachments, please check them for > viruses > > and other defects. > > > > > > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > >