Thank you Yexi...Thanks for spending your valuable time.
On Mon, Dec 2, 2013 at 8:22 PM, Yexi Jiang <yexiji...@gmail.com> wrote: > Yes, the user is responsible for using the correct model for a given piece > of testing (or unlabeled) data. > > > 2013/12/2 unmesha sreeveni <unmeshab...@gmail.com> > >> To make it more general, it's better to separate them. Since there might >> be multiple batches of training (or to-be-label), and you only need to >> train the model once (if your data is stable). >> >> Ok , I will go for the second one. >> So if we are going for separate.They will not have any connection with >> both. So we should tell what test data belongs to which train data. >> And load the corresponding playtennnis_tree.txt (so the result file >> should be named in a manner that the training result name can be noticed by >> its file name) for the train data and predict the test data. >> >> >> On Mon, Dec 2, 2013 at 10:29 AM, Yexi Jiang <yexiji...@gmail.com> wrote: >> >>> Actually the training and testing (or prediction) are not necessary to >>> be done in one shot. If you need to do them consecutively in your >>> particular scenario, you can do it as what you said. >>> >>> To make it more general, it's better to separate them. Since there might >>> be multiple batches of training (or to-be-label), and you only need to >>> train the model once (if your data is stable). >>> >>> >>> 2013/12/1 unmesha sreeveni <unmeshab...@gmail.com> >>> >>>> 1. I jst thought of building a model using a project named say DT and >>>> wen a huge input comes do another mr job test.java with in DT. >>>> If not chaining jobs we need to create seperate project right DT_build >>>> and DT_test projects >>>> NO need for seperate project file? >>>> >>>> 2. M1_train - dataset for training. >>>> >>>> M1_test - test data or prediction. >>>> 1. Will it be one data as input for prediction or set of data given >>>> as input at-once. >>>> 2.we also need to ensure in our pgm that M1_test belongs to M1_train >>>> only. we shld check that also ...right? if M1_test is given into >>>> M2_train it should show error. is nt 'it?. >>>> >>>> Any thing wrong in my inference... >>>> Are u able to guess wt i am trying to accomplish. >>>> I am confused if i need to create only 1 project that includes train >>>> and test.or 2 projects >>>> >>>> >>>> On Mon, Dec 2, 2013 at 9:54 AM, Yexi Jiang <yexiji...@gmail.com> wrote: >>>> >>>>> What is your motivation of using chaining jobs? >>>>> >>>>> >>>>> 2013/12/1 unmesha sreeveni <unmeshab...@gmail.com> >>>>> >>>>>> Thanks Yexi...A very nice explanation...Thanks a lot.. >>>>>> Explained in a very simple way which is really understandable for >>>>>> beginners..Thanks a lot. >>>>>> I can go for chaining jobs right? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Sun, Dec 1, 2013 at 8:55 PM, Yexi Jiang <yexiji...@gmail.com>wrote: >>>>>> >>>>>>> In my opinion. >>>>>>> >>>>>>> 1. Build the decision tree model with the training data. >>>>>>> 2. Store it somewhere. >>>>>>> 3. When the unlabeled data is available: >>>>>>> 3.1 if the unlabeled data is huge, write another mrjob to process >>>>>>> them, load the model at the setup stage, use the model to label the data >>>>>>> one by one in map stage. There is no necessary to have a reducer. >>>>>>> 3.2 if the unlabeled data is small, it is trivial. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2013/12/1 unmesha sreeveni <unmeshab...@gmail.com> >>>>>>> >>>>>>>> Thanks Yexi , >>>>>>>> >>>>>>>> But how it can be accomplished. >>>>>>>> The input to Desicion Tree MR will be a set of data. But while >>>>>>>> predicting a data it will be a one line data without classlabel >>>>>>>> right? >>>>>>>> So what changes will be there in mrjob.Should we design like this. >>>>>>>> 1. When a set of data is coming draw Desicion tree >>>>>>>> 2. else if a one line data is coming.check the output of decision >>>>>>>> tree(Decision tree generated from mr) and predict the class label. >>>>>>>> >>>>>>>> ------- >>>>>>>> >>>>>>>> M1_train - dataset for training. >>>>>>>> M1_test - test data or prediction. >>>>>>>> 1. Will it be one data as input for prediction or set of data given >>>>>>>> as input at-once. >>>>>>>> 2.we also need to ensure in our pgm that M1_test belongs to M1_train >>>>>>>> only. we shld check that also ...right? if M1_test is given into >>>>>>>> M2_train it should show error. is nt 'it?. >>>>>>>> >>>>>>>> Pls suggest if my thoughts are wrong. >>>>>>>> >>>>>>>> On 11/30/13, Yexi Jiang <yexiji...@gmail.com> wrote: >>>>>>>> > I watched the video in it but I cannot access its source code due >>>>>>>> to >>>>>>>> > permission issue. >>>>>>>> > In my opinion, once the decision tree model is built, the model >>>>>>>> is small >>>>>>>> > enough to be loaded into memory and can be used directly without >>>>>>>> another >>>>>>>> > mrjob for prediction. The prediction can be conducted in a >>>>>>>> streaming way. >>>>>>>> > >>>>>>>> > >>>>>>>> > 2013/11/30 unmesha sreeveni <unmeshab...@gmail.com> >>>>>>>> > >>>>>>>> >> I have gone through a Map Reduce implementation of c4.5 in >>>>>>>> >> >>>>>>>> http://btechfreakz.blogspot.in/2013/04/implementation-of-c45-algorithm-using.html >>>>>>>> >> >>>>>>>> >> Here a decision tree is build. So my doubt is >>>>>>>> >> Can we also include the prediction along with that? >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> On Tue, Nov 26, 2013 at 8:52 AM, Yexi Jiang <yexiji...@gmail.com> >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >>> You are welcome :) >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> 2013/11/25 unmesha sreeveni <unmeshab...@gmail.com> >>>>>>>> >>> >>>>>>>> >>>> ok . Thx Yexi >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> On Tue, Nov 26, 2013 at 1:41 AM, Yexi Jiang < >>>>>>>> yexiji...@gmail.com> >>>>>>>> >>>> wrote: >>>>>>>> >>>> >>>>>>>> >>>>> As far as I know, there is no ID3 implementation in mahout >>>>>>>> currently, >>>>>>>> >>>>> but you can use the decision forest instead. >>>>>>>> >>>>> >>>>>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Breiman+Example. >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> 2013/11/25 unmesha sreeveni <unmeshab...@gmail.com> >>>>>>>> >>>>> >>>>>>>> >>>>>> Is that ID3 classification? >>>>>>>> >>>>>> It includes prediction also? >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> On Sat, Nov 23, 2013 at 9:01 PM, Yexi Jiang >>>>>>>> >>>>>> <yexiji...@gmail.com>wrote: >>>>>>>> >>>>>> >>>>>>>> >>>>>>> You can directly find it at >>>>>>>> https://github.com/apache/mahout, or you >>>>>>>> >>>>>>> can check out from svn by following >>>>>>>> >>>>>>> >>>>>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Version+Control. >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> 2013/11/23 unmesha sreeveni <unmeshab...@gmail.com> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I want to go through Decision tree implementation in >>>>>>>> mahout. >>>>>>>> >>>>>>>> Refereed Apache Mahout <http://mahout.apache.org/> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 6 Feb 2012 - Apache Mahout 0.6 released >>>>>>>> >>>>>>>> Apache Mahout has reached version 0.6. All developers are >>>>>>>> encouraged >>>>>>>> >>>>>>>> to begin using version 0.6. Highlights include: >>>>>>>> >>>>>>>> Improved Decision Tree performance and added support for >>>>>>>> regression >>>>>>>> >>>>>>>> problems >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Where can I find its source code and documentation. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Should I download mahout >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> *Thanks & Regards* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Unmesha Sreeveni U.B >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Junior Developer* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> -- >>>>>>>> >>>>>>> ------ >>>>>>>> >>>>>>> Yexi Jiang, >>>>>>>> >>>>>>> ECS 251, yjian...@cs.fiu.edu >>>>>>>> >>>>>>> School of Computer and Information Science, >>>>>>>> >>>>>>> Florida International University >>>>>>>> >>>>>>> Homepage: http://users.cis.fiu.edu/~yjian004/ >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> -- >>>>>>>> >>>>>> *Thanks & Regards* >>>>>>>> >>>>>> >>>>>>>> >>>>>> Unmesha Sreeveni U.B >>>>>>>> >>>>>> >>>>>>>> >>>>>> *Junior Developer* >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> -- >>>>>>>> >>>>> ------ >>>>>>>> >>>>> Yexi Jiang, >>>>>>>> >>>>> ECS 251, yjian...@cs.fiu.edu >>>>>>>> >>>>> School of Computer and Information Science, >>>>>>>> >>>>> Florida International University >>>>>>>> >>>>> Homepage: http://users.cis.fiu.edu/~yjian004/ >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> -- >>>>>>>> >>>> *Thanks & Regards* >>>>>>>> >>>> >>>>>>>> >>>> Unmesha Sreeveni U.B >>>>>>>> >>>> >>>>>>>> >>>> *Junior Developer* >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> -- >>>>>>>> >>> ------ >>>>>>>> >>> Yexi Jiang, >>>>>>>> >>> ECS 251, yjian...@cs.fiu.edu >>>>>>>> >>> School of Computer and Information Science, >>>>>>>> >>> Florida International University >>>>>>>> >>> Homepage: http://users.cis.fiu.edu/~yjian004/ >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> -- >>>>>>>> >> *Thanks & Regards* >>>>>>>> >> >>>>>>>> >> Unmesha Sreeveni U.B >>>>>>>> >> >>>>>>>> >> *Junior Developer* >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> > >>>>>>>> > >>>>>>>> > -- >>>>>>>> > ------ >>>>>>>> > Yexi Jiang, >>>>>>>> > ECS 251, yjian...@cs.fiu.edu >>>>>>>> > School of Computer and Information Science, >>>>>>>> > Florida International University >>>>>>>> > Homepage: http://users.cis.fiu.edu/~yjian004/ >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Thanks & Regards* >>>>>>>> >>>>>>>> Unmesha Sreeveni U.B >>>>>>>> >>>>>>>> *Junior Developer* >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> ------ >>>>>>> Yexi Jiang, >>>>>>> ECS 251, yjian...@cs.fiu.edu >>>>>>> School of Computer and Information Science, >>>>>>> Florida International University >>>>>>> Homepage: http://users.cis.fiu.edu/~yjian004/ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Thanks & Regards* >>>>>> >>>>>> Unmesha Sreeveni U.B >>>>>> >>>>>> *Junior Developer* >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> ------ >>>>> Yexi Jiang, >>>>> ECS 251, yjian...@cs.fiu.edu >>>>> School of Computer and Information Science, >>>>> Florida International University >>>>> Homepage: http://users.cis.fiu.edu/~yjian004/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Thanks & Regards* >>>> >>>> Unmesha Sreeveni U.B >>>> >>>> *Junior Developer* >>>> >>>> >>>> >>> >>> >>> -- >>> ------ >>> Yexi Jiang, >>> ECS 251, yjian...@cs.fiu.edu >>> School of Computer and Information Science, >>> Florida International University >>> Homepage: http://users.cis.fiu.edu/~yjian004/ >>> >>> >> >> >> -- >> *Thanks & Regards* >> >> Unmesha Sreeveni U.B >> >> *Junior Developer* >> >> >> > > > -- > ------ > Yexi Jiang, > ECS 251, yjian...@cs.fiu.edu > School of Computer and Information Science, > Florida International University > Homepage: http://users.cis.fiu.edu/~yjian004/ > > -- *Thanks & Regards* Unmesha Sreeveni U.B *Junior Developer*