Hi Ilias, Thanks for your interest in SAMOA.
In theory SAMOA should be able to parse the same formats as MOA. Could you provide a link to your input file? Not sure I get your second question. To use the VHT classifier you need to provide labels, as it is a supervised learning methods. Cheers, -- Gianmarco On 13 May 2015 at 16:03, Bertsimas Ilias <[email protected]> wrote: > Hi all! > > I am in the process of running some tests for online machine learning in > data streams from social media. I came across apache-SAMOA and seemed like > a very interesting framework. > However it was not possible to figure out how to get it to test and train > using a sparse array of tf-idf feature vectors. I provide the data in the > standard WEKA arff format and although it run, the output is something > along the lines of: > > 2015-05-12 22:58:58,993 [main] INFO > > com.yahoo.labs.samoa.evaluation.EvaluatorProcessor > > (EvaluatorProcessor.java:189) - > > com.yahoo.labs.samoa.evaluation.EvaluatorProcessorid = 0 > > evaluation instances,classified instances,classifications correct > > (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent) > > 100.0,100.0,100.0,100.0,? > > 200.0,200.0,100.0,100.0,? > > 300.0,300.0,100.0,100.0,? > > 400.0,400.0,100.0,100.0,? > > 500.0,500.0,100.0,100.0,? > > 600.0,600.0,100.0,100.0,? > > 700.0,700.0,100.0,100.0,? > > 800.0,800.0,100.0,100.0,? > > 900.0,900.0,100.0,100.0,? > > 1000.0,1000.0,100.0,100.0,? > > 1100.0,1100.0,100.0,100.0,? > > 1200.0,1200.0,100.0,100.0,? > > 1300.0,1300.0,100.0,100.0,? > > 1400.0,1400.0,100.0,100.0,? > > 1500.0,1500.0,100.0,100.0,? > > 1600.0,1600.0,100.0,100.0,? > > 1700.0,1700.0,100.0,100.0,? > > 1800.0,1800.0,100.0,100.0,? > > 1900.0,1900.0,100.0,100.0,? > > > > I have read the documentation on the SAMOA project page but I wasn't able > to figure out how to get classification results per instance. > Could you please point me to the right direction in terms of acceptable > formats SAMOA can use as stream input ? Is there a need for a labeled > training set to be included in the data ? > > Any examples you could provide me with that are not already in the > documentation would be most welcome! > > > Kind Regards, > > Ilias Bertsimas. >
