I collabotate with one proffesor form my faculty, whose phd thesis was about machine learning in SE-s. He uses combination of Naive Bayes and SVM. I didn't understand his solution enough. But I think that SVM is very useful and deployable algorithm for SE-s. Do you think that I should change anything in my application.
Greetings --- Ted Dunning <[EMAIL PROTECTED]> wrote: > > SVM is not the only solution to these problems. For > many search engine > applications, it isn't even likely to be the best. > Regularized logistic > regression is a strong candidate as are random > forests and boosted trees. > > Beware of any author who claims that their algorithm > for machine learning > that claims to be better than all others. The > algorithm may well have some > virtues, but it is unlikely to be universal. It is > more likely that the > author who claims this simply has a limited view of > the range of things that > might need to be done. > > > On 3/29/08 10:23 AM, "Marko Novakovic" > <[EMAIL PROTECTED]> wrote: > > > The implementation of SVM algorithm at Hadoop > platform > > > > Abstract: > > > > I have been researching in Search Engines > > functionalities, like ranking, presenting relevant > > page to users, etc. > > I noted that the most usable solution for search > > engines is Support Vector Machine. > > The best solution for determination relevant page > > ranking for user based search result is SVM. > > Reference to this problem is article: > > T. Joachims, F. Radlinski: "Search Engines that > > Laerning from Implicit Feedback," IEEE Computer, > > August 2007, pp 38 > > According to SVM is very complex algorithm, which > has > > a lot of operations, > > I decided to implement SVM algorithm at Hadoop > > platform. > > > > Dear Apache, > > > > My Idea: > > > > I have idea to implement model and solution for > > retrieving relevant ranking Web pages driven by > user's > > past behavior. > > According to SE-s have a lot of crawled Web pages, > > this operation must be realized distributed if we > want > > to obtain results in real time and have fresh > learned > > database. > > So we should paralelize all algorithms, which are > used > > for processing Web pages. > > So I decided to implement the most used and > exploited > > algorithm in machine learning, deployed in > operating > > Web pages. > > I also, choose SVM algorithm because it is very > > complex algorithm for implementation > > and I like temptations and I am not affraid of > hard > > tasks. > > I tend to achieve most a big degree of > performances > > through paralelization. > > I will exploit working on this project for writing > new > > article about deployment of clustering at SE-a. > > I have prepared to this project reading articles: > > [1] C. Burges, "A Tutorial on Suppot Vector > Machines > > for Pattern Recognition," Kluwer Academin > Publishers, > > Boston > > [2] R.E Fan, P.H Chen, C.J. Lin, "Working Set > > Selection Using Second Order Information for > Training > > Support Vector Machines," Journal of Machine > Learning > > Research 6 (2005), pp 18891918 > > I also have read Hadoop documentation and examined > > your implementations of algoritm kMeans at this > > platform. > > > > Methodoligies of Development: > > > > - Test Driven Development > > - Deployment ANT an JUnit > > - SDK: Eclipse > > - SVN System for Versioning > > - Javadoc > > > > About Me: > > > > My resume you can see at link > > http://atisha34.googlepages.com/. > > I also participate in some academic projects at my > > college: > > - Working at topic based Search Engine, called > Grain, > > which is in construction at my faculty. > > - Tutorial about SE-s, mentored by professor > Veljko > > Milutinovic: "The New Avenues in Search Engines" > > presentation: > > http://atisha34.googlepages.com/Searchengines.ppt > > abstract: > > > http://atisha34.googlepages.com/TheNewAvenuesinWebSearch.docx > > I should publish article driven by this > presentation > > at IPSI Magazine. > > - Other projects in which I participate aren't > related > > to machine learning and search engines. > > > > My Interests: > > - Search Engines > > - Software Engineering and Test Driven Development > > - Machine Learning > > - Database Modeling and OO Design > > - ERP and Business Processes > > > > Sincerely Yours, > > Marko Novakovic > > > > --- Karl Wettin <[EMAIL PROTECTED]> wrote: > > > >> Marko Novakovic skrev: > >> > >> Hi Marko, > >> > >>> I apply for SVM algorithm at Hadoop platform. > >>> I hope that I will be accepted by Google and > >> Appache, > >>> I am serious in intention to do this jos as > great. > >> > >> great news! Feel free to post your proposal here > >> too. > >> > >> > >> karl > >> > > > > > > > > > > > ______________________________________________________________________________ > > ______ > > Looking for last minute shopping deals? > > Find them fast with Yahoo! Search. > > > http://tools.search.yahoo.com/newsearch/category.php?category=shopping > > > > ____________________________________________________________________________________ Special deal for Yahoo! users & friends - No Cost. Get a month of Blockbuster Total Access now http://tc.deals.yahoo.com/tc/blockbuster/text3.com