Re: GSOC

Marko Novakovic Sat, 29 Mar 2008 10:47:44 -0700

I collabotate with one proffesor form my faculty,
whose phd thesis was about machine learning in SE-s.
He uses combination of Naive Bayes and SVM. I didn't
understand his solution enough.
But I think that SVM is very useful and deployable
algorithm for SE-s.
Do you think that I should change anything in my
application.


Greetings

--- Ted Dunning <[EMAIL PROTECTED]> wrote:

> 
> SVM is not the only solution to these problems.  For
> many search engine
> applications, it isn't even likely to be the best. 
> Regularized logistic
> regression is a strong candidate as are random
> forests and boosted trees.
> 
> Beware of any author who claims that their algorithm
> for machine learning
> that claims to be better than all others.  The
> algorithm may well have some
> virtues, but it is unlikely to be universal.  It is
> more likely that the
> author who claims this simply has a limited view of
> the range of things that
> might need to be done.
> 
> 
> On 3/29/08 10:23 AM, "Marko Novakovic"
> <[EMAIL PROTECTED]> wrote:
> 
> > The implementation of SVM algorithm at Hadoop
> platform
> > 
> > Abstract:
> > 
> > I have been researching in Search Engines
> > functionalities, like ranking, presenting relevant
> > page to users, etc.
> > I noted that the most usable solution for search
> > engines is Support Vector Machine.
> > The best solution for determination relevant page
> > ranking for user based search result is SVM.
> > Reference to this problem is article:
> > T. Joachims, F. Radlinski: "Search Engines that
> > Laerning from Implicit Feedback," IEEE Computer,
> > August 2007, pp 38
> > According to SVM is very complex algorithm, which
> has
> > a lot of operations,
> > I decided to implement SVM algorithm at Hadoop
> > platform.
> > 
> > Dear Apache,
> > 
> > My Idea:
> > 
> > I have idea to implement model and solution for
> > retrieving relevant ranking Web pages driven by
> user's
> > past behavior. 
> > According to SE-s have a lot of crawled Web pages,
> > this operation must be realized distributed if we
> want
> > to obtain results in real time and have fresh
> learned
> > database. 
> > So we should paralelize all algorithms, which are
> used
> > for processing Web pages.
> > So I decided to implement the most used and
> exploited
> > algorithm in machine learning, deployed in
> operating
> > Web pages.
> > I also, choose SVM algorithm because it is very
> > complex algorithm for implementation
> > and I like temptations and I am not affraid of
> hard
> > tasks.
> > I tend to achieve most a big degree of
> performances
> > through paralelization.
> > I will exploit working on this project for writing
> new
> > article about deployment of clustering at SE-a.
> > I have prepared to this project reading articles:
> > [1] C. Burges, "A Tutorial on Suppot Vector
> Machines
> > for Pattern Recognition," Kluwer Academin
> Publishers,
> > Boston
> > [2] R.E Fan, P.H Chen, C.J. Lin, "Working Set
> > Selection Using Second Order Information for
> Training
> > Support Vector Machines," Journal of Machine
> Learning
> > Research 6 (2005), pp 18891918
> > I also have read Hadoop documentation and examined
> > your implementations of algoritm kMeans at this
> > platform.
> > 
> > Methodoligies of Development:
> > 
> > - Test Driven Development
> > - Deployment ANT an JUnit
> > - SDK: Eclipse
> > - SVN System for Versioning
> > - Javadoc
> > 
> > About Me:
> > 
> > My resume you can see at link
> > http://atisha34.googlepages.com/.
> > I also participate in some academic projects at my
> > college:
> > - Working at topic based Search Engine, called
> Grain,
> > which is in construction at my faculty.
> > - Tutorial about SE-s, mentored by professor
> Veljko
> > Milutinovic: "The New Avenues in Search Engines"
> > presentation:
> > http://atisha34.googlepages.com/Searchengines.ppt
> > abstract:
> >
>
http://atisha34.googlepages.com/TheNewAvenuesinWebSearch.docx
> > I should publish article driven by this
> presentation
> > at IPSI Magazine.
> > - Other projects in which I participate aren't
> related
> > to machine learning and search engines.
> > 
> > My Interests:
> > - Search Engines
> > - Software Engineering and Test Driven Development
> > - Machine Learning
> > - Database Modeling and OO Design
> > - ERP and Business Processes
> > 
> > Sincerely Yours,
> > Marko Novakovic
> > 
> > --- Karl Wettin <[EMAIL PROTECTED]> wrote:
> > 
> >> Marko Novakovic skrev:
> >> 
> >> Hi Marko,
> >> 
> >>> I apply for SVM algorithm at Hadoop platform.
> >>> I hope that I will be accepted by Google and
> >> Appache,
> >>> I am serious in intention to do this jos as
> great.
> >> 
> >> great news! Feel free to post your proposal here
> >> too.
> >> 
> >> 
> >>      karl
> >> 
> > 
> > 
> > 
> >       
> >
>
______________________________________________________________________________
> > ______
> > Looking for last minute shopping deals?
> > Find them fast with Yahoo! Search.
> >
>
http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> > 
> 
> 



      
____________________________________________________________________________________
Special deal for Yahoo! users & friends - No Cost. Get a month of Blockbuster 
Total Access now 
http://tc.deals.yahoo.com/tc/blockbuster/text3.com

Re: GSOC

Reply via email to