RHaddop package allows you to do statistical anlysis.  we were able to do word 
cloud on the text files using rmr and rhdfs packages.

Installtion details for these packages is available in the following link.

https://github.com/RevolutionAnalytics/RHadoop/wiki/rmr

Devi



________________________________
From: Charles Earl <charles.ce...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Wed, April 25, 2012 12:20:36 PM
Subject: Re: Text Analysis

If you've got existing R code, you might want to look at this 
http://www.quora.com/How-can-R-and-Hadoop-be-used-together.
Quora posting, also by Cloudera, or the rhipe R Hadoop package 
https://github.com/saptarshiguha/RHIPE/wiki
Mahout and Lucene/Solr offer some level of text analysis, although I would not 
call these complete text analysis packages.
What I've found are specific algorithms as opposed to a complete package: for 
example LDA for topic discovery -- Mahout and Yahoo Research 
(https://github.com/shravanmn/Yahoo_LDA) have Hadoop based implementations -- 
in 
the case of Yahoo_LDA the data is stored in HDFS, while the computation is 
essentially MPI based. Whether the algorithm reads data from HDFS store and 
uses 
another approach other than map reduce is another question.
C

On Apr 25, 2012, at 12:47 PM, Jagat wrote:

> There are Api which you can use , offcourse they are third party.
> 
> -----------
> Sent from Mobile , short and crisp.
> On 25-Apr-2012 8:57 PM, "Robert Evans" <ev...@yahoo-inc.com> wrote:
> 
>> Hadoop itself is the core Map/Reduce and HDFS functionality.  The higher
>> level algorithms like sentiment analysis are often done by others.
>> Cloudera has a video from HadoopWorld 2010 about it
>> 
>> 
>>http://www.cloudera.com/resource/hw10_video_sentiment_analysis_powered_by_hadoop/
>>/
>> 
>> And there are likely to be other tools like R that can help you out with
>> it.  I am not really sure if mahout offers sentiment analysis or not, but
>> you might want to look there too http://mahout.apache.org/
>> 
>> --Bobby Evans
>> 
>> 
>> On 4/25/12 7:50 AM, "karanveer.si...@barclays.com" <
>> karanveer.si...@barclays.com> wrote:
>> 
>> Hi,
>> 
>> I wanted to know if there are any existing API's within Hadoop for us to
>> do some text analysis like sentiment analysis, etc. OR are we to rely on
>> tools like R, etc. for this.
>> 
>> 
>> Regards,
>> Karanveer
>> 
>> 
>> 
>> 
>> 
>> This e-mail and any attachments are confidential and intended
>> solely for the addressee and may also be privileged or exempt from
>> disclosure under applicable law. If you are not the addressee, or
>> have received this e-mail in error, please notify the sender
>> immediately, delete it from your system and do not copy, disclose
>> or otherwise act upon any part of this e-mail or its attachments.
>> 
>> Internet communications are not guaranteed to be secure or
>> virus-free.
>> The Barclays Group does not accept responsibility for any loss
>> arising from unauthorised access to, or interference with, any
>> Internet communications by any third party, or from the
>> transmission of any viruses. Replies to this e-mail may be
>> monitored by the Barclays Group for operational or business
>> reasons.
>> 
>> Any opinion or other information in this e-mail or its attachments
>> that does not relate to the business of the Barclays Group is
>> personal to the sender and is not given or endorsed by the Barclays
>> Group.
>> 
>> Barclays Bank PLC. Registered in England and Wales (registered no.
>> 1026167).
>> Registered Office: 1 Churchill Place, London, E14 5HP, United
>> Kingdom.
>> 
>> Barclays Bank PLC is authorised and regulated by the Financial
>> Services Authority.
>> 
>> 

Reply via email to