Re: UIMA

2014-01-15 Thread Jens Grivolla

Hello Burcu,

UIMA has an entirely different purpose actually, and doesn't do 
classification or clustering.  You would rather use UIMA to enrich 
documents (individually) through text analysis and then use the result 
to create better feature vectors to use with Solr, Mahout, etc.


We typically use UIMA to do named entity recognition, sentiment 
analysis, chunking, etc. and then index the result in Solr. From there 
you can either use it for retrieval (i.e. use the enriched 
representation to get a better document similarity measure) or extract 
the vectors to use with Mahout/Weka/Cluto/...


HTH,
Jens

On 14/01/14 16:25, Burcu B wrote:

Hi,

I'd like to know why someone should prefer UIMA when developing an
application for end users to classify and cluster general purpose
documents?

I have two options:
1- integrating Mahout, SOLR, R ,Hadoop and other file sources such as
  document man. systems or file system.
2- or doing these using UIMA.

Intiutively, I think that UIMA should be preferred, but I could not justify
my feeling. I need a list of pros and cons.

If you could suggest me resources, it would be great.

Thank you.






Simple Recommendation System with Mahout and Netbeans

2014-01-15 Thread Chameera Wijebandara
Hi,

I have just published the simple blog post which describe simple
recommendation system with Mahout.

Can you please tell me  are there any thing to improve in
thishttp://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 blog post.

http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/

-- 
Thanks,
Chameera


Re: Mahout 0.9 Release Candidate - VOTE

2014-01-15 Thread Chameera Wijebandara
Hi Tharindu,

Still I could not able to download the artifacts. Could you please hep me
to test the Release

Thanks
Chameera


On Wed, Jan 15, 2014 at 12:21 PM, Suneel Marthi suneel_mar...@yahoo.comwrote:

 Thanks Tharindu.





 On Tuesday, January 14, 2014 11:30 PM, Tharindu Rusira 
 tharindurus...@gmail.com wrote:

 Hi Suneel,
 I tested
 the
 installation process with unit tests and everything went well. (Ubuntu
 12.10 32bit, Java 1.7.0_40).
 Please note that I did not clean my local maven repository before the
 installation so I assumed maven dependencies are all available
 .


 On Tue, Jan 14, 2014 at 7:03 PM, Suneel Marthi suneel_mar...@yahoo.com
 wrote:

  Here's the link to Release artifacts for Mahout 0.9:
  https://repository.apache.org/content/repositories/orgapachemahout-1000/
 
  For those volunteering to test this, some of the stuff to look out for:
  a)  Verify u can unpack the Release tar.
 
 Verified


  b)  Verify u are able to compile the distribution
 
 Verified

 [INFO]
 
 [INFO] Reactor Summary:
 [INFO]
 [INFO] Mahout Build Tools  SUCCESS [4.380s]
 [INFO] Apache Mahout . SUCCESS [0.965s]
 [INFO] Mahout Math ... SUCCESS
 [2:07.687s]
 [INFO] Mahout Core ... SUCCESS
 [10:34.651s]
 [INFO] Mahout Integration  SUCCESS
 [1:03.250s]
 [INFO] Mahout Examples ... SUCCESS
 [16.607s]
 [INFO] Mahout Release Package  SUCCESS [0.469s]
 [INFO] Mahout Math/Scala wrappers  SUCCESS
 [35.562s]
 [INFO]
 
 [INFO] BUILD SUCCESS
 [INFO]
 
 [INFO] Total time: 14:44.158s
 [INFO] Finished at: Wed Jan 15 09:06:26 IST 2014
 [INFO] Final Memory: 41M/252M
 [INFO]
 

 c) Run through the unit tests: mvn clean test
 
 Verified.

 
  d)  Run the example scripts under $MAHOUT_HOME/examples/bin.
 

 I'm yet to test the example scripts and I will give an update soon.

 Regards,



 
  See http://incubator.apache.org/guides/releasemanagement.html#check-list
  for more details.
 
 
 
  On Tuesday, January 14, 2014 8:26 AM, spa...@gmail.com 
 spa...@gmail.com
  wrote:
 
  I want to volunteer to test this release. What is the procedure/steps to
  get started and what pre-reqs I need to have?
 
  Cheers
  .S
 
 
 
  On Tue, Jan 14, 2014 at 6:52 PM, Suneel Marthi suneel_mar...@yahoo.com
  wrote:
 
   Calling for volunteers to test this Release.
  
  
  
  
   On Friday, January 10, 2014 7:39 PM, Suneel Marthi 
   suneel_mar...@yahoo.com wrote:
  
   Pushed the Mahout 0.9 Release candidate.
   See
  
 https://repository.apache.org/content/repositories/orgapachemahout-1000/
  
   This is a call for Vote.
  
 
 
 
  --
  http://spawgi.wordpress.com
  We can do it and do it better.
 



 --
 M.P. Tharindu Rusira Kumara

 Department of Computer Science and Engineering,
 University of Moratuwa,
 Sri Lanka.
 +94757033733
 www.tharindu-rusira.blogspot.com




-- 
Thanks,
Chameera


Re: UIMA

2014-01-15 Thread Burcu B
Hi,

Thank you, Jens. I was planning to use OpenNLP  for named entity
recognition directly  for the analysis you''ve mentioned; and Lucene for
tokenization. However, UIMA has OpenNLP component, too. What is the reason
to use UIMA instead of uisng OpenNLP and SOLR together?

I am planning to use Mahout and R together in the application; but later
other libraries or algorithms could be added to the application. However,
the program should be extended like Atlassian's JIRA plugins. Does UIMA's
component architecture provide this easier compared to other options?

Where does UIMA fit in a system that reads documents from different
sources; removes stop words, identifies named entities; indexs them and
then classifies, clusteres text and indexes topics/labels? I am confused if
 why UIMA should be used or not.

Regards,





On Wed, Jan 15, 2014 at 1:15 PM, Jens Grivolla j+...@grivolla.net wrote:

 Hello Burcu,

 UIMA has an entirely different purpose actually, and doesn't do
 classification or clustering.  You would rather use UIMA to enrich
 documents (individually) through text analysis and then use the result to
 create better feature vectors to use with Solr, Mahout, etc.

 We typically use UIMA to do named entity recognition, sentiment analysis,
 chunking, etc. and then index the result in Solr. From there you can either
 use it for retrieval (i.e. use the enriched representation to get a better
 document similarity measure) or extract the vectors to use with
 Mahout/Weka/Cluto/...

 HTH,
 Jens


 On 14/01/14 16:25, Burcu B wrote:

 Hi,

 I'd like to know why someone should prefer UIMA when developing an
 application for end users to classify and cluster general purpose
 documents?

 I have two options:
 1- integrating Mahout, SOLR, R ,Hadoop and other file sources such as
   document man. systems or file system.
 2- or doing these using UIMA.

 Intiutively, I think that UIMA should be preferred, but I could not
 justify
 my feeling. I need a list of pros and cons.

 If you could suggest me resources, it would be great.

 Thank you.






Re: Simple Recommendation System with Mahout and Netbeans

2014-01-15 Thread Manuel Blechschmidt
Hi Chameera,

On 15.01.2014, at 13:59, Chameera Wijebandara wrote:

 Hi,
 
 I have just published the simple blog post which describe simple
 recommendation system with Mahout.

 
 
 Can you please tell me  are there any thing to improve in
 thishttp://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 blog post.
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 

You should use the released maven artifacts instead a custom build of mahout.

It is undefined if a current trunk build will succeed.

An example of an eclipse workspace based on maven and with mahout can be found 
here:
https://raw2.github.com/ManuelB/facebook-recommender-demo/master/docs/EclipseWorkspace.png

The only dependency that you need is the following:

dependency
  groupIdorg.apache.mahout/groupId
  artifactIdmahout-core/artifactId
  version0.8/version
/dependency

All other transitive dependencies will be inherited.

/Manuel

 -- 
 Thanks,
Chameera

-- 
Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B



Re: UIMA

2014-01-15 Thread Jens Grivolla

Hi,

the advantage of using UIMA over plain OpenNLP is that it can allow you 
to more easily combine components from different sources, e.g. a 
tokenizer and POS tagger from OpenNLP, a parser from Stanford, etc.


You then have components for input that deal with the different sources, 
and for output to e.g. index the results in Solr. Within the processing 
pipeline you have one consistent data representation so you don't have 
to worry about writing glue code to do format conversions.


Take a look at DKpro (https://code.google.com/p/dkpro-core-asl/) which 
has UIMA wrappers for many different components.


HTH,
Jens

On 15/01/14 14:52, Burcu B wrote:

Hi,

Thank you, Jens. I was planning to use OpenNLP  for named entity
recognition directly  for the analysis you''ve mentioned; and Lucene for
tokenization. However, UIMA has OpenNLP component, too. What is the reason
to use UIMA instead of uisng OpenNLP and SOLR together?

I am planning to use Mahout and R together in the application; but later
other libraries or algorithms could be added to the application. However,
the program should be extended like Atlassian's JIRA plugins. Does UIMA's
component architecture provide this easier compared to other options?

Where does UIMA fit in a system that reads documents from different
sources; removes stop words, identifies named entities; indexs them and
then classifies, clusteres text and indexes topics/labels? I am confused if
 why UIMA should be used or not.

Regards,





On Wed, Jan 15, 2014 at 1:15 PM, Jens Grivolla j+...@grivolla.net wrote:


Hello Burcu,

UIMA has an entirely different purpose actually, and doesn't do
classification or clustering.  You would rather use UIMA to enrich
documents (individually) through text analysis and then use the result to
create better feature vectors to use with Solr, Mahout, etc.

We typically use UIMA to do named entity recognition, sentiment analysis,
chunking, etc. and then index the result in Solr. From there you can either
use it for retrieval (i.e. use the enriched representation to get a better
document similarity measure) or extract the vectors to use with
Mahout/Weka/Cluto/...

HTH,
Jens


On 14/01/14 16:25, Burcu B wrote:


Hi,

I'd like to know why someone should prefer UIMA when developing an
application for end users to classify and cluster general purpose
documents?

I have two options:
1- integrating Mahout, SOLR, R ,Hadoop and other file sources such as
   document man. systems or file system.
2- or doing these using UIMA.

Intiutively, I think that UIMA should be preferred, but I could not
justify
my feeling. I need a list of pros and cons.

If you could suggest me resources, it would be great.

Thank you.












Re: Simple Recommendation System with Mahout and Netbeans

2014-01-15 Thread Chameera Wijebandara
Hi Manuel,

Sry I didn't get your comment. Could you please explain father more?

Thanks,
Chameera


On Wed, Jan 15, 2014 at 7:24 PM, Manuel Blechschmidt 
manuel.blechschm...@gmx.de wrote:

 Hi Chameera,

 On 15.01.2014, at 13:59, Chameera Wijebandara wrote:

  Hi,
 
  I have just published the simple blog post which describe simple
  recommendation system with Mahout.

 
 
  Can you please tell me  are there any thing to improve in
  this
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
  blog post.
 
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 

 You should use the released maven artifacts instead a custom build of
 mahout.

 It is undefined if a current trunk build will succeed.

 An example of an eclipse workspace based on maven and with mahout can be
 found here:

 https://raw2.github.com/ManuelB/facebook-recommender-demo/master/docs/EclipseWorkspace.png

 The only dependency that you need is the following:

 dependency
   groupIdorg.apache.mahout/groupId
   artifactIdmahout-core/artifactId
   version0.8/version
 /dependency

 All other transitive dependencies will be inherited.

 /Manuel

  --
  Thanks,
 Chameera

 --
 Manuel Blechschmidt
 M.Sc. IT Systems Engineering
 Dortustr. 57
 14467 Potsdam
 Mobil: 0173/6322621
 Twitter: http://twitter.com/Manuel_B




-- 
Thanks,
Chameera


Re: Simple Recommendation System with Mahout and Netbeans

2014-01-15 Thread Manuel Blechschmidt
Hi Chameera,
Netbeans has excellent maven support http://wiki.netbeans.org/Maven

Mahout uses maven to build their artifacts a.k.a. jars. They get deployed to 
the central repositories. There is no need to create custom builds.

On the following screenshot you already see the wizard for a maven project:

http://chameerawijebandara.files.wordpress.com/2014/01/screenshot-from-2014-01-15-155415.png?w=600h=400

Create a new maven project. Afterwards add a dependency:

http://mrhaki.blogspot.de/2009/07/add-maven-dependency-in-netbeans.html

group id: org.apache.mahout
artifact id: mahout-core
version: 0.8

There is no need to create custom builds. The maven artifacts will contain 
everything (classes, sources, tests, javadocs)

/Manuel

On 15.01.2014, at 15:08, Chameera Wijebandara wrote:

 Hi Manuel,
 
 Sry I didn't get your comment. Could you please explain father more?
 
 Thanks,
Chameera
 
 
 On Wed, Jan 15, 2014 at 7:24 PM, Manuel Blechschmidt 
 manuel.blechschm...@gmx.de wrote:
 
 Hi Chameera,
 
 On 15.01.2014, at 13:59, Chameera Wijebandara wrote:
 
 Hi,
 
 I have just published the simple blog post which describe simple
 recommendation system with Mahout.
 
 
 
 Can you please tell me  are there any thing to improve in
 this
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
 blog post.
 
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
 
 You should use the released maven artifacts instead a custom build of
 mahout.
 
 It is undefined if a current trunk build will succeed.
 
 An example of an eclipse workspace based on maven and with mahout can be
 found here:
 
 https://raw2.github.com/ManuelB/facebook-recommender-demo/master/docs/EclipseWorkspace.png
 
 The only dependency that you need is the following:
 
dependency
  groupIdorg.apache.mahout/groupId
  artifactIdmahout-core/artifactId
  version0.8/version
/dependency
 
 All other transitive dependencies will be inherited.
 
 /Manuel
 
 --
 Thanks,
   Chameera
 
 --
 Manuel Blechschmidt
 M.Sc. IT Systems Engineering
 Dortustr. 57
 14467 Potsdam
 Mobil: 0173/6322621
 Twitter: http://twitter.com/Manuel_B
 
 
 
 
 -- 
 Thanks,
Chameera

-- 
Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B



Re: Simple Recommendation System with Mahout and Netbeans

2014-01-15 Thread Chameera Wijebandara
Hi Manuel,

Thank you very mutch. I got it I'll corect it

Thanks,
Chameera


On Wed, Jan 15, 2014 at 7:47 PM, Manuel Blechschmidt 
manuel.blechschm...@gmx.de wrote:

 Hi Chameera,
 Netbeans has excellent maven support http://wiki.netbeans.org/Maven

 Mahout uses maven to build their artifacts a.k.a. jars. They get deployed
 to the central repositories. There is no need to create custom builds.

 On the following screenshot you already see the wizard for a maven project:


 http://chameerawijebandara.files.wordpress.com/2014/01/screenshot-from-2014-01-15-155415.png?w=600h=400

 Create a new maven project. Afterwards add a dependency:

 http://mrhaki.blogspot.de/2009/07/add-maven-dependency-in-netbeans.html

 group id: org.apache.mahout
 artifact id: mahout-core
 version: 0.8

 There is no need to create custom builds. The maven artifacts will contain
 everything (classes, sources, tests, javadocs)

 /Manuel

 On 15.01.2014, at 15:08, Chameera Wijebandara wrote:

  Hi Manuel,
 
  Sry I didn't get your comment. Could you please explain father more?
 
  Thanks,
 Chameera
 
 
  On Wed, Jan 15, 2014 at 7:24 PM, Manuel Blechschmidt 
  manuel.blechschm...@gmx.de wrote:
 
  Hi Chameera,
 
  On 15.01.2014, at 13:59, Chameera Wijebandara wrote:
 
  Hi,
 
  I have just published the simple blog post which describe simple
  recommendation system with Mahout.
 
 
 
  Can you please tell me  are there any thing to improve in
  this
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
  blog post.
 
 
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
 
  You should use the released maven artifacts instead a custom build of
  mahout.
 
  It is undefined if a current trunk build will succeed.
 
  An example of an eclipse workspace based on maven and with mahout can be
  found here:
 
 
 https://raw2.github.com/ManuelB/facebook-recommender-demo/master/docs/EclipseWorkspace.png
 
  The only dependency that you need is the following:
 
 dependency
   groupIdorg.apache.mahout/groupId
   artifactIdmahout-core/artifactId
   version0.8/version
 /dependency
 
  All other transitive dependencies will be inherited.
 
  /Manuel
 
  --
  Thanks,
Chameera
 
  --
  Manuel Blechschmidt
  M.Sc. IT Systems Engineering
  Dortustr. 57
  14467 Potsdam
  Mobil: 0173/6322621
  Twitter: http://twitter.com/Manuel_B
 
 
 
 
  --
  Thanks,
 Chameera

 --
 Manuel Blechschmidt
 M.Sc. IT Systems Engineering
 Dortustr. 57
 14467 Potsdam
 Mobil: 0173/6322621
 Twitter: http://twitter.com/Manuel_B




-- 
Thanks,
Chameera


Re: Simple Recommendation System with Mahout and Netbeans

2014-01-15 Thread Chameera Wijebandara
Hi Manuel,

Thank you very mutch. I got it I'll corect it

Thanks,
Chameera


On Wed, Jan 15, 2014 at 7:47 PM, Manuel Blechschmidt 
manuel.blechschm...@gmx.de wrote:

 Hi Chameera,
 Netbeans has excellent maven support http://wiki.netbeans.org/Maven

 Mahout uses maven to build their artifacts a.k.a. jars. They get deployed
 to the central repositories. There is no need to create custom builds.

 On the following screenshot you already see the wizard for a maven project:


 http://chameerawijebandara.files.wordpress.com/2014/01/screenshot-from-2014-01-15-155415.png?w=600h=400

 Create a new maven project. Afterwards add a dependency:

 http://mrhaki.blogspot.de/2009/07/add-maven-dependency-in-netbeans.html

 group id: org.apache.mahout
 artifact id: mahout-core
 version: 0.8

 There is no need to create custom builds. The maven artifacts will contain
 everything (classes, sources, tests, javadocs)

 /Manuel

 On 15.01.2014, at 15:08, Chameera Wijebandara wrote:

  Hi Manuel,
 
  Sry I didn't get your comment. Could you please explain father more?
 
  Thanks,
 Chameera
 
 
  On Wed, Jan 15, 2014 at 7:24 PM, Manuel Blechschmidt 
  manuel.blechschm...@gmx.de wrote:
 
  Hi Chameera,
 
  On 15.01.2014, at 13:59, Chameera Wijebandara wrote:
 
  Hi,
 
  I have just published the simple blog post which describe simple
  recommendation system with Mahout.
 
 
 
  Can you please tell me  are there any thing to improve in
  this
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
  blog post.
 
 
 
 http://chameerawijebandara.wordpress.com/2014/01/15/simple-recommendation-system-with-mahout-and-netbeans/
 
 
  You should use the released maven artifacts instead a custom build of
  mahout.
 
  It is undefined if a current trunk build will succeed.
 
  An example of an eclipse workspace based on maven and with mahout can be
  found here:
 
 
 https://raw2.github.com/ManuelB/facebook-recommender-demo/master/docs/EclipseWorkspace.png
 
  The only dependency that you need is the following:
 
 dependency
   groupIdorg.apache.mahout/groupId
   artifactIdmahout-core/artifactId
   version0.8/version
 /dependency
 
  All other transitive dependencies will be inherited.
 
  /Manuel
 
  --
  Thanks,
Chameera
 
  --
  Manuel Blechschmidt
  M.Sc. IT Systems Engineering
  Dortustr. 57
  14467 Potsdam
  Mobil: 0173/6322621
  Twitter: http://twitter.com/Manuel_B
 
 
 
 
  --
  Thanks,
 Chameera

 --
 Manuel Blechschmidt
 M.Sc. IT Systems Engineering
 Dortustr. 57
 14467 Potsdam
 Mobil: 0173/6322621
 Twitter: http://twitter.com/Manuel_B




-- 
Thanks,
Chameera


Re: Item recommendation w/o users or preferences

2014-01-15 Thread Pat Ferrel
Haven’t read the whole thread but it sounds like you just need some simple 
start-here info…

To do collaborative filtering you must have user-id, item-id, action/weight.

For a minimal commerce CF cooccurrence recommender this is typically something 
like: 
user-id, item-id, 1=purchase

To use Mahout you will have to translate the ids into positive integers. Treat 
these like keys to an item-id or user-id lookup. So input into Mahout will be:
user-id-key, item-id-key, 1

You can do CF with anonymous user-ids, meaning an individual took the action 
but you don’t know who. However to use this data you will have to have some way 
of tying an id to a real person. Using transactions ids as a proxy for user-ids 
will work in the training data but once you want to make a recommendation you 
will have to know some real user history to allow the recommender to compare it 
with transactions.

Then you calculate recommendations using some Mahout recommender. If you are 
using the hadoop version the output will be a row per user-id-key that will 
contain some number of recommendation item-id-keys and their recommendation 
weight for sorting purposes. You then write your own retrieval code to get the 
recs for a given user-id-key, since they are all pre-calculated and in a 
Sequence File. If you are using the in-memory recommender you can ask for recs 
for a given user-id-key and get the list returned.

You can also use transaction data alone to make anonymous recommendations, but 
that is market basket analysis. In that case you have:
transaction-id-key, item-id-key, 1

Then at recommendation time you have a list of items in a single basket. There 
are several ways to get this to work so I’ll stop here unless it’s what you 
need, in which case let us know.


On Jan 11, 2014, at 1:38 PM, Tim Smith timsmit...@hotmail.com wrote:

 Is it about how to arrange your data to use this computation?  The
 references below might help with that.

Yes, I read and tried the recommendation examples from MIA and there is a 
mention of item to item similarity, but I am not sure what form the file should 
take.  The examples are along the lines of  userid,itemid,value

In section 6.2 of MIA we are multiplying the Co-occur matrix X User 
preferences = Recommendations (top of page 97), so if I do not have preferences 
should
I just default them all to the same value?  Taken together with your previous 
comments, is this how I should be preparing my data?

Raw Sample Data (format: Transaction|Item)
123|Sun Glasses
124|Sun Glasses
124|Sun Glass Case
125|Sun Glass Case
126|Sun Glasses
126|Glass Repair Kit
127|Glass Repair Kit

Are you suggesting that I just simply use (format:  userid|item|value)
123|Sun Glasses|1
124|Sun Glasses|1
124|Sun Glass Case|1
125|Sun Glass Case|1
126|Sun Glasses|1
126|Glass Repair Kit|1
127|Glass Repair Kit|1

 Is it regarding the specifics of how you do the computation?  I can help
 with that, but would need a pointer to the difficulty.

Not quite yet.  I am working through the intuition first, I'll fight through 
the math once, if ever, the fog clears
  



seqdirectory error on mapreduce

2014-01-15 Thread pzhang
My set-ups seems work well:
$hadoop fs –copyFromLocal reuters_text reuters_text
$hadoop fs –ls reuters_text/
Many files in the directory

$mahout seqdirectory –i reuters_text –o reuters_seq –c UTF-8

reuters_seq directory generated but empty

mapreduce job was failed with the message:

14/01/15 11:27:36 INFO mapreduce.Job: Task Id :
attempt_1389802273858_0007_m_01_2, Status : FAILED
Error: java.lang.RuntimeException:
java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164)
  at
org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.init(CombineFileRecordReader.java:126)
at
org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.init(MapTask.java:491)
at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native
Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.reflect.InvocationTargetException
at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at
java.lang.reflect.Constructor.newInstance(Constructor.java:534)
at
org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155)
... 10 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface

Some suggestions?

I am running hadoop2.2.0, mahout 0.8.

Thanks.

Pengchu




--
View this message in context: 
http://lucene.472066.n3.nabble.com/seqdirectory-error-on-mapreduce-tp4111508.html
Sent from the Mahout User List mailing list archive at Nabble.com.


Re: Mahout 0.9 Release Candidate - VOTE

2014-01-15 Thread Suneel Marthi
The Release has been rolled back to include a few last minute fixes, will be 
sending out a new Release link in a day.





On Wednesday, January 15, 2014 1:33 PM, spa...@gmail.com spa...@gmail.com 
wrote:
 
I am getting a 404 Item not found exception for this link. Complete stack
trace is -
404 - ItemNotFoundException

Repository with ID=pathPrefixOrId: 'orgapachemahout-1000' not found

org.sonatype.nexus.proxy.ItemNotFoundException: Repository with
ID=pathPrefixOrId: 'orgapachemahout-1000' not found
    at 
org.sonatype.nexus.proxy.router.DefaultRepositoryRouter.getRequestRouteForRequest(DefaultRepositoryRouter.java:456)
    at 
org.sonatype.nexus.proxy.router.DefaultRepositoryRouter.retrieveItem(DefaultRepositoryRouter.java:147)
    at 
org.sonatype.nexus.web.content.NexusContentServlet.doGet(NexusContentServlet.java:359)
    at 
org.sonatype.nexus.web.content.NexusContentServlet.service(NexusContentServlet.java:331)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
    at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:278)
    at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:268)
    at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:180)
    at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:93)
    at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
    at 
com.yammer.metrics.web.WebappMetricsFilter.doFilter(WebappMetricsFilter.java:76)
    at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at 
org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61)
    at 
org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
    at 
org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
    at 
org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
    at 
org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
    at 
org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
    at 
org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
    at 
org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
    at 
org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
    at 
org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
    at 
org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
    at 
org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
    at 
org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
    at 
org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
    at 
org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
    at 
org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
    at 
org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
    at 
org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:383)
    at 
org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362)
    at 
org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
    at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at 
org.sonatype.nexus.web.NexusGuiceFilter$MultiFilterChain.doFilter(NexusGuiceFilter.java:83)
    at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
    at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at 
org.sonatype.nexus.web.NexusGuiceFilter$MultiFilterChain.doFilter(NexusGuiceFilter.java:83)
    at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
    at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at 
org.sonatype.nexus.web.NexusGuiceFilter$MultiFilterChain.doFilter(NexusGuiceFilter.java:83)
    at 
org.sonatype.nexus.web.NexusGuiceFilter$MultiFilterPipeline.dispatch(NexusGuiceFilter.java:57)
    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:132)
    at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:129)
    at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:206)
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:129)
    at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
    at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at 

Classification of books

2014-01-15 Thread Suresh M
Hi,
Our application will be getting books from different users.
We have to classify them accordingly.
Some one please tell me how to do that using apache mahout and java.
Is hadoop necessary for that?


--
Thank Regards
Suresh


Re: Classification of books

2014-01-15 Thread Suresh M
Hi,

Can you please tell me what does that pre-processing mean? Is it
vectorization(as explained in Mahout in Action book)
Can it be done using java and Mahout AP ?
And, the model means, is it a class?




On 16 January 2014 11:38, KK R kirubakumar...@gmail.com wrote:

 Hi Suresh,

 Apache Mahout has certain classification algorithms which you can use to do
 the classifcation.

 Step 1: Your data may require any pre-processing. If so, it can be done
 using Hadoop / Hive / Mahout utilities.

 Step 2: Run classification algorithm on your training data and build your
 model using Mahout classification algorithms.

 Step 3: When the actual data comes, it needs to be classified with the help
 of trained model. This can be done sequentially in java or mapreduce can be
 used if the size of the data is huge and scalability is a requirement.

 Thanks,
 Kirubakumaresh
 @http://www.linkedin.com/pub/kirubakumaresh-rajendran/66/411/305


 On Thu, Jan 16, 2014 at 11:28 AM, Suresh M suresh4mas...@gmail.com
 wrote:

  Hi,
  Our application will be getting books from different users.
  We have to classify them accordingly.
  Some one please tell me how to do that using apache mahout and java.
  Is hadoop necessary for that?
 
 
  --
  Thank Regards
  Suresh