Re: Welcome Pat Ferrel as new committer on Mahout

2014-04-24 Thread Kevin Moulart
Congratulations Pat! Wish you the best! Kévin Moulart 2014-04-24 13:52 GMT+02:00 Martin, Nick nimar...@pssd.com: Awesome Pat congrats!!! Very well deserved. Sent from my iPhone On Apr 24, 2014, at 6:20 AM, Sebastian Schelter s...@apache.org wrote: Hi, this is to announce that the

Re: Command line vector to sequence file

2014-03-18 Thread Kevin Moulart
+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa BjM8j36yJvoBVsfOHQIDAQAB -END PUBLIC KEY- On 18/03/14 10:58, Kevin Moulart wrote: Hi, I did the same search a few weeks back and found that there is nothing in the current API to do that from command line. However I did write a java program

Re: Compiling Mahout with maven in Eclipse

2014-03-13 Thread Kevin Moulart
, because I can't find those maps/sets/lists in the math package ? (I have the same problem on both my windows, centos and mac os) Kévin Moulart 2014-03-12 17:00 GMT+01:00 Kevin Moulart kevinmoul...@gmail.com: Never mind, I found where the problem lied, I deleted the full content of .m2

Re: Compiling Mahout with maven in Eclipse

2014-03-13 Thread Kevin Moulart
How can I generate them to make these errors go away then ? Or don't I have to ? Kévin Moulart 2014-03-13 9:17 GMT+01:00 Sebastian Schelter ssc.o...@googlemail.com: Those are autogenerated. On 03/13/2014 09:05 AM, Kevin Moulart wrote: Ok it does compile with maven in eclipse as well

Re: Compiling Mahout with maven in Eclipse

2014-03-13 Thread Kevin Moulart
on the commandline? On 03/13/2014 09:50 AM, Kevin Moulart wrote: How can I generate them to make these errors go away then ? Or don't I have to ? Kévin Moulart 2014-03-13 9:17 GMT+01:00 Sebastian Schelter ssc.o...@googlemail.com: Those are autogenerated. On 03/13/2014 09:05 AM, Kevin Moulart

Fwd: Compiling Mahout with maven in Eclipse

2014-03-13 Thread Kevin Moulart
in Eclipse To: Kevin Moulart kevinmoul...@gmail.com I use Intellij IDEA. Its support for maven projects is very nice. You should be able to simply import mahout as maven project there and everything should work fine. --sebastian On 03/13/2014 10:24 AM, Kevin Moulart wrote: Actually I pretty

Re: Website, urgent help needed

2014-03-12 Thread Kevin Moulart
I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up on Mahout dut to lack of documentation, but my feeling is that it doesn't only concerns the website : the API is too few documented as well. At this point there are no

Re: Website, urgent help needed

2014-03-12 Thread Kevin Moulart
it. If you start working on the javadoc, please create a jira issue for that work before you start. Best, Sebastian On 03/12/2014 09:30 AM, Kevin Moulart wrote: I can confirm what Sebastian said, I'm fairly new on this and I did find myself so desperate at some point that I almost gave up

Compiling Mahout with maven in Eclipse

2014-03-12 Thread Kevin Moulart
Hi, I tried to fix all the problem I had to configure eclipse in order to compile mahout in it using maven clean package as goal. First I had to make a change in mahout core in the class GroupTree.java, line 171 : stack = new ArrayDequeGroupTree(); Then I tried compiling with eclipse (I

Re: Compiling Mahout with maven in Eclipse

2014-03-12 Thread Kevin Moulart
Never mind, I found where the problem lied, I deleted the full content of .m2 and retried it as non root user and it worked. Trying in Eclipse now, with tests I'll let you now if it doesn't work. Kévin Moulart 2014-03-12 16:45 GMT+01:00 Kevin Moulart kevinmoul...@gmail.com: Hi, I tried

Re: PCA to improve classification performances

2014-03-10 Thread Kevin Moulart
reduction), followed by train Naive Bayes and test Naive Bayes. On Friday, March 7, 2014 10:01 AM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi again, I'm now using Mahout 0.9, and I'm trying to use PCA (via the SSVD) to reduce the dimention of a dataset from 1600+ features to ~100

Re: PCA to improve classification performances

2014-03-10 Thread Kevin Moulart
algorithm used. Kévin Moulart 2014-03-10 9:45 GMT+01:00 Suneel Marthi suneel_mar...@yahoo.com: On Monday, March 10, 2014 4:21 AM, Kevin Moulart kevinmoul...@gmail.com wrote: Its not clear to me from ur description as to the exact sequence of steps u r running thru, but an SSVD job

Re: PCA to improve classification performances

2014-03-10 Thread Kevin Moulart
Dmitriy Lyubimov dlie...@gmail.com: Pca and ssvd propagates exact row keys given in the input. If you give it text keys, U and Usigma will have text keys. It doesn t change that. On Mar 10, 2014 3:39 AM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi and thanks, I'll try that, but I'd like to do so

Re: Fwd: PCA with ssvd leads to StackOverFlowError

2014-03-07 Thread Kevin Moulart
Perfect ! It works like a charm now ! I'll still be testing after lunch, and let you know if any new problem subsists, but it looks promising ! Thanks you very much ! Kévin Moulart 2014-03-06 19:31 GMT+01:00 Ted Dunning ted.dunn...@gmail.com: On Thu, Mar 6, 2014 at 7:46 AM, Kevin Moulart

PCA to improve classification performances

2014-03-07 Thread Kevin Moulart
Hi again, I'm now using Mahout 0.9, and I'm trying to use PCA (via the SSVD) to reduce the dimention of a dataset from 1600+ features to ~100 and then to use the reducted dataset to train a naive bayes model and test it. So here is my workflow : - Transform my CSV into a SequencFile with

Re: Welcome Andrew Musselman as new comitter

2014-03-07 Thread Kevin Moulart
Congratulation Andrew !— Sent from Mailbox for iPhone On Fri, Mar 7, 2014 at 6:26 PM, Frank Scholten fr...@frankscholten.nl wrote: Congratulations Andrew! On Fri, Mar 7, 2014 at 6:12 PM, Sebastian Schelter s...@apache.org wrote: Hi, this is to announce that the Project Management Committee

Re: Fwd: PCA with ssvd leads to StackOverFlowError

2014-03-06 Thread Kevin Moulart
Hi again, and thanks for the enthousiasm ! I did compile the trunk with the hadoop2 profile and, althoug it didn't work at first because of some Canopy tests not passing, when I skipped the tests it compiled and when I tested it afterward it passed. I used the version I have isntalled, so I just

Re: Rework our website

2014-03-06 Thread Kevin Moulart
Hi I also prefer the second one. While I'm at it, there are several links that point to absent pages. I just clicked on all the link present on page : http://mahout.apache.org/users/basics/quickstart.html And those links are broken :

Re: Rework our website

2014-03-06 Thread Kevin Moulart
and post the links there? That would be awesome, then we can track that this stuff gets fixed. Best, Sebastian On 03/06/2014 02:58 PM, Kevin Moulart wrote: Hi I also prefer the second one. While I'm at it, there are several links that point to absent pages. I just clicked on all the link

Re: Fwd: PCA with ssvd leads to StackOverFlowError

2014-03-06 Thread Kevin Moulart
verify that you have the right hadoop jars with the following command: find . -name hadoop*.jar Gokhan On Thu, Mar 6, 2014 at 3:26 PM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi again, and thanks for the enthousiasm ! I did compile the trunk with the hadoop2 profile and, althoug

Re: Fwd: PCA with ssvd leads to StackOverFlowError

2014-03-06 Thread Kevin Moulart
this has come up before too. On Thu, Mar 6, 2014 at 3:23 PM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi thanks very much it seems to have worked ! Compiling with mvn clean package -Dhadoop2.version=2.0.0-cdh4.6.0 works and I no longer have the error, but then when running tests

Re: Fwd: PCA with ssvd leads to StackOverFlowError

2014-03-06 Thread Kevin Moulart
someone cleaned that up... On Thu, Mar 6, 2014 at 3:34 PM, Kevin Moulart kevinmoul...@gmail.com wrote: Ok so should I try and recompile and change the guava version to 11.0.2 in the pom ? Kévin Moulart 2014-03-06 16:26 GMT+01:00 Sean Owen sro...@gmail.com: That's gonna

Re: PCA with ssvd leads to StackOverFlowError

2014-03-05 Thread Kevin Moulart
. On Tuesday, March 4, 2014 8:54 AM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi, I'm trying to apply a PCA to reduce the dimension of a matrix of 1603 columns and 100.000 to 30.000.000 lines using ssvd with the pca option, and I always get a StackOverflowError : Here is my command line

PCA with ssvd leads to StackOverFlowError

2014-03-04 Thread Kevin Moulart
Hi, I'm trying to apply a PCA to reduce the dimension of a matrix of 1603 columns and 100.000 to 30.000.000 lines using ssvd with the pca option, and I always get a StackOverflowError : Here is my command line : mahout ssvd -i /user/myUser/Echant100k -o /user/myUser/Echant/SVD100 -k 100 -pca

Re: Use Naïve Bayes on a large CSV

2014-02-25 Thread Kevin Moulart
= 1 I did the testnb with the exact same file I used to train the model. Any idea ? 2014-02-25 11:33 GMT+01:00 Kevin Moulart kevinmoul...@gmail.com: All right I've manage to narrow it down to the LabelIndex, I went to see the code but it isnt realy clear at all for me. What exactly should I

Re: Use Naïve Bayes on a large CSV

2014-02-25 Thread Kevin Moulart
); // Create a label with a / and the class label String label = c[0] + / + c[0]; // Write all in the seqfile writer.append(new Text(label), writable); } } catch (NumberFormatException e) { continue; } } writer.close(); reader.close(); } 2014-02-25 16:25 GMT+01:00 Kevin Moulart kevinmoul

Re: Use Naïve Bayes on a large CSV

2014-02-24 Thread Kevin Moulart
have: Does mahout have a concept of adapters which allow us to read data csv style data with filters to create exact format for its various inputs (i.e. Recommender three column format).? If not is it worth a jira? On Feb 20, 2014, at 7:50 AM, Kevin Moulart kevinmoul...@gmail.com wrote

Re: Use Naïve Bayes on a large CSV

2014-02-24 Thread Kevin Moulart
: This relates to a previous question I have: Does mahout have a concept of adapters which allow us to read data csv style data with filters to create exact format for its various inputs (i.e. Recommender three column format).? If not is it worth a jira? On Feb 20, 2014, at 7:50 AM, Kevin Moulart

Re: Use Naïve Bayes on a large CSV

2014-02-24 Thread Kevin Moulart
on it for the moment. 2014-02-24 15:37 GMT+01:00 Ted Dunning ted.dunn...@gmail.com: Kevin, While this is fresh in your mind can you prepare a javadoc patch that would have helped you out? And suggest other doc patches as well? On Mon, Feb 24, 2014 at 3:00 AM, Kevin Moulart kevinmoul...@gmail.com

Use Naïve Bayes on a large CSV

2014-02-20 Thread Kevin Moulart
Hi I'm trying to apply a Naive Bayes Classifier to a large CSV file from the command line. I know I have to feed the classifier with a seq file, so I tried to put my csv into one using the command seqdirectory, but even when I try with a really small csv (less than 100Mo) I instantly get an

Re: Use Naïve Bayes on a large CSV

2014-02-20 Thread Kevin Moulart
generate vectors from input CSV that could then be fed into Mahout classifier/clustering jobs. On Thursday, February 20, 2014 5:57 AM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi I'm trying to apply a Naive Bayes Classifier to a large CSV file from the command line. I know I have

Mahout 0.9 with cloudera

2014-02-06 Thread Kevin Moulart
Hi everyone, Is there a simple way to install Mahout 0.9 on a cluster running Cloudera's CDH 4.5 ? When I try what they advise on their doc (yum install mahout on my CentOS 6 node), it wants to install mahout version 0.7+22-1.cdh4.5.0.p0.14.el6. Thanks in advance ! -- Kévin Moulart GSM France

Problem with mahout classpath after update

2013-12-23 Thread Kevin Moulart
Hi I had mahout working on my cluster of VMs running cloudera CDH 4.4 and ever since I updated to cloudera 4.5, it stopped working, as if it could not find its classpath, despite all my attempts to fix it : http://pastebin.com/5nxpqZEC Can anyone tell me what I did wrong ? Here is my .bashrc

Re: Problem with mahout classpath after update

2013-12-23 Thread Kevin Moulart
version). Please upgrade to the latest version of Mahout. On Monday, December 23, 2013 8:59 AM, Kevin Moulart kevinmoul...@gmail.com wrote: Hi I had mahout working on my cluster of VMs running cloudera CDH 4.4 and ever since I updated to cloudera 4.5, it stopped working, as if it could