from:"Mahmood Naderan"

Re: deprecated command

2016-04-11 Thread Mahmood Naderan

Hi, Any feedback is appreciated. Regards, Mahmood On Sat, Apr 9, 2016 at 3:36 PM, Mahmood Naderan <mahmood...@gmail.com> wrote: > Hi, > As I run the command > > mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4 > > I get this error: > >

deprecated command

2016-04-09 Thread Mahmood Naderan

Hi, As I run the command mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4 I get this error: Running on hadoop, using /opt/new_analytic/hadoop-2.7.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB:

Re: Code execution path of mahout

2016-02-03 Thread Mahmood Naderan

Really thanks for that. I am getting closer to what I was searching for... Is there any high level document about the procedure of the classifier (using map reduce) after the training phase. For example: 1- Reading chunks 2- Sorting each chunk 3-... I didn't find such an example on the web.

NegativeArraySizeException in seqdirecotry

2014-04-01 Thread Mahmood Naderan

Hello, Running seqdirecotry (from Mahout 0.9) on a large input file gives an exception which is shown as below. Any idea? MAHOUT_LOCAL is set, running locally 14/04/01 12:15:17 INFO common.AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647],

Re: Profiling with visualvm

2014-03-31 Thread Mahmood Naderan

Whoever tried mahout/hadoop profiling, please let us know Regards, Mahmood On Sunday, March 30, 2014 2:30 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Profiled what exactly, a Hadoop job? As soon as I run /mahout testclassifier -m wikipediamodel -d wikipediainputI see

Using split without partitioning the data to train/test

2014-03-31 Thread Mahmood Naderan

Hi, In an old Mahout, I used wikipediaDataSetCreator on an input to create the training data mahout wikipediaDataSetCreator -i wiki-tr/chunks -o tr-input -c labels.txt and then fed the tr-input to the trainclassifier using mahout trainclassifier -i tr-input -o wikimodel Now, in

Re: Using split without partitioning the data to train/test

2014-03-31 Thread Mahmood Naderan

Yeah you are right. I have to ignore that command Regards, Mahmood On Monday, March 31, 2014 6:56 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: Sent from my iPhone On Mar 31, 2014, at 4:20 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Hi, In an old Mahout, I used

Profiling with visualvm

2014-03-30 Thread Mahmood Naderan

Hi, I profiled the Mahout command with visualvm and saw many threads. Some of them are related to the profiler and some other are communication threads. Interesting thing is that, the main thread is always in sleep state! From the thread dump (which has been attached), the owner is Mahout.

Re: Profiling with visualvm

2014-03-30 Thread Mahmood Naderan

Profiled what exactly, a Hadoop job? As soon as I run /mahout testclassifier -m wikipediamodel -d wikipediainputI see a org.apache.mahout.driver.MahoutDriver in the visualvm and then I open it. Regards, Mahmood

Re: Question about Mahout/Hadoop

2014-03-29 Thread Mahmood Naderan

` commands are; that's where things start to be executed. On Fri, Mar 28, 2014 at 12:34 PM, Mahmood Naderan nt_mahm...@yahoo.comwrote: Hi I want to know then I run a command like mahout trainnb -i -o ... , am I running a mahout code or hadoop? In other words, which one is dominant

Question about Mahout/Hadoop

2014-03-28 Thread Mahmood Naderan

Hi I want to know then I run a command like mahout trainnb -i -o ... , am I running a mahout code or hadoop? In other words, which one is dominant? Regards, Mahmood

Re: debug mode

2014-03-26 Thread Mahmood Naderan

at 2:47 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote: Hello, 1- Can we debug Hadoop/Mahout with the available bin files? Or we have to do rebuild them? Probably they are compiled with optimizations 2- I cannot find a procedure on how to build Hadoop/Mahout in debug mode. If anyone knows

Re: debug mode

2014-03-26 Thread Mahmood Naderan

Excuse me, I forgot to say that should I use mvndebug for debugging purpose? Regards, Mahmood On , Mahmood Naderan nt_mahm...@yahoo.com wrote: Let me state in this way. Using GNU Make, we use -g -ggdb to insert debug symbols in the object file. On the other hand, if we use -O3

trainclassifier/trainnb

2014-03-25 Thread Mahmood Naderan

Hi, What is the correct syntax for this old command? mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4 It seems that trainclassifier is replaced by trainnb but this one has no -mf option. Regards, Mahmood

Re: trainclassifier/trainnb

2014-03-25 Thread Mahmood Naderan

, 2014 at 3:17 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote: Hi, What is the correct syntax for this old command? mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4 It seems that trainclassifier is replaced by trainnb but this one has no -mf option. Regards

Multiple errors and messages

2014-03-18 Thread Mahmood Naderan

Hello When run the following command on Mahout-0.9 and Hadoop-1.2.1, I get multiple errors and I can not figure out what is the problem? Sorry for the long post. [hadoop@solaris ~]$ mahout wikipediaDataSetCreator -i wikipedia/chunks -o wikipediainput -c ~/categories.txt Running on hadoop,

debug mode

2014-03-14 Thread Mahmood Naderan

Hello, 1- Can we debug Hadoop/Mahout with the available bin files? Or we have to do rebuild them? Probably they are compiled with optimizations 2- I cannot find a procedure on how to build Hadoop/Mahout in debug mode. If anyone knows, please let me know. Since I am a gcc/make guy, maybe these

verbose output

2014-03-13 Thread Mahmood Naderan

Hi, Is there any verbosity flag for hadoop and mahout commands? I can not find such thing in the command line. Regards, Mahmood

Re: verbose output

2014-03-13 Thread Mahmood Naderan

. On 03/13/2014 10:21 AM, Mahmood Naderan wrote: Hi, Is there any verbosity flag for hadoop and mahout commands? I can not find such thing in the command line. Regards, Mahmood

Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan

On Tuesday, March 11, 2014 11:57 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: As I posted earlier, here is the result of a successful test 5.4GB XML file (which is larger than enwiki-latest-pages-articles10.xml) with 4GB of RAM and -Xmx128m tooks 5 minutes to complete. I didn't find a larger

bug report

2014-03-13 Thread Mahmood Naderan

Hi Where can I submit a mahout bug? I am not familiar with JIRA and I see issues and agile. Regards, Mahmood

Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan

. Regards, Mahmood On Thursday, March 13, 2014 2:31 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Strange thing is that if I use either -Xmx128m of -Xmx16384m the process stops at the chunk #571 (571*64=36.5GB). Still I haven't figured out is this a problem with JVM or Hadoop or Mahout? I have

Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan

splitter class does but if you're interested in isolating the issue you could find out what is happening data-wise and see if there is some very large grouping on a pathologically frequent key for instance. On Thu, Mar 13, 2014 at 11:31 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote: I am

Re: bug report

2014-03-13 Thread Mahmood Naderan

. On Thu, Mar 13, 2014 at 11:29 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Hi Where can I submit a mahout bug? I am not familiar with JIRA and I see issues and agile. Regards, Mahmood

Re: Heap space

2014-03-11 Thread Mahmood Naderan

needs 2GB. If my system supports 10GB of heap, then I will feed 5 threads at one time. When the first 5 threads are done (the chunks) then I will feed the next 5 threads and so on. Regards, Mahmood On Monday, March 10, 2014 9:42 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: UPDATE: I

Re: Heap space

2014-03-11 Thread Mahmood Naderan

Suneel, One more thing Right now it has created 500 chunks. So 32GB out of 48GB (the original size of XML file) has been processed. Is it possible to resume that? Regards, Mahmood On Tuesday, March 11, 2014 9:47 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Suneel, Is it possible

Re: Heap space

2014-03-11 Thread Mahmood Naderan

be a question for a Hadoop list, unless I'm misunderstanding. When you say resume what do you mean? On Tue, Mar 11, 2014 at 11:46 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote: Suneel, One more thing Right now it has created 500 chunks. So 32GB out of 48GB (the original size of XML file) has

Solving heap size error

2014-03-11 Thread Mahmood Naderan

Hi, Recently I have faced a heap size error when I run $MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d $MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64 Here is the specs 1- XML file size = 44GB 2- System memory = 54GB (on virtualbox) 3- Heap size = 51GB

Re: Solving heap size error

2014-03-11 Thread Mahmood Naderan

to running on the entire english wikipedia. On Tue, Mar 11, 2014 at 12:56 PM, Mahmood Naderan nt_mahm...@yahoo.comwrote: Hi, Recently I have faced a heap size error when I run $MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d $MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o

Re: Heap space

2014-03-10 Thread Mahmood Naderan

. On Sunday, March 9, 2014 2:25 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Hi Suneel, Do you have any idea? Searching the web shows many question regarding the heap size for wikipediaXMLSplitter. I have increased the the memory size to 16GB and still get that error. I have to say

Re: Heap space

2014-03-10 Thread Mahmood Naderan

on the entire english wikipedia. On Monday, March 10, 2014 2:59 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Thanks for the update. Thing is, when that command is running, in another terminal I run 'top' command and I see that the java process takes less 1GB of memory. As another test

Re: Heap space

2014-03-10 Thread Mahmood Naderan

UPDATE: I split another 5.4GB XML file with 4GB of RAM and -Xmx128m and it took 5 minutes Regards, Mahmood On Monday, March 10, 2014 7:16 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: The extracted size is about 960MB (enwiki-latest-pages-articles10.xml). With 4GB of RAM set for the OS

Heap space

2014-03-09 Thread Mahmood Naderan

Hello, I ran this command ./bin/mahout wikipediaXMLSplitter -d examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64 but got this error Exception in thread main java.lang.OutOfMemoryError: Java heap space There are many web pages regarding this and the solution is

Re: Heap space

2014-03-09 Thread Mahmood Naderan

OK I found that I have to add this property to mapred-site.xml property namemapred.child.java.opts/name value-Xmx2048m/value /property Regards, Mahmood On Sunday, March 9, 2014 11:39 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Hello, I ran this command ./bin/mahout

Re: Heap space

2014-03-09 Thread Mahmood Naderan

Excuse me, I added the -Xmx option and restarted the hadoop services using sbin/stop-all.sh sbin/start-all.sh however still I get heap size error. How can I find the correct and needed heap size? Regards, Mahmood On Sunday, March 9, 2014 1:37 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote

Re: mahout command

2014-03-08 Thread Mahmood Naderan

: wikipedia splitter hadoop@solaris:~/mahout-distribution-0.9$ bin/mahout wikipediaXMLSplitter -d examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64 On Sat, Mar 8, 2014 at 11:11 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote: No success Suneel... Please see the attachment

Re: mahout command

2014-03-08 Thread Mahmood Naderan

What a fast reply... Thanks a lot Suneel, Regards, Mahmood On Saturday, March 8, 2014 11:29 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: You can ignore the warnings. On Saturday, March 8, 2014 2:58 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Oh yes... Thanks Andrew you

mahout command

2014-03-07 Thread Mahmood Naderan

Hi When I run mahout wikipediaXMLSplitter -d examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64 I get this error 14/03/07 16:24:13 WARN driver.MahoutDriver: Unable to add class: wikipediaXMLSplitter java.lang.ClassNotFoundException: wikipediaXMLSplitter at

Re: mahout command

2014-03-07 Thread Mahmood Naderan

to invoke via: mahout wikipediaXmlSplitter -dpath -opath -c64 please give that a try. On Friday, March 7, 2014 8:11 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote: Hi When I run mahout wikipediaXMLSplitter -d examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64 I get

Re: mahout command

2014-03-07 Thread Mahmood Naderan

FYI, I am trying to complete the wikipedia example from Apache's document https://cwiki.apache.org/confluence/display/MAHOUT/Wikipedia+Bayes+Example Regards, Mahmood On Friday, March 7, 2014 5:23 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote: In fact, see this file src/conf

Re: mahout command

2014-03-07 Thread Mahmood Naderan

for Hadoop 2.x. On Friday, March 7, 2014 11:16 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote: FYI, I am trying to complete the wikipedia example from Apache's document https://cwiki.apache.org/confluence/display/MAHOUT/Wikipedia+Bayes+Example Regards, Mahmood On Friday, March 7, 2014 5:23

Installation question

2014-02-23 Thread Mahmood Naderan

Hi, I have followed the steps stated in https://cwiki.apache.org/confluence/display/MAHOUT/BuildingMahout to install Mahout. However I get an error at mvn install hadoop@solaris:~/mahout-distribution-0.9$ mvn install [INFO] Scanning for projects... [INFO]

Re: Installation question

2014-02-23 Thread Mahmood Naderan

Lequeux Le Dimanche 23 février 2014 20h08, Mahmood Naderan nt_mahm...@yahoo.com a écrit : Hi, I have followed the steps stated in https://cwiki.apache.org/confluence/display/MAHOUT/BuildingMahout to install Mahout. However I get an error at mvn install hadoop@solaris:~/mahout-distribution-0.9

Re: Installation question

2014-02-23 Thread Mahmood Naderan

you have contains all the jars needed to run jobs but not the source. On Sun, Feb 23, 2014 at 9:47 PM, Mahmood Naderan nt_mahm...@yahoo.comwrote: I have downloaded mahout-distribution-0.9.tar.gz and here is the content. hadoop@solaris:~/mahout-distribution-0.9$ ls bin

44 matches

Mail list logo