Hi,
Any feedback is appreciated.
Regards,
Mahmood
On Sat, Apr 9, 2016 at 3:36 PM, Mahmood Naderan <mahmood...@gmail.com>
wrote:
> Hi,
> As I run the command
>
> mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4
>
> I get this error:
>
>
Hi,
As I run the command
mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4
I get this error:
Running on hadoop, using /opt/new_analytic/hadoop-2.7.1/bin/hadoop and
HADOOP_CONF_DIR=
MAHOUT-JOB:
Really thanks for that. I am getting closer to what I was searching for...
Is there any high level document about the procedure of the classifier (using
map reduce) after the training phase. For example:
1- Reading chunks
2- Sorting each chunk
3-...
I didn't find such an example on the web.
Hello,
Running seqdirecotry (from Mahout 0.9) on a large input file gives an exception
which is shown as below. Any idea?
MAHOUT_LOCAL is set, running locally
14/04/01 12:15:17 INFO common.AbstractJob: Command line arguments:
{--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647],
Whoever tried mahout/hadoop profiling, please let us know
Regards,
Mahmood
On Sunday, March 30, 2014 2:30 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
Profiled what exactly, a Hadoop job?
As soon as I run
/mahout testclassifier -m
wikipediamodel -d wikipediainputI see
Hi,
In an old Mahout, I used wikipediaDataSetCreator on an input to create the
training data
mahout wikipediaDataSetCreator -i
wiki-tr/chunks -o tr-input -c labels.txt
and then fed the tr-input to the trainclassifier using
mahout trainclassifier -i tr-input -o wikimodel
Now, in
Yeah you are right. I have to ignore that command
Regards,
Mahmood
On Monday, March 31, 2014 6:56 PM, Suneel Marthi suneel_mar...@yahoo.com
wrote:
Sent from my iPhone
On Mar 31, 2014, at 4:20 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
Hi,
In an old Mahout, I used
Hi,
I profiled the Mahout command with visualvm and saw many threads. Some of them
are related to the profiler and some other are communication threads.
Interesting thing is that, the main thread is always in sleep state!
From the thread dump (which has been attached), the owner is Mahout.
Profiled what exactly, a Hadoop job?
As soon as I run
/mahout testclassifier -m
wikipediamodel -d wikipediainputI see a org.apache.mahout.driver.MahoutDriver
in the visualvm and then I open it.
Regards,
Mahmood
` commands are; that's where things start to be executed.
On Fri, Mar 28, 2014 at 12:34 PM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
Hi
I want to know then I run a command like
mahout trainnb -i -o ...
, am I running a mahout code or hadoop?
In other words, which one is dominant
Hi
I want to know then I run a command like
mahout trainnb -i -o ...
, am I running a mahout code or hadoop?
In other words, which one is dominant?
Regards,
Mahmood
at 2:47 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
Hello,
1- Can we debug Hadoop/Mahout with the available bin files? Or we have to
do rebuild them? Probably they are compiled with optimizations
2- I cannot find a procedure on how to build Hadoop/Mahout in debug
mode. If anyone knows
Excuse me, I forgot to say that should I use mvndebug for debugging purpose?
Regards,
Mahmood
On , Mahmood Naderan nt_mahm...@yahoo.com wrote:
Let me state in this way. Using GNU Make, we use -g -ggdb to insert debug
symbols in the object file. On the other hand, if we use -O3
Hi,
What is the correct syntax for this old command?
mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4
It seems that trainclassifier is replaced by trainnb but this one has no -mf
option.
Regards,
Mahmood
, 2014 at 3:17 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
Hi,
What is the correct syntax for this old command?
mahout trainclassifier -i traininginput -o wikipediamodel -mf 4 -ms 4
It seems that trainclassifier is replaced by trainnb but this one has no
-mf option.
Regards
Hello
When run the following command on Mahout-0.9 and Hadoop-1.2.1, I get multiple
errors and I can not figure out what is the problem? Sorry for the long post.
[hadoop@solaris ~]$ mahout wikipediaDataSetCreator -i wikipedia/chunks -o
wikipediainput -c ~/categories.txt
Running on hadoop,
Hello,
1- Can we debug Hadoop/Mahout with the available bin files? Or we have to do
rebuild them? Probably they are compiled with optimizations
2- I cannot find a procedure on how to build Hadoop/Mahout in debug mode. If
anyone knows, please let me know.
Since I am a gcc/make guy, maybe these
Hi,
Is there any verbosity flag for hadoop and mahout commands? I can not find such
thing in the command line.
Regards,
Mahmood
.
On 03/13/2014 10:21 AM, Mahmood Naderan wrote:
Hi,
Is there any verbosity flag for hadoop and mahout commands? I can not find
such thing in the command line.
Regards,
Mahmood
On Tuesday, March 11, 2014 11:57 PM, Mahmood Naderan nt_mahm...@yahoo.com
wrote:
As I posted earlier, here is the result of a successful test
5.4GB XML file (which is larger than enwiki-latest-pages-articles10.xml) with
4GB of RAM and -Xmx128m tooks 5 minutes to complete.
I didn't find a larger
Hi
Where can I submit a mahout bug? I am not familiar with JIRA and I see issues
and agile.
Regards,
Mahmood
.
Regards,
Mahmood
On Thursday, March 13, 2014 2:31 PM, Mahmood Naderan nt_mahm...@yahoo.com
wrote:
Strange thing is that if I use either -Xmx128m of -Xmx16384m the process stops
at the chunk #571 (571*64=36.5GB).
Still I haven't figured out is this a problem with JVM or Hadoop or Mahout?
I have
splitter class does but if you're interested
in isolating the issue you could find out what is happening data-wise and
see if there is some very large grouping on a pathologically frequent key
for instance.
On Thu, Mar 13, 2014 at 11:31 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
I am
.
On Thu, Mar 13, 2014 at 11:29 AM, Mahmood Naderan nt_mahm...@yahoo.com
wrote:
Hi
Where can I submit a mahout bug? I am not familiar with JIRA and I see
issues and agile.
Regards,
Mahmood
needs 2GB.
If my system supports 10GB of heap, then I will feed 5 threads at one time.
When the first 5 threads are done (the chunks) then I will feed the next 5
threads and so on.
Regards,
Mahmood
On Monday, March 10, 2014 9:42 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
UPDATE:
I
Suneel,
One more thing Right now it has created 500 chunks. So 32GB out of 48GB
(the original size of XML file) has been processed. Is it possible to resume
that?
Regards,
Mahmood
On Tuesday, March 11, 2014 9:47 PM, Mahmood Naderan nt_mahm...@yahoo.com
wrote:
Suneel,
Is it possible
be a question for a Hadoop list, unless I'm misunderstanding.
When you say resume what do you mean?
On Tue, Mar 11, 2014 at 11:46 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
Suneel,
One more thing Right now it has created 500 chunks. So 32GB out of
48GB (the original size of XML file) has
Hi,
Recently I have faced a heap size error when I run
$MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d
$MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o
wikipedia/chunks -c 64
Here is the specs
1- XML file size = 44GB
2- System memory = 54GB (on virtualbox)
3- Heap size = 51GB
to running on the entire
english wikipedia.
On Tue, Mar 11, 2014 at 12:56 PM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
Hi,
Recently I have faced a heap size error when I run
$MAHOUT_HOME/bin/mahout wikipediaXMLSplitter -d
$MAHOUT_HOME/examples/temp/enwiki-latest-pages-articles.xml -o
.
On Sunday, March 9, 2014 2:25 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
Hi Suneel,
Do you have any idea? Searching the web shows many question regarding the heap
size for wikipediaXMLSplitter. I have increased the the memory size to 16GB and
still get that error. I have to say
on the entire
english wikipedia.
On Monday, March 10, 2014 2:59 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
Thanks for the update.
Thing is, when that command is running, in another terminal I run 'top' command
and I see that the java process takes less 1GB of memory. As another test
UPDATE:
I split another 5.4GB XML file with 4GB of RAM and -Xmx128m and it took 5
minutes
Regards,
Mahmood
On Monday, March 10, 2014 7:16 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
The extracted size is about 960MB (enwiki-latest-pages-articles10.xml).
With 4GB of RAM set for the OS
Hello,
I ran this command
./bin/mahout wikipediaXMLSplitter -d
examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64
but got this error
Exception in thread main java.lang.OutOfMemoryError: Java heap space
There are many web pages regarding this and the solution is
OK I found that I have to add this property to mapred-site.xml
property
namemapred.child.java.opts/name
value-Xmx2048m/value
/property
Regards,
Mahmood
On Sunday, March 9, 2014 11:39 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
Hello,
I ran this command
./bin/mahout
Excuse me, I added the -Xmx option and restarted the hadoop services using
sbin/stop-all.sh sbin/start-all.sh
however still I get heap size error. How can I find the correct and needed heap
size?
Regards,
Mahmood
On Sunday, March 9, 2014 1:37 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote
: wikipedia splitter
hadoop@solaris:~/mahout-distribution-0.9$ bin/mahout wikipediaXMLSplitter
-d examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64
On Sat, Mar 8, 2014 at 11:11 AM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
No success Suneel...
Please see the attachment
What a fast reply... Thanks a lot Suneel,
Regards,
Mahmood
On Saturday, March 8, 2014 11:29 PM, Suneel Marthi suneel_mar...@yahoo.com
wrote:
You can ignore the warnings.
On Saturday, March 8, 2014 2:58 PM, Mahmood Naderan nt_mahm...@yahoo.com
wrote:
Oh yes... Thanks Andrew you
Hi
When I run
mahout wikipediaXMLSplitter -d
examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64
I get this error
14/03/07 16:24:13 WARN driver.MahoutDriver: Unable to add class:
wikipediaXMLSplitter
java.lang.ClassNotFoundException: wikipediaXMLSplitter
at
to invoke via:
mahout wikipediaXmlSplitter -dpath -opath -c64
please give that a try.
On Friday, March 7, 2014 8:11 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
Hi
When I run
mahout wikipediaXMLSplitter -d
examples/temp/enwiki-latest-pages-articles.xml -o wikipedia/chunks -c 64
I get
FYI, I am trying to complete the wikipedia example from Apache's document
https://cwiki.apache.org/confluence/display/MAHOUT/Wikipedia+Bayes+Example
Regards,
Mahmood
On Friday, March 7, 2014 5:23 PM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
In fact, see this file
src/conf
for Hadoop 2.x.
On Friday, March 7, 2014 11:16 AM, Mahmood Naderan nt_mahm...@yahoo.com wrote:
FYI, I am trying to complete the wikipedia example from Apache's document
https://cwiki.apache.org/confluence/display/MAHOUT/Wikipedia+Bayes+Example
Regards,
Mahmood
On Friday, March 7, 2014 5:23
Hi,
I have followed the steps stated in
https://cwiki.apache.org/confluence/display/MAHOUT/BuildingMahout to
install Mahout. However I get an error at mvn install
hadoop@solaris:~/mahout-distribution-0.9$ mvn install
[INFO] Scanning for projects...
[INFO]
Lequeux
Le Dimanche 23 février 2014 20h08, Mahmood Naderan nt_mahm...@yahoo.com a
écrit :
Hi,
I have followed the steps stated in
https://cwiki.apache.org/confluence/display/MAHOUT/BuildingMahout to
install Mahout. However I get an error at mvn install
hadoop@solaris:~/mahout-distribution-0.9
you have contains all the jars
needed to run jobs but not the source.
On Sun, Feb 23, 2014 at 9:47 PM, Mahmood Naderan nt_mahm...@yahoo.comwrote:
I have downloaded mahout-distribution-0.9.tar.gz and here is the content.
hadoop@solaris:~/mahout-distribution-0.9$ ls
bin
44 matches
Mail list logo