To: user@mahout.apache.org
there is no Random Forest impl on Spark in Mahout yet. Ml-lib has a Random
Forests impl why can't u use that instead.
On Tue, Aug 12, 2014 at 2:19 AM, Sameer Tilak ssti...@live.com wrote:
Hi All,
We are currently using Weka. I looked
, Sameer Tilak ssti...@live.com wrote:
Hi All,
We are currently using Weka. I looked the the site and read briefly
about
experimental nature of Mahout on Spark. I was wondering how mu of
Mahout's
functionality is available currently? For example, I am
Hi All,My script runs find in map reduce mode, but I get the following error
when I run it in the local mode. I have made sure that the i/p file exists. I
am not sure why map reduce is coming into picture when it is local mode.
pig -x local myscript.pig
2014-01-22 16:14:02,771 [main] INFO
Hi All,I am getting the following error while executing this job:
-bash-4.1$ ./mahout itemsimilarity -i /scratch/SimilartyInput -o
/scratch/SimilartyOutput -s SIMILARITY_COOCCURRENCE --maxSimilaritiesPerItem 10
13/12/30 10:30:29 INFO common.AbstractJob: Command line arguments:
Hi All,
I was able to successfully run item similarity algorithm on my dataset.My input
data had the following format:
userid itemid……
I used the following command:
./mahout itemsimilarity -i /scratch/SimilartyInput -o /scratch/SimilartyOutput
-s SIMILARITY_COOCCURRENCE --maxSimilaritiesPerItem
Hi All,I am running the following command to process a quite large dataset. I
want to mention upfront that my input file does contain few blank lines. Any
thought on why this might be happening?
./mahout itemsimilarity -i /scratch/SimilartyInput -o /scratch/SimilartyOutput
-s
Hi All,I am having another issue with item similarity. For some reason
numUsers.bin file does not get generated. I am copying the command here:
./mahout itemsimilarity -i /scratch/SimilartyInput -o /scratch/SimilartyOutput
--tempDir /scratch/Similartytemp -s SIMILARITY_COOCCURRENCE
Hi all,
I get the following problem whehn I run k-mens clustering on my real data. Any
ehlp with this would be great!
Here is data that I read out of the Sequencefile:
022960 value:
Hi everyone,
My Pig script generates the following -- results are stored in part-m-0 to
part-m-4 files.
-bash-4.1$ hadoop dfs -ls /scratch/ItemIds
Found 7 items
-rw-r--r-- 1 userid supergroup 0 2013-12-23 11:13
/scratch/ItemIds/_SUCCESS
drwxr-xr-x - userid supergroup
Hi All,
I was able to do the clustering and need some help with viewing the result. I
get the following problem.
./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final -d
/scratch/dummyvectorfinalclusters
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Warning:
Hi All,
I was able to resolve this issue by adding the following to my code:
DistributedCache.addFileToClassPath(new Path(/scratch/mahout-math-0.9-\
SNAPSHOT.jar), conf,fs);
DistributedCache.addFileToClassPath(new Path(/scratch/mahout-core-0.9-\
SNAPSHOT.jar), conf,fs);
, December 20, 2013 5:33 PM, Sameer Tilak ssti...@live.com wrote:
Hi All,
I was able to do the clustering and need some help with viewing the result. I
get the following problem.
./mahout clusterdump -i /scratch/dummyvectoroutput/clusters-*-final -d
/scratch/dummyvectorfinalclusters
. However, using
DistributedCache.addFileToClassPath I was able to have them seen in worker
nodes.
From: kkrugler_li...@transpac.com
Subject: Re: KMeansDriver and distributed cache
Date: Fri, 20 Dec 2013 14:47:13 -0800
To: user@mahout.apache.org
On Dec 20, 2013, at 2:35pm, Sameer Tilak
@mahout.apache.org
I would investigate all of those 'Unable to add .' messages first.
Checkout the latest code and run a clean build.
On Friday, December 20, 2013 5:58 PM, Sameer Tilak ssti...@live.com wrote:
Suneel:
Yes, I am working off of trunk. I saw that example. In my case the data
, 2013 1:04 PM, Sameer Tilak ssti...@live.com wrote:
Hi everyone,
I used the following commands to generate the jar file:
javac -d /apps/analytics/myanalytics -classpath
.:/apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar:/users/p529444/software/hadoop-1.0.3/hadoop-core-1.0.3.jar:/apps
Hi Everyone,
I have added math jar file to my javac command. As per the documentation,
Include the JAR in the “-libjars” command line option of the `hadoop jar …`
command. The jar will be placed in distributed cache and will be made available
to all of the job’s task attempts. Ideally, this
Hi,I was able to resolve this problem by adding the jar files to
HADOOP_CLASSPATH. Here is a command sequence:
export
Hi All,
I am trying to execute the following command:
hadoop jar /apps/analytics/myanalytics.jar myanalytics.SimpleKMeansClustering
-libjars /apps/mahout/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar
Hi All,
I have some questions regarding vectorization.
Here is my Pig script snippet.
AU = FOREACH A GENERATE myparser.myUDF(param1, param2); STORE AU into
'/scratch/AU';
AU has the following format:
(userid, (item_view_history))
(27,(0,1,1,0,0))(28,(0,0,1,0,0))(29,(0,0,1,0,1))(30,(1,0,1,0,1))
Trying to figure that out now. Will keep you posted.
Date: Mon, 16 Dec 2013 12:13:52 -0800
Subject: Re: Data Vectorization
From: andrew.mussel...@gmail.com
To: user@mahout.apache.org
Looks reasonable. Does it work?
On Mon, Dec 16, 2013 at 12:09 PM, Sameer Tilak ssti...@live.com wrote
Dec 2013 12:13:52 -0800
Subject: Re: Data Vectorization
From: andrew.mussel...@gmail.com
To: user@mahout.apache.org
Looks reasonable. Does it work?
On Mon, Dec 16, 2013 at 12:09 PM, Sameer Tilak ssti...@live.com wrote:
Hi All,
I have some questions regarding vectorization.
Here
.
On Monday, December 16, 2013 3:58 PM, Sameer Tilak ssti...@live.com wrote:
It does not see to work :(.
Here is who I use the generated sequence (described in my last email) file
for clustering.
./mahout seqdirectory -i /scratch/VectorizedInput -o
/scratch/VectorizedOutputSeqdir -c UTF-8
9.288379669189453
Date: Mon, 16 Dec 2013 12:13:52 -0800
Subject: Re: Data Vectorization
From: andrew.mussel...@gmail.com
To: user@mahout.apache.org
Looks reasonable.á Does it work?
On Mon, Dec 16, 2013 at 12:09 PM, Sameer Tilak ssti...@live.com wrote:
Hi All,
I have
Hi,
I am running K-means clustering following the script on Wiki:
https://cwiki.apache.org/confluence/download/attachments/75159/quickstart-kmeans.sh?version=2modificationDate=1286718326000
Looks like with the newer version of Mahout the commandline options have
changed. For example I get the
Hi All,
I have some question about using EB's VectorWritableConverter in my Pig script
for data vectorization.
I am generating the tuples using a UDF, however for
simplicity I am loading the data from a file in the following code. My
UDF returns tuples of the form (1,0,1,1...) etc.
My map.dat
Hi All,We are using Pig top build our data pipeline.
I came across the following:https://github.com/tdunning/pig-vector
The last commit was 2 yrs ago. Any information on will there be any further
work on this project?
Hi All,We are using Apache Pig for building our data pipeline. We have data in
the following fashion:
userid, age, items {code 1, code 2, ….}, few other features...
Each item has a unique alphanumeric code. I would like to use mahout for
clustering it. Based on my current reading I see
.
https://github.com/kevinweil/elephant-bird
On Mon, Dec 2, 2013 at 4:10 PM, Sameer Tilak ssti...@live.com wrote:
Hi All,We are using Pig top build our data pipeline.
I came across the following:https://github.com/tdunning/pig-vector
The last commit was 2 yrs ago. Any information
I am looking for some input on how to vectorize my data.
From: ssti...@live.com
To: user@mahout.apache.org
Subject: Mahout for clustering
Date: Mon, 2 Dec 2013 16:22:03 -0800
Hi All,We are using Apache Pig for building our data pipeline. We have data
in the following fashion:
.
--sebastian
On 21.11.2013 00:28, Sameer Tilak wrote:
Yes, changing A1234567 to 1234567 resolves that issue trivially. However,
(input: userid, itemcode) itemcode is alphanumeric and not just numeric. I
am sure ItemSimilarityJob will be able to handle that case, however I need
to know to supply
To: user@mahout.apache.org
Subject: Re: Mahout fpg
You can use ItemSimilarityJob to find sets of items that cooccur
together in your users interactions.
--sebastian
On 20.11.2013 08:11, Sameer Tilak wrote:
Hi Sunil,
Thanks for your reply. We can benefit a lot from
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
Obviously, the input's incorrect.
On Wednesday, November 20, 2013 6:02 PM, Sameer Tilak ssti...@live.com
wrote:
Dear Sebastian,I tried using ItemSimilarityJob.My data has the following
format
Each line
Hi everyone,
I am interested in using Mahout for analyzing data -- in particular frequent
pattern mining using Mahout's FPG algorithm. My data can be expressed as a MXN
matrix. Each row represents a given user where as columns represent the items
(1 if a given user has viewed a particular item
Hi everyone,I downloaded the latest version of Mahout and did mvn install. When
I try to run fog, I get the following errors. Do I need to download and compile
FPG separately? Looks like somehow it has not been included in the list of
valid programs.
13/11/19 17:49:19 WARN driver.MahoutDriver:
...@yahoo.com
Subject: Re: Mahout fpg
To: user@mahout.apache.org
Fpg has been removed from the codebase as it will not be supported.
On Tuesday, November 19, 2013 8:56 PM, Sameer Tilak ssti...@live.com wrote:
Hi everyone,I downloaded the latest version of Mahout and did mvn install.
When
35 matches
Mail list logo