On Thu, Dec 4, 2014 at 5:38 AM, Shahid Shaikh shaikhshah...@gmail.com
wrote:
i see the problem is with the way data is written
What exactly do you mean by this?
Hi All,
I have been trying mahout clustering on unstructured data i.e human
written data . I have tried mahout clustering algorithms like
Kmeans,Canopy+Kmeans and LDA but the results produced are not help full .
i see the problem is with the way data is written , Can some one please
provide
Shaikh shaikhshah...@gmail.com
wrote:
Hi All,
I have been trying mahout clustering on unstructured data i.e human
written data . I have tried mahout clustering algorithms like
Kmeans,Canopy+Kmeans and LDA but the results produced are not help full .
i see the problem is with the way data
PM, Shahid Shaikh shaikhshah...@gmail.com
wrote:
Hi All,
I have been trying mahout clustering on unstructured data i.e human
written data . I have tried mahout clustering algorithms like
Kmeans,Canopy+Kmeans and LDA but the results produced are not help full .
i see the problem
parameters to the clustering algorithm like number of topics or
number of clusters.
Cheers,
Donni
On Thu, Dec 4, 2014 at 2:38 PM, Shahid Shaikh shaikhshah...@gmail.com
wrote:
Hi All,
I have been trying mahout clustering on unstructured data i.e human
written data . I have tried mahout
Hi,
I am following the book Mahout In Action.
I downloaded sources and I am trying to run this piece of code:
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
You are using 0.9 version of Mahout amd 1.0 version of
mahout-collections. The API might have changed considerably.
I suggest you checkout the code from here:
https://github.com/tdunning/MiA/tree/mahout-0.7
This code works with mahout-0.7
Regards,
Saleem
On Wed, May 21, 2014 at 4:49 PM,
Hi,
Thank you for your answer.
I changed my pom.xml:
mahout.version0.7/mahout.version
mahout.groupidorg.apache.mahout/mahout.groupid
dependency
groupId${mahout.groupid}/groupId
Hi All,We are using Apache Pig for building our data pipeline. We have data in
the following fashion:
userid, age, items {code 1, code 2, ….}, few other features...
Each item has a unique alphanumeric code. I would like to use mahout for
clustering it. Based on my current reading I see
I am looking for some input on how to vectorize my data.
From: ssti...@live.com
To: user@mahout.apache.org
Subject: Mahout for clustering
Date: Mon, 2 Dec 2013 16:22:03 -0800
Hi All,We are using Apache Pig for building our data pipeline. We have data
in the following fashion
@mahout.apache.org
Subject: Mahout for clustering
Date: Mon, 2 Dec 2013 16:22:03 -0800
Hi All,We are using Apache Pig for building our data pipeline. We have
data in the following fashion:
userid, age, items {code 1, code 2, ….}, few other features...
Each item has a unique
in the following fashion:
userid, age, items {code 1, code 2, ….}, few other features...
Each item has a unique alphanumeric code. I would like to use mahout for
clustering it. Based on my current reading I see following few options
1. Map each alphanumeric item code to a numeric code -- A1 - 0
I'm currently using KMeans with canopy and Cosine as the measure. The data I'm
using has been somewhat curated into categories so I expected them to cluster
alongside the other documents in their respective categories. Some of them fall
nicely into clusters I'd expect but others are like the
I was wondering if there was an explain feature in Mahout, something that gives
the reason why it did what it did, shows the values of the various features it
used to evaluate and choose the result, etc.
Because I have some wildly different text data being clustered together, for
example it
Sent from phone
On 4 Feb 2013, at 18:57, Chris Harrington ch...@heystaks.com wrote:
I was wondering if there was an explain feature in Mahout, something that
gives the reason why it did what it did, shows the values of the various
features it used to evaluate and choose the result, etc.
That's a really good question. Mahout does not have an explain
feature; however, you can use the ClusterDumper to print out the cluster
centers and vectors clustered within each cluster. Output is pretty
verbose and, with large text vectors being truncated, might not be that
useful. You might
16 matches
Mail list logo