Re: using Mahout to classify customer service and sales emails?

2014-10-23 Thread Mahesh Balija
Hi Ted, What is MapR classifiers? do you mean MapReduce? Since the data is streaming data, shall we store the data in any database like NoSQL DB and export it to Hadoop (if the data is huge) build the model, and deploy the model in production for classifying the streaming data in realtime? But

Re: Mahout Vs Spark

2014-10-22 Thread Mahesh Balija
in MLLib package and being support for in-memory computation and with rich scientific libraries through Scala and support for languages like Java/Scala/Python will the survival of Mahout be questionable?* Best! Mahesh Balija. On Wed, Oct 22, 2014 at 1:26 PM, Martin, Nick nimar...@pssd.com wrote: I

Re: Mahout Vs Spark

2014-10-22 Thread Mahesh Balija
, just wanted to take some inputs from the active contributors. Best! Mahesh Balija. On Wed, Oct 22, 2014 at 6:57 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: For the record, this is all false dilemma (at least w.r.t. spark vs mahout spark bindings). The spark bindings have never been

Mahout Vs Spark

2014-10-21 Thread Mahesh Balija
Hi Team, As Spark framework is gaining more attention in the Big Data - Open source frameworks, with the support for variety of applications like, 1) Shark 2) GraphX 3) MLLib 4) Streaming With the rapid development algorithms supporting Clustering, Classification, Regression etc in the MLlib

Re: Upgrade to Spark 1.1.0?

2014-10-20 Thread Mahesh Balija
Hi Pat, Can you please give detailed steps to build Mahout against Spark 1.1.0. I build against 1.1.0 but still had class not found errors, thats why I reverted back to Spark 1.0.2 even though first few steps are successful but still facing some issues in running Mahout spark-shell sample

Re: problems with running K-means on hadoop's pseudo-distributed mode

2014-04-04 Thread Mahesh Balija
Hi Wei Zhang, Can you check whether this path exists in your Hadoop HDFS /tmp/mahout-work-weiz/reuters-kmeans-clusters/part-randomSeed Instead of using cluster_reuters.sh script file can you run Kmeans manually on your cluster. BTW, what is the command you are using for running

Re: SLF4J Error

2014-01-06 Thread Mahesh Balija
if you are running it on Hadoop please check whether you are having SLF4J jars are part of the Hadoop lib folder or else you are suppose to pass them with -libjars option with the right usage of GenericOptionsParser in your source code. On Mon, Jan 6, 2014 at 12:13 PM, Chameera Wijebandara

Re: Network Traffic and Security Analysis

2013-02-21 Thread Mahesh Balija
. Your suggestions are most welcome. Thanks, Mahesh Balija, CalsoftLabs. On Wed, Jan 30, 2013 at 1:25 PM, Ted Dunning ted.dunn...@gmail.com wrote: I don't have any such references. It would actually be interesting if you could summarize some of the white papers you have read to the list

Re: how to use a custom distance measure with kmeans?

2013-02-12 Thread Mahesh Balija
Are you getting any errors? Can you specify fully qualified class name of your distance measure (like com.xxx.MyDistanceMeasure) and check? Best, Mahesh Balija, Calsoft Labs. On Tue, Feb 12, 2013 at 2:28 PM, Mihai Josan mihai.jo...@iquestgroup.comwrote: Hello, Can you please tell me how can

Re: how to use a custom distance measure with kmeans?

2013-02-12 Thread Mahesh Balija
better way to add the custom classes to classpath rather than users modifying the script file. Thanks, Mahesh Balija, Calsoft Labs. On Tue, Feb 12, 2013 at 10:18 PM, Dan Filimon dangeorge.fili...@gmail.comwrote: You need to add the JAR containing the distance measure you want to the classpath

Re: Mahout KMeans - java.lang.NoSuchMethodError: com.google.common.collect.Iterators.forArray([Ljava/lang/Object;)Lcom/google/common/collect/UnmodifiableIterator

2013-02-04 Thread Mahesh Balija
Hi Murthy, It seems to be NOT a mahout's issue to me, rather I suspect that you might have compiled your source with some latest/legacy jars and running in a legacy environment where you don't have latest API. Best, Mahesh Balija, CalsoftLabs. On Mon, Feb 4, 2013 at 8:48 PM

Re: can i run mahout algorithms on mobile device..

2013-01-30 Thread Mahesh Balija
AFAIK it is NOT possible. As Mahout runs on top of Hadoop. Also Hadoop is a distributed computing framework, it will run on cluster of machines. So ideally it may NOT be possible to run on a Mobile. On Wed, Jan 30, 2013 at 8:46 PM, VIGNESH S vigneshkln...@gmail.com wrote: I am trying to

Re: Question on kmeans

2013-01-29 Thread Mahesh Balija
information other than the actual clusters (for ex, while doing text clustering there can be Top terms associated with a specific cluster). Can you attach your output.txt file. Best, Mahesh Balija, CalsoftLabs. On Tue, Jan 29, 2013 at 5:44 AM, jamal sasha jamalsha...@gmail.com wrote: Hi

Re: Heirarch clustering

2013-01-29 Thread Mahesh Balija
get the sub-clusters for the big one. Mahout in Action book is the best reference book for getting familiarity with Mahout along with some nice examples. Best, Mahesh Balija, CalsoftLabs. On Tue, Jan 29, 2013 at 4:11 AM, jamal sasha jamalsha...@gmail.com wrote: Sorry.. accidental

Re: How to make normal Text suitable for Kmeans using mahout

2013-01-25 Thread Mahesh Balija
features. Finally you can pass through these vectors to K-Means clustering algorithm in order to get keyword clusters. You can have better documentation in Mahout in Action on clustering documents. Best, Mahesh Balija, CalsoftLabs. On Sat, May 19, 2012 at 1:16 PM

Re: Mahout getting started confusion

2013-01-10 Thread Mahesh Balija
Hi Peter, Can you check whether your HDFS is up and running separately? Also check whether this is helpful for you, This is associated with cloud era version of Hadoop, https://groups.google.com/forum/?fromgroups=#!topic/spark-users/cSefxQ6HIVU Best, Mahesh Balija, Calsoft Labs