Did you build Spark from source and deploy it to the cluster?
When you build Mahout it’s running its tests against the artifacts it gets from
maven repos. When you run mahout on a cluster it is running from the artifacts
on the cluster. These may not be the same and there have been problems that
So, when I follow examples from hortonworks and run spark Pi example using
spark-submit - everything works.
I can run mahout spark-itemsimilarity without specifying master parameter which
means it is running in the local mode (right?) and it works. But if I try to
run mahout using -ma (master
Thanks, Pat.
I am using HDP with spark 1.1.0:
http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/
Spark examples run without issues. For mahout I had to create a couple of env
vars: (HADOOP_HOME, SPARK_HOME, MAHOUT_HOME). Also, to run using yarn cluster
with HDP -ma yarn-cluster
I believe I may have found a solution to this problem which I will try to
eventually put on github but now I am not sure how to run this on the
cluster. I have created the code on my eclipse IDE as a maven project and
then copied the jar file to the Hadoop cluster (vectorCode-1.0.jar)
I know try
Just a guess
itemsimilarity takes a csv of
The IDs must be non-negative row and column numbers. The hadoop version of this
job expects you to translate your IDs into row and column numbers in the
overall matrix it will create from the individual lines of the csv.
On Jan 6, 2015, at 1:36 AM,
There are some issues with using Mahout on Windows so you’ll have to run on a
‘nix machine or VM. There shouldn’t be any problem with using VMs as long as
your Spark install is setup correctly.
Currently you have to build Spark first and then Mahout from source. Mahout
uses Spark 1.1. You’ll ne
Hi, I've been trying to run spark-itemsimilarity against Hortonworks Sandbox
with Spark running in a VM, but have not succeeded yet.
Do I need to install mahout and run within a VM or is there a way to run
remotely against a VM where spark and hadoop are running?
I tried running a scala ItemSim
Any one know what could be the reason?
On Tue, Jan 6, 2015 at 2:08 PM, unmesha sreeveni
wrote:
> I am trying to run Itemsimilarity in mahout instead of ftp growth.
>
> But once I run I am getting
> java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.hadoop.mapred.L
I am trying to run Itemsimilarity in mahout instead of ftp growth.
But once I run I am getting
java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 1
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJ