doc3s = new IndexedRow(3L, new SSV(22, Array(10, 14, 20,
21),Array(2.0, 0.0, 2.0, 1.0)))
val doc4s = new IndexedRow(4L, new SSV(22, Array(3, 7, 13, 20),Array(2.0,
0.0, 2.0, 1.0)))
2014-11-26 10:09 GMT+08:00 Shivani Rao raoshiv...@gmail.com:
Hello Spark fans,
I am trying to use the IDF model
Hello Spark fans,
I am trying to use the IDF model available in the spark mllib to create an
tf-idf representation of a n RDD[Vectors]. Below i have attached my MWE
I get the following error
java.lang.IndexOutOfBoundsException: 7 not in [-4,4)
at
Hello spark aficionados,
We upgraded from spark 1.0.0 to 1.0.1 when the new release came out and
started noticing some weird errors. Even a simple operation like
reduceByKey or count on an RDD gets stuck in cluster mode. This issue
does not occur with spark 1.0.0 (in cluster or local mode) or
Hello Spark fans,
I am unable to figure out how Spark figures out which logger to use. I know
that Spark decides upon this at the time of initialization of the Spark
Context. From Spark documentation it is clear that Spark uses log4j, and
not slf4j, but I have been able to successfully get spark
I have two jars with the following packages
package a.b.c.d.z found in jar1
package a.b.e found in jar2
In scala REPL (no spark) both imports work just fine, but in the Spark
REPL, I found that
import a.b.c.d.z gives me the following error
object c is not a member of package a.b
Has
Actually I figured it out. There was a problem was that I was loading the
sbt package-ed jar into the class path and not the sbt assembly-ed jar.
Once I put the right jar in for package a.b.c.d.z everything worked
thanks
shivani
On Mon, Jun 23, 2014 at 4:38 PM, Shivani Rao raoshiv...@gmail.com
Hello Abhi, I did try that and it did not work
And Eugene, Yes I am assembling the argonaut libraries in the fat jar. So
how did you overcome this problem?
Shivani
On Fri, Jun 20, 2014 at 1:59 AM, Eugen Cepoi cepoi.eu...@gmail.com wrote:
Le 20 juin 2014 01:46, Shivani Rao raoshiv
Hello Michael,
I have a quick question for you. Can you clarify the statement build fat
JAR's and build dist-style TAR.GZ packages with launch scripts, JAR's and
everything needed to run a Job. Can you give an example.
I am using sbt assembly as well to create a fat jar, and supplying the
AP1z4IYraYm5fqWhITWArY53x
Cyyz3Zr67tVK46G8dus5tSbc83KQOdtMDgYoQ5WLQwH0mTWzB6
115254720-OfJ4yFsUU6C6vBkEOMDlBlkIgslPleFjPwNcxHjN
Qd76y2izncM7fGGYqU1VXYTxg1eseNuzcdZKm2QJyK8d1 fifa fifa2014
Hope this helps.
Thanks,
Shrikar
On Fri, Jun 20, 2014 at 9:16 AM, Shivani Rao raoshiv...@gmail.com wrote
That error typically means that there is a communication error (wrong
ports) between master and worker. Also check if the worker has write
permissions to create the work directory. We were getting this error due
one of the above two reasons
On Tue, Jun 17, 2014 at 10:04 AM, Luis Ángel Vicente
to disk sounds very lightweight.
I
On Wed, Jun 18, 2014 at 5:17 PM, Shivani Rao raoshiv...@gmail.com wrote:
I am trying to process a file that contains 4 log lines (not very long)
and then write my parsed out case classes to a destination folder, and I
get the following error
I am trying to process a file that contains 4 log lines (not very long) and
then write my parsed out case classes to a destination folder, and I get
the following error:
java.lang.OutOfMemoryError: Java heap space
at
I learned this from my co-worker, but it is relevant here.
Spark has lazy evaluation by default, which means that all of your code
does not get executed until you run your saveAsTextFile, which does not
tell you much about where the problem is occurring. In order to debug this
better, you might
@Marcelo: The command ./bin/spark-shell --jars jar1,jar2,etc,etc did not
work for me on a linux machine
What I did is to append the class path in the bin/compute-classpath.sh
file. Ran the script, then started the spark shell, and that worked
Thanks
Shivani
On Wed, Jun 11, 2014 at 10:52 AM,
at 7:18 PM, Shivani Rao raoshiv...@gmail.com wrote:
Hello Spark fans,
I am trying to log messages from my spark application. When the main()
function attempts to log, using log.info() it works great, but when I
try the same command from the code that probably runs on the worker, I
initially got
Hello Spark fans,
I am trying to log messages from my spark application. When the main()
function attempts to log, using log.info() it works great, but when I try
the same command from the code that probably runs on the worker, I
initially got an serialization error. To solve that, I created a
I am having trouble adding logging to the class that does serialization and
deserialization. Where is the code for org.apache.spark.Logging located?
and is this serializable?
On Mon, May 12, 2014 at 10:02 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Ah, yes, that is correct. You
Hello All,
I am learning that there are certain imports done by Spark REPL that is
used to invoke and run code in a spark shell, that I would have to import
specifically if I need the same functionality in a spark jar run by command
line.
I am getting into a repeated serialization error of an
This is something that I have bumped into time and again. the object that
contains your main() should also be serializable then you won't have this
issue.
For example
object Test extends serializable{
def main(){
// set up spark context
// read your data
// create your RDD's (grouped by key)
Hello Sophia
You are only providing the Spark jar here (nevertheless, a spark jar that
contains hadoop libraries in it, but that is not sufficient). Where is your
hadoop installed? (Most probably: /usr/lib/hadoop/*)
So you need to add that to your class path (by using -cp) I guess. Let me
know
had the same situation for a while without issues.
On May 1, 2014 8:46 PM, Shivani Rao raoshiv...@gmail.com wrote:
Hello Koert,
That did not work. I specified it in my email already. But I figured a
way around it by excluding akka dependencies
Shivani
On Tue, Apr 29, 2014 at 12:37 PM
I have mucked around this a little bit. The first step to make this happen
is to build a fat jar. I wrote a quick
bloghttp://myresearchdiaries.blogspot.com/2014/05/building-apache-spark-jars.htmldocumenting
my learning curve w.r.t that.
The next step is to schedule this as a java action. Since
Hello Spark Fans,
I am trying to run a spark job via oozie as a java action. The spark code
is packaged as a MySparkJob.jar compiled using sbt assembly (excluding
spark and hadoop dependencies).
I am able to invoke the spark job from any client using
java -cp
From what i understand getting Spark to run alongside a hadoop cluster
requires the following
a) a working hadoop
b) a compiled Spark
c) configuration parameters that point spark to the right hadoop conf files
i ) Can you let me know the specific steps to take after spark was compiled
(via sbt
24 matches
Mail list logo