Thanks. But that did not work.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/convert-List-to-RDD-tp7606p7609.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi,
My unit test is failing (the output is not matching the expected output). I
would like to printout the value of the output. But
rdd.foreach(r=println(r)) does not work from the unit test. How can I print
or write out the output to a file/screen?
thanks.
--
View this message in context:
Hi,
When we have multiple runs of a program writing to the same output file, the
execution fails if the output directory already exists from a previous run.
Is there some way we can have it overwrite the existing directory, so that
we dont have to manually delete it after each run?
Thanks for
This issue is resolved.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/specifying-fields-for-join-tp7528p7544.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
After doing a groupBy operation, I have the following result:
val res =
(ID1,ArrayBuffer((145804601,ID1,japan)))
(ID3,ArrayBuffer((145865080,ID3,canada),
(145899640,ID3,china)))
(ID2,ArrayBuffer((145752760,ID2,usa),
(145934200,ID2,usa)))
Now I need to output for each group,
My output is a set of tuples and when I output it using saveAsTextFile, my
file looks as follows:
(field1_tup1, field2_tup1, field3_tup1,...)
(field1_tup2, field2_tup2, field3_tup2,...)
In Spark. is there some way I can simply have it output in CSV format as
follows (i.e. without the
I tried to use sbt/sbt assembly to build spark-1.0.0. I get the a lot of
Server access error: Connection refused
errors when it tries to download from repo.eclipse.org and
repository,jboss.org. I tried to navigate to these links manually and some
of these links are obsolete (Error 404).
Thank you very much. Making the trait serializable worked.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Task-not-serializable-collect-take-tp5193p5236.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
I am using Spark 0.9.1 in standalone mode. In the
SPARK_HOME/examples/src/main/scala/org/apache/spark/ folder, I created my
directory called mycode in which I have placed some standalone scala code.
I was able to compile. I ran the code using:
./bin/run-example org.apache.spark.mycode.MyClass
Hi,
I have the following code structure. I compiles ok, but at runtime it aborts
with the error:
Exception in thread main org.apache.spark.SparkException: Job aborted:
Task not serializable: java.io.NotSerializableException:
I am running in local (standalone) mode.
trait A{
def input(...):
Hi,
I am a new user of Spark. I have a class that defines a function as follows.
It returns a tuple : (Int, Int, Int).
class Sim extends VectorSim {
override def input(master:String): (Int,Int,Int) = {
sc = new SparkContext(master, Test)
val ratings =
Each time I run sbt/sbt assembly to compile my program, the packaging time
takes about 370 sec (about 6 min). How can I reduce this time?
thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/packaging-time-tp5048.html
Sent from the Apache Spark User
Hi,
I am a new user of Spark. I have a class that defines a function as follows.
It returns a tuple : (Int, Int, Int).
class Sim extends VectorSim {
override def input(master:String): (Int,Int,Int) = {
sc = new SparkContext(master, Test)
val ratings =
101 - 113 of 113 matches
Mail list logo