Hi folks,
I couldn't find much literature on this so I figured I could ask here.
Does anyone have experience in tuning the memory settings and interval
times of the Spark History Server?
Let's say I have 500 applications at 0.5 G each with a
*spark.history.fs.update.interval* of 400s.
Is there a
Hi Yin,
Thanks for your reply. I didn't realize there is a specific mailing list
for spark-Cassandra-connector. I will ask there. Thanks!
-Sa
On Tuesday, February 23, 2016, Yin Yang wrote:
> Hi, Sa:
> Have you asked on spark-cassandra-connector mailing list ?
>
> Seems you
didn't see the metrics on the spark UI either. I didn't
find any relevant error or info in the log that indicates the
CassandraConnectorSource is actually registered by the spark metrics
system. Any pointers would be very much appreciated!
Thanks,
Sa
to provide other information or run other experiment to help diagnose
the problem.
Thanks a lot in advanced!
Sa
Note that I am running pyspark in local mode (I do not have a hadoop cluster
connected) as I want to be able to work with the avro file outside of
hadoop.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-load-avro-file-into-spark-not-on-Hadoop-in-pyspa
I have 2 questions related to pyspark:
1. How do I load an avro file that is on the local filesystem as opposed to
hadoop? I tried the following and I just get NullPointerExceptions:
avro_rdd = sc.newAPIHadoopFile(
"file:///c:/my-file.avro",
"org.apache.avro.mapreduce.AvroKeyInputFormat",
Can Spark be built with Hadoop 2.6? All I see instructions up to are for 2.4
and there does not seem to be a hadoop2.6 profile. If it works with Hadoop
2.6, can anyone recommend how to build?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Hadoop-2-6-compati
/api/java/index.html?org/apache/spark/streaming/dstream/DStream.html
Thanks.
SA