Spark Metrics for Ganglia
I built Apache Spark on Ubuntu 14.04 LTS with the following command: mvn -Pspark-ganglia-lgpl -Pyarn -Phadoop-2.4 -Dhadoop.version=2.5.0 -DskipTests clean package Build was successful. Then, following modifications were made. 1. Included "SPARK_LOCAL_IP=127.0.0.1" to the file $SPARK_HOME/conf/spark-env.sh to avoid the following warnings. 16/04/27 17:45:54 WARN Utils: Your hostname, ganglia resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface eth0) 16/04/27 17:45:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address After that, Spark started well without these warnings. 2. To enable Ganglia metrics, following lines were included to the file $SPARK_HOME/conf/metrics.properties # Enable GangliaSink for all instances *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink *.sink.ganglia.name=SparkCluster *.sink.ganglia.host=XYZ.XYZ.XYZ.XYZ (Replaced by real IP) *.sink.ganglia.port=8649 *.sink.ganglia.period=10 *.sink.ganglia.unit=seconds *.sink.ganglia.ttl=1 *.sink.ganglia.mode=multicast Following errors were displayed, but Spark got started. 16/04/27 17:45:59 ERROR MetricsSystem: Sink class org.apache.spark.metrics.sink.GangliaSink cannot be instantiated 16/04/27 17:45:59 ERROR SparkContext: Error initializing SparkContext. java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) ... "GangliaSink" class can be found at: $SPARK_HOME/external/spark-ganglia-lgpl/target/classes/org/apache/spark/metrics/sink/GangliaSink.class I can see previous threads regarding the same problem but I cannot find any solution. Any idea?
Re: Can not see any spark metrics on ganglia-web
Hi, Should the gmond be installed in all the Spark nodes? What should the host and port be? Should it be the host and port of gmetad? Enable GangliaSink for all instances *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink *.sink.ganglia.name=hadoop_cluster1 *.sink.ganglia.host=localhost *.sink.ganglia.port=8653 *.sink.ganglia.period=10 *.sink.ganglia.unit=seconds *.sink.ganglia.ttl=1 *.sink.ganglia.mode=multicast -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p25636.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Can not see any spark metrics on ganglia-web
Hi, Where does *.sink.csv.directory directory get created? I cannot see nay metrics in logs. How did you verify consoleSink and csvSink? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p25643.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark metrics for ganglia
Hi, How to verify whether the GangliaSink directory got created? Thanks, Swetha On Mon, Dec 15, 2014 at 11:29 AM, danilopds <danilob...@gmail.com> wrote: > Thanks tsingfu, > > I used this configuration based in your post: (with ganglia unicast mode) > # Enable GangliaSink for all instances > *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink > *.sink.ganglia.host=10.0.0.7 > *.sink.ganglia.port=8649 > *.sink.ganglia.period=15 > *.sink.ganglia.unit=seconds > *.sink.ganglia.ttl=1 > *.sink.ganglia.mode=unicast > > Then, > I have the following error now. > ERROR metrics.MetricsSystem: Sink class > org.apache.spark.metrics.sink.GangliaSink cannot be instantialized > java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-metrics-for-ganglia-tp14335p20690.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: Can not see any spark metrics on ganglia-web
Hi, I cannot see any metrics as well. How did you verify ConsoleSink and CSVSink works OK? Where does *.sink.csv.directory get created? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p25644.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark metrics for ganglia
Did you get past this issue? I¹m trying to get this to work as well. You have to compile in the spark-ganglia-lgpl artifact into your application. dependency groupIdorg.apache.spark/groupId artifactIdspark-ganglia-lgpl_2.10/artifactId version${project.version}/version /dependency So I added the above snippet to the examples project, and it finds the class now when I try to run the Pi example, but I get this problem instead: 14/12/24 11:47:23 ERROR metrics.MetricsSystem: Sink class org.apache.spark.metrics.sink.GangliaSink cannot be instantialized ŠSNIPŠ Caused by: java.lang.NumberFormatException: For input string: 1 On 12/15/14, 11:29 AM, danilopds danilob...@gmail.com wrote: Thanks tsingfu, I used this configuration based in your post: (with ganglia unicast mode) # Enable GangliaSink for all instances *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink *.sink.ganglia.host=10.0.0.7 *.sink.ganglia.port=8649 *.sink.ganglia.period=15 *.sink.ganglia.unit=seconds *.sink.ganglia.ttl=1 *.sink.ganglia.mode=unicast Then, I have the following error now. ERROR metrics.MetricsSystem: Sink class org.apache.spark.metrics.sink.GangliaSink cannot be instantialized java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-metrics-for-gang lia-tp14335p20690.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark metrics for ganglia
Thanks tsingfu, I used this configuration based in your post: (with ganglia unicast mode) # Enable GangliaSink for all instances *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink *.sink.ganglia.host=10.0.0.7 *.sink.ganglia.port=8649 *.sink.ganglia.period=15 *.sink.ganglia.unit=seconds *.sink.ganglia.ttl=1 *.sink.ganglia.mode=unicast Then, I have the following error now. ERROR metrics.MetricsSystem: Sink class org.apache.spark.metrics.sink.GangliaSink cannot be instantialized java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-metrics-for-ganglia-tp14335p20690.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Can not see any spark metrics on ganglia-web
I used the command below because I'm using Spark 1.0.2 built with SBT and it worked. SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true SPARK_GANGLIA_LGPL=true sbt/sbt assembly -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p20384.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark metrics for ganglia
Hello Samudrala, Did you solve this issue about view metrics in Ganglia?? Because I have the same problem. Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-metrics-for-ganglia-tp14335p20385.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Can not see any spark metrics on ganglia-web
Hi tsingfu, I want to see metrics in ganglia too. But I don't understand this step: ./make-distribution.sh --tgz --skip-java-test -Phadoop-2.3 -Pyarn -Phive -Pspark-ganglia-lgpl Are you installing the hadoop, yarn, hive AND ganglia?? If I want to install just ganglia? Can you suggest me something? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p15631.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Can not see any spark metrics on ganglia-web
Hi, I am sure you can use the -Pspark-ganglia-lgpl switch to enable Ganglia. This step only adds the support for Hadoop,Yarn,Hive et al in the spark executable.No need to run if one is not using them. Cheers k/ On Thu, Oct 2, 2014 at 12:29 PM, danilopds danilob...@gmail.com wrote: Hi tsingfu, I want to see metrics in ganglia too. But I don't understand this step: ./make-distribution.sh --tgz --skip-java-test -Phadoop-2.3 -Pyarn -Phive -Pspark-ganglia-lgpl Are you installing the hadoop, yarn, hive AND ganglia?? If I want to install just ganglia? Can you suggest me something? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p15631.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Can not see any spark metrics on ganglia-web
Ok Krishna Sankar, In relation to this information on Spark monitoring webpage, For sbt users, set the SPARK_GANGLIA_LGPL environment variable before building. For Maven users, enable the -Pspark-ganglia-lgpl profile Do you know what I need to do to install with sbt? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p15636.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Can not see any spark metrics on ganglia-web
Hi, I found the problem. By default, gmond is monitoring the multicast ip:239.2.11.71, while I set *.sink.ganglia.host=localhost. the correct configuration in metrics.properties: # Enable GangliaSink for all instances *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink #*.sink.ganglia.host=localhost *.sink.ganglia.host=239.2.11.71 *.sink.ganglia.port=8653 *.sink.ganglia.period=10 *.sink.ganglia.unit=seconds *.sink.ganglia.ttl=1 *.sink.ganglia.mode=multicast -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981p15128.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Can not see any spark metrics on ganglia-web
I installed ganglia, and I think it worked well for hadoop, hbase for I can see hadoop/hbase metrics on ganglia-web.I want to use ganglia to monitor spark. and I followed the steps as following:1) first I did a custom compile with -Pspark-ganglia-lgpl, and it sucessed without warnings../make-distribution.sh --tgz --skip-java-test -Phadoop-2.3 -Pyarn -Phive -Pspark-ganglia-lgpl2)I configured the conf/metrics.properties:(8653 is the port I set for gmond) and restart spark Master and Workervi conf/metrics.properties# Enable GangliaSink for all instances*.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink*.sink.ganglia.name=hadoop_cluster1*.sink.ganglia.host=localhost*.sink.ganglia.port=8653*.sink.ganglia.period=10*.sink.ganglia.unit=seconds*.sink.ganglia.ttl=1*.sink.ganglia.mode=multicastsbin/stop-all.shsbin/start-all.sh3) I refreshed my ganglia-web,but I can not see any spark metrics.4) I made a test to verify whether the sinks of ConsoleSink and CSVSink works OK, and the result is OK, I found metrics in logs and *.sink.csv.directoryI searched topic about ganglia and metrics on the http://apache-spark-user-list.1001560.n3.nabble.com ,spark JIRA and google, but found anything useful.Any one could give me a help or some proposal? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-see-any-spark-metrics-on-ganglia-web-tp14981.html Sent from the Apache Spark User List mailing list archive at Nabble.com.