I built Apache Spark on Ubuntu 14.04 LTS with the following command:

mvn -Pspark-ganglia-lgpl -Pyarn -Phadoop-2.4 -Dhadoop.version=2.5.0 -DskipTests clean package

Build was successful. Then, following modifications were made.

1. Included "SPARK_LOCAL_IP=127.0.0.1" to the file $SPARK_HOME/conf/spark-env.sh to avoid the following warnings.

16/04/27 17:45:54 WARN Utils: Your hostname, ganglia resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface eth0) 16/04/27 17:45:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address

        After that, Spark started well without these warnings.

2. To enable Ganglia metrics, following lines were included to the file $SPARK_HOME/conf/metrics.properties

        # Enable GangliaSink for all instances
*.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
        *.sink.ganglia.name=SparkCluster
        *.sink.ganglia.host=XYZ.XYZ.XYZ.XYZ (Replaced by real IP)
        *.sink.ganglia.port=8649
        *.sink.ganglia.period=10
        *.sink.ganglia.unit=seconds
        *.sink.ganglia.ttl=1
        *.sink.ganglia.mode=multicast

        Following errors were displayed, but Spark got started.

16/04/27 17:45:59 ERROR MetricsSystem: Sink class org.apache.spark.metrics.sink.GangliaSink cannot be instantiated 16/04/27 17:45:59 ERROR SparkContext: Error initializing SparkContext. java.lang.ClassNotFoundException: org.apache.spark.metrics.sink.GangliaSink at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
            at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
            at java.lang.Class.forName0(Native Method)
            .......


"GangliaSink" class can be found at: $SPARK_HOME/external/spark-ganglia-lgpl/target/classes/org/apache/spark/metrics/sink/GangliaSink.class

I can see previous threads regarding the same problem but I cannot find any solution. Any idea?



Reply via email to