Cassandra and Spark checkpoints
According to „DataStax Brings Spark To Cassandra“ press realese: „DataStax has partnered with Databricks, the company founded by the creators of Apache Spark, to build a supported, open source integration between the two platforms. The partners expect to have the integration ready by this summer.“ How far this integration goes? Fow example is it possible to use Cassandra as distributed checkpoints storage? Currently only HDFS is supported? Thanks Toivo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Cassandra-and-Spark-checkpoints-tp8254.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Spark 1.0.0 Maven dependencies problems.
Thanks for the hint. I removed signature info from same jar and JVM is happy now. But problem remains, several same jar's but different versions, not good. Spark itself is very, very promising, I am very excited Thank you all toivo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-0-0-Maven-dependencies-problems-tp7247p7309.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: Spark 1.0.0 Maven dependencies problems.
I am using Maven from Eclipse dependency:tree shows [INFO] +- org.apache.spark:spark-core_2.10:jar:1.0.0:compile [INFO] | +- net.java.dev.jets3t:jets3t:jar:0.7.1:runtime [INFO] | +- org.apache.curator:curator-recipes:jar:2.4.0:compile [INFO] | | +- org.apache.curator:curator-framework:jar:2.4.0:compile [INFO] | | | \- org.apache.curator:curator-client:jar:2.4.0:compile [INFO] | | \- org.apache.zookeeper:zookeeper:jar:3.4.5:compile [INFO] | | \- jline:jline:jar:0.9.94:compile [INFO] | +- org.eclipse.jetty:jetty-plus:jar:8.1.14.v20131031:compile [INFO] | | +- org.eclipse.jetty.orbit:javax.transaction:jar:1.1.1.v201105210645:compile [INFO] | | +- org.eclipse.jetty:jetty-webapp:jar:8.1.14.v20131031:compile [INFO] | | | +- org.eclipse.jetty:jetty-xml:jar:8.1.14.v20131031:compile [INFO] | | | \- org.eclipse.jetty:jetty-servlet:jar:8.1.14.v20131031:compile [INFO] | | \- org.eclipse.jetty:jetty-jndi:jar:8.1.14.v20131031:compile [INFO] | | \- org.eclipse.jetty.orbit:javax.mail.glassfish:jar:1.4.1.v201005082020:compile [INFO] | |\- org.eclipse.jetty.orbit:javax.activation:jar:1.1.0.v201105071233:compile [INFO] | +- org.eclipse.jetty:jetty-security:jar:8.1.14.v20131031:compile [INFO] | +- org.eclipse.jetty:jetty-server:jar:8.1.14.v20131031:compile [INFO] | | +- org.eclipse.jetty.orbit:javax.servlet:jar:3.0.0.v201112011016:compile [INFO] | | +- org.eclipse.jetty:jetty-continuation:jar:8.1.14.v20131031:compile [INFO] | | \- org.eclipse.jetty:jetty-http:jar:8.1.14.v20131031:compile [INFO] | | \- org.eclipse.jetty:jetty-io:jar:8.1.14.v20131031:compile [INFO] | +- com.google.guava:guava:jar:14.0.1:compile [INFO] | +- org.apache.commons:commons-lang3:jar:3.3.2:compile [INFO] | +- com.google.code.findbugs:jsr305:jar:1.3.9:compile [INFO] | +- org.slf4j:slf4j-api:jar:1.7.5:compile [INFO] | +- org.slf4j:jul-to-slf4j:jar:1.7.5:compile [INFO] | +- org.slf4j:jcl-over-slf4j:jar:1.7.5:compile [INFO] | +- log4j:log4j:jar:1.2.17:compile [INFO] | +- org.slf4j:slf4j-log4j12:jar:1.7.5:compile [INFO] | +- com.ning:compress-lzf:jar:1.0.0:compile [INFO] | +- org.xerial.snappy:snappy-java:jar:1.0.5:compile [INFO] | +- com.twitter:chill_2.10:jar:0.3.6:compile [INFO] | | \- com.esotericsoftware.kryo:kryo:jar:2.21:compile [INFO] | | +- com.esotericsoftware.reflectasm:reflectasm:jar:shaded:1.07:compile [INFO] | | +- com.esotericsoftware.minlog:minlog:jar:1.2:compile [INFO] | | \- org.objenesis:objenesis:jar:1.2:compile [INFO] | +- com.twitter:chill-java:jar:0.3.6:compile [INFO] | +- commons-net:commons-net:jar:2.2:compile [INFO] | +- org.spark-project.akka:akka-remote_2.10:jar:2.2.3-shaded-protobuf:compile [INFO] | | +- org.spark-project.akka:akka-actor_2.10:jar:2.2.3-shaded-protobuf:compile [INFO] | | | \- com.typesafe:config:jar:1.0.2:compile [INFO] | | +- io.netty:netty:jar:3.6.6.Final:compile [INFO] | | +- org.spark-project.protobuf:protobuf-java:jar:2.4.1-shaded:compile [INFO] | | \- org.uncommons.maths:uncommons-maths:jar:1.2.2a:compile [INFO] | +- org.spark-project.akka:akka-slf4j_2.10:jar:2.2.3-shaded-protobuf:compile [INFO] | +- org.scala-lang:scala-library:jar:2.10.4:compile [INFO] | +- org.json4s:json4s-jackson_2.10:jar:3.2.6:compile [INFO] | | +- org.json4s:json4s-core_2.10:jar:3.2.6:compile [INFO] | | | +- org.json4s:json4s-ast_2.10:jar:3.2.6:compile [INFO] | | | +- com.thoughtworks.paranamer:paranamer:jar:2.6:compile [INFO] | | | \- org.scala-lang:scalap:jar:2.10.0:compile [INFO] | | | \- org.scala-lang:scala-compiler:jar:2.10.0:compile [INFO] | | |\- org.scala-lang:scala-reflect:jar:2.10.0:compile [INFO] | | \- com.fasterxml.jackson.core:jackson-databind:jar:2.3.0:compile [INFO] | | +- com.fasterxml.jackson.core:jackson-annotations:jar:2.3.0:compile [INFO] | | \- com.fasterxml.jackson.core:jackson-core:jar:2.3.0:compile [INFO] | +- colt:colt:jar:1.2.0:compile [INFO] | | \- concurrent:concurrent:jar:1.3.4:compile [INFO] | +- org.apache.mesos:mesos:jar:shaded-protobuf:0.18.1:compile [INFO] | +- io.netty:netty-all:jar:4.0.17.Final:compile [INFO] | +- com.clearspring.analytics:stream:jar:2.5.1:compile [INFO] | +- com.codahale.metrics:metrics-core:jar:3.0.0:compile [INFO] | +- com.codahale.metrics:metrics-jvm:jar:3.0.0:compile [INFO] | +- com.codahale.metrics:metrics-json:jar:3.0.0:compile [INFO] | +- com.codahale.metrics:metrics-graphite:jar:3.0.0:compile [INFO] | +- org.tachyonproject:tachyon:jar:0.4.1-thrift:compile [INFO] | | +- org.apache.ant:ant:jar:1.9.0:compile [INFO] | | | \- org.apache.ant:ant-launcher:jar:1.9.0:compile [INFO] | | \- commons-io:commons-io:jar:2.4:compile [INFO] | +- org.spark-project:pyrolite:jar:2.0.1:compile [INFO] | \- net.sf.py4j:py4j:jar:0.8.1:compile [INFO] +- org.apache.hadoop:hadoop-client:jar:2.4.0:compile [INFO] | +- org.apache.hadoop:hadoop-common:jar:2.4.0:compile [INFO] | | +- org.apache.commons:commons-math3:jar:3.1.1:compile [INFO]
Spark 1.0.0 Maven dependencies problems.
Using org.apache.spark spark-core_2.10 1.0.0 I can create simple test and run under Eclipse. But when I try to deploy on test server I have dependencies problems. 1. Spark requires akka-remote_2.10 2.2.3-shaded-protobuf And this in turn requires io.netty netty 3.6.6.Final 2. At the same time Spark itself requires netty-parent 4.0.17.Final So now I have different Netty versions and I get either Exception in thread "main" java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s signer information does not match signer information of other classes in the same package When using 3.6.6.Final Or 14/06/09 16:08:10 ERROR ActorSystemImpl: Uncaught fatal error from thread [spark-akka.actor.default-dispatcher-4] shutting down ActorSystem [spark] java.lang.NoClassDefFoundError: org/jboss/netty/util/Timer When using 4.0.17.Final What I am doing wrong and how to solve problem? Thanks toivo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-0-0-Maven-dependencies-problems-tp7247.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
Re: wholeTextFiles() : java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
Wow! What a quick reply! adding org.apache.hadoop hadoop-client 2.4.0 solved the problem. But now I get 14/06/03 19:52:50 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333) at org.apache.hadoop.util.Shell.(Shell.java:326) at org.apache.hadoop.util.StringUtils.(StringUtils.java:76) at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93) at org.apache.hadoop.security.Groups.(Groups.java:77) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283) at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:36) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:109) at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala) thanks toivo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-java-lang-IncompatibleClassChangeError-Found-class-org-apache-hadoop-mapreduce-TaskAtd-tp6818p6820.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
wholeTextFiles() : java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
Hi Set up project under Eclipse using Maven: org.apache.spark spark-core_2.10 1.0.0 Simple example fails: def main(args: Array[String]): Unit = { val conf = new SparkConf() .setMaster("local") .setAppName("CountingSheep") .set("spark.executor.memory", "1g") val sc = new SparkContext(conf) val indir = "src/main/resources/testdata" val files = sc.wholeTextFiles(indir, 10) for( pair <- files) println(pair._1 + " = " + pair._2) 14/06/03 19:20:34 ERROR executor.Executor: Exception in task ID 0 java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164) at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.(CombineFileRecordReader.java:126) at org.apache.spark.input.WholeTextFileInputFormat.createRecordReader(WholeTextFileInputFormat.scala:44) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:111) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) at org.apache.spark.scheduler.Task.run(Task.scala:51) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155) ... 13 more Caused by: java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected at org.apache.spark.input.WholeTextFileRecordReader.(WholeTextFileRecordReader.scala:40) ... 18 more Any idea? thanks toivo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-java-lang-IncompatibleClassChangeError-Found-class-org-apache-hadoop-mapreduce-TaskAtd-tp6818.html Sent from the Apache Spark User List mailing list archive at Nabble.com.