Thanks, will try this out and get back... On Tue, Jun 23, 2015 at 2:30 AM, Tathagata Das <t...@databricks.com> wrote:
> Try adding the provided scopes > > <dependency> <!-- Spark dependency --> > <groupId>org.apache.spark</groupId> > <artifactId>spark-core_2.10</artifactId> > <version>1.4.0</version> > > * <scope>provided</scope> * </dependency> > <dependency> <!-- Spark Streaming dependency --> > <groupId>org.apache.spark</groupId> > <artifactId>spark-streaming_2.10</artifactId> > <version>1.4.0</version> > > * <scope>provided</scope> * </dependency> > > This prevents these artifacts from being included in the assembly JARs. > > See scope > > https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Scope > > On Mon, Jun 22, 2015 at 10:28 AM, Nipun Arora <nipunarora2...@gmail.com> > wrote: > >> Hi Tathagata, >> >> I am attaching a snapshot of my pom.xml. It would help immensely, if I >> can include max, and min values in my mapper phase. >> >> The question is still open at : >> http://stackoverflow.com/questions/30902090/adding-max-and-min-in-spark-stream-in-java/30909796#30909796 >> >> I see that there is a bug report filed for a similar error as well: >> https://issues.apache.org/jira/browse/SPARK-3266 >> >> Please let me know, how I can get the same version of spark streaming in >> my assembly. >> I am using the following spark version: >> http://www.apache.org/dyn/closer.cgi/spark/spark-1.4.0/spark-1.4.0-bin-hadoop2.6.tgz >> .. no compilation, just an untar and use the spark-submit script in a local >> install. >> >> >> I still get the same error. >> >> Exception in thread "JobGenerator" java.lang.NoSuchMethodError: >> org.apache.spark.api.java.JavaPairRDD.max(Ljava/util/Comparator;)Lscala/Tuple2; >> >> <dependencies> >> <dependency> <!-- Spark dependency --> >> <groupId>org.apache.spark</groupId> >> <artifactId>spark-core_2.10</artifactId> >> <version>1.4.0</version> >> </dependency> >> <dependency> <!-- Spark Streaming dependency --> >> <groupId>org.apache.spark</groupId> >> <artifactId>spark-streaming_2.10</artifactId> >> <version>1.4.0</version> >> </dependency> >> >> Thanks >> >> Nipun >> >> >> On Thu, Jun 18, 2015 at 11:16 PM, Nipun Arora <nipunarora2...@gmail.com> >> wrote: >> >>> Hi Tathagata, >>> >>> When you say please mark spark-core and spark-streaming as dependencies >>> how do you mean? >>> I have installed the pre-build spark-1.4 for Hadoop 2.6 from spark >>> downloads. In my maven pom.xml, I am using version 1.4 as described. >>> >>> Please let me know how I can fix that? >>> >>> Thanks >>> Nipun >>> >>> On Thu, Jun 18, 2015 at 4:22 PM, Tathagata Das <t...@databricks.com> >>> wrote: >>> >>>> I think you may be including a different version of Spark Streaming in >>>> your assembly. Please mark spark-core nd spark-streaming as provided >>>> dependencies. Any installation of Spark will automatically provide Spark in >>>> the classpath so you do not have to bundle it. >>>> >>>> On Thu, Jun 18, 2015 at 8:44 AM, Nipun Arora <nipunarora2...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have the following piece of code, where I am trying to transform a >>>>> spark stream and add min and max to it of eachRDD. However, I get an error >>>>> saying max call does not exist, at run-time (compiles properly). I am >>>>> using >>>>> spark-1.4 >>>>> >>>>> I have added the question to stackoverflow as well: >>>>> http://stackoverflow.com/questions/30902090/adding-max-and-min-in-spark-stream-in-java/30909796#30909796 >>>>> >>>>> Any help is greatly appreciated :) >>>>> >>>>> Thanks >>>>> Nipun >>>>> >>>>> JavaPairDStream<Tuple2<Long, Integer>, Tuple3<Integer,Long,Long>> >>>>> sortedtsStream = transformedMaxMintsStream.transformToPair(new Sort2()); >>>>> >>>>> sortedtsStream.foreach( >>>>> new Function<JavaPairRDD<Tuple2<Long, Integer>, Tuple3<Integer, >>>>> Long, Long>>, Void>() { >>>>> @Override >>>>> public Void call(JavaPairRDD<Tuple2<Long, Integer>, >>>>> Tuple3<Integer, Long, Long>> tuple2Tuple3JavaPairRDD) throws Exception { >>>>> List<Tuple2<Tuple2<Long, Integer>, >>>>> Tuple3<Integer,Long,Long>> >templist = tuple2Tuple3JavaPairRDD.collect(); >>>>> for(Tuple2<Tuple2<Long,Integer>, >>>>> Tuple3<Integer,Long,Long>> tuple :templist){ >>>>> >>>>> Date date = new Date(tuple._1._1); >>>>> int pattern = tuple._1._2; >>>>> int count = tuple._2._1(); >>>>> Date maxDate = new Date(tuple._2._2()); >>>>> Date minDate = new Date(tuple._2._2()); >>>>> System.out.println("TimeSlot: " + date.toString() + " >>>>> Pattern: " + pattern + " Count: " + count + " Max: " + maxDate.toString() >>>>> + " Min: " + minDate.toString()); >>>>> >>>>> } >>>>> return null; >>>>> } >>>>> } >>>>> ); >>>>> >>>>> Error: >>>>> >>>>> >>>>> 15/06/18 11:05:06 INFO BlockManagerInfo: Added input-0-1434639906000 in >>>>> memory on localhost:42829 (size: 464.0 KB, free: 264.9 MB)15/06/18 >>>>> 11:05:06 INFO BlockGenerator: Pushed block input-0-1434639906000Exception >>>>> in thread "JobGenerator" java.lang.NoSuchMethodError: >>>>> org.apache.spark.api.java.JavaPairRDD.max(Ljava/util/Comparator;)Lscala/Tuple2; >>>>> at >>>>> org.necla.ngla.spark_streaming.MinMax.call(Type4ViolationChecker.java:346) >>>>> at >>>>> org.necla.ngla.spark_streaming.MinMax.call(Type4ViolationChecker.java:340) >>>>> at >>>>> org.apache.spark.streaming.api.java.JavaDStreamLike$class.scalaTransform$3(JavaDStreamLike.scala:360) >>>>> at >>>>> org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$transformToPair$1.apply(JavaDStreamLike.scala:361) >>>>> at >>>>> org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$transformToPair$1.apply(JavaDStreamLike.scala:361) >>>>> at >>>>> org.apache.spark.streaming.dstream.DStream$$anonfun$transform$1$$anonf >>>>> >>>>> >>>> >>> >> >