RE: Spark-Kafka Connector issue

2015-09-28 Thread Ratika Prasad
Thanks for your reply.

I invoked my program with the broker ip and host and it triggered as expected 
but I see the below error

./bin/spark-submit --class org.stream.processing.JavaKafkaStreamEventProcessing 
--master local spark-stream-processing-0.0.1-SNAPSHOT-jar-with-dependencies.jar 
172.28.161.32:9092 TestTopic
15/09/28 17:45:09 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
15/09/28 17:45:11 WARN StreamingContext: spark.master should be set as 
local[n], n > 1 in local mode if you have receivers to get data, otherwise 
Spark jobs will not get resources to process the received data.
Exception in thread "main" org.apache.spark.SparkException: 
java.nio.channels.ClosedChannelException
org.apache.spark.SparkException: Couldn't find leader offsets for 
Set([TestTopic,0])
at 
org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
at 
org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
at scala.util.Either.fold(Either.scala:97)
at 
org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:365)
at 
org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:422)
at 
org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:532)
at 
org.apache.spark.streaming.kafka.KafkaUtils.createDirectStream(KafkaUtils.scala)
at 
org.stream.processing.JavaKafkaStreamEventProcessing.main(JavaKafkaStreamEventProcessing.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Whene I ran the below to check the offsets I get this

bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic TestTopic 
--group test-consumer-group --zookeeper localhost:2181
Exiting due to: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for /consumers/test-consumer-group/offsets/TestTopic 
/0.

Also I just added this below configs to my kafaka/config/consumer.properties 
and restarted kafka

auto.offset.reset=smallest
offsets.storage=zookeeper
offsets.channel.backoff.ms=1000
offsets.channel.socket.timeout.ms=1
offsets.commit.max.retries=5
dual.commit.enabled=true

From: Cody Koeninger [mailto:c...@koeninger.org]
Sent: Monday, September 28, 2015 7:56 PM
To: Ratika Prasad 
Cc: dev@spark.apache.org
Subject: Re: Spark-Kafka Connector issue

This is a user list question not a dev list question.

Looks like your driver is having trouble communicating to the kafka brokers.  
Make sure the broker host and port is available from the driver host (using nc 
or telnet); make sure that you're providing the _broker_ host and port to 
createDirectStream, not the zookeeper host; make sure the topics in question 
actually exist on kafka and the names match what you're providing to 
createDirectStream.





On Sat, Sep 26, 2015 at 11:50 PM, Ratika Prasad 
> wrote:
Hi All,

I am trying out the spark streaming and reading the messages from kafka topics 
which later would be created into streams as below…I have the kafka setup on a 
vm and topics created however when I try to run the program below from my spark 
vm as below I get an error even though the kafka server and zookeeper are up 
and running

./bin/spark-submit --class org.stream.processing.JavaKafkaStreamEventProcessing 
--master local spark-stream-processing-0.0.1-SNAPSHOT-jar-with-dependencies.jar 
172.28.161.32:2181 redemption_inbound

Exception in thread "main" org.apache.spark.SparkException: 
java.io.EOFException: Received -1 when reading from channel, socket has likely 
been closed.
at 
org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
at 
org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
at scala.util.Either.fold(Either.scala:97)
at 
org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:365)
at 
org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:422)
at 

using JavaRDD in spark-redis connector

2015-09-28 Thread Rohith P
Hi all,
  I am trying to work with spark-redis connector (redislabs) which
requires all transactions between redis and spark be in RDD's. The language 
I am using is Java but the connector does not accept JavaRDD's .So I tried
using Spark context in my code instead of JavaSparkContext. But when I
wanted to create a RDD using sc.parallelize , it asks for some scala related
parameters as opposed to lists in java when I tries to have both
javaSparkContext and sparkcontext(for connector) then Multiple contexts
cannot be opened was the error
 The code that I have been trying 


// initialize spark context
private static RedisContext config() {
conf = new SparkConf().setAppName("redis-jedis");
sc2=new SparkContext(conf);
RedisContext rc=new RedisContext(sc2);
return rc;

}
//write to redis which requires the data to be in RDD 
private static void WriteUserTacticData(RedisContext rc, String userid,
String tacticsId, String value) {
hostTup= calling(redisHost,redisPort);
String key=userid+"-"+tacticsId;
RDD> newTup=createTuple(key,value);
rc.toRedisKV(newTup,hostTup);

// the createTuple where the RDD is to be created which will be inserted
into redis
private static RDD> createTuple(String key,
String value) {
sc=new JavaSparkContext(conf);
ArrayList> list= new
ArrayList>();
Tuple2 e= new Tuple2(key,value);
list.add(e);
JavaRDD> javardd= sc.parallelize(list);
RDD> newTupRdd=JavaRDD.toRDD(javardd); 
sc.close();
return newTupRdd;
}



How would I create an RDD(not javaRDD) in java which will be accepted by
redis connector... Any kind of related to the topic would be
appretiated..





--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/using-JavaRDD-in-spark-redis-connector-tp14391.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [Discuss] NOTICE file for transitive "NOTICE"s

2015-09-28 Thread Richard Hillegas
Thanks, Sean!

Sean Owen  wrote on 09/25/2015 06:35:46 AM:

> From: Sean Owen 
> To: Reynold Xin , Richard Hillegas/San
> Francisco/IBM@IBMUS
> Cc: "dev@spark.apache.org" 
> Date: 09/25/2015 07:21 PM
> Subject: Re: [Discuss] NOTICE file for transitive "NOTICE"s
>
> Work underway at ...
>
> https://issues.apache.org/jira/browse/SPARK-10833
> https://github.com/apache/spark/pull/8919
>
>
>
> On Fri, Sep 25, 2015 at 8:54 AM, Sean Owen  wrote:
> > Update: I *think* the conclusion was indeed that nothing needs to
> > happen with NOTICE.
> > However, along the way in
> > https://issues.apache.org/jira/browse/LEGAL-226 it emerged that the
> > BSD/MIT licenses should be inlined into LICENSE (or copied in the
> > distro somewhere). I can get on that -- just some grunt work to copy
> > and paste it all.
> >
> > On Thu, Sep 24, 2015 at 6:55 PM, Reynold Xin 
wrote:
> >> Richard,
> >>
> >> Thanks for bringing this up and this is a great point. Let's start
another
> >> thread for it so we don't hijack the release thread.
> >>
> >>
> >>
> >> On Thu, Sep 24, 2015 at 10:51 AM, Sean Owen 
wrote:
> >>>
> >>> On Thu, Sep 24, 2015 at 6:45 PM, Richard Hillegas

> >>> wrote:
> >>> > Under your guidance, I would be happy to help compile a NOTICE file
> >>> > which
> >>> > follows the pattern used by Derby and the JDK. This effort might
proceed
> >>> > in
> >>> > parallel with vetting 1.5.1 and could be targeted at a later
release
> >>> > vehicle. I don't think that the ASF's exposure is greatly increased
by
> >>> > one
> >>> > more release which follows the old pattern.
> >>>
> >>> I'd prefer to use the ASF's preferred pattern, no? That's what we've
> >>> been trying to do and seems like we're even required to do so, not
> >>> follow a different convention. There is some specific guidance there
> >>> about what to add, and not add, to these files. Specifically, because
> >>> the AL2 requires downstream projects to embed the contents of NOTICE,
> >>> the guidance is to only include elements in NOTICE that must appear
> >>> there.
> >>>
> >>> Put it this way -- what would you like to change specifically? (you
> >>> can start another thread for that)
> >>>
> >>> >> My assessment (just looked before I saw Sean's email) is the same
as
> >>> >> his. The NOTICE file embeds other projects' licenses.
> >>> >
> >>> > This may be where our perspectives diverge. I did not find those
> >>> > licenses
> >>> > embedded in the NOTICE file. As I see it, the licenses are cited
but not
> >>> > included.
> >>>
> >>> Pretty sure that was meant to say that NOTICE embeds other projects'
> >>> "notices", not licenses. And those notices can have all kinds of
> >>> stuff, including licenses.
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >>> For additional commands, e-mail: dev-h...@spark.apache.org
> >>>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

Re: Spark-Kafka Connector issue

2015-09-28 Thread Cody Koeninger
This is a user list question not a dev list question.

Looks like your driver is having trouble communicating to the kafka
brokers.  Make sure the broker host and port is available from the driver
host (using nc or telnet); make sure that you're providing the _broker_
host and port to createDirectStream, not the zookeeper host; make sure the
topics in question actually exist on kafka and the names match what you're
providing to createDirectStream.





On Sat, Sep 26, 2015 at 11:50 PM, Ratika Prasad 
wrote:

> Hi All,
>
>
>
> I am trying out the spark streaming and reading the messages from kafka
> topics which later would be created into streams as below…I have the kafka
> setup on a vm and topics created however when I try to run the program
> below from my spark vm as below I get an error even though the kafka server
> and zookeeper are up and running
>
>
>
> ./bin/spark-submit --class
> org.stream.processing.JavaKafkaStreamEventProcessing --master local
> spark-stream-processing-0.0.1-SNAPSHOT-jar-with-dependencies.jar
> 172.28.161.32:2181 redemption_inbound
>
>
>
> Exception in thread "main" org.apache.spark.SparkException:
> java.io.EOFException: Received -1 when reading from channel, socket has
> likely been closed.
>
> at
> org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
>
> at
> org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
>
> at scala.util.Either.fold(Either.scala:97)
>
> at
> org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:365)
>
> at
> org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:422)
>
> at
> org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:532)
>
> at
> org.apache.spark.streaming.kafka.KafkaUtils.createDirectStream(KafkaUtils.scala)
>
> at
> org.stream.processing.JavaKafkaStreamEventProcessing.main(JavaKafkaStreamEventProcessing.java:52)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:497)
>
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
>
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
>
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
>
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
>
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
>
> Program
>
>
>
> *public* *static* *void* main(String[] args) {
>
> *if* (args.length < 2) {
>
>   System.*err*.println("Usage: DirectKafkaWordCount 
> \n" +
>
>   "   is a list of one or more Kafka brokers\n" +
>
>   "   is a list of one or more kafka topics to consume
> from\n\n");
>
>   System.*exit*(1);
>
> }
>
>
>
> String brokers = args[0];
>
> String topics = args[1];
>
>
>
> // Create context with 2 second batch interval
>
> SparkConf sparkConf = *new* SparkConf().setAppName(
> "JavaKafkaStreamEventProcessing");
>
> JavaStreamingContext jssc = *new* JavaStreamingContext(sparkConf,
> Durations.*seconds*(2));
>
>
>
> HashSet topicsSet = *new* HashSet(Arrays.*asList*
> (topics.split(",")));
>
> HashMap kafkaParams = *new* HashMap();
>
> kafkaParams.put("metadata.broker.list", brokers);
>
>
>
> // Create direct *kafka* stream with brokers and topics
>
> JavaPairInputDStream messages = KafkaUtils.
> *createDirectStream*(
>
> jssc,
>
> String.*class*,
>
> String.*class*,
>
> StringDecoder.*class*,
>
> StringDecoder.*class*,
>
> kafkaParams,
>
> topicsSet
>
> );
>
>
>
> // Get the lines, split them into words, count the words and print
>
> JavaDStream lines = messages.map(*new* *Function String>, String>()* {
>
>   *public* String call(Tuple2 tuple2) {
>
> *return* tuple2._2();
>
>   }
>
> });
>
> JavaDStream words = lines.flatMap(*new* *FlatMapFunction String>()* {
>
>   *public* Iterable call(String x) {
>
> *return* Lists.*newArrayList*(*SPACE*.split(x));
>
>   }
>
> });
>
> JavaPairDStream wordCounts = words.mapToPair(
>
>   *new* *PairFunction()* {
>
> *public* Tuple2 call(String s) {
>
>   *return* *new* Tuple2(s, 1);
>
> }
>
>   }).reduceByKey(
>
> *new* *Function2()* {
>

failed to run spark sample on windows

2015-09-28 Thread Renyi Xiong
I tried to run HdfsTest sample on windows spark-1.4.0

bin\run-sample org.apache.spark.examples.HdfsTest 

but got below exception, any body any idea what was wrong here?

15/09/28 16:33:56.565 ERROR SparkContext: Error initializing SparkContext.
java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
at org.apache.hadoop.util.Shell.run(Shell.java:418)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:633)
at
org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:467)
at
org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:130)
at org.apache.spark.SparkContext.(SparkContext.scala:515)
at org.apache.spark.examples.HdfsTest$.main(HdfsTest.scala:32)
at org.apache.spark.examples.HdfsTest.main(HdfsTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


Monitoring tools for spark streaming

2015-09-28 Thread Siva
Hi,

Could someone recommend the monitoring tools for spark streaming?

By extending StreamingListener we can dump the delay in processing of
batches and some alert messages.

But are there any Web UI tools where we can monitor failures, see delays in
processing, error messages and setup alerts etc.

Thanks


Re: failed to run spark sample on windows

2015-09-28 Thread Ted Yu
What version of hadoop are you using ?

Is that version consistent with the one which was used to build Spark 1.4.0
?

Cheers

On Mon, Sep 28, 2015 at 4:36 PM, Renyi Xiong  wrote:

> I tried to run HdfsTest sample on windows spark-1.4.0
>
> bin\run-sample org.apache.spark.examples.HdfsTest 
>
> but got below exception, any body any idea what was wrong here?
>
> 15/09/28 16:33:56.565 ERROR SparkContext: Error initializing SparkContext.
> java.lang.NullPointerException
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:445)
> at org.apache.hadoop.util.Shell.run(Shell.java:418)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:739)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:722)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:633)
> at
> org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:467)
> at
> org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:130)
> at org.apache.spark.SparkContext.(SparkContext.scala:515)
> at org.apache.spark.examples.HdfsTest$.main(HdfsTest.scala:32)
> at org.apache.spark.examples.HdfsTest.main(HdfsTest.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>


spark-submit classloader issue...

2015-09-28 Thread Rachana Srivastava
Hello all,

Goal:  I want to use APIs from HttpClient library 4.4.1.  I am using maven 
shaded plugin to generate JAR.



Findings: When I run my program as a java application within eclipse everything 
works fine.  But when I am running the program using spark-submit I am getting 
following error:

URL content Could not initialize class 
org.apache.http.conn.ssl.SSLConnectionSocketFactory

java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.http.conn.ssl.SSLConnectionSocketFactory



When I tried to get the referred JAR it is pointing to some Hadoop JAR,  I am 
assuming this is something set in spark-submit.



ClassLoader classLoader = HttpEndPointClient.class.getClassLoader();

URL resource = 
classLoader.getResource("org/apache/http/message/BasicLineFormatter.class");

Prints following jar:

jar:file:/usr/lib/hadoop/lib/httpcore-4.2.5.jar!/org/apache/http/message/BasicLineFormatter.class



After research I found that I can override --conf 
spark.files.userClassPathFirst=true --conf spark.yarn.user.classpath.first=true



But when I do that I am getting following error:

ERROR: org.apache.spark.executor.Executor - Exception in task 0.0 in stage 0.0 
(TID 0)

java.io.InvalidClassException: org.apache.spark.scheduler.Task; local class 
incompatible: stream classdesc serialVersionUID = -4703555755588060120, local 
class serialVersionUID = -1589734467697262504

at 
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)

at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)

at 
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)

at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)

at 
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)

at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)

at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

at 
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)

at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)

at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)

at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)



I am running on CDH 5.4  Here is my complete pom file.



http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;

xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd">

4.0.0

test

test

0.0.1-SNAPSHOT






org.apache.httpcomponents


httpcore

4.4.1






org.apache.httpcomponents


httpclient

4.4.1






org.apache.spark


spark-streaming-kafka_2.10

1.5.0





 httpcore

  org.apache.httpcomponents










org.apache.spark


spark-streaming_2.10

1.5.0





 httpcore

  org.apache.httpcomponents










org.apache.spark


spark-core_2.10

1.5.0





 httpcore

  org.apache.httpcomponents










org.apache.spark


spark-mllib_2.10

  

Re: Using scala-2.11 when making changes to spark source

2015-09-28 Thread Stephen Boesch
The effects of changing the pom.xml extend beyond cases in which we wish to
modify spark itself. In addition when git pull'ing from trunk we need to
either stash or roll back the changes before rebase'ing.

An effort to look into a better solution (possibly including evaluating Ted
Yu's suggested approach) might be considered?

2015-09-20 9:12 GMT-07:00 Ted Yu :

> Maybe the following can be used for changing Scala version:
> http://maven.apache.org/archetype/maven-archetype-plugin/
>
> I played with it a little bit but didn't get far.
>
> FYI
>
> On Sun, Sep 20, 2015 at 6:18 AM, Stephen Boesch  wrote:
>
>>
>> The dev/change-scala-version.sh [2.11]  script modifies in-place  the
>> pom.xml files across all of the modules.  This is a git-visible change.  So
>> if we wish to make changes to spark source in our own fork's - while
>> developing with scala 2.11 - we would end up conflating those updates with
>> our own.
>>
>> A possible scenario would be to update .gitignore - by adding pom.xml.
>> However I can not get that to work: .gitignore is tricky.
>>
>> Suggestions appreciated.
>>
>
>


Re: [VOTE] Release Apache Spark 1.5.1 (RC1)

2015-09-28 Thread james
+1 

1) Build binary instruction: ./make-distribution.sh --tgz --skip-java-test
-Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver
-DskipTests
2) Run Spark SQL with YARN client mode

This 1.5.1 RC1 package have better test results than previous 1.5.0 except
for Spark-10484,Spark-4266 open issue.





--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tp14310p14388.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [VOTE] Release Apache Spark 1.5.1 (RC1)

2015-09-28 Thread Jerry Lam
Hi Spark Developers,

The Spark 1.5.1 documentation is already publicly accessible (
https://spark.apache.org/docs/latest/index.html) but the release is not. Is
it intentional?

Best Regards,

Jerry

On Mon, Sep 28, 2015 at 9:21 AM, james  wrote:

> +1
>
> 1) Build binary instruction: ./make-distribution.sh --tgz --skip-java-test
> -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver
> -DskipTests
> 2) Run Spark SQL with YARN client mode
>
> This 1.5.1 RC1 package have better test results than previous 1.5.0 except
> for Spark-10484,Spark-4266 open issue.
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tp14310p14388.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [VOTE] Release Apache Spark 1.5.1 (RC1)

2015-09-28 Thread Sean Owen
It's on Maven Central already. These various updates have to happen in
some order, and you'll probably see an inconsistent state for a day or
so while things get slowly updated. Consider it released when there's
an announcement, I suppose.

On Mon, Sep 28, 2015 at 11:07 PM, Jerry Lam  wrote:
> Hi Spark Developers,
>
> The Spark 1.5.1 documentation is already publicly accessible
> (https://spark.apache.org/docs/latest/index.html) but the release is not. Is
> it intentional?
>
> Best Regards,
>
> Jerry
>
> On Mon, Sep 28, 2015 at 9:21 AM, james  wrote:
>>
>> +1
>>
>> 1) Build binary instruction: ./make-distribution.sh --tgz --skip-java-test
>> -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver
>> -DskipTests
>> 2) Run Spark SQL with YARN client mode
>>
>> This 1.5.1 RC1 package have better test results than previous 1.5.0 except
>> for Spark-10484,Spark-4266 open issue.
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tp14310p14388.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org