[ANNOUNCE] Announcing Apache Spark 2.1.0

2016-12-29 Thread Yin Huai
Hi all, Apache Spark 2.1.0 is the second release of Spark 2.x line. This release makes significant strides in the production readiness of Structured Streaming, with added support for event time watermarks <https://spark.apache.org/docs/2.1.0/structured-streaming-programming-guide.html#handl

Re: Apache Spark or Spark-Cassandra-Connector doesnt look like it is reading multiple partitions in parallel.

2016-11-26 Thread kant kodali
https://spark.apache.org/docs/2.0.0/api/java/org/apache/spark/sql/DataFrameReader.html#json(org.apache.spark.rdd.RDD) You can pass a rdd to spark.read.json. // Spark here is SparkSession Also it works completely fine with smaller dataset in a table but with 1B records it takes forever and more

Re: Apache Spark or Spark-Cassandra-Connector doesnt look like it is reading multiple partitions in parallel.

2016-11-26 Thread Anastasios Zouzias
guess you might want to first store rdd as a text file on HDFS and then read it using spark.read.json . Cheers, Anastasios On Sat, Nov 26, 2016 at 9:34 AM, kant kodali <kanth...@gmail.com> wrote: > up vote > down votefavorite > <http://stackoverflow.com/questions/40797231/apa

Apache Spark or Spark-Cassandra-Connector doesnt look like it is reading multiple partitions in parallel.

2016-11-26 Thread kant kodali
up vote down votefavorite <http://stackoverflow.com/questions/40797231/apache-spark-or-spark-cassandra-connector-doesnt-look-like-it-is-reading-multipl?noredirect=1#> Apache Spark or Spark-Cassandra-Connector doesnt look like it is reading multiple partitions in parallel. Here is my code

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread kant kodali
We have a 8 node Cassandra Cluster. Replication Strategy: 3 Consistency Level Quorum. Data Spread: I can let you know once I get access to our production cluster. The use case for simple count is more for internal use than say end clients/customers however there are many uses cases from customers

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread Jörn Franke
I am not sure what use case you want to demonstrate with select count in general. Maybe you can elaborate more what your use case is. Aside from this: this is a Cassandra issue. What is the setup of Cassandra? Dedicated nodes? How many? Replication strategy? Consistency configuration? How is

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread kant kodali
some accurate numbers here. so it took me 1hr:30 mins to count 698705723 rows (~700 Million) and my code is just this sc.cassandraTable("cuneiform", "blocks").cassandraCount On Thu, Nov 24, 2016 at 10:48 AM, kant kodali wrote: > Take a look at this

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread kant kodali
Take a look at this https://github.com/brianmhess/cassandra-count Now It is just matter of incorporating it into spark-cassandra-connector I guess. On Thu, Nov 24, 2016 at 1:01 AM, kant kodali wrote: > According to this link https://github.com/datastax/ >

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread kant kodali
According to this link https://github.com/datastax/spark-cassandra-connector/blob/master/doc/3_selection.md I tried the following but it still looks like it is taking forever sc.cassandraTable(keyspace, table).cassandraCount On Thu, Nov 24, 2016 at 12:56 AM, kant kodali

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread kant kodali
I would be glad if SELECT COUNT(*) FROM hello can return any value for that size :) I can say for sure it didn't return anything for 30 mins and I probably need to build more patience to sit for few more hours after that! Cassandra recommends to use ColumnFamilyStats using nodetool cfstats which

Re: Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread Anastasios Zouzias
How fast is Cassandra without Spark on the count operation? cqsh> SELECT COUNT(*) FROM hello (this is not equivalent with what you are doing but might help you find the root of the cause) On Thu, Nov 24, 2016 at 9:03 AM, kant kodali wrote: > I have the following code > > I

Apache Spark SQL is taking forever to count billion rows from Cassandra?

2016-11-24 Thread kant kodali
I have the following code I invoke spark-shell as follows ./spark-shell --conf spark.cassandra.connection.host=170.99.99.134 --executor-memory 15G --executor-cores 12 --conf spark.cassandra.input.split.size_in_mb=67108864 code scala> val df = spark.sql("SELECT test from hello") //

RE: submitting a spark job using yarn-client and getting NoClassDefFoundError: org/apache/spark/Logging

2016-11-16 Thread David Robison
I’ve gotten a little further along. It now submits the job via Yarn, but now the jobs exit immediately with the following error: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging at java.lang.ClassLoader.defineClass1(Nat

[ANNOUNCE] Apache Spark 2.0.2

2016-11-14 Thread Reynold Xin
We are happy to announce the availability of Spark 2.0.2! Apache Spark 2.0.2 is a maintenance release containing 90 bug fixes along with Kafka 0.10 support and runtime metrics for Structured Streaming. This release is based on the branch-2.0 maintenance branch of Spark. We strongly recommend all

Re: Using Apache Spark Streaming - how to handle changing data format within stream

2016-11-09 Thread coolgar
uot;).foreach { line => if // it's a header parser = someParserBasedOn(line) else items += parser.parse(line) } items.iterator } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-Apache-Spark-Streaming-how-to-handle-changing-da

[ANNOUNCE] Announcing Apache Spark 1.6.3

2016-11-07 Thread Reynold Xin
We are happy to announce the availability of Spark 1.6.3! This maintenance release includes fixes across several areas of Spark and encourage users on the 1.6.x line to upgrade to 1.6.3. Head to the project's download page to download the new version: http://spark.apache.org/downloads.html

Re: Using Apache Spark Streaming - how to handle changing data format within stream

2016-11-07 Thread Cody Koeninger
if // it's a header parser = someParserBasedOn(line) else items += parser.parse(line) } items.iterator } On Mon, Nov 7, 2016 at 4:22 PM, coolgar <karllbunn...@gmail.com> wrote: > I'm using apache spark streaming with the kafka direct consumer. The data > stre

Using Apache Spark Streaming - how to handle changing data format within stream

2016-11-07 Thread coolgar
I'm using apache spark streaming with the kafka direct consumer. The data stream I'm receiving is log data that includes a header with each block of messages. Each DStream can therefore have many blocks of messages, each with it's own header. The header is used to know how to interpret

why is that two stages in apache spark are computing same thing?

2016-10-24 Thread maitraythaker
I have a spark optimization query that I have posted on StackOverflow, any guidance on this would be appreciated. Please follow the link below, I have explained the problem in depth here with code. http://stackoverflow.com/questions/40192302/why-is-that-two-stages-in-apache-spark-are-computing

why is that two stages in apache spark are computing same thing?

2016-10-22 Thread maitraythaker
I have a spark optimization query that I have posted on StackOverflow, any guidance on this would be appreciated. Please follow the link below, I have explained the problem in depth here with code. http://stackoverflow.com/questions/40192302/why-is-that-two-stages-in-apache-spark-are-computing

Re: NoClassDefFoundError: org/apache/spark/Logging in SparkSession.getOrCreate

2016-10-17 Thread Saisai Shao
Not sure why your code will search Logging class under org/apache/spark, this should be “org/apache/spark/internal/Logging”, and it changed long time ago. On Sun, Oct 16, 2016 at 3:25 AM, Brad Cox <bradj...@gmail.com> wrote: > I'm experimenting with Spark 2.0.1 for the first time an

NoClassDefFoundError: org/apache/spark/Logging in SparkSession.getOrCreate

2016-10-15 Thread Brad Cox
() .master("local") .appName("DecisionTreeExample") .getOrCreate(); Running this in the eclipse debugger, execution fails in getOrCreate() with this exception Exception in thread "main" java.lang.NoC

Re: java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset

2016-10-07 Thread kant kodali
perfect! That fixes it all! On Fri, Oct 7, 2016 1:29 AM, Denis Bolshakov bolshakov.de...@gmail.com wrote: You need to have spark-sql, now you are missing it. 7 Окт 2016 г. 11:12 пользователь "kant kodali" написал: Here are the jar files on my classpath after doing a

Re: java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset

2016-10-05 Thread kant kodali
I am running locally so they all are on one host On Wed, Oct 5, 2016 3:12 PM, Jakob Odersky ja...@odersky.com wrote: Are all spark and scala versions the same? By "all" I mean the master, worker and driver instances.

Re: Apache Spark JavaRDD pipe() need help

2016-09-23 Thread शशिकांत कुलकर्णी
s sparkContext > > object as app will throw error of "task not serializable". If there is a > way > > let me know. > > > > Now I am not able to achieve STEP 3 above. How can I pass a String to C > > binary and get the output back in my program. The C binar

Re: Apache Spark JavaRDD pipe() need help

2016-09-22 Thread Jakob Odersky
get the output back in my program. The C binary reads data from > STDIN and outputs data to STDOUT. It is working from other part of > application from PHP. I want to reuse the same C binary in my Apache SPARK > application for some background processing and analysis using JavaRDD

Re: Apache Spark JavaRDD pipe() need help

2016-09-22 Thread शशिकांत कुलकर्णी
am. The C binary reads data from STDIN and outputs data to STDOUT. It is working from other part of application from PHP. I want to reuse the same C binary in my Apache SPARK application for some background processing and analysis using JavaRDD.pipe() API. If there is any other way let me know. T

Re: Apache Spark JavaRDD pipe() need help

2016-09-21 Thread Jakob Odersky
Can you provide more details? It's unclear what you're asking On Wed, Sep 21, 2016 at 10:14 AM, shashikant.kulka...@gmail.com wrote: > Hi All, > > I am trying to use the JavaRDD.pipe() API. > > I have one object with me from the JavaRDD

Apache Spark JavaRDD pipe() need help

2016-09-21 Thread shashikant.kulka...@gmail.com
w if you need more inputs from me. Thanks in advance. Shashi -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-JavaRDD-pipe-need-help-tp27772.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Apache Spark 2.0.0 on Microsoft Windows Create Dataframe

2016-09-16 Thread Jacek Laskowski
Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Fri, Sep 16, 2016 at 9:47 PM, Advait Mohan Raut <adv...@essexlg.com> wrote: > Hi > > I am trying to run Spark 2.0.0 in the Microsoft Windows environment without > hadoop or

Apache Spark 2.0.0 on Microsoft Windows Create Dataframe

2016-09-16 Thread Advait Mohan Raut
ailable to configure spark 2.0.0 on Windows for above mentioned environment ? Or it does not support ? Your inputs will be appreciated. Source: http://stackoverflow.com/questions/39538544/apache-spark-2-0-0-on-microsoft-windows-create-dataframe Similar Post: http://stackoverflow.com/questions/3

How do we process/scale variable size batches in Apache Spark Streaming

2016-08-23 Thread Rachana Srivastava
I am running a spark streaming process where I am getting batch of data after n seconds. I am using repartition to scale the application. Since the repartition size is fixed we are getting lots of small files when batch size is very small. Is there anyway I can change the partitioner logic

Things to do learn Cassandra in Apache Spark Environment

2016-08-23 Thread Gokula Krishnan D
Hello All - Hope, you are doing good. I have a general question. I am working on Hadoop using Apache Spark. At this moment, we are not using Cassandra but I would like to know what's the scope of learning and using it in the Hadoop environment. It would be great if you could provide the use

Re: Apache Spark toDebugString producing different output for python and scala repl

2016-08-15 Thread Saisai Shao
The implementation inside the Python API and Scala API for RDD is slightly different, so the difference of RDD lineage you printed is expected. On Tue, Aug 16, 2016 at 10:58 AM, DEEPAK SHARMA wrote: > Hi All, > > > Below is the small piece of code in scala and

Re: Apache Spark toDebugString producing different output for python and scala repl

2016-08-15 Thread DEEPAK SHARMA
Hi All, Below is the small piece of code in scala and python REPL in Apache Spark.However I am getting different output in both the language when I execute toDebugString.I am using cloudera quick start VM. PYTHON rdd2 =

Re: Source format for Apache Spark logo

2016-08-08 Thread Sean Owen
n S > > > > > > On 08-Aug-2016, at 5:24 PM, michael.ar...@gdata-adan.de wrote: > > Hi, > > for a presentation I’d apreciate a vector version of the Apache Spark logo, > unfortunately I cannot find it. Is the Logo available in a vector format > somewhere? >

Source format for Apache Spark logo

2016-08-08 Thread Michael.Arndt
Hi, for a presentation I'd apreciate a vector version of the Apache Spark logo, unfortunately I cannot find it. Is the Logo available in a vector format somewhere? Virus checked by G Data MailSecurity Version: AVA 25.7800 dated 08.08.2016 Virus news: www.antiviruslab.com

How to connect Power BI to Apache Spark on local machine?

2016-08-04 Thread Devi P.V
Hi all, I am newbie in Power BI.What are the configurations need to connect Power BI to spark on my local machine? I found some documents that mentioned spark over Azure's HDInsight .But didn't find any reference materials for connecting Spark to remote machine? Is it possible? following is the

FW: [jupyter] newbie. apache spark python3 'Jupyter' data frame problem with auto completion and accessing documentation

2016-08-02 Thread Andy Davidson
FYI From: <jupy...@googlegroups.com> on behalf of Thomas Kluyver <tak...@gmail.com> Reply-To: <jupy...@googlegroups.com> Date: Tuesday, August 2, 2016 at 3:26 AM To: Project Jupyter <jupy...@googlegroups.com> Subject: Re: [jupyter] newbie. apache spark python3 'Jup

Re:Re:Re: [ANNOUNCE] Announcing Apache Spark 2.0.0

2016-07-27 Thread prosp4300
Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Wed, Jul 27, 2016 at 9:00 AM, Reynold Xin <r...@databricks.com> wrote: Hi all, Apache Spark 2.0.0 is the first release of Spark 2.x line. It includes 2500+ patches from 300+ contributors. To download Spark 2.0, head over

Re:Re: [ANNOUNCE] Announcing Apache Spark 2.0.0

2016-07-27 Thread prosp4300
red-streaming-programming-guide.html Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Wed, Jul 27, 2016 at 9:00 AM, Reynold Xin <r...@databricks.com> wrote: Hi all, Apache Spark 2.0.0 is the first release of Spark 2.x line. It

Re: [ANNOUNCE] Announcing Apache Spark 2.0.0

2016-07-27 Thread Ofir Manor
ml Ofir Manor Co-Founder & CTO | Equalum Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io On Wed, Jul 27, 2016 at 9:00 AM, Reynold Xin <r...@databricks.com> wrote: > Hi all, > > Apache Spark 2.0.0 is the first release of Spark 2.x line. It includes > 2500+ patches fro

Re:[ANNOUNCE] Announcing Apache Spark 2.0.0

2016-07-27 Thread prosp4300
Congratulations! 在 2016-07-27 14:00:22,"Reynold Xin" <r...@databricks.com> 写道: Hi all, Apache Spark 2.0.0 is the first release of Spark 2.x line. It includes 2500+ patches from 300+ contributors. To download Spark 2.0, head over to the download page: http://

[ANNOUNCE] Announcing Apache Spark 2.0.0

2016-07-27 Thread Reynold Xin
Hi all, Apache Spark 2.0.0 is the first release of Spark 2.x line. It includes 2500+ patches from 300+ contributors. To download Spark 2.0, head over to the download page: http://spark.apache.org/downloads.html To view the release notes: http://spark.apache.org/releases/spark-release-2-0-0.html

What is the maximum number of column being supported by apache spark dataframe

2016-07-11 Thread Zijing Guo
Hi all, SPARK-Version: 1.5.2 with yarn 2.7.1.2.3.0.0-2557I'm running into a problem while I'm exploring the data through spark-shell that I'm trying to create a really fat dataframe that with 3000 columns. Code as below:val valueFunctionUDF = udf((valMap: Map[String, String], dataItemId:

RE: AMQP extension for Apache Spark Streaming (messaging/IoT)

2016-07-03 Thread Darren Govoni
This is fantastic news. Sent from my Verizon 4G LTE smartphone Original message From: Paolo Patierno <ppatie...@live.com> Date: 7/3/16 4:41 AM (GMT-05:00) To: user@spark.apache.org Subject: AMQP extension for Apache Spark Streaming (messaging/IoT) Hi all

AMQP extension for Apache Spark Streaming (messaging/IoT)

2016-07-03 Thread Paolo Patierno
Hi all, I'm working on an AMQP extension for Apache Spark Streaming, developing a reliable receiver for that. After MQTT support (I see it in the Apache Bahir repository), another messaging/IoT protocol could be very useful for the Apache Spark Streaming ecosystem. Out there a lot

Apache Spark Is Hanging when fetch data from SQL Server 2008

2016-06-29 Thread Gastón Schabas
Hi everyone. I'm experiencing an issue when I try to fetch data from SQL Server. This is my context Ubuntu 14.04 LTS Apache Spark 1.4.0 SQL Server 2008 Scala 2.10.5 Sbt 0.13.11 I'm trying to fetch data from a table in SQL Server 2008 that has 85.000.000 records. I just only need around 200.000

Databricks' 2016 Survey on Apache Spark

2016-06-23 Thread Jules Damji
Hi All, We at Databricks are running a short survey to understand users’ needs and usage of Apache Spark. Because we value community feedback, this survey will help us both to understand usage of Spark and to direct our future contributions to it. If you have a moment, please take some time

Re: Kerberos setup in Apache spark connecting to remote HDFS/Yarn

2016-06-17 Thread Sudarshan Rangarajan
> renewer -> [Help 1] > > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Kerberos-setup-in-Apache-spark-connecting-to-remote-HDFS-Yarn-tp27181p27189.html > Sent from the

Re: Kerberos setup in Apache spark connecting to remote HDFS/Yarn

2016-06-17 Thread akhandeshi
Little more progress... I add few enviornment variables, not I get following error message: InvocationTargetException: Can't get Master Kerberos principal for use as renewer -> [Help 1] -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kerberos-se

Re: Kerberos setup in Apache spark connecting to remote HDFS/Yarn

2016-06-16 Thread akhandeshi
by: KrbException: Cannot locate default realm at sun.security.krb5.Config.getDefaultRealm(Config.java:1029) I did add krb5.config to classpath as well as define KRB5_CONFIG -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kerberos-setup-in-Apache-spark

Re: Kerberos setup in Apache spark connecting to remote HDFS/Yarn

2016-06-16 Thread Ami Khandeshi
at org.apache.spark.examples.SparkYarn.main(SparkYarn.scala) >> ... 6 more >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>

Re: Kerberos setup in Apache spark connecting to remote HDFS/Yarn

2016-06-16 Thread Ted Yu
un.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > > org.apache.hadoop.security.authentication.uti

Kerberos setup in Apache spark connecting to remote HDFS/Yarn

2016-06-16 Thread akhandeshi
at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:63) ... 11 more Caused by: KrbException: Cannot locate default realm -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Kerberos-setup-in-A

Re: RESOLVED - Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Mich Talebzadeh
anonfun$process$1.apply(ILoop.scala:837) >>>> at >>>> >>>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) >>>> at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837) >>>> at s

RESOLVED - Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
7) >>>> at >>>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) >>>> at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837) >>>> at scala.tools.nsc.interpreter.ILoop.main(ILoop.s

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
soleRunner.java:64) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMeth

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Todd Nist
tbrains.plugins.scala.compiler.rt.ConsoleRunner.main(ConsoleRunner.java:64) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
va:483) >> at >> com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) >> >> As for the Spark configuration: >> >>val conf: SparkConf = new >> SparkConf().setAppName("AppName").setMaster("local[2]") >>

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Todd Nist
>> As for the Spark configuration: >> >>val conf: SparkConf = new >> SparkConf().setAppName("AppName").setMaster("local[2]") >> >> val confParams: Map[String, String] = Map( >> "metadata.broker.list" -> ":

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
(conf, Seconds(1)) > val kafkaStream = KafkaUtils.createDirectStream(context,confParams, > topics) > > kafkaStream.foreachRDD(rdd => { > rdd.collect().foreach(println) > }) > > context.awaitTermination() > context.start() > > The

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Jacek Laskowski
KafkaUtils.createDirectStream(context,confParams, > topics) > > kafkaStream.foreachRDD(rdd => { > rdd.collect().foreach(println) > }) > > context.awaitTermination() > context.start() > > The Kafka topic does exist, Kafka server is up and running a

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Mich Talebzadeh
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>> at >>>> >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(Delegati

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
17) >>> at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581) >>> at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588) >>> at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591) >>> at >>> scala.tools.nsc.inte

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Mich Talebzadeh
terpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882) >>> at >>> >>> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837) >>> at >>> >>> scala.tools.nsc.interpreter.ILoop$$anonfun$process$

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
ect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.l

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Mich Talebzadeh
ke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:483) >> at >> com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) >> >> As for the Spark configuration: >> >> val conf: Spark

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
ontext: StreamingContext = new StreamingContext(conf, Seconds(1)) > val kafkaStream = KafkaUtils.createDirectStream(context,confParams, > topics) > > kafkaStream.foreachRDD(rdd => { > rdd.collect().foreach(println) > }) > > context.awaitTermin

Re: Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Mich Talebzadeh
reamingContext = new StreamingContext(conf, Seconds(1)) > val kafkaStream = KafkaUtils.createDirectStream(context,confParams, > topics) > > kafkaStream.foreachRDD(rdd => { > rdd.collect().foreach(println) > }) > > context.awaitTermination() > c

Apache Spark Kafka Integration - org.apache.spark.SparkException: Couldn't find leader offsets for Set()

2016-06-07 Thread Dominik Safaric
t the problem actually be? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Kafka-Integration-org-apache-spark-SparkException-Couldn-t-find-leader-offsets-for-Set-tp271

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread Marcelo Vanzin
On Mon, Jun 6, 2016 at 12:31 PM, verylucky Man <verylucky...@gmail.com> >> wrote: >> > Hi, >> > >> > I have a cluster (Hortonworks supported system) running Apache spark on >> > 1.5.2 on Java 7, installed by admin. Java 8 is also installed. >>

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread verylucky...@gmail.com
Man <verylucky...@gmail.com> > wrote: > > Hi, > > > > I have a cluster (Hortonworks supported system) running Apache spark on > > 1.5.2 on Java 7, installed by admin. Java 8 is also installed. > > > > I don't have admin access to this cluster and would lik

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread Marcelo Vanzin
algorithms supported by jdk8. On Mon, Jun 6, 2016 at 12:31 PM, verylucky Man <verylucky...@gmail.com> wrote: > Hi, > > I have a cluster (Hortonworks supported system) running Apache spark on > 1.5.2 on Java 7, installed by admin. Java 8 is also installed. > > I don't have admin acc

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread verylucky...@gmail.com
e understand how >>>> security is managed in Spark and how changing from java 7 to 8 can mess up >>>> these configurations? >>>> >>>> >>>> Thank you! >>>> >>>> On Mon, Jun 6, 2016 at 2:37 PM, Ted Yu <yuzhi

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread Marco Mistroni
;>> these configurations? >>> >>> >>> Thank you! >>> >>> On Mon, Jun 6, 2016 at 2:37 PM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Have you seen this ? >>>> >>>> >>>> http://stackov

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread Koert Kuipers
> >> On Mon, Jun 6, 2016 at 2:37 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> Have you seen this ? >>> >>> >>> http://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation >>> >>> On Mon, Jun 6, 2016 at

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread Ted Yu
2:37 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Have you seen this ? >> >> >> http://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation >> >> On Mon, Jun 6, 2016 at 12:31 PM, verylucky Man <verylucky...@gmail.com> >> wrote: &

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread verylucky...@gmail.com
hih...@gmail.com> wrote: > Have you seen this ? > > > http://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation > > On Mon, Jun 6, 2016 at 12:31 PM, verylucky Man <verylucky...@gmail.com> > wrote: > >> Hi, >> >> I have a clust

Re: Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread Ted Yu
Have you seen this ? http://stackoverflow.com/questions/22423063/java-exception-on-sslsocket-creation On Mon, Jun 6, 2016 at 12:31 PM, verylucky Man <verylucky...@gmail.com> wrote: > Hi, > > I have a cluster (Hortonworks supported system) running Apache spark on > 1.5.2 on

Apache Spark security.NosuchAlgorithm exception on changing from java 7 to java 8

2016-06-06 Thread verylucky Man
Hi, I have a cluster (Hortonworks supported system) running Apache spark on 1.5.2 on Java 7, installed by admin. Java 8 is also installed. I don't have admin access to this cluster and would like to run spark (1.5.2 and later versions) on java 8. I come from HPC/MPI background. So I naively

Apache Spark Video Processing from NFS Shared storage: Advise needed

2016-05-26 Thread mobcdi
. Is that to be expected and if I do need to spin up hadoop can I double job the 20vms by running both hadoop and spark on all 20 machines or would the recommendation be I split them into separate hadoop and spark clusters Michael -- View this message in context: http://apache-spark-user-list.1001560.n3

Re: Apache Spark Slack

2016-05-16 Thread Matei Zaharia
mailing list and JIRA so that they can easily be archived and found afterward. Matei > On May 16, 2016, at 1:06 PM, Dood@ODDO <oddodao...@gmail.com> wrote: > > On 5/16/2016 9:52 AM, Xinh Huynh wrote: >> I just went to IRC. It looks like the correct channel is #apache-spark. >&g

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 9:52 AM, Xinh Huynh wrote: I just went to IRC. It looks like the correct channel is #apache-spark. So, is this an "official" chat room for Spark? Ah yes, my apologies, it is #apache-spark indeed. Not sure if there is an official channel on IRC

Re: Apache Spark Slack

2016-05-16 Thread Xinh Huynh
I just went to IRC. It looks like the correct channel is #apache-spark. So, is this an "official" chat room for Spark? Xinh On Mon, May 16, 2016 at 9:35 AM, Dood@ODDO <oddodao...@gmail.com> wrote: > On 5/16/2016 9:30 AM, Paweł Szulc wrote: > >> >> Just reali

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 9:30 AM, Paweł Szulc wrote: Just realized that people have to be invited to this thing. You see, that's why Gitter is just simpler. I will try to figure it out ASAP You don't need invitations to IRC and it has been around for decades. You can just go to webchat.freenode.net

Re: Apache Spark Slack

2016-05-16 Thread Paweł Szulc
Just realized that people have to be invited to this thing. You see, that's why Gitter is just simpler. I will try to figure it out ASAP 16 maj 2016 15:40 "Paweł Szulc" napisał(a): > I've just created this https://apache-spark.slack.com for ad-hoc > communications within

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 6:40 AM, Paweł Szulc wrote: I've just created this https://apache-spark.slack.com for ad-hoc communications within the comunity. Everybody's welcome! Why not just IRC? Slack is yet another place to create an account etc. - IRC is much easier. What does Slack give you that's so

Re: apache spark on gitter?

2016-05-16 Thread Sean Owen
, even as it is possible to make it clear it does refer to Apache Spark. Since this has come up in recent memory, I have a good link handy for the interested: http://www.apache.org/foundation/marks/ On Mon, May 16, 2016 at 2:41 PM, Paweł Szulc <paul.sz...@gmail.com> wrote: > I've jus

Apache Spark Slack

2016-05-16 Thread Paweł Szulc
I've just created this https://apache-spark.slack.com for ad-hoc communications within the comunity. Everybody's welcome! -- Regards, Paul Szulc twitter: @rabbitonweb blog: www.rabbitonweb.com

Re: apache spark on gitter?

2016-05-16 Thread Paweł Szulc
I've just created https://apache-spark.slack.com On Thu, May 12, 2016 at 9:28 AM, Paweł Szulc wrote: > Hi, > > well I guess the advantage of gitter over maling list is the same as with > IRC. It's not actually a replacer because mailing list is also important. > But it is

Re: apache spark on gitter?

2016-05-12 Thread Xinh Huynh
I agree that it can help build a community and be a place for real-time conversations. Xinh On Thu, May 12, 2016 at 12:28 AM, Paweł Szulc wrote: > Hi, > > well I guess the advantage of gitter over maling list is the same as with > IRC. It's not actually a replacer because

Re: apache spark on gitter?

2016-05-12 Thread Paweł Szulc
Hi, well I guess the advantage of gitter over maling list is the same as with IRC. It's not actually a replacer because mailing list is also important. But it is lot easier to build a community around tool with ad-hoc ability to connect with each other. I have gitter running on constantly, I

Re: apache spark on gitter?

2016-05-11 Thread Xinh Huynh
Hi Pawel, I'd like to hear more about your idea. Could you explain more why you would like to have a gitter channel? What are the advantages over a mailing list (like this one)? Have you had good experiences using gitter on other open source projects? Xinh On Wed, May 11, 2016 at 11:10 AM, Sean

Re: apache spark on gitter?

2016-05-11 Thread Sean Owen
I don't know of a gitter channel and I don't use it myself, FWIW. I think anyone's welcome to start one. I hesitate to recommend this, simply because it's preferable to have one place for discussion rather than split it over several, and, we have to keep the @spark.apache.org mailing lists as the

Re: apache spark on gitter?

2016-05-11 Thread Paweł Szulc
no answer, but maybe one more time, a gitter channel for spark users would be a good idea! On Mon, May 9, 2016 at 1:45 PM, Paweł Szulc wrote: > Hi, > > I was wondering - why Spark does not have a gitter channel? > > -- > Regards, > Paul Szulc > > twitter: @rabbitonweb >

apache spark on gitter?

2016-05-09 Thread Paweł Szulc
Hi, I was wondering - why Spark does not have a gitter channel? -- Regards, Paul Szulc twitter: @rabbitonweb blog: www.rabbitonweb.com

Re: Kafka exception in Apache Spark

2016-04-26 Thread Cody Koeninger
That error indicates a message bigger than the buffer's capacity https://issues.apache.org/jira/browse/KAFKA-1196 On Tue, Apr 26, 2016 at 3:07 AM, Michel Hubert wrote: > Hi, > > > > > > I use a Kafka direct stream approach. > > My Spark application was running ok. > > This

RE: Kafka exception in Apache Spark

2016-04-26 Thread Michel Hubert
This is production. Van: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Verzonden: dinsdag 26 april 2016 12:01 Aan: Michel Hubert <mich...@phact.nl> CC: user@spark.apache.org Onderwerp: Re: Kafka exception in Apache Spark Hi Michael, Is this production or test? Dr Mich Tale

Re: Kafka exception in Apache Spark

2016-04-26 Thread Mich Talebzadeh
Hi Michael, Is this production or test? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 26 April

<    2   3   4   5   6   7   8   9   10   11   >