date:20170803

[pyspark] java.lang.NoSuchMethodError: net.jpountz.util.Utils.checkRange error

2017-08-03 Thread kulas...@gmail.com

hi all i use spark-streaming 2.2.0 with python. and read data from kafka(2.11-0.10.0.0) cluster. folllow the kafka Integration guide http://spark.apache.org/docs/latest/streaming-kafka-0-8-integration.html. and i submit a python script with spark-submit --jars

Unsubscribe

2017-08-03 Thread Parijat Mazumdar

Unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: PySpark Streaming S3 checkpointing

2017-08-03 Thread Riccardo Ferrari

Hi Steve, Thank you for your answer, much appreciated. Reading the code seems that: - Python StreamingContext.getOrCreate calls Scala StreamingContextPythonHelper().tryRecoverFromCheckpoint(

Re: SPARK Issue in Standalone cluster

2017-08-03 Thread Marco Mistroni

Hello my 2 cents here, hope it helps If you want to just to play around with Spark, i'd leave Hadoop out, it's an unnecessary dependency that you dont need for just running a python script Instead do the following: - got to the root of our master / slave node. create a directory /root/pyscripts -

SparkEventListener dropping events

2017-08-03 Thread Miles Crawford

We are seeing lots of stability problems with Spark 2.1.1 as a result of dropped events. We disabled the event log, which seemed to help, but many events are still being dropped, as in the example log below. I there any way for me to see what listener is backing up the queue? Is there any

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

2017-08-03 Thread Rabin Banerjee

Unfortunately I cant use scala/python. Any solution in Java ? On Thu, Aug 3, 2017 at 6:04 PM, Gourav Sengupta wrote: > Guru, > > Anyways you can pick up SCALA or Python. Makes things way easier. The > perfomance, maintainability, visibility, and minimum translation

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

2017-08-03 Thread Chetan Khatri

Thanks Holden ! On Thu, Aug 3, 2017 at 4:02 AM, Holden Karau wrote: > The memory overhead is based less on the total amount of data and more on > what you end up doing with the data (e.g. if your doing a lot of off-heap > processing or using Python you need to increase

mapPartitioningWithIndex in Dataframe

2017-08-03 Thread Lalwani, Jayesh

Are there any plans to add mapPartitioningWithIndex in the Dataframe API? Or is there any way to implement my own mapPartitionWithIndex for a Dataframe? I am implementing something which is logically similar to the randomSplit function. In 2.1, randomSplit internally does

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

2017-08-03 Thread Abdallah Mahmoud

Unsubscribe me Please Sent with Mailtrack *Abdallah Mahmoud Zidan* *Junior Data Scientist & Software Developer* *(+20) 01027228807* On 3 August 2017 at 14:39, Jörn Franke

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

2017-08-03 Thread Jörn Franke

You need to create a schema for person. https://spark.apache.org/docs/latest/sql-programming-guide.html#programmatically-specifying-the-schema > On 3. Aug 2017, at 12:09, Rabin Banerjee wrote: > > Hi All, > > I am trying to create a DataSet from DataFrame, where

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

2017-08-03 Thread Gourav Sengupta

Guru, Anyways you can pick up SCALA or Python. Makes things way easier. The perfomance, maintainability, visibility, and minimum translation loss makes things better. Regards, Gourav On Thu, Aug 3, 2017 at 11:09 AM, Rabin Banerjee < dev.rabin.baner...@gmail.com> wrote: > Hi All, > > I am

DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

2017-08-03 Thread Rabin Banerjee

Hi All, I am trying to create a DataSet from DataFrame, where dataframe has been created successfully, and using the same bean I am trying to create dataset. But when I am running it, Dataframe is created as expected. I am able to print the content as well. But not the dataset. The DataSet is

Re: Quick one on evaluation

2017-08-03 Thread Daniel Darabos

On Wed, Aug 2, 2017 at 2:16 PM, Jean Georges Perrin wrote: > Hi Sparkians, > > I understand the lazy evaluation mechanism with transformations and > actions. My question is simpler: 1) are show() and/or printSchema() > actions? I would assume so... > show() is an action (it prints

Re: SPARK Issue in Standalone cluster

2017-08-03 Thread Gourav Sengupta

Hi Steve, I love you mate, thanks a ton once again for ACTUALLY RESPONDING. I am now going through the documentation ( https://github.com/steveloughran/hadoop/blob/s3guard/HADOOP-13786-committer/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3a_committer_architecture.md) and it

Re: SPARK Issue in Standalone cluster

2017-08-03 Thread Steve Loughran

On 2 Aug 2017, at 20:05, Gourav Sengupta > wrote: Hi Steve, I have written a sincere note of apology to everyone in a separate email. I sincerely request your kind forgiveness before hand if anything does sound impolite in my

[pyspark] java.lang.NoSuchMethodError: net.jpountz.util.Utils.checkRange error

Unsubscribe

Re: PySpark Streaming S3 checkpointing

Re: SPARK Issue in Standalone cluster

SparkEventListener dropping events

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

Re: Reparitioning Hive tables - Container killed by YARN for exceeding memory limits

mapPartitioningWithIndex in Dataframe

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

Re: DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

DataSet creation not working Spark 1.6.0 , populating wrong data CDH 5.7.1

Re: Quick one on evaluation

Re: SPARK Issue in Standalone cluster

Re: SPARK Issue in Standalone cluster

15 matches

Site Navigation

Mail list logo

Footer information