Re:

2023-02-10 Thread Sunil Prabhakara
unsubscribe On Tue, Feb 7, 2023 at 5:19 AM Tang Jinxin wrote: > unsubscribe >

Re: [Spark SQL]: Aggregate Push Down / Spark 3.2

2021-11-04 Thread Sunil Prabhakara
Unsubscribe. On Mon, Nov 1, 2021 at 6:57 PM Kapoor, Rohit wrote: > Hi, > > > > I am testing the aggregate push down for JDBC after going through the JIRA > - https://issues.apache.org/jira/browse/SPARK-34952 > > I have the latest Spark 3.2 setup in local mode (laptop). > > > > I have PostgreSQL

Unsubscribe

2021-06-19 Thread Sunil Prabhakara

Unsubscribe

2021-02-11 Thread Sunil Prabhakara

Out of memory causing due to high number of spark submissions in FIFO mode

2020-06-09 Thread Sunil Pasumarthi
Hi all, I have written a small ETL spark application which takes data from GCS and transforms them and saves them again into some other GCS bucket. I am trying to run this application for different ids using a spark cluster in google's dataproc and just tweaking the default configuration to use a

Spark :- Update record in partition.

2020-06-07 Thread Sunil Kalra
Hi All, If i have to update a record in partition using spark, do i have to read the whole partition and update the row and overwrite the partition? Is there a way to only update 1 row like DBMS. Otherwise 1 row update takes a long time to rewrite the whole partition ? Thanks Sunil

Unsubscribe

2020-06-06 Thread Sunil Prabhakara

Unsubscribe

2020-05-30 Thread Sunil Prabhakara

Re: unsubscribe

2020-05-24 Thread Sunil Prabhakara
On Sat, 16 May 2020, 22:34 Punna Yenumala, wrote: >

Need help with Application Detail UI link

2019-12-05 Thread Sunil Patil
sh Thanks Sunil

Re: OOM Error

2019-09-07 Thread Sunil Kalra
Ankit Can you try reducing number of cores or increasing memory. Because with below configuration your each core is getting ~3.5 GB. Otherwise your data is skewed, that one of cores is getting too much data based key. spark.executor.cores 6 spark.executor.memory 36g On Sat, Sep 7, 2019 at 6:35

Re: Unsubscribe

2019-02-04 Thread Sunil Prabhakara
Unsubscribe

Re: Unsubscribe

2018-09-05 Thread Sunil Prabhakara
> >

Re: retention policy for spark structured streaming dataset

2018-03-14 Thread Sunil Parmar
Can you use partitioning ( by day ) ? That will make it easier to drop data older than x days outside streaming job. Sunil Parmar On Wed, Mar 14, 2018 at 11:36 AM, Lian Jiang <jiangok2...@gmail.com> wrote: > I have a spark structured streaming job which dump data into a parqu

Broadcast variables: destroy/unpersist unexpected behaviour

2018-03-13 Thread Sunil
I experienced the below two cases when unpersisting or destroying broadcast variables in pyspark. But the same works good in spark scala shell. Any clue why this happens ? Is it a bug in pyspark? ***Case 1:*** >>> b1 = sc.broadcast([1,2,3]) >>> b1.value [1, 2, 3] >>> b1.destroy()

Re: [Beginner] How to save Kafka Dstream data to parquet ?

2018-03-05 Thread Sunil Parmar
We use Impala to access parquet files in the directories. Any pointers on achieving at least once semantic with spark streaming or partial files ? Sunil Parmar On Fri, Mar 2, 2018 at 2:57 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Structured Streaming's file si

Re: [Beginner] How to save Kafka Dstream data to parquet ?

2018-03-02 Thread Sunil Parmar
to deal with partial files by writing .tmp files and renaming them as the last step. We only commit offset after rename is successful. This way we get at least once semantic and partial file write issue. Thoughts ? Sunil Parmar On Wed, Feb 28, 2018 at 1:59 PM, Tathagata Das <tathagata.d

Spark UI port

2017-09-10 Thread Sunil Kalyanpur
. Is there a way we can make spark job to use the port assigned by Marathon instead of Spark job picking the configuration from the Checkpoint? -- Thanks, Sunil

Spark UI to use Marathon assigned port

2017-09-07 Thread Sunil Kalyanpur
. Is there a way we can make spark job to use the port assigned by Marathon instead of Spark job picking the configuration from the Checkpoint? Please let me know if you need any information. Thank you, Sunil

Spark Memory Allocation Exception

2016-09-09 Thread Sunil Tripathy
Hi, I am using spark 1.6 to load a history activity dataset for last 3/4 years and write that to a parquet file partitioned by day. I am using the following exception when the insert command is running to insert the data onto the parquet partitions.

Re: Unable to create a dataframe from json dstream using pyspark

2016-07-28 Thread Sunil Kumar Chinnamgari
unds. now my recordings are much more crisp. it is one of the lowest prices pop filters on amazon so might as well buy it, they honestly work the same despite their pricing,", "overall": 5.0, "summary": "good", "unixReviewTime": 1393545600, "reviewTime": "02 28, 2014"} I am absolute new comer to spark streaming and started working on pet projects by reading documentation. Any help and guidance is greatly appreciated. Best Regards,Sunil Kumar Chinnamgari

spark-ec2 scripts with spark-2.0.0-preview

2016-06-14 Thread Sunil Kumar
Hi, The spark-ec2 scripts are missing from spark-2.0.0-preview. Is there a workaround available ? I tried to change the ec2 scripts to accomodate spark-2.0.0...If I call the release spark-2.0.0-preview, then it barfs because the command line argument : --spark-version=spark-2.0.0-preview  gets

Re: JIRA SPARK-2984

2016-06-09 Thread Sunil Kumar
. Even if it is an intrinsic problem with s3 (and I'm not super sure since I'm just reading this on mobile) - it would maybe be a good thing for us to document. On Thursday, June 9, 2016, Sunil Kumar <parvat_2...@yahoo.com> wrote: | Holden Thanks for your prompt reply... Any suggestions on the next

Re: JIRA SPARK-2984

2016-06-09 Thread Sunil Kumar
Holden Thanks for your prompt reply... Any suggestions on the next step ? Does this call for a new spark jira ticket or is this an issue for s3? Thx

JIRA SPARK-2984

2016-06-09 Thread Sunil Kumar
Hi, I am running into SPARK-2984 while running my spark 1.6.1 jobs over yarn in AWS. I have tried with spark.speculation=false but still see the same failure with _temporary file missing for task_xxx...This ticket is in resolved state. How can this be reopened ? Is there a workaround ? thanks

Re: is there any way to submit spark application from outside of spark cluster

2016-03-25 Thread sunil m
Hi Prateek! You might want to have a look at spark job server: https://github.com/spark-jobserver/spark-jobserver Warm regards, Sunil Manikani. On 25 March 2016 at 23:34, Ted Yu <yuzhih...@gmail.com> wrote: > Do you run YARN in your production environment (and plan to run S

connecting beeline to spark sql thrift server

2016-01-06 Thread Sunil Kumar
Hi, I have an AWS spark EMR cluster running with spark 1.5.2, hadoop 2.6 and hive 1.0.0I brought up the spark sql thriftserver on this cluster with spark.sql.hive.metastore version set to 1.0 When I try to connect to this thriftserver remotely using beeline packaged  with

Error while running a job in yarn-client mode

2015-12-16 Thread sunil m
parkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Thanks in advance. Warm regards, Sunil M.

Logging spark output to hdfs file

2015-12-08 Thread sunil m
regards, Sunil M

Associating spark jobs with logs

2015-12-08 Thread sunil m
are welcome! Thanks in advance. Warm regards, Sunil M.

Re: NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable

2015-12-08 Thread Sunil Tripathy
rebuilding spark help? From: Fengdong Yu <fengdo...@everstring.com> Sent: Monday, December 7, 2015 10:31 PM To: Sunil Tripathy Cc: user@spark.apache.org Subject: Re: NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable Can you try like

Re: Associating spark jobs with logs

2015-12-08 Thread sunil m
to YARN. I am looking for a feature like this but we need to get logs irrespective of the master being YARN, MESSOS or stand-alone spark. Warm regards, Sunil M. On 9 December 2015 at 00:48, Ted Yu <yuzhih...@gmail.com> wrote: > Have you looked at the REST API section of:

NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable

2015-12-07 Thread Sunil Tripathy
I am getting the following exception when I use spark-submit to submit a spark streaming job. Exception in thread "main" java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable([Lcom/fasterxml/jackson/core/JsonParser$Feature;)Lcom/fasterxml/jackson/databind/ObjectMapper;

Available options for Spark REST API

2015-12-07 Thread sunil m
a example. Are there any known disadvantages of using it? Is there anything better available, which is used in Production environment? Any advise is appreciated. We are using Spark 1.5.1. Thanks in advance. Warm regards, Sunil M.

Queue in Spark standalone mode

2015-11-25 Thread sunil m
. Warm regards, Sunil

Spark JDBCRDD query

2015-11-18 Thread sunil m
... Warm regards, Sunil M.

Re: Web UI is not showing up

2015-09-01 Thread Sunil Rathee
story server. > On Sep 1, 2015 8:57 PM, "Sunil Rathee" <ratheesunil...@gmail.com> wrote: > >> >> Hi, >> >> >> localhost:4040 is not showing anything on the browser. Do we have to >> start some service? >> >> -- >> >> >> Sunil Rathee >> >> >> >> -- Sunil Rathee

Web UI is not showing up

2015-09-01 Thread Sunil Rathee
Hi, localhost:4040 is not showing anything on the browser. Do we have to start some service? -- Sunil Rathee

Data locality with HDFS not being seen

2015-08-20 Thread Sunil
no difference. Thanks and regards, Sunil -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Data-locality-with-HDFS-not-being-seen-tp24361.html Sent from the Apache Spark User List mailing list archive at Nabble.com

clean up of state in State Dstream

2014-12-12 Thread Sunil Yarram
I am using *updateStateByKey *to maintain state in my streaming application, the state gets accumulated over time. Is there a way i can delete the old state data or put a limit on the amount of state the State Dstream can keep in the system. Thanks, Sunil.

updateStateByKey

2014-11-26 Thread Sunil Yarram
rapidly. Please share your thoughts/experience. Thanks, Sunil.