Re: Querying on multiple Hive stores using Apache Spark

2015-09-24 Thread Karthik
Any ideas or suggestions? Thanks, Karthik. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Querying-on-multiple-Hive-stores-using-Apache-Spark-tp24765p24797.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Querying on multiple Hive stores using Apache Spark

2015-09-22 Thread Karthik
I have a spark application which will successfully connect to hive and query on hive tables using spark engine. To build this, I just added hive-site.xml to classpath of the application and spark will read the hive-site.xml to connect to its metastore. This method was suggested in spark's mailing

unsubscribe

2018-01-16 Thread karthik
unsubscribe

Incorrect ACL checking for partitioned table in Spark SQL-1.4

2015-06-16 Thread Karthik Subramanian
*Problem Statement:* While doing query on a partitioned table using Spark SQL (Version 1.4.0), access denied exception is observed on the partition the user doesn’t belong to (The user permission is controlled using HDF ACLs). The same works correctly in hive. *Usercase:* /To address

Issue on spark.driver.maxResultSize

2015-10-29 Thread karthik kadiyam
Hi, In spark streaming job i had the following setting this.jsc.getConf().set("spark.driver.maxResultSize", “0”); and i got the error in the job as below User class threw exception: Job aborted due to stage failure: Total size of serialized results of 120 tasks (1082.2 MB) is

Re: issue with spark.driver.maxResultSize parameter in spark 1.3

2015-11-01 Thread karthik kadiyam
Did any one had issue setting spark.driver.maxResultSize value ? On Friday, October 30, 2015, karthik kadiyam <karthik.kadiyam...@gmail.com> wrote: > Hi Shahid, > > I played around with spark driver memory too. In the conf file it was set > to " --driver-memory 20G

Re: issue with spark.driver.maxResultSize parameter in spark 1.3

2015-10-30 Thread karthik kadiyam
wrote: > Hi > I guess you need to increase spark driver memory as well. But that should > be set in conf files > Let me know if that resolves > On Oct 30, 2015 7:33 AM, "karthik kadiyam" <karthik.kadiyam...@gmail.com> > wrote: >

issue with spark.driver.maxResultSize parameter in spark 1.3

2015-10-29 Thread karthik kadiyam
Hi, In spark streaming job i had the following setting this.jsc.getConf().set("spark.driver.maxResultSize", “0”); and i got the error in the job as below User class threw exception: Job aborted due to stage failure: Total size of serialized results of 120 tasks (1082.2 MB) is bigger

Spark Streaming join

2016-06-02 Thread karthik tunga
and reload this RDD every say 10 minutes. Is this possible ? Apologies if this has been asked before. Cheers, Karthik

using matrix as column datatype in SparkSQL Dataframe

2016-08-08 Thread Vadla, Karthik
://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/linalg/MatrixUDT.scala Can anyone help me with this. Really appreciate your help. Thanks Karthik Vadla

Re: What are using Spark for

2016-08-02 Thread Karthik Ramakrishnan
We used Storm for ETL, now currently thinking Spark might be advantageous since some ML also is coming our way. - Karthik On Tue, Aug 2, 2016 at 1:10 PM, Rohit L <rohitfor...@gmail.com> wrote: > Does anyone use Spark for ETL? > > On Tue, Aug 2, 2016 at 1:24 PM, Sonal

io.netty.handler.codec.EncoderException: java.lang.NoSuchMethodError:

2016-11-24 Thread Karthik Shyamsunder
:18:24 ERROR client.TransportResponseHandler: Still have 1 requests outstanding when connection from /10.0.2.15:54561 is closed PLEASE ADVISE. Sincerely, Karthik

[Streaming][Structured Streaming] Understanding dynamic allocation in streaming jobs

2017-08-22 Thread Karthik Palaniappan
-12133. Is that actually a supported feature? Or was that just an experiment? I had trouble getting this to work, but I'll follow up in a different thread. Also, does Structured Streaming have its own dynamic allocation algorithm? Thanks, Karthik Palaniappan

[Spark Streaming] Streaming Dynamic Allocation is broken (at least on YARN)

2017-08-22 Thread Karthik Palaniappan
I ran the HdfsWordCount example using this command: spark-submit run-example \ --conf spark.streaming.dynamicAllocation.enabled=true \ --conf spark.executor.instances=0 \ --conf spark.dynamicAllocation.enabled=false \ --conf spark.master=yarn \ --conf spark.submit.deployMode=client \

Re: [Spark Streaming] Streaming Dynamic Allocation is broken (at least on YARN)

2017-09-08 Thread Karthik Palaniappan
For posterity, I found the root cause and filed a JIRA: https://issues.apache.org/jira/browse/SPARK-21960. I plan to open a pull request with the minor fix. From: Karthik Palaniappan Sent: Friday, September 1, 2017 9:49 AM To: Akhil Das Cc: user@spark.apache.org

Re: [Streaming][Structured Streaming] Understanding dynamic allocation in streaming jobs

2017-08-25 Thread Karthik Palaniappan
I definitely agree that dynamic allocation is useful, that's why I asked the question :p More specifically, does spark plan to solve the problems with DRA for structured streaming mentioned in that Cloudera article? If folks can give me pointers on where to start, I'd be happy to implement

RE: [Spark Streaming] Streaming Dynamic Allocation is broken (at least on YARN)

2017-08-25 Thread Karthik Palaniappan
, and explicitly set it to 0 after hitting that error. Setting executor cores > 1 seems like reasonable advice in general, but that shouldn’t be my issue here, right? From: Akhil Das<mailto:ak...@hacked.work> Sent: Thursday, August 24, 2017 2:34 AM To: Karthik Palaniappan<mailto:karthik...@hot

Re: [Spark Streaming] Streaming Dynamic Allocation is broken (at least on YARN)

2017-09-01 Thread Karthik Palaniappan
Any ideas @Tathagata? I'd be happy to contribute a patch if you can point me in the right direction. From: Karthik Palaniappan <karthik...@hotmail.com> Sent: Friday, August 25, 2017 9:15 AM To: Akhil Das Cc: user@spark.apache.org; t...@databricks.com Subje

[Beginner] Kafka 0.11 header support in Spark Structured Streaming

2018-02-27 Thread Karthik Jayaraman
tps://issues.apache.org/jira/browse/KAFKA-4208>), is it supported in Spark. ? If yes, can anyone point me to an example ? - Karthik

[ spark-streaming ] - Data Locality issue

2020-02-04 Thread Karthik Srinivas
Hi, I am using spark 2.3.2, i am facing issues due to data locality, even after giving spark.locality.wait.rack=200, locality_level is always RACK_LOCAL, can someone help me with this. Thank you

Data locality

2020-02-04 Thread Karthik Srinivas
Hi all, I am using spark 2.3.2, i am facing issues due to data locality, even after giving spark.locality.wait.rack=200, locality_level is always RACK_LOCAL, can someone help me with this. Thank you

Unsubscribe

2022-07-28 Thread Karthik Jayaraman

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-24 Thread Karthik Reddy Vadde
On Thu, Jul 12, 2018 at 10:23 AM Arun Mahadevan wrote: > Yes ForeachWriter [1] could be an option If you want to write to different > sinks. You can put your custom logic to split the data into different sinks. > > The drawback here is that you cannot plugin existing sinks like Kafka and > you