Re: Vulnerabilities in htrace-core4-4.1.0-incubating.jar jar used in spark.

2022-05-01 Thread HARSH TAKKAR
We scanned 3 versions of spark 3.0.0, 3.1.3, 3.2.1 On Tue, 26 Apr, 2022, 18:46 Bjørn Jørgensen, wrote: > What version of spark is it that you have scanned? > > > > tir. 26. apr. 2022 kl. 12:48 skrev HARSH TAKKAR : > >> Hello, >> >> Please let me know if th

Vulnerabilities in htrace-core4-4.1.0-incubating.jar jar used in spark.

2022-04-26 Thread HARSH TAKKAR
-14379 CVE-2019-12086 CVE-2018-7489 CVE-2018-5968 CVE-2018-14719 CVE-2018-14718 CVE-2018-12022 CVE-2018-11307 CVE-2017-7525 CVE-2017-17485 CVE-2017-15095 Kind Regards Harsh Takkar

Unsubscribe

2021-11-17 Thread HARSH TAKKAR
Unsubscribe

Re: Using Custom Scala Spark ML Estimator in PySpark

2021-02-16 Thread HARSH TAKKAR
Hello Sean, Thanks for the advice, can you please point me to an example where i can find a custom wrapper for python. Kind Regards Harsh Takkar On Tue, 16 Feb, 2021, 8:25 pm Sean Owen, wrote: > You won't be able to use it in python if it is implemented in Java - needs > a python wrapp

Using Custom Scala Spark ML Estimator in PySpark

2021-02-15 Thread HARSH TAKKAR
in the class pass using "spark.jars" Can you please help, if i am missing something. Kind Regards Harsh Takkar

Re: How to enable hive support on an existing Spark session?

2020-05-27 Thread HARSH TAKKAR
Hi Kun, You can use following spark property instead while launching the app instead of manually enabling it in the code. spark.sql.catalogImplementation=hive Kind Regards Harsh On Tue, May 26, 2020 at 9:55 PM Kun Huang (COSMOS) wrote: > > Hi Spark experts, > > I am seeking for an approach

Structured Streaming using Kafka Avro Record in 2.3.0

2020-04-28 Thread HARSH TAKKAR
Hi How can we deserialise avro record read from kafka in spark 2.3.0 in optimised manner. I could see that native support for avro was added in 2.4.x. Currently i am using following library which is very slow. com.twitter bijection-avro_2.11 Kind Regards Harsh Takkar

Reading 7z file in spark

2020-01-13 Thread HARSH TAKKAR
Hi, Is it possible to read 7z compressed file in spark? Kind Regards Harsh Takkar

Re: Hive External Table Partiton Data Type.

2019-12-16 Thread HARSH TAKKAR
Hi 10 Time taken: 0.356 seconds, Fetched: 2 row(s) hive> describe longpartition; OK b string a bigint # Partition Information # col_name data_type comment a bigint On Mon, Dec 16, 2019 at 11:05 AM SB M wrote: > spark version 2

Re: Hive External Table Partiton Data Type.

2019-12-15 Thread HARSH TAKKAR
Please share the spark version you are using . On Fri, 13 Dec, 2019, 4:02 PM SB M, wrote: > Hi All, >Am trying to create a dynamic partition with external table on hive > metastore using spark sql. > > when am trying to create a partition column data type as bigint, partition > is not

Re: Unable to write data from Spark into a Hive Managed table

2019-08-20 Thread HARSH TAKKAR
Please refere to the following documentation on how to write data into hive in hdp3.1 https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_hivewarehouseconnector_for_handling_apache_spark_data.html Harsh On Fri, 9 Aug, 2019, 10:21 PM Mich Talebzadeh, wrote:

Re: Back pressure not working on streaming

2019-01-01 Thread HARSH TAKKAR
There is separate property for max rate , by default is is not set, so if you want to limit the max rate you should provide that property a value. Initial rate =10 means it will pick only 10 records per receiver in the batch interval when you start the process. Depending upon the consumption

Re: executing stored procedure through spark

2018-08-13 Thread HARSH TAKKAR
Hi You can call the java program directly though pyspark, Following is the code that will help. sc._jvm.. Harsh Takkar On Sun, Aug 12, 2018 at 9:27 PM amit kumar singh wrote: > Hi /team, > > The way we call java program to executed stored procedure > is there any way we

Re: Pyspark access to scala/java libraries

2018-07-18 Thread HARSH TAKKAR
Hi You can access your java packages using following in pySpark obj = sc._jvm.yourPackage.className() Kind Regards Harsh Takkar On Wed, Jul 18, 2018 at 4:00 AM Mohit Jaggi wrote: > Thanks 0xF0F0F0 and Ashutosh for the pointers. > > Holden, > I am trying to look into sparklingml

Sklearn model in pyspark prediction

2018-05-15 Thread HARSH TAKKAR
Hi, Is there a way to load model saved using sklearn lib in pyspark/ scala spark for prediction. Thanks

Data of ArrayType field getting truncated when saving to parquet

2018-01-31 Thread HARSH TAKKAR
Hi I have a dataframe with a field of type array which is of large size, when i am trying to save the data to parquet file and read it again , array field comes out as empty array. Please help Harsh

Does Random Forest in spark ML supports multi label classification in scala

2017-11-07 Thread HARSH TAKKAR
Hi Does Random Forest in spark Ml supports multi label classification in scala ? I found out, sklearn provides sklearn.ensemble.RandomForestClassifier in python, do we have the similar functionality in scala ?

Building Spark with hive 1.1.0

2017-11-06 Thread HARSH TAKKAR
Hi I am using the cloudera (cdh5.11.0) setup, which have the hive version as 1.1.0, but when i build spark with hive and thrift support it pack the hive version as 1.6.0, Please let me know how can i build spark with hive 1.1.0 ? command i am using to build : ./dev/make-distribution.sh --name

Re: ConcurrentModificationException using Kafka Direct Stream

2017-09-19 Thread HARSH TAKKAR
; > On Mon, Sep 18, 2017 at 1:56 AM, HARSH TAKKAR <takkarha...@gmail.com> > wrote: > > Hi > > > > Changing spark version if my last resort, is there any other workaround > for > > this problem. > > > > > > On Mon, Sep 18, 2017 at 11:43 AM pandees

Re: ConcurrentModificationException using Kafka Direct Stream

2017-09-18 Thread HARSH TAKKAR
n Sep 17, 2017, at 11:08 PM, Anastasios Zouzias <zouz...@gmail.com> > wrote: > > Hi, > > I had a similar issue using 2.1.0 but not with Kafka. Updating to 2.1.1 > solved my issue. Can you try with 2.1.1 as well and report back? > > Best, > Anastasios > > Am 17

Re: ConcurrentModificationException using Kafka Direct Stream

2017-09-17 Thread HARSH TAKKAR
s in your application? > > On Sun, Sep 17, 2017 at 7:48 AM, HARSH TAKKAR <takkarha...@gmail.com> > wrote: > >> >> Hi >> >> I am using spark 2.1.0 with scala 2.11.8, and while iterating over the >> partitions of each rdd in a dStream formed using KafkaUtils,

ConcurrentModificationException using Kafka Direct Stream

2017-09-17 Thread HARSH TAKKAR
Hi I am using spark 2.1.0 with scala 2.11.8, and while iterating over the partitions of each rdd in a dStream formed using KafkaUtils, i am getting the below exception, please suggest a fix. I have following config kafka : enable.auto.commit:"true", auto.commit.interval.ms:"1000",

update hive metastore in spark session at runtime

2017-09-01 Thread HARSH TAKKAR
Hi, I have just started using spark session, with hive enabled. but i am facing some issue while updating hive warehouse directory post spark session creation, usecase: i want to read data from hive one cluster and write to hive on another cluster Please suggest if this can be done?

Reading parquet file in stream

2017-08-16 Thread HARSH TAKKAR
Hi I want to read a hdfs directory which contains parquet files, how can i stream data from this directory using streaming context (ssc.fileStream) ? Harsh

Re: Getting a TreeNode Exception while saving into Hadoop

2016-08-17 Thread HARSH TAKKAR
Hi I can see that exception is caused by following, csn you check where in your code you are using this path Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://testcluster:8020/experiments/vol/spark_chomp_data/bak/restaurants-bak/latest On Wed, 17 Aug

Re: Updating Values Inside Foreach Rdd loop

2016-05-09 Thread HARSH TAKKAR
Hi Please help. On Sat, 7 May 2016, 11:43 p.m. HARSH TAKKAR, <takkarha...@gmail.com> wrote: > Hi Ted > > Following is my use case. > > I have a prediction algorithm where i need to update some records to > predict the target. > > For eg. > I have an eq. Y= mX

Re: Updating Values Inside Foreach Rdd loop

2016-05-07 Thread HARSH TAKKAR
L allows you to leverage existing code. > > If you can share some more of your use case, that would help other people > provide suggestions. > > Thanks > > On May 6, 2016, at 6:57 PM, HARSH TAKKAR <takkarha...@gmail.com> wrote: > > Hi Ted > > I am aware that rdd are im

Re: Updating Values Inside Foreach Rdd loop

2016-05-06 Thread HARSH TAKKAR
> On Fri, May 6, 2016 at 5:25 AM, HARSH TAKKAR <takkarha...@gmail.com> > wrote: > >> Hi >> >> Is there a way i can modify a RDD, in for-each loop, >> >> Basically, i have a use case in which i need to perform multiple >> iteration over data and modify few values in each iteration. >> >> >> Please help. >> > >

Updating Values Inside Foreach Rdd loop

2016-05-06 Thread HARSH TAKKAR
Hi Is there a way i can modify a RDD, in for-each loop, Basically, i have a use case in which i need to perform multiple iteration over data and modify few values in each iteration. Please help.

Re: [Please Help] Log redirection on EMR

2016-02-23 Thread HARSH TAKKAR
dha...@manthan.com> wrote: > Your logs are getting archived in your logs bucket in S3. > > > http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-debugging.html > > Regards > Sab > > On Mon, Feb 22, 2016 at 12:14 PM, HARSH TAKKAR <tak

[Please Help] Log redirection on EMR

2016-02-21 Thread HARSH TAKKAR
Hi In am using an EMR cluster for running my spark jobs, but after the job finishes logs disappear, I have added a log4j.properties in my jar, but all the logs still redirects to EMR resource manager which vanishes after jobs completes, is there a way i could redirect the logs to a location in

Using Java spring injection with spark

2016-02-01 Thread HARSH TAKKAR
> > Hi > > I am new to apache spark and big data analytics, before starting to code > on spark data frames and rdd, i just wanted to confirm following > > 1. Can we create an implementation of java.api.Function as a singleton > bean using the spring frameworks and, can it be injected using

Re: Using Java spring injection with spark

2016-02-01 Thread HARSH TAKKAR
Hi Please can anyone reply on this. On Mon, 1 Feb 2016, 4:28 p.m. HARSH TAKKAR <takkarha...@gmail.com> wrote: > Hi >> >> I am new to apache spark and big data analytics, before starting to code >> on spark data frames and rdd, i just wanted to confirm follo

Re: Using Java spring injection with spark

2016-02-01 Thread HARSH TAKKAR
ite your code using Scala/ Python using the spark shell > or a notebook like Ipython, zeppelin or if you have written a application > using Scala/Java using the Spark API you can create a jar and run it using > spark-submit. > > *From:* HARSH TAKKAR [mailto:takkarha...@gmail.com] >