Re: Got Error Creating permanent view in Postgresql through Pyspark code

2023-01-05 Thread Stelios Philippou
them to do batch insertion using spark.sql("INSERt ..."); Hope this helps Stelios -- Hi Stelios Philippou, I need to create a view table in Postgresql DB using pyspark code. But I'm unable to create a view table, I can able to create table through pyspark code. I need to

Re: Got Error Creating permanent view in Postgresql through Pyspark code

2023-01-04 Thread Stelios Philippou
Vajiha, I believe that you might be confusing stuff ? Permanent View in PSQL is a standard view. Temp view or Global View is the Spark View that is internal for Spark. Can we get a snippet of the code please. On Wed, 4 Jan 2023 at 15:10, Vajiha Begum S A wrote: > > I have tried to Create a

Re: Spark migration from 2.3 to 3.0.1

2023-01-02 Thread Stelios Philippou
Can we see your Spark Configuration parameters ? The mater URL refers to as per java new SparkConf()setMaster("local[*]") according to where you want to run this On Mon, 2 Jan 2023 at 14:38, Shrikant Prasad wrote: > Hi, > > I am trying to migrate one spark application from Spark 2.3 to

Re: RDD block has negative value in Spark UI

2022-12-07 Thread Stelios Philippou
Already a know minor issue https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-10141 On Wed, 7 Dec 2022, 15:09 K B M Kaala Subhikshan, < kbmkaalasubhiks...@gmail.com> wrote: > Could you explain why the RDD block has a negative value? > >

Re: [pyspark delta] [delta][Spark SQL]: Getting an Analysis Exception. The associated location (path) is not empty

2022-08-02 Thread Stelios Philippou
HI Kumba. SQL Structure is a bit different for CREATE OR REPLACE TABLE You can only do the following CREATE TABLE IF NOT EXISTS https://spark.apache.org/docs/3.3.0/sql-ref-syntax-ddl-create-table-datasource.html On Tue, 2 Aug 2022 at 14:38, Sean Owen wrote: > I don't think "CREATE OR

Re: how to properly filter a dataset by dates ?

2022-06-17 Thread Stelios Philippou
ot;02-03-2012").cast("date")); > ? > This is returning an empty dataset. > > Le ven. 17 juin 2022 à 21:34, Stelios Philippou a > écrit : > >> You are already doing it once. >> to_date the second part and don't forget to cast it as well >> >> On Fr

Re: how to properly filter a dataset by dates ?

2022-06-17 Thread Stelios Philippou
You are already doing it once. to_date the second part and don't forget to cast it as well On Fri, 17 Jun 2022, 22:08 marc nicole, wrote: > should i cast to date the target date then? for example maybe: > > dataset = >>

Re: API Problem

2022-06-10 Thread Stelios Philippou
gt;> On Thu, Jun 9, 2022, 9:41 PM Stelios Philippou >> wrote: >> >>> Perhaps >>> >>> >>> finalDF.repartition(finalDF.rdd.getNumPartitions()).withColumn("status_for_batch >>> >>> To >>> >>> finalDF.repartiti

Re: API Problem

2022-06-09 Thread Stelios Philippou
Perhaps finalDF.repartition(finalDF.rdd.getNumPartitions()).withColumn("status_for_batch To finalDF.repartition(finalDF.rdd.getNumPartitions()).withColumn(col("status_for_batch") On Thu, 9 Jun 2022, 22:32 Sid, wrote: > Hi Experts, > > I am facing one problem while passing a column to the

Re: How to convert a Dataset to a Dataset?

2022-06-06 Thread Stelios Philippou
Hi All, Simple in Java as well. You can get the Dataset Directly Dataset encodedString = df.select("Column") .where("") .as(Encoders.STRING()) .toDF(); On Mon, 6 Jun 2022 at 15:26, Christophe Préaud < christophe.pre...@kelkoogroup.com> wrote: > Hi Marc, > > I'm not much familiar with Spark on

Re: Unable to format timestamp values in pyspark

2022-05-30 Thread Stelios Philippou
Sid, According to the error that i am seeing there, this is the Date Format issue. Text '5/1/2019 1:02:16' could not be parsed But your time format is specific as such 'M/dd/ H:mm:ss') You can see that the day specific is /1/ but your format is dd which expects two digits. Please try

Re: Unable to convert double values

2022-05-29 Thread Stelios Philippou
Hi Sid, df = df.withColumn("annual_salary", regexp_replace(col("annual_salary"), "\.", "")) The value 125.06 becomes 12506 which when cast to double is 12506.00 Have you tried without removing the . ? df.withColumn("annual_salary", round(col("annual_salary").cast("double"),

Re: protobuf data as input to spark streaming

2022-04-06 Thread Stelios Philippou
Yes we are currently using it as such. Code is in java. Will that work? On Wed, 6 Apr 2022 at 00:51, Kiran Biswal wrote: > Hello Experts > > Has anyone used protobuf (proto3) encoded data (from kafka) as input > source and been able to do spark structured streaming? > > I would appreciate if

Re: add an auto_increment column

2022-02-07 Thread Stelios Philippou
https://stackoverflow.com/a/51854022/299676 On Tue, 8 Feb 2022 at 09:25, Stelios Philippou wrote: > This has the information that you require in order to add an extra column > with a sequence to it. > > > On Tue, 8 Feb 2022 at 09:11, wrote: > >> >> Hello Gourav &g

Re: add an auto_increment column

2022-02-07 Thread Stelios Philippou
This has the information that you require in order to add an extra column with a sequence to it. On Tue, 8 Feb 2022 at 09:11, wrote: > > Hello Gourav > > > As you see here orderBy has already give the solution for "equal > amount": > > >>> df = > >>> >

Re: Unable to use WriteStream to write to delta file.

2021-12-17 Thread Stelios Philippou
Hi Abhinav, Using ReadStream or Read will not mind. The following error java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldNames( states that you are using different version of Spark at someplace of your project or you are using an

Re: Running spark Kafka streaming jo in Azure HDInsight

2021-10-06 Thread Stelios Philippou
Hi Favas, The error states that you are using different libraries version. Exception in thread "streaming-start" java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V Have in mind that Spark uses its internal libraries for the majority

Re: 21/09/27 23:34:03 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

2021-09-27 Thread Stelios Philippou
It might be possible that you do not have the resources on the cluster. So your job will remain to wait for them as they cannot be provided. On Tue, 28 Sep 2021, 04:26 davvy benny, wrote: > How can I solve the problem? > > On 2021/09/27 23:05:41, Thejdeep G wrote: > > Hi, > > > > That would

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-09-06 Thread Stelios Philippou
Stelios Philippou 16:20 (3 minutes ago) to Mich My local Spark submit : ~/development/SimpleKafkaStream spark-submit --version Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.2 /_/ Using Scala

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-09-06 Thread Stelios Philippou
perty which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Mon, 6 Sept 2021 at 11:16, Stelios Philippou > wrot

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-09-06 Thread Stelios Philippou
not really be doing such risky config changes (unless > you've got no other choice and you know what you're doing). > > Pozdrawiam, > Jacek Laskowski > > https://about.me/JacekLaskowski > "The Internals Of" Online Books <https://books.japila.pl/> > Follow me on

Re: Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-08-31 Thread Stelios Philippou
to share as I think it's worth investigating. > > Pozdrawiam, > Jacek Laskowski > > https://about.me/JacekLaskowski > "The Internals Of" Online Books <https://books.japila.pl/> > Follow me on https://twitter.com/jaceklaskowski > > <https://twitter.com

Spark Stream on Kubernetes Cannot Set up JavaSparkContext

2021-08-31 Thread Stelios Philippou
Hello, I have been facing the current issue for some time now and I was wondering if someone might have some inside on how I can resolve the following. The code (java 11) is working correctly on my local machine but whenever I try to launch the following on K8 I am getting the following error.