Re: Spark reading from HBase using hbase-connectors - any benefit from localization?

2023-01-05 Thread Mich Talebzadeh
Hi Aaron, Thanks for the details. It is a general practice when running Spark on premise to use Hadoop clusters. This comes from the notion of data locality. Data locality in

Re: Spark reading from HBase using hbase-connectors - any benefit from localization?

2023-01-05 Thread Aaron Grubb
Hi Mich, Thanks for your reply. In hindsight I realize I didn't provide enough information about the infrastructure for the question to be answered properly. We are currently running a Hadoop cluster with nodes that have the following services: - HDFS NameNode (3.3.4) - YARN NodeManager

Re: Spark reading from HBase using hbase-connectors - any benefit from localization?

2023-01-05 Thread Mich Talebzadeh
Few questions - As I understand you already have a Hadoop cluster. Are you going to put your spark as Hadoopp nodes? - Where is your HBase cluster? Is it sharing nodes with Hadoop or has its own cluster I looked at that link and it does not say much. Essentially you want to use HBase

Re: Got Error Creating permanent view in Postgresql through Pyspark code

2023-01-05 Thread ayan guha
Hi What you are trying to do does not make sense. I suggest you to understand how Views work in SQL. IMHO you are better off creating a table. Ayan On Fri, 6 Jan 2023 at 12:20 am, Stelios Philippou wrote: > Vajiha, > > I dont see your query working as you hope it will. > > spark.sql will

Re: GPU Support

2023-01-05 Thread Sean Owen
Spark itself does not use GPUs, but you can write and run code on Spark that uses GPUs. You'd typically use software like Tensorflow that uses CUDA to access the GPU. On Thu, Jan 5, 2023 at 7:05 AM K B M Kaala Subhikshan < kbmkaalasubhiks...@gmail.com> wrote: > Is Gigabyte GeForce RTX 3080 GPU

Re: Got Error Creating permanent view in Postgresql through Pyspark code

2023-01-05 Thread Stelios Philippou
Vajiha, I dont see your query working as you hope it will. spark.sql will execute a query on a database level to retrieve the temp view you need to go from the sessions. i.e session.sql("SELECT * FROM TEP_VIEW") You might need to retrieve the data in a collection and iterate over them to do

GPU Support

2023-01-05 Thread K B M Kaala Subhikshan
Is Gigabyte GeForce RTX 3080 GPU support for running machine learning in Spark?

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-05 Thread Saurabh Gulati
and 2 single quotes together'' are looking like a single double quote ". Mvg/Regards Saurabh Gulati From: Saurabh Gulati Sent: 05 January 2023 12:24 To: Sean Owen Cc: User Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

2023-01-05 Thread Saurabh Gulati
Its the same input except that headers are also being read with csv reader. Mvg/Regards Saurabh Gulati From: Sean Owen Sent: 04 January 2023 15:12 To: Saurabh Gulati Cc: User Subject: Re: [EXTERNAL] Re: Incorrect csv parsing when delimiter used within the data

Re: [EXTERNAL] Re: Re: Incorrect csv parsing when delimiter used within the data

2023-01-05 Thread Saurabh Gulati
Yes, there are other ways to solve this but trying to understand why there is a difference in behaviour between df.show() and df.select("c").show()​ Mvg/Regards Saurabh Gulati From: Shay Elbaz Sent: 04 January 2023 14:54 To: Saurabh Gulati ; Sean Owen Cc: Mich

Spark reading from HBase using hbase-connectors - any benefit from localization?

2023-01-05 Thread Aaron Grubb
(cross-posting from the HBase user list as I didn't receive a reply there) Hello, I'm completely new to Spark and evaluating setting up a cluster either in YARN or standalone. Our idea for the general workflow is create a concatenated dataframe using historical pickle/parquet files (whichever

Re: How to set a config for a single query?

2023-01-05 Thread Khalid Mammadov
Hi I believe there is a feature in Spark specifically for this purpose. You can create a new spark session and set those configs. Note that it's not the same as creating a separate driver processes with separate sessions, here you will still have the same SparkContext that works as a backend for