Re: Log file location in Spark on K8s

2023-10-09 Thread Prashant Sharma
Hi Sanket, Driver and executor logs are written to stdout by default, it can be configured using SPARK_HOME/conf/log4j.properties file. The file including the entire SPARK_HOME/conf is auto propogateded to all driver and executor container and mounted as volume. Thanks On Mon, 9 Oct, 2023, 5:37

Re: Clarification with Spark Structured Streaming

2023-10-09 Thread Danilo Sousa
Unsubscribe > Em 9 de out. de 2023, à(s) 07:03, Mich Talebzadeh > escreveu: > > Hi, > > Please see my responses below: > > 1) In Spark Structured Streaming does commit mean streaming data has been > delivered to the sink like Snowflake? > > No. a commit does not refer to data being

Re: Clarification with Spark Structured Streaming

2023-10-09 Thread Mich Talebzadeh
Your mileage varies. Often there is a flavour of Cloud Data warehouse already there. CDWs like BigQuery, Redshift, Snowflake and so forth. They can all do a good job for various degrees - Use efficient data types. Choose data types that are efficient for Spark to process. For example, use

Re: Clarification with Spark Structured Streaming

2023-10-09 Thread ashok34...@yahoo.com.INVALID
Thank you for your feedback Mich. In general how can one optimise the cloud data warehouses (the sink part), to handle streaming Spark data efficiently, avoiding bottlenecks that discussed. AKOn Monday, 9 October 2023 at 11:04:41 BST, Mich Talebzadeh wrote: Hi, Please see my

Re: Updating delta file column data

2023-10-09 Thread Mich Talebzadeh
In a nutshell, is this what you are trying to do? 1. Read the Delta table into a Spark DataFrame. 2. Explode the string column into a struct column. 3. Convert the hexadecimal field to an integer. 4. Write the DataFrame back to the Delta table in merge mode with a unique key. Is

Log file location in Spark on K8s

2023-10-09 Thread Agrawal, Sanket
Hi All, We are trying to send the spark logs using fluent-bit. We validated that fluent-bit is able to move logs of all other pods except the driver/executor pods. It would be great if someone can guide us where should I look for spark logs in Spark on Kubernetes with client/cluster mode

Re: Clarification with Spark Structured Streaming

2023-10-09 Thread Mich Talebzadeh
Hi, Please see my responses below: 1) In Spark Structured Streaming does commit mean streaming data has been delivered to the sink like Snowflake? No. a commit does not refer to data being delivered to a sink like Snowflake or bigQuery. The term commit refers to Spark Structured Streaming (SS)

Re: Updating delta file column data

2023-10-09 Thread Karthick Nk
Hi All, I have mentioned the sample data below and the operation I need to perform over there, I have delta tables with columns, in that columns I have the data in the string data type(contains the struct data), So, I need to update one key value in the struct field data in the string column