unsubscribe

2024-05-03 Thread Bing
Replied Message | From | Wood Super | | Date | 05/01/2024 07:49 | | To | user | | Subject | unsubscribe | unsubscribe

Spark Materialized Views: Improve Query Performance and Data Management

2024-05-03 Thread Mich Talebzadeh
Hi, I have raised a ticket SPARK-48117 for enhancing Spark capabilities with Materialised Views (MV). Currently both Hive and Databricks support this. I have added these potential benefits to the ticket -* Improved Query Performance

Re: Issue with Materialized Views in Spark SQL

2024-05-03 Thread Mich Talebzadeh
Sadly Apache Spark sounds like it has nothing to do within materialised views. I was hoping it could read it! >>> *spark.sql("SELECT * FROM test.mv ").show()* Traceback (most recent call last): File "", line 1, in File "/opt/spark/python/pyspark/sql/session.py", line 1440, in

Help needed optimize spark history server performance

2024-05-03 Thread Vikas Tharyani
Dear Spark Community, I'm writing to seek your expertise in optimizing the performance of our Spark History Server (SHS) deployed on Amazon EKS. We're encountering timeouts (HTTP 504) when loading large event logs exceeding 5 GB. *Our Setup:* - Deployment: SHS on EKS with Nginx ingress (idle

Re: ********Spark streaming issue to Elastic data**********

2024-05-03 Thread Mich Talebzadeh
My recommendation! is using materialized views (MVs) created in Hive with Spark Structured Streaming and Change Data Capture (CDC) is a good combination for efficiently streaming view data updates in your scenario. HTH Mich Talebzadeh, Technologist | Architect | Data Engineer | Generative AI |

Re: Issue with Materialized Views in Spark SQL

2024-05-03 Thread Mich Talebzadeh
Thanks for the comments I received. So in summary, Apache Spark itself doesn't directly manage materialized views,(MV) but it can work with them through integration with the underlying data storage systems like Hive or through iceberg. I believe databricks through unity catalog support MVs as