Re: Performance Issue

2019-01-08 Thread Gourav Sengupta
Hi, Can you please let us know the SPARK version, and the query, and whether the data is in parquet format or not, and where is it stored? Regards, Gourav Sengupta On Wed, Jan 9, 2019 at 1:53 AM 大啊 wrote: > What is your performance issue? > > > > > > At 2019-01-08 22:09:24, "Tzahi File" wrote

[Spark SQL] Failure Scenarios involving JDBC and SQL databases

2019-01-08 Thread Ramon Tuason
Hi all, I'm writing a data source that shares similarities with Spark's own JDBC implementation, and I'd like to ask a question about how Spark handles failure scenarios involving JDBC and SQL databases. To my understanding, if an executor dies while it's running a task, Spark will revive the e

Re:Performance Issue

2019-01-08 Thread 大啊
What is your performance issue? At 2019-01-08 22:09:24, "Tzahi File" wrote: Hello, I have some performance issue running SQL query on Spark. The query contains one parquet partitioned table (partition by date) one each partition is about 200gb and simple table with about 100 records.

Is it possible to rate limit an UDP?

2019-01-08 Thread email
I have a data frame for which I apply an UDF that calls a REST web service. This web service is distributed in only a few nodes and it won't be able to handle a massive load from Spark. Is it possible to rate limit this UDP? For example , something like 100 op/s. If not , what are the opt

Performance Issue

2019-01-08 Thread Tzahi File
Hello, I have some performance issue running SQL query on Spark. The query contains one parquet partitioned table (partition by date) one each partition is about 200gb and simple table with about 100 records. The spark cluster is of type m5.2xlarge - 8 cores. I'm using Qubole interface for runnin