The Dataset unit test is much slower than the RDD unit test (in Scala)

2022-10-25 Thread Tanin Na Nakorn
Hi All, Our data job is very complex (e.g. 100+ joins), and we have switched from RDD to Dataset recently. We've found that the unit test takes much longer. We profiled it and have found that it's the planning phase that is slow, not execution. I wonder if anyone has encountered this issue

Dynamic allocation on K8

2022-10-25 Thread Nikhil Goyal
Hi folks, When running spark on Kubernetes is it possible to use dynamic allocation? Some blog posts mentioned that dynamic allocation is available, however I am not sure how it works. Spark official docs

Re: Prometheus with spark

2022-10-25 Thread Raja bhupati
We have use case where we would like process Prometheus metrics data with spark On Tue, Oct 25, 2022, 19:49 Jacek Laskowski wrote: > Hi Raj, > > Do you want to do the following? > > spark.read.format("prometheus").load... > > I haven't heard of such a data source / format before. > > What would

Re: Prometheus with spark

2022-10-25 Thread Jacek Laskowski
Hi Raj, Do you want to do the following? spark.read.format("prometheus").load... I haven't heard of such a data source / format before. What would you like it for? Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski "The Internals Of" Online Books