Maybe something like Livy, otherwise roll your own REST API and have it
start a Spark job.
On Mon, 20 Jan 2020 at 06:55, wrote:
> I am new to Spark. The task I want to accomplish is let client send http
> requests, then spark process that request for further operations. However
> searching Spark
I am new to Spark. The task I want to accomplish is let client send http
requests, then spark process that request for further operations. However
searching Spark's website docs
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package
https://spark.apache.or
Hi Anbutech,
If I am not mistaken, I believe you are trying to read multiple
dataframes from around 150 different paths (in your case the Kafka
topics) to count their records. You have all these paths stored in a
CSV with columns year, month, day and hour.
Here is what I came up with; I have been
Depends on the use case, if you have to join, you're saving a join and a
shuffle from having it already in an array.
If you explode, at least sort within partitions to get you predicate
pushdown when you read the data next time.
On Sun, 19 Jan 2020, 1:19 pm Jörn Franke, wrote:
> Why not two tab