Thank you Sanchay and thank you for blog. No dia terça-feira, 19 de julho de 2022, sanchay javeria < sanchay.jave...@gmail.com> escreveu:
> I ran into this issue and solved it roughly the way you described your > second approach. You can modify the SQLIntepreter > <https://github.com/apache/incubator-livy/blob/master/repl/src/main/scala/org/apache/livy/repl/SQLInterpreter.scala#L97> > to > write the output dataframe to S3 instead, and on your client you can > retrieve the results in a paginated manner from S3. I wrote about this > problem in a blog post > <https://medium.com/pinterest-engineering/interactive-querying-with-apache-spark-sql-at-pinterest-2a3eaf60ac1b> > last year (see "Large Result Handling and Status Tracking") if you're > interested. > > Best > > On Mon, 18 Jul 2022 at 11:38, Gos Os <goosro...@gmail.com> wrote: > >> Hello folks, >> >> I am new to Apache Livy and currently trying to understand how feasible >> will Livy be for interactive query for a 300 user app. >> >> Latency to first results is critical for customer experience. >> >> The biggest concern I have is the 1000 limit associated with the >> take/collect. Most of the ad-how queries will return more than 10k rows >> easily. >> >> In my view there are two options: >> >> 1- Livy batch submission with S3 as the destination. Then read the >> results from the app from S3. This will not be the best experience as >> customer can’t see results right away. >> >> 2- interactive query submission via Livy. Then add a mechanism to perform >> pagination or write results to S3 if more than 1000 rows returned. The app >> would know this query has more than 1000 rows and automatically start >> paginating from S3 after 1000. >> >> My question: how have other Livy users with a requirement of low latency >> to first result solve this? >> >> Thank you, >> Gos. >> >