Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Lalwani, Jayesh
Presto has slightly lower latency than Spark, but I've found that it gets stuck on some edge cases. If you are on AWS, then the simplest solution is to use Athena. Athena is built on Presto, has a JDBC driver, and is serverless, so you don't have to take any headaches On 2/18/21, 3:32 PM,

Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Scott Ribe
> On Feb 18, 2021, at 12:52 PM, Jeff Evans > wrote: > > It sounds like the tool you're after, then, is a distributed SQL engine like > Presto. But I could be totally misunderstanding what you're trying to do. Presto may well be a longer-term solution as our use grows. For now, a simple data

Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Scott Ribe
> On Feb 18, 2021, at 1:13 PM, Lalwani, Jayesh > wrote: > > Have you tried any of those? Where are you getting stuck? Thanks! The 3rd one in your list I had not found, and it seems to fill in what I was missing (CREATE EXTERNAL TABLE). I'd found the first two, but they only got me creating

Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Lalwani, Jayesh
There are several step by step guides that you can find online by googling https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-thrift-server.html

Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Jeff Evans
It sounds like the tool you're after, then, is a distributed SQL engine like Presto. But I could be totally misunderstanding what you're trying to do. On Thu, Feb 18, 2021 at 1:48 PM Scott Ribe wrote: > I have a client side piece that needs access via JDBC. > > > On Feb 18, 2021, at 12:45 PM,

Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Scott Ribe
I have a client side piece that needs access via JDBC. > On Feb 18, 2021, at 12:45 PM, Jeff Evans > wrote: > > If the data is already in Parquet files, I don't see any reason to involve > JDBC at all. You can read Parquet files directly into a DataFrame. >

Re: how to serve data over JDBC using simplest setup

2021-02-18 Thread Jeff Evans
If the data is already in Parquet files, I don't see any reason to involve JDBC at all. You can read Parquet files directly into a DataFrame. https://spark.apache.org/docs/latest/sql-data-sources-parquet.html On Thu, Feb 18, 2021 at 1:42 PM Scott Ribe wrote: > I need a little help figuring out

how to serve data over JDBC using simplest setup

2021-02-18 Thread Scott Ribe
I need a little help figuring out how some pieces fit together. I have some tables in parquet files, and I want to access them using SQL over JDBC. I gather that I need to run the thrift server, but how do I configure it to load my files into datasets and expose views? The context is this: