Re: Hive and distributed sql engine

Marco Colombo Mon, 25 Jul 2016 00:40:43 -0700

Thanks. That what I was thinking.
But how to setup connection per worker?


Il lunedì 25 luglio 2016, ayan guha <guha.a...@gmail.com> ha scritto:

> In order to use existing pg UDF, you may create a view in pg and expose
> the view to hive.
> Spark to database connection happens from each executors, so you must have
> a connection or a pool of connection per worker. Executors of the same
> worker can share connection pool.
>
> Best
> Ayan
> On 25 Jul 2016 16:48, "Marco Colombo" <ing.marco.colo...@gmail.com
> <javascript:_e(%7B%7D,'cvml','ing.marco.colo...@gmail.com');>> wrote:
>
>> Hi all!
>> Among other use cases, I want to use spark as a distributed sql engine
>> via thrift server.
>> I have some tables in postegres and Cassandra: I need to expose them via
>> hive for custom reporting.
>> Basic implementation is simple and works, but I have some concerns and
>> open question:
>> - is there a better approach rather than mapping a temp table as a select
>> of the full table?
>> - What about query setup cost? I mean, is there a way to avoid db
>> connection setup costs using a pre-created connection pool?
>> - is it possibile from hiveql to use functions defined in the pg database
>> or should I have to rewrite them as udaf?
>>
>> Thanks!
>>
>>
>>
>> --
>> Ing. Marco Colombo
>>
>

-- 
Ing. Marco Colombo

Re: Hive and distributed sql engine

Reply via email to