Thanks. That what I was thinking. But how to setup connection per worker?
Il lunedì 25 luglio 2016, ayan guha <guha.a...@gmail.com> ha scritto: > In order to use existing pg UDF, you may create a view in pg and expose > the view to hive. > Spark to database connection happens from each executors, so you must have > a connection or a pool of connection per worker. Executors of the same > worker can share connection pool. > > Best > Ayan > On 25 Jul 2016 16:48, "Marco Colombo" <ing.marco.colo...@gmail.com > <javascript:_e(%7B%7D,'cvml','ing.marco.colo...@gmail.com');>> wrote: > >> Hi all! >> Among other use cases, I want to use spark as a distributed sql engine >> via thrift server. >> I have some tables in postegres and Cassandra: I need to expose them via >> hive for custom reporting. >> Basic implementation is simple and works, but I have some concerns and >> open question: >> - is there a better approach rather than mapping a temp table as a select >> of the full table? >> - What about query setup cost? I mean, is there a way to avoid db >> connection setup costs using a pre-created connection pool? >> - is it possibile from hiveql to use functions defined in the pg database >> or should I have to rewrite them as udaf? >> >> Thanks! >> >> >> >> -- >> Ing. Marco Colombo >> > -- Ing. Marco Colombo