Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-05 Thread Muthu Jayakumar
I run a spark-submit(https://spark.apache.org/docs/latest/spark-standalone. html#launching-spark-applications) in client-mode that starts the micro-service. If you keep the event loop going then the spark context would remain active. Thanks, Muthu On Mon, Jun 5, 2017 at 2:44 PM, kant kodali

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-05 Thread kant kodali
Are you launching SparkSession from a MicroService or through spark-submit ? On Sun, Jun 4, 2017 at 11:52 PM, Muthu Jayakumar wrote: > Hello Kant, > > >I still don't understand How SparkSession can use Akka to communicate > with SparkCluster? > Let me use your initial

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-05 Thread Muthu Jayakumar
Hello Kant, >I still don't understand How SparkSession can use Akka to communicate with SparkCluster? Let me use your initial requirement as a way to illustrate what I mean -- i.e, "I want my Micro service app to be able to query and access data on HDFS" In order to run a query say a DF query

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread kant kodali
Hi Muthu, I am actually using Play framework for my Micro service which uses Akka but I still don't understand How SparkSession can use Akka to communicate with SparkCluster? SparkPi or SparkPl? any link? Thanks!

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread Muthu Jayakumar
One drastic suggestion can be to write a simple microservice using Akka and create a SparkSession (during the start of vm) and pass it around. You can look at SparkPI for sample source code to start writing your microservice. In my case, I used akka http to wrap my business requests and transform

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread Sandeep Nemuri
Well if you are using Hortonworks distribution there is Livy2 which is compatible with Spark2 and scala 2.11. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_command-line-installation/content/install_configure_livy2.html On Sun, Jun 4, 2017 at 1:55 PM, kant kodali

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread kant kodali
Hi, Thanks for this but here is what the documentation says: "To run the Livy server, you will also need an Apache Spark installation. You can get Spark releases at https://spark.apache.org/downloads.html. Livy requires at least Spark 1.4 and currently only supports Scala 2.10 builds of Spark.

Re: What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread Sandeep Nemuri
Check out http://livy.io/ On Sun, Jun 4, 2017 at 11:59 AM, kant kodali wrote: > Hi All, > > I am wondering what is the easiest way for a Micro service to query data > on HDFS? By easiest way I mean using minimal number of tools. > > Currently I use spark structured

What is the easiest way for an application to Query parquet data on HDFS?

2017-06-04 Thread kant kodali
Hi All, I am wondering what is the easiest way for a Micro service to query data on HDFS? By easiest way I mean using minimal number of tools. Currently I use spark structured streaming to do some real time aggregations and store it in HDFS. But now, I want my Micro service app to be able to