Re: Problems with reading data from parquet files in a HDFS remotely

2016-01-08 Thread Henrik Baastrup
t; *Date: *Thu, 7 Jan 2016 17:54 > > *To: *user@spark.apache.org; > > *Cc: *Baastrup, Henrik; > > *Subject:*Problems with reading data from parquet files in a HDFS remotely > > > Hi All, > > I have a small Hadoop cluster where I have stored a lot of data in parquet >

Re: Problems with reading data from parquet files in a HDFS remotely

2016-01-08 Thread Henrik Baastrup
rom: *Henrik Baastrup > > *Date: *Thu, 7 Jan 2016 17:54 > > *To: *user@spark.apache.org; > > *Cc: *Baastrup, Henrik; > > *Subject:*Problems with reading data from parquet files in a HDFS remotely > > > Hi All, > > I have a small Hadoop cluster whe

Re: Problems with reading data from parquet files in a HDFS remotely

2016-01-07 Thread Prem Sure
you many need to add createDataFrame( for Python, inferschema) call before registerTempTable. Thanks, Prem On Thu, Jan 7, 2016 at 12:53 PM, Henrik Baastrup < henrik.baast...@netscout.com> wrote: > Hi All, > > I have a small Hadoop cluster where I have stored a lot of data in parquet >

Problems with reading data from parquet files in a HDFS remotely

2016-01-07 Thread Henrik Baastrup
Hi All, I have a small Hadoop cluster where I have stored a lot of data in parquet files. I have installed a Spark master service on one of the nodes and now would like to query my parquet files from a Spark client. When I run the following program from the spark-shell on the Spark Master node