Re: Problem while loading saved data

2015-09-04 Thread Amila De Silva
Hi Ewan,

To start up the cluster I simply ran ./sbin/start-master.sh from master
node and ./sbin/start-slave.sh  from the slave. I didn't
configure hdfs explicitly.
Is there something additional that has to be done?


On Fri, Sep 4, 2015 at 12:42 AM, Ewan Leith <ewan.le...@realitymine.com>
wrote:

> From that, I'd guesd that HDFS isn't setup between the nodes, or for some
> reason writes are defaulting to file:///path/ rather than hdfs:///path/
>
>
>
>
> -- Original message--
>
> *From: *Amila De Silva
>
> *Date: *Thu, 3 Sep 2015 17:12
>
> *To: *Ewan Leith;
>
> *Cc: *user@spark.apache.org;
>
> *Subject:*Re: Problem while loading saved data
>
>
> Hi Ewan,
>
> Yes, 'people.parquet' is from the first attempt and in that attempt it
> tried to save the same people.json.
>
> It seems that the same folder is created on both the nodes and contents of
> the files are distributed between the two servers.
>
> On the master node(this is the same node which runs IPython Notebook) this
> is what I have:
>
> people.parquet
> └── _SUCCESS
>
> On the slave I get,
> people.parquet
> └── _temporary
> └── 0
> ├── task_201509030057_4699_m_00
> │   └──
> part-r-0-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
> ├── task_201509030057_4699_m_01
> │   └──
> part-r-1-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
> └── _temporary
>
> I have zipped and attached both the folders.
>
> On Thu, Sep 3, 2015 at 5:58 PM, Ewan Leith <ewan.le...@realitymine.com>
> wrote:
>
>> Your error log shows you attempting to read from 'people.parquet2' not
>> ‘people.parquet’ as you’ve put below, is that just from a different attempt?
>>
>>
>>
>> Otherwise, it’s an odd one! There aren’t _SUCCESS, _common_metadata and
>> _metadata files under people.parquet that you’ve listed below, which would
>> normally be created when the write completes, can you show us your write
>> output?
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Ewan
>>
>>
>>
>>
>>
>>
>>
>> *From:* Amila De Silva [mailto:jaa...@gmail.com]
>> *Sent:* 03 September 2015 05:44
>> *To:* Guru Medasani <gdm...@gmail.com>
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Problem while loading saved data
>>
>>
>>
>> Hi Guru,
>>
>>
>>
>> Thanks for the reply.
>>
>>
>>
>> Yes, I checked if the file exists. But instead of a single file what I
>> found was a directory having the following structure.
>>
>>
>>
>> people.parquet
>>
>> └── _temporary
>>
>> └── 0
>>
>> ├── task_201509030057_4699_m_00
>>
>> │   └──
>> part-r-0-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
>>
>> ├── task_201509030057_4699_m_01
>>
>> │   └──
>> part-r-1-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
>>
>> └── _temporary
>>
>>
>>
>>
>>
>> On Thu, Sep 3, 2015 at 7:13 AM, Guru Medasani <gdm...@gmail.com> wrote:
>>
>> Hi Amila,
>>
>>
>>
>> Error says that the ‘people.parquet’ file does not exist. Can you
>> manually check to see if that file exists?
>>
>>
>>
>> Py4JJavaError: An error occurred while calling o53840.parquet.
>>
>> : java.lang.AssertionError: assertion failed: No schema defined, and no 
>> Parquet data file or summary file found under 
>> file:/home/ubuntu/ipython/people.parquet2.
>>
>>
>>
>>
>>
>> Guru Medasani
>>
>> gdm...@gmail.com
>>
>>
>>
>>
>>
>>
>>
>> On Sep 2, 2015, at 8:25 PM, Amila De Silva <jaa...@gmail.com> wrote:
>>
>>
>>
>> Hi All,
>>
>>
>>
>> I have a two node spark cluster, to which I'm connecting using IPython
>> notebook.
>>
>> To see how data saving/loading works, I simply created a dataframe using
>> people.json using the Code below;
>>
>>
>>
>> df = sqlContext.read.json("examples/src/main/resources/people.json")
>>
>>
>>
>> Then called the following to save the dataframe as a parquet.
>>
>> df.write.save("people.parquet")
>>
>>
>>
>> Tried loading the saved dataframe using;
>>
>> df2 = sqlContext.read.parquet('people.parquet');
>>
>>
>>
>> But this simply fails giving the following exception
>>
>&g

Re: Problem while loading saved data

2015-09-03 Thread Ewan Leith
>From that, I'd guesd that HDFS isn't setup between the nodes, or for some 
>reason writes are defaulting to file:///path/ rather than hdfs:///path/




-- Original message--

From: Amila De Silva

Date: Thu, 3 Sep 2015 17:12

To: Ewan Leith;

Cc: user@spark.apache.org;

Subject:Re: Problem while loading saved data


Hi Ewan,

Yes, 'people.parquet' is from the first attempt and in that attempt it tried to 
save the same people.json.

It seems that the same folder is created on both the nodes and contents of the 
files are distributed between the two servers.

On the master node(this is the same node which runs IPython Notebook) this is 
what I have:

people.parquet
└── _SUCCESS

On the slave I get,
people.parquet
└── _temporary
└── 0
├── task_201509030057_4699_m_00
│   └── part-r-0-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
├── task_201509030057_4699_m_01
│   └── part-r-1-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
└── _temporary

I have zipped and attached both the folders.

On Thu, Sep 3, 2015 at 5:58 PM, Ewan Leith 
<ewan.le...@realitymine.com<mailto:ewan.le...@realitymine.com>> wrote:
Your error log shows you attempting to read from 'people.parquet2' not 
‘people.parquet’ as you’ve put below, is that just from a different attempt?

Otherwise, it’s an odd one! There aren’t _SUCCESS, _common_metadata and 
_metadata files under people.parquet that you’ve listed below, which would 
normally be created when the write completes, can you show us your write output?


Thanks,
Ewan



From: Amila De Silva [mailto:jaa...@gmail.com<mailto:jaa...@gmail.com>]
Sent: 03 September 2015 05:44
To: Guru Medasani <gdm...@gmail.com<mailto:gdm...@gmail.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Problem while loading saved data

Hi Guru,

Thanks for the reply.

Yes, I checked if the file exists. But instead of a single file what I found 
was a directory having the following structure.

people.parquet
└── _temporary
└── 0
├── task_201509030057_4699_m_00
│   └── part-r-0-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
├── task_201509030057_4699_m_01
│   └── part-r-1-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
└── _temporary


On Thu, Sep 3, 2015 at 7:13 AM, Guru Medasani 
<gdm...@gmail.com<mailto:gdm...@gmail.com>> wrote:
Hi Amila,

Error says that the ‘people.parquet’ file does not exist. Can you manually 
check to see if that file exists?


Py4JJavaError: An error occurred while calling o53840.parquet.

: java.lang.AssertionError: assertion failed: No schema defined, and no Parquet 
data file or summary file found under file:/home/ubuntu/ipython/people.parquet2.



Guru Medasani
gdm...@gmail.com<mailto:gdm...@gmail.com>



On Sep 2, 2015, at 8:25 PM, Amila De Silva 
<jaa...@gmail.com<mailto:jaa...@gmail.com>> wrote:

Hi All,

I have a two node spark cluster, to which I'm connecting using IPython notebook.
To see how data saving/loading works, I simply created a dataframe using 
people.json using the Code below;

df = sqlContext.read.json("examples/src/main/resources/people.json")

Then called the following to save the dataframe as a parquet.
df.write.save("people.parquet")

Tried loading the saved dataframe using;
df2 = sqlContext.read.parquet('people.parquet');

But this simply fails giving the following exception


---

Py4JJavaError Traceback (most recent call last)

 in ()

> 1 df2 = sqlContext.read.parquet('people.parquet2');



/srv/spark/python/pyspark/sql/readwriter.pyc in parquet(self, *path)

154 [('name', 'string'), ('year', 'int'), ('month', 'int'), ('day', 
'int')]

155 """

--> 156 return 
self._df(self._jreader.parquet(_to_seq(self._sqlContext._sc, path)))

157

158 @since(1.4)



/srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in 
__call__(self, *args)

536 answer = self.gateway_client.send_command(command)

537 return_value = get_return_value(answer, self.gateway_client,

--> 538 self.target_id, self.name<http://self.name/>)

539

540 for temp_arg in temp_args:



/srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py in 
get_return_value(answer, gateway_client, target_id, name)

298 raise Py4JJavaError(

299 'An error occurred while calling {0}{1}{2}.\n'.

--> 300 format(target_id, '.', name), value)

301 else:

302 raise Py4JError(



Py4JJavaError: An error occurred while calling o53840.parquet.

: java.lang.AssertionError: assertion failed: No schema defined, and no Parquet 
data

RE: Problem while loading saved data

2015-09-03 Thread Ewan Leith
Your error log shows you attempting to read from 'people.parquet2' not 
‘people.parquet’ as you’ve put below, is that just from a different attempt?

Otherwise, it’s an odd one! There aren’t _SUCCESS, _common_metadata and 
_metadata files under people.parquet that you’ve listed below, which would 
normally be created when the write completes, can you show us your write output?


Thanks,
Ewan



From: Amila De Silva [mailto:jaa...@gmail.com]
Sent: 03 September 2015 05:44
To: Guru Medasani <gdm...@gmail.com>
Cc: user@spark.apache.org
Subject: Re: Problem while loading saved data

Hi Guru,

Thanks for the reply.

Yes, I checked if the file exists. But instead of a single file what I found 
was a directory having the following structure.

people.parquet
└── _temporary
└── 0
├── task_201509030057_4699_m_00
│   └── part-r-0-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
├── task_201509030057_4699_m_01
│   └── part-r-1-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
└── _temporary


On Thu, Sep 3, 2015 at 7:13 AM, Guru Medasani 
<gdm...@gmail.com<mailto:gdm...@gmail.com>> wrote:
Hi Amila,

Error says that the ‘people.parquet’ file does not exist. Can you manually 
check to see if that file exists?


Py4JJavaError: An error occurred while calling o53840.parquet.

: java.lang.AssertionError: assertion failed: No schema defined, and no Parquet 
data file or summary file found under file:/home/ubuntu/ipython/people.parquet2.



Guru Medasani
gdm...@gmail.com<mailto:gdm...@gmail.com>



On Sep 2, 2015, at 8:25 PM, Amila De Silva 
<jaa...@gmail.com<mailto:jaa...@gmail.com>> wrote:

Hi All,

I have a two node spark cluster, to which I'm connecting using IPython notebook.
To see how data saving/loading works, I simply created a dataframe using 
people.json using the Code below;

df = sqlContext.read.json("examples/src/main/resources/people.json")

Then called the following to save the dataframe as a parquet.
df.write.save("people.parquet")

Tried loading the saved dataframe using;
df2 = sqlContext.read.parquet('people.parquet');

But this simply fails giving the following exception


---

Py4JJavaError Traceback (most recent call last)

 in ()

> 1 df2 = sqlContext.read.parquet('people.parquet2');



/srv/spark/python/pyspark/sql/readwriter.pyc in parquet(self, *path)

154 [('name', 'string'), ('year', 'int'), ('month', 'int'), ('day', 
'int')]

155 """

--> 156 return 
self._df(self._jreader.parquet(_to_seq(self._sqlContext._sc, path)))

157

158 @since(1.4)



/srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in 
__call__(self, *args)

536 answer = self.gateway_client.send_command(command)

537 return_value = get_return_value(answer, self.gateway_client,

--> 538 self.target_id, self.name<http://self.name/>)

539

540 for temp_arg in temp_args:



/srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py in 
get_return_value(answer, gateway_client, target_id, name)

298 raise Py4JJavaError(

299 'An error occurred while calling {0}{1}{2}.\n'.

--> 300 format(target_id, '.', name), value)

301 else:

302 raise Py4JError(



Py4JJavaError: An error occurred while calling o53840.parquet.

: java.lang.AssertionError: assertion failed: No schema defined, and no Parquet 
data file or summary file found under file:/home/ubuntu/ipython/people.parquet2.

   at scala.Predef$.assert(Predef.scala:179)

   at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.org<http://MetadataCache.org>$apache$spark$sql$parquet$ParquetRelation2$MetadataCache$$readSchema(newParquet.scala:429)

   at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)

   at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)

   at scala.Option.orElse(Option.scala:257)

   at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:369)

   at 
org.apache.spark.sql.parquet.ParquetRelation2.org<http://org.apache.spark.sql.parquet.parquetrelation2.org/>$apache$spark$sql$parquet$ParquetRelation2$$metadataCache$lzycompute(newParquet.scala:126)

   at 
org.apache.spark.sql.parquet.ParquetRelation2.org<http://org.apache.spark.sql.parquet.parquetrelation2.org/>$apache$spark$sql$parquet$ParquetRelation2$$metadataCache(newParquet.scala:124)

   at 
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)

   at 
org.apache.spark.sql.parquet.ParquetRel

Re: Problem while loading saved data

2015-09-02 Thread Guru Medasani
Hi Amila,

Error says that the ‘people.parquet’ file does not exist. Can you manually 
check to see if that file exists?

> Py4JJavaError: An error occurred while calling o53840.parquet.
> : java.lang.AssertionError: assertion failed: No schema defined, and no 
> Parquet data file or summary file found under 
> file:/home/ubuntu/ipython/people.parquet2.


Guru Medasani
gdm...@gmail.com



> On Sep 2, 2015, at 8:25 PM, Amila De Silva  wrote:
> 
> Hi All,
> 
> I have a two node spark cluster, to which I'm connecting using IPython 
> notebook.
> To see how data saving/loading works, I simply created a dataframe using 
> people.json using the Code below;
> 
> df = sqlContext.read.json("examples/src/main/resources/people.json")
> 
> Then called the following to save the dataframe as a parquet.
> df.write.save("people.parquet")
> 
> Tried loading the saved dataframe using;
> df2 = sqlContext.read.parquet('people.parquet');
> 
> But this simply fails giving the following exception
> 
> ---
> Py4JJavaError Traceback (most recent call last)
>  in ()
> > 1 df2 = sqlContext.read.parquet('people.parquet2');
> 
> /srv/spark/python/pyspark/sql/readwriter.pyc in parquet(self, *path)
> 154 [('name', 'string'), ('year', 'int'), ('month', 'int'), 
> ('day', 'int')]
> 155 """
> --> 156 return 
> self._df(self._jreader.parquet(_to_seq(self._sqlContext._sc, path)))
> 157 
> 158 @since(1.4)
> 
> /srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in 
> __call__(self, *args)
> 536 answer = self.gateway_client.send_command(command)
> 537 return_value = get_return_value(answer, self.gateway_client,
> --> 538 self.target_id, self.name )
> 539 
> 540 for temp_arg in temp_args:
> 
> /srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py in 
> get_return_value(answer, gateway_client, target_id, name)
> 298 raise Py4JJavaError(
> 299 'An error occurred while calling {0}{1}{2}.\n'.
> --> 300 format(target_id, '.', name), value)
> 301 else:
> 302 raise Py4JError(
> 
> Py4JJavaError: An error occurred while calling o53840.parquet.
> : java.lang.AssertionError: assertion failed: No schema defined, and no 
> Parquet data file or summary file found under 
> file:/home/ubuntu/ipython/people.parquet2.
>   at scala.Predef$.assert(Predef.scala:179)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.org$apache$spark$sql$parquet$ParquetRelation2$MetadataCache$$readSchema(newParquet.scala:429)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)
>   at scala.Option.orElse(Option.scala:257)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:369)
>   at org.apache.spark.sql.parquet.ParquetRelation2.org 
> $apache$spark$sql$parquet$ParquetRelation2$$metadataCache$lzycompute(newParquet.scala:126)
>   at org.apache.spark.sql.parquet.ParquetRelation2.org 
> $apache$spark$sql$parquet$ParquetRelation2$$metadataCache(newParquet.scala:124)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)
>   at scala.Option.getOrElse(Option.scala:120)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2.dataSchema(newParquet.scala:165)
>   at 
> org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:506)
>   at 
> org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:505)
>   at 
> org.apache.spark.sql.sources.LogicalRelation.(LogicalRelation.scala:30)
>   at 
> org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:438)
>   at 
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:264)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:601)
>   at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>   at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>   at py4j.Gateway.invoke(Gateway.java:259)
>   at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> 

Re: Problem while loading saved data

2015-09-02 Thread Amila De Silva
Hi Guru,

Thanks for the reply.

Yes, I checked if the file exists. But instead of a single file what I
found was a directory having the following structure.

people.parquet
└── _temporary
└── 0
├── task_201509030057_4699_m_00
│   └── part-r-0-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
├── task_201509030057_4699_m_01
│   └── part-r-1-b921ed54-53fa-459b-881c-cccde7f79320.gz.parquet
└── _temporary


On Thu, Sep 3, 2015 at 7:13 AM, Guru Medasani  wrote:

> Hi Amila,
>
> Error says that the ‘people.parquet’ file does not exist. Can you manually
> check to see if that file exists?
>
> Py4JJavaError: An error occurred while calling o53840.parquet.
> : java.lang.AssertionError: assertion failed: No schema defined, and no 
> Parquet data file or summary file found under 
> file:/home/ubuntu/ipython/people.parquet2.
>
>
>
> Guru Medasani
> gdm...@gmail.com
>
>
>
> On Sep 2, 2015, at 8:25 PM, Amila De Silva  wrote:
>
> Hi All,
>
> I have a two node spark cluster, to which I'm connecting using IPython
> notebook.
> To see how data saving/loading works, I simply created a dataframe using
> people.json using the Code below;
>
> df = sqlContext.read.json("examples/src/main/resources/people.json")
>
> Then called the following to save the dataframe as a parquet.
> df.write.save("people.parquet")
>
> Tried loading the saved dataframe using;
> df2 = sqlContext.read.parquet('people.parquet');
>
> But this simply fails giving the following exception
>
> ---Py4JJavaError
>  Traceback (most recent call 
> last) in ()> 1 df2 = 
> sqlContext.read.parquet('people.parquet2');
> /srv/spark/python/pyspark/sql/readwriter.pyc in parquet(self, *path)154   
>   [('name', 'string'), ('year', 'int'), ('month', 'int'), ('day', 'int')] 
>155 """--> 156 return 
> self._df(self._jreader.parquet(_to_seq(self._sqlContext._sc, path)))157   
>   158 @since(1.4)
> /srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in 
> __call__(self, *args)536 answer = 
> self.gateway_client.send_command(command)537 return_value = 
> get_return_value(answer, self.gateway_client,--> 538 
> self.target_id, self.name)539 540 for temp_arg in temp_args:
> /srv/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py in 
> get_return_value(answer, gateway_client, target_id, name)298  
>raise Py4JJavaError(299 'An error occurred while 
> calling {0}{1}{2}.\n'.--> 300 format(target_id, '.', 
> name), value)301 else:302 raise Py4JError(
> Py4JJavaError: An error occurred while calling o53840.parquet.
> : java.lang.AssertionError: assertion failed: No schema defined, and no 
> Parquet data file or summary file found under 
> file:/home/ubuntu/ipython/people.parquet2.
>   at scala.Predef$.assert(Predef.scala:179)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.org$apache$spark$sql$parquet$ParquetRelation2$MetadataCache$$readSchema(newParquet.scala:429)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$11.apply(newParquet.scala:369)
>   at scala.Option.orElse(Option.scala:257)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:369)
>   at org.apache.spark.sql.parquet.ParquetRelation2.org 
> $apache$spark$sql$parquet$ParquetRelation2$$metadataCache$lzycompute(newParquet.scala:126)
>   at org.apache.spark.sql.parquet.ParquetRelation2.org 
> $apache$spark$sql$parquet$ParquetRelation2$$metadataCache(newParquet.scala:124)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$dataSchema$1.apply(newParquet.scala:165)
>   at scala.Option.getOrElse(Option.scala:120)
>   at 
> org.apache.spark.sql.parquet.ParquetRelation2.dataSchema(newParquet.scala:165)
>   at 
> org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:506)
>   at 
> org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:505)
>   at 
> org.apache.spark.sql.sources.LogicalRelation.(LogicalRelation.scala:30)
>   at 
> org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:438)
>   at 
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:264)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
>