Hi,

I got strange behavior.  When I am creating schema RDD for nested directory
sometimes it work and sometime it does not work. My question is whether
nested directory supported or not?

My code is as below.

val fileLocation = "hdfs://localhost:9000/apps/hive/warehouse/hl7"
  val parquetRDD = sqlContex.parquetFile(fileLocation)

My HDFS direcotries are as below.

/apps/hive/warehouse/hl7/_SUCCESS
-rw-r--r--   1 hdfs supergroup       5809 2015-02-05 10:44
/apps/hive/warehouse/hl7/_common_metadata
-rw-r--r--   1 hdfs supergroup      15127 2015-02-05 10:44
/apps/hive/warehouse/hl7/_metadata
-rw-r--r--   1 hdfs supergroup     174044 2015-02-03 10:51
/apps/hive/warehouse/hl7/part-r-1.parquet
-rw-r--r--   1 hdfs supergroup     190220 2015-02-03 10:51
/apps/hive/warehouse/hl7/part-r-2.parquet
drwxr-xr-x   - hdfs supergroup          0 2015-02-05 15:35
/apps/hive/warehouse/hl7/111


I get error Exception in thread "main" java.io.FileNotFoundException: Path
is not a file: /apps/hive/warehouse/hl7/111

111 is directory having more parquet files.

After renaming 111 directory to people4a its working without issue and I am
able to fetch data from nested Directory. I tried different directory names
but it failed for all except people4a.

Am I missing anything?

Regards,
Nishant

-- 
Regards,
Nishant

Reply via email to