vinothchandar commented on a change in pull request #2475: URL: https://github.com/apache/hudi/pull/2475#discussion_r577755265
########## File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala ########## @@ -74,6 +78,19 @@ class DefaultSource extends RelationProvider val tablePath = DataSourceUtils.getTablePath(fs, globPaths.toArray) log.info("Obtained hudi table path: " + tablePath) + val sparkEngineContext = new HoodieSparkEngineContext(sqlContext.sparkContext) + val fsBackedTableMetadata = + new FileSystemBackedTableMetadata(sparkEngineContext, new SerializableConfiguration(fs.getConf), tablePath, false) + val partitionPaths = fsBackedTableMetadata.getAllPartitionPaths + val onePartitionPath = if(!partitionPaths.isEmpty && !StringUtils.isEmpty(partitionPaths.get(0))) { + tablePath + "/" + partitionPaths.get(0) + } else { + tablePath + } + val dataPath = DataSourceUtils.getDataPath(tablePath, onePartitionPath) + log.info("Obtained hudi data path: " + dataPath) + parameters += "path" -> dataPath Review comment: @teeyog Sorry it's not still clear to me. I supplied a globbed path `2015/*/*/*` and even that overrides `path -> tablePath/*/*/*/*` Won't this incur reading all partitions in the tablePath as opposed only 2015's? ![image](https://user-images.githubusercontent.com/1179324/108234541-bde4fb00-70f9-11eb-8611-58579636b51b.png) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org