Hi, Is there any way to discard files starting with dot(.) or ending with .tmp in the hive partition while reading from Hive table using spark.read.table method.
I tried using PathFilters but they didn't work. I am using spark-submit and passing my python file(pyspark) containing the source code. spark.sparkContext._jsc.hadoopConfiguration().set("mapreduce.input.pathFilter.class", "com.abc.hadoop.utility.TmpFileFilter") class TmpFileFilter extends PathFilter { override def accept(path : Path): Boolean = !path.getName.endsWith(".tmp") } Still in the detailed logs I can see .tmp files are getting considered in the detailed logs: 20/04/22 12:58:44 DEBUG MapRFileSystem: getMapRFileStatus maprfs:///a/hour=05/host=abc/FlumeData.1587559137715 20/04/22 12:58:44 DEBUG MapRFileSystem: getMapRFileStatus maprfs:///a/hour=05/host=abc/FlumeData.1587556815621 20/04/22 12:58:44 DEBUG MapRFileSystem: getMapRFileStatus maprfs:///a/hour=05/host=abc/.FlumeData.1587560277337.tmp Is there any way to discard the tmp(.tmp) or the hidden files(filename starting with dot or underscore) in hive partitions while reading from spark? *Regards,Dhrubajyoti Hati.Mob No: 9886428028/9652029028*