Have a look at the spark streaming. You can make use of the ssc.fileStream.
Eg: val avroStream = ssc.fileStream[AvroKey[GenericRecord], NullWritable, AvroKeyInputFormat[GenericRecord]](input) You can also specify a filter function <http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.StreamingContext> as the second argument. Thanks Best Regards On Wed, Aug 19, 2015 at 10:46 PM, Masf <masfwo...@gmail.com> wrote: > Hi. > > I'd like to read Avro files using this library > https://github.com/databricks/spark-avro > > I need to load several files from a folder, not all files. Is there some > functionality to filter the files to load? > > And... Is is possible to know the name of the files loaded from a folder? > > My problem is that I have a folder where an external process is inserting > files every X minutes and I need process these files once, and I can't > move, rename or copy the source files. > > > Thanks > -- > > Regards > Miguel Ángel >