What about /blah/*/blah/out*.avro?
On 27 May 2015 18:08, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I am doing that now.
Is there no other way ?
On Wed, May 27, 2015 at 12:40 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
How about creating two and union [ sc.union(first, second) ] them?
Try something like that:
def readGenericRecords(sc: SparkContext, inputDir: String, startDate:
Date, endDate: Date) = {
// assuming a list of paths
val paths: Seq[String] = getInputPaths(inputDir, startDate, endDate)
val job = Job.getInstance(new
def readGenericRecords(sc: SparkContext, inputDir: String, startDate:
Date, endDate: Date) = {
val path = getInputPaths(inputDir, startDate, endDate)
sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
AvroKeyInputFormat[GenericRecord]](/A/B/C/D/D/2015/05/22/out-r-*.avro)
}
You can do that using FileInputFormat.addInputPath
2015-05-27 10:41 GMT+02:00 ayan guha guha.a...@gmail.com:
What about /blah/*/blah/out*.avro?
On 27 May 2015 18:08, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I am doing that now.
Is there no other way ?
On Wed, May 27, 2015 at 12:40 PM,
How about creating two and union [ sc.union(first, second) ] them?
Thanks
Best Regards
On Wed, May 27, 2015 at 11:51 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I have this piece
sc.newAPIHadoopFile[AvroKey[GenericRecord], NullWritable,
AvroKeyInputFormat[GenericRecord]](
I am doing that now.
Is there no other way ?
On Wed, May 27, 2015 at 12:40 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
How about creating two and union [ sc.union(first, second) ] them?
Thanks
Best Regards
On Wed, May 27, 2015 at 11:51 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com
wrote:
I