Hi,
   Lets say I have millions of binary format files... Lets say I have this
java (or python) library which reads and parses these binary formatted
files..

Say
import foo
f = foo.open(filename)
header = f.get_header()
and some other methods..

What I was thinking was to write hadoop input format to read and parse
these files.. but since I am newbie in spark.. I was wondering if I can
directly open these files and use these existing libraries to process the
files..I dont necessary have requirement to save these files in hdfs.. even
nfs will work.
Is there a way I can use spark without having to write the parser again?
Thanks

Reply via email to