Try sc.wholeTextFiles(). It reads the entire file into a string record. -Xiangrui
On Tue, May 20, 2014 at 8:25 AM, Nathan Kronenfeld <nkronenf...@oculusinfo.com> wrote: > We are trying to read some large GraphML files to use in spark. > > Is there an easy way to read XML-based files like this that accounts for > partition boundaries and the like? > > Thanks, > Nathan > > > -- > Nathan Kronenfeld > Senior Visualization Developer > Oculus Info Inc > 2 Berkeley Street, Suite 600, > Toronto, Ontario M5A 4J5 > Phone: +1-416-203-3003 x 238 > Email: nkronenf...@oculusinfo.com