Unfortunately, I don't have a bunch of moderately big xml files; I have one, really big file - big enough that reading it into memory as a single string is not feasible.
On Tue, May 20, 2014 at 1:24 PM, Xiangrui Meng <men...@gmail.com> wrote: > Try sc.wholeTextFiles(). It reads the entire file into a string > record. -Xiangrui > > On Tue, May 20, 2014 at 8:25 AM, Nathan Kronenfeld > <nkronenf...@oculusinfo.com> wrote: > > We are trying to read some large GraphML files to use in spark. > > > > Is there an easy way to read XML-based files like this that accounts for > > partition boundaries and the like? > > > > Thanks, > > Nathan > > > > > > -- > > Nathan Kronenfeld > > Senior Visualization Developer > > Oculus Info Inc > > 2 Berkeley Street, Suite 600, > > Toronto, Ontario M5A 4J5 > > Phone: +1-416-203-3003 x 238 > > Email: nkronenf...@oculusinfo.com > -- Nathan Kronenfeld Senior Visualization Developer Oculus Info Inc 2 Berkeley Street, Suite 600, Toronto, Ontario M5A 4J5 Phone: +1-416-203-3003 x 238 Email: nkronenf...@oculusinfo.com