When you refer to "filesystem," do you mean HDFS? It's very common to store lots of text files in HDFS and run multiple jobs to process / learn about those text files. As for XML support, you can use Java libraries (or Python libraries if you're using Hadoop streaming) to parse the XML; Hadoop itself doesn't have much XML support. I hope this answers your question.
Alex On Fri, Jun 12, 2009 at 1:31 PM, Alexandre Jaquet <alexjaq...@gmail.com>wrote: > Hi, > > Does hadoop and map / reduce will allow me to parse large quantity of open > xml files distributed inside the same filesystem but using multipe jobs ? > > Thx > > Alexandre Jaquet >