Hi, You should keep replies on the list - you never know when someone will swoop in with the right answer to make your life easier.
Below is a simple example that uses xpath syntax to identify (and in this case retrieve) children that match your xpath expression. xpath epxressions are sort of like /a/directory/structure/description so you can visualize elements of XML like nested folders or subdirectories. Hopefully this will get you started. A lot more on xpath here http://www.w3schools.com/xml/xml_xpath.asp There are other extraction tools in xml2 - just type ?xml2 at the command prompt to see more. Since you have more deeply nested elements you'll need to play with this a bit first. library(xml2) uri = 'http://www.w3schools.com/xml/simple.xml' x = read_xml(uri) name_nodes = xml_find_all(x, "//name") name = xml_text(name_nodes) price_nodes = xml_find_all(x, "//price") price = xml_text(price_nodes) calories_nodes = xml_find_all(x, "//calories") calories = xml_double(calories_nodes) X = data.frame(name, price, calories, stringsAsFactors = FALSE) write.csv(X, file = 'foo.csv') Cheers, Ben > On Jan 4, 2017, at 2:13 PM, Andrew Lachance <alach...@bates.edu> wrote: > > Hello Ben, > > Thank you for the advice. I am extremely new to any sort of coding so I have > learned a lot already. Essentially, I was given an XML file and was told to > convert all of it to a csv so that it could be uploaded into a database. > Unfortunately the information I am working with is medical information and > can't really share it. I initially tried to convert it using online programs, > however that ended up with a large amount of blank spaces that wasn't useful > for uploading into the database. > > So essentially, my goal is to parse all the data in the XML to a coherent, > succinct CSV that could be uploaded. In the document, there are 361 patient > files with 13 subcategories for each patient which further branches off to > around 150 categories total. Since I am so new, I have been having a hard > time seeing the bigger picture or knowing if there are any intermediary steps > that will prevent all the blank spaces that the online conversion programs > created. > > I will look through the information on the xml2 package. Any advice or > recommendations would be greatly appreciated as I have felt fairly stuck. > Once again, thank you very much for your help. > > Best, > Andrew > > On Tue, Jan 3, 2017 at 2:29 PM, Ben Tupper <btup...@bigelow.org > <mailto:btup...@bigelow.org>> wrote: > Hi, > > It's hard to know what to advise - much depends upon the XML data you have > and what you want to extract from it. Without knowing about those two things > there is little anyone could do to help. Can you post to the internet a to > example data and provide the link here? Then state explicitly what you want > to have in hand at the end. > > If you are just starting out I suggest that you try xml2 package ( > https://cran.r-project.org/web/packages/xml2/ > <https://cran.r-project.org/web/packages/xml2/> ) rather than XML package ( > https://cran.r-project.org/web/packages/XML/ > <https://cran.r-project.org/web/packages/XML/> ). I have been using it much > more since the authors added the ability to create xml nodes (rather than > just extracting data from existing xml nodes). > > Cheers, > Ben > > P.S. Hello to my niece Olivia S on the Bates EMS team. > > > > On Jan 3, 2017, at 11:27 AM, Andrew Lachance <alach...@bates.edu > > <mailto:alach...@bates.edu>> wrote: > > > > up votdown votefavorite > > <http://stats.stackexchange.com/questions/254328/how-to-convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1# > > > > <http://stats.stackexchange.com/questions/254328/how-to-convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1#>> > > > > I am completely new to R and have tried to use several functions within the > > xml packages to convert an XML to a csv and have had little success. Since > > I am so new, I am not sure what the necessary steps are to complete this > > conversion without a lot of NA. > > > > -- > > Andrew D. Lachance > > Chief of Service, Bates Emergency Medical Service > > Residence Coordinator, Hopkins House > > Bates College Class of 2017 > > alach...@bates.edu <mailto:alach...@bates.edu> <wcur...@bates.edu > > <mailto:wcur...@bates.edu>> > > (207) 620-4854 > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To > > UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > <https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > <http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 60 Bigelow Drive, P.O. Box 380 > East Boothbay, Maine 04544 > http://www.bigelow.org <http://www.bigelow.org/> > > > > > > > -- > Andrew D. Lachance > Chief of Service, Bates Emergency Medical Service > Residence Coordinator, Hopkins House > Bates College Class of 2017 > alach...@bates.edu <mailto:wcur...@bates.edu> > (207) 620-4854 Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.