Hadley,

Thank you. I am able to get the xml_ns_strip() function to work with my file 
directly so I will likely be able to reach my immediate goal.

However, I still have had no success with understanding the namespace problem. 
I am not able to use read_xml() using the object I generated for the 
reproducible example, which is simply a character vector of length 4 having the 
contents of the XML file as produce by readLines(). I then used dput() to 
define the structure. The resulting structure apparently is not to the liking 
of read_xml(). I have reproduced the necessary code here for your convenience. 
There error is below.

##
library(xml2)
library(stringr)
with_ns_xml <- c("<?xml version=\"1.0\" ?>",
                 "<WorkSet xmlns=\"http://labkey.org/etl/xml\";>",
                 "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
                 "</WorkSet>")
## without str_c() collapse it complain of a vector of length > 1 also.
read_xml(str_c(with_ns_xml, collapse = TRUE))
Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = 
as_html,  :
  Start tag expected, '<' not found [4]

## produces the following error message.
Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = 
as_html,  :
  Start tag expected, '<' not found [4]

I have similar issues with xml2::xml_find_all
xml_find_all(str_c(with_ns_xml, collapse = TRUE), "/WorkSet//Description")

## Produces the following error message.
Error in UseMethod("xml_find_all") :
  no applicable method for 'xml_find_all' applied to an object of class 
"character"



R. Mark Sharp, Ph.D.
msh...@txbiomed.org





> On Jan 31, 2017, at 4:27 PM, Hadley Wickham <h.wick...@gmail.com> wrote:
>
> See the last example in ?xml2::xml_find_all or use xml2::xml2::xml_ns_strip()
>
> Hadley
>
> On Tue, Jan 31, 2017 at 9:43 AM, Mark Sharp <msh...@txbiomed.org> wrote:
>> I am trying to read a series of XML files that use a namespace and I have 
>> failed, thus far, to discover the proper syntax. I have a reproducible 
>> example below. I have two XML character strings defined: one without a 
>> namespace and one with. I show that I can successfully extract the node 
>> using the XML string without the namespace and fail when using the XML 
>> string with the namespace.
>>
>> Mark
>> PS I am having the same problem with the xml2 package and am hoping 
>> understanding one with help with the other.
>>
>> ##
>> library(XML)
>> ## The first XML text (no_ns_xml) does not have a namespace defined
>> no_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet>",
>>               "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>>               "</WorkSet>")
>> l_no_ns_xml <-xmlTreeParse(no_ns_xml, asText = TRUE, getDTD = FALSE,
>>                           useInternalNodes = TRUE)
>> ## The node is found
>> getNodeSet(l_no_ns_xml, "/WorkSet//Description")
>>
>> ## The second XML text (with_ns_xml) has a namespace defined
>> with_ns_xml <- c("<?xml version=\"1.0\" ?>",
>>                 "<WorkSet xmlns=\"http://labkey.org/etl/xml\";>",
>>                 "<Description>MFIA 9-Plex (CharlesRiver)</Description>",
>>                 "</WorkSet>")
>>
>> l_with_ns_xml <-xmlTreeParse(with_ns_xml, asText = TRUE, getDTD = FALSE,
>>                               useInternalNodes = TRUE)
>> ## The node is not found
>> getNodeSet(l_with_ns_xml, "/WorkSet//Description")
>> ## I attempt to provide the namespace, but fail.
>> ns <-  "http://labkey.org/etl/xml";
>> names(ns)[1] <- "xmlns"
>> getNodeSet(l_with_ns_xml, "/WorkSet//Description", namespaces = ns)
>>
>> R. Mark Sharp, Ph.D.
>> Director of Data Science Core
>> Southwest National Primate Research Center
>> Texas Biomedical Research Institute
>> P.O. Box 760549
>> San Antonio, TX 78245-0549
>> Telephone: (210)258-9476
>> e-mail: msh...@txbiomed.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}
>>
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> http://hadley.nz

CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}}

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to