Is it worth while to make a conversion script from XML to something easier like a database or json object?
Thanks, ~Ben On Fri, Feb 22, 2019, 7:17 AM Bonnie MacKellar <bkmackel...@gmail.com> wrote: > Thanks so much! > > Bonnie MacKellar > > On Fri, Feb 22, 2019 at 7:03 AM Erik Fäßler <erik.faess...@uni-jena.de> > wrote: > > > Hey, > > > > just wanted to say that I didn’t come around to make the component > > available yet, will do first thing next week! > > > > Best, > > > > Erik > > > > > On 20. Feb 2019, at 19:47, Bonnie MacKellar <bkmackel...@gmail.com> > > wrote: > > > > > > Hi, > > > Yes, we are using that format. I have a parser that I wrote, but it > isn't > > > integrated into UIMA. It runs separately and loads the full clinical > > trial > > > data into a triplestore (Stardog). I would be interested in your system > > > since I am not really familiar with how to write file readers in the > UMIA > > > framework. Perhaps I can merge my parser into it and end up with just > the > > > right thing. If you can make it available, I would definitely be > > > interested. I will take a look at the other links as well. Thanks!! > > > > > > Bonnie MacKellar > > > > > > On Wed, Feb 20, 2019 at 3:54 AM Erik Fäßler <erik.faess...@uni-jena.de > > > > > wrote: > > > > > >> Dear Bonnie, > > >> > > >> are you talking about the clinical trial XML format used by > > >> ClinicalTrials. <http://clinicaltrials.org/>gov by any chance? > > >> If so, I did create a UIMA reader for these data. Its not perfect but > > >> perhaps enough for your purposes and also you might want to enhance > it. > > >> Please let me know if you would be interested in that, I did not get > > >> around to make it publicly available yet but could do so quickly. > > >> > > >> To answer the general question to the best of my knowledge: > > >> There is no such thing as a general XML reader built-in into the UIMA > > >> framework. For all non-trivial formats, a specific reader is > necessary. > > >> This also holds true with regard to the employed type system. > > >> That being said, there are UIMA readers that try to serve as a general > > XML > > >> reading facility, e.g. the “XML Reader” from our lab (JULIELab, > > >> https://github.com/JULIELab/jcore-base/tree/master/jcore-xml-reader < > > >> https://github.com/JULIELab/jcore-base/tree/master/jcore-xml-reader > >). > > >> However, in my experience XML inputs come in a lot of different forms > > >> which might often not be suitable to a generic approach which is why I > > >> wrote quite a few UIMA readers for specific XML formats in the past. > > >> > > >> Hope that helps, > > >> > > >> Erik > > >> > > >>> On 20. Feb 2019, at 01:13, Bonnie MacKellar <bkmackel...@gmail.com> > > >> wrote: > > >>> > > >>> This is probably a very naive question, but I can't seem to find > > anything > > >>> about this. I currently have a lot of XML files (clinical trial > > >>> descriptions). My current workflow is to run a preprocessor that > parses > > >> the > > >>> XML and generates text files in a simple format. I then run these > files > > >> in > > >>> a UIMA pipeline, using FileCollectionReader to load the text files, > > RUTA > > >> to > > >>> parse the simple format, the Metamap annotator to do some UMLS > > >> annotations, > > >>> and finally I have a writer that generates RDF triples from the UMIA > > >>> annotations and loads the triples into a database. This has worked > but > > is > > >>> clunky, especially the preprocessing. I feel like there has to be a > > >> better > > >>> way. Is there any support for reading XML files or do I need to > write > > my > > >>> own CollectionReader? Are there any other tools within UIMA for > > handling > > >>> XML text? > > >>> > > >>> thanks, > > >>> Bonnie MacKellar > > >> > > >> > > > > >