Re: [basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-27 Thread thufir
you write: "I would like to preprocess the xml before entering postgres, and stream it with the copy command." but why? I'm inferring that you want to dynamically generate XML as its queried by postgres? Just curious, Thufir On 2020-02-23 7:31 a.m., maxzor wrote: Hello, Thank you for

Re: [basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-27 Thread Majewski, Steven Dennis (sdm7g)
If you really want to read in all of the data as a single stream, I would suggest writing a preprocessor using SAX library ( from Python, Java or whatever language you want to use ) to break the Wikimedia stream into separate XML files for each page element, or else use the same language to do

Re: [basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-24 Thread Christian Grün
Hi Maxime, BaseX provides no streaming facilities for large XML instances. However, if you have enough disk space left, you can create a database instance from your XML dump. We have already done this for Wiki dumps up to 420 GB [1]. You should disable the text and attribute index; database

Re: [basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-23 Thread maxzor
Do you mean stream a single large XML file ? A series of XML files, or stream a file thru a series of XQuery|XSLT|XPath transforms. Possibly poor wording, I meant read a large XML file and produce i.e. a csv file. I don’t believe BaseX uses a streaming XML parser, so probably can’t handle

Re: [basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-23 Thread Majewski, Steven Dennis (sdm7g)
What do you mean by “stream xml transforms? ? Do you mean stream a single large XML file ? A series of XML files, or stream a file thru a series of XQuery|XSLT|XPath transforms. Depends on what you mean by “stream” . I don’t believe BaseX uses a streaming XML parser, so probably can’t

[basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-23 Thread maxzor
Hello, Thank you for your software which GUI has been my savior every time I needed to deal with XML. I would like to know if I can stream xml transforms, to pipe wikimedia XML dumps into a format acceptable by postgres copy ? I know very well SQL, but nothing about XPath or XQuery I