I hacked the Confluence exporter a bit to grab the page content in different formats so we can see if one is easier to migrate or similar. I just pushed 3 new branches to this repo:
body.only - this is basically the same HTML, but with all the “wrapper” stuff removed. Just the HTML of the body content. So no navigation or header/footer, etc… body.storage - this is the raw storage format of the data from confluence. Things like code snippets are in storage format (<structured-macro name=“code” …>), etc. body.view - confluence has a “body.view” mode that is between the “storage” format and not really the exported HTML. The structured macros are expanded a bit (<script class=“brush: java”…), but not really in final HTML form either. Anyway, if interested in converting content, one of the above might be a better starting point. Dan > On Dec 13, 2017, at 4:49 PM, Bruce Snyder <[email protected]> wrote: > > The HTML is available in the repo now. This HTML is just what I grabbed > from the public directory. Even if it's not comprehensive, it's good enough > for hacking around to figure out what we'd like to do. > > In the meantime, I'm still working with ASF Infra to figure out why the > Confluence export is failing. > > Bruce > > On Wed, Dec 13, 2017 at 12:51 PM, Bruce Snyder <[email protected]> > wrote: > >> Yep, that is correct, Dan: >> >> https://git-wip-us.apache.org/repos/asf/activemq-web.git >> >> I pushed my changes to the repo. Now they just need to propagate to the >> Github web UI. >> >> Bruce >> >> On Wed, Dec 13, 2017 at 11:28 AM, Daniel Kulp <[email protected]> wrote: >> >>> >>> Isn’t the push address supposed to be: >>> >>> https://git-wip-us.apache.org/repos/asf/activemq-web.git >>> >>> >>> Dan >>> >>> >>> >>>> On Dec 13, 2017, at 1:15 PM, Bruce Snyder <[email protected]> >>> wrote: >>>> >>>> Thank you for the suggestion, but it looks like I do not have >>> permissions >>>> either. I will contact ASF Infra for assistance. >>>> >>>> Bruce >>>> >>>> On Wed, Dec 13, 2017 at 2:41 AM, Martyn Taylor <[email protected]> >>> wrote: >>>> >>>>> On Wed, Dec 13, 2017 at 4:00 AM, Bruce Snyder <[email protected]> >>>>> wrote: >>>>> >>>>>> I had the following empty git repo created to hold the HTML from the >>>>>> current website: >>>>>> >>>>>> https://github.com/apache/activemq-web >>>>>> >>>>>> However, I have a conundrum -- content cannot be pushed directly to a >>>>>> Github ASF repo. Content can only be added via pull request but Github >>>>> does >>>>>> not allow a pull request on an empty repo. >>>>>> >>>>> Bruce, have you tried pushing directly to the ASF repo. i.e. git:// >>>>> git.apache.org/activemq-web.git >>>>> >>>>> This is the workflow we currently use, we push directly to the ASF >>> repo. >>>>> PRs are really only used for review and discussion. I tried to push >>>>> directly this morning but looks like I don't have write permissions. >>>>> >>>>>> >>>>>> Any ideas on how to get the HTML into the repo? I guess I could ask >>> ASF >>>>>> Infra to push it. >>>>>> >>>>>> Bruce >>>>>> >>>>>> On Tue, Dec 12, 2017 at 8:17 PM, Bruce Snyder <[email protected] >>>> >>>>>> wrote: >>>>>> >>>>>>> I'm going to address all the questions to me in this single reply. >>>>>>> >>>>>>> My original suggestion was that we export the HTML from Confluence, >>>>>>> convert to Markdown and put the Markdown and the images in a git >>> repo. >>>>>>> Markdown is much easier to edit than raw HTML, especially the HTML >>>>>> exported >>>>>>> from Confluence (blech!). The idea was that we could use Jekyll + >>> SAAS >>>>> to >>>>>>> craft a new website. In fact, Michael Andre Pearce produced a mockup >>> of >>>>>>> this using the Apache Metro website as an example (because it already >>>>>> makes >>>>>>> use of Jekyll + SAAS). It was enough to convince me that we should >>> take >>>>>>> this path, so I started looking into doing a full, new export of >>>>>> Confluence >>>>>>> pages to HTML. If you have not seen Michael's mockup, you should >>> really >>>>>>> take a look. >>>>>>> >>>>>>> So, I manually grabbed the raw HTML that is automagically exported >>> from >>>>>>> Confluence and is hosting the current site that we see at >>>>>>> http://activemq.apache.org. I did some testing on it using text2html >>>>> and >>>>>>> the conversion it does is pretty awful and would require a lot of >>> hand >>>>>> work >>>>>>> to fix it. So, we discussed the point that there are 1600+ pages of >>>>> HTML >>>>>> to >>>>>>> manually edit. But I later realized that it was only about 950 HTML >>>>> pages >>>>>>> (from what I can tell so far). >>>>>>> >>>>>>> Then, Dan Kulp found a Confluence HTML to raw HTML converter built on >>>>> top >>>>>>> of PanDoc. So, I have also been trying to export the HTML from >>>>> Confluence >>>>>>> in order to try out the PanDoc converter (it works based on the >>>>>> Confluence >>>>>>> export function which is different from how the HTML is automagically >>>>>>> converted). Unfortunately, I am running into a NullPointerException >>>>> from >>>>>>> Confluence. ASF Infra is telling me that the NPE is due to the CDATA >>> in >>>>>> the >>>>>>> search function on the Navigation page and is suggesting that the >>>>>> solution >>>>>>> is to remove the Navigation page. The problem with this suggestion is >>>>>> that >>>>>>> it would fundamentally remove all the navigation on the right-hand >>> side >>>>>> of >>>>>>> the site -- not what we want. >>>>>>> >>>>>>> I have also given some thought to the idea that removing the current >>>>> site >>>>>>> will break all links to old site. This is something that cannot be >>>>>>> overlooked and must be prevented as we do not want to leave users who >>>>>> have >>>>>>> bookmarked a page high and dry. This is a fairly easy problem to >>> solve >>>>>> this >>>>>>> using some mod_rewrite rules, the question is if ASF Infra is willing >>>>> to >>>>>>> allow us to deploy such custom rules. This should be investigated >>> when >>>>> we >>>>>>> get to that point, but we are not there yet. First, we need to decide >>>>> the >>>>>>> best path forward based on what I have described above in the >>> preceding >>>>>>> paragraphs. >>>>>>> >>>>>>> >>>>>>> Bruce >>>>>>> >>>>>>> On Tue, Dec 12, 2017 at 11:39 AM, Martyn Taylor <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I was thinking there would be a single css file for all the pages. >>>>> But >>>>>> I >>>>>>>> haven't seen the files yet. Let's have a play around when Bruce >>> pushes >>>>>> the >>>>>>>> export. >>>>>>>> >>>>>>>> Cheers >>>>>>>> >>>>>>>> On 12 Dec 2017 5:30 pm, "Michael André Pearce" < >>>>>>>> [email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> What’s 1600 pages between friends.... >>>>>>>>> >>>>>>>>> I agree it will be easier to covert to md than to start doing css >>>>>>>> styles. >>>>>>>>> It’s all from a wiki anyhow so it’s can’t be that far off. >>>>>>>>> >>>>>>>>> It be good to get some samples (eg 50 pages) if not all just to try >>>>>> and >>>>>>>>> see what it is like. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12 Dec 2017, at 17:04, Clebert Suconic < >>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>>> Exporting to MD and creating a gitbook seems like a big task, I >>>>>>>> suspect >>>>>>>>> any >>>>>>>>>>> tool we use will cause a bunch of styling/content issues. >>>>>>>>>>> >>>>>>>>>>> At least initially, how about we just create a nice landing page >>>>>> that >>>>>>>>>>> brings the ActiveMQ site and Artemis site together, and >>>>>> refresh/align >>>>>>>>> the >>>>>>>>>>> existing content with some CSS? >>>>>>>>>> >>>>>>>>>> I was just looking for the minimal effort task. I thought that >>>>>>>>>> converting these pages into a doc would be easier than converting >>>>>> them >>>>>>>>>> to another .css... >>>>>>>>>> >>>>>>>>>> if the conversion needed to be done anyways... I thought .md would >>>>>> be >>>>>>>>>> easier and having a better final presentation. >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> perl -e 'print unpack("u30","D0G)U8V4\@4VYY9& >>>>>>> 5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" );' >>>>>>> >>>>>>> ActiveMQ in Action: http://bit.ly/2je6cQ >>>>>>> Blog: http://bsnyder.org/ <http://bruceblog.org/> >>>>>>> Twitter: http://twitter.com/brucesnyder >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> perl -e 'print >>>>>> unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" >>> );' >>>>>> >>>>>> ActiveMQ in Action: http://bit.ly/2je6cQ >>>>>> Blog: http://bsnyder.org/ <http://bruceblog.org/> >>>>>> Twitter: http://twitter.com/brucesnyder >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> perl -e 'print >>>> unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" >>> );' >>>> >>>> ActiveMQ in Action: http://bit.ly/2je6cQ >>>> Blog: http://bsnyder.org/ <http://bruceblog.org/> >>>> Twitter: http://twitter.com/brucesnyder >>> >>> -- >>> Daniel Kulp >>> [email protected] - http://dankulp.com/blog >>> Talend Community Coder - http://coders.talend.com >>> >>> >> >> >> -- >> perl -e 'print unpack("u30","D0G)U8V4\@4VYY9& >> 5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" );' >> >> ActiveMQ in Action: http://bit.ly/2je6cQ >> Blog: http://bsnyder.org/ <http://bruceblog.org/> >> Twitter: http://twitter.com/brucesnyder >> > > > > -- > perl -e 'print > unpack("u30","D0G)U8V4\@4VYY9&5R\"F)R=6-E+G-N>61E<D\!G;6%I;\"YC;VT*" );' > > ActiveMQ in Action: http://bit.ly/2je6cQ > Blog: http://bsnyder.org/ <http://bruceblog.org/> > Twitter: http://twitter.com/brucesnyder -- Daniel Kulp [email protected] - http://dankulp.com/blog Talend Community Coder - http://coders.talend.com
