Re: [NTG-context] Html to Context using Wiki + hpricot
Dear All, I have added an article to the ConTeXt wiki on using Wiki as a collaborative medium for making ConTeXt documents. It is a preliminary version. I will continue to polish it in my spare time. Meanwhile comments/suggestions are welcome. http://wiki.contextgarden.net/HTML_and_ConTeXt saji .. * luigi scarso [EMAIL PROTECTED] [2007-07-11 10:27:12 +0200]: On 7/11/07, Saji Njarackalazhikam Hameed [EMAIL PROTECTED] wrote: Hello All, I wanted to share my recent experience Really interesting . Please, put all these on wiki.contextgarden.net -- luigi If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___ -- Saji N. Hameed APEC Climate Center +82 51 668 7470 National Pension Corporation Busan Building 12F Yeonsan 2-dong, Yeonje-gu, BUSAN 611705 [EMAIL PROTECTED] KOREA ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
[NTG-context] Html to Context using Wiki + hpricot
Hello All, I wanted to share my recent experience in co-ordinated document development. In our office we have to make annual reports, each part of which is contributed by a member. Previously everybody wrote 'Word' documents which was compiled into a larger report. Recently we had the idea to use a Wiki to ease the pain out of this process and to make it enjoyable for everyone involved. After looking around various wiki software we decided to install a brand-new one called Informl (http://informl.folklogic.net) . One nice feature about this is that in the edit mode twin windows are used, one for input and the other a realtime preview window. Anyway this approach could work with any Wiki software. One motivation behind using a Wiki as front-end was to involve people who new nothing about Tex or Context. Secondly it allowed any person to participate in the process from anywhere. Next we used the ruby library hpricot to retrieve the web document and filter it into a context document. This step was interesting and I would like to sharing the code with anybody interested. I am a novice Ruby programmer, so the code may be far from perfect .. nevertheless. saji % scan_page.rb = Retrieves the html page of interest from the server, navigates to links within the main page and construct a context document #!/usr/bin/ruby require 'rubygems' require 'open-uri' require 'hpricot' require 'scrape_page' # scans the home page and lists # all the directories and subdirectories doc=Hpricot(open(http://190.1.1.24:3010/AnnRep07;)) mainfil=annrep.tex `rm #{mainfil}` fil=File.new(mainfil,a) fil.write \\input context_styles \n fil.write \\starttext \n fil.write \\leftaligned{\\BigFontOne Contents} \n fil.write \\vfill \n fil.write { \\switchtobodyfont[10pt] fil.write \\startcolumns[n=2,balance=no,rule=off,option=background,frame=off,background=color,backgroundcolor=blue:1] \n fil.write \\placecontent \n fil.write \\stopcolumns \n fil.write } chapters= (doc/p/a.existingWikiWord) # we need to navigate one more level into the web page # let us discover the links for that chapters.each {|ch| chap_link = ch.attributes['href'] # using inner_html we can create subdirectories chap_name = ch.inner_html.gsub(/\s*/,) chap_name_org = ch.inner_html # We create chapter directories system(mkdir -p #{chap_name}) puts chap_name # if chapter name starts with underscore (_) skip it if chap_name.match(/^\_/) puts chap_name next end fil.write \\input #{chap_name} \n chapFil=#{chap_name}.tex `rm #{chapFil}` cFil=File.new(chapFil,a) cFil.write \\chapter{ #{chap_name_org} } \n # We navigate to sections now doc2=Hpricot(open(chap_link)) sections= (doc2/p/a.existingWikiWord) sections.each {|sc| sec_link = sc.attributes['href'] sec_name = sc.inner_html.gsub(/\s*/,) secFil=#{chap_name}/#{sec_name}.tex `rm #{secFil}` sFil=File.new(secFil,a) sechFil=#{chap_name}/#{sec_name}.html `rm #{sechFil}` shFil=File.new(sechFil,a) # scrape_the_page(sec_link,#{chap_name}/#{sec_name}) scrape_the_page(sec_link,sFil,shFil) cFil.write \\input #{chap_name}/#{sec_name} \n } } fil.write \\stoptext \n % The program calls scrape_page.rb, a function that does most of the filtering Function: scrape_page.rb def scrape_the_page(pagePath,oFile,hFile) items_to_remove = [ #menus,#menus notice div.markedup, div.navigation, head, #table of contents hr ] doc=Hpricot(open(pagePath)) @article = (doc/#container).each do |content| #remove unnecessary content and edit links items_to_remove.each { |x| (content/x).remove } end # Write HTML content to file hFile.write @article.inner_html # How to replace various syntactic elements using Hpricot # replace p/b element with /f (@article/p/*/b).each do |pb| pb.swap({\\bf #{pb.inner_html}}) end # replace p/b element with /bf (@article/p/b).each do |pb| pb.swap({\\bf #{pb.inner_html}}) end # replace strong element with /bf (@article/strong).each do |ps| ps.swap({\\bf #{ps.inner_html}}) end # replace h1 element with section (@article/h1).each do |h1| h1.swap(\\section{#{h1.inner_html}}) end # replace h2 element with subsection (@article/h2).each do |h2| h2.swap(\\subsection{#{h2.inner_html}}) end # replace h3 element with subsection (@article/h3).each do |h3| h3.swap(\\subsubsection{#{h3.inner_html}}) end # replace h4 element with subsection (@article/h4).each do |h4| h4.swap(\\subsubsubsection{#{h4.inner_html}}) end # replace h5 element with subsection (@article/h5).each do |h5| h5.swap(\\subsubsubsubsection{#{h5.inner_html}}) end # replace precode by equivalent command in context (@article/pre).each do |pre| pre.swap(\\startcode \n #{pre.at(code).inner_html} \n \\stopcode) end # when we encounter a reference to a figure inside the html #
Re: [NTG-context] Html to Context using Wiki + hpricot
On 7/11/07, Saji Njarackalazhikam Hameed [EMAIL PROTECTED] wrote: Hello All, I wanted to share my recent experience Really interesting . Please, put all these on wiki.contextgarden.net -- luigi If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Html to Context using Wiki + hpricot
Thanks, Luigi...i will do so. Would it be appropriate to put it under the section General ConTeXt Documents? Let me know otherwise and in that case let me know where would be a good place to add this article. saji .. * luigi scarso [EMAIL PROTECTED] [2007-07-11 10:27:12 +0200]: On 7/11/07, Saji Njarackalazhikam Hameed [EMAIL PROTECTED] wrote: Hello All, I wanted to share my recent experience Really interesting . Please, put all these on wiki.contextgarden.net -- luigi If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___ -- Saji N. Hameed APEC Climate Center +82 51 668 7470 National Pension Corporation Busan Building 12F Yeonsan 2-dong, Yeonje-gu, BUSAN 611705 [EMAIL PROTECTED] KOREA ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___
Re: [NTG-context] Html to Context using Wiki + hpricot
Would it be appropriate to put it under the section General ConTeXt Documents? Yes (You can always move/put under another category in a second moment ). -- luigi If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___