Like Bruce and Randall said, there is no perfect solution if the structure of the parsed page change.
so you need some control point before the parsing time to be sure that you get the good result at the end. If the control show a structure change then inform the user that the parser need to be revewed. Now there is two kind of parsing. Manual: by using instr, and other common text manipulation tools. I use it when i need to find one data on one line. Because it is more quick than a DOM tool that need to parse all the html struct before. But if you need many Info in many place of the web page, the Dom is better and allow more change in the web page before you need to change the parser structure. Simply because we use Tags and attributes to make the searches. I will send you an example tomorrow I too have done a lot of data scrapping for the past few years. I think picking the right tool for the job will ease your development. I have many python and Java tools and played with the Gambas parser. But I have found very little that matches the ease of development I find with Python and Selenium. Selenium is not just a scrapping tool. In-fact, it wasn't meant for that at all. It is a browser automation tool and website test framework. With it, I've had little problem dealing with typical changes in content. It is also great for comparing the page code sent to different browsers. BeautifulSoup is great for well structured pages. But once that structure is lost it often fails. XMLlib and HTMLlib and other Python modules just don't seem to match the productivity I find with Selenium. It all comes down to how general do you need your solution to be? Is this a one-off scrapping or something that you intend to do over a long period of time? Do you know Python or Java and can you learn it quickly? Must your solution scale to large projects, or just this one use? So answer these questions and then review the options. If it is something the GAMBAS parse can handle then use it. Or if the page is very stable and well structured, then write a parser. A basic parser is not difficult to write. Search the internet for Jack Crenshaw's article on building a simple parser. However, if the page is complex and this is a long term project, you may want to consider a more powerful and stable solution. As Bruce said, put in lots of tests along the way because some pages do change constantly. Having a reporting system that allows you to locate such changes is very helpful in a high production environment. Hope this helps On Sat, Jul 13, 2013 at 2:33 AM, Fabien Bodard <gambas...@gmail.com> wrote: > Send me an exemple url for the page > Le 13 juil. 2013 10:52, "Shane" <shanep1...@tpg.com.au> a écrit : > > > On 13/07/13 18:33, Fabien Bodard wrote: > > > There is a parsing tool in gambas for html. > > > > > > Gb.xml.html > > > > > > It's our own html dom parser. Itallow to generate well formated html5 > > page > > > and or parsing existing html pages. > > > > > > It's one of the most fast parser I know. > > > > > > Look at that ... And if you need I van show yousome examples. > > > Le 13 juil. 2013 08:20, "Caveat" <gam...@caveat.demon.co.uk> a écrit : > > > > > >> You need to use the right tool for the job. I find the python tool > > >> BeautifulSoup one of the best for parsing and extracting data from web > > >> pages. > > >> > > >> http://www.crummy.com/software/BeautifulSoup/ > > >> > > >> Kind regards, > > >> Caveat > > >> > > >> On 12/07/13 09:01, Shane wrote: > > >>> Hi everyone > > >>> > > >>> i am trying to get some info from a web page in the format of > > >>> > > >>> <div class="result"> > > >>> <div class="col">Text I Want</div> > > >>> <div class="col"> > > >>> And Some More i Want > > >>> </div> > > >>> <div class="col"> > > >>> And The last bit > > >>> </div> > > >>> </div> > > >>> > > >>> what would be the best way to go about this i have tried a few way > but > > i > > >>> feel there must be an > > >>> easy way to do this > > >>> > > >>> thanks shane > > >>> > > >>> > > >> > > > ------------------------------------------------------------------------------ > > >>> See everything from the browser to the database with AppDynamics > > >>> Get end-to-end visibility with application monitoring from > AppDynamics > > >>> Isolate bottlenecks and diagnose root cause in seconds. > > >>> Start your free trial of AppDynamics Pro today! > > >>> > > >> > > > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > > >>> _______________________________________________ > > >>> Gambas-user mailing list > > >>> Gambas-user@lists.sourceforge.net > > >>> https://lists.sourceforge.net/lists/listinfo/gambas-user > > >>> > > >> > > >> > > >> > > > ------------------------------------------------------------------------------ > > >> See everything from the browser to the database with AppDynamics > > >> Get end-to-end visibility with application monitoring from AppDynamics > > >> Isolate bottlenecks and diagnose root cause in seconds. > > >> Start your free trial of AppDynamics Pro today! > > >> > > > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > > >> _______________________________________________ > > >> Gambas-user mailing list > > >> Gambas-user@lists.sourceforge.net > > >> https://lists.sourceforge.net/lists/listinfo/gambas-user > > >> > > > > > > ------------------------------------------------------------------------------ > > > See everything from the browser to the database with AppDynamics > > > Get end-to-end visibility with application monitoring from AppDynamics > > > Isolate bottlenecks and diagnose root cause in seconds. > > > Start your free trial of AppDynamics Pro today! > > > > > > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > > > _______________________________________________ > > > Gambas-user mailing list > > > Gambas-user@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/gambas-user > > > > > > > > thanks everyone for the replys > > and i would like some examples thanks Fabien > > > > > > > > > ------------------------------------------------------------------------------ > > See everything from the browser to the database with AppDynamics > > Get end-to-end visibility with application monitoring from AppDynamics > > Isolate bottlenecks and diagnose root cause in seconds. > > Start your free trial of AppDynamics Pro today! > > > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > > _______________________________________________ > > Gambas-user mailing list > > Gambas-user@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/gambas-user > > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Gambas-user mailing list > Gambas-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/gambas-user > -- If you ask me if it can be done. The answer is YES, it can always be done. The correct questions however are... What will it cost, and how long will it take? ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Gambas-user mailing list Gambas-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gambas-user ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Gambas-user mailing list Gambas-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gambas-user