There are man html reader libraries out there. I have used one for perl for example. It enables you to look for some other tag to find your data (eg the css class name of that particular element) and rip the data by walking the html tree.

Pick a language and let us know I am sure you will get specific recommendations on html reader / parser libraries. (eg html agility for C#)

Ta
Ken


-----Original Message----- From: li...@sbt.net.au
Sent: Tuesday, January 14, 2014 12:35 PM
To: slug@slug.org.au
Subject: [SLUG] script help with grep or regex ?

I have a shell script that gets a web page, after around half dozen
sed/awk one liners I end up with like[1]:

I would like to extract all the 7 digit numeric values, currently starting
with '313....', to use them further in the script

I'm hoping there is some better way ? (rather what I'm doing, 'grep 313')

[1]-------
Page: 1 Items: 1 - 2

Items(Total: 2 displaying 2)

3137973
stf, Stuff, morestuff, Stuff2:stuff3,
Date: 14/01/2014, Time: 09:30 AM
14/01/2014 09:30
Notes(0)

3137966
stf, Stuff, morestuff, Stuff2:stuff3,
Date: 06/02/2014, Time: 09:30 AM
06/02/2014 09:30
Notes(0)



--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2014.0.4259 / Virus Database: 3658/6997 - Release Date: 01/12/14


-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2014.0.4259 / Virus Database: 3658/6997 - Release Date: 01/12/14

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to