Hi,
or in perl :)
If there is other data in front
| perl -ne 'if (<> =~ /(? wrote:
> > "lists" == lists writes:
>
> lists> I have a shell script that gets a web page, after around half
> lists> dozen sed/awk one liners I end up with like[1]:
>
> lists> I would like to extract all the 7 dig
> "lists" == lists writes:
lists> I have a shell script that gets a web page, after around half
lists> dozen sed/awk one liners I end up with like[1]:
lists> I would like to extract all the 7 digit numeric values,
lists> currently starting with '313', to use them further in the
lists> s
Hi,
> the page is reasonably constant (until any fixes?)
Yeah, that's always a danger...
> all the values I want are in a single 6000 char line, how do I break the
> 6000 char line into individual vaules, 'grep any_value file' gives me the
> whole 6000 chars ?
Several ways, probably the easiest
On Tue, January 14, 2014 1:31 pm, JiÅÃ Baum wrote:
> Ah, skip the lynx step. Just work with the html directly.
> All the tools (sed, awk, grep) can work directly with html.
> To some extent it depends on how variable the original page is.
> Once you skip the lynx step, you might even find that
Hi,
> thanks. I think you might have anwsered my next question already:
> what I'm doing is like: wget url > html; lynx html > text
Ah, skip the lynx step. Just work with the html directly.
All the tools (sed, awk, grep) can work directly with html.
> even it's somewhat outside of my abilities
On Tue, January 14, 2014 12:45 pm, kfos...@tpg.com.au wrote:
> There are man html reader libraries out there. I have used one for perl
> for example. It enables you to look for some other tag to find your data
> (eg
> the css class name of that particular element) and rip the data by walking
>
will get specific
recommendations on html reader / parser libraries. (eg html agility for C#)
Ta
Ken
-Original Message-
From: li...@sbt.net.au
Sent: Tuesday, January 14, 2014 12:35 PM
To: slug@slug.org.au
Subject: [SLUG] script help with grep or regex ?
I have a shell script that gets
I have a shell script that gets a web page, after around half dozen
sed/awk one liners I end up with like[1]:
I would like to extract all the 7 digit numeric values, currently starting
with '313', to use them further in the script
I'm hoping there is some better way ? (rather what I'm doing,