subject:"\[Hampshire\] extracting phrases from a file."

Re: [Hampshire] extracting phrases from a file.

2011-09-14 Thread David Anderson

On Mon, 12 Sep 2011 10:17:44 +0100 James Courtier-Dutton wrote: > Hi. > > I have a large file that contains snips of http pages. > Each line is like this: > some junk. > > I want extract the "some url" bits. I.e. Remove the href. > You can probably do this quite easily in perl. > Are th

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread Jeremy Hooks

Just lurking and I saw this. A simple technique might be to insert a new line before each href then use grep and cut. e.g. open it in vim and do: :%s/href=/^Mhref=/gc :%s/HREF=/^Mhref=/gc (where ^M is ctrl+v followed by the return key) Then grep href filename.html|cut -d '"' -f 2 and option

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread Vic

> You can probably do this quite easily in perl. You can. > Are there any nice short programs to do this? Something like this? #! /usr/bin/perl my $fname = $ARGV[0]; die "need a filename" unless defined ($fname); open INFILE, "<$fname" or die "Can't open $fname for reading"; while () {

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread Alan Pope

On 12 September 2011 10:54, James Courtier-Dutton wrote: >> lynx -dump --hiddenlinks=ignore foo.html >> >> Will dump it to stdout in plain text form with URLs removed. >> > > Sorry, I was not very clear. > I wish to keep the "some url" bits, and get rid of all the "some junk" bits. > I.e. I wish t

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread Bob Dunlop

On Mon, Sep 12 at 10:17, James Courtier-Dutton wrote: > Hi. > > I have a large file that contains snips of http pages. > Each line is like this: > some junk. > > I want extract the "some url" bits. I.e. Remove the href. > You can probably do this quite easily in perl. > Are there any nice

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread James Courtier-Dutton

On 12 September 2011 10:37, Alan Pope wrote: > On 12 September 2011 10:17, James Courtier-Dutton > wrote: >> I want extract the "some url" bits. I.e. Remove the href. >> You can probably do this quite easily in perl. >> Are there any nice short programs to do this? >> Is it easier to do in some o

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread Alan Pope

On 12 September 2011 10:17, James Courtier-Dutton wrote: > I want extract the "some url" bits. I.e. Remove the href. > You can probably do this quite easily in perl. > Are there any nice short programs to do this? > Is it easier to do in some other language? > lynx -dump --hiddenlinks=ignore foo.

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread James Courtier-Dutton

Hi, I forgot to mention, my starting document is not a valid http document so probably will not load into a web browser. Which what you have said still work? I need this to be run as a cron job, so use of a web browser is probably not the best solution. On 12 September 2011 10:21, Benjie Gillam

Re: [Hampshire] extracting phrases from a file.

2011-09-12 Thread Benjie Gillam

Or, alternatively, open it into a decent web browser and type this into the JavaScript console: var as = document.getElementsByTagName('a'); var hrefs=[]; for (var i = 0, l = as.length; i Hi. > > I have a large file that contains snips of http pages. > Each line is like this: > some junk...

[Hampshire] extracting phrases from a file.

2011-09-12 Thread James Courtier-Dutton

Hi. I have a large file that contains snips of http pages. Each line is like this: some junk. I want extract the "some url" bits. I.e. Remove the href. You can probably do this quite easily in perl. Are there any nice short programs to do this? Is it easier to do in some other language?

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

Re: [Hampshire] extracting phrases from a file.

[Hampshire] extracting phrases from a file.

10 matches

Site Navigation

Mail list logo

Footer information