Getting the text content of a HTML page

2008-08-02 Thread H Baric
Hi again *blush* Okay, this is no doubt something very simple even though I've searched through the docs but can't find exactly how to do this seemingly straightforward task: * Get the text only from a web page - no html tags, no formatting etc. I can get the html doc to appear in my field by u

Re: Getting the text content of a HTML page

2008-08-02 Thread Eric Chatonet
Re, Le 2 août 08 à 16:31, H Baric a écrit : * Get the text only from a web page - no html tags, no formatting etc. LOL This is a case that needs some additional code snippet as I said in a previous email :-) put StripTags(thePage) into field "The Page"

Re: Getting the text content of a HTML page

2008-08-02 Thread J. Landman Gay
H Baric wrote: Hi again *blush* Okay, this is no doubt something very simple even though I've searched through the docs but can't find exactly how to do this seemingly straightforward task: * Get the text only from a web page - no html tags, no formatting etc. One simple way is: set the ht

Re: Getting the text content of a HTML page

2008-08-02 Thread H Baric
t chunk of wool ; tie it in a knot ; create noose ; end knitOne - Original Message - From: "Eric Chatonet" <[EMAIL PROTECTED]> To: "How to use Revolution" Sent: Sunday, August 03, 2008 12:33 AM Subject: Re: Getting the text content of a HTML page Re, L

Re: Getting the text content of a HTML page

2008-08-02 Thread H Baric
August 03, 2008 12:59 AM Subject: Re: Getting the text content of a HTML page H Baric wrote: > Hi again *blush* > > Okay, this is no doubt something very simple even though I've searched > through the docs but can't find exactly how to do this seemingly > straightf

Re: Getting the text content of a HTML page

2008-08-02 Thread Eric Chatonet
elect chunk of wool ; tie it in a knot ; create noose ; end knitOne - Original Message - From: "Eric Chatonet" <[EMAIL PROTECTED]> To: "How to use Revolution" Sent: Sunday, August 03, 2008 12:33 AM Subject: Re: Getting the text content of a HTML page Re, Le 2

Re: Getting the text content of a HTML page

2008-08-02 Thread Andres Martinez
Hello Jacqueline I am using this "htmltext" for the first time because I want to have links that work in a text field. Links are shown with a different color that changes when clicked. But no browser window opens and no page is retrieved. Any thoughts? Regards, Andres Martinez www.baKno.

Re: Getting the text content of a HTML page

2008-08-02 Thread Andres Martinez
Nevermind I found it on the forum... on linkClicked theLink revGoUrl theLink end linkClicked Regards, Andres Martinez www.baKno.com On Aug 2, 2008, at 1:28 PM, Andres Martinez wrote: Hello Jacqueline I am using this "htmltext" for the first time because I want to have links that work in

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
Okay, can someone please tell me if this (or something remotely like it) would be possible (if script was written correctly, which I can't for the life of me work out how): put url "somewebpage" into field "theField" put return before all the "<" put return after all the ">" replace lines con

Re: Getting the text content of a HTML page

2008-08-03 Thread Mark Schonewille
Hi, I tried the following: put replacetext("aaaccc","<*>","") While put replacetext("aaaxbbbxccc","x*x","") works, the former doesn't. Does anyone know why? I tried escaping the < and >, but that didn't work either. Bug? -- Best regards, Mark Schonewille Economy-x-Talk Consulting and Soft

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
(Re my last post), I guess what I want to know is, how to use the info in the FILTER and REGULAR EXPRESSIONS etc for Processing Text / Data, when there are other characters ***besides A-Z and 0-9*** I can't find any examples that use other characters, they all use letters and numbers! Everythin

Re: Getting the text content of a HTML page

2008-08-03 Thread Mark Schonewille
Hi Heather, This ought to be filter myLines without "*<*>*" -- Best regards, Mark Schonewille Economy-x-Talk Consulting and Software Engineering http://economy-x-talk.com http://www.salery.biz Benefit from our inexpensive hosting services. See http://economy-x-talk.com/server.html for more

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
Thanks for helping me make sense of that Mark :) I tested it out on my project, and it filters out everything in the entire field! I thought I knew why, as the field displays: This is a table. Can you see this text? Everything IS in between the < and >. But then I tried adding text outside the

Re: Getting the text content of a HTML page

2008-08-03 Thread Sarah Reichelt
> But, just out of interest, is there a way to script "if there are more than > one blank line together, get rid of the extras and just have one" ? repeat while tString contains cr & cr replace cr & cr with cr in tString end repeat That should do it. Cheers, Sarah _

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
heers, Heather :) - Original Message - From: "Scott Morrow" <[EMAIL PROTECTED]> To: "How to use Revolution" Sent: Sunday, August 03, 2008 8:17 PM Subject: Re: Getting the text content of a HTML page Hello Heather, Mark's use of filter answered your qu

Re: Getting the text content of a HTML page

2008-08-03 Thread Scott Morrow
Hello Heather, Mark's use of filter answered your question nicely. This chunking method illustrates another, less elegant, way of doing what your pseudo-code below describes. put field "theField" into tText replace "<" with (CR&"<") in tText replace ">" with (">"&CR) in tText -- bu

Re: Getting the text content of a HTML page

2008-08-03 Thread Sarah Reichelt
On Sun, Aug 3, 2008 at 12:31 AM, H Baric <[EMAIL PROTECTED]> wrote: > Hi again *blush* > > Okay, this is no doubt something very simple even though I've searched > through the docs but can't find exactly how to do this seemingly > straightforward task: > > * Get the text only from a web page - no

Re: Getting the text content of a HTML page

2008-08-03 Thread Mark Schonewille
Heather, My little example filters out all line that have a < followed by a >. I thought you were putting returns before the <'s and after the >'s, which should make the example work, since all tags would be on their own separate line, if the site doesn't contain any mistakes. -- Best rega

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
Thanks Sarah! I have to say I feel a bit like a *duhh* dog-paddling around in brain soup here. But I'm determined, and definitely progressing daily thanks to all the wonderful readily available documents and examples, as well as all the friendly live help on demand here. :) Wow, your script ev

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
Excellent! Thanks so much Sarah! - Original Message - From: "Sarah Reichelt" <[EMAIL PROTECTED]> To: "How to use Revolution" Sent: Sunday, August 03, 2008 8:32 PM Subject: Re: Getting the text content of a HTML page > But, just out of interest, is there

Re: Getting the text content of a HTML page

2008-08-03 Thread H Baric
A, thanks Mark. A bit scattered am I at times... okay most times... but your script makes perfect sense now and I'm sure it will work. Haven't tried it again yet though - next on Experiments-To-Do! I've been having fun using bits and pieces from everyone's offerings the last couple of days,

Re: Getting the text content of a HTML page

2008-08-03 Thread Sarah Reichelt
> Another question though before I close this bloomin HTML thing and do > something different (like my abandoned Notepad app). > > How do I delete lines (in the resulting text extracted from the html page) > that contain a url in them? They appear by themselves on separate lines... > > I'm currentl

Re: Getting the text content of a HTML page

2008-08-04 Thread viktoras didziulis
one more way to do things using regular expressions: put the replaceText(myText,"","") into myText will simply replace all tags with empty string. Where myText is the text where replacements have to be made. is a regular expression matching most html tags and "" is empty replacement string.

Re: Getting the text content of a HTML page

2008-08-04 Thread H Baric
nderstand the hows and whys of it. If that makes any sense! Which is why RevOnline is great, as are the forums, the examples and workshops, and ofcourse this group! :) Cheers, Heather ----- Original Message ----- From: "viktoras didziulis" <[EMAIL PROTECTED]> Subject: Re:

Re: Getting the text content of a HTML page

2008-08-04 Thread Eric Chatonet
Bonjour Heather, Le 4 août 08 à 13:01, H Baric a écrit : Can you (or if anyone else reading has a moment) please help me understand more about what each part of the is the "" is about? I assume you use Rev 2.9: there is a search tool named 'Rev Search Engine' in the IDE that could help yo

Re: Getting the text content of a HTML page

2008-08-04 Thread H Baric
o explore more of the Rev universe... :D Thanks again :) Kindest regards, Heather - Original Message - From: "Eric Chatonet" <[EMAIL PROTECTED]> To: "How to use Revolution" Sent: Monday, August 04, 2008 9:56 PM Subject: Re: Getting the text content of a HTML pag

Re: Getting the text content of a HTML page

2008-08-04 Thread Eric Chatonet
Thanks for the kind words: I wrote the Rev Search Engine to help all and very often it helps me too :-) I'm really sorry about using 'regex' in my last post but you guessed it: 'regular expression'. I'm sure you'll become a respected contributor shortly: you make really quick progress :-) An

Re: Getting the text content of a HTML page

2008-08-04 Thread Richard Gaskin
viktoras didziulis wrote: one more way to do things using regular expressions: put the replaceText(myText,"","") into myText will simply replace all tags with empty string. Where myText is the text where replacements have to be made. is a regular expression matching most html tags and "" is

Re: Getting the text content of a HTML page

2008-08-04 Thread Jim Ault
On 8/4/08 8:25 AM, "Richard Gaskin" <[EMAIL PROTECTED]> wrote: > viktoras didziulis wrote: >> one more way to do things using regular expressions: >> >> put the replaceText(myText,"","") into myText >> >> will simply replace all tags with empty string. Where myText is the text >> where replacem

Re: Getting the text content of a HTML page

2008-08-04 Thread Richard Gaskin
ev.com?Subject=Getting%20the%20text%20content%20of%20a%20HTML%20page&In-Reply-To=f99b52860808031334l44f6cd1by6ed2444fb32560ac%40mail.gmail.com"; TITLE="Getting the text content of a HTML page"> Presumably this is because that tag is broken onto two lines. Thi

Re: Getting the text content of a HTML page

2008-08-04 Thread viktoras didziulis
whoops sorry, I tested this with basic tags like jsajka. The next 'thing' seem to work OK (the text is in fText field): put replaceText(fld "fText","","") into fld "fText" A small explanation: /? means zero or 1 occurence of / - because tags may be either opening (without /) or closing (with

Re: Getting the text content of a HTML page

2008-08-04 Thread Jim Ault
;HREF="mailto:use-revolution%40lists.runrev.com?Subject=Getting%20the%20text%20 > content%20of%20a%20HTML%20page&In-Reply-To=f99b52860808031334l44f6cd1by6ed2444 > fb32560ac%40mail.gmail.com" > TITLE="Getting the text content of a HTML page"> > >

Re: Getting the text content of a HTML page

2008-08-05 Thread H Baric
Wow Eric, great work! A valuable contribution to be sure :) I have found gold in the conference stacks especially! Thanks s much. Woohoo stop me now :-D I'm in Australia by the way (anyone else from Oz here?) I've noticed it gets busy at the time I should be Zzz-ing. I don't mind if I have

Re: Getting the text content of a HTML page

2008-08-05 Thread Richard Gaskin
Jim Ault wrote: > Richard wrote: >> This function takes care of that, and this far benchmarks about >> an order of magnitude faster: >> >> function HtmlTextMethod pHtml >>put the properties of the templateField into tSaveProps >>set the htmlText of the templateField to pHtml >>get the

Re: Getting the text content of a HTML page

2008-08-05 Thread Jim Ault
On 8/5/08 10:18 AM, "Richard Gaskin" <[EMAIL PROTECTED]> wrote: > But given its blindingly fast performance and the scope of things it > handles in well-optimized machine-compiled code in the engine, it seems > a good starting point for a more complete function which would have > relatively littl

Re: Getting the text content of a HTML page

2008-08-05 Thread Sarah Reichelt
On Tue, Aug 5, 2008 at 7:37 PM, H Baric <[EMAIL PROTECTED]> wrote: > Wow Eric, great work! A valuable contribution to be sure :) > > I have found gold in the conference stacks especially! Thanks s much. > Woohoo stop me now :-D > > I'm in Australia by the way (anyone else from Oz here?) Yes,

Re: Getting the text content of a HTML page

2008-08-06 Thread H Baric
Wonderful Sarah! :) Cheers, Heather - Original Message - From: "Sarah Reichelt" <[EMAIL PROTECTED]> To: "How to use Revolution" Sent: Wednesday, August 06, 2008 7:49 AM Subject: Re: Getting the text content of a HTML page On Tue, Aug 5, 2008 at 7:37 PM,