Re: Getting the web page language
Hi, I am wondering how Google and Altavista search engines can find out the language used in a web page. I can see that they can find the pages written in Romanian language, for example, even though the header of the file is same as for english ones. Do they search for some words? If you think this could be the only solution, is there any Perl module thatcan do that? Thanks. Teddy's Center: http://teddy.fcc.ro/ Mail: [EMAIL PROTECTED] - Original Message - From: "Kevin Meltzer" <[EMAIL PROTECTED]> To: "Octavian Rasnita" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Sunday, September 08, 2002 8:54 PM Subject: Re: Getting the web page language Does it matter what language, or what charset? You can always look at this and the lang="foo" tags to try to determine what language, or at least what charset, the page is in. Of course, a charset (like iso-8869-1) can cover many languages, but at least you can narrow it down a little if you don't find a lang="foo" tag. Cheers, Kevin On Sun, Sep 08, 2002 at 08:05:18AM +0300, Octavian Rasnita ([EMAIL PROTECTED]) said something similar to: > Hi all, > > I want to create a search engine. Please tell me how can I find out the > languages used in a web page. > I know that HTML 4.01 uses for example, but most of the web > pages don't use this tag. > > What should I test to find the language used? > > Thank you. > > Teddy's Center: http://teddy.fcc.ro/ > Mail: [EMAIL PROTECTED] > > > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] -- [Writing CGI Applications with Perl - http://perlcgi-book.com] You are all the Buddha. -- Buddha (last words) -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
output pushing
hi guys, the normal way to write output to the client browser is done when the whole program is finished. is there a way to push output to the client browser in non-buffered way - say, as line by line in the IRC clients - without need to print the the whole page again, something like appending ? -- Hytham Shehab -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Getting the web page language
Does it matter what language, or what charset? You can always look at this and the lang="foo" tags to try to determine what language, or at least what charset, the page is in. Of course, a charset (like iso-8869-1) can cover many languages, but at least you can narrow it down a little if you don't find a lang="foo" tag. Cheers, Kevin On Sun, Sep 08, 2002 at 08:05:18AM +0300, Octavian Rasnita ([EMAIL PROTECTED]) said something similar to: > Hi all, > > I want to create a search engine. Please tell me how can I find out the > languages used in a web page. > I know that HTML 4.01 uses for example, but most of the web > pages don't use this tag. > > What should I test to find the language used? > > Thank you. > > Teddy's Center: http://teddy.fcc.ro/ > Mail: [EMAIL PROTECTED] > > > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] -- [Writing CGI Applications with Perl - http://perlcgi-book.com] You are all the Buddha. -- Buddha (last words) -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Getting the web page language
Hi all, I want to create a search engine. Please tell me how can I find out the languages used in a web page. I know that HTML 4.01 uses for example, but most of the web pages don't use this tag. What should I test to find the language used? Thank you. Teddy's Center: http://teddy.fcc.ro/ Mail: [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Sendback programs output as Perl/CGI
Hi all, Could anyone assist with the following: Let me explain when running exe below on your c: drive the program, pulls the two input text file and produce two outputfile. In addition the program the program display a screen output. My request: 1) I want for this screen output to be send back to the user by CGI / HTML page. 2) On the same page I want an email prompt block enableing the user to enter his/her email address. The idea is for the output file (train_out.txt and nnet_out.txt) to be emailed to the user. I currently have this perlscript[start_nn1.pl] that does not really do what I want Will you be able to do this for me? Thanks in advance. Bruce pogram_info.zip Description: Zip compressed data -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Regular expression
Octavian Rasnita wrote at Sat, 07 Sep 2002 11:03:39 +0200: > I am trying to match a word boundry or an end of string. > I would like something like: > > /$word[\bX]/ > > where X is the symbol used for end of string. I know that I can use $ but I > don't think I can use it between brackets. > > I've seen that \b doesn't match the end or beginning of a string. > I would like to know if there is another symbol that can match both these. >From perldoc perlre: \Z Match only at end of string, or before newline at the end \z Match only at end of string So you can write /$word(\b|\Z)/ Please note that you can't use \b in a character class. >From perldoc perlre: (Within character classes "\b" represents backspace rather than a word boundary, just as it normally does in any double-quoted string.) Greetings, Janek -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Splitting a string
Janek Schleicher wrote at Sat, 07 Sep 2002 10:06:34 +0200: >>> ... >>> print join "\n", grep defined, ($string =~ /"(.*?)"|(\w+)/g); >>> ... > > The regexp matches either "(.*?)" in $1 or (\w+) in $1. ^^ Of course, I meant $2. > But the regexp always makes a list of both ($1,$2) foreach match. > That's why half of the list consists of undefined elements, > that have to be grepped out. > > It's not really nice, but it works. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]