Re: Hi, how to extract five texts on each side of an URI? I post my own perl script and its use.
From: ťÔ Íő [EMAIL PROTECTED] my $text; for my $left_index (1..WIDTH) { last if $start_index $left_index; $text .= $texts_arr[$start_index - $left_index] . ' '; } $text .= join( , @texts_arr[$start_index..$end_index]) . ' '; for my $right_index (1..WIDTH) { last if $end_index + $right_index $#texts_arr; $text .= $texts_arr[$end_index + $right_index] . ' '; } $text_hash{$url} = $text; As far as I can tell this could easily be rewriten with no loops. If I understand it correctly you want to get all the texts from $start_index-WIDTH to $end_index+WIDTH so something like: my $left_index = $start_index - WIDTH; $left_index = 0 if $left_index 0; my $right_index = $end_index + WIDTH; $right_index = $#texts_arr if $right_index $#texts_arr; my $text = join( , @texts_arr[$left_index .. $right_index]); should do what you are after. There are probable other things, but this caught my eyes. Jenda = [EMAIL PROTECTED] === http://Jenda.Krynicky.cz = When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response
Re: Hi, how to extract five texts on each side of an URI? I post my own perl script and its use.
On Sunday 12 November 2006 13:17, 辉 王 wrote: I can make my program do its job at last, but it runs slowly. Can anybody tell me how to improve the running speed of this program? Thanks. Have you had a look with the Perl profiler to see which bits are going slow. That way you know to look at make them run faster. See perldoc Devel::DProf for more information. -- Robin [EMAIL PROTECTED] JabberID: [EMAIL PROTECTED] Hostes alienigeni me abduxerunt. Qui annus est? PGP Key 0xA99CEB6D = 5957 6D23 8B16 EFAB FEF8 7175 14D3 6485 A99C EB6D pgpIhJEoay9Ke.pgp Description: PGP signature
RE: Hi, how to extract five texts on each side of an URI? I post my own perl script and its use.
Hui Wang mailto:[EMAIL PROTECTED] wrote: : Can anybody tell me how to improve the running speed of this : program? Thanks. I don't know if this is faster, but it is a more accurate solution. Your submitted code failed under some untested circumstances. I created another page similar to the CPAN page you used and fed it more complicated tests. Chakrabarti placed relevance on distance from the link. I changed your report to reflect this relevance. Instead of squashing all text together, it now shows a report of text token relevance. This change allowed me to test more thoroughly as well. Here is the sample report for one link with multiple texts inside the anchor. http://www.clarksonenergyhomes.com/scripts/index.html -5: 3401 MB 280 mirrors -4: 5501 authors 10789 modules -3: Welcome to CPAN! Here you will find All Things Perl. -2: Browsing -1: Perl modules 0: Perl 0: scripts +1: Perl binary distributions (ports) +2: Perl source code +3: Perl recent arrivals +4: recent +5: Perl modules You can find the modified code here (for a short time): Script: http://www.clarksonenergyhomes.com/chakrabarti.txt Module: http://www.clarksonenergyhomes.com/chakrabarti.pm HTH, Charles K. Clarkson -- Mobile Homes Specialist Free Market Advocate Web Programmer 254 968-8328 http://www.clarksonenergyhomes.com/ Don't tread on my bandwidth. Trim your posts. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/ http://learn.perl.org/first-response